If you use masked_fill according to ex_mask (0 for pad), it will fill not padding position(which value in ex_mask is 1) with 0, this will lead a bad performance.