[PATCH] D130919: [MRI] isConstantPhysReg should also check if the register is clobbered by a RegMask

Guozhi Wei via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 28 14:17:45 PDT 2022


Carrot added a comment.

In D130919#3891206 <https://reviews.llvm.org/D130919#3891206>, @foad wrote:

> Why are there so many changes in AMDGPU codegen, even in tests that do not contain function calls?

In a previous comment I mentioned I'm completely ignorant with AMDGPU, the instructions looks very different when compared with traditional CPU. So it's better to have an AMDGPU expert to have a look at these changes.

I tried to debug the simplest one wwm-reserved.ll. The difference comes from pass si-pre-allocate-wwm-regs, the related code is

  ...
  160B      dead $sgpr30_sgpr31 = SI_CALL %8:sreg_64, @called, <regmask $noreg $exec $exec_hi $exec_lo $flat_scr $flat_scr_hi $flat_scr_hi_ci $flat_scr_hi_vi $flat_scr_lo $flat_scr_lo_ci $flat_scr_lo_vi $flat_scr_ci $flat_scr_vi $fp_reg $lds_direct $mode $pc_reg $private_rsrc_reg $scc $sgpr_null $sgpr_null_hi $sp_reg $src_execz $src_pops_exiting_wave_id $src_private_base $src_private_limit $src_scc $src_shared_base $src_shared_limit $src_vccz $tba $tba_hi $tba_lo and 6831 more...>, implicit $sgpr0_sgpr1_sgpr2_sgpr3, implicit $vgpr0, implicit-def $vgpr0
  168B      %10:vgpr_32 = COPY $vgpr0
  ...

When this pass processing the COPY instruction, function SIPreAllocateWWMRegs::processDef tries to find a physical register for virtual register %10. It tries every physical register in class vgpr_32 to see if it fits.

  ...
    for (MCRegister PhysReg : RegClassInfo.getOrder(MRI->getRegClass(Reg))) {
      if (!MRI->isPhysRegUsed(PhysReg) &&                                                                             // Different result
          Matrix->checkInterference(LI, PhysReg) == LiveRegMatrix::IK_Free) {
        Matrix->assign(LI, PhysReg);
        assert(PhysReg != 0); 
        RegsToRewrite.push_back(Reg);
        return true;
      }   
    }
  ...

When PhysReg is $vgpr1, MRI->isPhysRegUsed(PhysReg) computes different result because of this patch. Function MRI->isPhysRegUsed checks regmask before register operands. But regmask information was not collected until very late, so at this time MRI doesn't have any regmask information, and there is no explicit $vgpr1 in the function, so MRI->isPhysRegUsed returns true for $vgpr1, and it is assigned to %10.

With this patch, MRI regmask information is updated whenever there is a regmask change, and recollected in reg-usage-propagation pass. So when $vgpr1 is passed to MRI->isPhysRegUsed, false is returned because $vgpr1 is used in @called and propagated to MRI in current function, it means $vgpr1 is not appropriate for %10, and next physical register $vgpr2 is tried.

Although the original behavior is correct for this specific function, but it is very dangerous. If we have another function call to @called inside the live range of %10, wrong code can be generated.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130919/new/

https://reviews.llvm.org/D130919



More information about the llvm-commits mailing list