[PATCH] D101405: [AMDGPU] Change FLAT SADDR to VADDR form in moveToVALU

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 30 02:46:45 PDT 2021


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5027
+
+  MachineRegisterInfo &MRI = Inst.getParent()->getParent()->getRegInfo();
+  MachineOperand &SAddr = Inst.getOperand(OldSAddrIdx);
----------------
Use `Inst.getMF()` ?


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5050
+  if (OldVAddrIdx == NewVAddrIdx) {
+    assert(OldVAddrIdx == NewVAddrIdx);
+    MachineOperand &NewVAddr = Inst.getOperand(NewVAddrIdx);
----------------
Redundant assert, you've just tested this condition.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5055-5056
+    Inst.RemoveOperand(OldSAddrIdx);
+    MRI.removeRegOperandFromUseList(&NewVAddr);
+    MRI.addRegOperandToUseList(&NewVAddr);
+  } else {
----------------
Are these last two lines really necessary? Hasn't MRI.moveOperands already handled this?


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5062
+
+  if (VAddrDef && MRI.use_nodbg_empty(VAddrDef->getOperand(0).getReg()))
+    VAddrDef->eraseFromParent();
----------------
VAddrDef can't be 0 here, that was already checked earlier.


================
Comment at: llvm/test/CodeGen/AMDGPU/move-load-addr-to-valu.mir:35
+      %3:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+      %4:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR %1, %3, 0, 0, implicit $exec
+      %2:sreg_64 = S_AND_B64 %1, 1, implicit-def $scc
----------------
rampitec wrote:
> arsenm wrote:
> > Can you also add an IR test where this matters? I'm not understanding why the DAG divergence analysis would let this happen
> The test case shall be more than 100 instructions long, this is the threshold for memory dependency analysis to give up on the "noclobber" check. The divergence analysis tells it is ok, it is uniform. But noclobber is not set, so the address registers are not known not to be clobberd and we are here.
> 
> It really happens in large basic blocks. Probably it can also happen if we just had no SALU instructions to do some computations, so ended up with VALU which was pripagated, but that is not the case I have started with. I.e. it is really uniform, and readfirstlane is valid, but we don't want to issue it.
> 
> I am not really sure we want a test of that size. It will be more than obscure in the first place.
You could test with -memdep-block-scan-limit=1 ?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101405/new/

https://reviews.llvm.org/D101405



More information about the llvm-commits mailing list