[PATCH] D101405: [AMDGPU] Change FLAT SADDR to VADDR form in moveToVALU

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 30 11:27:09 PDT 2021


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5055-5056
+    Inst.RemoveOperand(OldSAddrIdx);
+    MRI.removeRegOperandFromUseList(&NewVAddr);
+    MRI.addRegOperandToUseList(&NewVAddr);
+  } else {
----------------
foad wrote:
> Are these last two lines really necessary? Hasn't MRI.moveOperands already handled this?
It did not. I have added the comment.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5062
+
+  if (VAddrDef && MRI.use_nodbg_empty(VAddrDef->getOperand(0).getReg()))
+    VAddrDef->eraseFromParent();
----------------
foad wrote:
> VAddrDef can't be 0 here, that was already checked earlier.
Right, this is a part of D101408. Removed from this patch.


================
Comment at: llvm/test/CodeGen/AMDGPU/move-load-addr-to-valu.mir:35
+      %3:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+      %4:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR %1, %3, 0, 0, implicit $exec
+      %2:sreg_64 = S_AND_B64 %1, 1, implicit-def $scc
----------------
foad wrote:
> rampitec wrote:
> > arsenm wrote:
> > > Can you also add an IR test where this matters? I'm not understanding why the DAG divergence analysis would let this happen
> > The test case shall be more than 100 instructions long, this is the threshold for memory dependency analysis to give up on the "noclobber" check. The divergence analysis tells it is ok, it is uniform. But noclobber is not set, so the address registers are not known not to be clobberd and we are here.
> > 
> > It really happens in large basic blocks. Probably it can also happen if we just had no SALU instructions to do some computations, so ended up with VALU which was pripagated, but that is not the case I have started with. I.e. it is really uniform, and readfirstlane is valid, but we don't want to issue it.
> > 
> > I am not really sure we want a test of that size. It will be more than obscure in the first place.
> You could test with -memdep-block-scan-limit=1 ?
I will check if I can produce a reasonable IR test.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101405/new/

https://reviews.llvm.org/D101405



More information about the llvm-commits mailing list