[PATCH] D44364: [AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.

Michael Bedy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 11 12:05:38 PDT 2018


mjbedy added inline comments.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:920-921
   // Copy dst, if it is present in original then should also be present in SDWA
   MachineOperand *Dst = TII->getNamedOperand(MI, AMDGPU::OpName::vdst);
+  auto DstIdx = AMDGPU::getNamedOperandIdx(SDWAOpcode, AMDGPU::OpName::vdst);
+
----------------
arsenm wrote:
> You should use either getNamedOperand or getNamedOperandIdx, not both
The Dst operand is for the original instruction (which in my current understanding could be for example V_MOV_B32_e32 or V_MOV_B32_sdwa) and the DstIdx is for the replacement instruction (V_MOV_B32_sdwa). Is it true that vdst will always be the same index for both _e32 and _sdwa versions?

Will TII->getNamedOperand(SDWAInst, AMDGPU::OpName::vdst) return NULL if the operand has not yet been assigned? My assumption was that it would return NULL, which is why I used an index here. I could change the existing code to use indexes as well.

Or perhaps I should have just called it SDWADstIdx to make it clearer that it's potentially not the same index as the operand pointed to by Dst.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:930
     SDWAInst.add(*Dst);
+    DstIdx = AMDGPU::getNamedOperandIdx(SDWAOpcode, AMDGPU::OpName::sdst);
   } else {
----------------
arsenm wrote:
> Why is it interesting to use check sdst for this? You can't partially write the SGPR output?
> 
> I'm also not sure why the existing code treats sdst as an alternative to vdst. What about add with carry out?
The original version of this change was copying any tied operand no matter what, instead of limiting it to the UNUSED_PRESERVE case, which is why this line is here. I will remove this.

I believe add with carry out (and similar) has to use vcc if SDWA is used. VOPC can use SDWA with an sdst.

One thing I was curious about in this function: it seems to assume a certain order of source operands. (it calls SDWAInst.add() for everything, which seems to just append.) Is this typical?


Repository:
  rL LLVM

https://reviews.llvm.org/D44364





More information about the llvm-commits mailing list