[PATCH] D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands

David Stuttard via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 19 10:18:24 PDT 2019


dstuttard marked 9 inline comments as done.
dstuttard added inline comments.


================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:798-799
+        // normally
+        // count as using the constant bus twice - but in this case it is
+        // allowed as the lane selector doesn't count as a use of the constant
+        // bus). However, it is still required to abide by the 1 SGPR rule Apply
----------------
arsenm wrote:
> I think there's a missing word here; " as the lane selector and doesn't"
I think this might be a difference in language - so I've changed it to make it less confusing


================
Comment at: lib/Target/AMDGPU/SIFixSGPRCopies.cpp:816-818
+          // Check for trivially easy constant prop into one of the operands
+          // If this is the case then perform the operation now to resolve SGPR
+          // issue
----------------
arsenm wrote:
> Is this really necessary? Will SIFoldOperands not get this for some reason?
Yes, SIFoldOperands can replace the operand with the immediate, but the problem is that this code is forced to change one of the SGPR operands to M0 if it doesn't know if the other can be immediate.
If we just detect that an immediate will be folded, but don't do it, then the verification step will fail.

So, the simplest thing to do is detect trivial cases and change the operand to an immediate.
In some cases it is possible that it will not detect an immediate, replace the second SGPR with M0 and then the SIFoldOperands will replace the other SGPR operand with an immediate. This case is rare and functionally correct, just slightly less efficient.

I've reverted moving the foldToImm function from the SDWA peephole pass and implemented a simpler detection of immediates.


================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:6494
+  if (Op.isReg()) {
+    for (const MachineOperand &Def : MRI->def_operands(Op.getReg())) {
+      if (!isSameReg(Op, Def))
----------------
arsenm wrote:
> I would expect this to handle only a single vreg def
I reverted moving the foldToImm function to SIInstrInfo - it was simpler to write a new one that attempted to do less.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D51932/new/

https://reviews.llvm.org/D51932





More information about the llvm-commits mailing list