[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 18 07:12:15 PST 2022


tsymalla added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:326-330
+        if (VData && VData->getReg() == Reg)
+          return nullptr;
+
+        if (Addr && Addr->isReg() && Addr->getReg() == Reg)
+          return nullptr;
----------------
sebastian-ne wrote:
> I’m not sure what I thought before, but a buffer_store obviously only changes memory and not its register arguments, so it’s completely fine if a buffer_store is between the v_cmp and the s_and_saveexec. Checking modifiesRegister should be enough.
You are correct, any memory load or store will just update the contents behind the address stored in the VGPR. Thanks for that. I did not think thoroughly about this change.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:356
+      TII->getNamedOperand(SaveExec, AMDGPU::OpName::src0);
+  if (!SaveExecSrc0->isReg() || SaveExecSrc0->getSubReg())
+    return nullptr;
----------------
sebastian-ne wrote:
> There’s quite a lot of these isReg and !getSubReg checks in this patch. What are they guarding? I replaced them by asserts and there was no amdgpu codegen test that failed. (Leaving aside the isReg test for v_cmp arguments, which can also be constants.)
Regarding the v_cmp arguments, these are constants as you mentioned so these checks make sense.
However, I think we can safely assume that the check for s_and_saveexec will always pass as the arguments are registers, so I'm going to remove these.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:645
+        SaveExecInstr->eraseFromParent();
+        VCmpInstr->eraseFromParent();
+      }
----------------
sebastian-ne wrote:
> Not sure if this was already mentioned, but do you make sure that the v_cmp is not used anywhere outside the s_saveexec?
> 
> I guess it would still be beneficial to insert a v_cmpx if the v_cmp has more uses, but it shouldn’t be removed in that case.
No, I did not take this into account (it should mostly just write to VCC as this is checking sequences of s_and_saveexec and v_cmp).
I am going to add some more control flow for cases like this.

I don't really get your last comment though. Why would it beneficial to add the v_cmpx in addition to the v_cmp in cases like this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119696/new/

https://reviews.llvm.org/D119696



More information about the llvm-commits mailing list