[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 15 06:49:58 PST 2022


tsymalla marked 3 inline comments as done.
tsymalla added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:326
+      const Register &VCmpDest = I.getOperand(0).getReg();
+      if (MRI.hasOneUse(VCmpDest) && VCmpDest == Reg) {
+        VCmp = &I;
----------------
nhaehnle wrote:
> This doesn't work how you want it to work because after SSA elimination, if you need to iterate over defs then almost certainly you have multiple uses as well.
> 
> Do you have liveness information available and can you use it to get to the relevant defining instruction in a more targeted way?
> 
> Alternatively: this transform is really only helpful if the V_CMP and S_AND_SAVEEXEC are close together. If the V_CMP appears much earlier in the code (perhaps even in a different basic block!) then we shouldn't do the transform. So a better approach may be to do a bounded backwards walk through instructions. (This interacts with the point about checking for intermediate EXEC writes in a later comment.)
Working with backwards checking now, which should work. Thanks.


================
Comment at: llvm/lib/Target/AMDGPU/VOPCInstructions.td:222
+             VCMPXNoSDstTable<1, opName#"_e32">,
+             VCMPVCMPXTable<0, opName#"_e32"> {
     let Defs = !if(DefExec, [VCC, EXEC], [VCC]);
----------------
nhaehnle wrote:
> Shouldn't the is_vcmp field here logically be 1?
I renamed the variables. What I want to achieve is here is to map v_cmpx instructions to their v_cmp pendants in the mapping.


================
Comment at: llvm/lib/Target/AMDGPU/VOPCInstructions.td:259
+             VCMPXNoSDstTable<0, opName#"_e32">,
+             VCMPVCMPXTable<!ne(!subst("v_cmpx", "v_cmp", opName), opName), !subst("v_cmpx", "v_cmp", opName#"_e32")> {
     let Defs = [EXEC];
----------------
nhaehnle wrote:
> Why the complicated expression for is_vcmp? Shouldn't that just be 0?
> 
> More to the point, VOPC_Pseudo already derives from VCMPVCMPXTable, so this is rather dodgy. I'd suggest changing the table to have an is_cmpx field and using a `let` here to override the field's default value from the VOPC_Pseudo.
Using just 0 would not work as there are other VOPC instructions which would then be inserted into the result table, which is not wanted.

The idea is the following:

Evaluate if subst(op, v_cmpx, v_cmp) != v_cmpx which should yield true if the substitution succeeds. This is, if the instruction starts with v_cmpx. This would be way easier to do if TableGen would support string operations like "beginsWith". However, the way you proposed should also work, so I'm going to change this!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119696/new/

https://reviews.llvm.org/D119696



More information about the llvm-commits mailing list