[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.
Thomas Symalla via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 15 06:49:58 PST 2022
tsymalla marked 3 inline comments as done.
tsymalla added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:326
+ const Register &VCmpDest = I.getOperand(0).getReg();
+ if (MRI.hasOneUse(VCmpDest) && VCmpDest == Reg) {
+ VCmp = &I;
----------------
nhaehnle wrote:
> This doesn't work how you want it to work because after SSA elimination, if you need to iterate over defs then almost certainly you have multiple uses as well.
>
> Do you have liveness information available and can you use it to get to the relevant defining instruction in a more targeted way?
>
> Alternatively: this transform is really only helpful if the V_CMP and S_AND_SAVEEXEC are close together. If the V_CMP appears much earlier in the code (perhaps even in a different basic block!) then we shouldn't do the transform. So a better approach may be to do a bounded backwards walk through instructions. (This interacts with the point about checking for intermediate EXEC writes in a later comment.)
Working with backwards checking now, which should work. Thanks.
================
Comment at: llvm/lib/Target/AMDGPU/VOPCInstructions.td:222
+ VCMPXNoSDstTable<1, opName#"_e32">,
+ VCMPVCMPXTable<0, opName#"_e32"> {
let Defs = !if(DefExec, [VCC, EXEC], [VCC]);
----------------
nhaehnle wrote:
> Shouldn't the is_vcmp field here logically be 1?
I renamed the variables. What I want to achieve is here is to map v_cmpx instructions to their v_cmp pendants in the mapping.
================
Comment at: llvm/lib/Target/AMDGPU/VOPCInstructions.td:259
+ VCMPXNoSDstTable<0, opName#"_e32">,
+ VCMPVCMPXTable<!ne(!subst("v_cmpx", "v_cmp", opName), opName), !subst("v_cmpx", "v_cmp", opName#"_e32")> {
let Defs = [EXEC];
----------------
nhaehnle wrote:
> Why the complicated expression for is_vcmp? Shouldn't that just be 0?
>
> More to the point, VOPC_Pseudo already derives from VCMPVCMPXTable, so this is rather dodgy. I'd suggest changing the table to have an is_cmpx field and using a `let` here to override the field's default value from the VOPC_Pseudo.
Using just 0 would not work as there are other VOPC instructions which would then be inserted into the result table, which is not wanted.
The idea is the following:
Evaluate if subst(op, v_cmpx, v_cmp) != v_cmpx which should yield true if the substitution succeeds. This is, if the instruction starts with v_cmpx. This would be way easier to do if TableGen would support string operations like "beginsWith". However, the way you proposed should also work, so I'm going to change this!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119696/new/
https://reviews.llvm.org/D119696
More information about the llvm-commits
mailing list