[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Sebastian Neubauer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 22 02:25:38 PST 2022


sebastian-ne added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:362-363
+  assert(VCmpDest && "Should have an sdst operand!");
+  if (isLiveOut(*VCmp->getParent(), VCmpDest->getReg()))
+    return nullptr;
+  
----------------
As far as I recall, the stall between v_cmp and s_and_saveexec is quite long, so
```
v_cmp
s_mov
v_cmpx
```
is probably faster than
```
v_cmp
s_and_saveexec
```
and it’s worth to do this transformation and keep the v_cmp around.
@foad?


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:356
+      TII->getNamedOperand(SaveExec, AMDGPU::OpName::src0);
+  if (!SaveExecSrc0->isReg() || SaveExecSrc0->getSubReg())
+    return nullptr;
----------------
tsymalla wrote:
> sebastian-ne wrote:
> > There’s quite a lot of these isReg and !getSubReg checks in this patch. What are they guarding? I replaced them by asserts and there was no amdgpu codegen test that failed. (Leaving aside the isReg test for v_cmp arguments, which can also be constants.)
> Regarding the v_cmp arguments, these are constants as you mentioned so these checks make sense.
> However, I think we can safely assume that the check for s_and_saveexec will always pass as the arguments are registers, so I'm going to remove these.
Did you remove any of these checks? I still see a lot of isReg and getSubReg checks


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119696/new/

https://reviews.llvm.org/D119696



More information about the llvm-commits mailing list