[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 22 05:40:29 PST 2022
nhaehnle added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:362-363
+ assert(VCmpDest && "Should have an sdst operand!");
+ if (isLiveOut(*VCmp->getParent(), VCmpDest->getReg()))
+ return nullptr;
+
----------------
sebastian-ne wrote:
> As far as I recall, the stall between v_cmp and s_and_saveexec is quite long, so
> ```
> v_cmp
> s_mov
> v_cmpx
> ```
> is probably faster than
> ```
> v_cmp
> s_and_saveexec
> ```
> and it’s worth to do this transformation and keep the v_cmp around.
> @foad?
> As far as I recall, the stall between v_cmp and s_and_saveexec is quite long, so
> ```
> v_cmp
> s_mov
> v_cmpx
> ```
> is probably faster than
> ```
> v_cmp
> s_and_saveexec
> ```
> and it’s worth to do this transformation and keep the v_cmp around.
> @foad?
IIRC the VALU->SALU stall is a simple conservative stall based on outstanding SGPR writes from the VALU. The v_cmp -> s_mov in the first snippet would already have the same stall as v_cmp -> s_and_saveexec, whether it uses the result of the v_cmp or not. So it's better to choose the code sequence with s_and_saveexec.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119696/new/
https://reviews.llvm.org/D119696
More information about the llvm-commits
mailing list