[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 22 05:40:29 PST 2022


nhaehnle added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp:362-363
+  assert(VCmpDest && "Should have an sdst operand!");
+  if (isLiveOut(*VCmp->getParent(), VCmpDest->getReg()))
+    return nullptr;
+  
----------------
sebastian-ne wrote:
> As far as I recall, the stall between v_cmp and s_and_saveexec is quite long, so
> ```
> v_cmp
> s_mov
> v_cmpx
> ```
> is probably faster than
> ```
> v_cmp
> s_and_saveexec
> ```
> and it’s worth to do this transformation and keep the v_cmp around.
> @foad?
> As far as I recall, the stall between v_cmp and s_and_saveexec is quite long, so
> ```
> v_cmp
> s_mov
> v_cmpx
> ```
> is probably faster than
> ```
> v_cmp
> s_and_saveexec
> ```
> and it’s worth to do this transformation and keep the v_cmp around.
> @foad?

IIRC the VALU->SALU stall is a simple conservative stall based on outstanding SGPR writes from the VALU. The v_cmp -> s_mov in the first snippet would already have the same stall as v_cmp -> s_and_saveexec, whether it uses the result of the v_cmp or not. So it's better to choose the code sequence with s_and_saveexec.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119696/new/

https://reviews.llvm.org/D119696



More information about the llvm-commits mailing list