[PATCH] D123367: [AMDGPU] Increase detection range for s_mov, v_cmpx transformation.

Thomas Symalla via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 8 00:54:40 PDT 2022


tsymalla created this revision.
tsymalla added reviewers: foad, sebastian-ne.
Herald added subscribers: hsmhsm, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl, arsenm.
Herald added a project: All.
tsymalla requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

We found that it might be beneficial to have the SIOptimizeExecMasking
pass detect more cases where v_cmp, s_and_saveexec patterns can be
transformed to s_mov, v_cmpx patterns. Currently, the search range
for finding a fitting v_cmp instruction is 5, however, this is doubled
to 10 here.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D123367

Files:
  llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
  llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.mir


Index: llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.mir
+++ llvm/test/CodeGen/AMDGPU/vcmp-saveexec-to-vcmpx.mir
@@ -1,6 +1,5 @@
 # RUN: llc -march=amdgcn -mcpu=gfx1010 -mattr=-wavefrontsize32,+wavefrontsize64 -run-pass=si-optimize-exec-masking -verify-machineinstrs %s -o - | FileCheck --check-prefixes=GCN,GFX1010 %s
 # RUN: llc -march=amdgcn -mcpu=gfx1030 -mattr=-wavefrontsize32,+wavefrontsize64 -run-pass=si-optimize-exec-masking -verify-machineinstrs %s -o - | FileCheck --check-prefixes=GCN,GFX1030 %s
-
 ---
 
 # After the Optimize exec masking (post-RA) pass, there's a change of having v_cmpx instructions
@@ -62,3 +61,29 @@
     $sgpr2_sgpr3 = COPY $exec, implicit-def $exec
     $sgpr2_sgpr3 = S_AND_B64 killed renamable $sgpr2_sgpr3, killed renamable $sgpr0_sgpr1, implicit-def dead $scc
     $exec = S_MOV_B64_term killed renamable $sgpr2_sgpr3
+...
+
+---
+
+# Check if the sequence will be optimized even with more than 5 (unrelated) instructions inbetween the v_cmp and s_and_saveexec.
+
+# GCN-LABEL: name: vcmp_saveexec_to_mov_vcmpx_check_many_instrs
+# GFX1010: V_CMP_LT_F32_e64
+# GFX1010: S_AND_SAVEEXEC_B64
+# GFX1030: S_MOV_B64
+# GFX1030: V_CMPX_LT_F32_nosdst_e64 0, 953267991, 2
+name: vcmp_saveexec_to_mov_vcmpx_check_many_instrs
+tracksRegLiveness: true
+body: |
+  bb.0:
+    liveins: $vgpr0, $sgpr2, $vgpr1
+    renamable $sgpr0_sgpr1 = V_CMP_LT_F32_e64 0, 953267991, 2, $vgpr0, 0, implicit $mode, implicit $exec
+    $vgpr1 = V_WRITELANE_B32 0, $sgpr2, $vgpr1
+    $vgpr1 = V_WRITELANE_B32 0, $sgpr2, $vgpr1
+    $vgpr1 = V_WRITELANE_B32 0, $sgpr2, $vgpr1
+    $vgpr1 = V_WRITELANE_B32 0, $sgpr2, $vgpr1
+    $vgpr1 = V_WRITELANE_B32 0, $sgpr2, $vgpr1
+    $vgpr1 = V_WRITELANE_B32 0, $sgpr2, $vgpr1
+    $sgpr2_sgpr3 = COPY $exec, implicit-def $exec
+    $sgpr2_sgpr3 = S_AND_B64 killed renamable $sgpr2_sgpr3, killed renamable $sgpr0_sgpr1, implicit-def dead $scc
+    $exec = S_MOV_B64_term killed renamable $sgpr2_sgpr3
Index: llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
+++ llvm/lib/Target/AMDGPU/SIOptimizeExecMasking.cpp
@@ -302,7 +302,7 @@
 findInstrBackwards(MachineInstr &Origin,
                    std::function<bool(MachineInstr *)> Pred,
                    ArrayRef<MCRegister> NonModifiableRegs,
-                   const SIRegisterInfo *TRI, unsigned MaxInstructions = 5) {
+                   const SIRegisterInfo *TRI, unsigned MaxInstructions = 10) {
   MachineBasicBlock::reverse_iterator A = Origin.getReverseIterator(),
                                       E = Origin.getParent()->rend();
   unsigned CurrentIteration = 0;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D123367.421446.patch
Type: text/x-patch
Size: 2831 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220408/b6cd8c91/attachment.bin>


More information about the llvm-commits mailing list