[PATCH] D146131: [AMDGPU] Avoid constant bus limitation on V_BFE GISel pattern

Wed Mar 15 05:28:47 PDT 2023

Pierre-vh created this revision.
Pierre-vh added a reviewer: arsenm.
Herald added subscribers: kosarev, foad, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
Herald added a project: All.
Pierre-vh requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

For D141247 <https://reviews.llvm.org/D141247> - if that pattern is used by GISel the cosntant bus limitation is exceeded in some cases due to the use of S_MOV instead of V_MOV.

V_MOV will likely get scalarized (or the operands will just become immediates) at some point later in the pipeline anyway. There are no regressions in any codegen test.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D146131

Files:
  llvm/lib/Target/AMDGPU/VOP3Instructions.td
  llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sext.mir
  llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zext.mir


Index: llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zext.mir
===================================================================

--- llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zext.mir
+++ llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zext.mir
@@ -165,7 +165,9 @@
     ; GCN-NEXT: {{  $}}
     ; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
     ; GCN-NEXT: [[V_AND_B32_e32_:%[0-9]+]]:vgpr_32 = V_AND_B32_e32 1, [[COPY]], implicit $exec
-    ; GCN-NEXT: [[V_BFE_I32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_I32_e64 [[V_AND_B32_e32_]], 0, 16, implicit $exec
+    ; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 16, implicit $exec
+    ; GCN-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+    ; GCN-NEXT: [[V_BFE_I32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_I32_e64 [[V_AND_B32_e32_]], [[V_MOV_B32_e32_1]], [[V_MOV_B32_e32_]], implicit $exec
     ; GCN-NEXT: $vgpr0 = COPY [[V_BFE_I32_e64_]]
     %0:vgpr(s32) = COPY $vgpr0
     %1:vgpr(s1) = G_TRUNC %0
Index: llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sext.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sext.mir
+++ llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sext.mir
@@ -208,7 +208,9 @@
     ; GCN: liveins: $vgpr0
     ; GCN-NEXT: {{  $}}
     ; GCN-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
-    ; GCN-NEXT: [[V_BFE_I32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_I32_e64 [[COPY]], 0, 16, implicit $exec
+    ; GCN-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 16, implicit $exec
+    ; GCN-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+    ; GCN-NEXT: [[V_BFE_I32_e64_:%[0-9]+]]:vgpr_32 = V_BFE_I32_e64 [[COPY]], [[V_MOV_B32_e32_1]], [[V_MOV_B32_e32_]], implicit $exec
     ; GCN-NEXT: $vgpr0 = COPY [[V_BFE_I32_e64_]]
     %0:vgpr(s32) = COPY $vgpr0
     %1:vgpr(s16) = G_TRUNC %0
Index: llvm/lib/Target/AMDGPU/VOP3Instructions.td
===================================================================
--- llvm/lib/Target/AMDGPU/VOP3Instructions.td
+++ llvm/lib/Target/AMDGPU/VOP3Instructions.td
@@ -263,7 +263,7 @@
 
 def : GCNPat<
   (i32 (DivergentUnaryFrag<sext> i16:$src)),
-  (i32 (V_BFE_I32_e64 $src, (S_MOV_B32 (i32 0)), (S_MOV_B32 (i32 0x10))))
+  (i32 (V_BFE_I32_e64 i16:$src, (V_MOV_B32_e32 (i32 0)), (V_MOV_B32_e32 (i32 0x10))))
 >;
 
 let isReMaterializable = 1 in {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D146131.505450.patch
Type: text/x-patch
Size: 2391 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230315/39d86f63/attachment.bin>