[PATCH] D118461: [AMDGPU] Introduce new ISel combine for trunc-slr patterns
Thomas Symalla via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 28 05:32:50 PST 2022
tsymalla created this revision.
Herald added subscribers: foad, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, jvesely, kzhuravl, arsenm.
tsymalla requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
In some cases, when selecting a (trunc (slr)) pattern, the slr gets translated
to a v_lshrrev_b3e2_e64 instruction whereas the truncation gets selected to
a sequence of v_and_b32_e64 and v_cmp_eq_u32_e64. In the final ISA, this appears
as selecting the nth-bit:
v_lshrrev_b32_e32 v0, 2, v1
v_and_b32_e32 v0, 1, v0
v_cmp_eq_u32_e32 vcc_lo, 1, v0
However, when the value used in the right shift is known at compilation time, the
whole sequence can be reduced to two VALUs when the constant operand in the v_and
and the v_cmp_eq is adjusted to (1 << lshrrev_operand):
v_and_b32_e32 v0, (1 << 2), v1
v_cmp_eq_u32_e32 vcc_lo, (1 << 2), v0
In the example above, the following pseudo-code:
v0 = (v1 >> 2)
v0 = v0 & 1
vcc_lo = (v0 == 1)
would be translated to:
v0 = v1 & 0b100
vcc_lo = (v0 == 0b100)
which should yield an equivalent result.
This is a little bit hard to test as one needs to force the SelectionDAG to
contain the nodes before instruction selection, but the test sequence was
roughly derived from a production shader.
To prevent additional VGPR pressure by using the bitshift, this pattern only
takes part when the constant inside the lshr is < 16 as it could be observed
that for (1 << 16) an additional VGPR was used to store the constant value.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D118461
Files:
llvm/lib/Target/AMDGPU/SIInstructions.td
llvm/test/CodeGen/AMDGPU/dagcombine-lshr-and-cmp.ll
Index: llvm/test/CodeGen/AMDGPU/dagcombine-lshr-and-cmp.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/dagcombine-lshr-and-cmp.ll
@@ -0,0 +1,20 @@
+; RUN: llc -march=amdgcn -mtriple=amdgcn-- -stop-after=amdgpu-isel -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s
+
+; GCN-LABEL: bb.0.entry:
+; GCN-NOT: V_LSHRREV_B32_e64
+; GCN: V_AND_B32_e64 2
+; GCN: V_CMP_EQ_U32_e64 killed {{.*}}, 2
+define i32 @opt_lshr_and_cmp(i32 %x) {
+entry:
+ %0 = and i32 %x, 2
+ %1 = icmp eq i32 %0, 0
+ %2 = xor i1 %1, -1
+ br i1 %2, label %out.true, label %out.else
+
+out.true:
+ %3 = shl i32 %x, 2
+ ret i32 %3
+
+out.else:
+ ret i32 %x
+}
Index: llvm/lib/Target/AMDGPU/SIInstructions.td
===================================================================
--- llvm/lib/Target/AMDGPU/SIInstructions.td
+++ llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -2269,6 +2269,31 @@
(V_CMP_EQ_U32_e64 (V_AND_B32_e64 (i32 1), $a), (i32 1))
>;
+// Restrict the range to prevent using an additional VGPR
+// for the shifted value.
+def IMMBitSelRange : ImmLeaf <i32, [{
+ return Imm > 0 && Imm < 16;
+}]>;
+
+def IMMBitSelConst : SDNodeXForm<imm, [{
+ return CurDAG->getTargetConstant((1 << N->getZExtValue()), SDLoc(N),
+ MVT::i32);
+}]>;
+
+// Matching separate SRL and TRUNC instructions
+// with dependent operands (SRL dest is source of TRUNC)
+// generates three instructions. However, by using bit shifts,
+// the V_LSHRREV_B32_e64 result can be directly used in the
+// operand of the V_AND_B32_e64 instruction:
+// (trunc i32 (srl i32 $a, i32 $b)) ->
+// v_and_b32_e64 $a, (1 << $b), $a
+// v_cmp_eq_u32_e64 $a, (1 << $b), $a
+def : GCNPat <
+ (i1 (trunc (i32 (srl i32:$a, IMMBitSelRange:$b)))),
+ (V_CMP_EQ_U32_e64 (V_AND_B32_e64 (i32 (IMMBitSelConst $b)), $a),
+ (i32 (IMMBitSelConst $b)))
+>;
+
def : GCNPat <
(i1 (DivergentUnaryFrag<trunc> i64:$a)),
(V_CMP_EQ_U32_e64 (V_AND_B32_e64 (i32 1),
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D118461.403982.patch
Type: text/x-patch
Size: 2035 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220128/98163474/attachment.bin>
More information about the llvm-commits
mailing list