[llvm] [SLP] Make getSameOpcode support interchangeable instructions. (PR #127450)

Han-Kuan Chen via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 24 06:21:28 PST 2025


================
@@ -5,11 +5,19 @@ define i64 @test() {
 ; CHECK-LABEL: define i64 @test() {
 ; CHECK-NEXT:  [[ENTRY:.*:]]
 ; CHECK-NEXT:    [[OR54_I_I_6:%.*]] = or i32 0, 0
-; CHECK-NEXT:    [[TMP0:%.*]] = insertelement <16 x i32> poison, i32 [[OR54_I_I_6]], i32 8
-; CHECK-NEXT:    [[TMP1:%.*]] = call <16 x i32> @llvm.vector.insert.v16i32.v8i32(<16 x i32> [[TMP0]], <8 x i32> zeroinitializer, i64 0)
-; CHECK-NEXT:    [[TMP2:%.*]] = shufflevector <16 x i32> [[TMP1]], <16 x i32> poison, <16 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3, i32 4, i32 4, i32 5, i32 5, i32 6, i32 7, i32 7, i32 8>
-; CHECK-NEXT:    [[TMP3:%.*]] = zext <16 x i32> [[TMP2]] to <16 x i64>
-; CHECK-NEXT:    [[TMP4:%.*]] = call i64 @llvm.vector.reduce.or.v16i64(<16 x i64> [[TMP3]])
+; CHECK-NEXT:    [[CONV193_1_I_6:%.*]] = zext i32 [[OR54_I_I_6]] to i64
+; CHECK-NEXT:    [[CONV193_I_7:%.*]] = zext i32 0 to i64
+; CHECK-NEXT:    [[TMP0:%.*]] = call <4 x i64> @llvm.vector.extract.v4i64.v8i64(<8 x i64> zeroinitializer, i64 0)
+; CHECK-NEXT:    [[RDX_OP:%.*]] = or <4 x i64> [[TMP0]], zeroinitializer
+; CHECK-NEXT:    [[TMP1:%.*]] = call <8 x i64> @llvm.vector.insert.v8i64.v4i64(<8 x i64> zeroinitializer, <4 x i64> [[RDX_OP]], i64 0)
+; CHECK-NEXT:    [[OP_RDX:%.*]] = call i64 @llvm.vector.reduce.or.v8i64(<8 x i64> [[TMP1]])
----------------
HanKuanChen wrote:

before this PR, SLP are trying to vectorize VL with size 12.
```%xor148.2.i = xor i32 0, 0
%xor148.2.i.1 = xor i32 0, 0
%xor148.2.i.2 = xor i32 0, 0
%xor148.2.i.3 = xor i32 0, 0
%xor148.2.i.4 = xor i32 0, 0
%xor148.2.i.5 = xor i32 0, 0
%xor148.2.i.6 = xor i32 0, 0
%xor148.2.i.7 = xor i32 0, 0
%or54.i.i.6 = or i32 %xor148.2.i.6, 0
i32 poison
i32 poison
i32 poison
```
the code has a check like this
```
      if (VL.size() <= 2 || LoadEntriesToVectorize.contains(Idx) ||
          !(!E.hasState() || E.getOpcode() == Instruction::Load ||
            E.isAltShuffle() || !allSameBlock(VL)) ||
          allConstant(VL) || isSplat(VL))
        continue;
```
VL is alternate shuffle (xor and or) and SLP will try to combine from VL[0] to VL[7]. (`CombinedEntriesWithIndices`)
VL is NOT alternate shuffle when this PR is applied.

https://github.com/llvm/llvm-project/pull/127450


More information about the llvm-commits mailing list