[llvm] [SLP] Check the Operands of Copyable elements as well in getBestOperand() (PR #182443)
Ryan Buchner via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 9 09:36:50 PDT 2026
================
@@ -357,21 +340,14 @@ entry:
define void @test_add_udiv_sub_commuted(ptr %arr1, ptr %arr2, i32 %a0, i32 %a1, i32 %a2, i32 %a3) {
; CHECK-LABEL: @test_add_udiv_sub_commuted(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[GEP1_2:%.*]] = getelementptr i32, ptr [[ARR1:%.*]], i32 2
-; CHECK-NEXT: [[GEP1_3:%.*]] = getelementptr i32, ptr [[ARR1]], i32 3
-; CHECK-NEXT: [[V2:%.*]] = load i32, ptr [[GEP1_2]], align 4
-; CHECK-NEXT: [[V3:%.*]] = load i32, ptr [[GEP1_3]], align 4
-; CHECK-NEXT: [[Y2:%.*]] = sub i32 [[A2:%.*]], 42
-; CHECK-NEXT: [[TMP0:%.*]] = load <2 x i32>, ptr [[ARR1]], align 4
-; CHECK-NEXT: [[RES2:%.*]] = udiv i32 [[V2]], [[Y2]]
-; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A0:%.*]], i32 0
+; CHECK-NEXT: [[TMP0:%.*]] = load <4 x i32>, ptr [[ARR1:%.*]], align 4
+; CHECK-NEXT: [[TMP4:%.*]] = insertelement <4 x i32> <i32 1, i32 1, i32 poison, i32 1>, i32 [[A2:%.*]], i32 2
+; CHECK-NEXT: [[TMP6:%.*]] = sub <4 x i32> [[TMP4]], <i32 0, i32 0, i32 42, i32 0>
+; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x i32> <i32 poison, i32 poison, i32 0, i32 poison>, i32 [[A0:%.*]], i32 0
; CHECK-NEXT: [[TMP2:%.*]] = insertelement <4 x i32> [[TMP1]], i32 [[A1:%.*]], i32 1
; CHECK-NEXT: [[TMP3:%.*]] = insertelement <4 x i32> [[TMP2]], i32 [[A3:%.*]], i32 3
-; CHECK-NEXT: [[TMP4:%.*]] = insertelement <4 x i32> [[TMP3]], i32 [[RES2]], i32 2
-; CHECK-NEXT: [[TMP5:%.*]] = add nsw <4 x i32> [[TMP4]], <i32 1146, i32 146, i32 0, i32 0>
-; CHECK-NEXT: [[TMP6:%.*]] = insertelement <4 x i32> <i32 poison, i32 poison, i32 0, i32 poison>, i32 [[V3]], i32 3
-; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <2 x i32> [[TMP0]], <2 x i32> poison, <4 x i32> <i32 0, i32 1, i32 poison, i32 poison>
-; CHECK-NEXT: [[TMP8:%.*]] = shufflevector <4 x i32> [[TMP6]], <4 x i32> [[TMP7]], <4 x i32> <i32 4, i32 5, i32 2, i32 3>
+; CHECK-NEXT: [[TMP5:%.*]] = add nsw <4 x i32> <i32 1146, i32 146, i32 0, i32 0>, [[TMP3]]
+; CHECK-NEXT: [[TMP8:%.*]] = udiv <4 x i32> [[TMP0]], [[TMP6]]
----------------
bababuck wrote:
The test uses `-slp-threshold=-200`. There is some discussion about this in the other MR (see [here](https://github.com/llvm/llvm-project/pull/181731#discussion_r2813816245)). My understanding is that the general heuristic used when reordering is maximizing the vectorization opportunities, not considering if those opportunities are cost effective on a given architecture.
https://github.com/llvm/llvm-project/pull/182443
More information about the llvm-commits
mailing list