[llvm] 9a3aedb - [SLP]Do not try to schedule bundle with non-schedulable parent with commutable instructions

Alexey Bataev via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 4 12:57:25 PDT 2025


Author: Alexey Bataev
Date: 2025-09-04T12:57:14-07:00
New Revision: 9a3aedb093008b55fa2018396cebb9b4606a7453

URL: https://github.com/llvm/llvm-project/commit/9a3aedb093008b55fa2018396cebb9b4606a7453
DIFF: https://github.com/llvm/llvm-project/commit/9a3aedb093008b55fa2018396cebb9b4606a7453.diff

LOG: [SLP]Do not try to schedule bundle with non-schedulable parent with commutable instructions

Commutable instruction can be reordering during tree building, and if
the parent node is not scheduled, its ScheduleData elements are
considered independent and compiler do not looks for reordered operands.
Need to cancel scheduling of copyables in this case.

Added: 
    llvm/test/Transforms/SLPVectorizer/X86/copyable-with-non-scheduled-parent-node.ll

Modified: 
    llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 0929f04df49e4..805e7ea118eb7 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -20853,7 +20853,23 @@ BoUpSLP::BlockScheduling::tryScheduleBundle(ArrayRef<Value *> VL, BoUpSLP *SLP,
   for (Value *V : VL) {
     if (S.isNonSchedulable(V))
       continue;
-    if (!extendSchedulingRegion(V, S)) {
+    // For copybales with parent nodes, which do not need to be scheduled, the
+    // parents should not be commutative, otherwise may incorrectly handle deps
+    // because of the potential reordering of commutative operations.
+    if ((S.isCopyableElement(V) && EI.UserTE && !EI.UserTE->isGather() &&
+         EI.UserTE->hasState() && EI.UserTE->doesNotNeedToSchedule() &&
+         any_of(EI.UserTE->Scalars,
+                [&](Value *V) {
+                  if (isa<PoisonValue>(V))
+                    return false;
+                  auto *I = dyn_cast<Instruction>(V);
+                  return isCommutative(
+                      (I && EI.UserTE->isAltShuffle())
+                          ? EI.UserTE->getMatchingMainOpOrAltOp(I)
+                          : EI.UserTE->getMainOp(),
+                      V);
+                })) ||
+        !extendSchedulingRegion(V, S)) {
       // If the scheduling region got new instructions at the lower end (or it
       // is a new region for the first bundle). This makes it necessary to
       // recalculate all dependencies.

diff  --git a/llvm/test/Transforms/SLPVectorizer/X86/copyable-with-non-scheduled-parent-node.ll b/llvm/test/Transforms/SLPVectorizer/X86/copyable-with-non-scheduled-parent-node.ll
new file mode 100644
index 0000000000000..fbfc05f40d63a
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/copyable-with-non-scheduled-parent-node.ll
@@ -0,0 +1,42 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S --passes=slp-vectorizer -mtriple=i686-unknown-linux-android29 -mattr=+sse2 < %s | FileCheck %s
+
+define i64 @test(ptr %a) {
+; CHECK-LABEL: define i64 @test(
+; CHECK-SAME: ptr [[A:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 0, 0
+; CHECK-NEXT:    [[TMP2:%.*]] = load i64, ptr [[A]], align 4
+; CHECK-NEXT:    [[TMP3:%.*]] = add i64 [[TMP2]], 0
+; CHECK-NEXT:    [[TMP4:%.*]] = add i64 1, [[TMP1]]
+; CHECK-NEXT:    [[TMP5:%.*]] = ashr i64 0, 1
+; CHECK-NEXT:    [[TMP6:%.*]] = ashr i64 0, 0
+; CHECK-NEXT:    br label %[[BB7:.*]]
+; CHECK:       [[BB7]]:
+; CHECK-NEXT:    [[TMP8:%.*]] = phi i64 [ [[TMP3]], [[TMP0:%.*]] ]
+; CHECK-NEXT:    [[TMP9:%.*]] = phi i64 [ 0, [[TMP0]] ]
+; CHECK-NEXT:    [[TMP10:%.*]] = phi i64 [ [[TMP6]], [[TMP0]] ]
+; CHECK-NEXT:    [[TMP11:%.*]] = phi i64 [ [[TMP5]], [[TMP0]] ]
+; CHECK-NEXT:    [[TMP12:%.*]] = phi i64 [ 0, [[TMP0]] ]
+; CHECK-NEXT:    [[TMP13:%.*]] = phi i64 [ [[TMP4]], [[TMP0]] ]
+; CHECK-NEXT:    ret i64 0
+;
+  %1 = add i64 0, 0
+  %2 = load i64, ptr %a, align 4
+  %3 = add i64 0, 0
+  %4 = add i64 %2, 0
+  %5 = add i64 0, 0
+  %6 = add i64 1, %1
+  %7 = ashr i64 0, 1
+  %8 = add i64 0, 0
+  %9 = ashr i64 %8, 0
+  br label %10
+
+10:
+  %11 = phi i64 [ %4, %0 ]
+  %12 = phi i64 [ %3, %0 ]
+  %13 = phi i64 [ %9, %0 ]
+  %14 = phi i64 [ %7, %0 ]
+  %15 = phi i64 [ %5, %0 ]
+  %16 = phi i64 [ %6, %0 ]
+  ret i64 0
+}


        


More information about the llvm-commits mailing list