[llvm] d0e3a57 - [SLP]Fix PR64519: Unexpected reordering of gathers.

Alexey Bataev via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 8 08:10:00 PDT 2023


Author: Alexey Bataev
Date: 2023-08-08T08:07:25-07:00
New Revision: d0e3a571e7f7eced43d85891181dad3fb6c0eaa1

URL: https://github.com/llvm/llvm-project/commit/d0e3a571e7f7eced43d85891181dad3fb6c0eaa1
DIFF: https://github.com/llvm/llvm-project/commit/d0e3a571e7f7eced43d85891181dad3fb6c0eaa1.diff

LOG: [SLP]Fix PR64519: Unexpected reordering of gathers.

The issue is actually related to ScatterVectorize nodes. If such node
gets reordered during bottom-to-top reordering, it may have associated
non-empty ReorderIndices. In this case, such nodes need to be handled
the same way as regular Vectorize nodes, not NeedToGather nodes. In this
case we need to reorder ReorderIndices array rather than scalars.

Added: 
    llvm/test/Transforms/SLPVectorizer/X86/scatter-vectorize-reorder-non-empty.ll

Modified: 
    llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 85bd9eff1cbe11..7dfa74bbb41596 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4563,7 +4563,8 @@ bool BoUpSLP::canReorderOperands(
       // simply add to the list of gathered ops.
       // If there are reused scalars, process this node as a regular vectorize
       // node, just reorder reuses mask.
-      if (TE->State != TreeEntry::Vectorize && TE->ReuseShuffleIndices.empty())
+      if (TE->State != TreeEntry::Vectorize &&
+          TE->ReuseShuffleIndices.empty() && TE->ReorderIndices.empty())
         GatherOps.push_back(TE);
       continue;
     }
@@ -4792,7 +4793,9 @@ void BoUpSLP::reorderBottomToTop(bool IgnoreReorder) {
           continue;
         }
         // Gathers are processed separately.
-        if (TE->State != TreeEntry::Vectorize)
+        if (TE->State != TreeEntry::Vectorize &&
+            (TE->State != TreeEntry::ScatterVectorize ||
+             TE->ReorderIndices.empty()))
           continue;
         assert((BestOrder.size() == TE->ReorderIndices.size() ||
                 TE->ReorderIndices.empty()) &&

diff  --git a/llvm/test/Transforms/SLPVectorizer/X86/scatter-vectorize-reorder-non-empty.ll b/llvm/test/Transforms/SLPVectorizer/X86/scatter-vectorize-reorder-non-empty.ll
new file mode 100644
index 00000000000000..3bece6b7cf9a7e
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/scatter-vectorize-reorder-non-empty.ll
@@ -0,0 +1,31 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt -passes=slp-vectorizer -S < %s -mtriple=x86_64-unknown-linux-gnu -mcpu=skylake -slp-threshold=-10 | FileCheck %s
+
+define double @test01() {
+; CHECK-LABEL: define double @test01
+; CHECK-SAME: () #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[TMP1:%.*]] = load <2 x i32>, ptr null, align 8
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr double, <2 x ptr> zeroinitializer, <2 x i32> [[TMP1]]
+; CHECK-NEXT:    [[TMP3:%.*]] = call <2 x double> @llvm.masked.gather.v2f64.v2p0(<2 x ptr> [[TMP2]], i32 8, <2 x i1> <i1 true, i1 true>, <2 x double> poison)
+; CHECK-NEXT:    [[TMP4:%.*]] = shufflevector <2 x double> [[TMP3]], <2 x double> <double 0.000000e+00, double poison>, <2 x i32> <i32 2, i32 0>
+; CHECK-NEXT:    [[TMP5:%.*]] = fadd <2 x double> [[TMP4]], [[TMP4]]
+; CHECK-NEXT:    [[TMP6:%.*]] = fadd <2 x double> [[TMP3]], [[TMP5]]
+; CHECK-NEXT:    [[TMP7:%.*]] = extractelement <2 x double> [[TMP6]], i32 0
+; CHECK-NEXT:    [[TMP8:%.*]] = extractelement <2 x double> [[TMP6]], i32 1
+; CHECK-NEXT:    [[TMP9:%.*]] = fadd double [[TMP7]], [[TMP8]]
+; CHECK-NEXT:    ret double [[TMP9]]
+;
+  %1 = load i32, ptr null, align 8
+  %2 = load i32, ptr getelementptr inbounds (i32, ptr null, i32 1), align 4
+  %3 = getelementptr double, ptr null, i32 %2
+  %4 = load double, ptr %3, align 8
+  %5 = getelementptr double, ptr null, i32 %1
+  %6 = load double, ptr %5, align 8
+  %7 = fadd double %6, %6
+  %8 = fadd double %4, %7
+  %9 = fadd double 0.000000e+00, 0.000000e+00
+  %10 = fadd double %6, %9
+  %11 = fadd double %10, %8
+  ret double %11
+}
+


        


More information about the llvm-commits mailing list