[llvm] 247d3ea - [SLP] Expand non-power-of-two bailout in TryToFindDuplicates

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 5 13:51:30 PDT 2024


Author: Philip Reames
Date: 2024-09-05T13:51:11-07:00
New Revision: 247d3ea843cb20d8d75ec781cd603c8ececf8934

URL: https://github.com/llvm/llvm-project/commit/247d3ea843cb20d8d75ec781cd603c8ececf8934
DIFF: https://github.com/llvm/llvm-project/commit/247d3ea843cb20d8d75ec781cd603c8ececf8934.diff

LOG: [SLP] Expand non-power-of-two bailout in TryToFindDuplicates

This fixes a crash noticed when doing a downstream merge.  The
test case has been reduced, and is included in this commit.

The existing bailout for non-power-of-two vectors in TryToFindDuplicates
did not consider the case where the list being vectorized had no
root node.  This allowed reshuffled scalars to slip through to code
which does not yet expect to handle it.

This was an existing bug (likely introduced by my ed03070e), but
made easier to hit by 63e8a1b1

Added: 
    

Modified: 
    llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 74bb529b2526e7..a77d236413a968 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -6989,7 +6989,8 @@ void BoUpSLP::buildTree_rec(ArrayRef<Value *> VL, unsigned Depth,
       ReuseShuffleIndices.clear();
     } else {
       // FIXME: Reshuffing scalars is not supported yet for non-power-of-2 ops.
-      if (UserTreeIdx.UserTE && UserTreeIdx.UserTE->isNonPowOf2Vec()) {
+      if ((UserTreeIdx.UserTE && UserTreeIdx.UserTE->isNonPowOf2Vec()) ||
+          !llvm::has_single_bit(VL.size())) {
         LLVM_DEBUG(dbgs() << "SLP: Reshuffling scalars not yet supported "
                              "for nodes with padding.\n");
         newTreeEntry(VL, std::nullopt /*not vectorized*/, S, UserTreeIdx);

diff  --git a/llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll b/llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll
index 4e8e019e155dba..faffe16f8e9cd9 100644
--- a/llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll
+++ b/llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll
@@ -762,6 +762,36 @@ define double @dot_product_fp64(ptr %a, ptr %b) {
   ret double %add.1
 }
 
+;; Covers a case where SLP would previous crash due to a
+;; missing bailout in TryToFindDuplicates for the case
+;; where a VL=3 list was vectorized directly (without
+;; a root instruction such as a store or reduce).
+define double @no_root_reshuffle(ptr  %ptr) {
+; CHECK-LABEL: @no_root_reshuffle(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = load double, ptr [[PTR:%.*]], align 8
+; CHECK-NEXT:    [[MUL:%.*]] = fmul fast double [[TMP0]], [[TMP0]]
+; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i64 8
+; CHECK-NEXT:    [[TMP1:%.*]] = load double, ptr [[ARRAYIDX2]], align 8
+; CHECK-NEXT:    [[ARRAYIDX3:%.*]] = getelementptr inbounds i8, ptr [[PTR]], i64 16
+; CHECK-NEXT:    [[TMP2:%.*]] = load double, ptr [[ARRAYIDX3]], align 8
+; CHECK-NEXT:    [[TMP3:%.*]] = fmul fast double [[TMP2]], [[TMP2]]
+; CHECK-NEXT:    [[MUL6:%.*]] = fmul fast double [[TMP3]], [[TMP1]]
+; CHECK-NEXT:    [[ADD:%.*]] = fadd fast double [[MUL6]], [[MUL]]
+; CHECK-NEXT:    ret double [[ADD]]
+;
+entry:
+  %0 = load double, ptr %ptr, align 8
+  %mul = fmul fast double %0, %0
+  %arrayidx2 = getelementptr inbounds i8, ptr %ptr, i64 8
+  %1 = load double, ptr %arrayidx2, align 8
+  %arrayidx3 = getelementptr inbounds i8, ptr %ptr, i64 16
+  %2 = load double, ptr %arrayidx3, align 8
+  %3 = fmul fast double %2, %2
+  %mul6 = fmul fast double %3, %1
+  %add = fadd fast double %mul6, %mul
+  ret double %add
+}
 
 declare float @llvm.fmuladd.f32(float, float, float)
 


        


More information about the llvm-commits mailing list