[llvm] f42b930 - [SLP] Pessimistically handle unknown vector entries in SLP vectorizer (#75438)

via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 14 06:48:27 PST 2023


Author: Maurice Heumann
Date: 2023-12-14T09:48:23-05:00
New Revision: f42b930af976e73fcf52e28d77ac6d94883bf950

URL: https://github.com/llvm/llvm-project/commit/f42b930af976e73fcf52e28d77ac6d94883bf950
DIFF: https://github.com/llvm/llvm-project/commit/f42b930af976e73fcf52e28d77ac6d94883bf950.diff

LOG: [SLP] Pessimistically handle unknown vector entries in SLP vectorizer (#75438)

SLP Vectorizer can discard vector entries at unknown positions. This
example shows the behaviour:

https://godbolt.org/z/or43EM594

The following instruction inserts an element at an unknown position:

```
%2 = insertelement <3 x i64> poison, i64 %value, i64 %position
```

The position depends on an argument that is unknown at compile time.

After running SLP, one can see there is no more instruction present
referencing `%position`.

This happens as SLP parallelizes the two adds in the example. It then
needs to merge the original vector with the new vector.

Within `isUndefVector`, the SLP vectorizer constructs a bitmap
indicating which elements of the original vector are poison values. It
does this by walking the insertElement instructions.

If it encounters an insert with a non-constant position, it is ignored.
This will result in poison values to be used for all entries, where
there are no inserts with constant positions.

However, as the position is unknown, the element could be anywhere.
Therefore, I think it is only safe to assume none of the entries are
poison values and to simply take them all over when constructing the
shuffleVector instruction.

This fixes #75437

Added: 
    llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll

Modified: 
    llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index fe2aac78e5ab0d..4bc65067473eef 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -391,8 +391,10 @@ static SmallBitVector isUndefVector(const Value *V,
         if (isa<T>(II->getOperand(1)))
           continue;
         std::optional<unsigned> Idx = getInsertIndex(II);
-        if (!Idx)
-          continue;
+        if (!Idx) {
+          Res.reset();
+          return Res;
+        }
         if (*Idx < UseMask.size() && !UseMask.test(*Idx))
           Res.reset(*Idx);
       }

diff  --git a/llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll b/llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll
new file mode 100644
index 00000000000000..fc22280c2b8ada
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=slp-vectorizer -S | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define <3 x i64> @ahyes(i64 %position, i64 %value) {
+; CHECK-LABEL: define <3 x i64> @ahyes(
+; CHECK-SAME: i64 [[POSITION:%.*]], i64 [[VALUE:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = insertelement <2 x i64> poison, i64 [[VALUE]], i32 0
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <2 x i64> [[TMP0]], <2 x i64> poison, <2 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP2:%.*]] = add <2 x i64> [[TMP1]], <i64 1, i64 2>
+; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <3 x i64> poison, i64 [[VALUE]], i64 [[POSITION]]
+; CHECK-NEXT:    [[TMP4:%.*]] = shufflevector <2 x i64> [[TMP2]], <2 x i64> poison, <3 x i32> <i32 0, i32 1, i32 poison>
+; CHECK-NEXT:    [[TMP5:%.*]] = shufflevector <3 x i64> [[TMP3]], <3 x i64> [[TMP4]], <3 x i32> <i32 3, i32 4, i32 2>
+; CHECK-NEXT:    ret <3 x i64> [[TMP5]]
+;
+entry:
+  %0 = add i64 %value, 1
+  %1 = add i64 %value, 2
+  %2 = insertelement <3 x i64> poison, i64 %value, i64 %position
+  %3 = insertelement <3 x i64> %2, i64 %0, i64 0
+  %4 = insertelement <3 x i64> %3, i64 %1, i64 1
+  ret <3 x i64> %4
+}


        


More information about the llvm-commits mailing list