[llvm] [SLP] Pessimistically handle unknown vector entries in SLP vectorizer (PR #75438)

Maurice Heumann via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 13 23:31:52 PST 2023


https://github.com/momo5502 created https://github.com/llvm/llvm-project/pull/75438

SLP Vectorizer can discard vector entries at unknown positions. This example shows the behaviour:

https://godbolt.org/z/or43EM594

The following instruction inserts an element at an unknown position:

```
%2 = insertelement <3 x i64> poison, i64 %value, i64 %position
```

The position depends on an argument that is unknown at compile time.

After running SLP, one can see there is no more instruction present referencing `%position`.

This happens as SLP parallelizes the two adds in the example. It then needs to merge the original vector with the new vector.

Within `isUndefVector`, the SLP vectorizer constructs a bitmap indicating which elements of the original vector are poison values. It does this by walking the insertElement instructions.

If it encounters an insert with a non-constant position, it is ignored. This will result in poison values to be used for all entries, where there are no inserts with constant positions.

However, as the position is unknown, the element could be anywhere. Therefore, I think it is only safe to assume none of the entries are poison values and to simply take them all over when constructing the shuffleVector instruction.

This fixes #75437

>From de9f27f21e06c12bb1c0c63b9c4247d7afd76ad6 Mon Sep 17 00:00:00 2001
From: Maurice Heumann <maurice.heumann at wibu.com>
Date: Thu, 14 Dec 2023 07:05:57 +0100
Subject: [PATCH] [SLP] Pessimistically handle unknown vector entries in SLP
 vectorizer

SLP Vectorizer can discard vector entries at unknown positions.
This example shows the behaviour:

https://godbolt.org/z/or43EM594

The following instruction inserts an element at an unknown position:

```
%2 = insertelement <3 x i64> poison, i64 %value, i64 %position
```

The position depends on an argument that is unknown at compile time.

After running SLP, one can see there is no more instruction present
referencing `%position`.

This happens as SLP parallelizes the two adds in the example.
It then needs to merge the original vector with the new vector.

Within `isUndefVector`, the SLP vectorizer constructs a bitmap
indicating which elements of the original vector are poison values.
It does this by walking the insertElement instructions.

If it encounters an insert with a non-constant position, it is ignored.
This will result in poison values to be used for all entries, where
there are no inserts with constant positions.

However, as the position is unknown, the element could be anywhere.
Therefore, I think it is only safe to assume none of the entries are
poison values and to simply take them all over when constructing
the shuffleVector instruction.
---
 .../Transforms/Vectorize/SLPVectorizer.cpp    |  6 +++--
 .../SLPVectorizer/X86/unknown-entries.ll      | 25 +++++++++++++++++++
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index fe2aac78e5ab0d..4bc65067473eef 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -391,8 +391,10 @@ static SmallBitVector isUndefVector(const Value *V,
         if (isa<T>(II->getOperand(1)))
           continue;
         std::optional<unsigned> Idx = getInsertIndex(II);
-        if (!Idx)
-          continue;
+        if (!Idx) {
+          Res.reset();
+          return Res;
+        }
         if (*Idx < UseMask.size() && !UseMask.test(*Idx))
           Res.reset(*Idx);
       }
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll b/llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll
new file mode 100644
index 00000000000000..fc22280c2b8ada
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/unknown-entries.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=slp-vectorizer -S | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define <3 x i64> @ahyes(i64 %position, i64 %value) {
+; CHECK-LABEL: define <3 x i64> @ahyes(
+; CHECK-SAME: i64 [[POSITION:%.*]], i64 [[VALUE:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = insertelement <2 x i64> poison, i64 [[VALUE]], i32 0
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <2 x i64> [[TMP0]], <2 x i64> poison, <2 x i32> zeroinitializer
+; CHECK-NEXT:    [[TMP2:%.*]] = add <2 x i64> [[TMP1]], <i64 1, i64 2>
+; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <3 x i64> poison, i64 [[VALUE]], i64 [[POSITION]]
+; CHECK-NEXT:    [[TMP4:%.*]] = shufflevector <2 x i64> [[TMP2]], <2 x i64> poison, <3 x i32> <i32 0, i32 1, i32 poison>
+; CHECK-NEXT:    [[TMP5:%.*]] = shufflevector <3 x i64> [[TMP3]], <3 x i64> [[TMP4]], <3 x i32> <i32 3, i32 4, i32 2>
+; CHECK-NEXT:    ret <3 x i64> [[TMP5]]
+;
+entry:
+  %0 = add i64 %value, 1
+  %1 = add i64 %value, 2
+  %2 = insertelement <3 x i64> poison, i64 %value, i64 %position
+  %3 = insertelement <3 x i64> %2, i64 %0, i64 0
+  %4 = insertelement <3 x i64> %3, i64 %1, i64 1
+  ret <3 x i64> %4
+}



More information about the llvm-commits mailing list