[llvm] [AArch64][NFC] Add test as a representative of scalarizing a vector i… (PR #114107)

via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 29 10:51:03 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-llvm-transforms

Author: Sushant Gokhale (sushgokh)

<details>
<summary>Changes</summary>

…nteger division

The last resort to vectorize a bundle of integer divisions is considered scalarizing it. Currently, the cost estimates for scalarizing a vector division can be considerably overestimated as is the scenario with this motivating test case i.e. vector cost should not deviate much from the  scalar cost.

Future patch will try to improve the scalarization cost.

---
Full diff: https://github.com/llvm/llvm-project/pull/114107.diff


1 Files Affected:

- (added) llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll (+23) 


``````````diff
diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll
new file mode 100644
index 00000000000000..12bbf6e043ca91
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/AArch64/scalarizing-vector-cost.ll
@@ -0,0 +1,23 @@
+; RUN: opt -mtriple=aarch64 -passes=slp-vectorizer -debug-only=SLP -S -disable-output < %s 2>&1 | FileCheck %s
+
+define <4 x i8> @v4i8(<4 x i8> %a, <4 x i8> %b)
+{
+; CHECK: SLP: Found cost = 18 for VF=4
+  %a0 = extractelement <4 x i8> %a, i64 0
+  %a1 = extractelement <4 x i8> %a, i64 1
+  %a2 = extractelement <4 x i8> %a, i64 2
+  %a3 = extractelement <4 x i8> %a, i64 3
+  %b0 = extractelement <4 x i8> %b, i64 0
+  %b1 = extractelement <4 x i8> %b, i64 1
+  %b2 = extractelement <4 x i8> %b, i64 2
+  %b3 = extractelement <4 x i8> %b, i64 3
+  %1 = sdiv i8 %a0, undef
+  %2 = sdiv i8 %a1, 1
+  %3 = sdiv i8 %a2, 2
+  %4 = sdiv i8 %a3, 4
+  %r0 = insertelement <4 x i8> poison, i8 %1, i32 0
+  %r1 = insertelement <4 x i8> %r0, i8 %2, i32 1
+  %r2 = insertelement <4 x i8> %r1, i8 %3, i32 2
+  %r3 = insertelement <4 x i8> %r2, i8 %4, i32 3
+  ret <4 x i8> %r3
+}

``````````

</details>


https://github.com/llvm/llvm-project/pull/114107


More information about the llvm-commits mailing list