[llvm] [LoadStoreVectorizer] Postprocess and merge equivalence classes (PR #114501)

Vyacheslav Klochkov via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 8 14:02:43 PST 2024


================
@@ -0,0 +1,63 @@
+; RUN: opt %s -mtriple=x86_64-unknown-linux-gnu -passes=load-store-vectorizer -mcpu=skx -S -o %t.out.ll
+; RUN: FileCheck -input-file=%t.out.ll %s
+
+; This test verifies that the vectorizer can handle an extended sequence of
+; getelementptr instructions and generate longer vectors. With special handling,
+; some elements can still be vectorized even if they require looking up the
+; common underlying object deeper than 6 levels from the original pointer.
+
+; The test below is the simplified version of actual performance oriented
+; workload; the offsets in getelementptr instructions are similar or same for
+; the test simplicity.
+
+define void @v1_v2_v4_v1_to_v8_levels_6_7_8_8(i32 %arg0, ptr align 16 %arg1) {
+; CHECK-LABEL: @v1_v2_v4_v1_to_v8_levels_6_7_8_8
+; CHECK: store <8 x half>
+
+  %level1 = getelementptr inbounds i8, ptr %arg1, i32 917504
+  %level2 = getelementptr i8, ptr %level1, i32 %arg0
+  %level3 = getelementptr i8, ptr %level2, i32 32768
+  %level4 = getelementptr inbounds i8, ptr %level3, i32 %arg0
+  %level5 = getelementptr i8, ptr %level4, i32 %arg0
----------------
v-klochkov wrote:

There were some in the original workload. This test is the result of big simplification.
Those `inbound` are not important for the test, so I removed them as suggested: https://github.com/llvm/llvm-project/pull/114501/commits/427983a9378f8612c054ad219023564ecf803643

https://github.com/llvm/llvm-project/pull/114501


More information about the llvm-commits mailing list