[llvm] [LAA] Compute pointer bounds for pattern with urem operation (PR #106574)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 15 06:00:49 PDT 2024


================
@@ -0,0 +1,90 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -disable-output -passes='print<access-info>' %s 2>&1 | FileCheck %s
+
+define void @test_stride_1(ptr writeonly %dst, ptr readonly %src, i64 %n, i64 %offset) {
+; CHECK-LABEL: 'test_stride_1'
+; CHECK-NEXT:    loop:
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx1 = getelementptr inbounds i8, ptr %dst, i64 %i
+; CHECK-NEXT:        Against group ([[GRP2:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx = getelementptr inbounds i8, ptr %src, i64 %rem
+; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP1]]:
+; CHECK-NEXT:          (Low: %dst High: (%n + %dst))
+; CHECK-NEXT:            Member: {%dst,+,1}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP2]]:
+; CHECK-NEXT:          (Low: %src High: (%n + %src))
+; CHECK-NEXT:            Member: ((-1 * ({%offset,+,1}<nw><%loop> /u %n) * %n) + {(%offset + %src),+,1}<nw><%loop>)
+; CHECK-EMPTY:
+; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
+; CHECK-NEXT:      SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:      Expressions re-written:
+;
+entry:
+  %cmp = icmp sgt i64 %n, 0
+  br i1 %cmp, label %loop, label %exit
+
+loop:
+  %i = phi i64 [ %inc, %loop ], [ 0, %entry ]
+  %add = add i64 %i, %offset
+  %rem = urem i64 %add, %n
+  %arrayidx = getelementptr inbounds i8, ptr %src, i64 %rem
----------------
david-arm wrote:

I'm trying to understand how we can vectorise this safely today. Surely there is a point where the pointer `%arrayidx` wraps back to `%src`. This could happen mid-way through a vector load - how does the vectoriser handle this? I think the only way the vectoriser can safely vectorise such a loop is to reduce the trip count by `%offset` to avoid wrapping. Do you an example of what the vectorised code looks like?

https://github.com/llvm/llvm-project/pull/106574


More information about the llvm-commits mailing list