[llvm] [LAA] Compute pointer bounds for pattern with urem operation (PR #106574)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 15 06:00:49 PDT 2024
================
@@ -0,0 +1,90 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -disable-output -passes='print<access-info>' %s 2>&1 | FileCheck %s
+
+define void @test_stride_1(ptr writeonly %dst, ptr readonly %src, i64 %n, i64 %offset) {
+; CHECK-LABEL: 'test_stride_1'
+; CHECK-NEXT: loop:
+; CHECK-NEXT: Memory dependences are safe with run-time checks
+; CHECK-NEXT: Dependences:
+; CHECK-NEXT: Run-time memory checks:
+; CHECK-NEXT: Check 0:
+; CHECK-NEXT: Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; CHECK-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr %dst, i64 %i
+; CHECK-NEXT: Against group ([[GRP2:0x[0-9a-f]+]]):
+; CHECK-NEXT: %arrayidx = getelementptr inbounds i8, ptr %src, i64 %rem
+; CHECK-NEXT: Grouped accesses:
+; CHECK-NEXT: Group [[GRP1]]:
+; CHECK-NEXT: (Low: %dst High: (%n + %dst))
+; CHECK-NEXT: Member: {%dst,+,1}<nuw><%loop>
+; CHECK-NEXT: Group [[GRP2]]:
+; CHECK-NEXT: (Low: %src High: (%n + %src))
+; CHECK-NEXT: Member: ((-1 * ({%offset,+,1}<nw><%loop> /u %n) * %n) + {(%offset + %src),+,1}<nw><%loop>)
+; CHECK-EMPTY:
+; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
+; CHECK-NEXT: SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT: Expressions re-written:
+;
+entry:
+ %cmp = icmp sgt i64 %n, 0
+ br i1 %cmp, label %loop, label %exit
+
+loop:
+ %i = phi i64 [ %inc, %loop ], [ 0, %entry ]
+ %add = add i64 %i, %offset
+ %rem = urem i64 %add, %n
+ %arrayidx = getelementptr inbounds i8, ptr %src, i64 %rem
----------------
david-arm wrote:
I'm trying to understand how we can vectorise this safely today. Surely there is a point where the pointer `%arrayidx` wraps back to `%src`. This could happen mid-way through a vector load - how does the vectoriser handle this? I think the only way the vectoriser can safely vectorise such a loop is to reduce the trip count by `%offset` to avoid wrapping. Do you an example of what the vectorised code looks like?
https://github.com/llvm/llvm-project/pull/106574
More information about the llvm-commits
mailing list