[llvm-branch-commits] [llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)

Florian Hahn via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Fri Apr 19 06:00:35 PDT 2024


https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/88039

>From 110e5ea24d4b23a153b5f602460b81e5228c700f Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Thu, 4 Apr 2024 12:36:27 +0100
Subject: [PATCH 1/5] [LAA] Support different strides & non constant dep
 distances using SCEV.

Extend LoopAccessAnalysis to support different strides and as a
consequence non-constant distances between dependences using SCEV to
reason about the direction of the dependence.

In multiple places, logic to rule out dependences using the stride has
been updated to only be used if StrideA == StrideB, i.e. there's a
common stride.

We now also may bail out at multiple places where we may have to set
FoundNonConstantDistanceDependence. This is done when we need to bail
out and the distance is not constant to preserve original behavior.

I'd like to call out the changes in global_alias.ll in particular.
In the modified mayAlias01, and mayAlias02 tests, there should be no
aliasing in the original versions of the test, as they are accessing 2
different 100 element fields in a loop with 100 iterations.

I moved the original tests to noAlias15 and noAlias16 respectively,
while updating the original tests to use a variable trip count.

In some cases, like different_non_constant_strides_known_backward_min_distance_3,
we now vectorize with runtime checks, even though the runtime checks
will always be false. I'll also share a follow-up patch, that also uses
SCEV to more accurately identify backwards dependences with non-constant
distances.

Fixes https://github.com/llvm/llvm-project/issues/87336
---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp      | 115 ++++++++++++------
 .../Transforms/Scalar/LoopLoadElimination.cpp |   4 +-
 .../non-constant-strides-backward.ll          |  90 +++++++++-----
 .../non-constant-strides-forward.ll           |  10 +-
 .../Transforms/LoopVectorize/global_alias.ll  | 102 ++++++++++++++--
 .../single-iteration-loop-sroa.ll             |  40 +++++-
 6 files changed, 274 insertions(+), 87 deletions(-)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index c25eede96a1859..314484e11c4a7c 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1923,8 +1923,9 @@ isLoopVariantIndirectAddress(ArrayRef<const Value *> UnderlyingObjects,
 // of various temporary variables, like A/BPtr, StrideA/BPtr and others.
 // Returns either the dependence result, if it could already be determined, or a
 // tuple with (Distance, Stride, TypeSize, AIsWrite, BIsWrite).
-static std::variant<MemoryDepChecker::Dependence::DepType,
-                    std::tuple<const SCEV *, uint64_t, uint64_t, bool, bool>>
+static std::variant<
+    MemoryDepChecker::Dependence::DepType,
+    std::tuple<const SCEV *, uint64_t, uint64_t, uint64_t, bool, bool>>
 getDependenceDistanceStrideAndSize(
     const AccessAnalysis::MemAccessInfo &A, Instruction *AInst,
     const AccessAnalysis::MemAccessInfo &B, Instruction *BInst,
@@ -1982,7 +1983,7 @@ getDependenceDistanceStrideAndSize(
   // Need accesses with constant stride. We don't want to vectorize
   // "A[B[i]] += ..." and similar code or pointer arithmetic that could wrap
   // in the address space.
-  if (!StrideAPtr || !StrideBPtr || StrideAPtr != StrideBPtr) {
+  if (!StrideAPtr || !StrideBPtr) {
     LLVM_DEBUG(dbgs() << "Pointer access with non-constant stride\n");
     return MemoryDepChecker::Dependence::Unknown;
   }
@@ -1992,8 +1993,8 @@ getDependenceDistanceStrideAndSize(
       DL.getTypeStoreSizeInBits(ATy) == DL.getTypeStoreSizeInBits(BTy);
   if (!HasSameSize)
     TypeByteSize = 0;
-  uint64_t Stride = std::abs(StrideAPtr);
-  return std::make_tuple(Dist, Stride, TypeByteSize, AIsWrite, BIsWrite);
+  return std::make_tuple(Dist, std::abs(StrideAPtr), std::abs(StrideBPtr),
+                         TypeByteSize, AIsWrite, BIsWrite);
 }
 
 MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
@@ -2011,68 +2012,108 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
   if (std::holds_alternative<Dependence::DepType>(Res))
     return std::get<Dependence::DepType>(Res);
 
-  const auto &[Dist, Stride, TypeByteSize, AIsWrite, BIsWrite] =
-      std::get<std::tuple<const SCEV *, uint64_t, uint64_t, bool, bool>>(Res);
+  const auto &[Dist, StrideA, StrideB, TypeByteSize, AIsWrite, BIsWrite] =
+      std::get<
+          std::tuple<const SCEV *, uint64_t, uint64_t, uint64_t, bool, bool>>(
+          Res);
   bool HasSameSize = TypeByteSize > 0;
 
+  uint64_t CommonStride = StrideA == StrideB ? StrideA : 0;
+  if (isa<SCEVCouldNotCompute>(Dist)) {
+      FoundNonConstantDistanceDependence = true;
+    LLVM_DEBUG(dbgs() << "LAA: Dependence because of uncomputable distance.\n");
+    return Dependence::Unknown;
+  }
+
   ScalarEvolution &SE = *PSE.getSE();
   auto &DL = InnermostLoop->getHeader()->getModule()->getDataLayout();
-  if (!isa<SCEVCouldNotCompute>(Dist) && HasSameSize &&
+  if (HasSameSize && CommonStride &&
       isSafeDependenceDistance(DL, SE, *(PSE.getBackedgeTakenCount()), *Dist,
-                               Stride, TypeByteSize))
+                               CommonStride, TypeByteSize))
     return Dependence::NoDep;
 
   const SCEVConstant *C = dyn_cast<SCEVConstant>(Dist);
-  if (!C) {
-    LLVM_DEBUG(dbgs() << "LAA: Dependence because of non-constant distance\n");
-    FoundNonConstantDistanceDependence = true;
-    return Dependence::Unknown;
-  }
-
-  const APInt &Val = C->getAPInt();
-  int64_t Distance = Val.getSExtValue();
 
   // Attempt to prove strided accesses independent.
-  if (std::abs(Distance) > 0 && Stride > 1 && HasSameSize &&
-      areStridedAccessesIndependent(std::abs(Distance), Stride, TypeByteSize)) {
-    LLVM_DEBUG(dbgs() << "LAA: Strided accesses are independent\n");
-    return Dependence::NoDep;
+  if (C) {
+    const APInt &Val = C->getAPInt();
+    int64_t Distance = Val.getSExtValue();
+
+    if (std::abs(Distance) > 0 && CommonStride > 1 && HasSameSize &&
+        areStridedAccessesIndependent(std::abs(Distance), CommonStride,
+                                      TypeByteSize)) {
+      LLVM_DEBUG(dbgs() << "LAA: Strided accesses are independent\n");
+      return Dependence::NoDep;
+    }
+  }
+
+  // Write to the same location with the same size.
+  if (SE.isKnownPredicate(CmpInst::ICMP_EQ, Dist,
+                          SE.getZero(Dist->getType()))) {
+    if (HasSameSize)
+      return Dependence::Forward;
+    LLVM_DEBUG(
+        dbgs() << "LAA: Zero dependence difference but different type sizes\n");
+    return Dependence::Unknown;
   }
 
   // Negative distances are not plausible dependencies.
-  if (Val.isNegative()) {
+  if (SE.isKnownNonPositive(Dist)) {
+    if (!SE.isKnownNegative(Dist) && !HasSameSize) {
+      LLVM_DEBUG(dbgs() << "LAA: possibly zero dependence difference but "
+                           "different type sizes\n");
+      return Dependence::Unknown;
+    }
+
     bool IsTrueDataDependence = (AIsWrite && !BIsWrite);
     // There is no need to update MaxSafeVectorWidthInBits after call to
     // couldPreventStoreLoadForward, even if it changed MinDepDistBytes,
     // since a forward dependency will allow vectorization using any width.
-    if (IsTrueDataDependence && EnableForwardingConflictDetection &&
-        (!HasSameSize || couldPreventStoreLoadForward(Val.abs().getZExtValue(),
-                                                      TypeByteSize))) {
-      LLVM_DEBUG(dbgs() << "LAA: Forward but may prevent st->ld forwarding\n");
-      return Dependence::ForwardButPreventsForwarding;
+    if (IsTrueDataDependence && EnableForwardingConflictDetection) {
+      if (!C) {
+        FoundNonConstantDistanceDependence = true;
+        return Dependence::Unknown;
+      }
+      if (!HasSameSize ||
+          couldPreventStoreLoadForward(C->getAPInt().abs().getZExtValue(),
+                                       TypeByteSize)) {
+        LLVM_DEBUG(
+            dbgs() << "LAA: Forward but may prevent st->ld forwarding\n");
+        return Dependence::ForwardButPreventsForwarding;
+      }
     }
 
     LLVM_DEBUG(dbgs() << "LAA: Dependence is negative\n");
     return Dependence::Forward;
   }
 
-  // Write to the same location with the same size.
-  if (Val == 0) {
-    if (HasSameSize)
-      return Dependence::Forward;
-    LLVM_DEBUG(
-        dbgs() << "LAA: Zero dependence difference but different type sizes\n");
+  if (!SE.isKnownPositive(Dist)) {
+    if (!C)
+      FoundNonConstantDistanceDependence = true;
     return Dependence::Unknown;
   }
 
-  assert(Val.isStrictlyPositive() && "Expect a positive value");
-
   if (!HasSameSize) {
     LLVM_DEBUG(dbgs() << "LAA: ReadWrite-Write positive dependency with "
                          "different type sizes\n");
+    if (!C)
+      FoundNonConstantDistanceDependence = true;
     return Dependence::Unknown;
   }
 
+  if (!C) {
+    LLVM_DEBUG(dbgs() << "LAA: Dependence because of non-constant distance\n");
+    FoundNonConstantDistanceDependence = true;
+    return Dependence::Unknown;
+  }
+
+  // The logic below currently only supports StrideA ==  StrideB, i.e. there's a common stride.
+  if (!CommonStride)
+    return Dependence::Unknown;
+
+  const APInt &Val = C->getAPInt();
+  int64_t Distance = Val.getSExtValue();
+
   // Bail out early if passed-in parameters make vectorization not feasible.
   unsigned ForcedFactor = (VectorizerParams::VectorizationFactor ?
                            VectorizerParams::VectorizationFactor : 1);
@@ -2108,7 +2149,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
   // the minimum distance needed is 28, which is greater than distance. It is
   // not safe to do vectorization.
   uint64_t MinDistanceNeeded =
-      TypeByteSize * Stride * (MinNumIter - 1) + TypeByteSize;
+      TypeByteSize * CommonStride * (MinNumIter - 1) + TypeByteSize;
   if (MinDistanceNeeded > static_cast<uint64_t>(Distance)) {
     LLVM_DEBUG(dbgs() << "LAA: Failure because of positive distance "
                       << Distance << '\n');
@@ -2157,7 +2198,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
 
   // An update to MinDepDistBytes requires an update to MaxSafeVectorWidthInBits
   // since there is a backwards dependency.
-  uint64_t MaxVF = MinDepDistBytes / (TypeByteSize * Stride);
+  uint64_t MaxVF = MinDepDistBytes / (TypeByteSize * CommonStride);
   LLVM_DEBUG(dbgs() << "LAA: Positive distance " << Val.getSExtValue()
                     << " with max VF = " << MaxVF << '\n');
   uint64_t MaxVFInBits = MaxVF * TypeByteSize * 8;
diff --git a/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp b/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
index edddfb1b92402f..059900f357e64b 100644
--- a/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
@@ -126,8 +126,10 @@ struct StoreToLoadForwardingCandidate {
 
     // We don't need to check non-wrapping here because forward/backward
     // dependence wouldn't be valid if these weren't monotonic accesses.
-    auto *Dist = cast<SCEVConstant>(
+    auto *Dist = dyn_cast<SCEVConstant>(
         PSE.getSE()->getMinusSCEV(StorePtrSCEV, LoadPtrSCEV));
+    if (!Dist)
+      return false;
     const APInt &Val = Dist->getAPInt();
     return Val == TypeByteSize * StrideLoad;
   }
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
index 416742a94e0d36..8102611174c854 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
@@ -45,15 +45,21 @@ exit:
 define void @different_non_constant_strides_known_backward_distance_larger_than_trip_count(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_distance_larger_than_trip_count'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
-; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
-; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
-; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.1024, i64 %iv.mul.2
+; CHECK-NEXT:        Against group ([[GRP2:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP1]]:
+; CHECK-NEXT:          (Low: (1024 + %A)<nuw> High: (3068 + %A))
+; CHECK-NEXT:            Member: {(1024 + %A)<nuw>,+,8}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP2]]:
+; CHECK-NEXT:          (Low: %A High: (1024 + %A)<nuw>)
+; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -83,15 +89,21 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_16(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_16'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
-; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
-; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
-; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP3:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.16, i64 %iv.mul.2
+; CHECK-NEXT:        Against group ([[GRP4:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP3]]:
+; CHECK-NEXT:          (Low: (16 + %A)<nuw> High: (2060 + %A))
+; CHECK-NEXT:            Member: {(16 + %A)<nuw>,+,8}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP4]]:
+; CHECK-NEXT:          (Low: %A High: (1024 + %A))
+; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -121,15 +133,21 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_15(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_15'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
-; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
-; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
-; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP5:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.15, i64 %iv.mul.2
+; CHECK-NEXT:        Against group ([[GRP6:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP5]]:
+; CHECK-NEXT:          (Low: (15 + %A)<nuw> High: (2059 + %A))
+; CHECK-NEXT:            Member: {(15 + %A)<nuw>,+,8}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP6]]:
+; CHECK-NEXT:          (Low: %A High: (1024 + %A))
+; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -159,15 +177,21 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_8(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_8'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
-; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
-; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
-; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP7:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.8, i64 %iv.mul.2
+; CHECK-NEXT:        Against group ([[GRP8:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP7]]:
+; CHECK-NEXT:          (Low: (8 + %A)<nuw> High: (2052 + %A))
+; CHECK-NEXT:            Member: {(8 + %A)<nuw>,+,8}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP8]]:
+; CHECK-NEXT:          (Low: %A High: (1024 + %A))
+; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -197,15 +221,21 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_3(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_3'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
-; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
-; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
-; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP9:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.3, i64 %iv.mul.2
+; CHECK-NEXT:        Against group ([[GRP10:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP9]]:
+; CHECK-NEXT:          (Low: (3 + %A)<nuw> High: (2047 + %A))
+; CHECK-NEXT:            Member: {(3 + %A)<nuw>,+,8}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP10]]:
+; CHECK-NEXT:          (Low: %A High: (1024 + %A))
+; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-forward.ll b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-forward.ll
index aa22a2143352d2..7afbae2854374c 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-forward.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-forward.ll
@@ -8,10 +8,9 @@ declare void @llvm.assume(i1)
 define void @different_non_constant_strides_known_forward(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_forward'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:        Forward:
 ; CHECK-NEXT:            %l = load i32, ptr %gep.mul.2, align 4 ->
 ; CHECK-NEXT:            store i32 %add, ptr %gep, align 4
 ; CHECK-EMPTY:
@@ -45,10 +44,9 @@ exit:
 define void @different_non_constant_strides_known_forward_min_distance_3(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_forward_min_distance_3'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:        Forward:
 ; CHECK-NEXT:            %l = load i32, ptr %gep.mul.2, align 4 ->
 ; CHECK-NEXT:            store i32 %add, ptr %gep, align 4
 ; CHECK-EMPTY:
diff --git a/llvm/test/Transforms/LoopVectorize/global_alias.ll b/llvm/test/Transforms/LoopVectorize/global_alias.ll
index 336e462a4cf63c..84aee9ec62cf27 100644
--- a/llvm/test/Transforms/LoopVectorize/global_alias.ll
+++ b/llvm/test/Transforms/LoopVectorize/global_alias.ll
@@ -491,22 +491,101 @@ for.end:                                          ; preds = %for.body
   ret i32 %1
 }
 
+; /// Different objects, swapped induction, alias at the end
+; int noAlias15 (int a) {
+;   int i;
+;   for (i=0; i<SIZE; i++)
+;     Foo.A[i] = Foo.B[SIZE-i-1] + a;
+;   return Foo.A[a];
+; }
+; CHECK-LABEL: define i32 @noAlias15(
+; CHECK:      vector.memcheck:
+; CHECK-NEXT:    br i1 false, label %scalar.ph, label %vector.ph
+; CHECK: add nsw <4 x i32>
+; CHECK: ret
+
+define i32 @noAlias15(i32 %a) nounwind {
+entry:
+  br label %for.body
+
+for.body:                                         ; preds = %entry, %for.body
+  %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+  %sub1 = sub nuw nsw i32 99, %i.05
+  %arrayidx = getelementptr inbounds %struct.anon, ptr @Foo, i32 0, i32 2, i32 %sub1
+  %0 = load i32, ptr %arrayidx, align 4
+  %add = add nsw i32 %0, %a
+  %arrayidx2 = getelementptr inbounds [100 x i32], ptr @Foo, i32 0, i32 %i.05
+  store i32 %add, ptr %arrayidx2, align 4
+  %inc = add nuw nsw i32 %i.05, 1
+  %exitcond.not = icmp eq i32 %inc, 100
+  br i1 %exitcond.not, label %for.end, label %for.body
+
+for.end:                                          ; preds = %for.body
+  %arrayidx3 = getelementptr inbounds [100 x i32], ptr @Foo, i32 0, i32 %a
+  %1 = load i32, ptr %arrayidx3, align 4
+  ret i32 %1
+}
+
+; /// Different objects, swapped induction, alias at the beginning
+; int noAlias16 (int a) {
+;   int i;
+;   for (i=0; i<SIZE; i++)
+;     Foo.A[SIZE-i-1] = Foo.B[i] + a;
+;   return Foo.A[a];
+; }
+; CHECK-LABEL: define i32 @noAlias16(
+; CHECK:      entry:
+; CHECK-NEXT:   br i1 false, label %scalar.ph, label %vector.ph
+
+; CHECK: add nsw <4 x i32>
+; CHECK: ret
+
+define i32 @noAlias16(i32 %a) nounwind {
+entry:
+  br label %for.body
+
+for.body:                                         ; preds = %entry, %for.body
+  %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+  %arrayidx = getelementptr inbounds %struct.anon, ptr @Foo, i32 0, i32 2, i32 %i.05
+  %0 = load i32, ptr %arrayidx, align 4
+  %add = add nsw i32 %0, %a
+  %sub1 = sub nuw nsw i32 99, %i.05
+  %arrayidx2 = getelementptr inbounds [100 x i32], ptr @Foo, i32 0, i32 %sub1
+  store i32 %add, ptr %arrayidx2, align 4
+  %inc = add nuw nsw i32 %i.05, 1
+  %exitcond.not = icmp eq i32 %inc, 100
+  br i1 %exitcond.not, label %for.end, label %for.body
+
+for.end:                                          ; preds = %for.body
+  %arrayidx3 = getelementptr inbounds [100 x i32], ptr @Foo, i32 0, i32 %a
+  %1 = load i32, ptr %arrayidx3, align 4
+  ret i32 %1
+}
+
 
 ;; === Now, the tests that we could vectorize with induction changes or run-time checks ===
 
 
 ; /// Different objects, swapped induction, alias at the end
-; int mayAlias01 (int a) {
+; int mayAlias01 (int a, int N) {
 ;   int i;
-;   for (i=0; i<SIZE; i++)
+;   for (i=0; i<N; i++)
 ;     Foo.A[i] = Foo.B[SIZE-i-1] + a;
 ;   return Foo.A[a];
 ; }
 ; CHECK-LABEL: define i32 @mayAlias01(
-; CHECK-NOT: add nsw <4 x i32>
+; CHECK:      vector.memcheck:
+; CHECK-NEXT:   [[MUL:%.+]] = shl i32 %N, 2
+; CHECK-NEXT:   [[SCEVGEP0:%.+]] = getelementptr i8, ptr @Foo, i32 [[MUL]]
+; CHECK-NEXT:   [[SUB:%.+]] = sub i32 804, [[MUL]]
+; CHECK-NEXT:   [[SCEVGEP1:%.+]] = getelementptr i8, ptr @Foo, i32 [[SUB]]
+; CHECK-NEXT:   [[BOUND:%.+]] = icmp ult ptr [[SCEVGEP1]], [[SCEVGEP0]]
+; CHECK-NEXT:   br i1 [[BOUND]], label %scalar.ph, label %vector.ph
+
+; CHECK:       add nsw <4 x i32>
 ; CHECK: ret
 
-define i32 @mayAlias01(i32 %a) nounwind {
+define i32 @mayAlias01(i32 %a, i32 %N) nounwind {
 entry:
   br label %for.body
 
@@ -519,7 +598,7 @@ for.body:                                         ; preds = %entry, %for.body
   %arrayidx2 = getelementptr inbounds [100 x i32], ptr @Foo, i32 0, i32 %i.05
   store i32 %add, ptr %arrayidx2, align 4
   %inc = add nuw nsw i32 %i.05, 1
-  %exitcond.not = icmp eq i32 %inc, 100
+  %exitcond.not = icmp eq i32 %inc, %N
   br i1 %exitcond.not, label %for.end, label %for.body
 
 for.end:                                          ; preds = %for.body
@@ -529,17 +608,20 @@ for.end:                                          ; preds = %for.body
 }
 
 ; /// Different objects, swapped induction, alias at the beginning
-; int mayAlias02 (int a) {
+; int mayAlias02 (int a, int N) {
 ;   int i;
-;   for (i=0; i<SIZE; i++)
+;   for (i=0; i<N; i++)
 ;     Foo.A[SIZE-i-1] = Foo.B[i] + a;
 ;   return Foo.A[a];
 ; }
 ; CHECK-LABEL: define i32 @mayAlias02(
-; CHECK-NOT: add nsw <4 x i32>
+; CHECK:      vector.memcheck:
+; CHECK-NEXT:   br i1 false, label %scalar.ph, label %vector.ph
+
+; CHECK: add nsw <4 x i32>
 ; CHECK: ret
 
-define i32 @mayAlias02(i32 %a) nounwind {
+define i32 @mayAlias02(i32 %a, i32 %N) nounwind {
 entry:
   br label %for.body
 
@@ -552,7 +634,7 @@ for.body:                                         ; preds = %entry, %for.body
   %arrayidx2 = getelementptr inbounds [100 x i32], ptr @Foo, i32 0, i32 %sub1
   store i32 %add, ptr %arrayidx2, align 4
   %inc = add nuw nsw i32 %i.05, 1
-  %exitcond.not = icmp eq i32 %inc, 100
+  %exitcond.not = icmp eq i32 %inc, %N
   br i1 %exitcond.not, label %for.end, label %for.body
 
 for.end:                                          ; preds = %for.body
diff --git a/llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll b/llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll
index 3975a0dd33748b..2dc8a59fb7f899 100644
--- a/llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll
+++ b/llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll
@@ -12,9 +12,43 @@ define i16 @helper(i16 %0, i64 %x) {
 ; CHECK-NEXT:    [[DATA:%.*]] = alloca [2 x i8], align 2
 ; CHECK-NEXT:    store i16 [[TMP0:%.*]], ptr [[DATA]], align 2
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i8, ptr [[DATA]], i64 1
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[X:%.*]], 12
+; CHECK-NEXT:    [[TMP9:%.*]] = sub i64 1, [[X]]
+; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr i8, ptr [[TMP1]], i64 [[TMP9]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt ptr [[TMP3]], [[TMP1]]
+; CHECK-NEXT:    [[OR_COND:%.*]] = select i1 [[MIN_ITERS_CHECK]], i1 true, i1 [[TMP4]]
+; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i8, ptr [[DATA]], i64 [[X]]
+; CHECK-NEXT:    [[TMP5:%.*]] = sub i64 2, [[X]]
+; CHECK-NEXT:    [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[DATA]], i64 [[TMP5]]
+; CHECK-NEXT:    [[BOUND1:%.*]] = icmp ult ptr [[SCEVGEP1]], [[SCEVGEP]]
+; CHECK-NEXT:    [[OR_COND7:%.*]] = or i1 [[OR_COND]], [[BOUND1]]
+; CHECK-NEXT:    br i1 [[OR_COND7]], label [[BB6_I_I_PREHEADER:%.*]], label [[VECTOR_PH:%.*]]
+; CHECK:       vector.ph:
+; CHECK-NEXT:    [[N_VEC:%.*]] = and i64 [[X]], -4
+; CHECK-NEXT:    [[INVARIANT_GEP:%.*]] = getelementptr i8, ptr [[TMP1]], i64 -3
 ; CHECK-NEXT:    br label [[BB6_I_I:%.*]]
+; CHECK:       vector.body:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[BB6_I_I]] ]
+; CHECK-NEXT:    [[TMP6:%.*]] = sub nsw i64 0, [[INDEX]]
+; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [0 x i8], ptr [[DATA]], i64 0, i64 [[INDEX]]
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i8>, ptr [[TMP7]], align 2, !alias.scope [[META0:![0-9]+]], !noalias [[META3:![0-9]+]]
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr [0 x i8], ptr [[INVARIANT_GEP]], i64 0, i64 [[TMP6]]
+; CHECK-NEXT:    [[WIDE_LOAD3:%.*]] = load <4 x i8>, ptr [[GEP]], align 2, !alias.scope [[META3]]
+; CHECK-NEXT:    [[REVERSE:%.*]] = shufflevector <4 x i8> [[WIDE_LOAD3]], <4 x i8> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
+; CHECK-NEXT:    store <4 x i8> [[REVERSE]], ptr [[TMP7]], align 2, !alias.scope [[META0]], !noalias [[META3]]
+; CHECK-NEXT:    [[REVERSE4:%.*]] = shufflevector <4 x i8> [[WIDE_LOAD]], <4 x i8> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
+; CHECK-NEXT:    store <4 x i8> [[REVERSE4]], ptr [[GEP]], align 2, !alias.scope [[META3]]
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[TMP8]], label [[MIDDLE_BLOCK:%.*]], label [[BB6_I_I]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK:       middle.block:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[X]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label [[EXIT:%.*]], label [[BB6_I_I_PREHEADER]]
+; CHECK:       bb6.i.i.preheader:
+; CHECK-NEXT:    [[ITER_SROA_0_07_I_I_PH:%.*]] = phi i64 [ 0, [[START:%.*]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    br label [[BB6_I_I1:%.*]]
 ; CHECK:       bb6.i.i:
-; CHECK-NEXT:    [[ITER_SROA_0_07_I_I:%.*]] = phi i64 [ [[TMP2:%.*]], [[BB6_I_I]] ], [ 0, [[START:%.*]] ]
+; CHECK-NEXT:    [[ITER_SROA_0_07_I_I:%.*]] = phi i64 [ [[TMP2:%.*]], [[BB6_I_I1]] ], [ [[ITER_SROA_0_07_I_I_PH]], [[BB6_I_I_PREHEADER]] ]
 ; CHECK-NEXT:    [[_40_I_I:%.*]] = sub nsw i64 0, [[ITER_SROA_0_07_I_I]]
 ; CHECK-NEXT:    [[TMP2]] = add nuw nsw i64 [[ITER_SROA_0_07_I_I]], 1
 ; CHECK-NEXT:    [[_34_I_I:%.*]] = getelementptr inbounds [0 x i8], ptr [[DATA]], i64 0, i64 [[ITER_SROA_0_07_I_I]]
@@ -23,8 +57,8 @@ define i16 @helper(i16 %0, i64 %x) {
 ; CHECK-NEXT:    [[TMP2_0_COPYLOAD_I_I_I_I:%.*]] = load i8, ptr [[_39_I_I]], align 1
 ; CHECK-NEXT:    store i8 [[TMP2_0_COPYLOAD_I_I_I_I]], ptr [[_34_I_I]], align 1
 ; CHECK-NEXT:    store i8 [[TMP_0_COPYLOAD_I_I_I_I]], ptr [[_39_I_I]], align 1
-; CHECK-NEXT:    [[EXITCOND_NOT_I_I:%.*]] = icmp eq i64 [[TMP2]], [[X:%.*]]
-; CHECK-NEXT:    br i1 [[EXITCOND_NOT_I_I]], label [[EXIT:%.*]], label [[BB6_I_I]]
+; CHECK-NEXT:    [[EXITCOND_NOT_I_I:%.*]] = icmp eq i64 [[TMP2]], [[X]]
+; CHECK-NEXT:    br i1 [[EXITCOND_NOT_I_I]], label [[EXIT]], label [[BB6_I_I1]], !llvm.loop [[LOOP8:![0-9]+]]
 ; CHECK:       exit:
 ; CHECK-NEXT:    [[DOTSROA_0_0_COPYLOAD:%.*]] = load i16, ptr [[DATA]], align 2
 ; CHECK-NEXT:    ret i16 [[DOTSROA_0_0_COPYLOAD]]

>From 95c329b523d8cb5e84f642bc2320b712a8cfe856 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Mon, 8 Apr 2024 21:45:59 +0100
Subject: [PATCH 2/5] !fixup fix formatting.

---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 314484e11c4a7c..762d5f0cfcdcf8 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -2020,7 +2020,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
 
   uint64_t CommonStride = StrideA == StrideB ? StrideA : 0;
   if (isa<SCEVCouldNotCompute>(Dist)) {
-      FoundNonConstantDistanceDependence = true;
+    FoundNonConstantDistanceDependence = true;
     LLVM_DEBUG(dbgs() << "LAA: Dependence because of uncomputable distance.\n");
     return Dependence::Unknown;
   }
@@ -2107,7 +2107,8 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
     return Dependence::Unknown;
   }
 
-  // The logic below currently only supports StrideA ==  StrideB, i.e. there's a common stride.
+  // The logic below currently only supports StrideA ==  StrideB, i.e. there's a
+  // common stride.
   if (!CommonStride)
     return Dependence::Unknown;
 

>From 2e5a8077252007a5273df6dcf32ae945d539ec25 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Wed, 10 Apr 2024 13:51:40 +0100
Subject: [PATCH 3/5] !fixup fix formatting, address comments.

---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp      | 32 +++++-----
 .../non-constant-strides-backward.ll          | 58 ++++++++++---------
 2 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 212f463534fb2f..07270bcbc36a65 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1926,11 +1926,11 @@ struct DepDistanceStrideAndSizeInfo {
   bool AIsWrite;
   bool BIsWrite;
 
-  DepDistanceStrideAndSizeInfo(const SCEV *Dist, uint64_t StrideA, uint64_t StrideB,
-                               uint64_t TypeByteSize, bool AIsWrite,
-                               bool BIsWrite)
-      : Dist(Dist), StrideA(StrideA), StrideB(StrideB), TypeByteSize(TypeByteSize),
-        AIsWrite(AIsWrite), BIsWrite(BIsWrite) {}
+  DepDistanceStrideAndSizeInfo(const SCEV *Dist, uint64_t StrideA,
+                               uint64_t StrideB, uint64_t TypeByteSize,
+                               bool AIsWrite, bool BIsWrite)
+      : Dist(Dist), StrideA(StrideA), StrideB(StrideB),
+        TypeByteSize(TypeByteSize), AIsWrite(AIsWrite), BIsWrite(BIsWrite) {}
 };
 } // namespace
 
@@ -2009,9 +2009,9 @@ getDependenceDistanceStrideAndSize(
       DL.getTypeStoreSizeInBits(ATy) == DL.getTypeStoreSizeInBits(BTy);
   if (!HasSameSize)
     TypeByteSize = 0;
-  uint64_t Stride = std::abs(StrideAPtr);
-  return DepDistanceStrideAndSizeInfo(Dist, std::abs(StrideAPtr), std::abs(StrideBPtr), TypeByteSize, AIsWrite,
-                                      BIsWrite);
+  return DepDistanceStrideAndSizeInfo(Dist, std::abs(StrideAPtr),
+                                      std::abs(StrideBPtr), TypeByteSize,
+                                      AIsWrite, BIsWrite);
 }
 
 MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
@@ -2060,16 +2060,16 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
       LLVM_DEBUG(dbgs() << "LAA: Strided accesses are independent\n");
       return Dependence::NoDep;
     }
-  }
 
   // Write to the same location with the same size.
-  if (SE.isKnownPredicate(CmpInst::ICMP_EQ, Dist,
-                          SE.getZero(Dist->getType()))) {
-    if (HasSameSize)
-      return Dependence::Forward;
-    LLVM_DEBUG(
-        dbgs() << "LAA: Zero dependence difference but different type sizes\n");
-    return Dependence::Unknown;
+    if (C->isZero()) {
+      if (HasSameSize)
+        return Dependence::Forward;
+      LLVM_DEBUG(
+          dbgs()
+          << "LAA: Zero dependence difference but different type sizes\n");
+      return Dependence::Unknown;
+    }
   }
 
   // Negative distances are not plausible dependencies.
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
index 8102611174c854..d0e717c22cece5 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
@@ -8,15 +8,21 @@ declare void @llvm.assume(i1)
 define void @different_non_constant_strides_known_backward(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
-; CHECK-NEXT:  Unknown data dependence.
+; CHECK-NEXT:      Memory dependences are safe with run-time checks
 ; CHECK-NEXT:      Dependences:
-; CHECK-NEXT:        Unknown:
-; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
-; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
-; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A, i64 %iv.mul.2
+; CHECK-NEXT:        Against group ([[GRP2:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP1]]:
+; CHECK-NEXT:          (Low: %A High: (2044 + %A))
+; CHECK-NEXT:            Member: {%A,+,8}<nuw><%loop>
+; CHECK-NEXT:        Group [[GRP2]]:
+; CHECK-NEXT:          (Low: %A High: (1024 + %A))
+; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -49,15 +55,15 @@ define void @different_non_constant_strides_known_backward_distance_larger_than_
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP3:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.1024, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP2:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP4:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP1]]:
+; CHECK-NEXT:        Group [[GRP3]]:
 ; CHECK-NEXT:          (Low: (1024 + %A)<nuw> High: (3068 + %A))
 ; CHECK-NEXT:            Member: {(1024 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP2]]:
+; CHECK-NEXT:        Group [[GRP4]]:
 ; CHECK-NEXT:          (Low: %A High: (1024 + %A)<nuw>)
 ; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
@@ -93,15 +99,15 @@ define void @different_non_constant_strides_known_backward_min_distance_16(ptr %
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP3:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP5:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.16, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP4:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP6:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP3]]:
+; CHECK-NEXT:        Group [[GRP5]]:
 ; CHECK-NEXT:          (Low: (16 + %A)<nuw> High: (2060 + %A))
 ; CHECK-NEXT:            Member: {(16 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP4]]:
+; CHECK-NEXT:        Group [[GRP6]]:
 ; CHECK-NEXT:          (Low: %A High: (1024 + %A))
 ; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
@@ -137,15 +143,15 @@ define void @different_non_constant_strides_known_backward_min_distance_15(ptr %
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP5:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP7:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.15, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP6:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP8:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP5]]:
+; CHECK-NEXT:        Group [[GRP7]]:
 ; CHECK-NEXT:          (Low: (15 + %A)<nuw> High: (2059 + %A))
 ; CHECK-NEXT:            Member: {(15 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP6]]:
+; CHECK-NEXT:        Group [[GRP8]]:
 ; CHECK-NEXT:          (Low: %A High: (1024 + %A))
 ; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
@@ -181,15 +187,15 @@ define void @different_non_constant_strides_known_backward_min_distance_8(ptr %A
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP7:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP9:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.8, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP8:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP10:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP7]]:
+; CHECK-NEXT:        Group [[GRP9]]:
 ; CHECK-NEXT:          (Low: (8 + %A)<nuw> High: (2052 + %A))
 ; CHECK-NEXT:            Member: {(8 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP8]]:
+; CHECK-NEXT:        Group [[GRP10]]:
 ; CHECK-NEXT:          (Low: %A High: (1024 + %A))
 ; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
@@ -225,15 +231,15 @@ define void @different_non_constant_strides_known_backward_min_distance_3(ptr %A
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP9:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP11:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.3, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP10:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP12:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP9]]:
+; CHECK-NEXT:        Group [[GRP11]]:
 ; CHECK-NEXT:          (Low: (3 + %A)<nuw> High: (2047 + %A))
 ; CHECK-NEXT:            Member: {(3 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP10]]:
+; CHECK-NEXT:        Group [[GRP12]]:
 ; CHECK-NEXT:          (Low: %A High: (1024 + %A))
 ; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:

>From 53a1ea4eee648dd6c49963385e6259545cb15a08 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Thu, 11 Apr 2024 07:22:18 +0100
Subject: [PATCH 4/5] !fixup Update and merge.

---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp      |  14 +--
 .../non-constant-strides-backward.ll          | 108 ++++++------------
 2 files changed, 42 insertions(+), 80 deletions(-)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 07270bcbc36a65..b3745fb29f39c4 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1934,7 +1934,7 @@ struct DepDistanceStrideAndSizeInfo {
 };
 } // namespace
 
-// Get the dependence distance, stride, type size and whether it is a write for
+// Get the dependence distance, strides, type size and whether it is a write for
 // the dependence between A and B. Returns a DepType, if we can prove there's
 // no dependence or the analysis fails. Outlined to lambda to limit he scope
 // of various temporary variables, like A/BPtr, StrideA/BPtr and others.
@@ -2035,7 +2035,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
 
   uint64_t CommonStride = StrideA == StrideB ? StrideA : 0;
   if (isa<SCEVCouldNotCompute>(Dist)) {
-    FoundNonConstantDistanceDependence = true;
+    FoundNonConstantDistanceDependence |= CommonStride;
     LLVM_DEBUG(dbgs() << "LAA: Dependence because of uncomputable distance.\n");
     return Dependence::Unknown;
   }
@@ -2086,7 +2086,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
     // since a forward dependency will allow vectorization using any width.
     if (IsTrueDataDependence && EnableForwardingConflictDetection) {
       if (!C) {
-        FoundNonConstantDistanceDependence = true;
+        FoundNonConstantDistanceDependence |= CommonStride;
         return Dependence::Unknown;
       }
       if (!HasSameSize ||
@@ -2103,22 +2103,20 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
   }
 
   if (!SE.isKnownPositive(Dist)) {
-    if (!C)
-      FoundNonConstantDistanceDependence = true;
+    FoundNonConstantDistanceDependence |= !C && CommonStride;
     return Dependence::Unknown;
   }
 
   if (!HasSameSize) {
     LLVM_DEBUG(dbgs() << "LAA: ReadWrite-Write positive dependency with "
                          "different type sizes\n");
-    if (!C)
-      FoundNonConstantDistanceDependence = true;
+    FoundNonConstantDistanceDependence |= !C && CommonStride;
     return Dependence::Unknown;
   }
 
   if (!C) {
     LLVM_DEBUG(dbgs() << "LAA: Dependence because of non-constant distance\n");
-    FoundNonConstantDistanceDependence = true;
+    FoundNonConstantDistanceDependence |= CommonStride;
     return Dependence::Unknown;
   }
 
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
index d0e717c22cece5..416742a94e0d36 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll
@@ -8,21 +8,15 @@ declare void @llvm.assume(i1)
 define void @different_non_constant_strides_known_backward(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unknown data dependence.
 ; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
+; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
+; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
-; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP1:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP2:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP1]]:
-; CHECK-NEXT:          (Low: %A High: (2044 + %A))
-; CHECK-NEXT:            Member: {%A,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP2]]:
-; CHECK-NEXT:          (Low: %A High: (1024 + %A))
-; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -51,21 +45,15 @@ exit:
 define void @different_non_constant_strides_known_backward_distance_larger_than_trip_count(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_distance_larger_than_trip_count'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unknown data dependence.
 ; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
+; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
+; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
-; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP3:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.1024, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP4:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP3]]:
-; CHECK-NEXT:          (Low: (1024 + %A)<nuw> High: (3068 + %A))
-; CHECK-NEXT:            Member: {(1024 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP4]]:
-; CHECK-NEXT:          (Low: %A High: (1024 + %A)<nuw>)
-; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -95,21 +83,15 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_16(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_16'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unknown data dependence.
 ; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
+; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
+; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
-; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP5:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.16, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP6:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP5]]:
-; CHECK-NEXT:          (Low: (16 + %A)<nuw> High: (2060 + %A))
-; CHECK-NEXT:            Member: {(16 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP6]]:
-; CHECK-NEXT:          (Low: %A High: (1024 + %A))
-; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -139,21 +121,15 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_15(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_15'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unknown data dependence.
 ; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
+; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
+; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
-; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP7:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.15, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP8:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP7]]:
-; CHECK-NEXT:          (Low: (15 + %A)<nuw> High: (2059 + %A))
-; CHECK-NEXT:            Member: {(15 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP8]]:
-; CHECK-NEXT:          (Low: %A High: (1024 + %A))
-; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -183,21 +159,15 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_8(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_8'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unknown data dependence.
 ; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
+; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
+; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
-; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP9:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.8, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP10:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP9]]:
-; CHECK-NEXT:          (Low: (8 + %A)<nuw> High: (2052 + %A))
-; CHECK-NEXT:            Member: {(8 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP10]]:
-; CHECK-NEXT:          (Low: %A High: (1024 + %A))
-; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -227,21 +197,15 @@ exit:
 define void @different_non_constant_strides_known_backward_min_distance_3(ptr %A) {
 ; CHECK-LABEL: 'different_non_constant_strides_known_backward_min_distance_3'
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with run-time checks
+; CHECK-NEXT:      Report: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unknown data dependence.
 ; CHECK-NEXT:      Dependences:
+; CHECK-NEXT:        Unknown:
+; CHECK-NEXT:            %l = load i32, ptr %gep, align 4 ->
+; CHECK-NEXT:            store i32 %add, ptr %gep.mul.2, align 4
+; CHECK-EMPTY:
 ; CHECK-NEXT:      Run-time memory checks:
-; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP11:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep.mul.2 = getelementptr inbounds i32, ptr %A.3, i64 %iv.mul.2
-; CHECK-NEXT:        Against group ([[GRP12:0x[0-9a-f]+]]):
-; CHECK-NEXT:          %gep = getelementptr inbounds i32, ptr %A, i64 %iv
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP11]]:
-; CHECK-NEXT:          (Low: (3 + %A)<nuw> High: (2047 + %A))
-; CHECK-NEXT:            Member: {(3 + %A)<nuw>,+,8}<nuw><%loop>
-; CHECK-NEXT:        Group [[GRP12]]:
-; CHECK-NEXT:          (Low: %A High: (1024 + %A))
-; CHECK-NEXT:            Member: {%A,+,4}<nuw><%loop>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:

>From 0accad17f933b75296c92b582c87c07b0fc8aa71 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Thu, 11 Apr 2024 10:21:24 +0100
Subject: [PATCH 5/5] !fixup fix formatting

---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index b3745fb29f39c4..ffcb6cd9dba6a1 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -2061,7 +2061,7 @@ MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
       return Dependence::NoDep;
     }
 
-  // Write to the same location with the same size.
+    // Write to the same location with the same size.
     if (C->isZero()) {
       if (HasSameSize)
         return Dependence::Forward;



More information about the llvm-branch-commits mailing list