[llvm] 87ddd3a - [LAA] Rename and fix semantics of MaxSafeDepDistBytes to MinDepDistBytes

Michael Maitland via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 16 09:53:55 PDT 2023


Author: Michael Maitland
Date: 2023-08-16T09:53:35-07:00
New Revision: 87ddd3a191321c41045b1b9d737583e27a3f6c2a

URL: https://github.com/llvm/llvm-project/commit/87ddd3a191321c41045b1b9d737583e27a3f6c2a
DIFF: https://github.com/llvm/llvm-project/commit/87ddd3a191321c41045b1b9d737583e27a3f6c2a.diff

LOG: [LAA] Rename and fix semantics of MaxSafeDepDistBytes to MinDepDistBytes

`MaxSafeDepDistBytes` was not correct based on its name an semantics
in instances when there was a non-unit stride loop. For example,

```
for (int k = 0; k < len; k+=3) {
  a[k] = a[k+4];
  a[k+2] = a[k+6];
}
```

Here, the smallest dependence distance is 24 bytes, but only vectorizing 8 bytes
is safe. `MaxSafeVectorWidthInBits` reported the correct number of bits
that could be vectorized as 64 bits.

The semantics of of `MaxSafeDepDistBytes` should be:
  The smallest dependence distance in bytes in the loop. This may not be
  the same as the maximum number of bytes that are safe to operate on
  simultaneously.

The name of this variable should reflect those semantics and
its docstring should be updated accordingly, `MinDepDistBytes`.

A debug message that used `MaxSafeDepDistBytes` to signify to the user
how many bytes could be accessed in parallel is updated to use
`MaxSafeVectorWidthInBits` instead. That way, the same message if
communicated to the user, just in different units.

This patch makes sure that when `MinDepDistBytes` is modified in a way
that should impact `MaxSafeVectorWidthInBits`, that we update the latter
accordingly. This patch also clarifies why `MaxSafeVectorWidthInBits`
does not to be updated when `MinDepDistBytes` is (i.e. in the case of a
forward dependency).

Differential Revision: https://reviews.llvm.org/D156158

Added: 
    

Modified: 
    llvm/include/llvm/Analysis/LoopAccessAnalysis.h
    llvm/lib/Analysis/LoopAccessAnalysis.cpp
    llvm/test/Analysis/LoopAccessAnalysis/depend_diff_types.ll
    llvm/test/Analysis/LoopAccessAnalysis/depend_diff_types_opaque_ptr.ll
    llvm/test/Analysis/LoopAccessAnalysis/max_safe_dep_dist_non_unit_stride.ll
    llvm/test/Analysis/LoopAccessAnalysis/safe-with-dep-distance.ll

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index 08397fa0485ba7..c04a633d6daba6 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -126,8 +126,8 @@ class MemoryDepChecker {
       ForwardButPreventsForwarding,
       // Lexically backward.
       Backward,
-      // Backward, but the distance allows a vectorization factor of
-      // MaxSafeDepDistBytes.
+      // Backward, but the distance allows a vectorization factor of dependent
+      // on MinDepDistBytes.
       BackwardVectorizable,
       // Same, but may prevent store-to-load forwarding.
       BackwardVectorizableButPreventsForwarding
@@ -197,10 +197,6 @@ class MemoryDepChecker {
     return MaxSafeVectorWidthInBits == UINT_MAX;
   }
 
-  /// The maximum number of bytes of a vector register we can vectorize
-  /// the accesses safely with.
-  uint64_t getMaxSafeDepDistBytes() const { return MaxSafeDepDistBytes; }
-
   /// Return the number of elements that are safe to operate on
   /// simultaneously, multiplied by the size of the element in bits.
   uint64_t getMaxSafeVectorWidthInBits() const {
@@ -274,8 +270,10 @@ class MemoryDepChecker {
   /// The program order index to be used for the next instruction.
   unsigned AccessIdx = 0;
 
-  // We can access this many bytes in parallel safely.
-  uint64_t MaxSafeDepDistBytes = 0;
+  /// The smallest dependence distance in bytes in the loop. This may not be
+  /// the same as the maximum number of bytes that are safe to operate on
+  /// simultaneously.
+  uint64_t MinDepDistBytes = 0;
 
   /// Number of elements (from consecutive iterations) that are safe to
   /// operate on simultaneously, multiplied by the size of the element in bits.
@@ -310,7 +308,7 @@ class MemoryDepChecker {
   /// This function checks  whether there is a plausible dependence (or the
   /// absence of such can't be proved) between the two accesses. If there is a
   /// plausible dependence but the dependence distance is bigger than one
-  /// element access it records this distance in \p MaxSafeDepDistBytes (if this
+  /// element access it records this distance in \p MinDepDistBytes (if this
   /// distance is smaller than any other distance encountered so far).
   /// Otherwise, this function returns true signaling a possible dependence.
   Dependence::DepType isDependent(const MemAccessInfo &A, unsigned AIdx,
@@ -321,7 +319,7 @@ class MemoryDepChecker {
   /// forwarding.
   ///
   /// \return false if we shouldn't vectorize at all or avoid larger
-  /// vectorization factors by limiting MaxSafeDepDistBytes.
+  /// vectorization factors by limiting MinDepDistBytes.
   bool couldPreventStoreLoadForward(uint64_t Distance, uint64_t TypeByteSize);
 
   /// Updates the current safety status with \p S. We can go from Safe to
@@ -678,8 +676,6 @@ class LoopAccessInfo {
   unsigned NumLoads = 0;
   unsigned NumStores = 0;
 
-  uint64_t MaxSafeDepDistBytes = -1;
-
   /// Cache the result of analyzeLoop.
   bool CanVecMem = false;
   bool HasConvergentOp = false;

diff  --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index e15fd2b769f64a..0878ce2cc23478 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1676,7 +1676,7 @@ bool MemoryDepChecker::couldPreventStoreLoadForward(uint64_t Distance,
   const uint64_t NumItersForStoreLoadThroughMemory = 8 * TypeByteSize;
   // Maximum vector factor.
   uint64_t MaxVFWithoutSLForwardIssues = std::min(
-      VectorizerParams::MaxVectorWidth * TypeByteSize, MaxSafeDepDistBytes);
+      VectorizerParams::MaxVectorWidth * TypeByteSize, MinDepDistBytes);
 
   // Compute the smallest VF at which the store and load would be misaligned.
   for (uint64_t VF = 2 * TypeByteSize; VF <= MaxVFWithoutSLForwardIssues;
@@ -1696,10 +1696,10 @@ bool MemoryDepChecker::couldPreventStoreLoadForward(uint64_t Distance,
     return true;
   }
 
-  if (MaxVFWithoutSLForwardIssues < MaxSafeDepDistBytes &&
+  if (MaxVFWithoutSLForwardIssues < MinDepDistBytes &&
       MaxVFWithoutSLForwardIssues !=
           VectorizerParams::MaxVectorWidth * TypeByteSize)
-    MaxSafeDepDistBytes = MaxVFWithoutSLForwardIssues;
+    MinDepDistBytes = MaxVFWithoutSLForwardIssues;
   return false;
 }
 
@@ -1897,6 +1897,9 @@ MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
   // Negative distances are not plausible dependencies.
   if (Val.isNegative()) {
     bool IsTrueDataDependence = (AIsWrite && !BIsWrite);
+    // There is no need to update MaxSafeVectorWidthInBits after call to
+    // couldPreventStoreLoadForward, even if it changed MinDepDistBytes,
+    // since a forward dependency will allow vectorization using any width.
     if (IsTrueDataDependence && EnableForwardingConflictDetection &&
         (couldPreventStoreLoadForward(Val.abs().getZExtValue(), TypeByteSize) ||
          !HasSameSize)) {
@@ -1967,8 +1970,9 @@ MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
     return Dependence::Backward;
   }
 
-  // Unsafe if the minimum distance needed is greater than max safe distance.
-  if (MinDistanceNeeded > MaxSafeDepDistBytes) {
+  // Unsafe if the minimum distance needed is greater than smallest dependence
+  // distance distance.
+  if (MinDistanceNeeded > MinDepDistBytes) {
     LLVM_DEBUG(dbgs() << "LAA: Failure because it needs at least "
                       << MinDistanceNeeded << " size in bytes\n");
     return Dependence::Backward;
@@ -1990,15 +1994,24 @@ MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
   // is 2. Then we analyze the accesses on array A, the minimum distance needed
   // is 8, which is less than 2 and forbidden vectorization, But actually
   // both A and B could be vectorized by 2 iterations.
-  MaxSafeDepDistBytes =
-      std::min(static_cast<uint64_t>(Distance), MaxSafeDepDistBytes);
+  MinDepDistBytes =
+      std::min(static_cast<uint64_t>(Distance), MinDepDistBytes);
 
   bool IsTrueDataDependence = (!AIsWrite && BIsWrite);
+  uint64_t MinDepDistBytesOld = MinDepDistBytes;
   if (IsTrueDataDependence && EnableForwardingConflictDetection &&
-      couldPreventStoreLoadForward(Distance, TypeByteSize))
+      couldPreventStoreLoadForward(Distance, TypeByteSize)) {
+    // Sanity check that we didn't update MinDepDistBytes when calling
+    // couldPreventStoreLoadForward
+    assert(MinDepDistBytes == MinDepDistBytesOld &&
+           "An update to MinDepDistBytes requires an update to "
+           "MaxSafeVectorWidthInBits");
     return Dependence::BackwardVectorizableButPreventsForwarding;
+  }
 
-  uint64_t MaxVF = MaxSafeDepDistBytes / (TypeByteSize * Stride);
+  // An update to MinDepDistBytes requires an update to MaxSafeVectorWidthInBits
+  // since there is a backwards dependency.
+  uint64_t MaxVF = MinDepDistBytes / (TypeByteSize * Stride);
   LLVM_DEBUG(dbgs() << "LAA: Positive distance " << Val.getSExtValue()
                     << " with max VF = " << MaxVF << '\n');
   uint64_t MaxVFInBits = MaxVF * TypeByteSize * 8;
@@ -2010,7 +2023,7 @@ bool MemoryDepChecker::areDepsSafe(DepCandidates &AccessSets,
                                    MemAccessInfoList &CheckDeps,
                                    const DenseMap<Value *, const SCEV *> &Strides) {
 
-  MaxSafeDepDistBytes = -1;
+  MinDepDistBytes = -1;
   SmallPtrSet<MemAccessInfo, 8> Visited;
   for (MemAccessInfo CurAccess : CheckDeps) {
     if (Visited.count(CurAccess))
@@ -2399,7 +2412,6 @@ void LoopAccessInfo::analyzeLoop(AAResults *AA, LoopInfo *LI,
     LLVM_DEBUG(dbgs() << "LAA: Checking memory dependencies\n");
     CanVecMem = DepChecker->areDepsSafe(
         DependentAccesses, Accesses.getDependenciesToCheck(), SymbolicStrides);
-    MaxSafeDepDistBytes = DepChecker->getMaxSafeDepDistBytes();
 
     if (!CanVecMem && DepChecker->shouldRetryWithRuntimeCheck()) {
       LLVM_DEBUG(dbgs() << "LAA: Retrying with memory checks\n");
@@ -2764,9 +2776,10 @@ LoopAccessInfo::LoopAccessInfo(Loop *L, ScalarEvolution *SE,
 void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {
   if (CanVecMem) {
     OS.indent(Depth) << "Memory dependences are safe";
-    if (MaxSafeDepDistBytes != -1ULL)
-      OS << " with a maximum dependence distance of " << MaxSafeDepDistBytes
-         << " bytes";
+    const MemoryDepChecker &DC = getDepChecker();
+    if (!DC.isSafeForAnyVectorWidth())
+      OS << " with a maximum safe vector width of "
+         << DC.getMaxSafeVectorWidthInBits() << " bits";
     if (PtrRtChecking->Need)
       OS << " with run-time checks";
     OS << "\n";

diff  --git a/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types.ll b/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types.ll
index 3d89c051b3f94b..8c436de4c3f6ac 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types.ll
@@ -9,7 +9,7 @@
 
 ; CHECK-LABEL: function 'backdep_type_size_equivalence':
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with a maximum dependence distance of 800 bytes
+; CHECK-NEXT:      Memory dependences are safe with a maximum safe vector width of 3200 bits
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:        Forward:
 ; CHECK-NEXT:            %ld.f32 = load float, ptr %gep.iv, align 8 ->

diff  --git a/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types_opaque_ptr.ll b/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types_opaque_ptr.ll
index b62b6fdcb7bde4..e424683bbef341 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types_opaque_ptr.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/depend_
diff _types_opaque_ptr.ll
@@ -8,7 +8,7 @@
 
 ; CHECK-LABEL: function 'backdep_type_size_equivalence':
 ; CHECK-NEXT:    loop:
-; CHECK-NEXT:      Memory dependences are safe with a maximum dependence distance of 800 bytes
+; CHECK-NEXT:      Memory dependences are safe with a maximum safe vector width of 3200 bits
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:        Forward:
 ; CHECK-NEXT:            %ld.f32 = load float, ptr %gep.iv, align 8 ->

diff  --git a/llvm/test/Analysis/LoopAccessAnalysis/max_safe_dep_dist_non_unit_stride.ll b/llvm/test/Analysis/LoopAccessAnalysis/max_safe_dep_dist_non_unit_stride.ll
index 946d9b7cfd9b83..0364248b004cf8 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/max_safe_dep_dist_non_unit_stride.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/max_safe_dep_dist_non_unit_stride.ll
@@ -10,7 +10,7 @@
 define void @foo(i64  %len, ptr %a) {
 ; CHECK-LABEL: Loop access info in function 'foo':
 ; CHECK-NEXT:  loop:
-; CHECK-NEXT:    Memory dependences are safe with a maximum dependence distance of 24 bytes
+; CHECK-NEXT:   Memory dependences are safe with a maximum safe vector width of 64 bits
 ; CHECK-NEXT:    Dependences:
 ; CHECK-NEXT:      BackwardVectorizable:
 ; CHECK-NEXT:          store i32 %0, ptr %arrayidx2, align 4 ->

diff  --git a/llvm/test/Analysis/LoopAccessAnalysis/safe-with-dep-distance.ll b/llvm/test/Analysis/LoopAccessAnalysis/safe-with-dep-distance.ll
index 23fdcd5d2a1dfe..efa3100464759e 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/safe-with-dep-distance.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/safe-with-dep-distance.ll
@@ -4,7 +4,7 @@
 ;   for (i = 0; i < n; i++)
 ;    A[i + 4] = A[i] * 2;
 
-; CHECK: Memory dependences are safe with a maximum dependence distance of 8 bytes
+; CHECK: Memory dependences are safe with a maximum safe vector width of 64 bits
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx10.10.0"


        


More information about the llvm-commits mailing list