[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

Thu Aug 7 08:27:48 PDT 2025

================
@@ -58,14 +58,55 @@ bool inferAlignment(Function &F, AssumptionCache &AC, DominatorTree &DT) {
   }
 
   // Compute alignment from known bits.
+  auto InferFromKnownBits = [&](Instruction &I, Value *PtrOp) {
+    KnownBits Known = computeKnownBits(PtrOp, DL, &AC, &I, &DT);
+    unsigned TrailZ =
+        std::min(Known.countMinTrailingZeros(), +Value::MaxAlignmentExponent);
+    return Align(1ull << std::min(Known.getBitWidth() - 1, TrailZ));
+  };
+
+  // Propagate alignment between loads and stores that originate from the
+  // same base pointer.
+  DenseMap<Value *, Align> BestBasePointerAligns;
+  auto InferFromBasePointer = [&](Value *PtrOp, Align LoadStoreAlign) {
+    APInt OffsetFromBase(DL.getIndexTypeSizeInBits(PtrOp->getType()), 0);
+    PtrOp = PtrOp->stripAndAccumulateConstantOffsets(DL, OffsetFromBase, true);
+    // Derive the base pointer alignment from the load/store alignment
+    // and the offset from the base pointer.
+    Align BasePointerAlign =
+        commonAlignment(LoadStoreAlign, OffsetFromBase.getLimitedValue());
+
+    auto [It, Inserted] =
+        BestBasePointerAligns.try_emplace(PtrOp, BasePointerAlign);
+    if (!Inserted) {
+      // If the stored base pointer alignment is better than the
+      // base pointer alignment we derived, we may be able to use it
+      // to improve the load/store alignment. If not, store the
+      // improved base pointer alignment for future iterations.
+      if (It->second > BasePointerAlign) {
+        Align BetterLoadStoreAlign =
+            commonAlignment(It->second, OffsetFromBase.getLimitedValue());
+        return BetterLoadStoreAlign;
+      }
+      It->second = BasePointerAlign;
+    }
+    return LoadStoreAlign;
+  };
+
   for (BasicBlock &BB : F) {
+    // We need to reset the map for each block because alignment information
----------------
dakersnar wrote:

Good call out, thank you for pointing that out! This tracks with my understanding of why a backwards propagation worked in the LSV but doesn't work here: the LSV analyzes within the scope of what it calls a "pseudo basic block", which is defined as follows:

```
  /// Runs the vectorizer on a "pseudo basic block", which is a range of
  /// instructions [Begin, End) within one BB all of which have
  /// isGuaranteedToTransferExecutionToSuccessor(I) == true.
```

Anyway, I can adjust the comment to call out your example. And just to confirm, the hypothetical dominator tree approach described in my comment would still be correct, right? 

https://github.com/llvm/llvm-project/pull/145733