[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)
Drew Kersnar via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 7 08:27:48 PDT 2025
================
@@ -58,14 +58,55 @@ bool inferAlignment(Function &F, AssumptionCache &AC, DominatorTree &DT) {
}
// Compute alignment from known bits.
+ auto InferFromKnownBits = [&](Instruction &I, Value *PtrOp) {
+ KnownBits Known = computeKnownBits(PtrOp, DL, &AC, &I, &DT);
+ unsigned TrailZ =
+ std::min(Known.countMinTrailingZeros(), +Value::MaxAlignmentExponent);
+ return Align(1ull << std::min(Known.getBitWidth() - 1, TrailZ));
+ };
+
+ // Propagate alignment between loads and stores that originate from the
+ // same base pointer.
+ DenseMap<Value *, Align> BestBasePointerAligns;
+ auto InferFromBasePointer = [&](Value *PtrOp, Align LoadStoreAlign) {
+ APInt OffsetFromBase(DL.getIndexTypeSizeInBits(PtrOp->getType()), 0);
+ PtrOp = PtrOp->stripAndAccumulateConstantOffsets(DL, OffsetFromBase, true);
+ // Derive the base pointer alignment from the load/store alignment
+ // and the offset from the base pointer.
+ Align BasePointerAlign =
+ commonAlignment(LoadStoreAlign, OffsetFromBase.getLimitedValue());
+
+ auto [It, Inserted] =
+ BestBasePointerAligns.try_emplace(PtrOp, BasePointerAlign);
+ if (!Inserted) {
+ // If the stored base pointer alignment is better than the
+ // base pointer alignment we derived, we may be able to use it
+ // to improve the load/store alignment. If not, store the
+ // improved base pointer alignment for future iterations.
+ if (It->second > BasePointerAlign) {
+ Align BetterLoadStoreAlign =
+ commonAlignment(It->second, OffsetFromBase.getLimitedValue());
+ return BetterLoadStoreAlign;
+ }
+ It->second = BasePointerAlign;
+ }
+ return LoadStoreAlign;
+ };
+
for (BasicBlock &BB : F) {
+ // We need to reset the map for each block because alignment information
----------------
dakersnar wrote:
Good call out, thank you for pointing that out! This tracks with my understanding of why a backwards propagation worked in the LSV but doesn't work here: the LSV analyzes within the scope of what it calls a "pseudo basic block", which is defined as follows:
```
/// Runs the vectorizer on a "pseudo basic block", which is a range of
/// instructions [Begin, End) within one BB all of which have
/// isGuaranteedToTransferExecutionToSuccessor(I) == true.
```
Anyway, I can adjust the comment to call out your example. And just to confirm, the hypothetical dominator tree approach described in my comment would still be correct, right?
https://github.com/llvm/llvm-project/pull/145733
More information about the llvm-commits
mailing list