[llvm] [SROA] Support load-only promotion with dynamic offset loads (PR #135609)

Mon Apr 14 03:23:12 PDT 2025

https://github.com/nikic created https://github.com/llvm/llvm-project/pull/135609

If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location.

As such, we can handle loads with unknown offset via the "escaped read-only" code path.

This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective.

Fixes https://github.com/llvm/llvm-project/issues/134513.

>From b96efb3e556e2a6440271396ef8b0d6e14463b91 Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Mon, 14 Apr 2025 12:03:34 +0200
Subject: [PATCH] [SROA] Support load-only promotion with dynamic offset loads

If we do load-only promotion, it is okay if we leave some loads
alone. We only need to know all stores that affect a specific
location.

As such, we can handle loads with unknown offset via the "escaped
read-only" code path.

This is something we already support in LICM load-only promotion,
but doing this in SROA is much better from a phase ordering
perspective.

Fixes https://github.com/llvm/llvm-project/issues/134513.
---
 llvm/lib/Transforms/Scalar/SROA.cpp            | 4 +++-
 llvm/test/Transforms/SROA/readonlynocapture.ll | 8 ++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp
index 7d49b63a8e4f6..a4e373d395b90 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -1114,8 +1114,10 @@ class AllocaSlices::SliceBuilder : public PtrUseVisitor<SliceBuilder> {
     assert((!LI.isSimple() || LI.getType()->isSingleValueType()) &&
            "All simple FCA loads should have been pre-split");
 
+    // If there is a load with an unknown offset, we can still perform store
+    // to load forwarding for other known-offset loads.
     if (!IsOffsetKnown)
-      return PI.setAborted(&LI);
+      return PI.setEscapedReadOnly(&LI);
 
     TypeSize Size = DL.getTypeStoreSize(LI.getType());
     if (Size.isScalable())
diff --git a/llvm/test/Transforms/SROA/readonlynocapture.ll b/llvm/test/Transforms/SROA/readonlynocapture.ll
index 90ca72462f7c1..5752fadd76d48 100644
--- a/llvm/test/Transforms/SROA/readonlynocapture.ll
+++ b/llvm/test/Transforms/SROA/readonlynocapture.ll
@@ -501,6 +501,8 @@ define i32 @twoalloc_with_lifetimes() {
 
 declare void @use.i32(i32)
 
+; We can promote the %i load, even though there is an unknown offset load
+; in the loop. It is sufficient that we know all stores.
 define void @load_dyn_offset(ptr %ary) {
 ; CHECK-LABEL: @load_dyn_offset(
 ; CHECK-NEXT:    [[A:%.*]] = alloca { i64, [4 x i32] }, align 8
@@ -509,8 +511,8 @@ define void @load_dyn_offset(ptr %ary) {
 ; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr [[GEP]], ptr [[ARY:%.*]], i64 16, i1 false)
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
-; CHECK-NEXT:    [[I:%.*]] = load i64, ptr [[A]], align 4
-; CHECK-NEXT:    [[I_NEXT:%.*]] = add i64 [[I]], 1
+; CHECK-NEXT:    [[I:%.*]] = phi i64 [ [[I_NEXT:%.*]], [[LOOP]] ], [ 0, [[TMP0:%.*]] ]
+; CHECK-NEXT:    [[I_NEXT]] = add i64 [[I]], 1
 ; CHECK-NEXT:    store i64 [[I_NEXT]], ptr [[A]], align 4
 ; CHECK-NEXT:    [[GEP_I:%.*]] = getelementptr i32, ptr [[GEP]], i64 [[I]]
 ; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[GEP_I]], align 4
@@ -540,6 +542,8 @@ exit:
   ret void
 }
 
+; Same as previous test, but with an unknown-offset store. We can't promote in
+; that case.
 define void @store_dyn_offset(ptr %ary) {
 ; CHECK-LABEL: @store_dyn_offset(
 ; CHECK-NEXT:    [[A:%.*]] = alloca { i64, [4 x i32] }, align 8