[llvm] [NVPTX][InferAS] assume alloca instructions are in local AS (PR #121710)
Alex MacLean via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 7 09:38:59 PST 2025
================
@@ -9,38 +9,43 @@ define dso_local void @caller_St8x4(ptr nocapture noundef readonly byval(%struct
; CHECK-NEXT: .local .align 8 .b8 __local_depot0[32];
; CHECK-NEXT: .reg .b32 %SP;
; CHECK-NEXT: .reg .b32 %SPL;
-; CHECK-NEXT: .reg .b32 %r<2>;
-; CHECK-NEXT: .reg .b64 %rd<13>;
+; CHECK-NEXT: .reg .b32 %r<4>;
+; CHECK-NEXT: .reg .b64 %rd<17>;
; CHECK-EMPTY:
; CHECK-NEXT: // %bb.0:
; CHECK-NEXT: mov.u32 %SPL, __local_depot0;
; CHECK-NEXT: cvta.local.u32 %SP, %SPL;
; CHECK-NEXT: ld.param.u32 %r1, [caller_St8x4_param_1];
+; CHECK-NEXT: add.u32 %r3, %SPL, 0;
; CHECK-NEXT: ld.param.u64 %rd1, [caller_St8x4_param_0+24];
-; CHECK-NEXT: st.u64 [%SP+24], %rd1;
+; CHECK-NEXT: st.local.u64 [%r3+24], %rd1;
; CHECK-NEXT: ld.param.u64 %rd2, [caller_St8x4_param_0+16];
-; CHECK-NEXT: st.u64 [%SP+16], %rd2;
+; CHECK-NEXT: st.local.u64 [%r3+16], %rd2;
; CHECK-NEXT: ld.param.u64 %rd3, [caller_St8x4_param_0+8];
-; CHECK-NEXT: st.u64 [%SP+8], %rd3;
+; CHECK-NEXT: st.local.u64 [%r3+8], %rd3;
; CHECK-NEXT: ld.param.u64 %rd4, [caller_St8x4_param_0];
-; CHECK-NEXT: st.u64 [%SP], %rd4;
+; CHECK-NEXT: st.local.u64 [%r3], %rd4;
+; CHECK-NEXT: ld.u64 %rd5, [%SP+8];
+; CHECK-NEXT: ld.u64 %rd6, [%SP];
+; CHECK-NEXT: ld.u64 %rd7, [%SP+24];
+; CHECK-NEXT: ld.u64 %rd8, [%SP+16];
----------------
AlexMaclean wrote:
These loads are here because the kernel parameter is copied to local memory and then loaded back so that it may be passed to the callee. Previously SDAG was able to see that the store above was to the same address as the load, but now that the stores are to local mem this is not possible. To fix we should:
- Look through addrspacecasts in DAG Combiner, though I'm not sure if this is valid for all targets
- In InferAS, propagate new addrspaces into byval args of calls, since the pointer type doesn't actually matter to the callee.
Overall, while this change may interfere with this sort of load elimination for byval callees, this is a fairly minor edge case that can be addressed in a follow up.
https://github.com/llvm/llvm-project/pull/121710
More information about the llvm-commits
mailing list