[llvm] [NVPTX] Improve kernel byval parameter lowering (PR #136008)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 16 16:35:38 PDT 2025


================
@@ -148,18 +139,40 @@ entry:
 
 ; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: readwrite)
 define dso_local ptx_kernel void @read_only_gep_asc0(ptr nocapture noundef writeonly %out, ptr nocapture noundef readonly byval(%struct.S) align 4 %s) local_unnamed_addr #0 {
-; COMMON-LABEL: define dso_local ptx_kernel void @read_only_gep_asc0(
-; COMMON-SAME: ptr noundef writeonly captures(none) [[OUT:%.*]], ptr noundef readonly byval([[STRUCT_S:%.*]]) align 4 captures(none) [[S:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
-; COMMON-NEXT:  [[ENTRY:.*:]]
-; COMMON-NEXT:    [[S1:%.*]] = alloca [[STRUCT_S]], align 4
-; COMMON-NEXT:    [[S2:%.*]] = addrspacecast ptr [[S]] to ptr addrspace(101)
-; COMMON-NEXT:    call void @llvm.memcpy.p0.p101.i64(ptr align 4 [[S1]], ptr addrspace(101) align 4 [[S2]], i64 8, i1 false)
-; COMMON-NEXT:    [[B:%.*]] = getelementptr inbounds nuw i8, ptr [[S1]], i64 4
-; COMMON-NEXT:    [[ASC:%.*]] = addrspacecast ptr [[B]] to ptr addrspace(101)
-; COMMON-NEXT:    [[ASC0:%.*]] = addrspacecast ptr addrspace(101) [[ASC]] to ptr
-; COMMON-NEXT:    [[I:%.*]] = load i32, ptr [[ASC0]], align 4
-; COMMON-NEXT:    store i32 [[I]], ptr [[OUT]], align 4
-; COMMON-NEXT:    ret void
+; SM_60-LABEL: define dso_local ptx_kernel void @read_only_gep_asc0(
+; SM_60-SAME: ptr noundef writeonly captures(none) [[OUT:%.*]], ptr noundef readonly byval([[STRUCT_S:%.*]]) align 4 captures(none) [[S:%.*]]) local_unnamed_addr #[[ATTR0]] {
+; SM_60-NEXT:  [[ENTRY:.*:]]
+; SM_60-NEXT:    [[S1:%.*]] = alloca [[STRUCT_S]], align 4
+; SM_60-NEXT:    [[S2:%.*]] = call ptr addrspace(101) @llvm.nvvm.internal.noop.addrspacecast.p101.p0(ptr [[S]])
+; SM_60-NEXT:    call void @llvm.memcpy.p0.p101.i64(ptr align 4 [[S1]], ptr addrspace(101) align 4 [[S2]], i64 8, i1 false)
+; SM_60-NEXT:    [[B:%.*]] = getelementptr inbounds nuw i8, ptr [[S1]], i64 4
+; SM_60-NEXT:    [[ASC:%.*]] = addrspacecast ptr [[B]] to ptr addrspace(101)
+; SM_60-NEXT:    [[ASC0:%.*]] = addrspacecast ptr addrspace(101) [[ASC]] to ptr
+; SM_60-NEXT:    [[I:%.*]] = load i32, ptr [[ASC0]], align 4
----------------
AlexMaclean wrote:

I'm not sure exactly what is going on here. The problematic `addrspacecast` instructions are in the source of this test, they aren't being inserted by NVPTXLowerArgs. I'd lean towards removing the casts in this test as I don't think there is a situation in which we'd expect to see them coming into NVPTXLowerArgs. Do you remember anything about your motivation for this test in d0615a93bb6d7aedc43323dc8957fe57e86ed8ae?

https://github.com/llvm/llvm-project/pull/136008


More information about the llvm-commits mailing list