[llvm] 21cea3f - AMDGPU: Stop promoting allocas with addrspacecast users (#104051)

via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 14 10:53:42 PDT 2024


Author: Matt Arsenault
Date: 2024-08-14T21:53:38+04:00
New Revision: 21cea3f3be51d5c3856b87fed301b5fc3dddade4

URL: https://github.com/llvm/llvm-project/commit/21cea3f3be51d5c3856b87fed301b5fc3dddade4
DIFF: https://github.com/llvm/llvm-project/commit/21cea3f3be51d5c3856b87fed301b5fc3dddade4.diff

LOG: AMDGPU: Stop promoting allocas with addrspacecast users (#104051)

We cannot promote this case unless we know the value is only
observed through flat operations. We cannot analyze this through
a call. PointerMayBeCaptured was an imprecise check for this.
A callee with a nocapture attribute may still cast to private and
observe the address space, so really we need a different notion
of nocapture.

I doubt this was of any use anyway. The promotable cases should
have optimized out addrspacecast to begin earlier.

Fixes #66669
Fixes #104035

Added: 
    

Modified: 
    llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
    llvm/test/CodeGen/AMDGPU/promote-alloca-addrspacecast.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 33474e7de0188..76553e99431c1 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -1195,14 +1195,10 @@ bool AMDGPUPromoteAllocaImpl::collectUsesWithPtrTypes(
       WorkList.push_back(ICmp);
     }
 
-    if (UseInst->getOpcode() == Instruction::AddrSpaceCast) {
-      // Give up if the pointer may be captured.
-      if (PointerMayBeCaptured(UseInst, true, true))
-        return false;
-      // Don't collect the users of this.
-      WorkList.push_back(User);
-      continue;
-    }
+    // TODO: If we know the address is only observed through flat pointers, we
+    // could still promote.
+    if (UseInst->getOpcode() == Instruction::AddrSpaceCast)
+      return false;
 
     // Do not promote vector/aggregate type instructions. It is hard to track
     // their users.

diff  --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-addrspacecast.ll b/llvm/test/CodeGen/AMDGPU/promote-alloca-addrspacecast.ll
index 8a467812ec485..bf4e02d8d7e1c 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-addrspacecast.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-addrspacecast.ll
@@ -3,10 +3,10 @@
 ; The types of the users of the addrspacecast should not be changed.
 
 ; CHECK-LABEL: @invalid_bitcast_addrspace(
-; CHECK: [[GEP:%[0-9]+]] = getelementptr inbounds [256 x [1 x i32]], ptr addrspace(3) @invalid_bitcast_addrspace.data, i32 0, i32 %{{[0-9]+}}
-; CHECK: [[ASC:%[a-z0-9]+]] = addrspacecast ptr addrspace(3) [[GEP]] to ptr
-; CHECK: [[LOAD:%[a-z0-9]+]] = load <2 x i16>, ptr [[ASC]]
-; CHECK: bitcast <2 x i16> [[LOAD]] to <2 x half>
+; CHECK: alloca
+; CHECK: addrspacecast
+; CHECK: load
+; CHECK: bitcast
 define amdgpu_kernel void @invalid_bitcast_addrspace() #0 {
 entry:
   %data = alloca [1 x i32], addrspace(5)
@@ -16,4 +16,22 @@ entry:
   ret void
 }
 
+; A callee use is not promotable even if it has a nocapture attribute.
+define void @nocapture_callee(ptr nocapture noundef writeonly %flat.observes.addrspace) #0 {
+  %private.ptr = addrspacecast ptr %flat.observes.addrspace to ptr addrspace(5)
+  store i32 1, ptr addrspace(5) %private.ptr, align 4
+  ret void
+}
+
+; CHECK-LABEL: @kernel_call_nocapture(
+; CHECK: alloca i32
+; CHECK-NEXT: addrspacecast
+; CHECK-NEXT: call
+define amdgpu_kernel void @kernel_call_nocapture() #0 {
+  %alloca = alloca i32, align 4, addrspace(5)
+  %flat.alloca = addrspacecast ptr addrspace(5) %alloca to ptr
+  call void @nocapture_callee(ptr noundef %flat.alloca)
+  ret void
+}
+
 attributes #0 = { nounwind "amdgpu-flat-work-group-size"="1,256" }


        


More information about the llvm-commits mailing list