[PATCH] D102954: [AMDGPU] Lower kernel LDS into a sorted structure

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 21 22:55:30 PDT 2021


hsmhsm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:250
     // so remove the variables from these lists before replaceAllUsesWith
     removeFromUsedLists(M, LocalVars);
 
----------------
We end-up erasing and re-constructing "llvm.used" as many times as number of kernels which have undergone LDS lowering within them, plus for a very first module level lowering. Not sure about the side-effect it. Also may not worth, erasing and re-constructing "llvm.used" so many times in my openion. But it is left to you.


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPULDSUtils.cpp:39
+    return all_of(U->users(), [F](const User *U) {
+      return isUsedOnlyFromFunction(U, F);
+    });
----------------
hsmhsm wrote:
> For me, this logic looks incomplete in general sense. Constants are constants throughout compilation phase, meaning, once a constant is created, it is reused in multiple contexts when required. So this is possible case.
> 
> void kern_0() {
>   Inst_1(Const_1(Const_3(LDS)));
> }
> 
> void kern_1() {
>   Inst_1(Const_2(Const_3(LDS)));
> }
> 
> Here Inst_1 and Inst_2 are two different instances, even if they are exactly same, but Const_3 is one-and-the same for both kern_0() and kern_1().  Will this case works here?
Further, I verified that below test-case misses lowering because of above incomplete logic, but lowering is required here.

```
@kern = addrspace(3) global float undef, align 4

define amdgpu_kernel void @k0() {
  %ld = load i32, i32* inttoptr (i64 add (i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64), i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64)) to i32*), align 4
  ret void
}

define amdgpu_kernel void @k1() {
  %ld = load i32, i32* inttoptr (i64 add (i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64), i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64)) to i32*), align 4
  ret void
}
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102954/new/

https://reviews.llvm.org/D102954



More information about the llvm-commits mailing list