[PATCH] D102954: [AMDGPU] Lower kernel LDS into a sorted structure
Mahesha S via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 21 22:55:30 PDT 2021
hsmhsm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:250
// so remove the variables from these lists before replaceAllUsesWith
removeFromUsedLists(M, LocalVars);
----------------
We end-up erasing and re-constructing "llvm.used" as many times as number of kernels which have undergone LDS lowering within them, plus for a very first module level lowering. Not sure about the side-effect it. Also may not worth, erasing and re-constructing "llvm.used" so many times in my openion. But it is left to you.
================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPULDSUtils.cpp:39
+ return all_of(U->users(), [F](const User *U) {
+ return isUsedOnlyFromFunction(U, F);
+ });
----------------
hsmhsm wrote:
> For me, this logic looks incomplete in general sense. Constants are constants throughout compilation phase, meaning, once a constant is created, it is reused in multiple contexts when required. So this is possible case.
>
> void kern_0() {
> Inst_1(Const_1(Const_3(LDS)));
> }
>
> void kern_1() {
> Inst_1(Const_2(Const_3(LDS)));
> }
>
> Here Inst_1 and Inst_2 are two different instances, even if they are exactly same, but Const_3 is one-and-the same for both kern_0() and kern_1(). Will this case works here?
Further, I verified that below test-case misses lowering because of above incomplete logic, but lowering is required here.
```
@kern = addrspace(3) global float undef, align 4
define amdgpu_kernel void @k0() {
%ld = load i32, i32* inttoptr (i64 add (i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64), i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64)) to i32*), align 4
ret void
}
define amdgpu_kernel void @k1() {
%ld = load i32, i32* inttoptr (i64 add (i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64), i64 ptrtoint (i32* addrspacecast (i32 addrspace(3)* bitcast (float addrspace(3)* @kern to i32 addrspace(3)*) to i32*) to i64)) to i32*), align 4
ret void
}
```
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D102954/new/
https://reviews.llvm.org/D102954
More information about the llvm-commits
mailing list