[PATCH] D103655: [AMDGPU] Handle constant LDS uses from different kernels

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 4 11:45:01 PDT 2021


rampitec added a comment.

> In D103655#2797809 <https://reviews.llvm.org/D103655#2797809>, @rampitec wrote:
>
>> I think from this we are just one step from removing module lds except for potentially indirect functions. Everything else can be moved into kernel lds structure.
>>
>> We do have a problem with excessive use of lds in rocBLAS because of the module lds already, so technically we have a regression with it. There is a w/a but it is better be solved sooner rather than later.
>
> I am not sure, if we can completely eliminate module lds. My understanding is that it is still required to handle within non-kernel function used LDS.  But, instead of directly dealing with LDS, it should deal with pointers, hence the patch https://reviews.llvm.org/D103225.

Not completely, but we can omit module lds for functions which are only used from a single kernel at the very least.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:266
+        // current kernel.
+        for (Value::user_iterator UI = GV->user_begin(), E = GV->user_end();
+             UI != E;) {
----------------
foad wrote:
> Can this be `for (auto &U : make_early_inc_range(GV->users()))`?
Thanks!


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:279
+          Instruction *I = dyn_cast<Instruction>(U.getUser());
+          return I && I->getFunction() == F;
         });
----------------
hsmhsm wrote:
> Probably, we can change
> 
> ```
>   Instruction *I = dyn_cast<Instruction>(U.getUser());
>   return I && I->getFunction() == F;
> ```
> 
> to
> 
> ```
>   Instruction *I = cast<Instruction>(U.getUser());
>   return I->getFunction() == F;
> ```
Not really, there can be non-instruction uses from outside of the kernel. In fact assert above is also not correct, we may not have converted non-kernel constant exprs.


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPULDSUtils.cpp:53
 
-  return false;
+  collectFunctionUses(C, F, InstUsers);
+  for (Instruction *I : InstUsers) {
----------------
hsmhsm wrote:
> Within the pointer replacement patch https://reviews.llvm.org/D103225, we already have this functionality in place - please take a look at the utility function getFunctionToInstsMap(). Probably we can reuse it here. Once we get FunctionToInstsMap, we can consider only key F, and ignore all others.
> 
> If you think it is a good idea, then, I can create new patch by taking out this utility function from https://reviews.llvm.org/D103225. I leave it to you.
It is not the same, I am only interested in instructions with constant expr uses, a much smaller set.


================
Comment at: llvm/test/CodeGen/AMDGPU/lower-module-lds-constantexpr.ll:43-45
+; CHECK: %4 = bitcast float addrspace(3)* getelementptr inbounds (%llvm.amdgcn.kernel.timestwo.lds.t, %llvm.amdgcn.kernel.timestwo.lds.t addrspace(3)* @llvm.amdgcn.kernel.timestwo.lds, i32 0, i32 0) to i32 addrspace(3)*
+; CHECK: %5 = addrspacecast i32 addrspace(3)* %4 to i32*
+; CHECK: %6 = ptrtoint i32* %5 to i64
----------------
foad wrote:
> Why do %4, %5, %6 need to be Instructions? Couldn't they could be left as ConstantExprs?
It converts the whole constant expr, this is how D103661 works (and you have already commented there). I am not sure that is important though. Do you see any benefits of having long constant expressions vs instructions? We have to admit the testcase is quite degenerate too, more a torture test than a real life use.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103655/new/

https://reviews.llvm.org/D103655



More information about the llvm-commits mailing list