[PATCH] D100481: [AMDGPU] Disable forceful inline of non-kernel functions which use LDS.

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 14 08:40:38 PDT 2021


hsmhsm added a comment.

In D100481#2689032 <https://reviews.llvm.org/D100481#2689032>, @JonChesterfield wrote:

> I agree with the intent. This would be a good change to run through the internal CI before landing.

I agree to some extent that it will be good change to run through the internal CI before landing, but, not sure, why you did not take care of it while you were landing your original patch. Because, without this change, your patch is practically not much applicable.

Can we upstream it and watch out how your patch really performs with this change?

@arsenm what would be your suggestion here?



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp:125
+
+  if (!AMDGPUTargetMachine::EnableLowerModuleLDS) {
+    for (GlobalVariable &GV : M.globals()) {
----------------
JonChesterfield wrote:
> What's REGION_ADDRESS? LowerModuleLDS doesn't to do anything with it so this looks like an incidental behaviour change
Just like LOCAL_ADDRESS is specific to work-group,  REGION_ADDRESS is specific to device, but it is not yet implemented in AMDHSA.  Please have a look at the link - https://llvm.org/docs/AMDGPUUsage.html#address-spaces.

DS instructions can be used to access region memory too in addition to local memory. And, I guess, this is the region that code for local memory and region memory co-exist together at some places.

@arsenm might give better answer here about if it will result in an incidental behaviour change.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:396
 bool AMDGPUTargetMachine::EnableFixedFunctionABI = false;
+bool AMDGPUTargetMachine::EnableLowerModuleLDS = false;
 
----------------
JonChesterfield wrote:
> Why does the cl::opt instance initialize to true, while this initializes to false?
> 
> I'm not clear why the storage needs to become external here, but do see that the same thing has been done for the similar variables in this block.
cl::opt instance initialize it to true, because, by default the flag is enabled, and also this same instance make sure that AMDGPUTargetMachine::EnableLowerModuleLDS is correctly initialized to default true. Please have a look at cl::location(AMDGPUTargetMachine::EnableLowerModuleLDS).

However, since AMDGPUTargetMachine::EnableLowerModuleLDS is a static variable, it is initialized to "false" at class scope.

The storage needs to be external, because it needs to used at different TU, like this one used within AMDGPUAlwaysInlinePass.cpp.

For example, please take a look at the flag EnableAMDGPUFunctionCallsOpt, and its corresponding static member - AMDGPUTargetMachine::EnableFunctionCalls.

I just followed the same philosophy here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100481/new/

https://reviews.llvm.org/D100481



More information about the llvm-commits mailing list