[PATCH] D100481: [AMDGPU] Disable forceful inline of non-kernel functions which use LDS.
Jon Chesterfield via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 14 09:51:43 PDT 2021
JonChesterfield requested changes to this revision.
JonChesterfield added a comment.
This revision now requires changes to proceed.
Change requested is to not change the handling of region_address. I haven't looked up what that is but I am sure the module lds pass doesn't do anything with it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAlwaysInlinePass.cpp:125
+
+ if (!AMDGPUTargetMachine::EnableLowerModuleLDS) {
+ for (GlobalVariable &GV : M.globals()) {
----------------
hsmhsm wrote:
> JonChesterfield wrote:
> > What's REGION_ADDRESS? LowerModuleLDS doesn't to do anything with it so this looks like an incidental behaviour change
> Just like LOCAL_ADDRESS is specific to work-group, REGION_ADDRESS is specific to device, but it is not yet implemented in AMDHSA. Please have a look at the link - https://llvm.org/docs/AMDGPUUsage.html#address-spaces.
>
> DS instructions can be used to access region memory too in addition to local memory. And, I guess, this is the region that code for local memory and region memory co-exist together at some places.
>
> @arsenm might give better answer here about if it will result in an incidental behaviour change.
Let's refactor this so that it doesn't change behaviour for REGION_ADDRESS, as otherwise we're introducing at least one of confusion or a latent bug.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:396
bool AMDGPUTargetMachine::EnableFixedFunctionABI = false;
+bool AMDGPUTargetMachine::EnableLowerModuleLDS = false;
----------------
hsmhsm wrote:
> JonChesterfield wrote:
> > Why does the cl::opt instance initialize to true, while this initializes to false?
> >
> > I'm not clear why the storage needs to become external here, but do see that the same thing has been done for the similar variables in this block.
> cl::opt instance initialize it to true, because, by default the flag is enabled, and also this same instance make sure that AMDGPUTargetMachine::EnableLowerModuleLDS is correctly initialized to default true. Please have a look at cl::location(AMDGPUTargetMachine::EnableLowerModuleLDS).
>
> However, since AMDGPUTargetMachine::EnableLowerModuleLDS is a static variable, it is initialized to "false" at class scope.
>
> The storage needs to be external, because it needs to used at different TU, like this one used within AMDGPUAlwaysInlinePass.cpp.
>
> For example, please take a look at the flag EnableAMDGPUFunctionCallsOpt, and its corresponding static member - AMDGPUTargetMachine::EnableFunctionCalls.
>
> I just followed the same philosophy here.
I would say you've copied a mistake from enable function calls. Look at EnableFixedFunctionABI for a case where the initializer is consistent between the two places.
It looks like none of these need explicit initializers to me, but if we're going to have one, lets have it match the default for the command line argument.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100481/new/
https://reviews.llvm.org/D100481
More information about the llvm-commits
mailing list