[PATCH] D91516: [AMDGPU][WIP] Lower Function Local LDS Variables.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 19 12:15:07 PST 2021
arsenm added a comment.
You can just use the done checkbox, you don't need to comment on each point
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1510
+ // Collect all the amdgpu kernels defined within the current module.
+ SetVector<Function *> Kernels;
+ for (auto &F : M.functions())
----------------
hsmhsm wrote:
> arsenm wrote:
> > hsmhsm wrote:
> > > arsenm wrote:
> > > > Don't see why this is a SetVector, the ordering shouldn't matter
> > > I intentionally used SetVector in order to make sure that the TRAVERSAL ORDER is preserved during multiple traversals of this set in different places.
> > >
> > > Is it harm to use SetVector here? Is it causes noticible performance issues? If not, why not we use it when it would provide additional benifit of preserving TRAVERSAL ORDER?
> > >
> > Traversal order shouldn't matter though
> Agree, but what is the harm in preserving traversal order? If you insist, I will change, but just curious to know, if it is really a serious comment to be taken care.
You should only try to preserve things that are important, otherwise you are adding cost and complexity for no benefit
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUDeviceScopeSharedVariable.cpp:1512
+ for (auto &F : M.functions())
+ if ((F.getCallingConv() == CallingConv::AMDGPU_KERNEL) &&
+ !F.isDeclaration())
----------------
hsmhsm wrote:
> arsenm wrote:
> > Probably should handle all entry function CCs
> I assume, the only kind entry functions are KERNELS (for hip programming language). And, this pass is targetted only for HIP. With this context, is there anything that we are missing?
The IR is language independent and none of the constructs here are tied to a language
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h:43
static bool EnableFixedFunctionABI;
+ static bool EnableDeviceScopeSharedVariable;
----------------
hsmhsm wrote:
> arsenm wrote:
> > hsmhsm wrote:
> > > arsenm wrote:
> > > > I don't think this needs to be exposed here, there's no reason other places would need to inspect this
> > > This is needed in the function `AMDGPUAlwaysInlinePass.cpp`. Please look at the changes being made to this file.
> > This should be a required pass without an option to disable it that the pass would need to be aware of. Is this just for debug/bringup?
> Current idea and common consensus (as I understand it) is to guard this feature by a flag, and disable it by default. HIP application programmer need to pass the option to enable this feature, at least in the begining. May be in future, we can think of getting rid of the option.
This isn't a user exposed flag, and there shouldn't be a need for users to set one.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D91516/new/
https://reviews.llvm.org/D91516
More information about the llvm-commits
mailing list