[PATCH] D102401: [AMDGPU] Allocate LDS globals in sorted order of their alignment and size.

Mon May 17 09:38:39 PDT 2021

arsenm added a comment.

In D102401#2763512 <https://reviews.llvm.org/D102401#2763512>, @hsmhsm wrote:

> In D102401#2763394 <https://reviews.llvm.org/D102401#2763394>, @arsenm wrote:
>
>> If we already have a pass that condenses the LDS globals into a single variable to access, what is the advantage of this? Why can't we just always do that compacting and codegen will then not have to worry about optimal LDS packing since it will only see the one global
>
> But, do we really have one right now?  As you might know - "LowerModuleLDSPass" will not guarentee it (atleast as it exist today).  We need have something working till we have a kind of perfect solution(s).  And, I strongly believe that this patch is really useful patch in the sense it will definetely increase the probability of unaligned ds access to be aligned at runtime. Morever, it only touches kernels, not every single function in the module as you said in one of the earlier patch. I do not see any harm in having this patch, unless I am missing something.

I'm saying you're putting a nontrivial IR inspection into a codegen pass. It would be cleaner if we could leave the allocation optimizations entirely in an IR pass.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102401/new/

https://reviews.llvm.org/D102401