[PATCH] D103261: [AMDGPU] Fix natural alignment of LDS globals during LDS lowering.

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 27 12:24:42 PDT 2021


rampitec added a comment.

In D103261#2785495 <https://reviews.llvm.org/D103261#2785495>, @JonChesterfield wrote:

> What's the use case for this? It will increase memory use if it increases the alignment of any variables

It may increase fragmentation if we have a lot of small underaligned arrays, but in general superalignment should give a better performance. We are trading memory for performance. Changes exposed by the ISA tests are mostly positive. We may want to add a threshold for the FoundLocalVars.size() to inhibit the superalignment as the fragmentation is a function of the number of variables. Let's say in a worst case we will waste 15 bytes. To stay below 1Kb the threshold would be 68 which is plenty of variables. In reality fragmentation will be even less as we are not going to waste maximum. The other option is to compute total allocation and only superalign if ST.getOccupancyWithLocalMemSize() does not drop (and we do not exceed getLocalMemorySize() of course). The latter is more expensive but shall work better than a simple threshold. AMDGPUSubtarget::getMaxLocalMemSizeWithWaveCount() can be used to simplify the logic (one can check AMDGPUPromoteAlloca.cpp for the usage).

I'd say I am in favor of this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103261/new/

https://reviews.llvm.org/D103261



More information about the llvm-commits mailing list