[llvm] [AMDGPU] Use table strategy for LowerModuleLDSPass at O0 (PR #160181)

Jon Chesterfield via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 30 13:41:30 PDT 2025


JonChesterfield wrote:

Github won't let me comment on the right place in the diff.

set_is_subset(K.second, HybridModuleRootKernels)

That^ is too optimistic, needs to be set equality, not subset. That'll reduce how often the module path is taken which will make some kernels slower, so the fix probably needs to be change that to equality and then a second patch work harder to retrieve the anticipated performance loss.

Essentially currently we have a path that chooses faster instruction execution over minimising allocation. That's not a deliberate design choice, more an oversight from the original implementation that went unnoticed. Thank you for picking up on it!

https://github.com/llvm/llvm-project/pull/160181


More information about the llvm-commits mailing list