[llvm] [AMDGPU] Limit promoting allocas that have users with dynamic index above a threshold on number of elements (PR #170327)

Kevin Choi via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 3 08:50:38 PST 2025


choikwa wrote:

> > dynamic indexing will blow up compile-time in GreedyRA.
> 
> Have you done some further investigation why it causes issues in GreedyRA?
> 
> Note that by adding another limit, we are also making the pass less useful in alloca promotion. Do you have the runtime performance and compile-time numbers with and without this change for your case? `8` sounds too small, maybe `16`? (since the case you cared has `32` elements).

Yes, we had an MLIR testcase (SWDEV-559837) that would blow up in compile-time when promote alloca tried to create <128 x i8> with <16 x i8> users. After rejecting those cases, compile time dropped from ~2min to 0.5s in my sandbox. Investigation has shown that a long chain of extract/insert elements with dynamic index would end up creating 35x more LiveIntervals for GreedyRA to deal with, and ends up being bogged down in interference check in the eviction phase. 
I've discussed with colleagues and the hope is that this fix is very surgical to avoid dropping runtime perf while targetting compile-time. Internally, we are tracking runtime perf and thought that this change was too small to warrant a custom request.

https://github.com/llvm/llvm-project/pull/170327


More information about the llvm-commits mailing list