[llvm] [AMDGPU] Introduce "amdgpu-sw-lower-lds" pass to lower LDS accesses. (PR #87265)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 22 08:46:08 PDT 2024
b-sumner wrote:
> > Which runtime are you talking about? The firmware or trap handler? And where are they going to place the pointer? Or are you even considering a new architected or reserved register to hold it and a new ABI?
>
> Presumably the implicit kernel arguments, and whatever is setting that up. It's essentially a partner to the queue pointer, which also is in the implicit kernargs
OK. Suppose the launch has a million work groups. How much memory should the runtime allocate, and how will workgroup J decode what part of that memory to use? It can certainly be done but I'm wondering if we really need to do it now? And how much do we really need an independently working SW LDS?
https://github.com/llvm/llvm-project/pull/87265
More information about the llvm-commits
mailing list