[llvm] [AMDGPU] Lower `llvm.amdgcn.queue.ptr` instrinsic to using implicit kernel argument if feasible (PR #103490)

Austin Kerbow via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 15 10:24:41 PDT 2024


kerbowa wrote:

> That is an indirect way doing that.
> 
> What this patch is doing can allow us not to set `.hidden_queue_ptr` unnecessarily. Currently we set it when the function attribute `amdgpu-no-queue-ptr` is absent even for GFX9+ with COV5+. Since we already handle the use of queue pointer for aperture base and trap handling correctly based on COV, it is supposed to safely not set it. However, we still do it. That is probably based on the assumption that a function can call the intrinsic, since we lower it to read the SGPR in any case. If we can lower it to implicit kernel argument for COV5+, we can safely drop `.hidden_queue_ptr` for COV5+.

What do you mean by .hidden_queue_ptr, the preloaded SGPR pair? That should be set with the KD field ENABLE_SGPR_QUEUE_PTR.

Currently the lowering for this is totally broken even without this patch. There is an explicit check for COV5 for whether to allocate SGPRs for the queue_ptr and we are not allocating anything. So we load from some random incorrect SGPRs as far as I can tell.

https://github.com/llvm/llvm-project/pull/103490


More information about the llvm-commits mailing list