[PATCH] D120265: AMDGPU: Use the implicit kernargs for code object version 5
Changpeng Fang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 21 21:53:12 PST 2022
cfang added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4876
+ Register LoadAddr;
+ B.materializePtrAdd(LoadAddr, KernargPtrReg, LLT::scalar(64), Offset);
+ // Load address
----------------
arsenm wrote:
> You're repeating this long sequence to get the queue pointer in two places, should common these into a function to get the queue pointer. Alternatively, emit the intrinsic and move this expansion into a lowering of the queue pointer intrinsic
We are loading different implicit kernel arguments in these two place, one is for queue_ptr, and another is for private_base/shared_base. I can try to figure out whether we can factor out some common part.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120265/new/
https://reviews.llvm.org/D120265
More information about the llvm-commits
mailing list