[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 14 12:04:51 PST 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:55
+ bool HasApertureRegs, bool SupportsGetDoorBellID ) {
+ unsigned CodeObjectVersion = AMDGPU::getAmdhsaCodeObjectVersion();
switch (ID) {
----------------
This really ought to be something read from the IR
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+ // implicit kernargs.
+ if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+ removeAssumedBits(IMPLICIT_ARG_PTR);
+ else
----------------
This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:653
+ bool needsQueuePtrUserSGPRs() const {
+ return QueuePtr && AMDGPU::getAmdhsaCodeObjectVersion() < 5;
+ }
----------------
I don't see why this query needs to change, the code object version can be considered when setting QueuePtr initially
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119762/new/
https://reviews.llvm.org/D119762
More information about the llvm-commits
mailing list