[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 14 14:27:33 PST 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+ // implicit kernargs.
+ if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+ removeAssumedBits(IMPLICIT_ARG_PTR);
+ else
----------------
cfang wrote:
> arsenm wrote:
> > This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
> This is different from the case of hostcall handling. We are handling aperture bases in the backend. We do not have explicit intrinsic call for implicitarf ptr.
It's not different because some subtargets still use the queue pointer from here (pre gfx9)
================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:653
+ bool needsQueuePtrUserSGPRs() const {
+ return QueuePtr && AMDGPU::getAmdhsaCodeObjectVersion() < 5;
+ }
----------------
cfang wrote:
> arsenm wrote:
> > I don't see why this query needs to change, the code object version can be considered when setting QueuePtr initially
> I intend to factor out same thing like QuterPre && (CodeObjectVersion < 5).
> Can we introduce a global function in AMDGPU space like:
> bool AMDGPU::needsQueuePreUserSGPRs(MachineFunctionInfo MFI) to achieve that purpose?
No, I mean it's inappropriate to check the version here. The version+subtarget should change whether QueuePtr is set, not add an additional check here
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119762/new/
https://reviews.llvm.org/D119762
More information about the llvm-commits
mailing list