[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 21 17:49:59 PST 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+ // implicit kernargs.
+ if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+ removeAssumedBits(IMPLICIT_ARG_PTR);
+ else
----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > cfang wrote:
> > > > arsenm wrote:
> > > > > This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
> > > > This is different from the case of hostcall handling. We are handling aperture bases in the backend. We do not have explicit intrinsic call for implicitarf ptr.
> > > It's not different because some subtargets still use the queue pointer from here (pre gfx9)
> > I know some subtargets still use the queue pointer. However,
> > you suggest we use similar approach as we handle hostcall.
> > But we actually have the different case. For hostcall, we are using implicitarg_ptr + offset, but for aperture bases, we do not have implicitarg_ptr
> > intrinsic call at all.
> The logical queue pointer value still exists and we can infer that it's not needed, just like hostcall in this case
We should still be tracking the logical queue pointer
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119762/new/
https://reviews.llvm.org/D119762
More information about the llvm-commits
mailing list