[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Fri Feb 25 16:38:19 PST 2022

cfang added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+      // implicit kernargs.
+      if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+        removeAssumedBits(IMPLICIT_ARG_PTR);
+      else
----------------
arsenm wrote:
> arsenm wrote:
> > cfang wrote:
> > > arsenm wrote:
> > > > cfang wrote:
> > > > > arsenm wrote:
> > > > > > This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
> > > > > This is different from the case of hostcall handling. We are handling aperture bases in the backend.  We do not have explicit intrinsic call for implicitarf ptr.
> > > > It's not different because some subtargets still use the queue pointer from here (pre gfx9)
> > > I know some subtargets still use the queue pointer.  However,
> > > you suggest we use similar approach as we handle hostcall. 
> > > But we actually have the different case.  For hostcall, we are using implicitarg_ptr + offset, but for aperture bases, we do not have  implicitarg_ptr
> > > intrinsic call at all. 
> > The logical queue pointer value still exists and we can infer that it's not needed, just like hostcall in this case
> We should still be tracking the logical queue pointer
Can you be explicit what is "logical queue pointer" here> And why do we need to trace it? 

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119762/new/

https://reviews.llvm.org/D119762