[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 15:13:45 PST 2022


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+      // implicit kernargs.
+      if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+        removeAssumedBits(IMPLICIT_ARG_PTR);
+      else
----------------
cfang wrote:
> arsenm wrote:
> > cfang wrote:
> > > arsenm wrote:
> > > > This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
> > > This is different from the case of hostcall handling. We are handling aperture bases in the backend.  We do not have explicit intrinsic call for implicitarf ptr.
> > It's not different because some subtargets still use the queue pointer from here (pre gfx9)
> I know some subtargets still use the queue pointer.  However,
> you suggest we use similar approach as we handle hostcall. 
> But we actually have the different case.  For hostcall, we are using implicitarg_ptr + offset, but for aperture bases, we do not have  implicitarg_ptr
> intrinsic call at all. 
The logical queue pointer value still exists and we can infer that it's not needed, just like hostcall in this case


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119762/new/

https://reviews.llvm.org/D119762



More information about the llvm-commits mailing list