[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Changpeng Fang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 14:59:43 PST 2022


cfang added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+      // implicit kernargs.
+      if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+        removeAssumedBits(IMPLICIT_ARG_PTR);
+      else
----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
> > This is different from the case of hostcall handling. We are handling aperture bases in the backend.  We do not have explicit intrinsic call for implicitarf ptr.
> It's not different because some subtargets still use the queue pointer from here (pre gfx9)
I know some subtargets still use the queue pointer.  However,
you suggest we use similar approach as we handle hostcall. 
But we actually have the different case.  For hostcall, we are using implicitarg_ptr + offset, but for aperture bases, we do not have  implicitarg_ptr
intrinsic call at all. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119762/new/

https://reviews.llvm.org/D119762



More information about the llvm-commits mailing list