[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 14:27:33 PST 2022


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+      // implicit kernargs.
+      if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+        removeAssumedBits(IMPLICIT_ARG_PTR);
+      else
----------------
cfang wrote:
> arsenm wrote:
> > This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
> This is different from the case of hostcall handling. We are handling aperture bases in the backend.  We do not have explicit intrinsic call for implicitarf ptr.
It's not different because some subtargets still use the queue pointer from here (pre gfx9)


================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:653
+  bool needsQueuePtrUserSGPRs() const {
+    return QueuePtr && AMDGPU::getAmdhsaCodeObjectVersion() < 5;
+  }
----------------
cfang wrote:
> arsenm wrote:
> > I don't see why this query needs to change, the code object version can be considered when setting QueuePtr initially
> I intend to factor out same thing like QuterPre && (CodeObjectVersion < 5).
> Can we introduce a global function in AMDGPU space like:
> bool AMDGPU::needsQueuePreUserSGPRs(MachineFunctionInfo MFI) to achieve that purpose? 
No, I mean it's inappropriate to check the version here. The version+subtarget should change whether QueuePtr is set, not add an additional check here


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119762/new/

https://reviews.llvm.org/D119762



More information about the llvm-commits mailing list