[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 12:04:51 PST 2022


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:55
+                    bool HasApertureRegs, bool SupportsGetDoorBellID ) {
+  unsigned CodeObjectVersion = AMDGPU::getAmdhsaCodeObjectVersion();
   switch (ID) {
----------------
This really ought to be something read from the IR


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+      // implicit kernargs.
+      if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+        removeAssumedBits(IMPLICIT_ARG_PTR);
+      else
----------------
This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory


================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:653
+  bool needsQueuePtrUserSGPRs() const {
+    return QueuePtr && AMDGPU::getAmdhsaCodeObjectVersion() < 5;
+  }
----------------
I don't see why this query needs to change, the code object version can be considered when setting QueuePtr initially


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119762/new/

https://reviews.llvm.org/D119762



More information about the llvm-commits mailing list