[PATCH] D119762: AMDGPU: Set up User SGPRs for queue_ptr only when necessary

Changpeng Fang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 14:25:45 PST 2022


cfang added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:429-430
+      // implicit kernargs.
+      if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+        removeAssumedBits(IMPLICIT_ARG_PTR);
+      else
----------------
arsenm wrote:
> This should recognize both the intrinsic and load from the specific offset from the implicitarg ptr, similar to the new hostcall handling. We still should be able to infer no queue ptr with it in memory
This is different from the case of hostcall handling. We are handling aperture bases in the backend.  We do not have explicit intrinsic call for implicitarf ptr.


================
Comment at: llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h:653
+  bool needsQueuePtrUserSGPRs() const {
+    return QueuePtr && AMDGPU::getAmdhsaCodeObjectVersion() < 5;
+  }
----------------
arsenm wrote:
> I don't see why this query needs to change, the code object version can be considered when setting QueuePtr initially
I intend to factor out same thing like QuterPre && (CodeObjectVersion < 5).
Can we introduce a global function in AMDGPU space like:
bool AMDGPU::needsQueuePreUserSGPRs(MachineFunctionInfo MFI) to achieve that purpose? 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119762/new/

https://reviews.llvm.org/D119762



More information about the llvm-commits mailing list