[PATCH] D120265: AMDGPU: Use the implicit kernargs for code object version 5
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 21 17:40:43 PST 2022
arsenm added inline comments.
================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:16252-16254
+ const unsigned XOffset = IsCOV_5 ? 12 : 4;
+ auto *DP =
+ IsCOV_5 ? EmitAMDGPUImplicitArgPtr(CGF) : EmitAMDGPUDispatchPtr(CGF);
----------------
Given that it's an offset from a different base, I think it would be cleaner to just branch around the two cases
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:410-412
+ // We are going to use the implicit kernarg for V5.
+ if (AMDGPU::getAmdhsaCodeObjectVersion() == 5)
+ removeAssumedBits(IMPLICIT_ARG_PTR);
----------------
This isn't covered by any test changes
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h:326-328
+ PRIVATE_BASE,
+ SHARED_BASE,
+ QUEUE_PTR,
----------------
You shouldn't merge these into the same enum. This enum should be renamed, this is for a different clover ABI
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4876
+ Register LoadAddr;
+ B.materializePtrAdd(LoadAddr, KernargPtrReg, LLT::scalar(64), Offset);
+ // Load address
----------------
You're repeating this long sequence to get the queue pointer in two places, should common these into a function to get the queue pointer. Alternatively, emit the intrinsic and move this expansion into a lowering of the queue pointer intrinsic
================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:155
+unsigned getAmdhsaCodeObjectVersion() { return AmdhsaCodeObjectVersion; }
+
----------------
The code object version should probably come from the IR, not a global opt
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120265/new/
https://reviews.llvm.org/D120265
More information about the llvm-commits
mailing list