[PATCH] D130784: [AMDGPU] Support LDS spilling
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 4 19:16:27 PDT 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:890
- ProgInfo.LDSSize = MFI->getLDSSize();
+ unsigned WorkGroupSize = STM.getFlatWorkGroupSizes(F).second;
+ unsigned LDSSpillSize = MFI->getLdsSpill().TotalSize * WorkGroupSize;
----------------
Rename to MaxWorkGroupSize
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:893
+
+ ProgInfo.LDSSize = MFI->getLDSSize() + LDSSpillSize;
+
----------------
Does this need alignment padding up to 4?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:799-800
+unsigned GCNSubtarget::getLdsSpillLimitDwords(const MachineFunction &MF) const {
+ const Function &F = MF.getFunction();
+ return AMDGPU::getIntegerAttribute(F, "amdgpu-lds-spill-limit-dwords", 0);
+}
----------------
I don't see why we would need a new limit for this and just rely on the remaining LDS in the occupancy budget
================
Comment at: llvm/lib/Target/AMDGPU/SIFrameLowering.cpp:406
+// m0 for LDS spilling if possible.
+void SIFrameLowering::setupLDSSpilling(MachineFunction &MF,
+ MachineBasicBlock &MBB,
----------------
Could we do this in a post-RA pass before LiveIntervals is discarded? I was thinking we should copy what SC does and reserve more registers, and try to reallocate them in such a pass. The same place could have smarter management of m0
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:2565
+ auto OrigArg = Fn.getArg(Arg.getOrigArgIndex());
+ if (OrigArg->getName().equals("MultiDispatchInfo")) {
+ CCValAssign &VA = ArgLocs[i];
----------------
piotr wrote:
> rampitec wrote:
> > Matching argument by name is bad, especially if such name can be used by an user.
> This is controlled by the front-end - graphics compute shaders do not have inputs that could be specified by a shader writer and end up landing here.
>
> Having said that, I am not 100% happy with the way this is handled here either. I guess a better idea could be to rely on a function attribute set by the front-end that would say - "the user sgpr you want to use for multi dispatch info is at nth location".
>
> Just to note, input sgprs are handled differently between graphics and compute. I need to "just" preload WorkGroupInfo, so that I could later use that sgpr in the spill code. In kernels, it is treated as a system sgpr (allocateSystemSGPRs), so preloading it should not be a problem. However, in graphics we treat it as a user sgpr and pass in the list of arguments (with the name "MultiDispatchInfo").
>
>
Argument names are totally meaningless. opt -strip will now break this, so you need something that doesn't rely on the name
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130784/new/
https://reviews.llvm.org/D130784
More information about the llvm-commits
mailing list