[PATCH] D155240: [amdgpu][lds] Use amdgpu-lds-size instead of llvm.donothing

Thu Jul 13 16:15:40 PDT 2023

JonChesterfield added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:955
+
+  if (CurrentLocalMemUsage == 0) {
+    // Sort to try to estimate the worst case alignment padding
----------------
JonChesterfield wrote:
> arsenm wrote:
> > arsenm wrote:
> > > Is there now a path to introducing LDS outside of a kernel?
> > > 
> > > I'm slightly nervous about assuming amdgpu-lds-size is a source of truth in case something else introduces new LDS globals. Such as this pass, which is not updating the value 
> > If it already pre-packed the globals into a single big one, won't the search find the case of 1 and work fine without this?
> PromoteAlloca sometimes adds a global accessed by a kernel. It could be extended to work on non-entry functions with some effort. That's currently the only thing downstream of lowermodulelds that might add a global.
> 
> amdgpu-frame-size is a reasonable contender for a source of truth. We can append to it and mark the new global with an absolute address to get something the backend handles correctly. The awkward case is dynamic lds, which currently needs to block appending.
> 
> I think the lowering pass could be reworked to compose cleanly based on that attribute and append-only model
> 
> I'm thinking of moving graphics to the same lowering as compute and then making any lds variable without an absolute address assigned before codegen a fatal error.
It misses the big struct if it isn't directly used by the kernel. This pass crawls the globals but not the call graph. Thus the donothing hack

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155240/new/

https://reviews.llvm.org/D155240