[PATCH] D82496: [amdgpu] Add codegen support for HIP dynamic shared memory.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 17 14:44:42 PDT 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:2230-2231
+          // Adjust alignment for that dynamic shared memory array.
+          MFI->setDynLDSAlign(
+              B.getDataLayout().getABITypeAlign(ATy->getElementType()));
+          LLT S32 = LLT::scalar(32);
----------------
hliao wrote:
> arsenm wrote:
> > hliao wrote:
> > > arsenm wrote:
> > > > Should use the aggregate alignment, not the element
> > > Propose to add `preFinalizeLowering` before pseudo instruction expansion so that both GISel and DAGISel have the chance to adjust LDS size.
> > It's really cleaner to just have this compute what you care about at the point you care about it. Having a point where this needs to be made consistent is both worse from a serialization perspective, and from an optimization point since theoretically we could spill into the padding later
> But, as we discuss this is a short-term solution before the linker could perform the per-kernel LDS resolution. As the spilling to LDS is not implemented yet, this short-term solution should be kept as simple as possible and finally reverted. To pad LDS for the shared memory array, we have to wait until all static LDS ones are allocated. That's should be the point where `amdgcn_groupstatcisize` is about to be expanded, i.e. before finalizing lowering.
Short term workaround or not, the less mutable state the better. I see no advantage to accumulating the allocated + padding size in a single variable vs. keeping the two separate. You have to track both in MFI anyway, and accumulating like this loses information.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82496/new/

https://reviews.llvm.org/D82496



More information about the llvm-commits mailing list