[PATCH] D150609: [AMDGPU] Do not assume stack size for PAL code object indirect calls

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 16 07:34:10 PDT 2023

arsenm added inline comments.

Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:120-122
+  if (AMDGPU::getCodeObjectVersion(M) >= AMDGPU::AMDHSA_COV5 &&
+      !AssumedStackSizeForDynamicSizeObjects.getNumOccurrences())
+      AssumedStackSizeForDynamicSizeObjects = 0;
sebastian-ne wrote:
> arsenm wrote:
> > sebastian-ne wrote:
> > > arsenm wrote:
> > > > Dynamic objects should be treated identically 
> > > I think the logic here is that we treat PAL code objects the same as AMDHSL code objects >= 5 (PAL code objects have no amdgpu_code_object_version).
> > > If there is an indirect call, something outside the compiler needs to compute the stack size, in the case of PAL/graphics, that is the driver.
> > > 
> > > So probably we should use `if (AMDGPU::getCodeObjectVersion(M) >= AMDGPU::AMDHSA_COV5 || STI.getTargetTriple().getOS() == Triple::AMDPAL)` for both, `AssumedStackSizeForDynamicSizeObjects` and `AssumedStackSizeForExternalCall`
> > Is some equivalent to the dynamic bit in the metadata emitted for pal?
> We emit per-function metadata (https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdpal-code-object-shader-function-map-table).
> An application needs to specify the recursion depth it uses and the driver uses this information to compute the needed scratch allocation.
That metadata should be checked in the test 

  rG LLVM Github Monorepo



More information about the llvm-commits mailing list