[PATCH] D150609: [AMDGPU] Do not assume stack size for PAL code object indirect calls
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 16 07:34:10 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp:120-122
+ if (AMDGPU::getCodeObjectVersion(M) >= AMDGPU::AMDHSA_COV5 &&
+ !AssumedStackSizeForDynamicSizeObjects.getNumOccurrences())
+ AssumedStackSizeForDynamicSizeObjects = 0;
----------------
sebastian-ne wrote:
> arsenm wrote:
> > sebastian-ne wrote:
> > > arsenm wrote:
> > > > Dynamic objects should be treated identically
> > > I think the logic here is that we treat PAL code objects the same as AMDHSL code objects >= 5 (PAL code objects have no amdgpu_code_object_version).
> > > If there is an indirect call, something outside the compiler needs to compute the stack size, in the case of PAL/graphics, that is the driver.
> > >
> > > So probably we should use `if (AMDGPU::getCodeObjectVersion(M) >= AMDGPU::AMDHSA_COV5 || STI.getTargetTriple().getOS() == Triple::AMDPAL)` for both, `AssumedStackSizeForDynamicSizeObjects` and `AssumedStackSizeForExternalCall`
> > Is some equivalent to the dynamic bit in the metadata emitted for pal?
> We emit per-function metadata (https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdpal-code-object-shader-function-map-table).
> An application needs to specify the recursion depth it uses and the driver uses this information to compute the needed scratch allocation.
That metadata should be checked in the test
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D150609/new/
https://reviews.llvm.org/D150609
More information about the llvm-commits
mailing list