[PATCH] D84194: [AMDGPU] Correct the number of SGPR blocks used for GFX9
Scott Linder via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 23 15:50:36 PDT 2020
scott.linder added a comment.
I discussed with Tony today, and I was thinking about this the wrong way.
SPI does not require the granule count to be even, it just rounds up the granule count before actually performing the allocation. This means, from the compiler's perspective, when it is calculating things like the `AMDGPU::IsaInfo::getMaxNumSGPRs` it must consider the "allocation" granule size (`IsaInfo::getSGPRAllocGranule`). Conversely, from the assembler/diassembler perspective, it must consider the "encoding" granule size (`IsaInfo::getSGPREncodingGranule`). It is perfectly OK to have a GFX9 code object with a granulated SGPR count of `1`, and we should allow emitting that in the assembler so that the disassembler can accurately reproduce those code objects.
I don't think there is any fix needed here, we already separate these two concepts and correctly apply them elsewhere. I think I just led you astray in the disassembly patch; you should only be using the encoding granule size, and shouldn't need any special handling for e.g. GFX9 to handle the fact that the allocation and encoding granule sizes are not equal.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84194/new/
https://reviews.llvm.org/D84194
More information about the llvm-commits
mailing list