[PATCH] D84194: [AMDGPU] Correct the number of SGPR blocks used for GFX9
Ronak Chauhan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jul 25 03:13:27 PDT 2020
rochauha marked an inline comment as done.
rochauha added a comment.
In D84194#2170882 <https://reviews.llvm.org/D84194#2170882>, @scott.linder wrote:
> I discussed with Tony today, and I was thinking about this the wrong way.
>
> SPI does not require the granule count to be even, it just rounds up the granule count before actually performing the allocation. This means, from the compiler's perspective, when it is calculating things like the `AMDGPU::IsaInfo::getMaxNumSGPRs` it must consider the "allocation" granule size (`IsaInfo::getSGPRAllocGranule`). Conversely, from the assembler/diassembler perspective, it must consider the "encoding" granule size (`IsaInfo::getSGPREncodingGranule`). It is perfectly OK to have a GFX9 code object with a granulated SGPR count of `1`, and we should allow emitting that in the assembler so that the disassembler can accurately reproduce those code objects.
>
> I don't think there is any fix needed here, we already separate these two concepts and correctly apply them elsewhere. I think I just led you astray in the disassembly patch; you should only be using the encoding granule size, and shouldn't need any special handling for e.g. GFX9 to handle the fact that the allocation and encoding granule sizes are not equal.
Correct me if I'm wrong. So we must not take inverse of the mentioned GFX9 calculation (the one where we divide by 16 before roundup) as it is for allocation granule size? And hence the disassembly computation will be same for GFX6-8 and GFX9 (because the encoding granule size is the same)?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84194/new/
https://reviews.llvm.org/D84194
More information about the llvm-commits
mailing list