[PATCH] D84194: [AMDGPU] Correct the number of SGPR blocks used for GFX9
Scott Linder via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 21 15:08:12 PDT 2020
scott.linder added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:348
+ // 16 for GFX9, 8 for GFX6-8
+ return isGFX9(*STI) ? 16 : 8;
}
----------------
I don't know if this is actually accurate, I think the reason for the "2 *" in the equation for GFX9 is not because the allocation granule is 16. It is still 8 for gfx9, but there is an additional constraint that you must allocate an even number of granules.
It is a bit confusing, and I would like @kzhuravl to weigh in as IIRC he was who originally helped me understand this when we were updating the assembler.
================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:437
// SGPRBlocks is actual number of SGPR blocks minus 1.
- return NumSGPRs / getSGPREncodingGranule(STI) - 1;
+ unsigned NumSGPRBlocks = NumSGPRs / getSGPREncodingGranule(STI) - 1;
+ return isGFX9(*STI) ? NumSGPRBlocks * 2 : NumSGPRBlocks;
----------------
rochauha wrote:
> foad wrote:
> > Why have you changed this?
> To follow the computation of `GRANULATED_WAVEFRONT_SGPR_COUNT` for GFX9, as mentioned in https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdhsa-compute-pgm-rsrc1-gfx6-gfx10-table
If the above is true, and the granule for gfx9 is in fact 8, then I would just move all of the handling of the "even" requirement into this function, i.e. change this to:
```
unsigned NumSGPRBlocks = NumSGPRs / (isGFX9(*STI) ? 2 * getSGPREncodingGranule(STI) : getSGPREncodingGranule(STI)) - 1;
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84194/new/
https://reviews.llvm.org/D84194
More information about the llvm-commits
mailing list