[PATCH] D84194: [AMDGPU] Correct the number of SGPR blocks used for GFX9

Scott Linder via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 21 15:08:12 PDT 2020


scott.linder added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:348
+  // 16 for GFX9, 8 for GFX6-8
+  return isGFX9(*STI) ? 16 : 8;
 }
----------------
I don't know if this is actually accurate, I think the reason for the "2 *" in the equation for GFX9 is not because the allocation granule is 16. It is still 8 for gfx9, but there is an additional constraint that you must allocate an even number of granules.

It is a bit confusing, and I would like @kzhuravl to weigh in as IIRC he was who originally helped me understand this when we were updating the assembler.


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:437
   // SGPRBlocks is actual number of SGPR blocks minus 1.
-  return NumSGPRs / getSGPREncodingGranule(STI) - 1;
+  unsigned NumSGPRBlocks = NumSGPRs / getSGPREncodingGranule(STI) - 1;
+  return isGFX9(*STI) ? NumSGPRBlocks * 2 : NumSGPRBlocks;
----------------
rochauha wrote:
> foad wrote:
> > Why have you changed this?
> To follow the computation of `GRANULATED_WAVEFRONT_SGPR_COUNT` for GFX9, as mentioned in https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdhsa-compute-pgm-rsrc1-gfx6-gfx10-table
If the above is true, and the granule for gfx9 is in fact 8, then I would just move all of the handling of the "even" requirement into this function, i.e. change this to:

```
unsigned NumSGPRBlocks = NumSGPRs / (isGFX9(*STI) ? 2 * getSGPREncodingGranule(STI) : getSGPREncodingGranule(STI)) - 1;
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84194/new/

https://reviews.llvm.org/D84194





More information about the llvm-commits mailing list