[PATCH] D84194: [AMDGPU] Correct the number of SGPR blocks used for GFX9

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 22 00:32:57 PDT 2020


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:437
   // SGPRBlocks is actual number of SGPR blocks minus 1.
-  return NumSGPRs / getSGPREncodingGranule(STI) - 1;
+  unsigned NumSGPRBlocks = NumSGPRs / getSGPREncodingGranule(STI) - 1;
+  return isGFX9(*STI) ? NumSGPRBlocks * 2 : NumSGPRBlocks;
----------------
scott.linder wrote:
> rochauha wrote:
> > foad wrote:
> > > Why have you changed this?
> > To follow the computation of `GRANULATED_WAVEFRONT_SGPR_COUNT` for GFX9, as mentioned in https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdhsa-compute-pgm-rsrc1-gfx6-gfx10-table
> If the above is true, and the granule for gfx9 is in fact 8, then I would just move all of the handling of the "even" requirement into this function, i.e. change this to:
> 
> ```
> unsigned NumSGPRBlocks = NumSGPRs / (isGFX9(*STI) ? 2 * getSGPREncodingGranule(STI) : getSGPREncodingGranule(STI)) - 1;
> ```
The current patch does have the advantage that it closely matches the documentation that Ronak pointed to. Though I suppose we could update the documentation too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84194/new/

https://reviews.llvm.org/D84194





More information about the llvm-commits mailing list