[PATCH] D138443: [AMDGPU] Fix GCNSubtarget::getMinNumVGPRs, add unit test to check consistency between GCNSubtarget's getMinNumVGPRs, getMaxNumVGPRs and getOccupancyWithNumVGPRs.

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 23 03:22:51 PST 2022


foad added a comment.

>   /// \returns Minimum number of VGPRs that meets given number of waves per
>   /// execution unit requirement supported by the subtarget.
>   unsigned getMinNumVGPRs(unsigned WavesPerEU) const;
>   
>   /// \returns Maximum number of VGPRs that meets given number of waves per
>   /// execution unit requirement supported by the subtarget.
>   unsigned getMaxNumVGPRs(unsigned WavesPerEU) const;

I think your patch shows that this documentation needs improving here. My understanding is:

getMaxNumVGPRs returns the maximum number of VGPRs that you can use and still achieve //at least// the specified occupancy.
getMinNumVGPRs returns the minimum number of VGPRs that will prevent you from achieving //more// than the specified occupancy.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:629
+unsigned GCNSubtarget::getOccupancyWithNumVGPRs(unsigned NumVGPRs) const {
+  return AMDGPU::IsaInfo::getNumWavesWithNumVGPRs(this, NumVGPRs);
 }
----------------
Why change terminology from "Occupancy" to "NumWaves"? AMDGPUBaseInfo seems to use "WavesPerEU" a lot instead of just "NumWaves".


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:998
+    return MaxWaves;
+  unsigned RoundedRegs = ((NumVGPRs + Granule - 1) / Granule) * Granule;
+  return std::min(std::max(getTotalNumVGPRs(STI) / RoundedRegs, 1u), MaxWaves);
----------------
This is `divideCeil(NumVGPRs, Granule) * Granule`. But how about adding `uint64_t alignUp(uint64_t Value, uint64_t Align)` to MathExtras.h (like `alignDown` but no need for the `Skew` parameter) and using it here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138443/new/

https://reviews.llvm.org/D138443



More information about the llvm-commits mailing list