[PATCH] D138443: [AMDGPU] Fix GCNSubtarget::getMinNumVGPRs, add unit test to check consistency between GCNSubtarget's getMinNumVGPRs, getMaxNumVGPRs and getOccupancyWithNumVGPRs.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 29 12:42:43 PST 2022
arsenm accepted this revision.
arsenm added a comment.
This revision is now accepted and ready to land.
LGTM with nits
================
Comment at: llvm/lib/Target/AMDGPU/GCNSubtarget.h:1219
- /// \returns Minimum number of VGPRs that meets given number of waves per
- /// execution unit requirement supported by the subtarget.
+ /// \returns the minimum number of VGPRs that will prevent you from achieving
+ /// more than the specified number of waves \p WavesPerEU.
----------------
Remove "you from"
================
Comment at: llvm/lib/Target/AMDGPU/GCNSubtarget.h:1225
- /// \returns Maximum number of VGPRs that meets given number of waves per
- /// execution unit requirement supported by the subtarget.
+ /// \returns the maximum number of VGPRs that you can use and still achieve at
+ /// least the specified number of waves \p WavesPerEU.
----------------
s/"you can use"/"can be used"
================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp:998
+ return MaxWaves;
+ unsigned RoundedRegs = ((NumVGPRs + Granule - 1) / Granule) * Granule;
+ return std::min(std::max(getTotalNumVGPRs(STI) / RoundedRegs, 1u), MaxWaves);
----------------
vpykhtin wrote:
> foad wrote:
> > This is `divideCeil(NumVGPRs, Granule) * Granule`. But how about adding `uint64_t alignUp(uint64_t Value, uint64_t Align)` to MathExtras.h (like `alignDown` but no need for the `Skew` parameter) and using it here?
> I just didn't want to change the function body since I moved it but if I changed the name the body can be also imporved :)
> MathExtras.h already has //alignTo//, I'll use it here.
> ```
> inline uint64_t alignTo(uint64_t Value, uint64_t Align) {
> assert(Align != 0u && "Align can't be 0.");
> return (Value + Align - 1) / Align * Align;
> }
> ```
>
I'd somewhat prefer non-moving changes to be separate, but doesn't really matter
================
Comment at: llvm/unittests/Target/AMDGPU/AMDGPUUnitTests.cpp:100
+
+ std::map<std::string, SmallVector<std::string>> TablePerCPUs;
+ for (auto CPUName : CPUs) {
----------------
StringMap?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D138443/new/
https://reviews.llvm.org/D138443
More information about the llvm-commits
mailing list