[llvm] [AMDGPU] Allow amdgpu-waves-per-eu to lower target occupancy range (PR #168358)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 29 03:41:14 PST 2025
================
@@ -190,23 +190,29 @@ std::pair<unsigned, unsigned> AMDGPUSubtarget::getEffectiveWavesPerEU(
// sizes limits the achievable maximum, and we aim to support enough waves per
// EU so that we can concurrently execute all waves of a single workgroup of
// maximum size on a CU.
- std::pair<unsigned, unsigned> Default = {
+ std::pair<unsigned, unsigned> WavesPerEU = {
getWavesPerEUForWorkGroup(FlatWorkGroupSizes.second),
getOccupancyWithWorkGroupSizes(LDSBytes, FlatWorkGroupSizes).second};
- Default.first = std::min(Default.first, Default.second);
-
- // Make sure requested minimum is within the default range and lower than the
- // requested maximum. The latter must not violate target specification.
- if (RequestedWavesPerEU.first < Default.first ||
- RequestedWavesPerEU.first > Default.second ||
- RequestedWavesPerEU.first > RequestedWavesPerEU.second ||
- RequestedWavesPerEU.second > getMaxWavesPerEU())
- return Default;
-
- // We cannot exceed maximum occupancy implied by flat workgroup size and LDS.
- RequestedWavesPerEU.second =
- std::min(RequestedWavesPerEU.second, Default.second);
- return RequestedWavesPerEU;
+ WavesPerEU.first = std::min(WavesPerEU.first, WavesPerEU.second);
+
+ // Requested minimum must not violate subtarget's specifications and be no
+ // greater than maximum.
+ if (RequestedWavesPerEU.first &&
+ (RequestedWavesPerEU.first < getMinWavesPerEU() ||
+ RequestedWavesPerEU.first > RequestedWavesPerEU.second))
+ return WavesPerEU;
+ // Requested maximum must not violate subtarget's specifications.
+ if (RequestedWavesPerEU.second > getMaxWavesPerEU())
+ return WavesPerEU;
----------------
arsenm wrote:
Not in the low level query; I think we already do emit such a remark in the asm printer
https://github.com/llvm/llvm-project/pull/168358
More information about the llvm-commits
mailing list