[llvm] [AMDGPU] Occupancy w.r.t. workgroup size range is also a range (PR #123748)
Lucas Ramirez via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 23 05:45:05 PST 2025
================
@@ -55,13 +55,15 @@ AMDGPUSubtarget::getMaxLocalMemSizeWithWaveCount(unsigned NWaves,
return getLocalMemorySize() / WorkGroupsPerCU;
}
-std::pair<unsigned, unsigned>
-AMDGPUSubtarget::getOccupancyWithWorkGroupSizes(uint32_t LDSBytes,
- const Function &F) const {
- // FIXME: Is there an allocation granularity for the LDS? If so we would need
- // to make sure the amount of bytes is aligned on that granularity.
-
+std::pair<unsigned, unsigned> AMDGPUSubtarget::getOccupancyWithWorkGroupSizes(
+ uint32_t LDSBytes, const Function &F, const TargetMachine &TM) const {
// Compute occupancy restriction based on LDS usage.
+ if (TM.getTargetTriple().getArch() == Triple::amdgcn) {
----------------
lucas-rami wrote:
I reverted all changes related to allocation granularity, and will properly address this for all subtargets in a different PR (probably through a new protected member variable on AMDGPUSubtarget). I reintroduced the FIXME in the function as well.
https://github.com/llvm/llvm-project/pull/123748
More information about the llvm-commits
mailing list