[llvm] [AMDGPU] Occupancy w.r.t. workgroup size range is also a range (PR #123748)

Thu Jan 23 05:45:05 PST 2025

================
@@ -55,13 +55,15 @@ AMDGPUSubtarget::getMaxLocalMemSizeWithWaveCount(unsigned NWaves,
   return getLocalMemorySize() / WorkGroupsPerCU;
 }
 
-std::pair<unsigned, unsigned>
-AMDGPUSubtarget::getOccupancyWithWorkGroupSizes(uint32_t LDSBytes,
-                                                const Function &F) const {
-  // FIXME: Is there an allocation granularity for the LDS? If so we would need
-  // to make sure the amount of bytes is aligned on that granularity.
-
+std::pair<unsigned, unsigned> AMDGPUSubtarget::getOccupancyWithWorkGroupSizes(
+    uint32_t LDSBytes, const Function &F, const TargetMachine &TM) const {
   // Compute occupancy restriction based on LDS usage.
+  if (TM.getTargetTriple().getArch() == Triple::amdgcn) {
----------------
lucas-rami wrote:

I reverted all changes related to allocation granularity, and will properly address this for all subtargets in a different PR (probably through a new protected member variable on AMDGPUSubtarget). I reintroduced the FIXME in the function as well.

https://github.com/llvm/llvm-project/pull/123748