[llvm] [AMDGPU] Occupancy w.r.t. workgroup size range is also a range (PR #123748)

Thu Jan 23 05:02:54 PST 2025

================
@@ -55,13 +55,15 @@ AMDGPUSubtarget::getMaxLocalMemSizeWithWaveCount(unsigned NWaves,
   return getLocalMemorySize() / WorkGroupsPerCU;
 }
 
-std::pair<unsigned, unsigned>
-AMDGPUSubtarget::getOccupancyWithWorkGroupSizes(uint32_t LDSBytes,
-                                                const Function &F) const {
-  // FIXME: Is there an allocation granularity for the LDS? If so we would need
-  // to make sure the amount of bytes is aligned on that granularity.
-
+std::pair<unsigned, unsigned> AMDGPUSubtarget::getOccupancyWithWorkGroupSizes(
+    uint32_t LDSBytes, const Function &F, const TargetMachine &TM) const {
   // Compute occupancy restriction based on LDS usage.
+  if (TM.getTargetTriple().getArch() == Triple::amdgcn) {
----------------
arsenm wrote:

I would leave the allocation granularity for a follow up. There always was an allocation granularity for r600, but that's not handled here. 

Also just move this whole thing into GCNSubtarget, don't do the triple check and downcast 

https://github.com/llvm/llvm-project/pull/123748