[PATCH] D119510: AMDGPU: Clamp min value of effective waves-per-eu instead of discarding

Thu Feb 10 20:00:38 PST 2022

arsenm created this revision.
arsenm added reviewers: kzhuravl, t-tye, rampitec, AMDGPU.
Herald added subscribers: foad, kerbowa, hiraditya, tpr, dstuttard, yaxunl, nhaehnle, jvesely.
arsenm requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

If the flat work group size implied a larger minimum, this was
ignoring the requested maximum. This was interfering with the logic to
propagate amdgpu-waves-per-eu when accounting for the inferred flat
workgroup size. Just clamp the minimum so we still preserve the
requested maximum.

      

Plus I'm not really sure what the point of the minimum really is or
does. It is queried in a few IR passes (AMDGPUPromoteAlloca and TTI)
use it for getting a number of VGPRs, but everything else uses the
maximum.

      

No test here since I don't think this is a directly observable
property, but fixes a future patch which propagates
amdgpu-waves-per-eu.


https://reviews.llvm.org/D119510

Files:
  llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp


Index: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
===================================================================

--- llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
@@ -560,7 +560,7 @@
   // Make sure requested values are compatible with values implied by requested
   // minimum/maximum flat work group sizes.
   if (Requested.first < MinImpliedByFlatWorkGroupSize)
-    return Default;
+    Requested.first = MinImpliedByFlatWorkGroupSize;
 
   return Requested;
 }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D119510.407752.patch
Type: text/x-patch
Size: 515 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220211/d24ae165/attachment-0001.bin>