[PATCH] D124678: [AMDGPU] Allow for MFMA Inst Clustering

Fri May 6 15:03:44 PDT 2022

kerbowa accepted this revision.
kerbowa added a comment.
This revision is now accepted and ready to land.

LGTM with nits. Thanks!

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:23
+
+#define DEBUG_TYPE "amdgpu-subtarget"
+
----------------
Can you give this its own debug type or else use machine-scheduler?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:33
+    MFMAClusterSize("amdgpu-mfma-cluster-size", cl::init(5), cl::Hidden,
+                    cl::desc("The maximum number of MFMA insts to "
+                             "attempt to cluster together."));
----------------
NIT: insts->instructions on help text

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:113
+    for (; NextIdx < End; ++NextIdx) {
+      if (ClusterSize >= MFMAClusterSize || NextIdx >= End)
+        break;
----------------
NIT: Maybe this should be 'MaxMFMAClusterSize'.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:152
+  // If our occupancy doesn't support multi waves, bypass clustering
+  if (!ST.hasMAIInsts() || MFI->getOccupancy() < 2)
+    return;
----------------
I actually don't think this should be gated by having multiple waves. This transformation could also be useful for intra-wave scheduling and I think we should leave it up to the users when it is enabled. If we need more precision we could eventually allow for specifying functions where the clustering should be enabled. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124678/new/

https://reviews.llvm.org/D124678