[PATCH] D124678: [AMDGPU] Allow for MFMA Inst Clustering
Austin Kerbow via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri May 6 15:03:44 PDT 2022
kerbowa accepted this revision.
kerbowa added a comment.
This revision is now accepted and ready to land.
LGTM with nits. Thanks!
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:23
+
+#define DEBUG_TYPE "amdgpu-subtarget"
+
----------------
Can you give this its own debug type or else use machine-scheduler?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:33
+ MFMAClusterSize("amdgpu-mfma-cluster-size", cl::init(5), cl::Hidden,
+ cl::desc("The maximum number of MFMA insts to "
+ "attempt to cluster together."));
----------------
NIT: insts->instructions on help text
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:113
+ for (; NextIdx < End; ++NextIdx) {
+ if (ClusterSize >= MFMAClusterSize || NextIdx >= End)
+ break;
----------------
NIT: Maybe this should be 'MaxMFMAClusterSize'.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp:152
+ // If our occupancy doesn't support multi waves, bypass clustering
+ if (!ST.hasMAIInsts() || MFI->getOccupancy() < 2)
+ return;
----------------
I actually don't think this should be gated by having multiple waves. This transformation could also be useful for intra-wave scheduling and I think we should leave it up to the users when it is enabled. If we need more precision we could eventually allow for specifying functions where the clustering should be enabled.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124678/new/
https://reviews.llvm.org/D124678
More information about the llvm-commits
mailing list