[PATCH] D124678: [AMDGPU] Allow for MFMA Inst Clustering
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 29 10:18:40 PDT 2022
rampitec added a comment.
One thing to mention: transferring all successors and predecessors technically does not guarantee that an independent instruction will not be scheduled in between. I see that you are adding SDep::Cluster, but I remember it didn't use to work in the very similar scenario before. Not sure it is still so. It can happen that in a bigger program the cluster may be broken.
Also we probably need to consider dropping MFMA clustering along with load clustering during GCNScheduleDAGMILive::UnclusteredReschedule stage.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:898
+ if (MFMAOpa->NodeNum > MFMAOpb->NodeNum)
+ std::swap(MFMAOpa, MFMAOpb);
+
----------------
What if one feeds another and you swap them?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:900
+
+ DAG->addEdge(MFMAOpb, SDep(MFMAOpa, SDep::Cluster));
+
----------------
I believe you need to check the result of addEgde.
================
Comment at: llvm/test/CodeGen/AMDGPU/mfma-cluster.mir:3
+# RUN: llc -march=amdgcn -mcpu=gfx90a -start-before=machine-scheduler %s -o - -amdgpu-mfma-cluster=1 -amdgpu-mfma-cluster-size=2 --misched-bottomup --debug-only=amdgpu-subtarget,machine-scheduler 2>&1 | FileCheck -check-prefix=TWOLIMIT %s
+
+
----------------
You need '; REQUIRES: asserts' if you want to inspect debug output.
================
Comment at: llvm/test/CodeGen/AMDGPU/mfma-cluster.mir:33
+# TWOLIMIT-NEXT: Copy Pred SU(6)
+
+---
----------------
You also need to check resulting MIR.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124678/new/
https://reviews.llvm.org/D124678
More information about the llvm-commits
mailing list