[PATCH] D124678: [AMDGPU] Allow for MFMA Inst Clustering
Jeffrey Byrnes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 3 12:43:50 PDT 2022
jrbyrnes added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/mfma-cluster.mir:154
+ ; BOTHSCHEDPASS-NEXT: $agpr0_agpr1_agpr2_agpr3 = V_MFMA_F32_4X4X1F32_e64 $vgpr1, $vgpr0, killed $agpr0_agpr1_agpr2_agpr3, 0, 0, 0, implicit $mode, implicit $exec
+ ; BOTHSCHEDPASS-NEXT: $vgpr2 = V_MOV_B32_e32 1, implicit $exec
+ ; BOTHSCHEDPASS-NEXT: $agpr4_agpr5_agpr6_agpr7 = V_MFMA_F32_4X4X1F32_e64 $vgpr3, $vgpr4, killed $agpr4_agpr5_agpr6_agpr7, 0, 0, 0, implicit $mode, implicit $exec
----------------
rampitec wrote:
> So the cluster does not really hold?
Currently, clusters will be broken by: 1. higher priority instructions, or by 2. independent instructions.
Here we see an independent instruction filling in the gap caused by hardware hazard. I have tried disabling fillMFMAShadow but this does not change the behavior. I think if we want unbroken clusters using SDep::Cluster, we need to address this via a different SchedStrategy (specifically, the logic in tryCandidate and pickNode). Should I start thinking about this?
In the context of CK -- broken clusters will cause problems if clusters of different type blend together. However, I think this won't happen due to dependencies -- hard to say without sample MIR.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124678/new/
https://reviews.llvm.org/D124678
More information about the llvm-commits
mailing list