[PATCH] D124678: [AMDGPU] Allow for MFMA Inst Clustering

Wed May 4 10:15:45 PDT 2022

jrbyrnes updated this revision to Diff 427056.
jrbyrnes added a comment.
Herald added a subscriber: mgrang.

Fix algorithmic flaws:

1. Use chain as cluster shape (A->B->C->D) instead of fanout (A->{B,C,D}). With a chain, the scheduler will not miss cluster edges due to multiple cluster succs.
2. Create artificial edges in the cluster. This will coerce the scheduler to start from either the root or leaf of the cluster rather than potentially selecting the middle. In post RA scheduling, if the scheduler selects the middle, it will lose the cluster prefix.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124678/new/

https://reviews.llvm.org/D124678

Files:
  llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.cpp
  llvm/lib/Target/AMDGPU/AMDGPUMFMAClustering.h
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/CMakeLists.txt
  llvm/test/CodeGen/AMDGPU/mfma-cluster-edges.mir
  llvm/test/CodeGen/AMDGPU/mfma-cluster.mir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D124678.427056.patch
Type: text/x-patch
Size: 42439 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220504/1cbfde1e/attachment.bin>