[llvm] [AMDGPU]:: Minor Unpacking Fixes. (PR #163992)
Jeffrey Byrnes via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 17 12:21:55 PDT 2025
================
@@ -787,22 +769,26 @@ bool SIPreEmitPeephole::run(MachineFunction &MF) {
// TODO: Fold this into previous block, if possible. Evaluate and handle any
// side effects.
- for (MachineBasicBlock &MBB : MF) {
- // Unpack packed instructions overlapped by MFMAs. This allows the compiler
- // to co-issue unpacked instructions with MFMA
- auto SchedModel = TII->getSchedModel();
- SetVector<MachineInstr *> InstrsToUnpack;
- for (auto &MI : make_early_inc_range(MBB.instrs())) {
- if (!SIInstrInfo::isMFMA(MI))
- continue;
- const MCSchedClassDesc *SchedClassDesc =
- SchedModel.resolveSchedClass(&MI);
- uint16_t NumMFMACycles =
- SchedModel.getWriteProcResBegin(SchedClassDesc)->ReleaseAtCycle;
- collectUnpackingCandidates(MI, InstrsToUnpack, NumMFMACycles);
- }
- for (MachineInstr *MI : InstrsToUnpack) {
- performF32Unpacking(*MI);
+
+ // Perform the extra MF scans only for supported archs
+ if (ST.hasGFX950Insts() || ST.hasGFX940Insts()) {
----------------
jrbyrnes wrote:
Also, can just use `hasGFX940Insts` as this implies gfx950
https://github.com/llvm/llvm-project/pull/163992
More information about the llvm-commits
mailing list