[llvm] [AArch64][SME] Extend FORM_TRANSPOSED pseudos to all multi-vector intrinsics (PR #124258)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 3 09:25:03 PST 2025
================
@@ -225,6 +226,83 @@ bool SMEPeepholeOpt::optimizeStartStopPairs(
return Changed;
}
+// Using the FORM_TRANSPOSED_REG_TUPLE pseudo can improve register allocation
+// of multi-vector intrinsics. However, the psuedo should only be emitted if
+// the input registers of the REG_SEQUENCE are copy nodes where the source
+// register is in a StridedOrContiguous class. For example:
+//
+// %3:zpr2stridedorcontiguous = LD1B_2Z_IMM_PSEUDO ..
+// %4:zpr = COPY %3.zsub1:zpr2stridedorcontiguous
+// %5:zpr = COPY %3.zsub0:zpr2stridedorcontiguous
+// %6:zpr2stridedorcontiguous = LD1B_2Z_PSEUDO ..
+// %7:zpr = COPY %6.zsub1:zpr2stridedorcontiguous
+// %8:zpr = COPY %6.zsub0:zpr2stridedorcontiguous
+// %9:zpr2mul2 = REG_SEQUENCE %5:zpr, %subreg.zsub0, %8:zpr, %subreg.zsub1
+//
+// -> %9:zpr2mul2 = FORM_TRANSPOSED_REG_TUPLE_X2_PSEUDO %5:zpr, %8:zpr
+//
+bool SMEPeepholeOpt::visitRegSequence(MachineInstr &MI) {
+ MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
+
+ switch (MRI.getRegClass(MI.getOperand(0).getReg())->getID()) {
+ case AArch64::ZPR2RegClassID:
+ case AArch64::ZPR4RegClassID:
+ case AArch64::ZPR2Mul2RegClassID:
+ case AArch64::ZPR4Mul4RegClassID:
+ break;
+ default:
+ return false;
+ }
+
+ // The first operand is the register class created by the REG_SEQUENCE.
+ // Each operand pair after this consists of a vreg + subreg index, so
+ // for example a sequence of 2 registers will have a total of 5 operands.
+ if (MI.getNumOperands() != 5 && MI.getNumOperands() != 9)
+ return false;
+
+ MCRegister SubReg = MCRegister::NoRegister;
+ for (unsigned I = 1; I < MI.getNumOperands(); I += 2) {
+ MachineOperand &MO = MI.getOperand(I);
+
+ if (!MI.getOperand(I).isReg())
+ return false;
+
+ MachineOperand *Def = MRI.getOneDef(MO.getReg());
+ if (!Def || !Def->getParent()->isCopy())
+ return false;
+
+ const MachineOperand &CopySrc = Def->getParent()->getOperand(1);
+ unsigned OpSubReg = CopySrc.getSubReg();
+ if (SubReg == MCRegister::NoRegister)
+ SubReg = OpSubReg;
+
+ MachineOperand *CopySrcOp = MRI.getOneDef(CopySrc.getReg());
----------------
sdesmalen-arm wrote:
nit: This only returns a value when the MIR is in SSA form. There is an assert for this in the `runOnmachineFunction`, but maybe it's worth adding it to this function too in case that one accidentally gets removed?
https://github.com/llvm/llvm-project/pull/124258
More information about the llvm-commits
mailing list