[llvm] [AArch64][SME] Extend FORM_TRANSPOSED pseudos to all multi-vector intrinsics (PR #124258)

Sander de Smalen via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 3 09:25:03 PST 2025


================
@@ -225,6 +226,83 @@ bool SMEPeepholeOpt::optimizeStartStopPairs(
   return Changed;
 }
 
+// Using the FORM_TRANSPOSED_REG_TUPLE pseudo can improve register allocation
+// of multi-vector intrinsics. However, the psuedo should only be emitted if
+// the input registers of the REG_SEQUENCE are copy nodes where the source
+// register is in a StridedOrContiguous class. For example:
+//
+//   %3:zpr2stridedorcontiguous = LD1B_2Z_IMM_PSEUDO ..
+//   %4:zpr = COPY %3.zsub1:zpr2stridedorcontiguous
+//   %5:zpr = COPY %3.zsub0:zpr2stridedorcontiguous
+//   %6:zpr2stridedorcontiguous = LD1B_2Z_PSEUDO ..
+//   %7:zpr = COPY %6.zsub1:zpr2stridedorcontiguous
+//   %8:zpr = COPY %6.zsub0:zpr2stridedorcontiguous
+//   %9:zpr2mul2 = REG_SEQUENCE %5:zpr, %subreg.zsub0, %8:zpr, %subreg.zsub1
+//
+//   ->  %9:zpr2mul2 = FORM_TRANSPOSED_REG_TUPLE_X2_PSEUDO %5:zpr, %8:zpr
+//
+bool SMEPeepholeOpt::visitRegSequence(MachineInstr &MI) {
+  MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
+
+  switch (MRI.getRegClass(MI.getOperand(0).getReg())->getID()) {
+  case AArch64::ZPR2RegClassID:
+  case AArch64::ZPR4RegClassID:
+  case AArch64::ZPR2Mul2RegClassID:
+  case AArch64::ZPR4Mul4RegClassID:
+    break;
+  default:
+    return false;
+  }
+
+  // The first operand is the register class created by the REG_SEQUENCE.
+  // Each operand pair after this consists of a vreg + subreg index, so
+  // for example a sequence of 2 registers will have a total of 5 operands.
+  if (MI.getNumOperands() != 5 && MI.getNumOperands() != 9)
+    return false;
+
+  MCRegister SubReg = MCRegister::NoRegister;
+  for (unsigned I = 1; I < MI.getNumOperands(); I += 2) {
+    MachineOperand &MO = MI.getOperand(I);
+
+    if (!MI.getOperand(I).isReg())
+      return false;
+
+    MachineOperand *Def = MRI.getOneDef(MO.getReg());
+    if (!Def || !Def->getParent()->isCopy())
+      return false;
+
+    const MachineOperand &CopySrc = Def->getParent()->getOperand(1);
+    unsigned OpSubReg = CopySrc.getSubReg();
+    if (SubReg == MCRegister::NoRegister)
+      SubReg = OpSubReg;
+
+    MachineOperand *CopySrcOp = MRI.getOneDef(CopySrc.getReg());
----------------
sdesmalen-arm wrote:

nit: This only returns a value when the MIR is in SSA form. There is an assert for this in the `runOnmachineFunction`, but maybe it's worth adding it to this function too in case that one accidentally gets removed?

https://github.com/llvm/llvm-project/pull/124258


More information about the llvm-commits mailing list