[llvm] [AArch64][GlobalISel] Push ADD/SUB through Extend Instructions (PR #90964)

Thu May 23 01:14:42 PDT 2024

================
@@ -554,6 +554,55 @@ void applyExtUaddvToUaddlv(MachineInstr &MI, MachineRegisterInfo &MRI,
   MI.eraseFromParent();
 }
 
+// Pushes ADD/SUB through extend instructions to decrease the number of extend
+// instruction at the end by allowing selection of {s|u}addl sooner
+
+// i32 add(i32 ext i8, i32 ext i8) => i32 ext(i16 add(i16 ext i8, i16 ext i8))
+bool matchPushAddSubExt(MachineInstr &MI, MachineRegisterInfo &MRI,
+                        Register DstReg, Register SrcReg1, Register SrcReg2) {
+  assert(MI.getOpcode() == TargetOpcode::G_ADD ||
+         MI.getOpcode() == TargetOpcode::G_SUB &&
+             "Expected a G_ADD or G_SUB instruction\n");
+
+  // Deal with vector types only
+  LLT DstTy = MRI.getType(DstReg);
+  if (!DstTy.isVector())
+    return false;
+
+  // Return true if G_{S|Z}EXT instruction is more than 2* source
+  Register ExtDstReg = MI.getOperand(1).getReg();
+  LLT ExtDstTy = MRI.getType(ExtDstReg);
+  LLT Ext1SrcTy = MRI.getType(SrcReg1);
+  LLT Ext2SrcTy = MRI.getType(SrcReg2);
+  if (((Ext1SrcTy.getScalarSizeInBits() == 8 &&
+        ExtDstTy.getScalarSizeInBits() == 32) ||
+       ((Ext1SrcTy.getScalarSizeInBits() == 8 ||
+         Ext1SrcTy.getScalarSizeInBits() == 16) &&
+        ExtDstTy.getScalarSizeInBits() == 64)) &&
+      Ext1SrcTy == Ext2SrcTy)
----------------
davemgreen wrote:

I don't think there is much in terms of free instructions in this combine. This is the GISel equivalent of https://reviews.llvm.org/D128426, which always felt quite AArch64 specific to me, due to the way the saddl/saddl2/sshll instructions are legalized when extending types. Splitting the type up early can be profitable, but I'm not sure how to generally say that would be the case on different architectures.

https://github.com/llvm/llvm-project/pull/90964