[PATCH] D138817: [AAch64] Optimize muls with operands having enough sign bits.

Mon Nov 28 13:40:50 PST 2022

dmgreen added a comment.

Could this apply to umull too? We should also (not necessarily in this commit) look into improving GlobalISel too.  I think it should have enough info nowadays to perform the same ComputeNumSignBits check.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:1902
 def : Pat<(i64 (ineg (mul (sext_inreg GPR64:$Rn, i32), (s64imm_32bit:$C)))),
           (SMSUBLrrr (i32 (EXTRACT_SUBREG GPR64:$Rn, sub_32)),
                      (MOVi32imm (trunc_imm imm:$C)), XZR)>;
----------------
Could the same thing be done for SMSUBLrrr too?
And the additional add/sub patterns below too.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:1925
+// Mul with enough sign-bits.
+def smullwithsignbits : PatFrag<(ops node:$l, node:$r), (mul node:$l, node:$r), [{
+  return CurDAG->ComputeNumSignBits(N->getOperand(0)) > 32 &&
----------------
Can the Pat be moved into the above block, and the PatFrag be moved maybe closer to the add_and_or_is_add existing PatFrag.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:1926
+def smullwithsignbits : PatFrag<(ops node:$l, node:$r), (mul node:$l, node:$r), [{
+  return CurDAG->ComputeNumSignBits(N->getOperand(0)) > 32 &&
+         CurDAG->ComputeNumSignBits(N->getOperand(1)) > 32;
----------------
I think it maybe needs to be 33 bits. Sign bits are always off by one. Can you add some tests for the edge cases?

================
Comment at: llvm/test/CodeGen/AArch64/aarch64-mull-masks.ll:92
+  %sext4 = sext i8 %x1 to i64
+  %mul = mul nsw i64 %sext, %sext4
+  ret i64 %mul
----------------
We can remove the nsw from the mul.
Do you have any tests for the commuted form of this too?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138817/new/

https://reviews.llvm.org/D138817