[PATCH] D135441: [AArch64][SelectionDAG] Lower multiplication by a constant to shl+add+shl+add

Mon Oct 17 09:34:40 PDT 2022

dmgreen added a comment.

Can you add some tests for multiplying by larger values. Maybe 165 and 297.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15044
+      unsigned ShiftN1 = CVNMinus1.logBase2();
+      SDValue MVal = Add(Shl(N0, ShiftM1), N0);
+      // LSLFast implicate that Shifts <= 3 places are fast
----------------
Can this be moved into the `if..`

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15038
+               isPowPlusPlusConst(ConstValue, CVM, CVN)) {
+      // TOTO: The latency can vary depending on the shirt amount, so need
+      // construct an MCInst to get more detail information.
----------------
Allen wrote:
> dmgreen wrote:
> > -> TODO, shift.
> > 
> > The documentation for LSLFast says that Shifts <= 3 places are fast, which is the limit for most address offsets. Modern cores usually have free shifts <= 4 places. (They tend to have cheap multiplies too, if they can perform fast shifts).
> > 
> > I was considering putting a LSLFast4 option in when I recently enabled LSLFast for Arm cores, but as far as I understand the LSLFast option current doesn't actually apply to Add instructions like it should at the moment. We can check that ShiftM1 and ShiftN1 are <= 3 here though, and maybe change the subtarget feature for shifts of 4?
> Apply your comment, thanks
You can remove the comment now then.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135441/new/

https://reviews.llvm.org/D135441