[PATCH] D135441: [AArch64][SelectionDAG] Lower multiplication by a constant to shl+add+shl+add

Tue Oct 18 08:00:05 PDT 2022

Allen marked an inline comment as done.
Allen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15044
+      unsigned ShiftN1 = CVNMinus1.logBase2();
+      SDValue MVal = Add(Shl(N0, ShiftM1), N0);
+      // LSLFast implicate that Shifts <= 3 places are fast
----------------
dmgreen wrote:
> Can this be moved into the `if..`
Done, thanks

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15038
+               isPowPlusPlusConst(ConstValue, CVM, CVN)) {
+      // TOTO: The latency can vary depending on the shirt amount, so need
+      // construct an MCInst to get more detail information.
----------------
dmgreen wrote:
> Allen wrote:
> > dmgreen wrote:
> > > -> TODO, shift.
> > > 
> > > The documentation for LSLFast says that Shifts <= 3 places are fast, which is the limit for most address offsets. Modern cores usually have free shifts <= 4 places. (They tend to have cheap multiplies too, if they can perform fast shifts).
> > > 
> > > I was considering putting a LSLFast4 option in when I recently enabled LSLFast for Arm cores, but as far as I understand the LSLFast option current doesn't actually apply to Add instructions like it should at the moment. We can check that ShiftM1 and ShiftN1 are <= 3 here though, and maybe change the subtarget feature for shifts of 4?
> > Apply your comment, thanks
> You can remove the comment now then.
Done

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135441/new/

https://reviews.llvm.org/D135441