[PATCH] D135441: [AArch64][SelectionDAG] Lower multiplication by a constant to shl+add+shl+add

Mon Oct 17 00:19:22 PDT 2022

dmgreen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15038
+               isPowPlusPlusConst(ConstValue, CVM, CVN)) {
+      // TOTO: The latency can vary depending on the shirt amount, so need
+      // construct an MCInst to get more detail information.
----------------
-> TODO, shift.

The documentation for LSLFast says that Shifts <= 3 places are fast, which is the limit for most address offsets. Modern cores usually have free shifts <= 4 places. (They tend to have cheap multiplies too, if they can perform fast shifts).

I was considering putting a LSLFast4 option in when I recently enabled LSLFast for Arm cores, but as far as I understand the LSLFast option current doesn't actually apply to Add instructions like it should at the moment. We can check that ShiftM1 and ShiftN1 are <= 3 here though, and maybe change the subtarget feature for shifts of 4?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135441/new/

https://reviews.llvm.org/D135441