[PATCH] D25966: [AArch64] Lower multiplication by a constant int to shl+add+shl

Thu Oct 27 19:29:25 PDT 2016

Gerolf added a comment.

Hi Haicheng,

I just have a few observations/food for thought:

- Nit: In your Summary I think you swapped n and m in your code snippets vs your formulas. Your code is correct though.
- The 2^N-1 * 2^M reduction increases code size, so it should not fire under Oz. Otherwise similar consideration as to your major case apply
- The 2^N+1 * 2^M reduction increases schedule height (at least on most processors). It might also increase code when e.g. add+mul could be combined to madd. But when code size is *not* a concern and latency(lsl) + 1 < latency (mul), latency(madd) it should always be a win. But that target dependence is not checked in your code yet.
- I would look at the machine combiner only for cases that need more global scheduling context to decide

Like Renato I'm also curious about your gains. How big? Which benchmarks?

Cheers
Gerolf

Repository:
  rL LLVM

https://reviews.llvm.org/D25966