[PATCH] D25966: [AArch64] Lower multiplication by a constant int to shl+add+shl
Gerolf Hoflehner via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 27 19:29:25 PDT 2016
Gerolf added a comment.
I just have a few observations/food for thought:
- Nit: In your Summary I think you swapped n and m in your code snippets vs your formulas. Your code is correct though.
- The 2^N-1 * 2^M reduction increases code size, so it should not fire under Oz. Otherwise similar consideration as to your major case apply
- The 2^N+1 * 2^M reduction increases schedule height (at least on most processors). It might also increase code when e.g. add+mul could be combined to madd. But when code size is *not* a concern and latency(lsl) + 1 < latency (mul), latency(madd) it should always be a win. But that target dependence is not checked in your code yet.
- I would look at the machine combiner only for cases that need more global scheduling context to decide
Like Renato I'm also curious about your gains. How big? Which benchmarks?
More information about the llvm-commits