[llvm-commits] [llvm] r103881 - in /llvm/trunk: lib/Target/ARM/ARMISelLowering.cpp test/CodeGen/ARM/mul_const.ll

Sun May 16 13:36:44 PDT 2010

On May 16, 2010, at 1:06 PM, Dale Johannesen wrote:

> 
> On May 16, 2010, at 12:31 PM, Chris Lattner wrote:
> 
>> On May 16, 2010, at 10:32 AM, Jakob Stoklund Olesen wrote:
>>> 
>>> But it is not clear that this transform would be a benefit to all targets, and the first one is better than the second for X86 because it partially matches an addressing mode.
>> 
>> The best way to handle this IMO is to have target-independent code do this, and have TargetLower* expose hooks that the code can query for its cost model.
>> 
>> -Chris
> 
> I agree (and, fwiw, that's what gcc does).  You can do a reasonable job knowing the relative cycle counts for multiply-by-constant (which depends on the value of the constant in some hardware), add, sub, and shift.

It sounds doable.

An extra quirk is that X86 can execute up to 3 add/shl/lea instructions in parallel, dependencies allowing, but it only has one multiplier for imul. So if you can build a DAG with some parallel instructions, that would be extra good.

An approach would be to have a target hook to provide the cost of operations, and a hook to suggest a way of breaking up a constant. A given constant multiplication (mul x, N) can be broken up recursively depending on N:

Keep it: (mul x, N)
Factors: (mul (mul x, N/K), K)
Terms: (add (mul x, N-K), (mul x, N+K))

I am not sure how to balance parallel instructions vs register pressure.

Another problem is that such an expanded constant multiplication would look really yummy to the DAG combiner.

/jakob