[llvm-commits] [PATCH][FastMath, InstCombine] Fadd/Fsub optimizations

Tue Dec 11 15:51:43 PST 2012

> Correct me if I'm wrong, but as far as I can tell, you aren't actually
> storing any useful state across calls to simplifyFAdd.

I do *NOT* store state.  The pre-created Addend instances are used and 
discarded but not destroyed.
So, the APFloat fields in Addends instance are valid valid APFloat. But 
we don't care their value.

>
>> The avoidance of APFloat construction cost is the MOST difficult part of
>> this change.
>> It took me about 70% time to strike the balance between the clarity the
>> efficiency.
> I understand where you're coming from, but I'm not convinced you're
> right.  Constructing an APFloat simply isn't that expensive,
> especially for floating-point types which don't have more than 64
> bits.  And having a bunch of special-case code also has a performance
> cost.
It is expensive in the sense:
   1. more often than not, and expr cannot be optimized, and
   2. try to optimize an expr need at about 9 Addends.

That means we need call APFloat::APFloat() for 9 * n * iteration where
the n is the # of unsafe add, and <iteration> is # of iterations 
instCombine take to
optimize instruction.

If 80% of instructions can be optimized, that might justify the cost.
However, this is not the case.

On the other hand,  I just create some instances to save the cost. There 
is no hand-shake between
functions, there is no state machine and other nasty stuff to make 
maintenance hard.

>
> Anyway, if you really care about it being fast, matching
> PatternMatching.h-style patterns is going to be faster than any data
> structure.
>
>
I start with pattern match, and gave up later on. It seems to be too messy.