[llvm-commits] [PATCH][FastMath, InstCombine] Fadd/Fsub optimizations

Tue Dec 11 16:14:15 PST 2012

On Tue, Dec 11, 2012 at 3:51 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>
>> Correct me if I'm wrong, but as far as I can tell, you aren't actually
>> storing any useful state across calls to simplifyFAdd.
>
>
> I do *NOT* store state.  The pre-created Addend instances are used and
> discarded but not destroyed.
> So, the APFloat fields in Addends instance are valid valid APFloat. But we
> don't care their value.

Then what cost are you trying to avoid by saving a pointer to the
FAddCombine?  Constructing an FAddCombine should be basically free.

>>
>>> The avoidance of APFloat construction cost is the MOST difficult part of
>>> this change.
>>> It took me about 70% time to strike the balance between the clarity the
>>> efficiency.
>>
>> I understand where you're coming from, but I'm not convinced you're
>> right.  Constructing an APFloat simply isn't that expensive,
>> especially for floating-point types which don't have more than 64
>> bits.  And having a bunch of special-case code also has a performance
>> cost.
>
> It is expensive in the sense:
>   1. more often than not, and expr cannot be optimized, and
>   2. try to optimize an expr need at about 9 Addends.
>
> That means we need call APFloat::APFloat() for 9 * n * iteration where
> the n is the # of unsafe add, and <iteration> is # of iterations instCombine
> take to
> optimize instruction.
>
> If 80% of instructions can be optimized, that might justify the cost.
> However, this is not the case.
>
> On the other hand,  I just create some instances to save the cost. There is
> no hand-shake between
> functions, there is no state machine and other nasty stuff to make
> maintenance hard.

My high-level concern with this patch is basically that this is a lot
of code, the algorithm is complex, and the implementation is hard to
reason about; I'm trying to find places where can simplify if
possible.  And I'm immediately skeptical of any performance claim
without numbers.

>> Anyway, if you really care about it being fast, matching
>> PatternMatching.h-style patterns is going to be faster than any data
>> structure.
>>
>>
> I start with pattern match, and gave up later on. It seems to be too messy.

Okay.

-Eli