[llvm-commits] PATCH: Teach LLVM to form minimal multiply DAGs

Wed Apr 25 21:20:19 PDT 2012

On Tue, Apr 24, 2012 at 4:47 PM, Evan Cheng <evan.cheng at apple.com> wrote:

> Interesting stuff. But it seems like we need some performance data. Are
> they coming? :-)
>

Indeed!

This has almost no impact on LNT. The only regressions look like noise to
me (<3%, only two benchmarks)

However, SPASS speeds up by 8%!!! =D

Look good?

>
> Evan
>
> On Apr 22, 2012, at 3:07 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
>
> On Sat, Apr 21, 2012 at 8:39 PM, Chandler Carruth <chandlerc at gmail.com>wrote:
>
>> After reading the implementation of Reassociate much more carefully, I've
>> convinced myself it belongs there. =] There is already quite a bit of logic
>> in reassociate to minimize the operations required to compute a result when
>> that computation happens to be reassociative. There was even a commented
>> out case for optimizing multiplications accordingly. I've refactored this
>> logic to fit into that model, which required substantial changes. I've also
>> fixed it such that it actually reaches the fixed point immediately. We run
>> reassociate on each layer of the minimized computation, and in fact can run
>> it on the final computation, and nothing else changes. That seems to make
>> it a nice canonical representation. Also, I've tweaked it to avoid
>> modifying the computation as much as possible.
>>
>> I think this is reasonable to commit in its current form, but as Richard
>> has pointed out on IRC, it doesn't quite find all opportunities for
>> minimizing the number of multiplications. I'm looking into an algorithmic
>> tweak to catch the remaining cases, but I don't expect that would change
>> the fundamental formulation of the optimization.
>>
>
> Ok, I've done my homework now, and I think this is probably as far as I'm
> interested in taking this.
>
> We could use one of the fancy algorithms for generating near-minimal
> arithmetic chain multi-exponentiation multiplication sequences (such as
> Pippenger's) but I'm not convinced its worth the substantial complexity.
> Just the basic binary exponentiation technique this patch uses is likely to
> be sufficient for most (small) products we're likely to see in real
> programs.
>
> At most, it might be interesting to get CSE to look through chains of
> reassociative operations more aggressively -- it doesn't always seem
> possible for the reassociate pass to align things just right for it...
>
>
>>
>> -Chandler
>>
>> On Fri, Apr 20, 2012 at 2:23 AM, Chandler Carruth <chandlerc at gmail.com>wrote:
>>
>>> Hello,
>>>
>>> Richard and I hacked up a simple patch to form minimal multiply DAGs
>>> when intermediate results are unused. The code should handle fully generic
>>> formation of a minimal DAG of multiplies, sharing intermediate values
>>> where-ever possible. Check out the test cases for some of the fun
>>> simplifications it performs.
>>>
>>> The only real problem is where to put it. Currently I've put it in
>>> InstCombine, but not because that's really a good home for it (in fact, it
>>> can't be the home, read on). The most "natural" home for this type of
>>> transform is the Reassociate pass, but that has its own problems....
>>>
>>> The reassociate pass does a really nice job of canonicalizing chains of
>>> operations into left-linear trees that fully expose each intermediate
>>> operation for CSE and combining with other operations. This is a really
>>> nice property that seems a shame to loose. If we form the minimal multiply
>>> DAG in reassociate  eagerly, we may miss chances to combine non-minimal
>>> intermediate steps with other operations.
>>>
>>> However, if we use a cleanup pass like instcombine to mop up one-use
>>> left-linear trees like this patch does, we can easily get Reassociate and
>>> InstCombine dueling over the canonical form, and into an inf-loop we go.
>>>
>>> One possible fix is to have InstCombine form a left-linear tree after it
>>> forms the minimal folding to try to keep Reassociate from undoing its
>>> work... but this all feels a bit hackish. Especially the
>>> deferred-processing hack required to ensure InstCombine performs this
>>> optimization on the entire tree at once. Suggestions?
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120425/b2ed56ca/attachment.html>