[LLVMdev] Optimization passes organization and tradeoffs
Chris Lattner
sabre at nondot.org
Wed May 21 13:48:57 PDT 2008
On Wed, 21 May 2008, Nicolas Capens wrote:
> Thanks for the detailed explanations. I have a few remaining questions:
>
> Am I correct that ScalarReplAggregates is hardly more expensive than Mem2Reg
> and therefore generally preferable?
Right.
> What would be the code quality implications of using "-dce -simplifycfg"
> instead of -adce? As far as I understand the algorithms involved, -dce would
> hardly ever miss a dead instruction if dead stores have already been
> eliminated, right?
-dce -simplifycfg should be basically the same as adce. ADCE was recently
changed to not delete output-free loops. That was the major advantage it
had over simplifycfg. We now have an explicit loop deletion pass as part
of the optimizer.
> For my application it is safe to reduce x+0.0 to x. I checked and my current
> set of passes keeps the add. Is there a way to control the floating-point
> model in some way? Lots of compilers can select 'strict', 'precise', 'fast',
> etc. Even adding and then subtracting the same value, which almost never
> results in the original value, is almost always safe to eliminate for my
> app...
You can tell the code generator that it is safe to do this (see
llvm/Target/TargetOptions.h), but there isn't currently a way to tell the
optimizers about it. My tentative plan is to split "add" into
"fadd"/"iadd" and then make the "fp" operations have some sort of flags to
parameterize the optimizations allowed for them. This would allow us to
propagate down the equivalent of GCC's -ffast-math into the IR.
> What's the difference between GVN and GCSE, if they both perform common
> subexpression elimination?
GVN does more, and is a better algorithm. GCSE is basically deprecated
and should be removed at some point.
> So far LLVM is working great for me. I'm impressed by the overall
> architecture and the fair learning curve. I'm generating complex but
> well-optimized code in a matter of days. I haven't found any problems in the
> generated code yet, but I'll let you know as soon as I notice a glitch.
Awesome!
-Chris
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
> Behalf Of Chris Lattner
> Sent: Tuesday, 20 May, 2008 20:28
> To: LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Optimization passes organization and tradeoffs
>
>
> On May 20, 2008, at 8:57 AM, David Greene wrote:
>
>> On Tuesday 20 May 2008 07:03, Nicolas Capens wrote:
>>
>>> 1) Does ScalarReplAggregates totally superscede
>>> PromoteMemoryToRegister? I
>>
>> Nope, they are different. Mem2Reg is really important if you want
>> register
>> allocation.
>
> Actually SROA does fully subsume Mem2Reg. It iterates between breaking
> up aggregates and promoting them to registers.
>
>>> think I need it to optimize small arrays, but what is the expected
>>> added
>>> complexity?
>>
>> I shouldn't think it would be very expensive at all.
>
> Agreed.
>
>>> 2) Does SCCP also eliminate multiplying/dividing by 1 and
>>> adding/subtracting 0?
>>
>> That's probably more the purview of instcombine.
>
> Right.
>
>>> 4) Is DeadStoreElimination still necessary when we have
>>> AggressiveDCE?
>>
>> Probably, but I'll let others give the definitive answer.
>
> They are orthogonal. DSE touches stores, AggressiveDCE does mostly
> scalars. ADCE basically assumes all stores are live.
>
>>> 5) What are the tradeoffs between the different dead code elimination
>>> variants (why not always use the aggressive one)?
>>
>> Others can speak to this.
>
> ADCE is actually quite expensive from the compile-time perspective,
> because it requires building post dominance information. I'd stick
> with a combination of "-dce -simplifycfg" instead of -adce.
>
>>> 6) Is there a better place for CFGSimplification? Should I perform
>>> it at
>>> multiple points?
>>
>> I think once is probably enough. Earlier would probably be better
>> as it will
>> simplify later passes and potentially help them run faster.
>
> I'd suggest running it earlier as you suggest, but then also late to
> clean up. It's a very fast pass.
>
>>> Also, my code will frequently have vectors, that are either
>>> initialized to
>>> all 0.0 or 1.0. This offers a lot of opportunity for eliminating many
>>> multiplications and additions, but I was wondering which passes I
>>> need for
>>> this (probably a Reassociate pass, what else)? And I also wonder
>>> whether
>>> these passes actually work with vectors?
>>
>> I would assume they work with vector. Anything expression-related
>> is good
>> to capture these opportunities (reassociation, folding, instcombine,
>> etc.).
>
> Note that a lot of optimizations don't apply to floating point
> values. E.g. "x+0.0" can't be eliminated (in general), because
> "-0.0+0.0" is 0.0, not -0.0. All the optimizations should tolerate
> vectors, but may miss certain optimizations on them. If you notice
> something (e.g. reassociation) that isn't being done aggressively with
> vectors, please file a bug. We do a lot of vector optimizations
> though, so we are probably only missing details, not whole classes of
> important optimizations.
>
>>> Is there any other highly recommended pass for this kind of
>>> applications
>>> that I'm forgetting? Any passes that I better avoid due to poor
>>> gains and
>>> long optimization time?
>>
> Another important optimization is GVN, which does common subexpression
> elimination (for example).
>
>>> Sorry for the many question marks. :-) I don't expect there is an
>>> absolute
>>> definite answer to each of them, but some guidelines and insights
>>> would be
>>> very welcome and much appreciated.
>>
>> Phase ordering is one of the trickiest parts to tune. It's often
>> highly
>> code-dependent. If your target sources are limited in number, you
>> might be
>> able to get away with a funky ordering that would be disastrous on
>> general-purpose integer codes, for example. It's often a matter of
>> trial and
>> error. Several people have done research on using genetic
>> algorithms and
>> other tricks to find an optimal phase ordering. A google search
>> should turn
>> up interesting stuff.
>
> "What he said" :). Also, if you see specific problems in the
> generated code, let us know.
>
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-Chris
--
http://nondot.org/sabre/
http://llvm.org/
More information about the llvm-dev
mailing list