[LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
Hal Finkel
hfinkel at anl.gov
Tue Oct 14 15:50:57 PDT 2014
----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Robin Morisset" <morisset at google.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "James Molloy" <james at jamesmolloy.co.uk>, "LLVM Developers Mailing List"
> <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, October 14, 2014 4:09:11 PM
> Subject: Re: [LLVMdev] RFC: Should we have (something like) -extra-vectorizer-passes in -O2?
>
>
> Yes, this is a much needed feature. [snip] I'm just a bit worried about
> scope creep here is all.
I don't think this is a prerequisite for the particular pass-manager change you're proposing, but I'd like to make sure we're on the same page regarding what should happen in the future.
> I have and will continue to push
> back on trying to add it until we at least get a sane framework for
> our *existing* pass pipelines though.
And this makes me even more convinced that we might not all be on the same page ;) -- I really would like you to clarify exactly what this means, and what it has to do with registration of cleanup passes; or to put it another way, why cleanup-pass registration is not part of bringing further sanity to the configuration.
> It's definitely something I want to see
> happen long-term.
Good.
Thanks again,
Hal
>
>
> On Tue, Oct 14, 2014 at 1:45 PM, Robin Morisset < morisset at google.com
> > wrote:
>
>
>
> For what it is worth, I agree with the usefulness of having a concept
> of "cleanup pass". Another example of a situation where it would be
> nice is in the fence elimination patch I sent for review recently:
> the pass is rather expensive because it relies on several analysis
> passes, and is only useful if AtomicExpand introduced fences. Being
> able to say "Only run this pass if the code was modified by this
> previous pass" would be awesome.
> Yet another example is the EnableAtomicTidy flag on ARM that runs
> simplifyCFG after AtomicExpand to cleanup some of the
> load-linked/store-conditional loops: if no such loops have been
> introduced, the cleanup is unnecessary.
>
>
>
>
> On Tue, Oct 14, 2014 at 12:24 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
>
>
> ----- Original Message -----
> > From: "Andrew Trick" < atrick at apple.com >
> > To: "Chandler Carruth" < chandlerc at google.com >
> > Cc: "James Molloy" < james at jamesmolloy.co.uk >, "LLVM Developers
> > Mailing List" < llvmdev at cs.uiuc.edu >
> > Sent: Tuesday, October 14, 2014 1:21:21 PM
> > Subject: Re: [LLVMdev] RFC: Should we have (something like)
> > -extra-vectorizer-passes in -O2?
> >
> >
> > I’ll summarize your responses as: The new pipeline produces better
> > results than the old, and we currently have no good mechanism for
> > reducing the compile time overhead.
> >
> >
> > I’ll summarize my criticism as: In principle, there are better ways
> > to clean up after the vectorizer without turning it into a
> > complicated megapass, but no one has done the engineering. I don’t
> > think cleaning up after the vectorizer should incur any noticeable
> > overhead if the vectorizer never runs, and it would be avoidable
> > with a sensibly designed passes that aren’t locked into the current
> > pass manager design.
> >
> > I don’t have the data right now to argue against enabling the new
> > pipeline under O2. Hopefully others who care about clang compile
> > time will jump in.
> >
> > As for the long-term plan to improve compile-time, all I can do now
> > is to advocate for a better approach.
>
> Sure, but we should also have a plan ;) -- I think this is important,
> and we should make sure this doesn't get dropped. I don't think that
> the vectorizers are the only thing for which the concept of
> 'cleanup' passes are relevant. The critical piece here is that, if
> we don't need to run a cleanup pass, we also don't need to compute
> any analysis passes on which the pass might depend.
>
> While I'm in favor of the new passes, I also want us to have a plan
> in this area. I understand that the pass manager is being rewritten,
> and maybe putting effort into the old one is not worthwhile now, but
> the new one should certainly have a design compatible with this
> requirement. Chandler?
>
> -Hal
>
>
>
> >
> >
> > -Andy
> >
> >
> >
> >
> >
> > On Oct 14, 2014, at 10:56 AM, Chandler Carruth <
> > chandlerc at google.com
> > > wrote:
> >
> >
> >
> >
> >
> > On Tue, Oct 14, 2014 at 10:11 AM, Andrew Trick < atrick at apple.com >
> > wrote:
> >
> >
> >
> > >> + correlated-propagation
> >
> > A little worried about this.
> >
> > >> + instcombine
> >
> > I'm *very* concerned about rerunning instcombine, but understand it
> > may help cleanup the vectorized preheader.
> >
> >
> >
> > Why are you concerned? Is instcombine that slow? I usually don't
> > see
> > huge overhead from re-running it on nearly-canonical code. (Oh, I
> > see you just replied to Hal here, fair enough.
> >
> >
> >
> >
> > >> + licm
> > >> + loop-unswitch
> >
> > These should limited to the relevant loop nest.
> >
> >
> >
> > We have no way to do that currently. Do you think they will in
> > practice be too slow? If so, why? I would naively expect unswitch
> > to
> > be essentially free unless it can do something, and LICM not much
> > more expensive.
> >
> >
> >
> >
> > >> + simplifycfg
> >
> > OK if the CFG actually changed.
> >
> >
> >
> > Again, we have no mechanism to gate this. Frustratingly, the only
> > thing I want here is to delete dead code formed by earlier passes.
> > We just don't have anything cheaper (and I don't have any
> > measurements indicating we need something cheaper).
> >
> >
> >
> >
> > >> + instcombine
> >
> > instcombine again! This can’t be good.
> >
> >
> >
> > I actually have no specific reason to think we need this other than
> > the fact that we run instcombine after simplifycfg in a bunch of
> > other places. If you're looking for one to rip out, this would be
> > the first one I would rip out because I'm doubtful of its value.
> >
> >
> >
> > On a separate note:
> >
> >
> >
> >
> >
> > >> + early-cse
> >
> > Passes like loop-vectorize should be able to do their own CSE
> > without
> > much engineering effort.
> >
> > >> slp-vectorize
> > >> + early-cse
> >
> > SLP should do its own CSE.
> >
> > I actually agree with you in principle, but I would rather run the
> > pass now (and avoid hacks downstream to essentially do CSE in the
> > backend) than hold up progress on the hope of advanced on-demand
> > CSE
> > layers being added to the vectorizers. I don't know of anyone
> > actually working on that, and so I'm somewhat concerned it will
> > never materialize.
> >
>
>
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list