[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

Mon Jul 29 16:24:19 PDT 2013

----- Original Message -----
> 
> On Jul 27, 2013, at 5:47 PM, Shuxin Yang <shuxin.llvm at gmail.com>
> wrote:
> 
> > Hi, Sean:
> > 
> >   I'm sorry I lie.  I didn't mean to lie. I did try to avoid making
> >   a *BIG* change
> > to the IPO pass-ordering for now. However, when I make a minor
> > change to
> > populateLTOPassManager() by separating module-pass and
> > non-module-passes, I
> > saw quite a few performance difference, most of them are
> > degradations. Attacking
> > these degradations one by one in a piecemeal manner is wasting
> > time. We might as
> > well define the pass-ordering for Pre-IPO, IPO and Post-IPO phases
> > at this time,
> > and hopefully once for all.
> >    
> >  In order to repair the image of being a liar, I post some
> >  preliminary result in this cozy
> > Saturday afternoon which I normally denote to daydreaming :-)
> > 
> >  So far I only measure the result of MultiSource benchmarks on my
> >  iMac (late
> > 2012 model), and the command to run the benchmark is
> >  "make TEST=simple report OPTFLAGS='-O3 -flto'".
> > 
> >  In terms of execution-time, some degrade, but more improve, few of
> >  them
> > are quite substantial. User-time is used for comparison. I measure
> > the
> > result twice, they are basically very stable. As far as I can tell
> > from the result,
> > the proposed pass-ordering is basically toward good change.
> > 
> >  Interesting enough, if I combine the populatePreIPOPassMgr() as
> >  the preIPO phase
> > (see the patch) with original populateLTOPassManager() for both IPO
> > and postIPO,
> > I see significant improve to
> > "Benchmarks/Trimaran/netbench-crc/netbench-crc"
> > (about 94%, 0.5665s(was) vs 0.0295s), as of I write this mail, I
> > have not yet got chance
> > to figure out why this combination improves this benchmark this
> > much.
> > 
> >  In teams of compile-time, the result reports my change improve the
> >  compile
> > time by about 2x, which is non-sense. I guess test-script doesn't
> > count
> > link-time.
> > 
> >   The new pass ordering Pre-IPO, IPO, and PostIPO are defined by
> > populate{PreIPO|IPO|PostIPO}PassMgr().
> > 
> >   I will discuss with Andy next Monday in order to be consistent
> >   with the
> > pass-ordering design he is envisioning, and measure more benchmarks
> > then
> > post the patch and result to the community for discussion and
> > approval.
> > 
> > Thanks
> > Shuxin
> 
> I don't have any objection to this as long as your compile times are
> comparable.
> 
> The major differences that I could spot are:
> 
> You've moved the second iteration of some scalar opts into post-IPO:
> - JumpThreading
> - CorrelatedValueProp
> 
> You no longer run InstCombine after the first round of scalar opts
> (in preIPO) and before the second round (in PostIPO).
> 
> You now have an extra (3rd) SROA in PostIPO.
> 
> I don't see a problem, but I'd like to understand the rationale. I
> think it would be valuable to capture some of the motivation behind
> the standard pass ordering and any changes we make to it. Sometimes
> part of the design becomes obsolete but no one can be sure.

Out of curiosity, has anyone tried to optimize the pass ordering in some (quasi-)automated way? Naively, a genetic algorithm seems like a perfect fit for this.

 -Hal

> Shall we
> start a new doc under LLVM subsystems?
> 
> -Andy
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory