[LLVMdev] Instruction Scheduling
Evan Cheng
evan.cheng at apple.com
Mon Mar 3 00:30:14 PST 2008
It's hard to say. I remember -pre-RA-sched=none (when it used to
exist) does a depth first traversal on the dag and translates the
nodes in that order. It's not particularly good at anything.
I assume your target of choice is x86. In that case, yes the default
is burr. On modern x86 cpu's, it's far more important to avoid
register spills / restores. Scheduling for latency before register
allocation hasn't proven to be a win. Benchmarking x86 is very very
tricky. On many cases where obviously better code ended up being
slower. Hidden hazards like loop alignment, instructions crossing
instruction dispatch buffer, etc. are very hard to model. If the
scheduler ended up reducing the number of instructions (and loads and
stores), then it's doing its job. It's probably more important to
those than the actual runtime.
Also, all x86 cpu's do not perform the same. Are you seeing these
results on current generation of x86 cpu's? Are you using the latest
llvm release (when I guess is not since -pre-RA-sched=none is gone)?
Evan
On Feb 29, 2008, at 10:52 PM, Fernando Magno Quintao Pereira wrote:
>
> Hi, guys,
>
> I am comparing the performance of the default scheduler (seems
> to be
> the one that minimizes register pressure) with no scheduler
> (-pre-RA-sched=none), and I got these numbers. The ratio is
> low_reg_pressure/none, that is, the lower the number, the better the
> performance with low register pressure:
>
> CFP2000/177.mesa/177.mesa 1.00
> CFP2000/179.art/179.art 0.98
> CFP2000/183.equake/183.equake 1.00
> CFP2000/188.ammp/188.ammp 0.98
> CINT2000/164.gzip/164.gzip 0.97
> CINT2000/175.vpr/175.vpr 0.97
> CINT2000/176.gcc/176.gcc n/a // crashed!
> CINT2000/181.mcf/181.mcf 1.02
> CINT2000/186.crafty/186.crafty 1.00
> CINT2000/197.parser/197.parser 1.01
> CINT2000/252.eon/252.eon n/a // never runs
> CINT2000/253.perlbmk/253.perlbmk 1.05
> CINT2000/254.gap/254.gap 0.97
> CINT2000/255.vortex/255.vortex 1.00
> CINT2000/256.bzip2/256.bzip2 0.98
> CINT2000/300.twolf/300.twolf 0.92
>
> In three cases, I got a ratio above 1 [Must mean: scheduling had a
> negative impact on performance.] I just run it once, but I was
> wondering
> if this could make sense, or if I am setting the tests wrongly. I am
> running the nightly test Makefile, in a x86 linux 32 bits machine.
>
> best,
>
> Fernando
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list