[llvm-commits] [patch] Change how clang produces computed gotos

Mon Jun 6 10:57:40 PDT 2011

I tried compiling firefox with both clang and Apple's gcc. The 
performance was close. Running it on shark showed that the difference 
was that the clang compiled one was using 2x as much time in the js 
interpreter.

There seems to be a register allocator problem that I hope to report 
shortly, but another problem is the way the computed gotos are produced.

Currently clang produces a branch to a common bb that has the only 
inderctbr in the function. This is normally optimized away, but the 
optimizers are not kicking in jsinterp.o and it has a single 'jmp *%rax' 
in the generated binary.

The main reason for the status quo I think is a gradual evolution from 
the days there was no indirectbr in llvm. Producing n indirectbr 
instructions does increase the size of the cfg, but I think the results 
are worth it:

jsinterp.o goes from 1876288 to 1684448 bytes. The dromaeo score of 
firefox goes from 1628.59runs/s to 1674.68runs/s. For comparison, gcc is 
1667.49runs/s. It is the first time I get firefox to go faster when 
compiled with clang :-)

Build time on files that use computed goto does suffer. jsinterp.o with 
-O3 -g goes from 63s to 140s. Gives that this is one of the two files in 
all of mozilla that use it, and that it is really performance critical, 
I think it is worth it.

A question about indirectbr: I has a list of every label it could branch 
to. It doesn't look like we optimize it and that we avoid inlining 
functions that use computed goto. Wouldn't it be better to make the list 
implicit: every label in this function that has its address taken?

Cheers,
Rafael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: indirect.patch
Type: text/x-patch
Size: 6342 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110606/908b8175/attachment.bin>