[cfe-dev] Compilation benchmark: bzip2

Sun Dec 23 02:28:45 PST 2007

Chris Lattner wrote:
> On Dec 22, 2007, at 8:49 PM, Sanghyeon Seo wrote:
> 
>> I decided to measure clang's performance by compiling bzip2.
>> bzip2 is available from http://www.bzip.org/
>> Makefile used for benchmark is here:
>> http://sparcs.kaist.ac.kr/~tinuviel/devel/llvm/Makefile.bzip2
> 
> Cool
> 
>> Result (Minimum of 3):
>>
>> tinuviel at debian:~/clang$ tar zxf src/bzip2-1.0.4.tar.gz
>> tinuviel at debian:~/clang$ cp make/Makefile.bzip2 bzip2-1.0.4/Makefile
>> tinuviel at debian:~/clang$ cd bzip2-1.0.4
>>
>> tinuviel at debian:~/clang/bzip2-1.0.4$ make -s
>>
>> GCC: real 0.613s user 0.556s sys 0.052s
>> tcc: real 0.046s user 0.028s sys 0.012s
>> GCC -S: real 0.555s user 0.488s sys 0.052s
>> clang: real 0.298s user 0.248s sys 0.040s
>> clang+llvm-as: real 0.636s user 0.576s sys 0.048s
> 
> Just so I understand what is going on here:
>    "GCC" -> "gcc -O0 -c"
>    "GCC -S" -> "gcc -O0 -S"
>    "tcc" -> "tcc -c"
>    "clang" -> clang -emit-llvm
>    "clang+llvm-as" -> clang -emit-llvm | llvm-as
> 
> These are interesting numbers, but not very relevant.  GCC -S is doing  
> a *lot* more than clang -emit-llvm.  To get a useful comparison  
> between gcc vs clang codegen, you'd need to link the llvm code  
> generator into clang to get it to emit a native .s file.  Likewise, if  
> you want "clang emission of llvm bytecode", you should link the  
> bytecode writer into clang, instead of using llvm-as (which is  
> obviously not very fast).  A really rough functional approximation  
> would be "clang -emit-llvm | llvm-as | llc -fast -regalloc=local", but  
> this will obviously be much slower than linking llc components into  
> clang.
> 
> To me, the one interesting thing out of this is that the difference  
> between gcc -c and gcc -S is ~14%.  That's a pretty big cost just for  
> an assembler.  Maybe someone should work on finishing the llvm .o file  
> emitter at some point ;-).

Darn. I was hoping it was further along. I was planning on using it. :-)
Hope much work do you think is left?

> 
>> clang+llvm-as spent about half the time in assembler.
>> gcc spent less than 10% time in assembler.
> 
> Right, but these assemblers are not the same thing at all :)

Indeed!

> 
> At this point, llvm -O0 code generation is not nearly as fast as it  
> should be.  That said, our -O2 or -O3 codegen is a lot faster than GCC  
> in most cases.  In the future, we'll put more effort into making -O0  
> codegen fast.
> 

Speed is a Really Good Thing, of course. That's one reason why I'm 
writing the ellsif driver (getting everything done in in-memory passes).

Having said that, I've worked for a *long* time in environments 
involving people who use compilers (I've been both a compiler vendor and 
compiler user), I've heard one complaint about compilation speed and 
that involved linking. That complaint was from some guys using my linker 
on an early nineties workstation (either Sparc or hppa, I can't remember 
which). Systems are a bit faster today. ;-) People are much more 
concerned about compiler correctness, informational error messages, and 
debug capability. I cringe every time I get one of those g++ overload 
messages that force me to expand my window to 300 columns just to read 
it. ;-)

-Rich