[LLVMdev] current llvm vs. current gcc code size

regehr regehr at cs.utah.edu
Wed Aug 20 20:50:28 PDT 2008


This is a bit random but perhaps interesting:

Attached is a histogram of code sizes for about 50,000 random C programs 
compiled by recent versions of llvm and gcc.  x-axis is bytes in the 
text segment and y-axis is number of programs in each 100-byte bucket.

Code size for a program is taken to be the smallest code size across 
-O0, -O1, -O2, -O3, and -Os, for a given compiler.

I don't know why each histogram has two peaks, or why llvm's left-hand 
peak is at a lower code size than gcc's left-hand peak.  Perhaps the 
left-hand peaks indicate trivial random programs that compile completely 
away and then llvm has a more concise crt0.

Anyway the interesting effect here is the rightward shift of the main 
bulk of the binaries produced by llvm vs. gcc.  This would seem to 
indicate that there are optimization opportunities being missed by llvm. 
  Ideally, there would be some automatic way to identify these 
opportunities, perhaps using delta-debugging techniques.  If this seems 
interesting to people I can look into it more...

This is targeting x86/Linux.

John
-------------- next part --------------
A non-text attachment was scrubbed...
Name: size.png
Type: image/png
Size: 4094 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080820/1e376bba/attachment.png>


More information about the llvm-dev mailing list