[LLVMdev] llvm-gcc -O0 compile times

Chris Lattner clattner at apple.com
Sat Jun 21 14:33:02 PDT 2008

I've started investigating -O0 -g compile times with llvm-gcc, which  
are pretty important for people in development mode (e.g. all debug  
builds of llvm itself!).

I've found some interesting things.  I'm testing with mainline as of  
r52596 in a Release build and with checking disabled in the front- 
end.  My testcase is a large C++ source file: my friend  
InstructionCombining.cpp.  I build it the normal way we build it in a  
debug mode but with the output redirected to /dev/null, which is:

time llvm-g++ -I/Users/sabre/llvm/include -I/Users/sabre/llvm/lib/ 
Transforms/Scalar -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -g -fno- 
exceptions -Woverloaded-virtual -pedantic -Wall -W -Wwrite-strings - 
Wno-long-long -Wunused -Wno-unused-parameter -c -MMD -MP -MF "/Users/ 
sabre/llvm/lib/Transforms/Scalar/Debug/InstructionCombining.d.tmp" -MT  
InstructionCombining.lo" -MT "/Users/sabre/llvm/lib/Transforms/Scalar/ 
Debug/InstructionCombining.o" -MT "/Users/sabre/llvm/lib/Transforms/ 
Scalar/Debug/InstructionCombining.d" InstructionCombining.cpp -o /dev/ 

One thing that is interesting is that we are significantly slower than  
g++-4.2 on this testcase.  I'm seeing these timings:

GCC 4.2 -c: 4.27s
GCC 4.2 -S: 3.59s
LLVM4.2 -c: 9.30s
LLVM4.2 -S: 8.40s

One thing I noticed is that with llvm-gcc, the assembler is taking  
longer than with gcc 4.2 (.9s vs .68s).  This turns out to be because  
we make much larger output than GCC does:

gcc.s  -> 8943786
llvm.s -> 13424378
gcc.o  -> 2055892
llvm.o -> 3044512

Why is this? Lets look at the contents:

$ sdiff -w 120 gcc.size llvm.size
Segment : 1495968					   |	Segment : 2211617
	Section (__TEXT, __text): 251661		   |		Section (__TEXT, __text):  
	Section (__DWARF, __debug_frame): 82752		   |		Section (__DWARF,  
__debug_frame): 80240
	Section (__DWARF, __debug_info): 671478		   |		Section (__DWARF,  
__debug_info): 1240778
	Section (__DWARF, __debug_abbrev): 3241		   |		Section (__DWARF,  
__debug_abbrev): 1535
	Section (__DWARF, __debug_aranges): 48		   |		Section (__DWARF,  
__debug_aranges): 0
	Section (__DWARF, __debug_macinfo): 0				Section (__DWARF,  
__debug_macinfo): 0
	Section (__DWARF, __debug_line): 126106		   |		Section (__DWARF,  
__debug_line): 149797
	Section (__DWARF, __debug_loc): 0				Section (__DWARF, __debug_loc): 0
	Section (__DWARF, __debug_pubnames): 168873	   |		Section (__DWARF,  
__debug_pubnames): 165104
	Section (__DWARF, __debug_pubtypes): 32449	   |
	Section (__DWARF, __debug_str): 17541		   |		Section (__DWARF,  
__debug_str): 0
	Section (__DWARF, __debug_ranges): 456		   |		Section (__DWARF,  
__debug_ranges): 0
	Section (__DATA, __const): 100			   |		Section (__DATA, __const): 136
	Section (__TEXT, __cstring): 11543		   |		Section (__TEXT,  
__cstring): 12678
	Section (__DATA, __data): 64			   |		Section (__DATA, __data): 76
	Section (__DATA, __const_coal): 48		   |
	Section (__TEXT, __const_coal): 128		   |
	Section (__DATA, __mod_init_func): 4		   |		Section (__DATA,  
__mod_init_func): 4
	Section (__DATA, __bss): 32			   |		Section (__DATA, __bss): 65
	Section (__TEXT, __textcoal_nt): 116324		   |		Section (__TEXT,  
__textcoal_nt): 168920
	Section (__TEXT, __literal8): 8			   |		Section (__TEXT, __eh_frame):  
	Section (__TEXT, __StaticInit): 147		   |		Section (__TEXT,  
__StaticInit): 166
	Section (__IMPORT, __jump_table): 12790		   |		Section (__IMPORT,  
__jump_table): 12410
	Section (__IMPORT, __pointers): 136		   |		Section (__IMPORT,  
__pointers): 128
	total 1495929					   |		total 2211546
total 1495968						   |	total 2211617

There are several problems here:

1. We're emitting __eh_frame even though it is being built with -fno- 
exceptions: http://llvm.org/PR2481.  Just the excess labels alone give  
the assembler a lot more work to do.
2. The __debug_info section is twice as big and the __debug_line  
section is a bit bigger: http://llvm.org/PR2482
3. We aren't outputting text or data __const_coal sections.  I'm not  
sure what these are, but they seem preferable to __textcoal_nt: http://llvm.org/PR2483

Also, we have no __debug_pubtypes, __debug_aranges, __debug_str,  
__debug_ranges or sections.  I have no idea what these are, but could  
be a problem :)

Fixing these are important for a couple of reasons.  Generating more  
output takes more time, both in the assembler but also in the compiler  
to push all this around.

Moving up from the assembler, according to -ftime-report, our time in  
cc1plus is basically going into:

LLVM Passes:
   2.65s -> X86 DAG->DAG Instruction Selection (all selectiondag stuff)
   0.54s -> X86 AT&T-Style Assembly Printer
   0.42s -> Live Variable Analysis
   0.19s -> Local Register Allocator

C++ Front-end time:
   - 2.22s Tree to LLVM translator
   - 1.94s parser
   - 2.07s name lookup
   - 0.66s preprocessor
   - 0.20s gimplify

This doesn't add up to 8.4s because -ftime-report adds significant  
overhead.  It isn't to be trusted, but is a decent indicator.

 From this, it looks like there is significant room for improvement in  
many of the LLVM pieces.  The two that sick out are the tree to llvm  
translator and the selection dag related stuff.  However, even the  
asmprinter is taking a significant amount of time.  This is partially  
because it has to output a ton of stuff, but even then it could be  

For example, picking on the frontend for a bit, we spend 10% of "-emit- 
llvm -O0 -g -c" time in DebugInfo::EmitFunctionStart, most of which is  
spent recursively walking the debug info with DISerializer.  We also  
spend 9.3% of the time in DebugInfo::EmitDeclare, 10% of the time in  
eraseLocalLLVMValues, 12% of the time writing the .bc file (which  
isn't relevant to normal use), 21% of time parsing (which we can't  

Anyone interested in picking off a piece and tackling it?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080621/3abe82e4/attachment.html>

More information about the llvm-dev mailing list