[LLVMbugs] [Bug 4252] New: Compiling interp.i takes 54 seconds: 78% time spent in Simple Register Coalescing

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Sat May 23 05:42:42 PDT 2009


http://llvm.org/bugs/show_bug.cgi?id=4252

           Summary: Compiling interp.i takes 54 seconds: 78% time spent in
                    Simple Register Coalescing
           Product: new-bugs
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: edwintorok at gmail.com
                CC: llvmbugs at cs.uiuc.edu


Created an attachment (id=3017)
 --> (http://llvm.org/bugs/attachment.cgi?id=3017)
llc interp.bc

Compiling attached file takes more than 54 seconds with clang -O.
Compiling same file with gcc takes 0.002s.

You can reproduce by:
$ llc -time-passes interp.bc

Or by running clang -O:
$ clang interp.i -c -O -ftime-report                                            
===-------------------------------------------------------------------------=== 
                      Instruction Selection and Scheduling                      
===-------------------------------------------------------------------------=== 
  Total Execution Time: 0.2400 seconds (0.2590 wall clock)                      

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  ---
Name ---
   0.0480 ( 21.4%)   0.0040 ( 25.0%)   0.0520 ( 21.6%)   0.0644 ( 24.8%) 
Instruction Scheduling
   0.0400 ( 17.8%)   0.0000 (  0.0%)   0.0400 ( 16.6%)   0.0458 ( 17.7%)  DAG
Legalization      
   0.0480 ( 21.4%)   0.0040 ( 24.9%)   0.0520 ( 21.6%)   0.0443 ( 17.1%)  Type
Legalization     
   0.0200 (  8.9%)   0.0040 ( 25.0%)   0.0240 ( 10.0%)   0.0320 ( 12.3%) 
Instruction Selection 
   0.0360 ( 16.0%)   0.0040 ( 24.9%)   0.0400 ( 16.6%)   0.0298 ( 11.5%) 
Instruction Creation  
   0.0120 (  5.3%)   0.0000 (  0.0%)   0.0120 (  5.0%)   0.0199 (  7.7%)  DAG
Combining 1       
   0.0120 (  5.3%)   0.0000 (  0.0%)   0.0120 (  4.9%)   0.0137 (  5.3%)  DAG
Combining 2       
   0.0080 (  3.5%)   0.0000 (  0.0%)   0.0080 (  3.3%)   0.0079 (  3.0%) 
Instruction Scheduling Cleanup
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0006 (  0.2%)  DAG
Combining after legalize types
   0.2240 (100.0%)   0.0160 (100.0%)   0.2400 (100.0%)   0.2590 (100.0%)  TOTAL 

===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...                     
===-------------------------------------------------------------------------===
  Total Execution Time: 54.6514 seconds (54.6741 wall clock)                   

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  ---
Name ---
  42.7026 ( 78.2%)   0.0240 ( 21.4%)  42.7266 ( 78.1%)  42.7355 ( 78.1%) 
Simple Register Coalescing
   3.7242 (  6.8%)   0.0040 (  3.5%)   3.7282 (  6.8%)   3.7292 (  6.8%) 
Unswitch loops            
   1.4080 (  2.5%)   0.0000 (  0.0%)   1.4080 (  2.5%)   1.4124 (  2.5%) 
Canonicalize natural loops
   1.1840 (  2.1%)   0.0000 (  0.0%)   1.1840 (  2.1%)   1.1913 (  2.1%) 
Canonicalize natural loops
   0.8320 (  1.5%)   0.0000 (  0.0%)   0.8320 (  1.5%)   0.8321 (  1.5%) 
Simplify the CFG          
   0.6840 (  1.2%)   0.0000 (  0.0%)   0.6840 (  1.2%)   0.6861 (  1.2%) 
Optimize for code generation
   0.6080 (  1.1%)   0.0000 (  0.0%)   0.6080 (  1.1%)   0.6070 (  1.1%) 
Simplify the CFG            
   0.5440 (  0.9%)   0.0000 (  0.0%)   0.5440 (  0.9%)   0.5444 (  0.9%) 
Linear Scan Register Allocator
   0.4720 (  0.8%)   0.0320 ( 28.5%)   0.5040 (  0.9%)   0.5015 (  0.9%)  X86
DAG->DAG Instruction Selection
   0.3640 (  0.6%)   0.0000 (  0.0%)   0.3640 (  0.6%)   0.3662 (  0.6%)  Break
critical edges in CFG       
   0.3280 (  0.6%)   0.0000 (  0.0%)   0.3280 (  0.6%)   0.3248 (  0.5%) 
Global Value Numbering            
   0.1920 (  0.3%)   0.0000 (  0.0%)   0.1920 (  0.3%)   0.1925 (  0.3%) 
Control Flow Optimizer            
   0.1120 (  0.2%)   0.0000 (  0.0%)   0.1120 (  0.2%)   0.1104 (  0.2%) 
Eliminate PHI nodes for register allocation
   0.1120 (  0.2%)   0.0040 (  3.5%)   0.1160 (  0.2%)   0.1081 (  0.1%) 
Canonicalize Induction Variables           
   0.1040 (  0.1%)   0.0000 (  0.0%)   0.1040 (  0.1%)   0.0961 (  0.1%)  Loop
Strength Reduction                    
   0.0680 (  0.1%)   0.0000 (  0.0%)   0.0680 (  0.1%)   0.0906 (  0.1%) 
Induction Variable Users                   
   0.0600 (  0.1%)   0.0120 ( 10.7%)   0.0720 (  0.1%)   0.0764 (  0.1%)  Live
Interval Analysis                     
   0.0480 (  0.0%)   0.0200 ( 17.8%)   0.0680 (  0.1%)   0.0714 (  0.1%)  Live
Variable Analysis                     
   0.0720 (  0.1%)   0.0040 (  3.5%)   0.0760 (  0.1%)   0.0709 (  0.1%) 
Loop-Closed SSA Form Pass                  
   0.0680 (  0.1%)   0.0000 (  0.0%)   0.0680 (  0.1%)   0.0685 (  0.1%) 
Simplify the CFG                           
   0.0640 (  0.1%)   0.0000 (  0.0%)   0.0640 (  0.1%)   0.0641 (  0.1%) 
Combine redundant instructions             
   0.0600 (  0.1%)   0.0000 (  0.0%)   0.0600 (  0.1%)   0.0593 (  0.1%) 
Combine redundant instructions             
   0.0520 (  0.0%)   0.0000 (  0.0%)   0.0520 (  0.0%)   0.0528 (  0.0%) 
Promote Memory to Register                 
   0.0520 (  0.0%)   0.0000 (  0.0%)   0.0520 (  0.0%)   0.0511 (  0.0%) 
Loop-Closed SSA Form Pass                  
   0.0360 (  0.0%)   0.0000 (  0.0%)   0.0360 (  0.0%)   0.0475 (  0.0%) 
Induction Variable Users                   
   0.0480 (  0.0%)   0.0000 (  0.0%)   0.0480 (  0.0%)   0.0465 (  0.0%) 
Simplify the CFG                           
   0.0400 (  0.0%)   0.0000 (  0.0%)   0.0400 (  0.0%)   0.0392 (  0.0%) 
Sparse Conditional Constant Propagation    
   0.0400 (  0.0%)   0.0000 (  0.0%)   0.0400 (  0.0%)   0.0392 (  0.0%) 
Dominator Tree Construction                
   0.0320 (  0.0%)   0.0000 (  0.0%)   0.0320 (  0.0%)   0.0317 (  0.0%) 
Two-Address instruction pass               
   0.0280 (  0.0%)   0.0040 (  3.5%)   0.0320 (  0.0%)   0.0290 (  0.0%)  Loop
Invariant Code Motion                 
   0.0240 (  0.0%)   0.0000 (  0.0%)   0.0240 (  0.0%)   0.0231 (  0.0%) 
Simplify the CFG                           
   0.0200 (  0.0%)   0.0000 (  0.0%)   0.0200 (  0.0%)   0.0222 (  0.0%) 
Combine redundant instructions             
   0.0200 (  0.0%)   0.0000 (  0.0%)   0.0200 (  0.0%)   0.0205 (  0.0%) 
Combine redundant instructions             
   0.0240 (  0.0%)   0.0000 (  0.0%)   0.0240 (  0.0%)   0.0204 (  0.0%) 
Dominance Frontier Construction            
   0.0160 (  0.0%)   0.0000 (  0.0%)   0.0160 (  0.0%)   0.0195 (  0.0%) 
Dominance Frontier Construction            
   0.0160 (  0.0%)   0.0000 (  0.0%)   0.0160 (  0.0%)   0.0153 (  0.0%)  Break
critical edges in CFG                
   0.0160 (  0.0%)   0.0000 (  0.0%)   0.0160 (  0.0%)   0.0148 (  0.0%)  Stack
Slot Coloring                        
   0.0120 (  0.0%)   0.0000 (  0.0%)   0.0120 (  0.0%)   0.0141 (  0.0%) 
Dominator Tree Construction                
   0.0120 (  0.0%)   0.0000 (  0.0%)   0.0120 (  0.0%)   0.0129 (  0.0%) 
MachineDominator Tree Construction         
   0.0119 (  0.0%)   0.0000 (  0.0%)   0.0119 (  0.0%)   0.0126 (  0.0%) 
Simplify the CFG                           
   0.0120 (  0.0%)   0.0000 (  0.0%)   0.0120 (  0.0%)   0.0124 (  0.0%) 
Dominator Tree Construction                
   0.0200 (  0.0%)   0.0000 (  0.0%)   0.0200 (  0.0%)   0.0115 (  0.0%) 
Delete dead loops                          
   0.0079 (  0.0%)   0.0000 (  0.0%)   0.0079 (  0.0%)   0.0111 (  0.0%) 
MachineDominator Tree Construction         
   0.0079 (  0.0%)   0.0040 (  3.5%)   0.0120 (  0.0%)   0.0105 (  0.0%) 
Machine Code Deleter                       
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0096 (  0.0%) 
Aggressive Dead Code Elimination           
   0.0079 (  0.0%)   0.0000 (  0.0%)   0.0079 (  0.0%)   0.0095 (  0.0%)  Jump
Threading                             
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0092 (  0.0%) 
Machine code sinking                       
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0078 (  0.0%) 
Dominator Tree Construction                
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0076 (  0.0%) 
Dominator Tree Construction                
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0075 (  0.0%) 
Combine redundant instructions             
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0071 (  0.0%)  Find
Used Types                            
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0069 (  0.0%) 
Dominator Tree Construction                
   0.0079 (  0.0%)   0.0000 (  0.0%)   0.0079 (  0.0%)   0.0068 (  0.0%) 
Natural Loop Information                   
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0068 (  0.0%) 
Dominator Tree Construction                
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0067 (  0.0%)  X86
AT&T-Style Assembly Printer            
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0061 (  0.0%) 
Prolog/Epilog Insertion & Frame Finalization
   0.0000 (  0.0%)   0.0040 (  3.5%)   0.0040 (  0.0%)   0.0058 (  0.0%) 
Dominance Frontier Construction             
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0057 (  0.0%) 
Remove unreachable blocks from the CFG      
   0.0080 (  0.0%)   0.0000 (  0.0%)   0.0080 (  0.0%)   0.0054 (  0.0%) 
Remove unreachable machine basic blocks     
   0.0079 (  0.0%)   0.0000 (  0.0%)   0.0079 (  0.0%)   0.0053 (  0.0%) 
Natural Loop Information                    
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0051 (  0.0%) 
Conditional Propagation                     
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0046 (  0.0%) 
MachineDominator Tree Construction          
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0040 (  0.0%) 
Dominance Frontier Construction             
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0040 (  0.0%) 
Dominance Frontier Construction             
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0038 (  0.0%) 
Combine redundant instructions              
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0038 (  0.0%) 
Combine redundant instructions              
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0033 (  0.0%)  Dead
Store Elimination                      
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0032 (  0.0%) 
Virtual Register Map                        
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0029 (  0.0%) 
Machine Natural Loop Construction           
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0027 (  0.0%) 
Machine Natural Loop Construction           
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0026 (  0.0%)  X86
FP Stackifier                           
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0023 (  0.0%) 
Loop-Closed SSA Form Pass                   
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0020 (  0.0%) 
Module Verifier                             
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0019 (  0.0%) 
Exception handling preparation              
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0018 (  0.0%) 
Reassociate expressions                     
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0018 (  0.0%) 
MemCpy Optimization                         
   0.0039 (  0.0%)   0.0000 (  0.0%)   0.0039 (  0.0%)   0.0017 (  0.0%) 
Conditional Propagation                     
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0012 (  0.0%)  Dead
Global Elimination                     
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0012 (  0.0%) 
Rotate Loops                                
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0010 (  0.0%) 
Machine Natural Loop Construction           
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0007 (  0.0%)  Tail
Call Elimination                       
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0006 (  0.0%) 
Subregister lowering instruction pass       
   0.0040 (  0.0%)   0.0000 (  0.0%)   0.0040 (  0.0%)   0.0006 (  0.0%) 
Memory Dependence Analysis                  
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0005 (  0.0%) 
Scalar Evolution Analysis                   
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%) 
Machine Instruction LICM                    
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%) 
Scalar Evolution Analysis                   
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0003 (  0.0%)  Label
Folder                                
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0002 (  0.0%) 
Memory Dependence Analysis                  
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Code
Placement Optimizater                  
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0001 (  0.0%)  Basic
CallGraph Construction                
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Live
Stack Slot Analysis                    
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Remove unused exception handling info       
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Inliner for always_inline functions         
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86
Maximal Stack Alignment Calculator      
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Simplify well-known library calls           
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Deduce function attributes                  
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Memory Dependence Analysis                  
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Dead
Argument Elimination                   
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Scalar Replacement of Aggregates            
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Dead
Type Elimination                       
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Global Variable Optimizer                   
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Preliminary module verification             
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Interprocedural constant propagation        
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  X86
FP_REG_KILL inserter
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Basic
Alias Analysis (default AA impl)
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Promote Memory to Register
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Analyze Machine Code For Garbage Collection
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Strip
Unused Function Prototypes
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Insert stack protectors
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Target Data Layout
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Raise
allocations from calls to instructions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Lower
Garbage Collection Instructions
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%) 
Delete Garbage Collector Information
  54.5394 (100.0%)   0.1120 (100.0%)  54.6514 (100.0%)  54.6741 (100.0%)  TOTAL

clang-cc: /home/edwin/llvm-svn/llvm/include/llvm/Support/Timer.h:158:
llvm::TimerGroup::~TimerGroup(): Assertion `NumTimers == 0 && "TimerGroup
destroyed before all contained timers!"' failed.                                
0   clang-cc        0x00000000010ec94f                                          
1   clang-cc        0x00000000010ecd49                                          
2   libpthread.so.0 0x0000003b0a80e7b0                                          
3   libc.so.6       0x0000003b09c32065 gsignal + 53                             
4   libc.so.6       0x0000003b09c35153 abort + 387                              
5   libc.so.6       0x0000003b09c2b159 __assert_fail + 233                      
6   clang-cc        0x0000000000dbcc5c llvm::TimerGroup::~TimerGroup() + 124    
7   libc.so.6       0x0000003b09c367dd exit + 157                               
8   libc.so.6       0x0000003b09c1e5ad __libc_start_main + 237                  
9   clang-cc        0x000000000042d479


-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list