[LLVMbugs] [Bug 4252] New: Compiling interp.i takes 54 seconds: 78% time spent in Simple Register Coalescing
bugzilla-daemon at cs.uiuc.edu
bugzilla-daemon at cs.uiuc.edu
Sat May 23 05:42:42 PDT 2009
http://llvm.org/bugs/show_bug.cgi?id=4252
Summary: Compiling interp.i takes 54 seconds: 78% time spent in
Simple Register Coalescing
Product: new-bugs
Version: unspecified
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: new bugs
AssignedTo: unassignedbugs at nondot.org
ReportedBy: edwintorok at gmail.com
CC: llvmbugs at cs.uiuc.edu
Created an attachment (id=3017)
--> (http://llvm.org/bugs/attachment.cgi?id=3017)
llc interp.bc
Compiling attached file takes more than 54 seconds with clang -O.
Compiling same file with gcc takes 0.002s.
You can reproduce by:
$ llc -time-passes interp.bc
Or by running clang -O:
$ clang interp.i -c -O -ftime-report
===-------------------------------------------------------------------------===
Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
Total Execution Time: 0.2400 seconds (0.2590 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- ---
Name ---
0.0480 ( 21.4%) 0.0040 ( 25.0%) 0.0520 ( 21.6%) 0.0644 ( 24.8%)
Instruction Scheduling
0.0400 ( 17.8%) 0.0000 ( 0.0%) 0.0400 ( 16.6%) 0.0458 ( 17.7%) DAG
Legalization
0.0480 ( 21.4%) 0.0040 ( 24.9%) 0.0520 ( 21.6%) 0.0443 ( 17.1%) Type
Legalization
0.0200 ( 8.9%) 0.0040 ( 25.0%) 0.0240 ( 10.0%) 0.0320 ( 12.3%)
Instruction Selection
0.0360 ( 16.0%) 0.0040 ( 24.9%) 0.0400 ( 16.6%) 0.0298 ( 11.5%)
Instruction Creation
0.0120 ( 5.3%) 0.0000 ( 0.0%) 0.0120 ( 5.0%) 0.0199 ( 7.7%) DAG
Combining 1
0.0120 ( 5.3%) 0.0000 ( 0.0%) 0.0120 ( 4.9%) 0.0137 ( 5.3%) DAG
Combining 2
0.0080 ( 3.5%) 0.0000 ( 0.0%) 0.0080 ( 3.3%) 0.0079 ( 3.0%)
Instruction Scheduling Cleanup
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0006 ( 0.2%) DAG
Combining after legalize types
0.2240 (100.0%) 0.0160 (100.0%) 0.2400 (100.0%) 0.2590 (100.0%) TOTAL
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 54.6514 seconds (54.6741 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- ---
Name ---
42.7026 ( 78.2%) 0.0240 ( 21.4%) 42.7266 ( 78.1%) 42.7355 ( 78.1%)
Simple Register Coalescing
3.7242 ( 6.8%) 0.0040 ( 3.5%) 3.7282 ( 6.8%) 3.7292 ( 6.8%)
Unswitch loops
1.4080 ( 2.5%) 0.0000 ( 0.0%) 1.4080 ( 2.5%) 1.4124 ( 2.5%)
Canonicalize natural loops
1.1840 ( 2.1%) 0.0000 ( 0.0%) 1.1840 ( 2.1%) 1.1913 ( 2.1%)
Canonicalize natural loops
0.8320 ( 1.5%) 0.0000 ( 0.0%) 0.8320 ( 1.5%) 0.8321 ( 1.5%)
Simplify the CFG
0.6840 ( 1.2%) 0.0000 ( 0.0%) 0.6840 ( 1.2%) 0.6861 ( 1.2%)
Optimize for code generation
0.6080 ( 1.1%) 0.0000 ( 0.0%) 0.6080 ( 1.1%) 0.6070 ( 1.1%)
Simplify the CFG
0.5440 ( 0.9%) 0.0000 ( 0.0%) 0.5440 ( 0.9%) 0.5444 ( 0.9%)
Linear Scan Register Allocator
0.4720 ( 0.8%) 0.0320 ( 28.5%) 0.5040 ( 0.9%) 0.5015 ( 0.9%) X86
DAG->DAG Instruction Selection
0.3640 ( 0.6%) 0.0000 ( 0.0%) 0.3640 ( 0.6%) 0.3662 ( 0.6%) Break
critical edges in CFG
0.3280 ( 0.6%) 0.0000 ( 0.0%) 0.3280 ( 0.6%) 0.3248 ( 0.5%)
Global Value Numbering
0.1920 ( 0.3%) 0.0000 ( 0.0%) 0.1920 ( 0.3%) 0.1925 ( 0.3%)
Control Flow Optimizer
0.1120 ( 0.2%) 0.0000 ( 0.0%) 0.1120 ( 0.2%) 0.1104 ( 0.2%)
Eliminate PHI nodes for register allocation
0.1120 ( 0.2%) 0.0040 ( 3.5%) 0.1160 ( 0.2%) 0.1081 ( 0.1%)
Canonicalize Induction Variables
0.1040 ( 0.1%) 0.0000 ( 0.0%) 0.1040 ( 0.1%) 0.0961 ( 0.1%) Loop
Strength Reduction
0.0680 ( 0.1%) 0.0000 ( 0.0%) 0.0680 ( 0.1%) 0.0906 ( 0.1%)
Induction Variable Users
0.0600 ( 0.1%) 0.0120 ( 10.7%) 0.0720 ( 0.1%) 0.0764 ( 0.1%) Live
Interval Analysis
0.0480 ( 0.0%) 0.0200 ( 17.8%) 0.0680 ( 0.1%) 0.0714 ( 0.1%) Live
Variable Analysis
0.0720 ( 0.1%) 0.0040 ( 3.5%) 0.0760 ( 0.1%) 0.0709 ( 0.1%)
Loop-Closed SSA Form Pass
0.0680 ( 0.1%) 0.0000 ( 0.0%) 0.0680 ( 0.1%) 0.0685 ( 0.1%)
Simplify the CFG
0.0640 ( 0.1%) 0.0000 ( 0.0%) 0.0640 ( 0.1%) 0.0641 ( 0.1%)
Combine redundant instructions
0.0600 ( 0.1%) 0.0000 ( 0.0%) 0.0600 ( 0.1%) 0.0593 ( 0.1%)
Combine redundant instructions
0.0520 ( 0.0%) 0.0000 ( 0.0%) 0.0520 ( 0.0%) 0.0528 ( 0.0%)
Promote Memory to Register
0.0520 ( 0.0%) 0.0000 ( 0.0%) 0.0520 ( 0.0%) 0.0511 ( 0.0%)
Loop-Closed SSA Form Pass
0.0360 ( 0.0%) 0.0000 ( 0.0%) 0.0360 ( 0.0%) 0.0475 ( 0.0%)
Induction Variable Users
0.0480 ( 0.0%) 0.0000 ( 0.0%) 0.0480 ( 0.0%) 0.0465 ( 0.0%)
Simplify the CFG
0.0400 ( 0.0%) 0.0000 ( 0.0%) 0.0400 ( 0.0%) 0.0392 ( 0.0%)
Sparse Conditional Constant Propagation
0.0400 ( 0.0%) 0.0000 ( 0.0%) 0.0400 ( 0.0%) 0.0392 ( 0.0%)
Dominator Tree Construction
0.0320 ( 0.0%) 0.0000 ( 0.0%) 0.0320 ( 0.0%) 0.0317 ( 0.0%)
Two-Address instruction pass
0.0280 ( 0.0%) 0.0040 ( 3.5%) 0.0320 ( 0.0%) 0.0290 ( 0.0%) Loop
Invariant Code Motion
0.0240 ( 0.0%) 0.0000 ( 0.0%) 0.0240 ( 0.0%) 0.0231 ( 0.0%)
Simplify the CFG
0.0200 ( 0.0%) 0.0000 ( 0.0%) 0.0200 ( 0.0%) 0.0222 ( 0.0%)
Combine redundant instructions
0.0200 ( 0.0%) 0.0000 ( 0.0%) 0.0200 ( 0.0%) 0.0205 ( 0.0%)
Combine redundant instructions
0.0240 ( 0.0%) 0.0000 ( 0.0%) 0.0240 ( 0.0%) 0.0204 ( 0.0%)
Dominance Frontier Construction
0.0160 ( 0.0%) 0.0000 ( 0.0%) 0.0160 ( 0.0%) 0.0195 ( 0.0%)
Dominance Frontier Construction
0.0160 ( 0.0%) 0.0000 ( 0.0%) 0.0160 ( 0.0%) 0.0153 ( 0.0%) Break
critical edges in CFG
0.0160 ( 0.0%) 0.0000 ( 0.0%) 0.0160 ( 0.0%) 0.0148 ( 0.0%) Stack
Slot Coloring
0.0120 ( 0.0%) 0.0000 ( 0.0%) 0.0120 ( 0.0%) 0.0141 ( 0.0%)
Dominator Tree Construction
0.0120 ( 0.0%) 0.0000 ( 0.0%) 0.0120 ( 0.0%) 0.0129 ( 0.0%)
MachineDominator Tree Construction
0.0119 ( 0.0%) 0.0000 ( 0.0%) 0.0119 ( 0.0%) 0.0126 ( 0.0%)
Simplify the CFG
0.0120 ( 0.0%) 0.0000 ( 0.0%) 0.0120 ( 0.0%) 0.0124 ( 0.0%)
Dominator Tree Construction
0.0200 ( 0.0%) 0.0000 ( 0.0%) 0.0200 ( 0.0%) 0.0115 ( 0.0%)
Delete dead loops
0.0079 ( 0.0%) 0.0000 ( 0.0%) 0.0079 ( 0.0%) 0.0111 ( 0.0%)
MachineDominator Tree Construction
0.0079 ( 0.0%) 0.0040 ( 3.5%) 0.0120 ( 0.0%) 0.0105 ( 0.0%)
Machine Code Deleter
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0096 ( 0.0%)
Aggressive Dead Code Elimination
0.0079 ( 0.0%) 0.0000 ( 0.0%) 0.0079 ( 0.0%) 0.0095 ( 0.0%) Jump
Threading
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0092 ( 0.0%)
Machine code sinking
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0078 ( 0.0%)
Dominator Tree Construction
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0076 ( 0.0%)
Dominator Tree Construction
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0075 ( 0.0%)
Combine redundant instructions
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0071 ( 0.0%) Find
Used Types
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0069 ( 0.0%)
Dominator Tree Construction
0.0079 ( 0.0%) 0.0000 ( 0.0%) 0.0079 ( 0.0%) 0.0068 ( 0.0%)
Natural Loop Information
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0068 ( 0.0%)
Dominator Tree Construction
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0067 ( 0.0%) X86
AT&T-Style Assembly Printer
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0061 ( 0.0%)
Prolog/Epilog Insertion & Frame Finalization
0.0000 ( 0.0%) 0.0040 ( 3.5%) 0.0040 ( 0.0%) 0.0058 ( 0.0%)
Dominance Frontier Construction
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0057 ( 0.0%)
Remove unreachable blocks from the CFG
0.0080 ( 0.0%) 0.0000 ( 0.0%) 0.0080 ( 0.0%) 0.0054 ( 0.0%)
Remove unreachable machine basic blocks
0.0079 ( 0.0%) 0.0000 ( 0.0%) 0.0079 ( 0.0%) 0.0053 ( 0.0%)
Natural Loop Information
0.0039 ( 0.0%) 0.0000 ( 0.0%) 0.0039 ( 0.0%) 0.0051 ( 0.0%)
Conditional Propagation
0.0039 ( 0.0%) 0.0000 ( 0.0%) 0.0039 ( 0.0%) 0.0046 ( 0.0%)
MachineDominator Tree Construction
0.0039 ( 0.0%) 0.0000 ( 0.0%) 0.0039 ( 0.0%) 0.0040 ( 0.0%)
Dominance Frontier Construction
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0040 ( 0.0%)
Dominance Frontier Construction
0.0039 ( 0.0%) 0.0000 ( 0.0%) 0.0039 ( 0.0%) 0.0038 ( 0.0%)
Combine redundant instructions
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0038 ( 0.0%)
Combine redundant instructions
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0033 ( 0.0%) Dead
Store Elimination
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0032 ( 0.0%)
Virtual Register Map
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0029 ( 0.0%)
Machine Natural Loop Construction
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0027 ( 0.0%)
Machine Natural Loop Construction
0.0039 ( 0.0%) 0.0000 ( 0.0%) 0.0039 ( 0.0%) 0.0026 ( 0.0%) X86
FP Stackifier
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0023 ( 0.0%)
Loop-Closed SSA Form Pass
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0020 ( 0.0%)
Module Verifier
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0019 ( 0.0%)
Exception handling preparation
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0018 ( 0.0%)
Reassociate expressions
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0018 ( 0.0%)
MemCpy Optimization
0.0039 ( 0.0%) 0.0000 ( 0.0%) 0.0039 ( 0.0%) 0.0017 ( 0.0%)
Conditional Propagation
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0012 ( 0.0%) Dead
Global Elimination
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0012 ( 0.0%)
Rotate Loops
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0010 ( 0.0%)
Machine Natural Loop Construction
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0007 ( 0.0%) Tail
Call Elimination
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0006 ( 0.0%)
Subregister lowering instruction pass
0.0040 ( 0.0%) 0.0000 ( 0.0%) 0.0040 ( 0.0%) 0.0006 ( 0.0%)
Memory Dependence Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0005 ( 0.0%)
Scalar Evolution Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0003 ( 0.0%)
Machine Instruction LICM
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0003 ( 0.0%)
Scalar Evolution Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0003 ( 0.0%) Label
Folder
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0002 ( 0.0%)
Memory Dependence Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0001 ( 0.0%) Code
Placement Optimizater
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0001 ( 0.0%) Basic
CallGraph Construction
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Live
Stack Slot Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Remove unused exception handling info
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Inliner for always_inline functions
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) X86
Maximal Stack Alignment Calculator
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Simplify well-known library calls
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Deduce function attributes
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Memory Dependence Analysis
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Dead
Argument Elimination
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Scalar Replacement of Aggregates
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Dead
Type Elimination
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Global Variable Optimizer
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Preliminary module verification
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Interprocedural constant propagation
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) X86
FP_REG_KILL inserter
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Basic
Alias Analysis (default AA impl)
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Promote Memory to Register
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Analyze Machine Code For Garbage Collection
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Strip
Unused Function Prototypes
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Insert stack protectors
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Target Data Layout
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Raise
allocations from calls to instructions
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) Lower
Garbage Collection Instructions
0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0000 ( 0.0%)
Delete Garbage Collector Information
54.5394 (100.0%) 0.1120 (100.0%) 54.6514 (100.0%) 54.6741 (100.0%) TOTAL
clang-cc: /home/edwin/llvm-svn/llvm/include/llvm/Support/Timer.h:158:
llvm::TimerGroup::~TimerGroup(): Assertion `NumTimers == 0 && "TimerGroup
destroyed before all contained timers!"' failed.
0 clang-cc 0x00000000010ec94f
1 clang-cc 0x00000000010ecd49
2 libpthread.so.0 0x0000003b0a80e7b0
3 libc.so.6 0x0000003b09c32065 gsignal + 53
4 libc.so.6 0x0000003b09c35153 abort + 387
5 libc.so.6 0x0000003b09c2b159 __assert_fail + 233
6 clang-cc 0x0000000000dbcc5c llvm::TimerGroup::~TimerGroup() + 124
7 libc.so.6 0x0000003b09c367dd exit + 157
8 libc.so.6 0x0000003b09c1e5ad __libc_start_main + 237
9 clang-cc 0x000000000042d479
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the llvm-bugs
mailing list