<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Very slow compilation at -O2 but not -O3 (long time spent in Virtual Register Rewriter)"
href="https://bugs.llvm.org/show_bug.cgi?id=40241">40241</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Very slow compilation at -O2 but not -O3 (long time spent in Virtual Register Rewriter)
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>C++
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>tomasz.sniatowski+llvm@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=21296" name="attach_21296" title="preprocessed source (gzipped to fit)">attachment 21296</a> <a href="attachment.cgi?id=21296&action=edit" title="preprocessed source (gzipped to fit)">[details]</a></span>
preprocessed source (gzipped to fit)
Building parts of v8 with non-default compiler flags for that project (-O / -g
levels tweaked), I get:
$ time ../../third_party/llvm-build/Release+Asserts/bin/clang++
-DV8_EMBEDDED_BUILTINS -DV8_TARGET_ARCH_ARM -I../.. -I../../v8
-Iclang_x86_v8_arm/gen/v8 -fPIC -m32 -g2 -O2 -std=c++14 -fno-exceptions
-c ../../v8/src/builtins/setup-builtins-internal.cc -o foo.o
real 18m53.837s
user 18m38.992s
sys 0m14.452s
With -O3 or -O below -O2, the build is fast (7-14s on the same machine).
Dropping the symbol level to below -g2, or dropping -fno-exceptions / -fPIC
also makes the build fast.
I'm attaching the preprocessed source that reproduces the issue on a recent
trunk )cdc28a6f803270bc24866026344cd100584ec118 / 350491) as follows:
$ clang++ -fPIC -m32 -g2 -O2 -std=c++14 -fno-exceptions -c
setup-builtins-internal.ii -o foo.o
It's large because it's a perf issue and not a crash so the minimization tools
aren't immediately useful. It is at least somewhat minimized from the original
issue which happened in a jumbo (unity build) compile, where the unit
containing this source took over *50* minutes to build in -O2 (and 30-ish
seconds in -O3).
Notably -ftime-reports shows most of the time is spent in Virtual Register
Rewriter
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 1126.7400 seconds (1127.1255 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- ---
Name ---
1023.2560 ( 92.0%) 3.3200 ( 23.5%) 1026.5760 ( 91.1%) 1026.6591 ( 91.1%)
Virtual Register Rewriter
32.5960 ( 2.9%) 10.2840 ( 72.8%) 42.8800 ( 3.8%) 43.1881 ( 3.8%) X86
Assembly Printer
37.9080 ( 3.4%) 0.2120 ( 1.5%) 38.1200 ( 3.4%) 38.1153 ( 3.4%)
Greedy Register Allocator</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>