<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - For some bitcodes it can take 12 hours to read and compile"
href="https://bugs.llvm.org/show_bug.cgi?id=47395">47395</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>For some bitcodes it can take 12 hours to read and compile
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Bitcode Reader
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>scott.waye@hubse.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>I create bitcode using libLLVM for the corert compiler project
(<a href="https://github.com/dotnet/corert">https://github.com/dotnet/corert</a>). It uses the c# bindings over libLLVM from
<a href="https://github.com/Microsoft/LLVMSharp">https://github.com/Microsoft/LLVMSharp</a>.
I have 2 bitcodes generated from mostly the same source code. They are around
240MB in size. One compiles in 3 minutes, the other in 12 hours. I suspect
the 12 hour compilation is either not optimal or doing something wrong. I use
emscripten to compile and this ultimately calls
E:/GitHub/llvm-project/build/release/bin/clang++.exe -target
wasm32-unknown-emscripten -D__EMSCRIPTEN_major__=1 -D__EMSCRIPTEN_minor__=39
-D__EMSCRIPTEN_tiny__=19 -D_LIBCPP_ABI_VERSION=2 -Dunix -D__unix -D__unix__
-Werror=implicit-function-declaration -Xclang -nostdsysteminc -Xclang
-isystemE:\GitHub\emsdk\upstream\emscripten\system\include\libcxx -Xclang
-isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libcxxabi\include
-Xclang
-isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libunwind\include
-Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\compat
-Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include -Xclang
-isystemE:\GitHub\emsdk\upstream\emscripten\system\include\libc -Xclang
-isystemE:\GitHub\emsdk\upstream\emscripten\system\lib\libc\musl\arch\emscripten
-Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\local\include
-Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\SSE -Xclang
-isystemE:\GitHub\emsdk\upstream\emscripten\cache\wasm\include -DEMSCRIPTEN
-fignore-exceptions -c -g
E:\GitHub\UnoCoreRt\UnoCoreRt.Wasm\bin\Debug\netstandard2.0\UnoCoreRt.Wasm.bc
-Xclang -isystemE:\GitHub\emsdk\upstream\emscripten\system\include\SDL -c -o
E:\GitHub\UnoCoreRt\UnoCoreRt.Wasm\bin\Debug\netstandard2.0\UnoCoreRt-release.o
-mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj
-mllvm -disable-lsr -g
What I've noticed is that compared to the "fast", 3 minute compile, the "slow"
compile makes around 1 million calls to
<a href="https://github.com/llvm/llvm-project/blob/a6eb70c052da767aef6b041d0db20bdf3a9e06b5/llvm/lib/Bitcode/Reader/ValueList.cpp#L89">https://github.com/llvm/llvm-project/blob/a6eb70c052da767aef6b041d0db20bdf3a9e06b5/llvm/lib/Bitcode/Reader/ValueList.cpp#L89</a>
and hence the ResolveConstants variable ends up with that many entries.
Resolving these constants is then what seems to take most of the time. I think
the bitcode reader is identifying 1 million forward references so possible
causes of the slowness that come to mind are:
1. Incorrect identification of forward references
2. Incorrect writing from libLLVM that creates forward references
unnecessarily
3. Slow algorithm to resolve correctly identified and written forward
references.
A copy of the bitcode is at <a href="http://dev.hubse.com/UnoCoreRt.Wasm.bc.msi">http://dev.hubse.com/UnoCoreRt.Wasm.bc.msi</a> (its
not really an msi, just needed a binary extension that the web server would
serve). File is actually a .7z compressed file, so needs renaming from .msi to
.7z
I privately messaged @tlively in discord and I believe he has confirmed that it
takes a long time for him also.
<a class="bz_bug_link
bz_status_NEW "
title="NEW - Crash in BitcodeReader.cpp under LTO"
href="show_bug.cgi?id=46750">https://bugs.llvm.org/show_bug.cgi?id=46750</a> looks to be the same area of code,
but not the same problem.
I did spend a bit of time with clang++ in the debugger, but I'm not that
familiar with it at all, so I couldn't make any conclusion about my 3 theories
above.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>