<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - AsmPrinter puts NamedRegionTimer around every instruction, causing big overhead in clang -ftime-report"
href="https://bugs.llvm.org/show_bug.cgi?id=40303">40303</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>AsmPrinter puts NamedRegionTimer around every instruction, causing big overhead in clang -ftime-report
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Common Code Generator Code
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>aras@nesnausk.org
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Clang -ftime-report seems to have quite a lot of overhead, compared to when the
flag is not used. In the time reported, curiously passes like "X86 Assembly
Printer" take up a lot of of time.
Overhead of having -ftime-report on is 30-60% which sounds like a lot. Some
tests I did locally (actual files don't matter; basically any non-trivial .cpp
file compilation will do):
regular -ftime-report
catch.cpp 1.337 1.950
catch.cpp -O2 4.616 6.586
stl.cpp 0.882 1.269
unityformat.cpp 7.195 7.312
range-compr.cpp 5.352 6.129
shader.cpp 6.000 9.555
shader.cpp -O2 12.635 20.061
My guess is that's because lib/CodeGen/AsmPrinter/AsmPrinter.cpp basically has
two timer samples (NamedRegionTimer) for every instruction, and for every
"Handler" that it invokes.
EmitFunctionBody basically looks like:
for (auto &MBB : *MF) {
for (auto &MI : MBB) {
for (const HandlerInfo &HI : Handlers) {
NamedRegionTimer T(...);
beginInstruction();
}
// ...
for (const HandlerInfo &HI : Handlers) {
NamedRegionTimer T(...);
endInstruction();
}
}
}
And then every timer sample is of course involves getting elapsed time, process
times, and memory usage samples twice (for beginning and end of region).
I'm testing this with a couple days old clang trunk (8.0.0), but seemingly the
issue has been there for a while. Haven't tracked down how far back it exists.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>