<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - GeekBench4/Lua on Skylake/SSE2 regressed by 12% between r308321 and r308891"
href="https://bugs.llvm.org/show_bug.cgi?id=34032">34032</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>GeekBench4/Lua on Skylake/SSE2 regressed by 12% between r308321 and r308891
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Common Code Generator Code
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>zvi.rackover@intel.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>We observed a 12% degradation in the benchmark score between the mentioned
commits. Due to PR33900 the test would time-out in the range of commits, so we
can only get measurements on the boundaries.
Just so we are all on the same page, here is a timeline:
r308321 - GeekBench4/Lua was measured with score X.
r308322 - GeekBench4/Lua started failing due to compile-time timeout (same as
PR33900)
r308321 - GeekBench4/Lua back to life, but with score X-12%.
r308891 intended to fix the timeout problem and according to the log:
"The cut off at 20 uses was chosen by looking at other cut-offs in LLVM for
user scanning. It's probably too high, but does the job and is very unlikely to
regress anything."
The combination of r308321+r308321 brings the benchmark to a lesser score than
in r308321.
I ran an experiment on top-of-trunk (r309733) where i applied the following
patch which is intended to identify cases where FindAllMemoryUses bails-out
early due to r308891:
--- a/lib/CodeGen/CodeGenPrepare.cpp
+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -4059,8 +4059,10 @@ static bool FindAllMemoryUses(
for (Use &U : I->uses()) {
// Conservatively return true if we're seeing a large number or a deep
chain
// of users. This avoids excessive compilation times in pathological
cases.
- if (SeenInsts++ >= MaxMemoryUsesToScan)
+ if (SeenInsts++ >= MaxMemoryUsesToScan) {
+ llvm_unreachable("Early exit fired!");
return true;
+ }
Instruction *UserI = cast<Instruction>(U.getUser());
if (LoadInst *LI = dyn_cast<LoadInst>(UserI)) {
With the patch applied, compiled the benchmark and indeed the early-exit did
fire. So unfortunately, the assumption that r308891 does not affect real-world
code (as much as benchmarks can be considered as such) does not hold.
Next, i ran the LLVM Test-Suite with the same patch applied and saw 79
compilations that hit the UNREACHABLE. For reference, here is a list of the
suites (some of the MultiSource suites contained more than one UNREACHABLE
hit):
MultiSource/Applications/ClamAV/clamscan
MultiSource/Applications/JM/ldecod/ldecod
MultiSource/Applications/JM/lencod/lencod
MultiSource/Applications/SIBsim4/SIBsim4 MultiSource/Applications/SPASS/SPASS
MultiSource/Applications/d/make_dparser
MultiSource/Applications/hbd/hbd
MultiSource/Applications/lua/lua
MultiSource/Applications/minisat/minisat
MultiSource/Applications/oggenc/oggenc
MultiSource/Applications/sqlite3/sqlite3
MultiSource/Applications/treecc/treecc
MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
MultiSource/Benchmarks/Bullet/bullet
MultiSource/Benchmarks/MallocBench/espresso/espresso
MultiSource/Benchmarks/McCat/05-eks/eks
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan
MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg
MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame
MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset
MultiSource/Benchmarks/Prolangs-C/cdecl/cdecl
MultiSource/Benchmarks/mafft/pairlocalalign
MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode
MultiSource/Benchmarks/nbench/nbench
MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4
SingleSource/Benchmarks/Adobe-C++/loop_unroll
SingleSource/Benchmarks/Misc-C++-EH/spirit
Just to clear, hitting the early-exit does not necessarily mean there was an
affect on the final generated code, but it does show that the added early exit
does not handle only a pathological case.
In light of the above, I think we should take a few steps back to r308322 and
find a solution that avoids the compilation-time explosion reported in PR33900
and in GeekBench, and does not have a negative performance impact.
If we don't have data that shows the positive affect of r308322, i would like
to please request that r308322 and r308321 be disabled until a more robust
solution is found. We should also ensure the 5.0 release branch is fixed
accordingly.
Thanks, Zvi.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>