<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - GeekBench4/Lua on Skylake/SSE2 regressed by 12% between r308321 and r308891"
   href="https://bugs.llvm.org/show_bug.cgi?id=34032">34032</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>GeekBench4/Lua on Skylake/SSE2 regressed by 12% between r308321 and r308891
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Common Code Generator Code
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>zvi.rackover@intel.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>We observed a 12% degradation in the benchmark score between the mentioned
commits. Due to PR33900 the test would time-out in the range of commits, so we
can only get measurements on the boundaries.

Just so we are all on the same page, here is a timeline:
r308321 - GeekBench4/Lua was measured with score X.
r308322 - GeekBench4/Lua started failing due to compile-time timeout (same as
PR33900)
r308321 - GeekBench4/Lua back to life, but with score X-12%.

r308891 intended to fix the timeout problem and according to the log:
 "The cut off at 20 uses was chosen by looking at other cut-offs in LLVM for
user scanning. It's probably too high, but does the job and is very unlikely to
regress anything."

The combination of r308321+r308321 brings the benchmark to a lesser score than
in r308321. 
I ran an experiment on top-of-trunk (r309733) where i applied the following
patch which is intended to identify cases where FindAllMemoryUses bails-out
early due to r308891:

--- a/lib/CodeGen/CodeGenPrepare.cpp
+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -4059,8 +4059,10 @@ static bool FindAllMemoryUses(
   for (Use &U : I->uses()) {
     // Conservatively return true if we're seeing a large number or a deep
chain
     // of users. This avoids excessive compilation times in pathological
cases.
-    if (SeenInsts++ >= MaxMemoryUsesToScan)
+    if (SeenInsts++ >= MaxMemoryUsesToScan) {
+      llvm_unreachable("Early exit fired!");
       return true;
+    }

     Instruction *UserI = cast<Instruction>(U.getUser());
     if (LoadInst *LI = dyn_cast<LoadInst>(UserI)) {

With the patch applied, compiled the benchmark and indeed the early-exit did
fire. So unfortunately, the assumption that r308891 does not affect real-world
code (as much as benchmarks can be considered as such) does not hold.

Next, i ran the LLVM Test-Suite with the same patch applied and saw 79
compilations that hit the UNREACHABLE. For reference, here is a list of the
suites (some of the MultiSource suites contained more than one UNREACHABLE
hit):

MultiSource/Applications/ClamAV/clamscan
MultiSource/Applications/JM/ldecod/ldecod
MultiSource/Applications/JM/lencod/lencod
MultiSource/Applications/SIBsim4/SIBsim4 MultiSource/Applications/SPASS/SPASS
MultiSource/Applications/d/make_dparser
MultiSource/Applications/hbd/hbd
MultiSource/Applications/lua/lua
MultiSource/Applications/minisat/minisat
MultiSource/Applications/oggenc/oggenc
MultiSource/Applications/sqlite3/sqlite3
MultiSource/Applications/treecc/treecc
MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
MultiSource/Benchmarks/Bullet/bullet
MultiSource/Benchmarks/MallocBench/espresso/espresso
MultiSource/Benchmarks/McCat/05-eks/eks
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan
MultiSource/Benchmarks/MiBench/consumer-jpeg/consumer-jpeg
MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame
MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset
MultiSource/Benchmarks/Prolangs-C/cdecl/cdecl
MultiSource/Benchmarks/mafft/pairlocalalign
MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/cjpeg
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode
MultiSource/Benchmarks/nbench/nbench
MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4
SingleSource/Benchmarks/Adobe-C++/loop_unroll
SingleSource/Benchmarks/Misc-C++-EH/spirit

Just to clear, hitting the early-exit does not necessarily mean there was an
affect on the final generated code, but it does show that the added early exit
does not handle only a pathological case.

In light of the above, I think we should take a few steps back to r308322 and
find a solution that avoids the compilation-time explosion reported in PR33900
and in GeekBench, and does not have a negative performance impact.

If we don't have data that shows the positive affect of r308322, i would like
to please request that r308322 and r308321 be disabled until a more robust
solution is found. We should also ensure the 5.0 release branch is fixed
accordingly.

Thanks, Zvi.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>