<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - LIBGAV1 perf regression trunk vs. Clang 9"
   href="https://bugs.llvm.org/show_bug.cgi?id=44539">44539</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>LIBGAV1 perf regression trunk vs. Clang 9
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>andrea.dibiagio@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>This is related to <a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Perf regressions on many benchmarks with AMD Threadripper"
   href="show_bug.cgi?id=44411">bug 44411</a>.

There is a significant perf regression in benchmark LIBGAV1.

---

-O3 -march=znver1

Numbers are FPS (frames per second); more is better.


```
single thread -- 2000 frames
========
                         |  GCC 7.4 |  CLANG 9.x  |  CLANG Master 
chimera_8b_1080p.ivf     |  22.77   |  21.86      |  18.71
chimera_10b_1080p.ivf    |  11.31   |  12.68      |  11.67   
summer_nature_1080p.ivf  |  21.10   |  21.02      |  18.29
summer_nature_4K.ivf     |   4.74   |   4.57      |   3.94


multi threaded (8) -- no frame limit
========
                         |  GCC 7.4 |  CLANG 9.x  |  CLANG master
chimera_8b_1080p.ivf     |  43.51   |  42.76      |  34.80
chimera_10b_1080p.ivf    |  16.18   |  18.98      |  17.12
summer_nature_1080p.ivf  |  64.22   |  63.66      |  53.89
summer_nature_4K.ivf     |  17.57   |  17.17      |  14.70


multi threaded (16) -- no frame limit
========
                         |  GCC 7.4 |  CLANG 9.x  |  CLANG master
chimera_8b_1080p.ivf     |  43.40   |  43.05      |  38.73
chimera_10b_1080p.ivf    |  16.54   |  19.68      |  18.67
summer_nature_1080p.ivf  |  62.72   |  62.20      |  54.96
summer_nature_4K.ivf     |  19.31   |  19.11      |  17.13
```


The single threaded execution is ~14% slower on master vs clang 9.x.

Later I will post a full description of the underlying issue that caused this
perf regression.

tl;dr: performance degradation in libgav is caused by poor decisions made by
pass "x86 cmov converter". In particular, a bunch of CMOVs from a hot loop are
now sub-optimally expanded into if-then blocks. Those CMOVs weren't expanded by
the Clang 9 compiler (that was the correct decision).

If we disable that pass then we fully get back the performance loss. For
example, decoding "chimera_8b_1080p.ivf" with a single thread gives us an
average of 22.14 fps.

As I wrote, I plan to post all my findings in a follow-up comment.

NOTE: this is unlikely to be AMD specific. For example, I can reproduce the
poor CMOV expansions if I generate code for Skylake.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>