<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - bubble sort flakiness"

   href="https://llvm.org/bugs/show_bug.cgi?id=30404">30404</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>bubble sort flakiness

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>Test Suite

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>QMTest

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>evstupac@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>The test performance differs 2 times when test:

SingleSource/Benchmarks/Stanford/Bubblesort.c is compiled with

-O2 -march=core-avx2 -mllvm -unroll-runtime-epilog=true (bad case)

and

-O2 -march=core-avx2 -mllvm -unroll-runtime-epilog=false (good case)

Attached assemblies from current compiler:

bs_epil.s

bs_prol.s

and assembly from hottest loop:

bs_epil_loop.s

bs_prol_loop.s

The code looks very similar and with some assembly modifications I was able to

make hottest loops identical keeping the same performance gap (2 times).

Deeper analysis uncovered that hottest loop (99% of execution time) mostly

consist of memory and code stalls:

while ( i<top ) {    

    if ( sortlist[i] > sortlist[i+1] ) {

        j = sortlist[i];

        sortlist[i] = sortlist[i+1];

        sortlist[i+1] = j;

    }

    i=i+1;

}

sortlist is randomly filled array. That way comparison in the loop is

completely unpredictable. The same we could say about stores. The distance

between branches is very short (the same about memory accesses).

This makes the test very sensitive to code shifts and memory accesses order and

distance.

<a href="https://reviews.llvm.org/D24593">https://reviews.llvm.org/D24593</a>

Contains a patch to the test that makes it not code/memory sensitive (and

therefore more stable) without significant performance difference (from good

case).</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>