<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - bubble sort flakiness"
   href="https://llvm.org/bugs/show_bug.cgi?id=30404">30404</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>bubble sort flakiness
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>Test Suite
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>QMTest
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>evstupac@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The test performance differs 2 times when test:
SingleSource/Benchmarks/Stanford/Bubblesort.c is compiled with

-O2 -march=core-avx2 -mllvm -unroll-runtime-epilog=true (bad case)
and
-O2 -march=core-avx2 -mllvm -unroll-runtime-epilog=false (good case)

Attached assemblies from current compiler:
bs_epil.s
bs_prol.s
and assembly from hottest loop:
bs_epil_loop.s
bs_prol_loop.s

The code looks very similar and with some assembly modifications I was able to
make hottest loops identical keeping the same performance gap (2 times).

Deeper analysis uncovered that hottest loop (99% of execution time) mostly
consist of memory and code stalls:

while ( i<top ) {    
    if ( sortlist[i] > sortlist[i+1] ) {
        j = sortlist[i];
        sortlist[i] = sortlist[i+1];
        sortlist[i+1] = j;
    }
    i=i+1;
}

sortlist is randomly filled array. That way comparison in the loop is
completely unpredictable. The same we could say about stores. The distance
between branches is very short (the same about memory accesses).
This makes the test very sensitive to code shifts and memory accesses order and
distance.

<a href="https://reviews.llvm.org/D24593">https://reviews.llvm.org/D24593</a>
Contains a patch to the test that makes it not code/memory sensitive (and
therefore more stable) without significant performance difference (from good
case).</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>