<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - bubble sort test performance is 2 times worse with -unroll-runtime-epilog"

   href="https://llvm.org/bugs/show_bug.cgi?id=30939">30939</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>bubble sort test performance is 2 times worse with -unroll-runtime-epilog

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Loop Optimizer

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>evstupac@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=17562" name="attach_17562" title="test compiled with epilog">attachment 17562</a> <a href="attachment.cgi?id=17562&action=edit" title="test compiled with epilog">[details]</a></span>

test compiled with epilog

The performance difference between prologue and epilogue unroll is unclear for

the case.

performance differs 2 times on X86 when

SingleSource/Benchmarks/Stanford/Bubblesort.c is compiled with

-O2 -march=core-avx2 -mllvm -unroll-runtime-epilog=true (bad case)

and

-O2 -march=core-avx2 -mllvm -unroll-runtime-epilog=false (good case)

Attached assemblies from current compiler:

bs_epil.s

bs_prol.s

and assembly from hottest loop:

bs_epil_loop.s

bs_prol_loop.s

The code looks very similar and with some assembly modifications I was able to

make hottest loops identical keeping the same performance gap (2 times).

Deeper analysis uncovered that hottest loop (99% of execution time) mostly

consist of unpredictable branches stalls:

while ( i<top ) {    

    if ( sortlist[i] > sortlist[i+1] ) {

        j = sortlist[i];

        sortlist[i] = sortlist[i+1];

        sortlist[i+1] = j;

    }

    i=i+1;

}

sortlist is randomly filled array. That way comparison in the loop is

completely unpredictable. The distance between branches is very short.

This makes the test very sensitive to code shifts and memory accesses order (as

it influence on branch prediction in the loop).

See related discussions:

<a href="https://reviews.llvm.org/D18158">https://reviews.llvm.org/D18158</a>

<a href="https://reviews.llvm.org/D24593">https://reviews.llvm.org/D24593</a></pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>