<html>

    <head>

      <base href="http://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - Possible regression of vectorization of Linpack on ARM"

   href="http://llvm.org/bugs/show_bug.cgi?id=15247">15247</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Possible regression of vectorization of Linpack on ARM

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Loop Optimizer

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>renato.golin@linaro.org

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvmbugs@cs.uiuc.edu

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=10004" name="attach_10004" title="result of the runs with clang and GCC">attachment 10004</a> <a href="attachment.cgi?id=10004&action=edit" title="result of the runs with clang and GCC">[details]</a></span>

result of the runs with clang and GCC

Running Linpack from (<a href="http://www.netlib.org/benchmark/linpackc.new">http://www.netlib.org/benchmark/linpackc.new</a>) with clang

with and without vectorization, I get 40% of performance boost on Intel but -3%

on ARM.

Looking at the code, the intermixing of ARM and NEON code looks particularly

bad, so it seems some shuffles are not being lowered correctly or are being

generated on a pattern that the back-end instruction combine doesn't recognize.

Nadav said he got 40% boost on ARM too, so this could be a regression, but not

necessarily on the loop vectorizer, since the BB vectorizer uses the same cost

model and is known to abuse of shuffles.

Attached is the CSV with the result of the runs with clang and GCC. The

compilation command lines are below:

GCC:

$ gcc -O3 linpack.c

Clang Vect:

$ clang -O3 linpack.c

Clang No-vect:

$ clang -O3 -fno-vectorize linpack.c</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>