<html>

    <head>

      <base href="https://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - Inefficient unrolling in vectorization pass when VF==1"

   href="https://llvm.org/bugs/show_bug.cgi?id=23217">23217</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Inefficient unrolling in vectorization pass when VF==1

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Loop Optimizer

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>wmi@google.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvmbugs@cs.uiuc.edu

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=14199" name="attach_14199" title="bad.s">attachment 14199</a> <a href="attachment.cgi?id=14199&action=edit" title="bad.s">[details]</a></span>

bad.s

We found the unrolling in loop vectorization pass when VF==1 was inefficient

when analyzing an internal benchmark. The simple testcase 1.c here is used to

show the problem:

1.c:

int a[1000], N;

void foo() {

  long i;

  for (i = 0; i < N; i++) {

    a[i*7] = 3;

  }

}

~/workarea/llvm-r234389/build/bin/clang -O2 -S 1.c

In loop vectorization pass, VF=1 and UF=2 are computed for the above loop.

Because VF==1, no vectorization will be done, but the loop will still be

unrolled by a factor of two. A remainder loop will be generated.

In loop unroll pass, the unrolled loop body will be unrolled another time by a

factor of two. The remainder loop will be unrolled by a factor of four. Two

extra loop prologues and a bunch of other checks will be generated. See the

bad.s attached.

If we disabled the unrolling in loop vectorization pass when VF==1, loop unroll

pass will do unrolling for the above loop by a factor of four all at once and

generate much less extra code like prologue and overflow checks. See the good.s

attached.

We experimentally disabled the unrolling in loop vectorization pass and saw the

internal benchmark improved 5% on sandybridge and 9% on westmere.

Google ref b/19469562</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>