<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Loop vectorizer does not handle loop with a trip count equal to the vector size"
   href="http://llvm.org/bugs/show_bug.cgi?id=17830">17830</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Loop vectorizer does not handle loop with a trip count equal to the vector size
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>tools
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>opt
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>fwinter@jlab.org
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=11494" name="attach_11494" title="L3 has trip count 4">attachment 11494</a> <a href="attachment.cgi?id=11494&action=edit" title="L3 has trip count 4">[details]</a></span>
L3 has trip count 4

In a nested loop setup, where the inner loop's trip count is equal the vector
size (e.g. 4 in case of SSE), this loop gets not vectorized. In this case,
because the resulting loop would have a trip count of 1, the loop could also be
removed leaving only the loop body.

I'll attach an example which computes something similar to

for (int i = start ; i < end ; ++i )
  for (int p = 0 ; p < 3 ; ++p )
    a[i*4+p] = b[i*4+p] + c[i*4+p];

(NB, coming from C this gets vectorized. Possibly due to unrolling and the SLP
vectorized. I would like to specifically address the loop vectorizer here.)

When I call on the attached IR the following command:

opt -loop-vectorize -debug-only=loop-vectorize loop4.ll -S

Then the loop vectorizer complains:

LV: Checking a loop in "main"
LV: Found a loop: L3
LV: Found a loop with a very small trip count. This loop is not worth
vectorizing.
LV: Not vectorizing.

This is not good. I would be great if the loop vectorizer could handle this
loop.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>