[llvm-bugs] [Bug 34681] New: [LV/LAA] Vectorizer creates a dead vectorized loop body when specializing for stride=1

Tue Sep 19 23:22:44 PDT 2017

https://bugs.llvm.org/show_bug.cgi?id=34681

            Bug ID: 34681
           Summary: [LV/LAA] Vectorizer creates a dead vectorized loop
                    body when specializing for stride=1
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedbugs at nondot.org
          Reporter: dorit.nuzman at intel.com
                CC: llvm-bugs at lists.llvm.org

Created attachment 19177
  --> https://bugs.llvm.org/attachment.cgi?id=19177&action=edit
patch for a fix

Consider the following testcase:

void matmul(unsigned int N, int *C, int *A, int *B) {
  unsigned int i,j,k;
  for (i=0; i<N; i++) {
    for (j=0; j<N; j++) {
        C[i*N+j]=0;
        for(k=0;k<N;k++) {
                C[i*N+j]+=(int)A[i*N+k] * (int)B[k*N+j];
        }       
    }   
  }     
}

Compiling the attached testcase with -O2 -m32 results in specializing the loop
for the case where the unknown stride N == 1.
However, the vectorized loop will only be executed if the iteration count N >=
VF, that is if N > 1.
The two conditions cannot co-exist, so the vectorized loop body becomes dead
code. (Eventually this dead code is identified and gets removed).
It would have been better to avoid specialization for the stride if we know
that the stride==1 predicate is going to contradict the
loop-minimum-iteration-count guard.

The attached patch does that, and works for the simple case at hand.
I'll upload for review after further testing. Hopefully it's of interest...
(haven't seen any performance gains with it so far).

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170920/4c721607/attachment.html>