[llvm-bugs] [Bug 34681] New: [LV/LAA] Vectorizer creates a dead vectorized loop body when specializing for stride=1
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Sep 19 23:22:44 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=34681
Bug ID: 34681
Summary: [LV/LAA] Vectorizer creates a dead vectorized loop
body when specializing for stride=1
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Loop Optimizer
Assignee: unassignedbugs at nondot.org
Reporter: dorit.nuzman at intel.com
CC: llvm-bugs at lists.llvm.org
Created attachment 19177
--> https://bugs.llvm.org/attachment.cgi?id=19177&action=edit
patch for a fix
Consider the following testcase:
void matmul(unsigned int N, int *C, int *A, int *B) {
unsigned int i,j,k;
for (i=0; i<N; i++) {
for (j=0; j<N; j++) {
C[i*N+j]=0;
for(k=0;k<N;k++) {
C[i*N+j]+=(int)A[i*N+k] * (int)B[k*N+j];
}
}
}
}
Compiling the attached testcase with -O2 -m32 results in specializing the loop
for the case where the unknown stride N == 1.
However, the vectorized loop will only be executed if the iteration count N >=
VF, that is if N > 1.
The two conditions cannot co-exist, so the vectorized loop body becomes dead
code. (Eventually this dead code is identified and gets removed).
It would have been better to avoid specialization for the stride if we know
that the stride==1 predicate is going to contradict the
loop-minimum-iteration-count guard.
The attached patch does that, and works for the simple case at hand.
I'll upload for review after further testing. Hopefully it's of interest...
(haven't seen any performance gains with it so far).
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170920/4c721607/attachment.html>
More information about the llvm-bugs
mailing list