[llvm-bugs] [Bug 36448] New: Vectorization improvement opportunity for loops with stride

Mon Feb 19 21:04:50 PST 2018

https://bugs.llvm.org/show_bug.cgi?id=36448

            Bug ID: 36448
           Summary: Vectorization improvement opportunity for loops with
                    stride
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: serguei.katkov at azul.com
                CC: llvm-bugs at lists.llvm.org

Let's consider the following loop (https://godbolt.org/g/W8z3dY)
void testStride(int a[], int b[], int N) {
  for (int i = 0; i < N; i+=2)
    a[i] = b[i];
}

If we specify that we have avx-512 support (-march=skylake-avx512) LLVM will be
able to vectorize it using Gather/Scatter.

However if we do not have the avx-512 support LLVM will not vectorize this loop
due to its cost model detects it is inefficient because it needs to scalarize
the memory access.

At the same time LLVM Vectorizer supports masked load/store but it is not used
for loops with stride access. It is only used for loops with conditions.

Specifically if I re-write the loop as
void testCond(int a[], int b[], int N) {
  for (int i = 0; i < N; i++)
    if ((i % 2) == 0)
      a[i] = b[i];
}

LLVM vectorizes this loop and uses masked load/store. However it has a problem
to detect a simple stride pattern for mask and computes it on each iteration.

So I guess there are two opportunities here:
1) Support masked load/store for stride access to memory
2) Be clever in determine invariant mask hoisting from the loop.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180220/6cfd6509/attachment.html>