[llvm-bugs] [Bug 47929] New: Missed vectorization for loop in which array elements with different offset are read after write

Wed Oct 21 01:13:04 PDT 2020

https://bugs.llvm.org/show_bug.cgi?id=47929

            Bug ID: 47929
           Summary: Missed vectorization for loop in which array elements
                    with different offset are read after write
           Product: clang
           Version: 11.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: LLVM Codegen
          Assignee: unassignedclangbugs at nondot.org
          Reporter: hujiangping at cn.fujitsu.com
                CC: llvm-bugs at lists.llvm.org, neeilans at live.com,
                    richard-llvm at metafoo.co.uk

For the following codes, main.c can't be vectorized, while main5.c can but a
little complicated.  Form the C code level, there is no meaningful difference
between them, so why main.c can't be vectorized?

```main.c
#define LEN 100

float a[LEN], b[LEN], c[LEN], d[LEN];

int foo(void)
{
  int ntimes = LEN;

  for (int nl = 0; nl < ntimes; nl++) {
          for (int i = 0; i < LEN-1; i++) {
                  a[i] *= c[i];
                  b[i] += a[i + 1] * d[i];
          }
  }
}
```

```main5.c
#define LEN 100

float a[LEN], b[LEN], c[LEN], d[LEN];

int foo(void)
{
  int ntimes = LEN;

  for (int nl = 0; nl < ntimes; nl++) {
          for (int i = 0; i < LEN-1; i++) {
                  b[i] += a[i + 1] * d[i];
                  a[i] *= c[i];
          }
  }
}
```

```shell
# /home/build_llvm/LLVM1100rc1/llvm/build/bin/clang -Ofast -march=armv8.2-a
-Rpass-analysis=loop-vectorize -S -c ../main.c
../main.c:16:1: warning: non-void function does not return a value
[-Wreturn-type]
}
^
../main.c:11:24: remark: loop not vectorized: value that could not be
identified as reduction is used outside the loop
[-Rpass-analysis=loop-vectorize]
                  a[i] *= c[i];
                       ^
../main.c:10:11: remark: loop not vectorized: unsafe dependent memory
operations in loop. Use #pragma loop distribute(enable) to allow loop
distribution to attempt to isolate the offending operations into a separate
loop [-Rpass-analysis=loop-vectorize]
          for (int i = 0; i < LEN-1; i++) {
          ^
1 warning generated.

# /home/build_llvm/LLVM1100rc1/llvm/build/bin/clang -Ofast -march=armv8.2-a
-Rpass-analysis=loop-vectorize -S -c ../main5.c
../main5.c:16:1: warning: non-void function does not return a value
[-Wreturn-type]
}
^
../main5.c:10:11: remark: the cost-model indicates that interleaving is not
beneficial [-Rpass-analysis=loop-vectorize]
          for (int i = 0; i < LEN-1; i++) {
          ^
1 warning generated.
```

Because that a[i+1] does not depend on a[i] *= c[i], I think it can be load in
a different vector register at the begin, then main.c will be vectorized too
and main5.c will be vectorized more efficiently.  Why can't we do like that?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20201021/5f476438/attachment.html>