<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Missed vectorization for loop in which array elements with different offset are read after write"
href="https://bugs.llvm.org/show_bug.cgi?id=47929">47929</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Missed vectorization for loop in which array elements with different offset are read after write
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>11.0
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>LLVM Codegen
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>hujiangping@cn.fujitsu.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org, neeilans@live.com, richard-llvm@metafoo.co.uk
</td>
</tr></table>
<p>
<div>
<pre>For the following codes, main.c can't be vectorized, while main5.c can but a
little complicated. Form the C code level, there is no meaningful difference
between them, so why main.c can't be vectorized?
```main.c
#define LEN 100
float a[LEN], b[LEN], c[LEN], d[LEN];
int foo(void)
{
int ntimes = LEN;
for (int nl = 0; nl < ntimes; nl++) {
for (int i = 0; i < LEN-1; i++) {
a[i] *= c[i];
b[i] += a[i + 1] * d[i];
}
}
}
```
```main5.c
#define LEN 100
float a[LEN], b[LEN], c[LEN], d[LEN];
int foo(void)
{
int ntimes = LEN;
for (int nl = 0; nl < ntimes; nl++) {
for (int i = 0; i < LEN-1; i++) {
b[i] += a[i + 1] * d[i];
a[i] *= c[i];
}
}
}
```
```shell
# /home/build_llvm/LLVM1100rc1/llvm/build/bin/clang -Ofast -march=armv8.2-a
-Rpass-analysis=loop-vectorize -S -c ../main.c
../main.c:16:1: warning: non-void function does not return a value
[-Wreturn-type]
}
^
../main.c:11:24: remark: loop not vectorized: value that could not be
identified as reduction is used outside the loop
[-Rpass-analysis=loop-vectorize]
a[i] *= c[i];
^
../main.c:10:11: remark: loop not vectorized: unsafe dependent memory
operations in loop. Use #pragma loop distribute(enable) to allow loop
distribution to attempt to isolate the offending operations into a separate
loop [-Rpass-analysis=loop-vectorize]
for (int i = 0; i < LEN-1; i++) {
^
1 warning generated.
# /home/build_llvm/LLVM1100rc1/llvm/build/bin/clang -Ofast -march=armv8.2-a
-Rpass-analysis=loop-vectorize -S -c ../main5.c
../main5.c:16:1: warning: non-void function does not return a value
[-Wreturn-type]
}
^
../main5.c:10:11: remark: the cost-model indicates that interleaving is not
beneficial [-Rpass-analysis=loop-vectorize]
for (int i = 0; i < LEN-1; i++) {
^
1 warning generated.
```
Because that a[i+1] does not depend on a[i] *= c[i], I think it can be load in
a different vector register at the begin, then main.c will be vectorized too
and main5.c will be vectorized more efficiently. Why can't we do like that?</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>