<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Loop-idiom recognition for memset in the inner-loop of a nested-loop interferes with vectorization"
href="https://bugs.llvm.org/show_bug.cgi?id=32854">32854</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Loop-idiom recognition for memset in the inner-loop of a nested-loop interferes with vectorization
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Loop Optimizer
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>brycelelbach@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=18382" name="attach_18382" title="Reduced Test Case">attachment 18382</a> <a href="attachment.cgi?id=18382&action=edit" title="Reduced Test Case">[details]</a></span>
Reduced Test Case
Compilation options, build environment, etc are documented in the attached file
and here:
<a href="https://wandbox.org/permlink/o06VeIxCKC1qIhUh">https://wandbox.org/permlink/o06VeIxCKC1qIhUh</a>
Summary: We have a nested loop like this (where A is a double* __restrict__):
for (ptrdiff_t j = 0; j != N; ++j)
for (ptrdiff_t i = 0; i != N; ++i)
A[i + j * N] = 0.0F;
Loop-idiom recognition determines that it can replace the inner loop with
memset, turning the code into:
for (ptrdiff_t j = 0; j != N; ++j)
std::memset(A + j * N, 0, sizeof(double) * N); // e.g. @llvm.memset
Later, the vectorizer sees this code and decides to bail out because it cannot
vectorize the inserted call to @llvm.memset.
I have so many questions here :)
0.) The diagnostic that the vectorizer pass remarks give is not very helpful:
'call instruction cannot be vectorized', BUT the source location it points to
isn't a call - it's the users original code. Many users may not divine the fact
that loop-idiom replacement occured and end up fruitfully trying to figure out
why assignment to double (the source location pointed to) is a call that cannot
be vectorized. At the very least, the pass remark (emitted from here:
<a href="https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Vectorize/LoopVectorize.cpp#L5422">https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Vectorize/LoopVectorize.cpp#L5422</a>)
could give the name of the function in the function call that could not be
vectorized (which I assume would be something like "memset" or "@llvm.memset"
in this case).
1.) Why is there not a vector version of @llvm.memset in addition to the scalar
version? Is this a problem with the underlying C library on my target (x86
Linux)?
2.) Why does the vectorizer give up when it encounters a scalar function call?
If the function is noexcept, it should be able to take something like this:
// Assume A is an cache-line aligned double* __restrict__
// and N is divisible by some nice number, say 32.
for (ptrdiff_t i = 0; i != N; ++i)
{
double tmp = scalar_noexcept_f(i);
A[i] += B[i] * tmp;
}
And turn it into something like this:
// Assume A is an cache-line aligned double* __restrict__
// and N is divisible by some nice number, say 32.
for (ptrdiff_t i = 0; i != N; i += 8)
{
// Vectorize "around" the scalar call.
__m512d tmp = _mm512_set_pd(
scalar_noexcept_f(i)
, scalar_noexcept_f(i+1)
, scalar_noexcept_f(i+2)
, scalar_noexcept_f(i+3)
, scalar_noexcept_f(i+4)
, scalar_noexcept_f(i+5)
, scalar_noexcept_f(i+6)
, scalar_noexcept_f(i+7)
);
_mm512_store_pd(
A + i
, _mm512_fmadd_pd(
_mm512_load_pd(A + i)
, _mm512_load_pd(B + i)
, tmp
)
);
}
3.) Why isn't loop-idiom recognition "nested loop aware"? In this case, my
nested loops could be turned into a single memset.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>