<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Incorrect vectorisation of loop in StringRef::find when building 2 stage SVE"
href="https://bugs.llvm.org/show_bug.cgi?id=51182">51182</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Incorrect vectorisation of loop in StringRef::find when building 2 stage SVE
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Loop Optimizer
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>david.spickett@linaro.org
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Since <a href="https://reviews.llvm.org/D105817">https://reviews.llvm.org/D105817</a> our SVE 2 stage bot
<a href="https://lab.llvm.org/buildbot/#/builders/176/builds/88">https://lab.llvm.org/buildbot/#/builders/176/builds/88</a> has been failing to
build stage 2 with this error:
```
FAILED: tools/clang/include/clang/AST/AbstractBasicReader.inc
cd /home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/stage2 &&
/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/stage2/bin/clang-tblgen
-gen-clang-basic-reader -I
/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/llvm/clang/include/clang/AST
-I/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/llvm/clang/include
-I/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/stage2/tools/clang/include
-I/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/stage2/include
-I/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/include
/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/llvm/clang/include/clang/AST/PropertiesBase.td
--write-if-changed -o tools/clang/include/clang/AST/AbstractBasicReader.inc -d
tools/clang/include/clang/AST/AbstractBasicReader.inc.d
/home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/llvm/clang/include/clang/AST/PropertiesBase.td:362:3:
error: creation code for Array doesn't refer to property "totalLength"
def : Creator<[{
^
```
This bot builds stage 1 with -mcpu=a64fx then stage 2 with that plus -mllvm
-aarch64-sve-vector-bits-min=512. Which enables vectorisation. Removing that
flag fixes the build.
I narrowed this down by enabling -Rpass=loop-vectorize and looking at the loops
mentioned before and after the patch. The only additional loop was:
```
/home/david.spickett/llvm-project/llvm/lib/Support/StringRef.cpp:157:3: remark:
vectorized loop (vectorization width: 8, interleaved count: 1)
[-Rpass=loop-vectorize]
```
To confirm I did the following:
```
diff --git a/llvm/lib/Support/StringRef.cpp b/llvm/lib/Support/StringRef.cpp
index c532a1abe906..84210a91088f 100644
--- a/llvm/lib/Support/StringRef.cpp
+++ b/llvm/lib/Support/StringRef.cpp
@@ -154,6 +154,7 @@ size_t StringRef::find(StringRef Str, size_t From) const {
// Build the bad char heuristic table, with uint8_t to reduce cache
thrashing.
uint8_t BadCharSkip[256];
std::memset(BadCharSkip, N, 256);
+ #pragma clang loop vectorize(disable) interleave(disable)
for (unsigned i = 0; i != N-1; ++i)
BadCharSkip[(uint8_t)Str[i]] = N-1-i;
```
This makes some sense, if you look at this "creation code" handler:
```
llvm::StringRef getCreationCode() const {
return get()->getValueAsString(CreateFieldName);
}
// Verify that the creation code refers to this property.
if (info.IsReader && creationCode.find(prop.getName()) == StringRef::npos)
PrintFatalError(nodeInfo.Creator.getLoc(),
"creation code for " + node.getName()
+ " doesn't refer to property \""
+ prop.getName() + "\"");
```
We're doing a find in a StringRef. Though I was surprised because I expected to
see some more "complicated" loop be the issue but then again I know almost
nothing about the vectoriser.
At this point I'm not sure if it's the general vectorisation or how the SVE
code itself is emitted.
I will attach disassembly of the function with and without vectorisation (see
the sequence of 5 nops which marks the before/after of the loop). Next I'll
check what IR we get to see if the problem (whatever it is) is present before
codegen happens.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>