[llvm-bugs] [Bug 51182] New: Incorrect vectorisation of loop in StringRef::find when building 2 stage SVE

via llvm-bugs llvm-bugs at lists.llvm.org
Fri Jul 23 08:15:20 PDT 2021


            Bug ID: 51182
           Summary: Incorrect vectorisation of loop in StringRef::find
                    when building 2 stage SVE
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedbugs at nondot.org
          Reporter: david.spickett at linaro.org
                CC: llvm-bugs at lists.llvm.org

Since https://reviews.llvm.org/D105817 our SVE 2 stage bot
https://lab.llvm.org/buildbot/#/builders/176/builds/88 has been failing to
build stage 2 with this error:
FAILED: tools/clang/include/clang/AST/AbstractBasicReader.inc 
cd /home/tcwg-buildslave/worker/clang-aarch64-sve-vls-2stage/stage2 &&
-gen-clang-basic-reader -I
--write-if-changed -o tools/clang/include/clang/AST/AbstractBasicReader.inc -d
error: creation code for Array doesn't refer to property "totalLength"
  def : Creator<[{

This bot builds stage 1 with -mcpu=a64fx then stage 2 with that plus -mllvm
-aarch64-sve-vector-bits-min=512. Which enables vectorisation. Removing that
flag fixes the build.

I narrowed this down by enabling -Rpass=loop-vectorize and looking at the loops
mentioned before and after the patch. The only additional loop was:
/home/david.spickett/llvm-project/llvm/lib/Support/StringRef.cpp:157:3: remark:
vectorized loop (vectorization width: 8, interleaved count: 1)

To confirm I did the following:
diff --git a/llvm/lib/Support/StringRef.cpp b/llvm/lib/Support/StringRef.cpp
index c532a1abe906..84210a91088f 100644
--- a/llvm/lib/Support/StringRef.cpp
+++ b/llvm/lib/Support/StringRef.cpp
@@ -154,6 +154,7 @@ size_t StringRef::find(StringRef Str, size_t From) const {
   // Build the bad char heuristic table, with uint8_t to reduce cache
   uint8_t BadCharSkip[256];
   std::memset(BadCharSkip, N, 256);
+  #pragma clang loop vectorize(disable) interleave(disable)
   for (unsigned i = 0; i != N-1; ++i)
     BadCharSkip[(uint8_t)Str[i]] = N-1-i;

This makes some sense, if you look at this "creation code" handler:
llvm::StringRef getCreationCode() const {
    return get()->getValueAsString(CreateFieldName);

   // Verify that the creation code refers to this property.
    if (info.IsReader && creationCode.find(prop.getName()) == StringRef::npos)
                      "creation code for " + node.getName()
                        + " doesn't refer to property \""
                        + prop.getName() + "\"");

We're doing a find in a StringRef. Though I was surprised because I expected to
see some more "complicated" loop be the issue but then again I know almost
nothing about the vectoriser.

At this point I'm not sure if it's the general vectorisation or how the SVE
code itself is emitted.

I will attach disassembly of the function with and without vectorisation (see
the sequence of 5 nops which marks the before/after of the loop). Next I'll
check what IR we get to see if the problem (whatever it is) is present before
codegen happens.

You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210723/831376a1/attachment.html>

More information about the llvm-bugs mailing list