[PATCH] D133660: [Support] Add fast path for StringRef::find with needle of length 2.

Tatsuyuki Ishi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Sep 11 00:59:27 PDT 2022


ishitatsuyuki created this revision.
Herald added a subscriber: hiraditya.
Herald added a project: All.
ishitatsuyuki requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

InclusionRewriter on Windows (CRLF line endings) will exercise this in a
hot path. Calling memcmp repeatedly would be highly suboptimal for that
use case, so give it a specialized path.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D133660

Files:
  llvm/lib/Support/StringRef.cpp


Index: llvm/lib/Support/StringRef.cpp
===================================================================
--- llvm/lib/Support/StringRef.cpp
+++ llvm/lib/Support/StringRef.cpp
@@ -148,6 +148,23 @@
 
   const char *Stop = Start + (Size - N + 1);
 
+  if (N == 2) {
+    // Provide a fast path for newline finding (CRLF case) in InclusionRewriter.
+    // This is basically a naive search with a little bit twiddling.
+
+    // In theory we could use uint16_t, but some architectures don't support
+    // 16-bit arithmetic and will require a (& 65535) after the bit operations.
+    // Using 32-bit arithmetic with a shift of 16 would be more efficient for
+    // those cases.
+    uint32_t NeedleWord =
+        (uint32_t)(uint8_t)Needle[0] << 16 | (uint8_t)Needle[1];
+    uint32_t HaystackWord =
+        (uint32_t)(uint8_t)Start[0] << 16 | (uint8_t)Start[1];
+    for (Start++; Start < Stop && NeedleWord != HaystackWord;)
+      HaystackWord = HaystackWord << 16 | (uint8_t) * ++Start;
+    return NeedleWord == HaystackWord ? Start - Data - 1 : npos;
+  }
+
   // For short haystacks or unsupported needles fall back to the naive algorithm
   if (Size < 16 || N > 255) {
     do {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D133660.459333.patch
Type: text/x-patch
Size: 1187 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220911/e21194dc/attachment.bin>


More information about the llvm-commits mailing list