[llvm-branch-commits] [llvm] [llvm][mustache] Optimize accessor splitting with a single pass (PR #159198)

Paul Kirth via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Tue Sep 30 17:04:57 PDT 2025


https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/159198

>From 66ea253c076cffb6aaf6569806c7b32c069e754d Mon Sep 17 00:00:00 2001
From: Paul Kirth <paulkirth at google.com>
Date: Tue, 16 Sep 2025 00:24:43 -0700
Subject: [PATCH] [llvm][mustache] Optimize accessor splitting with a single
 pass

The splitMustacheString function previously used a loop of
StringRef::split and StringRef::trim. This was inefficient as
it scanned each segment of the accessor string multiple times.

This change introduces a custom splitAndTrim function that
performs both operations in a single pass over the string,
reducing redundant work and improving performance, most notably
in the number of CPU cycles executed.

  Metric         | Baseline | Optimized | Change
  -------------- | -------- | --------- | -------
  Time (ms)      | 35.57    | 35.36     | -0.59%
  Cycles         | 34.91M   | 34.26M    | -1.86%
  Instructions   | 85.54M   | 85.24M    | -0.35%
  Branch Misses  | 111.9K   | 112.2K    | +0.27%
  Cache Misses   | 242.1K   | 239.9K    | -0.91%
---
 llvm/lib/Support/Mustache.cpp | 34 +++++++++++++++++++++++++++-------
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Support/Mustache.cpp b/llvm/lib/Support/Mustache.cpp
index 4786242cdfba9..8eebeaec11925 100644
--- a/llvm/lib/Support/Mustache.cpp
+++ b/llvm/lib/Support/Mustache.cpp
@@ -34,6 +34,32 @@ static bool isContextFalsey(const json::Value *V) {
   return isFalsey(*V);
 }
 
+static void splitAndTrim(StringRef Str, SmallVectorImpl<StringRef> &Tokens) {
+  size_t CurrentPos = 0;
+  while (CurrentPos < Str.size()) {
+    // Find the next delimiter.
+    size_t DelimiterPos = Str.find('.', CurrentPos);
+
+    // If no delimiter is found, process the rest of the string.
+    if (DelimiterPos == StringRef::npos) {
+      DelimiterPos = Str.size();
+    }
+
+    // Get the current part, which may have whitespace.
+    StringRef Part = Str.slice(CurrentPos, DelimiterPos);
+
+    // Manually trim the part without creating a new string object.
+    size_t Start = Part.find_first_not_of(" \t\r\n");
+    if (Start != StringRef::npos) {
+      size_t End = Part.find_last_not_of(" \t\r\n");
+      Tokens.push_back(Part.slice(Start, End + 1));
+    }
+
+    // Move past the delimiter for the next iteration.
+    CurrentPos = DelimiterPos + 1;
+  }
+}
+
 static Accessor splitMustacheString(StringRef Str, MustacheContext &Ctx) {
   // We split the mustache string into an accessor.
   // For example:
@@ -46,13 +72,7 @@ static Accessor splitMustacheString(StringRef Str, MustacheContext &Ctx) {
     // It's a literal, so it doesn't need to be saved.
     Tokens.push_back(".");
   } else {
-    while (!Str.empty()) {
-      StringRef Part;
-      std::tie(Part, Str) = Str.split('.');
-      // Each part of the accessor needs to be saved to the arena
-      // to ensure it has a stable address.
-      Tokens.push_back(Part.trim());
-    }
+    splitAndTrim(Str, Tokens);
   }
   // Now, allocate memory for the array of StringRefs in the arena.
   StringRef *ArenaTokens = Ctx.Allocator.Allocate<StringRef>(Tokens.size());



More information about the llvm-branch-commits mailing list