[PATCH] D90386: [ADT] Add methods to SmallString for efficient concatenation

Nathan James via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 29 04:48:58 PDT 2020


njames93 created this revision.
njames93 added a reviewer: dblaikie.
Herald added subscribers: llvm-commits, dexonsmith.
Herald added a project: LLVM.
njames93 requested review of this revision.

A common pattern when using SmallString is to repeatedly call append to build a larger string.
The issue here is the optimizer can't see through this and often has to check there is enough space in the storage for each string you try to append.
This results in lots of conditional branches and potentially multiple calls to grow needing to be emitted if the buffer wasn't large enough.
By taking an initializer_list of StringRefs, SmallString can preallocate the storage it needs for all of the StringRefs which only need to grow one time at most, then use a fast path of copying all the strings into its storage knowing there is guaranteed to be enough capacity.
By using StringRefs, this also means you can append different string like types in one go as they will all be implicitly converted to a StringRef.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D90386

Files:
  llvm/include/llvm/ADT/SmallString.h
  llvm/unittests/ADT/SmallStringTest.cpp


Index: llvm/unittests/ADT/SmallStringTest.cpp
===================================================================
--- llvm/unittests/ADT/SmallStringTest.cpp
+++ llvm/unittests/ADT/SmallStringTest.cpp
@@ -71,6 +71,12 @@
   EXPECT_STREQ("abc", theString.c_str());
 }
 
+TEST_F(SmallStringTest, AssignStringRefs) {
+  theString.assign({"abc", "def", "ghi"});
+  EXPECT_EQ(9u, theString.size());
+  EXPECT_STREQ("abcdefghi", theString.c_str());
+}
+
 TEST_F(SmallStringTest, AppendIterPair) {
   StringRef abc = "abc";
   theString.append(abc.begin(), abc.end());
@@ -96,6 +102,12 @@
   EXPECT_STREQ("abcabc", theString.c_str());
 }
 
+TEST_F(SmallStringTest, AppendStringRefs) {
+  theString.append({"abc", "def", "ghi"});
+  EXPECT_EQ(9u, theString.size());
+  EXPECT_STREQ("abcdefghi", theString.c_str());
+}
+
 TEST_F(SmallStringTest, StringRefConversion) {
   StringRef abc = "abc";
   theString.assign(abc.begin(), abc.end());
Index: llvm/include/llvm/ADT/SmallString.h
===================================================================
--- llvm/include/llvm/ADT/SmallString.h
+++ llvm/include/llvm/ADT/SmallString.h
@@ -30,6 +30,12 @@
   /// Initialize from a StringRef.
   SmallString(StringRef S) : SmallVector<char, InternalLen>(S.begin(), S.end()) {}
 
+  /// Initialize by concatenating a list of StringRefs.
+  SmallString(std::initializer_list<StringRef> Refs)
+      : SmallVector<char, InternalLen>() {
+    this->append(Refs);
+  }
+
   /// Initialize with a range.
   template<typename ItTy>
   SmallString(ItTy S, ItTy E) : SmallVector<char, InternalLen>(S, E) {}
@@ -65,6 +71,12 @@
     SmallVectorImpl<char>::append(RHS.begin(), RHS.end());
   }
 
+  /// Assign from a list of StringRefs.
+  void assign(std::initializer_list<StringRef> Refs) {
+    this->clear();
+    append(Refs);
+  }
+
   /// @}
   /// @name String Concatenation
   /// @{
@@ -89,6 +101,20 @@
     SmallVectorImpl<char>::append(RHS.begin(), RHS.end());
   }
 
+  /// Append from a list of StringRefs.
+  void append(std::initializer_list<StringRef> Refs) {
+    size_t SizeNeeded = this->size();
+    for (const StringRef &Ref : Refs)
+      SizeNeeded += Ref.size();
+    this->reserve(SizeNeeded);
+    auto CurEnd = this->end();
+    for (const StringRef &Ref : Refs) {
+      this->uninitialized_copy(Ref.begin(), Ref.end(), CurEnd);
+      CurEnd += Ref.size();
+    }
+    this->set_size(SizeNeeded);
+  }
+
   /// @}
   /// @name String Comparison
   /// @{


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D90386.301576.patch
Type: text/x-patch
Size: 2455 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20201029/b8f73980/attachment.bin>


More information about the llvm-commits mailing list