[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

Yaron Keren yaron.keren at gmail.com
Thu Apr 30 20:02:53 PDT 2015


Yes, we should do without virtual flush. The problem was raw_char_ostream
should not be flushed - it's always current - so I overloaded flush to a
no-op. A cleaner solution is attached, adding a DirectBuffer mode to the
raw_ostream.

Also added a simple performance comparison project between raw_char_ostream
and raw_svector_ostream. On Window 7 x64 machine, raw_char_ostream was
three times faster than raw_svector_ostream when the provided buffer size
is large enough and two times as fast with one dynamic allocation resize.


2015-04-30 22:48 GMT+03:00 Rafael EspĂ­ndola <rafael.espindola at gmail.com>:

> I don't think we should make flush virtual. Why do you need to do it?
> Can't you set up the base class to write to use the tail of the memory
> region as the buffer?
>
>
>
> On 24 April 2015 at 06:46, Yaron Keren <yaron.keren at gmail.com> wrote:
> > Hi,
> >
> > Is this what you're thinking about?
> > The code is not tested yet, I'd like to know if the overall direction and
> > design are correct.
> >
> > Yaron
> >
> >
> > 2015-04-21 0:10 GMT+03:00 Yaron Keren <yaron.keren at gmail.com>:
> >>
> >> Sean, thanks for reminding this, Alp did commit a class derived from
> >> raw_svector_ostream conatining an internal SmallString he called
> >> small_string_ostream. The commit was reverted after a day due to a
> >> disagreement about the commit approval and apparently abandoned.
> >>
> >>
> >>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223393.html
> >>
> >>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223557.html
> >>
> >>
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140630/223986.html
> >>
> >> The class Alp comitted did solve the possible mismatch between the
> >> SmallString and the stream by making the SmallString private to the
> class.
> >> This however did not treat the root problem, the duplication of the
> >> information about the buffer between SmallString and the stream.
> >>
> >> I can make performance measurements but even if the performance
> difference
> >> is neglible (and looking at all the code paths and conditionals of
> >> non-inlined functions at raw_ostream.cpp that's hard to believe), the
> >> current SmallString-raw_svector_ostream is simply wrong.
> >>
> >> Personally, after the previous "Alp" discussion I found and fixed
> several
> >> such issues in my out-of-tree code only to make new similar new error
> >> earlier this year, which I caught only months after, when Rafael
> committed
> >> the pwrite which reminded me the issue. Due ot the information
> duplication
> >> Rafael commit also had a bug and Mehdi reports similar experience. Back
> then
> >> Alp reported similar problems he found in LLVM code which were hopefully
> >> fixed otherwise.
> >>
> >> In addition to the information duplication, the design is quite
> confusing
> >> to use
> >>
> >> - Should one use the stream.str() function or the SmallString itself?
> >> - flush or str?
> >> - How do you properly clear the combination for reuse (that was my
> >> mistake, I forgot to resync after OS.clear())?
> >>
> >> It's safe to say this design pattern is very easy to get wrong one way
> or
> >> another, we got burned by it multiple times and it should be replaced.
> >>
> >> Alp suggested a derived class containing its own SmallString. That's a
> >> simple and effective approach to avoid the bugs mentioned but does not
> fix
> >> the memory or runtime issues. Small as they may be, why have them at a
> >> fundemental data structure?
> >>
> >> I was thinking about going further avoiding all duplication with a
> >> templated class derived with its own internal buffer storage.
> >>
> >> Rafael suggested a more modular approach, a derived adpater class
> adapter
> >> to a *simple* buffer (or nullptr for fully-dynamic operation) which also
> >> won't duplicate any information the buffer is dumb and has no state.
> >>
> >> That solution sounds simple, efficient and safe to use. The
> implementation
> >> would be actually simpler then raw_svector_ostream since all the
> >> coordination logic is not required anymore.
> >>
> >>
> >> 2015-04-20 22:17 GMT+03:00 Sean Silva <chisophugis at gmail.com>:
> >>>
> >>>
> >>>
> >>> On Sun, Apr 19, 2015 at 7:40 AM, Yaron Keren <yaron.keren at gmail.com>
> >>> wrote:
> >>>>
> >>>> A very common code pattern in LLVM is
> >>>>
> >>>>  SmallString<128> S;
> >>>>  raw_svector_ostream OS(S);
> >>>>  OS<< ...
> >>>>  Use OS.str()
> >>>>
> >>>> While raw_svector_ostream is smart to share the text buffer itself,
> it's
> >>>> inefficient keeping two sets of pointers to the same buffer:
> >>>>
> >>>>  In SmallString: void *BeginX, *EndX, *CapacityX
> >>>>  In raw_ostream: char *OutBufStart, *OutBufEnd, *OutBufCur
> >>>>
> >>>> Moreover, at runtime the two sets of pointers need to be coordinated
> >>>> between the SmallString and raw_svector_ostream using
> >>>> raw_svector_ostream::init, raw_svector_ostream::pwrite,
> >>>> raw_svector_ostream::resync and raw_svector_ostream::write_impl.
> >>>> All these functions have non-inlined implementations in
> raw_ostream.cpp.
> >>>>
> >>>> Finally, this may cause subtle bugs if S is modified without calling
> >>>> OS::resync(). This is too easy to do by mistake.
> >>>>
> >>>> In this frequent case usage the client does not really care about S
> >>>> being a SmallString with its many useful string helper function. It's
> just
> >>>> boilerplate code for raw_svector_ostream. But it does cost three extra
> >>>> pointers, some runtime performance and possible bugs.
> >>>
> >>>
> >>> I agree the bugs are real (Alp proposed something a while back
> regarding
> >>> this?), but you will need to provide measurements to justify the cost
> in
> >>> runtime performance. One technique I have used in the past to measure
> these
> >>> sorts of things I call "stuffing": take the operation that you want to
> >>> measure, then essentially change the logic so that you pay the cost 2
> times,
> >>> 3 times, etc. You can then look at the trend in performance as N
> varies and
> >>> extrapolate back to the case where N = 0 (i.e. you don't pay the cost).
> >>>
> >>> For example, in one situation where I used this method it was to
> measure
> >>> the cost of stat'ing files (sys::fs::status) across a holistic build,
> using
> >>> only "time" on the command line (it was on Windows and I didn't have
> any
> >>> tools like DTrace available that can directly measure this). In order
> to do
> >>> this, I changed sys::fs::status to call stat N times instead of 1, and
> >>> measured with N=1 N=2 N=3 etc. The result was that the difference
> between
> >>> the N and N+1 versions was about 1-2% across N=1..10 (or whatever I
> >>> measured). In order to negate caching and other confounding effects,
> it is
> >>> important to try different distributions of stats; e.g. the extra
> stats are
> >>> on the same file as the "real" stat vs. the extra stats are on
> nonexistent
> >>> files in the same directory as the "real" file vs. parent directories
> of the
> >>> "real" file; if these match up fairly well (they did), then you have
> some
> >>> indication that the "stuffing" is measuring what you want to measure.
> >>>
> >>> So e.g. if you think the cost of 3 extra pointers is significant, then
> >>> "stuff" the struct with 3, 6, 9, ... extra pointers and measure the
> >>> difference in performance (e.g. measure the time building a real
> project).
> >>>
> >>> -- Sean Silva
> >>>
> >>>>
> >>>>
> >>>> To solve all three issues, would it make sense to have
> >>>> raw_ostream-derived container with a its own SmallString like
> templated-size
> >>>> built-in buffer?
> >>>>
> >>>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150501/823c6eb2/attachment.html>
-------------- next part --------------
Index: include/llvm/Support/raw_ostream.h
===================================================================
--- include/llvm/Support/raw_ostream.h	(revision 236300)
+++ include/llvm/Support/raw_ostream.h	(working copy)
@@ -52,17 +52,19 @@
   ///               OutBufEnd - OutBufStart >= 1).
   ///
   /// If buffered, then the raw_ostream owns the buffer if (BufferMode ==
-  /// InternalBuffer); otherwise the buffer has been set via SetBuffer and is
-  /// managed by the subclass.
+  /// InternalBuffer or BufferMode == DirectBuffer); otherwise the buffer has
+  /// been set via SetBuffer and is managed by the subclass.
   ///
   /// If a subclass installs an external buffer using SetBuffer then it can wait
   /// for a \see write_impl() call to handle the data which has been put into
   /// this buffer.
+protected:
   char *OutBufStart, *OutBufEnd, *OutBufCur;
 
   enum BufferKind {
     Unbuffered = 0,
     InternalBuffer,
+    DirectBuffer,
     ExternalBuffer
   } BufferMode;
 
@@ -132,7 +134,7 @@
   //===--------------------------------------------------------------------===//
 
   void flush() {
-    if (OutBufCur != OutBufStart)
+    if (BufferMode != DirectBuffer && OutBufCur != OutBufStart)
       flush_nonempty();
   }
 
@@ -208,8 +210,8 @@
   /// satisfy std::isprint into an escape sequence.
   raw_ostream &write_escaped(StringRef Str, bool UseHexEscapes = false);
 
-  raw_ostream &write(unsigned char C);
-  raw_ostream &write(const char *Ptr, size_t Size);
+  virtual raw_ostream &write(unsigned char C);
+  virtual raw_ostream &write(const char *Ptr, size_t Size);
 
   // Formatted output, see the format() function in Support/Format.h.
   raw_ostream &operator<<(const format_object_base &Fmt);
@@ -297,13 +299,13 @@
   /// unbuffered.
   const char *getBufferStart() const { return OutBufStart; }
 
+  /// Install the given buffer and mode.
+  void SetBufferAndMode(char *BufferStart, size_t Size, BufferKind Mode);
+
   //===--------------------------------------------------------------------===//
   // Private Interface
   //===--------------------------------------------------------------------===//
 private:
-  /// Install the given buffer and mode.
-  void SetBufferAndMode(char *BufferStart, size_t Size, BufferKind Mode);
-
   /// Flush the current buffer, which is known to be non-empty. This outputs the
   /// currently buffered data and resets the buffer to empty.
   void flush_nonempty();
@@ -481,6 +483,51 @@
   }
 };
 
+/// A raw_ostream that writes to a char array. This is a simple adaptor class.
+/// This class does not encounter output errors.
+class raw_char_ostream : public raw_pwrite_stream {
+  void write_impl(const char *Ptr, size_t Size) override{};
+  void pwrite_impl(const char *Ptr, size_t Size, uint64_t Offset) override {
+    memcpy(OutBufStart + Offset, Ptr, Size);
+  }
+  uint64_t current_pos() const override { return 0; }
+  void grow(size_t Size) {
+    size_t CurBytesAvailable = OutBufEnd - OutBufCur;
+    if (Size > CurBytesAvailable) {
+      size_t CurPos = OutBufCur - OutBufStart;
+      OutBufCur = OutBufStart;
+      size_t NewSize = size_t(NextPowerOf2(CurPos + Size + 2));
+      SetBufferAndMode(new char[NewSize], NewSize, DirectBuffer);
+      OutBufCur = OutBufStart + CurPos;
+    }
+  }
+
+public:
+  explicit raw_char_ostream(char *Buffer, unsigned BufferSize) {
+    if (Buffer) {
+      SetBufferAndMode(Buffer, BufferSize, ExternalBuffer);
+    } else {
+      if (!BufferSize)
+        BufferSize = preferred_buffer_size();
+      SetBufferAndMode(new char[BufferSize], BufferSize, DirectBuffer);
+    }
+  }
+  ~raw_char_ostream() override { OutBufCur = OutBufStart; }
+  raw_ostream &write(unsigned char C) override {
+    grow(1);
+    *OutBufCur++ = C;
+    return *this;
+  }
+  raw_ostream &write(const char *Ptr, size_t Size) override {
+    grow(Size);
+    memcpy(OutBufCur, Ptr, Size);
+    OutBufCur += Size;
+    return *this;
+  }
+  /// Returns the buffer.
+  StringRef str() { return StringRef(OutBufStart, GetNumBytesInBuffer()); }
+};
+
 /// A raw_ostream that writes to an SmallVector or SmallString.  This is a
 /// simple adaptor class. This class does not encounter output errors.
 class raw_svector_ostream : public raw_pwrite_stream {
Index: lib/Support/raw_ostream.cpp
===================================================================
--- lib/Support/raw_ostream.cpp	(revision 236300)
+++ lib/Support/raw_ostream.cpp	(working copy)
@@ -65,7 +65,7 @@
   assert(OutBufCur == OutBufStart &&
          "raw_ostream destructor called with non-empty buffer!");
 
-  if (BufferMode == InternalBuffer)
+  if (BufferMode == InternalBuffer || BufferMode == DirectBuffer)
     delete [] OutBufStart;
 }
 
@@ -95,7 +95,7 @@
   // child buffer management logic will be in write_impl).
   assert(GetNumBytesInBuffer() == 0 && "Current buffer is non-empty!");
 
-  if (BufferMode == InternalBuffer)
+  if (BufferMode == InternalBuffer || BufferMode == DirectBuffer)
     delete [] OutBufStart;
   OutBufStart = BufferStart;
   OutBufEnd = OutBufStart+Size;
Index: projects/raw_char_ostream_speed/CMakeLists.txt
===================================================================
--- projects/raw_char_ostream_speed/CMakeLists.txt	(revision 0)
+++ projects/raw_char_ostream_speed/CMakeLists.txt	(working copy)
@@ -0,0 +1,7 @@
+set(LLVM_LINK_COMPONENTS
+  Support
+)
+
+add_llvm_tool(raw_char_ostream
+  raw_char_ostream_speed.cpp
+)
Index: projects/raw_char_ostream_speed/raw_char_ostream_speed.cpp
===================================================================
--- projects/raw_char_ostream_speed/raw_char_ostream_speed.cpp	(revision 0)
+++ projects/raw_char_ostream_speed/raw_char_ostream_speed.cpp	(working copy)
@@ -0,0 +1,43 @@
+#include "llvm/ADT/SmallString.h"
+#include "llvm/Support/raw_ostream.h"
+
+#include <ctime>
+
+using namespace llvm;
+
+namespace {
+const StringRef FuncName =
+    "thadddddddddsrttttttttttttttttttttttttjouhojkirjdjdsjoiuosiasddjisjsrof";
+
+void writeStream(raw_ostream &OS) {
+  OS << "_Z" << FuncName.size() << FuncName << 'v';
+}
+
+const unsigned BufferSize = 128;
+void test_raw_char_ostream() {
+  char Buffer[BufferSize];
+  raw_char_ostream OS(Buffer, sizeof(Buffer));
+  writeStream(OS);
+}
+
+void test_raw_svector_ostream() {
+  SmallString<BufferSize> Buffer;
+  raw_svector_ostream OS(Buffer);
+  writeStream(OS);
+}
+
+void testIt(void (*F)()) {
+  std::clock_t StartTime = std::clock();
+  for (unsigned i = 1e7; i; --i)
+    F();
+  std::clock_t EndTime = std::clock();
+  unsigned ms = unsigned(double(EndTime - StartTime) * 1000.0 / CLOCKS_PER_SEC);
+  outs() << ms << "ms\n";
+}
+}
+
+int main() {
+  testIt(test_raw_char_ostream);
+  testIt(test_raw_svector_ostream);
+  return 0;
+}


More information about the llvm-dev mailing list