[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

Sun Apr 19 07:40:36 PDT 2015

A very common code pattern in LLVM is

 SmallString<128> S;
 raw_svector_ostream OS(S);
 OS<< ...
 Use OS.str()

While raw_svector_ostream is smart to share the text buffer itself, it's
inefficient keeping two sets of pointers to the same buffer:

 In SmallString: void *BeginX, *EndX, *CapacityX
 In raw_ostream: char *OutBufStart, *OutBufEnd, *OutBufCur

Moreover, at runtime the two sets of pointers need to be coordinated
between the SmallString and raw_svector_ostream using
raw_svector_ostream::init, raw_svector_ostream::pwrite,
raw_svector_ostream::resync
and raw_svector_ostream::write_impl.
All these functions have non-inlined implementations in raw_ostream.cpp.

Finally, this may cause subtle bugs if S is modified without calling
OS::resync(). This is too easy to do by mistake.

In this frequent case usage the client does not really care about S being a
SmallString with its many useful string helper function. It's just
boilerplate code for raw_svector_ostream. But it does cost three extra
pointers, some runtime performance and possible bugs.

To solve all three issues, would it make sense to have raw_ostream-derived
container with a its own SmallString like templated-size built-in buffer?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150419/7bdc7a73/attachment.html>