[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

Mon Apr 20 08:19:22 PDT 2015

On Sun, Apr 19, 2015 at 7:40 AM, Yaron Keren <yaron.keren at gmail.com> wrote:
> A very common code pattern in LLVM is
>
>  SmallString<128> S;
>  raw_svector_ostream OS(S);
>  OS<< ...
>  Use OS.str()
>
> While raw_svector_ostream is smart to share the text buffer itself, it's
> inefficient keeping two sets of pointers to the same buffer:
>
>  In SmallString: void *BeginX, *EndX, *CapacityX
>  In raw_ostream: char *OutBufStart, *OutBufEnd, *OutBufCur

Any reason to believe this inefficiency is significant/important?
Given that these are never in long-lived containers, but generally
just on the stack, it doesn't seem like the extra 3 pointers would be
very costly in terms of overall performance.

>
> Moreover, at runtime the two sets of pointers need to be coordinated between
> the SmallString and raw_svector_ostream using raw_svector_ostream::init,
> raw_svector_ostream::pwrite, raw_svector_ostream::resync and
> raw_svector_ostream::write_impl.
> All these functions have non-inlined implementations in raw_ostream.cpp.
>
> Finally, this may cause subtle bugs if S is modified without calling
> OS::resync(). This is too easy to do by mistake.
>
> In this frequent case usage the client does not really care about S being a
> SmallString with its many useful string helper function. It's just
> boilerplate code for raw_svector_ostream. But it does cost three extra
> pointers, some runtime performance and possible bugs.
>
> To solve all three issues, would it make sense to have raw_ostream-derived
> container with a its own SmallString like templated-size built-in buffer?
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>