[LLVMdev] Correct use of StringRef and Twine

Tue Jul 19 10:24:46 PDT 2011

> And for arguments, generally always use Twine as the default, it allows construction of complex things, and is still efficient when passed the equiv of a StringRef (with the toStringRef method).  The only annoying thing about it is that the API to do this requires a temporary SmallVector to scribble in, which makes it more difficult to use.

Yes, I noticed this - which was one of my concerns about migrating
lots of stuff to Twine: that efficient Twine-sinks would be tricky
and/or verbose, so I was wondering how to make Twine-sinks easier to
write (hence the ref return I mentioned below)

> > * Would it be OK to implement an implicit conversion from Twine to std::string rather than using the explicit str() function? (when switching to StringRef many expressions became Twine concatenation when they were std::string concatenation - this isn't a drop-in replacement due to the absence of std::string conversion from Twine (which should be a perf win, I'd think - delaying concatenation into a single operation instead of (((a + b) + c) + d)), so I've had to wrap various concatenation expressions in (...).str())
>
> I'd prefer not.  I'd rather convert the things that want an std::string to take a Twine.

Then it's a question of what sort of final destinations/sinks there
are & how many, given the slight complexity of writing an efficient
Twine sink.

> > * What would happen if Twine gave back a pointer/reference to some internal storage? In the common/correct use case (taking a Twine as a const ref, or using a Twine entirely in a temporary concatenation expression) wouldn't the Twine's internal storage live long enough to allow the caller to use that buffer within the life of the statement (as in, say, o << (aStringRef + "foo"))?
>
> This is really dangerous.

Is it much/any more dangerous than Twine's existing functionality,
though? Twine's already referring to temporaries & will break in fun
ways if used outside that cliche.

> I'd much rather extend raw_ostream to take twines.

Certainly that covers this case, but I'm not sure how many different
sinks there are that would need the nuanced efficient Twine handling
code. Perhaps it's not many - though some easy way to do "stdStr =
twine" would be great since quite often we want to take a string & put
it in a member, etc. Would it be worth having "assign(std::string&,
const Twine&)" or similar - or is going through Twine::str() &
expecting the compiler to optimize away the temporary std::string
acceptable? (I assume not, or Twine wouldn't have all those fancy
functions for getting a StringRef & providing a buffer, etc)

> The major win is actually when you have clients that don't *want* an std::string.

Ah, right, that too.

> std::string is really slow in most operations (it does atomic operations and COW with common implementations,

Hmm, which implementations still use COW, I wonder? I think there's
something in C++0x that makes it impossible to implement std::string
as COW.

> The toStringRef(x) method is what you want.  It is really fast and does no copy if the twine *just* contains a C string, std::string or StringRef, and in the concat cases it does no memory allocation in the common case where the SmallVector is big enough for the result.

curiosity question: how much more efficient (vague question, I know)
is the StringRef + SmallVector than a good (eg: libc++) std::string
implementation? I know, for example, that Visual C++ 2010's
std::string does perform the small string optimization which I guess
is what SmallVector is doing.

- David