[llvm-commits] Twine/StringRef enhancements & usage

David Blaikie dblaikie at gmail.com
Fri Jul 29 14:08:48 PDT 2011


Here's a more useful attempt at this. I had two issues:

1) My changes to MemoryBuffer were quite wrong (changing the buffer
from a StringRef to a Twine made no sense - but changing the name from
StringRef to Twine is still good). I'll see about adding some unit
tests for MemoryBuffer as there are none currently.
2) I found at least one case of StringRef(const char*) being used to
construct a StringRef from an array larger than the null termination
(in Unix/Pathv2). This broke with my StringRef(const char(&)[N]) ctor,
since it consumed the whole array as the StringRef, instead of just
the null terminated subsequence. I still think this ctor is worthwhile
- and the sooner it goes in the less chance people will depend on the
old behavior.
All the other details of my changes below still apply.

(on the comment below about toNullTerminatedStringRef - I've seen some
other instances of similar behavior. Pushing 0 onto SimpleStrings then
immediately popping it to produce a null terminated string... makes me
twitch a bit)

Thanks for your time,
- David

> Here's some more Twine-ification of APIs. I realize this seems like a
> somewhat random assortment - I haven't, for example, fully changed
> some types to Twine (so there might be some functions in a type that
> take Twine, others that take StringRef still), but it does seem to be
> stable & I didn't want to just keep piling on more code.
>
> API changes:
> * Twine
>  * appendTo(std::string&)
>    Works like "toVector" but on std::strings. (I'd like to rename
> toVector to appendTo to make it clear it's appending, not overwriting
> - I added test cases to expose this existing deliberate design because
> when I 'fixed' it by clearing the vector in toVector I broke a bunch
> of stuff that was relying on it being appending)
>  * assignTo(std::string&)
>    clear() + appendTo
> * StringRef
>  * StringRef(const char (&)[N])
>    an array constructor so it doesn't have to use strlen for string
> literals. For some reason this doesn't work in GCC (try making the
> StringRef(const char*) ctor explicit & GCC starts complaining about
> how it can't convert the const char* in 'return "foo";' into a
> StringRef). This is a little tricky - it doesn't include the trailing
> null character if one is present. This is to maintain compatibility
> with the existing implicit conversion from string arrays via the
> (const char*) ctor.
>  * assignTo(std::string&)
>    the same idea as Twine::appendTo(std::string&). This is more
> efficient than using str = strRef; because it doesn't produce an extra
> temporary buffer to return through. ctor initialization (std::string
> str = strRef;) is still efficient due to RVO.
>  * booleanTest()/SafeBool<T>
>    I've made StringRef boolean testable so it can be used as a drop
> in replacement when null tests are required
> * TwineString - a type that makes retrieving/manipulating a Twine
> value a little simpler. Yes, it's a little dodgy that it derives from
> StringRef but it's 'good enough' to make Twines the default string
> arguments - any time you see a StringRef or std::string argument there
> should be a specific explanation.
>
> One current problem I have is that SmallString<T> is convertible to
> StringRef and StringRef to Twine, but that seems to be insufficient to
> implicitly convert from SmallString<T> to Twine - so I've added a few
> explicit Twine(smallstr) calls around. If there's a better way to
> approach this (ultimately we could add a SmallString option to Twine,
> but it'd be sort of nice not to have to do that either) I'm all ears.
>
> Aside: Twine::toNullTerminatedStringRef scares me. A lot. It relies on
> UB to leave the null character beyond the end of the allocated buffer.
> Could we fix this to return a StringRef that doesn't include the null
> character (though I'm not sure this bit should matter, really - code
> asking for the null terminated StringRef is probably doing it to pass
> to a C API that's not going to care about the StringRef's concept of
> length), but leave the character itself in the buffer? (& I was
> thinking we could check if the Twine was a non-empty StringRef
> already, test the last character for null, then return a StringRef
> over that rather than using a new buffer - though this wouldn't work
> for my current StringRef(const char(&)[N]) ctor, since the length
> won't include the null character).
>
> Thanks for your help,
>
> - David
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twine_clang.diff
Type: application/octet-stream
Size: 1564 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110729/23ceb142/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: twine.diff
Type: application/octet-stream
Size: 61796 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20110729/23ceb142/attachment-0001.obj>


More information about the llvm-commits mailing list