[llvm-commits] Twine/StringRef enhancements & usage

Jordy Rose jediknil at belkadan.com
Wed Aug 24 15:38:56 PDT 2011


A couple comments, though this certainly isn't an area of the code I'm that familiar with.

- Instead of appendTo(string&) and assignTo(string&), why not just add operator+=(string&, const Twine&) and operator=(string&, const Twine&) ? Seems more C++ to me.

- Re: toNullTerminatedStringRef: A StringRef created from a null-terminated C string drops the null terminator, so you can't just "test the last character" to see if it's a null. In fact, having the last character of a StringRef be null is probably a bug. (Of course, you can't test /past/ the last character either, because one byte past valid memory is guaranteed to be a valid address but not guaranteed to be dereferenceable.)

- You've got several copies of SafeBool.h in the file. I'm guessing this is the result of reverting and then reapplying patches. (I do this all the time too.)

- TwineString definitely seems evil, but I haven't really thought about it hard enough to give a good reason why. I don't get why you're using a SmallVectorImpl instead of a SmallVector or SmallString, though.

Jordy


On Aug 23, 2011, at 22:16, David Blaikie wrote:

> 
> Here's a smaller version/start of my twine changes - perhaps it'll be easier to bite off to review.
> 
> The description from my initial attempt:
> 
> Here's some more Twine-ification of APIs. I realize this seems like a
> somewhat random assortment - I haven't, for example, fully changed
> some types to Twine (so there might be some functions in a type that
> take Twine, others that take StringRef still), but it does seem to be
> stable & I didn't want to just keep piling on more code.
> 
> API changes:
> * Twine
>  * appendTo(std::string&)
>    Works like "toVector" but on std::strings. (I'd like to rename
> toVector to appendTo to make it clear it's appending, not overwriting
> - I added test cases to expose this existing deliberate design because
> when I 'fixed' it by clearing the vector in toVector I broke a bunch
> of stuff that was relying on it being appending)
>  * assignTo(std::string&)
>    clear() + appendTo
> * StringRef
>  * StringRef(const char (&)[N])
>    an array constructor so it doesn't have to use strlen for string
> literals. For some reason this doesn't work in GCC (try making the
> StringRef(const char*) ctor explicit & GCC starts complaining about
> how it can't convert the const char* in 'return "foo";' into a
> StringRef). This is a little tricky - it doesn't include the trailing
> null character if one is present. This is to maintain compatibility
> with the existing implicit conversion from string arrays via the
> (const char*) ctor.
>  * assignTo(std::string&)
>    the same idea as Twine::appendTo(std::string&). This is more
> efficient than using str = strRef; because it doesn't produce an extra
> temporary buffer to return through. ctor initialization (std::string
> str = strRef;) is still efficient due to RVO.
>  * booleanTest()/SafeBool<T>
>    I've made StringRef boolean testable so it can be used as a drop
> in replacement when null tests are required
> * TwineString - a type that makes retrieving/manipulating a Twine
> value a little simpler. Yes, it's a little dodgy that it derives from
> StringRef but it's 'good enough' to make Twines the default string
> arguments - any time you see a StringRef or std::string argument there
> should be a specific explanation.
> 
> One current problem I have is that SmallString<T> is convertible to
> StringRef and StringRef to Twine, but that seems to be insufficient to
> implicitly convert from SmallString<T> to Twine - so I've added a few
> explicit Twine(smallstr) calls around. If there's a better way to
> approach this (ultimately we could add a SmallString option to Twine,
> but it'd be sort of nice not to have to do that either) I'm all ears.
> 
> Aside: Twine::
> toNullTerminatedStringRef scares me. A lot. It relies on
> UB to leave the null character beyond the end of the allocated buffer.
> Could we fix this to return a StringRef that doesn't include the null
> character (though I'm not sure this bit should matter, really - code
> asking for the null terminated StringRef is probably doing it to pass
> to a C API that's not going to care about the StringRef's concept of
> length), but leave the character itself in the buffer? (& I was
> thinking we could check if the Twine was a non-empty StringRef
> already, test the last character for null, then return a StringRef
> over that rather than using a new buffer - though this wouldn't work
> for my current StringRef(const char(&)[N]) ctor, since the length
> won't include the null character).
> 
> <twine_partial.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list