[llvm-commits] PATCH: Adding an array constructor to StringRef

Marshall Clow mclow.lists at gmail.com
Mon Jun 18 09:48:39 PDT 2012


On Jun 18, 2012, at 8:47 AM, David Blaikie wrote:
> On Mon, Jun 18, 2012 at 8:36 AM, Marshall Clow <mclow.lists at gmail.com> wrote:
>> Modeled on the one in ArrayRef
>> 
>>    /// Construct an string ref from a C array.
>>    template <size_t N>
>>    /*implicit*/ StringRef(const char (&Arr)[N])
>>      : Data(Arr), Length(N) {}
>> 
> 
> One possible problem with this is that if you construct a StringRef
> from a string literal, you'll end up including the '\0' inside it.

Ok - I can see that.
> 
> (also, when I tried this myself a few months ago, there's at least one
> case where a StringRef is constructed from a large fixed-size buffer
> on the stack - when using this ctor, it obviously gets the size quite
> wrong)

Interesting. Do you remember where that was?

> That being said, I like the idea, I'm just not sure the right way to
> implement it - when I was experimenting with this I implemented it
> with a '\0' check for the last character (truncating the size by one
> when present, and leaving it at the full size when it wasn't present)

Ok.

> - and added an assert for strlen(Data) == Length to catch the buffer
> case (which means the ctor didn't provide performance improvements in
> debug builds but only in release builds).

I see three problems here:
1) Constructing from string literal gives one too large. This is easily fixed.
2) Your case above.
3) The general case, where you have an array that may contain a NULL somewhere in the middle.

(These last two might turn out to be the same case.)

The second case, I don't see as a problem, given that the ( data, len ) constructor does not enforce the invariant that StringRef::Length () == strlen ( StringRef.data()).
I know that StringRef.data() doesn't have to be NULL-terminated, but I don't see any care being taken to deal with embedded NULL characters, either.

Maybe the confusion here is due to the fact that this would create a second one-argument constructor, and it would not be clear which was being chosen:

	StringRef ( array )	- gives a reference to the entire array
	StringRef ( pointer )  - gives a reference to the (null-terminated) string

Ok - I'm not sure I see a good way to disambiguate those (or rather, for the user to know which is being called, given that arrays decay to pointers if you look at them funny)

Dang.  Any ideas?

-- Marshall

Marshall Clow     Idio Software   <mailto:mclow.lists at gmail.com>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki





More information about the llvm-commits mailing list