[PATCH] A new ADT: StringRefNulTerminated
Dmitri Gribenko
gribozavr at gmail.com
Sat Feb 23 16:26:46 PST 2013
On Sun, Feb 24, 2013 at 2:11 AM, Chandler Carruth <chandlerc at google.com> wrote:
> Stepping back (alot), I'm really unconvinced about the need for this class.
That's a valid concern. I tried to avoid doing this, because I
understand that a new ADT imposes a new concept onto all maintainers.
> On Wed, Feb 20, 2013 at 1:40 PM, Dmitri Gribenko <gribozavr at gmail.com>
> wrote:
>>
>> Hello,
>>
>> I want to propose a new ADT: StringRefNulTerminated.
>>
>> The ``StringRefNulTerminated`` class
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> ``StringRefNulTerminated`` is a subclass of ``StringRef`` that represents
>> a
>> reference to a NUL-terminated string. It can be safely converted to
>> 'const
>> char *' without copying the underlying data. Because it is a subclass of
>> ``StringRef``, slicing implicitly occurs whenever a
>> ``StringRefNulTerminated``
>> object is passed to an API that requires a ``StringRef``. Since
>> ``StringRefNulTerminated`` does not contain any extra data, slicing
>> preserves
>> all the information (except for the fact that the string was
>> NUL-terminated).
>>
>> The motivation behind this class is to preserve the information about the
>> NUL
>> terminator in the string. With ``StringRef``, this information is lost.
>> Also,
>> one can not check if a ``StringRef`` contains a NUL-terminated string,
>> thus it
>> is required to copy the data to convert from a ``StringRef`` to
>> ``const char *``.
>
>
> Everything you describe here is true of a 'const char*'. Why not just pass
> that through the API layers where it is important to have a C-string
> representation?
I tried doing that (in order to fix libclang issue -- one-past end
read from StringRef -- without a performance impact), but I found some
cases where dropping the length from the string might create a
performance issue. See below.
> pro: It's a lot simpler and more clear about the fact that the code is
> dealing with a C-string.
> pro: It doesn't add a whole new class to ADT
> pro: It doesn't add complexity of understanding how this class and StringRef
> interact -- we already undrestand how to build a StringRef out of a
> C-string.
>
> con: It requires users who need a StringRef-like API to build a StringRef
> around it.
>
> This doesn't seem too bad... maybe there are other cons?
You have identified the downside correctly. I want to change some
existing C++ APIs in Clang that return a StrnigRef that is always
NUL-terminated. Among them, there are cases where we always return a
string literal, and the compiler can optimize out the strlen if we
construct a StringRef immediately from the string literal. If we
would build a StringRef outside the function body, there would be a
strlen call. (See BuiltinType::getName for an example of this.) In
other case (StaticDiagCategoryRec::getName in
lib/Basic/DiagnosticIDs.cpp) we explicitly pass the length of the
string (I guess, to be 100% sure that strlen is not called).
Dmitri
--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/
More information about the llvm-commits
mailing list