[llvm-dev] Is it time to allow StringRef to be constructed from nullptr?

Zachary Turner via llvm-dev llvm-dev at lists.llvm.org
Sun Sep 25 16:11:26 PDT 2016

I thought about doing something like that, but most compilers will fold a
call to strlen on a string literal into a constant anyway, so in practice I
don't think it matters much.  I know Clang does, and I tested MSVC and it
does too.

D:\>type strlen.cpp
#include <string.h>
#include <stdio.h>

int main(int argc, char **argv) {
  int x = strlen("This is a test");
  printf("%d", x);
  return 0;

D:\>cl /O2 strlen.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24213.1 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.00.24213.1
Copyright (C) Microsoft Corporation.  All rights reserved.


D:\>dumpbin strlen.obj /disasm | grep -C 5 main
  00000018: FF 30              push        dword ptr [eax]
  0000001A: E8 00 00 00 00     call        ___stdio_common_vfprintf
  0000001F: 83 C4 18           add         esp,18h
  00000022: C3                 ret

  *00000000: 6A 0E              push        0Eh*
  00000002: 68 00 00 00 00     push        offset ??_C at _02DPKJAMEF@
  00000007: E8 00 00 00 00     call        _printf
  0000000C: 83 C4 08           add         esp,8
  0000000F: 33 C0              xor         eax,eax

Also, IANALL, but I don't believe you can overload on const char* vs. const
char (&T)[N].  If you have both overloads, a string literal and char array
will still select the const char* overload, at least in the tests I

On Sun, Sep 25, 2016 at 3:58 PM Pete Cooper <peter_cooper at apple.com> wrote:

> On Sep 25, 2016, at 1:49 PM, Mehdi Amini via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> On Sep 25, 2016, at 9:10 AM, Zachary Turner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> While porting LLDB over to StringRef, I am continuously running into
> difficulties caused by the fact that StringRef cannot be constructed from
> nullptr.  So I wanted to see peoples' thoughts on removing this restriction
> from StringRef.  To be clear, I'm only using LLDB as a motivating example,
> but I'm not requesting that it be done because LLDB is some kind of special
> case.  If it is to be done it should be on its own merits.  That said, here
> is some context:
> LLDB has a lot of functions that look like this:
> void foo(const char *, Bar, const char *).
> I'm trying to port these to functions that look like this:
> void foo(StringRef, Bar, StringRef).
> Often times the parameters are string literals or char arrays, but equally
> often they are another const char* that got passed into the calling
> function, or a return value from a CRT function like strstr(), or many
> other possible sources.  This latter category presents a problem for
> porting code to StringRef, because if I simply change the function
> signature and fix up compile errors, I will probably have introduced a bug
> because hundreds of callers will now be implicitly converting from const
> char* to StringRef, leaving open the possibility that one of those was null.
> To work around this, I've started doing the following every time I port a
> function:
> void foo(const char *, Bar, const char*) = delete;
> This is pretty hackish, but it gets the job done.  At least the compiler
> warns me and forces me to go inspect every callsite where there's an
> implicit conversion.  Unfortunately it also makes for extremely verbose
> code.  Now instead of:
> foo("bar", baz, "buzz")
> I have to write
> foo(StringRef("bar"), baz, StringRef("buzz"))
> even for string literals and char arrays, which will obviously never be
> null!  If StringRef would handle a null argument gracefully, it would make
> my life much easier.
> With that out of the way, here are some reasons I can see to allow
> StringRef accept null to its constructor which are independent of LLDB and
> stand on their own merit.
> 1) std::string_view<> can be constructed with null.  I don't know when we
> will be able to use std::string_view<>, but there's a chance that at some
> point in the future we may wish to remove StringRef in favor of
> string_view.  That day isn't soon, but in any case, it will be easier if
> our assumptions are the same.
> 2) [nullptr, nullptr+0) is a valid range.  Why shouldn't we be able to
> construct a StringRef from an otherwise perfectly valid range?
> 3) StringRef() can *already* be constructed from nullptr (!)  Surprised?
> That's what happens when you invoke the default constructor.  It happily
> initializes the internal Data with null.  So why not allow the same
> behavior when invoking the const char * constructor?
> Thoughts?
> As a tangent: I don’t like the fact that StringRef is implicitly built out
> of “const char *”, this is calling strlen() and because it is implicit
> folks don’t realize when they go from string -> char * -> StringRef.
> I rather have this constructor explicit, and provide an implicit one for
> string literal.
> I wonder if we could change that call site to be deleted (or at least
> explicit), and add support for literal strings with a StringRef version of
> this:
>     /// Construct an ArrayRef from a C array.
>     template <size_t N>
>     /*implicit*/ LLVM_CONSTEXPR ArrayRef(const T (&Arr)[N])
>       : Data(Arr), Length(N) {}
> This way we’ll avoid the strlen on quoted strings which is the common case
> anyway, and then can see how many other cases we have from const char*
> remaining.
> Pete
> To come back to your point, I’m not sure if we should leave the internal
> pointer null or always set it to “”? This would provide the guarantee that
> dereferencing a StringRef is always valid without checking.
>> Mehdi
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160925/7a77a1c7/attachment.html>

More information about the llvm-dev mailing list