<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 25, 2016, at 4:11 PM, Zachary Turner <<a href="mailto:zturner@google.com" class="">zturner@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">I thought about doing something like that, but most compilers will fold a call to strlen on a string literal into a constant anyway, so in practice I don't think it matters much.  I know Clang does, and I tested MSVC and it does too.</div></div></blockquote><div><br class=""></div><div>To be clear: I’m not worried that this would add cost to the literal case. And the other change I’m suggesting is (relatively) orthogonal and isn’t a blocker for what you want to do.</div><div><br class=""></div><div>— </div><div>Mehdi</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class=""><div class="gmail_msg" style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif"><div class="gmail_msg">D:\>type strlen.cpp</div><div class="gmail_msg">#include <string.h></div><div class="gmail_msg">#include <stdio.h></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">int main(int argc, char **argv) {</div><div class="gmail_msg">  int x = strlen("This is a test");</div><div class="gmail_msg">  printf("%d", x);</div><div class="gmail_msg">  return 0;</div><div class="gmail_msg">}</div></div><div class="gmail_msg" style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif"><br class="gmail_msg"></div><div class="gmail_msg" style="color:rgb(33,33,33);font-family:"helvetica neue",helvetica,arial,sans-serif"><div class="gmail_msg">D:\>cl /O2 strlen.cpp</div><div class="gmail_msg">Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24213.1 for x86</div><div class="gmail_msg">Copyright (C) Microsoft Corporation.  All rights reserved.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">strlen.cpp</div><div class="gmail_msg">Microsoft (R) Incremental Linker Version 14.00.24213.1</div><div class="gmail_msg">Copyright (C) Microsoft Corporation.  All rights reserved.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">/out:strlen.exe</div><div class="gmail_msg">strlen.obj</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">D:\>dumpbin strlen.obj /disasm | grep -C 5 main</div><div class="gmail_msg">  00000018: FF 30              push        dword ptr [eax]<br class=""></div><div class="gmail_msg">  0000001A: E8 00 00 00 00     call        ___stdio_common_vfprintf</div><div class="gmail_msg">  0000001F: 83 C4 18           add         esp,18h</div><div class="gmail_msg">  00000022: C3                 ret</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">_main:</div><div class="gmail_msg"> <span class="inbox-inbox-Apple-converted-space"> </span><b class="gmail_msg">00000000: 6A 0E              push        0Eh</b></div><div class="gmail_msg">  00000002: 68 00 00 00 00     push        offset ??_C@_02DPKJAMEF@?$CFd?$AA@</div><div class="gmail_msg">  00000007: E8 00 00 00 00     call        _printf</div><div class="gmail_msg">  0000000C: 83 C4 08           add         esp,8</div><div class="gmail_msg">  0000000F: 33 C0              xor         eax,eax</div><div class="gmail_msg"><br class=""></div><div class="gmail_msg"><br class=""></div><div class="gmail_msg">Also, IANALL, but I don't believe you can overload on const char* vs. const char (&T)[N].  If you have both overloads, a string literal and char array will still select the const char* overload, at least in the tests I attempted.</div></div></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Sun, Sep 25, 2016 at 3:58 PM Pete Cooper <<a href="mailto:peter_cooper@apple.com" class="">peter_cooper@apple.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="gmail_msg"><div class="gmail_msg"><blockquote type="cite" class="gmail_msg"><div class="gmail_msg">On Sep 25, 2016, at 1:49 PM, Mehdi Amini via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="gmail_msg" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="m_-5167184091587602723Apple-interchange-newline gmail_msg"><div class="gmail_msg"><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><blockquote type="cite" class="gmail_msg"><div class="gmail_msg"><br class="m_-5167184091587602723Apple-interchange-newline gmail_msg">On Sep 25, 2016, at 9:10 AM, Zachary Turner via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="gmail_msg" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="m_-5167184091587602723Apple-interchange-newline gmail_msg"><div class="gmail_msg"><div dir="ltr" class="gmail_msg">While porting LLDB over to StringRef, I am continuously running into difficulties caused by the fact that StringRef cannot be constructed from nullptr.  So I wanted to see peoples' thoughts on removing this restriction from StringRef.  To be clear, I'm only using LLDB as a motivating example, but I'm not requesting that it be done because LLDB is some kind of special case.  If it is to be done it should be on its own merits.  That said, here is some context:<div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">LLDB has a lot of functions that look like this:</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">void foo(const char *, Bar, const char *).</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">I'm trying to port these to functions that look like this:</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">void foo(StringRef, Bar, StringRef).</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Often times the parameters are string literals or char arrays, but equally often they are another const char* that got passed into the calling function, or a return value from a CRT function like strstr(), or many other possible sources.  This latter category presents a problem for porting code to StringRef, because if I simply change the function signature and fix up compile errors, I will probably have introduced a bug because hundreds of callers will now be implicitly converting from const char* to StringRef, leaving open the possibility that one of those was null.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">To work around this, I've started doing the following every time I port a function:</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">void foo(const char *, Bar, const char*) = delete;</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">This is pretty hackish, but it gets the job done.  At least the compiler warns me and forces me to go inspect every callsite where there's an implicit conversion.  Unfortunately it also makes for extremely verbose code.  Now instead of:</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">foo("bar", baz, "buzz")</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">I have to write</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">foo(StringRef("bar"), baz, StringRef("buzz"))<br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">even for string literals and char arrays, which will obviously never be null!  If StringRef would handle a null argument gracefully, it would make my life much easier.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">With that out of the way, here are some reasons I can see to allow StringRef accept null to its constructor which are independent of LLDB and stand on their own merit.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">1) std::string_view<> can be constructed with null.  I don't know when we will be able to use std::string_view<>, but there's a chance that at some point in the future we may wish to remove StringRef in favor of string_view.  That day isn't soon, but in any case, it will be easier if our assumptions are the same.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">2) [nullptr, nullptr+0) is a valid range.  Why shouldn't we be able to construct a StringRef from an otherwise perfectly valid range?</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">3) StringRef() can<span class="gmail_msg m_-5167184091587602723Apple-converted-space"> </span><b class="gmail_msg">already</b> be constructed from nullptr (!)  Surprised?  That's what happens when you invoke the default constructor.  It happily initializes the internal Data with null.  So why not allow the same behavior when invoking the const char * constructor?</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Thoughts?</div></div></div></blockquote><br class="gmail_msg"></div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg">As a tangent: I don’t like the fact that StringRef is implicitly built out of “const char *”, this is calling strlen() and because it is implicit folks don’t realize when they go from string -> char * -> StringRef. </div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg">I rather have this constructor explicit, and provide an implicit one for string literal.</div></div></blockquote></div></div><div style="word-wrap:break-word" class="gmail_msg"><div class="gmail_msg">I wonder if we could change that call site to be deleted (or at least explicit), and add support for literal strings with a StringRef version of this:</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">    /// Construct an ArrayRef from a C array.<br class="gmail_msg">    template <size_t N><br class="gmail_msg">    /*implicit*/ LLVM_CONSTEXPR ArrayRef(const T (&Arr)[N])<br class="gmail_msg">      : Data(Arr), Length(N) {}</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">This way we’ll avoid the strlen on quoted strings which is the common case anyway, and then can see how many other cases we have from const char* remaining.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Pete<br class="gmail_msg"><blockquote type="cite" class="gmail_msg"><div class="gmail_msg"></div></blockquote></div></div><div style="word-wrap:break-word" class="gmail_msg"><div class="gmail_msg"><blockquote type="cite" class="gmail_msg"><div class="gmail_msg"><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg">To come back to your point, I’m not sure if we should leave the internal pointer null or always set it to “”? This would provide the guarantee that dereferencing a StringRef is always valid without checking.</div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg">— </div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg">Mehdi</div><div style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><br class="gmail_msg"></div></div></blockquote></div></div><div style="word-wrap:break-word" class="gmail_msg"><div class="gmail_msg"><blockquote type="cite" class="gmail_msg"><div class="gmail_msg"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="gmail_msg">_______________________________________________</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="gmail_msg">LLVM Developers mailing list</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><a href="mailto:llvm-dev@lists.llvm.org" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg" target="_blank">llvm-dev@lists.llvm.org</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg"><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="gmail_msg" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></div></blockquote></div><br class="gmail_msg"></div></blockquote></div>
</div></blockquote></div><br class=""></body></html>