[PATCH] D25639: Add ctor for string literal to StringRef, and make explicit the conversion from const char *

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 17 14:25:39 PDT 2016


> On Oct 17, 2016, at 1:51 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Mon, Oct 17, 2016 at 12:38 PM Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote:
>> On Oct 17, 2016, at 10:12 AM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>> 
>> (for my money the readability loss of adding StringRef(...) all over the place doesn't seem quite worth it, but I'm not going to stand in the way of the change or anything.)
> 
> The balance is between the readability, and the “error prone” current behavior.
> 
> For example we introduced all these redundant .c_str() in the past: https://reviews.llvm.org/D25667 <https://reviews.llvm.org/D25667>
> Ideally we shouldn’t have much explicit calls to StringRef, if all the APIs are converted to work with StringRef most of the time. 
> 
> But for now we do have a fair few (judging by the size of this change).

Sure, at least it will be easy to spot which APIs / code can benefit from an upgrade.

>  
> 
>> 
>> Are these changes necessarily joined - or is the explicitness of the StringRef(...) ctor independent of the addition of the template?
> 
> It is not independent: the template is never selected if the other constructor is not made explicit.
> 
> Ah, because the const char* is a close enough match that the template isn't chosen.
> 
> Can workaround that with something like:
> 
>  template<typename T>
>   StringRef(T *Str, typename std::enable_if<std::is_same<T, const char>::value>::type = 0) = delete;
> 
> (now it's a template, so it's not a better match than the other one - but wouldn't trigger other implicit conversions - could generalize it to T Str, is_convertible_to<T, const char*>, etc)

I’m not sure what you’re looking to achieve here?
Assuming your template above replaces the one I made explicit, you’re forbidding creating a StringRef from a const char *?


> 
> 
>> Do we need to worry about the recursive strlen in the template ctor? Or is it clear that even in a non-constexpr context the compilers we care about produce reasonable performance?
> 
> Can you clarify what you mean by "non-constexpr context” here?
> The recursive template is not intended to be used with anything else than literal, I am not expecting this to generate any code ever (if it does, it is not intended).
> 
> Constexpr functions don't guarantee that they will be evaluated at compile time. They only provide that they /can/ be used in a place where compile time evaluation is required (& taht they will conform to the constexpr requirements (not writing mutating state, throwing exceptions, etc) in whatever places they're used that require constexpr (eg: a constexpr function can throw under some condition, so long as all the uses of the function in a constexpr context don't satisfy that condition))

Right, I was expecting that calling this function when instantiating using a `constexpr StringRef` constructor would be enough to trigger it, but not in all cases apparently.


> 
> For example, look at the IR generated from this code:
> 
> struct StringRef {
>   constexpr static unsigned getInitLength(char const *str, unsigned count = 0) {
>     return ('\0' == str[0]) ? count : getInitLength(str + 1, count + 1);
>   }
> 
>   template <int N>
>   constexpr StringRef(const char (&Str)[N]) 
>     : Length(getInitLength(Str)) {
>   }
> 
>   int Length;
> };
> 
> int main() {
>   StringRef S = "foo";
> }
> 
> $ grep getInitLength str.ll
> $_ZN9StringRef13getInitLengthEPKcj = comdat any
>   %call = call i32 @_ZN9StringRef13getInitLengthEPKcj(i8* %arraydecay, i32 0)
> define linkonce_odr i32 @_ZN9StringRef13getInitLengthEPKcj(i8* %str, i32 %count) #1 comdat align 2 {
>   %call = call i32 @_ZN9StringRef13getInitLengthEPKcj(i8* %add.ptr, i32 %add)
> 
> the getInitLength function exists and is recursive (the only way to force this not to happen is if the S variable itself is declared constexpr, then there's no mention of getInitLength in the generated IR).
> 
> Granted, if it's always a compile time constant string and everything is inlinable, etc, the optimizer will get rid of it - but LLVM's -O0 build might suffer greatly with all this recursive string length evaluation.

Right, we can use __builtin_strlen() instead of the recursive getInitLength. I didn’t go this route because I don’t know how to address it with MSVC  (and was expecting it be folded by the frontend…).

> 
> & of course any code like this wouldn't ever be optimized away & would still hit the recursive version (which might be optimized to a loop - /maybe/ even to strlen?):
> 
> char x[N];
> x[0] = 'a' + (rand() % 26); // or otherwise compute string contents at runtime
> StringRef S = x;
> 
> (essentially what Zach was asking about/pointing out)
> 
> Does all of that make sense? I may've done a poor job explaining it.

You did a good job here :)

— 
Mehdi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161017/4a6c91a5/attachment.html>


More information about the llvm-commits mailing list