[PATCH] D25639: Add ctor for string literal to StringRef, and make explicit the conversion from const char *

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 17 14:30:19 PDT 2016


On Mon, Oct 17, 2016 at 2:25 PM Mehdi Amini <mehdi.amini at apple.com> wrote:

> On Oct 17, 2016, at 1:51 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Oct 17, 2016 at 12:38 PM Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
> On Oct 17, 2016, at 10:12 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
> (for my money the readability loss of adding StringRef(...) all over the
> place doesn't seem quite worth it, but I'm not going to stand in the way of
> the change or anything.)
>
>
> The balance is between the readability, and the “error prone” current
> behavior.
>
> For example we introduced all these redundant .c_str() in the past:
> https://reviews.llvm.org/D25667
> Ideally we shouldn’t have much explicit calls to StringRef, if all the
> APIs are converted to work with StringRef most of the time.
>
>
> But for now we do have a fair few (judging by the size of this change).
>
>
> Sure, at least it will be easy to spot which APIs / code can benefit from
> an upgrade.
>
>
>
>
>
> Are these changes necessarily joined - or is the explicitness of the
> StringRef(...) ctor independent of the addition of the template?
>
>
> It is not independent: the template is never selected if the other
> constructor is not made explicit.
>
>
> Ah, because the const char* is a close enough match that the template
> isn't chosen.
>
> Can workaround that with something like:
>
>  template<typename T>
>   StringRef(T *Str, typename std::enable_if<std::is_same<T, const
> char>::value>::type = 0) = delete;
>
>
> (now it's a template, so it's not a better match than the other one - but
> wouldn't trigger other implicit conversions - could generalize it to T Str,
> is_convertible_to<T, const char*>, etc)
>
>
> I’m not sure what you’re looking to achieve here?
> Assuming your template above replaces the one I made explicit, you’re
> forbidding creating a StringRef from a const char *?
>

The template replaces the explicit const char* ctor with the intention of
providing essentially the same behavior (implicit construction) while still
deferring the array cases to the array ctor template.

By making the const char* ctor a template, it competes on more even ground
against the template case. So you don't have to make the const char* one
const to get the template benefit.

Alternatively - if it's just in particular global array initializers built
by tblgen, we could skip all of this and provide a factory function we only
use in those cases (that is constexpr, etc). Then there's no ctor issue.
(we could mark the const char*+length ctor constexpr so it can be used by
the factory).


>
>
>
>
> Do we need to worry about the recursive strlen in the template ctor? Or is
> it clear that even in a non-constexpr context the compilers we care about
> produce reasonable performance?
>
>
> Can you clarify what you mean by "non-constexpr context” here?
> The recursive template is not intended to be used with anything else than
> literal, I am not expecting this to generate any code ever (if it does, it
> is not intended).
>
>
> Constexpr functions don't guarantee that they will be evaluated at compile
> time. They only provide that they /can/ be used in a place where compile
> time evaluation is required (& taht they will conform to the constexpr
> requirements (not writing mutating state, throwing exceptions, etc) in
> whatever places they're used that require constexpr (eg: a constexpr
> function can throw under some condition, so long as all the uses of the
> function in a constexpr context don't satisfy that condition))
>
>
> Right, I was expecting that calling this function when instantiating using
> a `constexpr StringRef` constructor would be enough to trigger it, but not
> in all cases apparently.
>
>
>
> For example, look at the IR generated from this code:
>
> struct StringRef {
>   constexpr static unsigned getInitLength(char const *str, unsigned count
> = 0) {
>     return ('\0' == str[0]) ? count : getInitLength(str + 1, count + 1);
>   }
>
>   template <int N>
>   constexpr StringRef(const char (&Str)[N])
>     : Length(getInitLength(Str)) {
>   }
>
>   int Length;
> };
>
> int main() {
>   StringRef S = "foo";
> }
>
> $ grep getInitLength str.ll
> $_ZN9StringRef13getInitLengthEPKcj = comdat any
>   %call = call i32 @_ZN9StringRef13getInitLengthEPKcj(i8* %arraydecay, i32
> 0)
> define linkonce_odr i32 @_ZN9StringRef13getInitLengthEPKcj(i8* %str, i32
> %count) #1 comdat align 2 {
>   %call = call i32 @_ZN9StringRef13getInitLengthEPKcj(i8* %add.ptr, i32
> %add)
>
> the getInitLength function exists and is recursive (the only way to force
> this not to happen is if the S variable itself is declared constexpr, then
> there's no mention of getInitLength in the generated IR).
>
> Granted, if it's always a compile time constant string and everything is
> inlinable, etc, the optimizer will get rid of it - but LLVM's -O0 build
> might suffer greatly with all this recursive string length evaluation.
>
>
> Right, we can use __builtin_strlen() instead of the recursive
> getInitLength. I didn’t go this route because I don’t know how to address
> it with MSVC  (and was expecting it be folded by the frontend…).
>
>
> & of course any code like this wouldn't ever be optimized away & would
> still hit the recursive version (which might be optimized to a loop -
> /maybe/ even to strlen?):
>
> char x[N];
> x[0] = 'a' + (rand() % 26); // or otherwise compute string contents at
> runtime
> StringRef S = x;
>
> (essentially what Zach was asking about/pointing out)
>
> Does all of that make sense? I may've done a poor job explaining it.
>
>
> You did a good job here :)
>
>> Mehdi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161017/b2789dc6/attachment.html>


More information about the llvm-commits mailing list