[PATCH] D87561: [Sema] List conversion validate character array

Mark de Wever via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Sep 27 09:47:53 PDT 2020


Mordante marked 2 inline comments as done.
Mordante added inline comments.


================
Comment at: clang/test/CXX/drs/dr14xx.cpp:411-414
+  void f(const char[4]);
+  void f(const wchar_t[4]);
+  void f(const char16_t[4]);
+  void f(const char32_t[4]);
----------------
rsmith wrote:
> rsmith wrote:
> > Mordante wrote:
> > > rsmith wrote:
> > > > These should presumably be references to arrays, rather than arrays, or the parameter type is as if you wrote (for example) `void f(const char *)`, which shouldn't get the special treatment here.
> > > > 
> > > > [over.ics.list]p4 mentions this in its footnote:
> > > > 
> > > > "Otherwise, if the parameter type is a character array [Footnote: Since there are no parameters of array type, this will only occur as the referenced type of a reference parameter.] and the initializer list has a single element that is an appropriately-typed string-literal (9.4.3), the implicit conversion sequence is the identity conversion."
> > > Ah I missed that footnote. I reread the standard and can you confirm some cases?
> > > ```
> > > namespace A { 
> > >   void f(const char(&)[4]);
> > >   void g() { f({"abc"}); }
> > > }
> > > namespace B { 
> > >   void f(const char(&&)[4]);
> > >   void g() { f({"abc"}); }
> > > } 
> > > ```
> > > both should work and with P0388 the array without bounds will also work.
> > > 
> > > I ask since this is my interpretation of the standard, but it seems there's a difference between implementations and `void f(const char(&&)[4]);` fails for "abc" but works for "ab".
> > > It seems ICC and consider "abc" an lvalue in this case and not when using "ab".
> > > 
> > > Here's a gotbolt link with the examples https://godbolt.org/z/r1TKfx
> > That's a really interesting example :)
> > 
> > The initializer is a list containing an lvalue of type `const char[4]`. Per [dcl.init.list]/3.9 and /3.10, the behavior depends on whether the referenced type is reference-related to `const char[4]` -- if so, then the reference can only bind directly and a `&&` reference will be invalid, because it's binding an rvalue reference to an lvalue, and if not, then we create an array temporary and the `&&` binding is fine.
> > 
> > Per [dcl.init.ref]/4, `const char[???]` is reference-related to `const char[4]` if they are similar types, and per [conv.qual]/2, the types are similar if `???` is omitted or `4`, and not similar otherwise.
> > 
> > So:
> > * `const char (&&r)[4] = {"abc"}` is ill-formed (types are the same, binds rvalue reference to lvalue)
> > * `const char (&&)[] = {"abc"}` is ill-formed (types are similar, binds rvalue reference to lvalue)
> > * `const char (&&r)[5] = {"abc"}` is OK (types are not similar, creates temporary)
> > 
> > That seems like a very surprising outcome to me. Perhaps we should probably ignore array bounds entirely when determining whether two types are reference-related. I'll take this to CWG for discussion.
> I think that's only an answer to half of your question. The other half is that [over.ics.list]p4 does not apply (directly) to either of your testcases, because the "parameter type" is a reference type, not an array type. For:
> 
> ```
> namespace A { 
>   void f(const char(&)[4]);
>   void g() { f({"abc"}); }
> }
> ```
> 
> ... we reach [over.ics.list]p9, which says to use the rules in [over.ics.ref]. Those rules say that if the reference binds directly to an argument expression (ignore the "expression" here; this is very old wording that predates braced initializers), then we form an identity conversion. So that's what happens in this case.
> 
> For
> 
> ```
> namespace B { 
>   void f(const char(&&)[4]);
>   void g() { f({"abc"}); }
> }
> ```
> 
> the same thing happens, but now [over.ics.ref]p3 says "an implicit conversion sequence cannot be formed if it requires binding an lvalue reference other than a reference to a non-volatile const type to an rvalue or binding an rvalue reference to an lvalue other than a function lvalue", so the candidate is not viable.
> 
> If the array bound were omitted or were not 4, then the reference would not bind directly, and we would instead consider initializing a temporary. [over.ics.ref]p2 says "When a parameter of reference type is not bound directly to an argument expression, the conversion sequence is the one required to convert the argument expression to the referenced type according to 12.4.4.2.", which takes us back around to [over.ics.list] with the "parameter type" now being the array type. That's how we can reach [over.ics.list]p4 and consider string literal -> array initialization.
> 
> So I think that, according to the current rules, for
> 
> ```
> void f(const char (&&)[4]); // #1
> void f(const char (&&)[5]); // #2
> ```
> 
> ... a call to `f({"abc"})` remarkably calls #2, because #1 is not viable (would bind rvalue reference to lvalue string literal), but #2 is, just like `const char (&&r)[4] = {"abc"};` is ill-formed but `const char (&&r)[5] = {"abc"};` is valid.
> That's a really interesting example :)

Thanks :-)
Also thanks for the detailed explanation of the rules applying to these cases.

> So:
> * `const char (&&r)[4] = {"abc"}` is ill-formed (types are the same, binds rvalue reference to lvalue)
> * `const char (&&)[] = {"abc"}` is ill-formed (types are similar, binds rvalue reference to lvalue)

Interesting, wasn't P0388 trying to solve the issues with array of unknown bound?
Am I correct to assume it's intended that this code will become valid in the future?

> * `const char (&&r)[5] = {"abc"}` is OK (types are not similar, creates temporary)
> 
> That seems like a very surprising outcome to me. Perhaps we should probably ignore array bounds entirely when determining whether two types are reference-related. I'll take this to CWG for discussion.

I agree this seems unexpected behaviour, thanks for discussing it with CWG.

> So I think that, according to the current rules, for

> void f(const char (&&)[4]); // #1
> void f(const char (&&)[5]); // #2

> ... a call to f({"abc"}) remarkably calls #2, because #1 is not viable (would bind rvalue reference to lvalue string literal), but #2 is, just like const char (&&r)[4] = {"abc"}; is ill-formed but const char (&&r)[5] = {"abc"}; is valid.

MSVC agrees with your observation, GCC picks #1 and fails to create the call, and ICC aborts with an ICE
https://godbolt.org/z/s7nTPP





Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87561/new/

https://reviews.llvm.org/D87561



More information about the cfe-commits mailing list