[cfe-dev] (Request for comments) Implicit cast kind for initialization of references.

Tue Oct 25 00:31:56 PDT 2011

Il 25/10/2011 00:20, Abramo Bagnara ha scritto:
> Il 24/10/2011 23:50, Johannes Schaub ha scritto:
>> Abramo Bagnara wrote:
>>
>>> Il 24/10/2011 19:47, Ahmed Charles ha scritto:
>>>> a[i] becomes *(a+i), which dereferences the address, i.e. undefined
>>>> behavior, in your case.
>>>
>>> Not without an lvalue to rvalue conversion.
>>
>> That may be the ultimate goal of the C++ committee to word into the Standard 
>> some time in the future, but the current C++11 does not incorporate that 
>> understanding yet. It leaves undefined the effects of evaluating an lvalue 
>> referring to nothing. So evaluating *(a+i) is UB *if* there is no object at 
>> address a+i. Since whether there is an object in this case is unspecified, 
>> the Standard does not require anything from the implementation - in other 
>> words, the behavior is undefined. 
> 
> Note that from C99 times &*E is considered equivalent to E, so starting
> from the original
> 
> int a[5];
> int *p = &a[5];
> 
> we have the following equivalent transformations:
> 
> &a[5] => &*(a+5) => a + 5
> 
> IMHO following a different line of thought we'd soon come to nonsense.

The wording in C++0x 8.3.2 p5 says:

  A reference shall be initialized to refer to a valid object
  or function. [Note: in particular, a null reference cannot
  exist in a well-defined program, because the only way to create
  such a reference would be to bind it to the “object” obtained
  by dereferencing a null pointer, which causes undefined behavior.

Now, lvalue references are initialized using lvalues. If there could be
no invalid lvalues at all, then why stating the first sentence above?

It is also worth to stress the use of words "in particular" in the Note
above: as far as I understand, it suggests that the real cause of UB is
that we are *binding* the invalid lvalue to a reference (i.e., it is not
the mere action of computing the invalid lvalue).

That is, after computing an lvalue, we necessarily have to do one of the
following action (unless I am missing something):
  - read from it (UB if invalid);
  - write to it (UB if invalid);
  - bind it to a reference (UB if invalid).
  - take its address (sometimes this is well-defined even though the
lvalue is not invalid, e.g., for invalid "off-by-one lvalues").

To our eyes, this view would provide quite a clear picture of the
matter. Is anyone aware of, e.g., comments from the STandard Committee
arguing against such a point of view?

Enea.