[PATCH] More precise aliasing for char arrays

Fri Jun 27 11:14:22 PDT 2014

On Fri, Jun 27, 2014 at 9:20 AM, Richard Smith <richard at metafoo.co.uk> wrote:
>
>> >> >>     typedef struct {
>> >> >>       char a;
>> >> >>       char b[100];
>> >> >>       char c;
>> >> >>     } S;
>> >> >>
>
> Even in this case, you can in principle hoist out the load. The expression
> 'x.b' would have undefined behavior if there weren't an S object at that
> address, and if there's an S object, there's not an int. This is probably
> more than we want to optimize -- it'll break too much real world code --
> even with -fstruct-path-tbaa.

This is where I disagree. If there is an S at address 0x100, there is
definitely not an int at address 0x100… but there is nothing
preventing an int from being at address 0x104!

    S mys;
    void* pv = &mys.b[3];  // silly alignas tricks omitted for brevity
    int* pi = new (pv) int;
    mys.a = 42;
    *pi = 43;  // well-defined, as far as I know

At this point it would be undefined behavior to read from mys.b[3] as
a char, but it would be well-defined to access other parts of the byte
array (e.g. mys.b[40]), and it would be well-defined to write to
mys.b[3] as a char as long as you didn't try to read from *pi
afterward.

Maybe my interpretation is influenced too much by that "real-world
code" of which you speak… but I really do think that this is the
intent of the Standard in providing (A) placement new and (B) the
char-may-alias-anything rule.  If you break placement new, what
happens to all the "real-world code" using it?

–Arthur