[cfe-dev] [LLVMdev] For alias analysis, It's gcc too aggressive or LLVM need to improve?

Richard Smith richard at metafoo.co.uk
Mon Aug 11 19:31:09 PDT 2014


I'll take this from the C++ angle; the C rules are not the same, and I'm
not confident they give the same answer.

On Mon, Aug 11, 2014 at 2:09 PM, Daniel Berlin <dberlin at dberlin.org> wrote:

> The access path matters (in some sense), but this is, AFIAK, valid no
> matter how you look at it.
>
> Let's take a look line by line
>
> #include <stdio.h>
> struct heap {int index; int b;};
> struct heap **ptr;
> int aa;
>
> int main() {
>   struct heap element;
>   struct heap *array[2];
>   array[0] = (struct heap *)&aa; <- Okay so far.
>   array[1] = &element; <- Clearly okay
>   ptr = array; <- still okay so far
>   aa = 1; <- not pointer related.
>   int i; <- not pointer related
>   for (i =0; i< 2; i++) { <- not pointer related
>     printf("i is %d, aa is %d\n", i, aa); <- not pointer related
>     ptr[i]->index = 0; <- Here is where it gets wonky.
>
> <rest of codeis irrelevan>
>
> First, ptr[i] is an lvalue, of type struct heap *, and ptr[i]-> is an
> lvalue of type struct heap (in C++03, this is 5.2.5 paragraph 3, check
> footnote 59).
>

This is where we get the undefined behavior.

3.8/6: "[If you have a glvalue referring to storage but where there is no
corresponding object, the] program has undefined behavior if:
[...] the glvalue is used to access a non-static data member".

There is no object of type 'heap' denoted by *ptr[0] (by 1.8/1, we can only
create objects through definitions, new-expressions, and by creating
temporary objects). So the behavior is undefined when we evaluate
ptr[0]->index.

(I'm too lazy to parse the rules for whether E1.E2 is an lvalue,
> because it doesn't end up making a difference)
>
> Let's assume, for the sake of argument,  the actual access to aa
> occurs through an lvalue of type "struct heap" rather than "int"
> In C++03 and C++11, it says:
>
> An object shall have its stored value accessed only by an lvalue
> expression that has one of the following types:
>
> a type compatible with the effective type of the object,
> a qualified version of a type compatible with the effective type of the
> object,
> a type that is the signed or unsigned type corresponding to the
> effective type of the object,
> a type that is the signed or unsigned type corresponding to a
> qualified version of the effective type of the object,
> an aggregate or union type that includes one of the aforementioned
> types among its members (including, recursively, a member of a
> subaggregate or contained union), or
> a character type.
> (C++11 adds something about dynamic type here)
>
> struct heap is "an aggregate or union type that includes one of the
> aforementioned types among it's members".
>
> Thus, this is legal to access this int through an lvalue expression
> that has a type of struct heap.
> Whether the actual store is legal for other reasons, i don't know.
> There are all kinds of rules about object alignment and value
> representation that aren't my baliwick.  I leave it to another
> language lawyer to say whether it's okay to do a store to something
> that is essentially, a partial object.
>
> Note that GCC actually knows this is legal to alias, at least at the
> tree level. I debugged it there, and it definitely isn't eliminating
> it at a high level. It also completely understands the call to ->index
> = 0 affects "aa", and has a reload for aa before the printf call.
>
> I don't know what is eliminating this at the RTL level, but i can't
> see why it's illegal from *aliasing rules*. Maybe this is invalid for
> some other reason.
>
>
>
>
>
>
>
>   }
>   return 0;
> }
>
> ptr[i]->index = 0;
>
> On Mon, Aug 11, 2014 at 1:13 PM, Reid Kleckner <rnk at google.com> wrote:
> > +aliasing people
> >
> > I *think* this is valid, because the rules have always been described to
> me
> > in terms of underlying storage type, and not access path. These are all
> > ints, so all loads and stores can alias.
> >
> >
> > On Sat, Aug 9, 2014 at 3:07 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> >>
> >> ----- Original Message -----
> >> > From: "Tim Northover" <t.p.northover at gmail.com>
> >> > To: "Jonas Wagner" <jonas.wagner at epfl.ch>
> >> > Cc: "cfe-dev Developers" <cfe-dev at cs.uiuc.edu>, "LLVM Developers
> Mailing
> >> > List" <llvmdev at cs.uiuc.edu>
> >> > Sent: Friday, August 8, 2014 6:54:50 AM
> >> > Subject: Re: [cfe-dev] [LLVMdev] For alias analysis, It's gcc too
> >> > aggressive or LLVM need to improve?
> >> >
> >> > > your C program invokes undefined behavior when it dereferences
> >> > > pointers that
> >> > > have been converted to other types. See for example
> >> > >
> >> > >
> http://stackoverflow.com/questions/4810417/c-when-is-casting-between-pointer-types-not-undefined-behavior
> >> >
> >> > I don't think it's quite that simple.The type-based aliasing rules
> >> > come from 6.5p7 of C11, I think. That says:
> >> >
> >> > "An object shall have its stored value accessed only by an lvalue
> >> > expression that has one of
> >> > the following types:
> >> >   + a type compatible with the effective type of the object,
> >> >   [...]
> >> >   + an aggregate or union type that includes one of the
> >> >   aforementioned
> >> > types among its members [...]"
> >> >
> >> > That would seem to allow this usage: aa (effective type "int") is
> >> > being accessed via an lvalue "ptr[i]->index" of type "int".
> >> >
> >> > The second point would even seem to allow something like "ptr[i] =
> >> > ..." if aa was declared "int aa[2];", though that seems to be going
> >> > too far. It also seems to be very difficult to pin down a meaning
> >> > (from the standard) for "a->b" if a is not a pointer to an object
> >> > with
> >> > the correct effective type. So the entire area is probably one that's
> >> > open to interpretation.
> >> >
> >> > I've added cfe-dev to the list; they're the *professional* language
> >> > lawyers.
> >>
> >> Coincidentally, this also seems to be PR20585 (adding Jiangning Liu, the
> >> reporter of that bug, to this thread too).
> >>
> >>  -Hal
> >>
> >> >
> >> > Cheers.
> >> >
> >> > Tim.
> >> > _______________________________________________
> >> > cfe-dev mailing list
> >> > cfe-dev at cs.uiuc.edu
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >> >
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140811/59c6cad3/attachment.html>


More information about the cfe-dev mailing list