[LLVMdev] [cfe-dev] For alias analysis, It's gcc too aggressive or LLVM need to improve?

Xinliang David Li xinliangli at gmail.com
Wed Aug 20 09:43:24 PDT 2014


Danny explains well on address escape and function side effect etc.  In
particular, without the address escape, GCC fully unrolls the loop, and
array[0]->index gets converted into direct access to 'aa' after fre pass.

David


On Mon, Aug 11, 2014 at 11:44 PM, Kevin Qin <kevinqindev at gmail.com> wrote:

> Hi all,
>
> Thanks for you paying time to look at this issue. I'm not an expert for
> C/C++ language, so I can just post some experiment results from LLVM and
> GCC.
>
> If we make minor changes to the test, gcc may give different results.
>
> #include <stdio.h>
> struct heap {int index; int b;};
> struct heap **ptr;
> int aa;
>
> int main() {
>   struct heap element;
>   struct heap *array[2];
>   array[0] = (struct heap *)&aa;
>   array[1] = &element;
>   ptr = array;
>   aa = 1;
>   int i;
>   for (i =0; i< 2; i++) {
>     printf("i is %d, aa is %d\n", i, aa);
>     array[i]->index = 0;  // we replace ptr to array here. so no global
> lvalue is used.
>   }
>   return 0;
> }
>
> Result didn't get changed,
>
> $gcc test.c -O0
> $./a.out
> i is 0, aa is 1
> i is 1, aa is 0
>
> $gcc test.c -O2
> $./a.out
> i is 0, aa is 1
> i is 1, aa is 1
>
> But if we change a bit more, like
>
> #include <stdio.h>
> struct heap {int index; int b;};
> struct heap **ptr;
> int aa;
>
> int main() {
>   struct heap element;
>   struct heap *array[2];
>   array[0] = (struct heap *)&aa;
>   array[1] = &element;
>   //ptr = array; // remove this assignment as well.
>   aa = 1;
>   int i;
>   for (i =0; i< 2; i++) {
>     printf("i is %d, aa is %d\n", i, aa);
>     array[i]->index = 0;  // we replace ptr to array here. so no global
> lvalue is used.
>   }
>   return 0;
> }
>
> Result changed to be the same as LLVM.
>
> $gcc test.c -O0
> $./a.out
> i is 0, aa is 1
> i is 1, aa is 0
>
> $gcc test.c -O2
> $./a.out
> i is 0, aa is 1
> i is 1, aa is 0
>
> I don't know why a assignement statment to a unrelated global pointer will
> affect gcc's aliasing work, and I don't know from the language point of
> view, if we use a local pointer to replace the global pointer, then the
> result would be still undefined or not.
>
> Regards,
> Kevin
>
>
> 2014-08-12 12:02 GMT+08:00 Daniel Berlin <dberlin at dberlin.org>:
>
> So then there you go, a real language lawyer says it's invalid for
>> other reasons :)
>>
>>
>> On Mon, Aug 11, 2014 at 7:31 PM, Richard Smith <richard at metafoo.co.uk>
>> wrote:
>> > I'll take this from the C++ angle; the C rules are not the same, and
>> I'm not
>> > confident they give the same answer.
>> >
>> > On Mon, Aug 11, 2014 at 2:09 PM, Daniel Berlin <dberlin at dberlin.org>
>> wrote:
>> >>
>> >> The access path matters (in some sense), but this is, AFIAK, valid no
>> >> matter how you look at it.
>> >>
>> >> Let's take a look line by line
>> >>
>> >> #include <stdio.h>
>> >> struct heap {int index; int b;};
>> >> struct heap **ptr;
>> >> int aa;
>> >>
>> >> int main() {
>> >>   struct heap element;
>> >>   struct heap *array[2];
>> >>   array[0] = (struct heap *)&aa; <- Okay so far.
>> >>   array[1] = &element; <- Clearly okay
>> >>   ptr = array; <- still okay so far
>> >>   aa = 1; <- not pointer related.
>> >>   int i; <- not pointer related
>> >>   for (i =0; i< 2; i++) { <- not pointer related
>> >>     printf("i is %d, aa is %d\n", i, aa); <- not pointer related
>> >>     ptr[i]->index = 0; <- Here is where it gets wonky.
>> >>
>> >> <rest of codeis irrelevan>
>> >>
>> >> First, ptr[i] is an lvalue, of type struct heap *, and ptr[i]-> is an
>> >> lvalue of type struct heap (in C++03, this is 5.2.5 paragraph 3, check
>> >> footnote 59).
>> >
>> >
>> > This is where we get the undefined behavior.
>> >
>> > 3.8/6: "[If you have a glvalue referring to storage but where there is
>> no
>> > corresponding object, the] program has undefined behavior if:
>> > [...] the glvalue is used to access a non-static data member".
>> >
>> > There is no object of type 'heap' denoted by *ptr[0] (by 1.8/1, we can
>> only
>> > create objects through definitions, new-expressions, and by creating
>> > temporary objects). So the behavior is undefined when we evaluate
>> > ptr[0]->index.
>> >
>> >> (I'm too lazy to parse the rules for whether E1.E2 is an lvalue,
>> >> because it doesn't end up making a difference)
>> >>
>> >> Let's assume, for the sake of argument,  the actual access to aa
>> >> occurs through an lvalue of type "struct heap" rather than "int"
>> >> In C++03 and C++11, it says:
>> >>
>> >> An object shall have its stored value accessed only by an lvalue
>> >> expression that has one of the following types:
>> >>
>> >> a type compatible with the effective type of the object,
>> >> a qualified version of a type compatible with the effective type of the
>> >> object,
>> >> a type that is the signed or unsigned type corresponding to the
>> >> effective type of the object,
>> >> a type that is the signed or unsigned type corresponding to a
>> >> qualified version of the effective type of the object,
>> >> an aggregate or union type that includes one of the aforementioned
>> >> types among its members (including, recursively, a member of a
>> >> subaggregate or contained union), or
>> >> a character type.
>> >> (C++11 adds something about dynamic type here)
>> >>
>> >> struct heap is "an aggregate or union type that includes one of the
>> >> aforementioned types among it's members".
>> >>
>> >> Thus, this is legal to access this int through an lvalue expression
>> >> that has a type of struct heap.
>> >> Whether the actual store is legal for other reasons, i don't know.
>> >> There are all kinds of rules about object alignment and value
>> >> representation that aren't my baliwick.  I leave it to another
>> >> language lawyer to say whether it's okay to do a store to something
>> >> that is essentially, a partial object.
>> >>
>> >> Note that GCC actually knows this is legal to alias, at least at the
>> >> tree level. I debugged it there, and it definitely isn't eliminating
>> >> it at a high level. It also completely understands the call to ->index
>> >> = 0 affects "aa", and has a reload for aa before the printf call.
>> >>
>> >> I don't know what is eliminating this at the RTL level, but i can't
>> >> see why it's illegal from *aliasing rules*. Maybe this is invalid for
>> >> some other reason.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>   }
>> >>   return 0;
>> >> }
>> >>
>> >> ptr[i]->index = 0;
>> >>
>> >> On Mon, Aug 11, 2014 at 1:13 PM, Reid Kleckner <rnk at google.com> wrote:
>> >> > +aliasing people
>> >> >
>> >> > I *think* this is valid, because the rules have always been
>> described to
>> >> > me
>> >> > in terms of underlying storage type, and not access path. These are
>> all
>> >> > ints, so all loads and stores can alias.
>> >> >
>> >> >
>> >> > On Sat, Aug 9, 2014 at 3:07 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> >> >>
>> >> >> ----- Original Message -----
>> >> >> > From: "Tim Northover" <t.p.northover at gmail.com>
>> >> >> > To: "Jonas Wagner" <jonas.wagner at epfl.ch>
>> >> >> > Cc: "cfe-dev Developers" <cfe-dev at cs.uiuc.edu>, "LLVM Developers
>> >> >> > Mailing
>> >> >> > List" <llvmdev at cs.uiuc.edu>
>> >> >> > Sent: Friday, August 8, 2014 6:54:50 AM
>> >> >> > Subject: Re: [cfe-dev] [LLVMdev] For alias analysis, It's gcc too
>> >> >> > aggressive or LLVM need to improve?
>> >> >> >
>> >> >> > > your C program invokes undefined behavior when it dereferences
>> >> >> > > pointers that
>> >> >> > > have been converted to other types. See for example
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> http://stackoverflow.com/questions/4810417/c-when-is-casting-between-pointer-types-not-undefined-behavior
>> >> >> >
>> >> >> > I don't think it's quite that simple.The type-based aliasing rules
>> >> >> > come from 6.5p7 of C11, I think. That says:
>> >> >> >
>> >> >> > "An object shall have its stored value accessed only by an lvalue
>> >> >> > expression that has one of
>> >> >> > the following types:
>> >> >> >   + a type compatible with the effective type of the object,
>> >> >> >   [...]
>> >> >> >   + an aggregate or union type that includes one of the
>> >> >> >   aforementioned
>> >> >> > types among its members [...]"
>> >> >> >
>> >> >> > That would seem to allow this usage: aa (effective type "int") is
>> >> >> > being accessed via an lvalue "ptr[i]->index" of type "int".
>> >> >> >
>> >> >> > The second point would even seem to allow something like "ptr[i] =
>> >> >> > ..." if aa was declared "int aa[2];", though that seems to be
>> going
>> >> >> > too far. It also seems to be very difficult to pin down a meaning
>> >> >> > (from the standard) for "a->b" if a is not a pointer to an object
>> >> >> > with
>> >> >> > the correct effective type. So the entire area is probably one
>> that's
>> >> >> > open to interpretation.
>> >> >> >
>> >> >> > I've added cfe-dev to the list; they're the *professional*
>> language
>> >> >> > lawyers.
>> >> >>
>> >> >> Coincidentally, this also seems to be PR20585 (adding Jiangning Liu,
>> >> >> the
>> >> >> reporter of that bug, to this thread too).
>> >> >>
>> >> >>  -Hal
>> >> >>
>> >> >> >
>> >> >> > Cheers.
>> >> >> >
>> >> >> > Tim.
>> >> >> > _______________________________________________
>> >> >> > cfe-dev mailing list
>> >> >> > cfe-dev at cs.uiuc.edu
>> >> >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >> >> >
>> >> >>
>> >> >> --
>> >> >> Hal Finkel
>> >> >> Assistant Computational Scientist
>> >> >> Leadership Computing Facility
>> >> >> Argonne National Laboratory
>> >> >> _______________________________________________
>> >> >> LLVM Developers mailing list
>> >> >> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > LLVM Developers mailing list
>> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >
>> >
>> >
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
>
>
> --
> Best Regards,
>
> Kevin Qin
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140820/bfcb5c57/attachment.html>


More information about the llvm-dev mailing list