<div dir="ltr">Hi all,<div><br></div><div>Thanks for you paying time to look at this issue. I'm not an expert for C/C++ language, so I can just post some experiment results from LLVM and GCC. </div><div><br></div><div>
If we make minor changes to the test, gcc may give different results.</div><div><br></div><div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px"><div>#include <stdio.h></div><div>struct heap {int index; int b;};</div>
<div>struct heap **ptr;</div><div>int aa;</div><div><br></div><div>int main() {</div><div> struct heap element;<br></div><div> struct heap *array[2];</div><div> array[0] = (struct heap *)&aa;</div><div> array[1] = &element;</div>
<div> ptr = array;</div><div> aa = 1;</div><div> int i;</div><div> for (i =0; i< 2; i++) {</div><div> printf("i is %d, aa is %d\n", i, aa);</div><div> array[i]->index = 0; // we replace ptr to array here. so no global lvalue is used.</div>
<div> }</div><div> return 0;</div><div>}</div></div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px"><br></div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px">Result didn't get changed,</div>
<div style="font-family:arial,sans-serif;font-size:14.166666030883789px"><br></div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px">$gcc test.c -O0<br></div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px">
$./a.out</div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px"><div>i is 0, aa is 1</div><div>i is 1, aa is 0</div></div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px"><br>
</div><div style="font-family:arial,sans-serif;font-size:14.166666030883789px"><div>$gcc test.c -O2<br></div><div>$./a.out</div><div><div>i is 0, aa is 1</div><div>i is 1, aa is 1</div></div><div><br></div><div>But if we change a bit more, like</div>
<div><br></div><div><div>#include <stdio.h></div><div>struct heap {int index; int b;};</div><div>struct heap **ptr;</div><div>int aa;</div><div><br></div><div>int main() {</div><div> struct heap element;<br></div><div>
struct heap *array[2];</div><div> array[0] = (struct heap *)&aa;</div><div> array[1] = &element;</div><div> //ptr = array; // remove this assignment as well.</div><div> aa = 1;</div><div> int i;</div><div> for (i =0; i< 2; i++) {</div>
<div> printf("i is %d, aa is %d\n", i, aa);</div><div> array[i]->index = 0; // we replace ptr to array here. so no global lvalue is used.</div><div> }</div><div> return 0;</div><div>}</div></div><div>
<br></div><div><div>Result changed to be the same as LLVM.</div><div><br></div><div>$gcc test.c -O0<br></div><div>$./a.out</div><div><div>i is 0, aa is 1</div><div>i is 1, aa is 0</div></div><div><br></div><div><div>$gcc test.c -O2<br>
</div><div>$./a.out</div><div><div>i is 0, aa is 1</div><div>i is 1, aa is 0</div></div></div></div><div><br></div><div>I don't know why a assignement statment to a unrelated global pointer will affect gcc's aliasing work, and I don't know from the language point of view, if we use a local pointer to replace the global pointer, then the result would be still undefined or not.</div>
<div><br></div><div>Regards,</div><div>Kevin</div></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-08-12 12:02 GMT+08:00 Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">So then there you go, a real language lawyer says it's invalid for<br>
other reasons :)<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On Mon, Aug 11, 2014 at 7:31 PM, Richard Smith <<a href="mailto:richard@metafoo.co.uk">richard@metafoo.co.uk</a>> wrote:<br>
> I'll take this from the C++ angle; the C rules are not the same, and I'm not<br>
> confident they give the same answer.<br>
><br>
> On Mon, Aug 11, 2014 at 2:09 PM, Daniel Berlin <<a href="mailto:dberlin@dberlin.org">dberlin@dberlin.org</a>> wrote:<br>
>><br>
>> The access path matters (in some sense), but this is, AFIAK, valid no<br>
>> matter how you look at it.<br>
>><br>
>> Let's take a look line by line<br>
>><br>
>> #include <stdio.h><br>
>> struct heap {int index; int b;};<br>
>> struct heap **ptr;<br>
>> int aa;<br>
>><br>
>> int main() {<br>
>> struct heap element;<br>
>> struct heap *array[2];<br>
>> array[0] = (struct heap *)&aa; <- Okay so far.<br>
>> array[1] = &element; <- Clearly okay<br>
>> ptr = array; <- still okay so far<br>
>> aa = 1; <- not pointer related.<br>
>> int i; <- not pointer related<br>
>> for (i =0; i< 2; i++) { <- not pointer related<br>
>> printf("i is %d, aa is %d\n", i, aa); <- not pointer related<br>
>> ptr[i]->index = 0; <- Here is where it gets wonky.<br>
>><br>
>> <rest of codeis irrelevan><br>
>><br>
>> First, ptr[i] is an lvalue, of type struct heap *, and ptr[i]-> is an<br>
>> lvalue of type struct heap (in C++03, this is 5.2.5 paragraph 3, check<br>
>> footnote 59).<br>
><br>
><br>
> This is where we get the undefined behavior.<br>
><br>
> 3.8/6: "[If you have a glvalue referring to storage but where there is no<br>
> corresponding object, the] program has undefined behavior if:<br>
> [...] the glvalue is used to access a non-static data member".<br>
><br>
> There is no object of type 'heap' denoted by *ptr[0] (by 1.8/1, we can only<br>
> create objects through definitions, new-expressions, and by creating<br>
> temporary objects). So the behavior is undefined when we evaluate<br>
> ptr[0]->index.<br>
><br>
>> (I'm too lazy to parse the rules for whether E1.E2 is an lvalue,<br>
>> because it doesn't end up making a difference)<br>
>><br>
>> Let's assume, for the sake of argument, the actual access to aa<br>
>> occurs through an lvalue of type "struct heap" rather than "int"<br>
>> In C++03 and C++11, it says:<br>
>><br>
>> An object shall have its stored value accessed only by an lvalue<br>
>> expression that has one of the following types:<br>
>><br>
>> a type compatible with the effective type of the object,<br>
>> a qualified version of a type compatible with the effective type of the<br>
>> object,<br>
>> a type that is the signed or unsigned type corresponding to the<br>
>> effective type of the object,<br>
>> a type that is the signed or unsigned type corresponding to a<br>
>> qualified version of the effective type of the object,<br>
>> an aggregate or union type that includes one of the aforementioned<br>
>> types among its members (including, recursively, a member of a<br>
>> subaggregate or contained union), or<br>
>> a character type.<br>
>> (C++11 adds something about dynamic type here)<br>
>><br>
>> struct heap is "an aggregate or union type that includes one of the<br>
>> aforementioned types among it's members".<br>
>><br>
>> Thus, this is legal to access this int through an lvalue expression<br>
>> that has a type of struct heap.<br>
>> Whether the actual store is legal for other reasons, i don't know.<br>
>> There are all kinds of rules about object alignment and value<br>
>> representation that aren't my baliwick. I leave it to another<br>
>> language lawyer to say whether it's okay to do a store to something<br>
>> that is essentially, a partial object.<br>
>><br>
>> Note that GCC actually knows this is legal to alias, at least at the<br>
>> tree level. I debugged it there, and it definitely isn't eliminating<br>
>> it at a high level. It also completely understands the call to ->index<br>
>> = 0 affects "aa", and has a reload for aa before the printf call.<br>
>><br>
>> I don't know what is eliminating this at the RTL level, but i can't<br>
>> see why it's illegal from *aliasing rules*. Maybe this is invalid for<br>
>> some other reason.<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> }<br>
>> return 0;<br>
>> }<br>
>><br>
>> ptr[i]->index = 0;<br>
>><br>
>> On Mon, Aug 11, 2014 at 1:13 PM, Reid Kleckner <<a href="mailto:rnk@google.com">rnk@google.com</a>> wrote:<br>
>> > +aliasing people<br>
>> ><br>
>> > I *think* this is valid, because the rules have always been described to<br>
>> > me<br>
>> > in terms of underlying storage type, and not access path. These are all<br>
>> > ints, so all loads and stores can alias.<br>
>> ><br>
>> ><br>
>> > On Sat, Aug 9, 2014 at 3:07 PM, Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br>
>> >><br>
>> >> ----- Original Message -----<br>
>> >> > From: "Tim Northover" <<a href="mailto:t.p.northover@gmail.com">t.p.northover@gmail.com</a>><br>
>> >> > To: "Jonas Wagner" <<a href="mailto:jonas.wagner@epfl.ch">jonas.wagner@epfl.ch</a>><br>
>> >> > Cc: "cfe-dev Developers" <<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a>>, "LLVM Developers<br>
>> >> > Mailing<br>
>> >> > List" <<a href="mailto:llvmdev@cs.uiuc.edu">llvmdev@cs.uiuc.edu</a>><br>
>> >> > Sent: Friday, August 8, 2014 6:54:50 AM<br>
>> >> > Subject: Re: [cfe-dev] [LLVMdev] For alias analysis, It's gcc too<br>
>> >> > aggressive or LLVM need to improve?<br>
>> >> ><br>
>> >> > > your C program invokes undefined behavior when it dereferences<br>
>> >> > > pointers that<br>
>> >> > > have been converted to other types. See for example<br>
>> >> > ><br>
>> >> > ><br>
>> >> > > <a href="http://stackoverflow.com/questions/4810417/c-when-is-casting-between-pointer-types-not-undefined-behavior" target="_blank">http://stackoverflow.com/questions/4810417/c-when-is-casting-between-pointer-types-not-undefined-behavior</a><br>
>> >> ><br>
>> >> > I don't think it's quite that simple.The type-based aliasing rules<br>
>> >> > come from 6.5p7 of C11, I think. That says:<br>
>> >> ><br>
>> >> > "An object shall have its stored value accessed only by an lvalue<br>
>> >> > expression that has one of<br>
>> >> > the following types:<br>
>> >> > + a type compatible with the effective type of the object,<br>
>> >> > [...]<br>
>> >> > + an aggregate or union type that includes one of the<br>
>> >> > aforementioned<br>
>> >> > types among its members [...]"<br>
>> >> ><br>
>> >> > That would seem to allow this usage: aa (effective type "int") is<br>
>> >> > being accessed via an lvalue "ptr[i]->index" of type "int".<br>
>> >> ><br>
>> >> > The second point would even seem to allow something like "ptr[i] =<br>
>> >> > ..." if aa was declared "int aa[2];", though that seems to be going<br>
>> >> > too far. It also seems to be very difficult to pin down a meaning<br>
>> >> > (from the standard) for "a->b" if a is not a pointer to an object<br>
>> >> > with<br>
>> >> > the correct effective type. So the entire area is probably one that's<br>
>> >> > open to interpretation.<br>
>> >> ><br>
>> >> > I've added cfe-dev to the list; they're the *professional* language<br>
>> >> > lawyers.<br>
>> >><br>
>> >> Coincidentally, this also seems to be PR20585 (adding Jiangning Liu,<br>
>> >> the<br>
>> >> reporter of that bug, to this thread too).<br>
>> >><br>
>> >> -Hal<br>
>> >><br>
>> >> ><br>
>> >> > Cheers.<br>
>> >> ><br>
>> >> > Tim.<br>
>> >> > _______________________________________________<br>
>> >> > cfe-dev mailing list<br>
>> >> > <a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><br>
>> >> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>
>> >> ><br>
>> >><br>
>> >> --<br>
>> >> Hal Finkel<br>
>> >> Assistant Computational Scientist<br>
>> >> Leadership Computing Facility<br>
>> >> Argonne National Laboratory<br>
>> >> _______________________________________________<br>
>> >> LLVM Developers mailing list<br>
>> >> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
>> >> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
>> ><br>
>> ><br>
>> ><br>
>> > _______________________________________________<br>
>> > LLVM Developers mailing list<br>
>> > <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
>> > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
>> ><br>
><br>
><br>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Best Regards,<div><br></div><div>Kevin Qin</div></div>
</div>