<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 18, 2015 at 9:04 AM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div style="font-family:arial,helvetica,sans-serif;font-size:10pt;color:#000000"><br><br><hr><blockquote style="border-left:2px solid rgb(16,16,255);margin-left:5px;padding-left:5px;color:rgb(0,0,0);font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt"><span class=""><b>From: </b>"Daniel Berlin" <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>><br><b>To: </b>"Jeroen Dobbelaere" <<a href="mailto:jeroen.dobbelaere@gmail.com" target="_blank">jeroen.dobbelaere@gmail.com</a>><br></span><b>Cc: </b>"Hal Finkel" <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>, <a href="mailto:reviews%2BD8056%2Bpublic%2B7b63cedb34d51bb1@reviews.llvm.org" target="_blank">reviews+D8056+public+7b63cedb34d51bb1@reviews.llvm.org</a>, <a href="mailto:cfe-commits@cs.uiuc.edu" target="_blank">cfe-commits@cs.uiuc.edu</a>, "<a href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a> Developers" <<a href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a>><br><b>Sent: </b>Wednesday, March 18, 2015 10:27:28 AM<span class=""><br><b>Subject: </b>Re: [PATCH] Fix for bug 21725: wrong results with union and strict-aliasing<br><br></span><div dir="ltr"><br><div class="gmail_extra"><br><span class=""><div class="gmail_quote">On Wed, Mar 18, 2015 at 2:22 AM, Jeroen Dobbelaere <span dir="ltr"><<a href="mailto:jeroen.dobbelaere@gmail.com" target="_blank">jeroen.dobbelaere@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Tue, Mar 17, 2015 at 11:55 PM, Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span> wrote:<br></span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">[..]<div class="gmail_extra"><div class="gmail_quote"><span><div><br></div></span><span><div>In theory, the last time i remember, you weren't allow to set one member of a union and read another.</div><div>But uh, that's not real user code :)</div><div><br></div><div>(and IIRC, it does not say anything real)</div><span><div> <br></div></span></span></div></div></div></blockquote><div>This is indeed correct. The main issue is that llvm thinks it is allowed to reorder 'stores' (as it thinks they are not aliasing),<br>so that a legal read afterwards will result in a wrong answer (if an optimization or the backend reorders the stores based on the wrong aliasing information).<br></div></div></div></div></blockquote><div><br></div><div>I thought both LLVM and GCC guaranteed type punning through unions would work?</div></div></span></div></div></blockquote><br>Yes, LLVM does this (at least for "local" type punning, for some heuristic definition of local), because we always allow BasicAA to override TBAA when BasicAA is *sure* there is aliasing.<span class=""><br><br><blockquote style="border-left:2px solid rgb(16,16,255);margin-left:5px;padding-left:5px;color:rgb(0,0,0);font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><span><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>If you access a member (or nested member) of a union, starting from the union itself, then it depends if the other type is also accessible through the union.<br><br><br></div><div>So:<br><br></div><div>int foo(union foo* a, float* b, int* c) {<br></div><div> a->a=1;<br></div><div> *b=2;<br></div><div> // compiler must assume a->a and *b can alias<br></div><div> // compiler must not assume *b and *c alias (access not through union)<br></div><div>}<br><br></div><div>(Also see section 3.10 of the c++03 standard;</div></div></div></div></blockquote><div><br></div><div><br></div></span><div>This, IMHO, does not say what you seem to think it does :)</div><div><br></div><div>For C++03, 3.10 only includes the word "union" here: "If a program attempts to access the stored value of an object through an lvalue of other than one of the following
types the behavior is undefined: </div><div><br></div><div>— the dynamic type of the object, </div><div>— a cv-qualified version of the dynamic type of the object, </div><div>— a type that is the signed or unsigned type corresponding to the dynamic type of the object,</div><div> — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of
the object, </div><div>— an aggregate or union type that includes one of the aforementioned types among its members (including,
recursively, a member of a subaggregate or contained union),</div><div> — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,</div><div> — a char or unsigned char type."<br><div><br><br></div><div>C++ standard experts, at least on the GCC side, did not view this as saying "all accesses must have an explicit union access", but that "It must be part of a union type", but about whether you try to access it through a union that doesn't have the right actual types in it.</div></div><div><br></div><div>The type of those objects is right the type of the object. There is, IMHO, nothing illegal about those accesses.</div><div><br></div><div><br></div></div></div></div></blockquote></span><div><br>So, with that interpretation, the mere presence of a 'union { short s; int i; }' (in the same or a different compilation unit) should influence the (type based) aliasing of all short and int pointers ?<br></div></div></div></div></blockquote><div><br></div><div>It's actually worse than this interpretation, since someone pointed out on the gcc side that "ihateunions" could be in a different TU, and you'd never see the union at all.<br></div><div>While I think this is the right interpretation (as do others), that's not the point.</div><div>I'm actually on your side here :)</div><div><br></div><div>But remember that Hal said that if hte language has semantics, we should follow them.</div></div></div></div></blockquote><br></span>I did. It is also true that standards have defects ;) If a strict reading of the language semantics implies a 'spooky action at a distance' effect, then we'll need to draw a line in the sand somewhere to avoid that. I also think your interpretation that implies that is incorrect...<span class=""><br><br><blockquote style="border-left:2px solid rgb(16,16,255);margin-left:5px;padding-left:5px;color:rgb(0,0,0);font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div><br></div><div>The whole point of this is essentially to point out that the language semantics weren't completely thought out here :)</div><div>Because of that, our options are either "make up a rule people can follow" or "disable TBAA".</div><div><br></div><div>GCC, and every other compiler i'm aware of, has chosen the former, at least for C++03 (I don't know what C++11 or 14 say here).</div></div></div></div></blockquote><br></span>Yes; fair enough. What I want is that if we make up a rule, then we make that rule well defined and documented (not just whatever the implementation happens to do today); and we should also write a paper for the C++ Standards Committee proposing that as an official solution (I'll be happy to present at the next meeting, and we have approximately 1 month before the next paper deadline, IIRC).<br><br>In the current draft, I relevant wording looks equivalent:<br><br>— an aggregate or union type that includes one of the aforementioned types among its elements<br> or non-static data members (including, recursively, an element or non-static data member of<span class=""><br> a subaggregate or contained union),<br><br><br></span>However, another point to consider is that, "<span>C++ standard experts, at least on the GCC side, did not
view this as saying "all accesses must have an explicit union access",
but that..." may not be a reasonable reading of the standard. The text says, "If a program attempts to access the stored value of an object through a glvalue of other than one<br>of the following types the behavior is undefined: ...", and so I read that as saying that the access must be through a glvalue of an aggregate or union type (with a relevant type as one of its members). So the union or aggregate type *must* be explicitly present in the access. I don't believe any other interpretations, especially those which imply non-local effects, are reasonable.</span><br><br>Exactly what rule did GCC implement?<br></div></div></blockquote><div><br></div><div>If you use an explicit union access, it assumes they alias, otherwise, all bets are off.</div><div><br></div><div>Note: As you all know, I have really no dog in this fight, nor am i a language lawyer. Among other things, i'm an aliasing guy. I just do the edge of what the language lawyers tell me is allowed :)</div><div><br></div></div></div></div>