<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div><br><br>Sent from my iPhone</div><div><br>On Feb 24, 2016, at 9:41 PM, Chandler Carruth via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br><br></div><blockquote type="cite"><div><div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">----- Original Message -----<br>

<br>

> From: "Chandler Carruth via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>

> To: "Philip Reames" <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>>, "Duncan P. N. Exon<br>

> Smith" <<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>>, "Sanjoy Das"<br>

> <<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a>><br>

> Cc: "llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>

> Sent: Wednesday, February 24, 2016 10:29:23 PM<br>

> Subject: Re: [llvm-dev] Possible soundness issue with<br>

> available_externally (split from "RFC: Add guard intrinsics")<br>

<br>

> Yea, I'm pretty sad about all of this. I'm also not seeing a lot of<br>

> awesome paths forward.<br>

<br>

> Here is the least bad strategy I can come up with. Curious if folks<br>

> think this is sufficient:<br>

<br>

I may not completely understand the problem, but this seems like overkill. The underlying restriction is that, if the compiler makes a non-determinism-collapsing choice when optimizing a function, it must make the same choice for all definitions of that function (undefined behavior excluded).</blockquote><div><br></div><div>This isn't enough, because some definition in some other module may *not be optimized at all*, and yet may get selected at link time.</div><div><br></div><div>Put another way, it must *prove* that the same choice will *always* be made for all definitions. This is akin to proving that the optimizer is run over all translation units for C++ linkonce_odr functions, which you can't do.</div><div><br></div><div>The result would be failing to optimize the bodies of linkonce_odr functions in any way which was externally detectable such as this. I think that would be *much* worse than losing the ability to do function attribute deduction for such functions?</div><div><br></div></div></div></div></blockquote>If you are prior to optimising the function in any way (which I think is a reasonable thing to require as you said before) then any attributes added there will at least be conservatively correct. That will still allow a bunch of optimisation, I hope!<div><br></div><div>So for the following function:</div><div><br></div><div>int foo(int *i) {</div><div>  if (*i == 1)</div><div>    return 0;</div><div>  return 1;</div><div>}</div><div><br></div><div>You can legitimately add readonly and propagate to callers that the returned value has known zero bits > 1.</div><div><br></div><div>And for this similar function:</div><div><br></div><div><div><span style="background-color: rgba(255, 255, 255, 0);">int foo(int *i, bool b) {</span></div><div><span style="background-color: rgba(255, 255, 255, 0);">  if (b) {</span></div><div><span style="background-color: rgba(255, 255, 255, 0);">    *i = 1;</span></div><div><span style="background-color: rgba(255, 255, 255, 0);">    return 0;</span></div><div><span style="background-color: rgba(255, 255, 255, 0);">  }</span></div><div><span style="background-color: rgba(255, 255, 255, 0);">  return 1;</span></div><div><span style="background-color: rgba(255, 255, 255, 0);">}</span></div><div><span style="background-color: rgba(255, 255, 255, 0);"><br></span></div><div><span style="background-color: rgba(255, 255, 255, 0);">You could legitimately optimise a call 'foo(p, false)' so that the call site propagates in the constants and knows that this call returns 1 for sure. </span></div><div><span style="background-color: rgba(255, 255, 255, 0);"><br></span></div><div><span style="background-color: rgba(255, 255, 255, 0);">But could you constant propagate (without inlining) in to the call and add readonly to the call instruction here? I think you can because you know that passing false for b means that the function will not take the path that loads memory.</span></div><div><span style="background-color: rgba(255, 255, 255, 0);"><br></span></div><div><span style="background-color: rgba(255, 255, 255, 0);">Pete</span></div><blockquote type="cite"><div><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Thus, with an externally_available function, the CSE in Sanjoy's original example should be forbidden. Richard's example again demonstrates this principle, although in this case the non-determinism is in the choice of a globally-visible implementation technique rather than non-determinism from memory-subsystem reordering.<br>

<br>

There is a complication, which you imply in your proposal, that such optimizations need to be forbidden not just in the externally_available functions themselves, but in any local function transitively called by one. This, however, we can take care of with an (easily-deduced) attribute.<br>

<br>

In short, it is not clear to me that the number of problematic optimizations is large (seems likely restricted to things involving atomics in practice), and while I understand the auditing difficulties here, we should just restrict these in appropriate contexts instead of trying to restrict all information flow into or out of comdats.<br>

<br>

 -Hal<br>

<br>

> 1) Stop deducing function attributes within comdats by examining the<br>

> bodies of the functions (so that we remain free to transform the<br>

> bodies of functions).<br>

> 2) Teach frontends to emit (even at O0!!!) trivially deduced function<br>

> attributes for comdats so that we continue to catch easy cases.<br>

> 3) Ensure and specify that we never hoist code *into* a comdat group<br>

> in which it would not have been executed previously. I don't know of<br>

> anything in LLVM that does this today, but it would become an<br>

> important invariant.<br>

> 4) Work a lot harder to do internalizing and removing of this<br>

> restriction.<br>

<br>

> Pretty horrible. But I think it is correct.<br>

<br>

> As a slight modification to #1 and #2, we could have a very carefully<br>

> crafted deduction rule where we only deduce function attributes for<br>

> functions prior to any modification of their function bodies. Such<br>

> attributes should be conservatively correct because we would never<br>

> lift new code into the function bodies. This would at least allow us<br>

> to do bottom-up deduction to catch interprocedural cases. But it<br>

> would become incredibly subtle that this is only valid prior to<br>

> *any* transformations of the comdat-containing functions.<br>

<br>

> I'm starting to think this subtle rule might be worth it. But I'm<br>

> frankly terrified by the implications.<br>

<br>

> On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev <<br>

> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> > wrote:<br>

<br>

> > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev wrote:<br>

><br>

> > >> On 2016-Feb-24, at 19:46, Sanjoy Das <<br>

> > >> <a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a> > wrote:<br>

><br>

> > >><br>

><br>

> > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth <<br>

> > >> <a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a> > wrote:<br>

><br>

> > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith<br>

><br>

> > >>> < <a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a> > wrote:<br>

><br>

> > >>>><br>

><br>

> > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth <<br>

> > >>>>> <a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a> > wrote:<br>

><br>

> > >>>>><br>

><br>

> > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev<br>

><br>

> > >>>>> < <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> > wrote:<br>

><br>

> > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith<br>

><br>

> > >>>>> < <a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a> > wrote:<br>

><br>

> > >>>>>>> If we do not inline @foo(), and instead re-link the call<br>

> > >>>>>>> site<br>

> > >>>>>>> in<br>

><br>

> > >>>>>>> @main<br>

><br>

> > >>>>>>> to some non-optimized copy (or differently optimized copy)<br>

> > >>>>>>> of<br>

> > >>>>>>> @foo,<br>

><br>

> > >>>>>>> then it is possible for the program to have the behavior<br>

> > >>>>>>> {print("Y");<br>

><br>

> > >>>>>>> print ("X")}, which was disallowed in the earlier program.<br>

><br>

> > >>>>>>><br>

><br>

> > >>>>>>> In other words, opt refined the semantics of @foo() (i.e.<br>

> > >>>>>>> reduced the<br>

><br>

> > >>>>>>> set of behaviors it may have) in ways that would make later<br>

><br>

> > >>>>>>> optimizations invalid if we de-refine the implementation of<br>

> > >>>>>>> @foo().<br>

><br>

> > >>>>>> I'm probably missing something obvious here. How could the<br>

> > >>>>>> result of<br>

><br>

> > >>>>>> `%t0 != %t1` be different at optimization time in one file<br>

> > >>>>>> than from<br>

><br>

> > >>>>>> runtime in the "real" implementation? Doesn't this make the<br>

> > >>>>>> CSE<br>

><br>

> > >>>>>> invalid?<br>

><br>

> > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", i.e.<br>

> > >>>>> an<br>

><br>

> > >>>>> implementation of @foo that always feeds in the same<br>

><br>

> > >>>>> value for `%t0` and `%t1` is a valid implementation (which is<br>

> > >>>>> why the<br>

><br>

> > >>>>> CSE was valid); but it is not the *only* valid<br>

> > >>>>> implementation.<br>

> > >>>>> If I<br>

><br>

> > >>>>> don't CSE the two load instructions (also a valid thing to<br>

> > >>>>> do),<br>

> > >>>>> and<br>

><br>

> > >>>>> this is a second thread writing to `%par`, then the two<br>

> > >>>>> values<br>

> > >>>>> loaded<br>

><br>

> > >>>>> can be different, and you could end up printing `"X"` in<br>

> > >>>>> `@foo`.<br>

><br>

> > >>>>><br>

><br>

> > >>>>> Did that make sense?<br>

><br>

> > >>>> Yes. To be sure I understand the scope: this is only a problem<br>

> > >>>> for<br>

><br>

> > >>>> atomics, correct? (Because multi-threaded behaviour with other<br>

> > >>>> globals<br>

><br>

> > >>>> is UB?)<br>

><br>

> > >>>><br>

><br>

> > >>>>>> Does linkonce_odr linkage have the same problem?<br>

><br>

> > >>>>>> - If so, do you want to change it too?<br>

><br>

> > >>>>>> - Else, why not?<br>

><br>

> > >>>>> Going by the specification in the LangRef, I'd say it depends<br>

> > >>>>> on how<br>

><br>

> > >>>>> you define "definitive". If you're allowed to replace the<br>

> > >>>>> body<br>

> > >>>>> of a<br>

><br>

> > >>>>> function with a differently optimized body, then the above<br>

> > >>>>> problem<br>

><br>

> > >>>>> exists.<br>

><br>

> > >>>>><br>

><br>

> > >>>>> I believe that is the case, and I strongly believe the<br>

> > >>>>> problem<br>

> > >>>>> you<br>

><br>

> > >>>>> outline exists for linkonce_odr exactly as it does for<br>

> > >>>>> available_externally.<br>

><br>

> > >>>>><br>

><br>

> > >>>>> Which is what makes this scary: every C++ inline function<br>

> > >>>>> today<br>

> > >>>>> can<br>

><br>

> > >>>>> trigger this.<br>

><br>

> > >>>> Every C/C++ inline or template function. But only the ones<br>

> > >>>> that<br>

> > >>>> use<br>

><br>

> > >>>> atomics, right?<br>

><br>

> > >>><br>

><br>

> > >>> Well, with *this* example...<br>

><br>

> > >> Atomic are one source of non-determinism that compilers can<br>

> > >> reason<br>

><br>

> > >> about. I don't know if the following snippet is well defined or<br>

> > >> not,<br>

><br>

> > >> but you could have similar issues with<br>

><br>

> > >><br>

><br>

> > >><br>

><br>

> > >> void foo() {<br>

><br>

> > >> int *p = malloc(sizeof(int));<br>

><br>

> > >> if (*p < 10) print("X");<br>

><br>

> > >> }<br>

><br>

> > >><br>

><br>

> > >> or (again, I don't know if this is actually well defined)<br>

><br>

> > >><br>

><br>

> > >> void foo() {<br>

><br>

> > >> int t; // it is probably reasonable to fold compares with<br>

><br>

> > >> ptrtoint(alloca) to undef<br>

><br>

> > >> if ((intptr_t)(&t) < 10) print("X");<br>

><br>

> > >> }<br>

><br>

> > >><br>

><br>

> > > The first one at least is UB, but as Richard pointed out the<br>

> > > scope<br>

><br>

> > > is certainly broader than atomics (it's not even just<br>

> > > well-defined<br>

><br>

> > > non-deterministism).<br>

><br>

> > ><br>

><br>

> > > I'm kind of terrified by the implications.<br>

><br>

> > Me too. :(<br>

><br>

> > ><br>

><br>

> > >> -- Sanjoy<br>

><br>

> > >><br>

><br>

> > >>>><br>

><br>

> > >>>> Not that I'm sure that will end up being a helpful<br>

> > >>>> distinction.<br>

><br>

> > >>><br>

><br>

> > >>> Right. See Richard's comment. I think that sums up the real<br>

> > >>> issue<br>

> > >>> here. =/<br>

><br>

> > > _______________________________________________<br>

><br>

> > > LLVM Developers mailing list<br>

><br>

> > > <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

><br>

> > > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

<br>

> > _______________________________________________<br>

><br>

> > LLVM Developers mailing list<br>

><br>

> > <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

><br>

> > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

<br>

> _______________________________________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

<br>

--<br>

<br>

--<br>

Hal Finkel<br>

Assistant Computational Scientist<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</blockquote></div></div>

</div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>LLVM Developers mailing list</span><br><span><a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></span><br><span><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></span><br></div></blockquote></div></body></html>