[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Wed Feb 24 20:50:24 PST 2016

A clarification pointed out by David Majnemer: what I"m really talking
about is "comdat or comdat-semantic-equivalents" which include linkonce_odr
and available_externally.

As a further strategy to recover optimizations:

On platforms with comdat support, we could also teach function attrs that
deduction is safe for functions which are only called from within their own
comdat. And then we could do some work in Clang (or other FEs) to try to
form larger comdats when safe to do so. For example with C++, inline method
definitions inside of class bodies could have a merged comdat I think
because the class body can't change between translation units. This would
in turn allow, for example, continuing to get full deduction for private
inline methods inside of class bodies whose callers were all also inline
and inside the class body.

We could probably also do some tricks where we actually expand comdats
themselves in order to enable deduction and subsequnet optimizations,
although that might need a cost model.

-Chandler

On Wed, Feb 24, 2016 at 8:29 PM Chandler Carruth <chandlerc at google.com>
wrote:

> Yea, I'm pretty sad about all of this. I'm also not seeing a lot of
> awesome paths forward.
>
> Here is the least bad strategy I can come up with. Curious if folks think
> this is sufficient:
>
> 1) Stop deducing function attributes within comdats by examining the
> bodies of the functions (so that we remain free to transform the bodies of
> functions).
> 2) Teach frontends to emit (even at O0!!!) trivially deduced function
> attributes for comdats so that we continue to catch easy cases.
> 3) Ensure and specify that we never hoist code *into* a comdat group in
> which it would not have been executed previously. I don't know of anything
> in LLVM that does this today, but it would become an important invariant.
> 4) Work a lot harder to do internalizing and removing of this restriction.
>
> Pretty horrible. But I think it is correct.
>
> As a slight modification to #1 and #2, we could have a very carefully
> crafted deduction rule where we only deduce function attributes for
> functions prior to any modification of their function bodies. Such
> attributes should be conservatively correct because we would never lift new
> code into the function bodies. This would at least allow us to do bottom-up
> deduction to catch interprocedural cases. But it would become incredibly
> subtle that this is only valid prior to *any* transformations of the
> comdat-containing functions.
>
> I'm starting to think this subtle rule might be worth it. But I'm frankly
> terrified by the implications.
>
> On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev wrote:
>> >> On 2016-Feb-24, at 19:46, Sanjoy Das <sanjoy at playingwithpointers.com>
>> wrote:
>> >>
>> >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth <
>> chandlerc at google.com> wrote:
>> >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith
>> >>> <dexonsmith at apple.com> wrote:
>> >>>>
>> >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com>
>> wrote:
>> >>>>>
>> >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev
>> >>>>> <llvm-dev at lists.llvm.org> wrote:
>> >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
>> >>>>> <dexonsmith at apple.com> wrote:
>> >>>>>>> If we do not inline @foo(), and instead re-link the call site in
>> >>>>>>> @main
>> >>>>>>> to some non-optimized copy (or differently optimized copy) of
>> @foo,
>> >>>>>>> then it is possible for the program to have the behavior
>> {print("Y");
>> >>>>>>> print ("X")}, which was disallowed in the earlier program.
>> >>>>>>>
>> >>>>>>> In other words, opt refined the semantics of @foo() (i.e. reduced
>> the
>> >>>>>>> set of behaviors it may have) in ways that would make later
>> >>>>>>> optimizations invalid if we de-refine the implementation of
>> @foo().
>> >>>>>> I'm probably missing something obvious here.  How could the result
>> of
>> >>>>>> `%t0 != %t1` be different at optimization time in one file than
>> from
>> >>>>>> runtime in the "real" implementation?  Doesn't this make the CSE
>> >>>>>> invalid?
>> >>>>> `%t0` and `%t1` are "allowed" to "always be the same", i.e. an
>> >>>>> implementation of @foo that always feeds in the same
>> >>>>> value for `%t0` and `%t1` is a valid implementation (which is why
>> the
>> >>>>> CSE was valid); but it is not the *only* valid implementation.  If I
>> >>>>> don't CSE the two load instructions (also a valid thing to do), and
>> >>>>> this is a second thread writing to `%par`, then the two values
>> loaded
>> >>>>> can be different, and you could end up printing `"X"` in `@foo`.
>> >>>>>
>> >>>>> Did that make sense?
>> >>>> Yes.  To be sure I understand the scope: this is only a problem for
>> >>>> atomics, correct?  (Because multi-threaded behaviour with other
>> globals
>> >>>> is UB?)
>> >>>>
>> >>>>>> Does linkonce_odr linkage have the same problem?
>> >>>>>> - If so, do you want to change it too?
>> >>>>>> - Else, why not?
>> >>>>> Going by the specification in the LangRef, I'd say it depends on how
>> >>>>> you define "definitive".  If you're allowed to replace the body of a
>> >>>>> function with a differently optimized body, then the above problem
>> >>>>> exists.
>> >>>>>
>> >>>>> I believe that is the case, and I strongly believe the problem you
>> >>>>> outline exists for linkonce_odr exactly as it does for
>> available_externally.
>> >>>>>
>> >>>>> Which is what makes this scary: every C++ inline function today can
>> >>>>> trigger this.
>> >>>> Every C/C++ inline or template function.  But only the ones that use
>> >>>> atomics, right?
>> >>>
>> >>> Well, with *this* example...
>> >> Atomic are one source of non-determinism that compilers can reason
>> >> about.  I don't know if the following snippet is well defined or not,
>> >> but you could have similar issues with
>> >>
>> >>
>> >>   void foo() {
>> >>     int *p = malloc(sizeof(int));
>> >>     if (*p < 10) print("X");
>> >>   }
>> >>
>> >> or (again, I don't know if this is actually well defined)
>> >>
>> >>   void foo() {
>> >>     int t;  // it is probably reasonable to fold compares with
>> >> ptrtoint(alloca) to undef
>> >>     if ((intptr_t)(&t) < 10) print("X");
>> >>   }
>> >>
>> > The first one at least is UB, but as Richard pointed out the scope
>> > is certainly broader than atomics (it's not even just well-defined
>> > non-deterministism).
>> >
>> > I'm kind of terrified by the implications.
>> Me too.  :(
>> >
>> >> -- Sanjoy
>> >>
>> >>>>
>> >>>> Not that I'm sure that will end up being a helpful distinction.
>> >>>
>> >>> Right. See Richard's comment. I think that sums up the real issue
>> here. =/
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/aa8d0f6b/attachment.html>