[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Wed Feb 24 19:09:57 PST 2016

On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
<dexonsmith at apple.com> wrote:
>> If we do not inline @foo(), and instead re-link the call site in @main
>> to some non-optimized copy (or differently optimized copy) of @foo,
>> then it is possible for the program to have the behavior {print("Y");
>> print ("X")}, which was disallowed in the earlier program.
>>
>> In other words, opt refined the semantics of @foo() (i.e. reduced the
>> set of behaviors it may have) in ways that would make later
>> optimizations invalid if we de-refine the implementation of @foo().
>
> I'm probably missing something obvious here.  How could the result of
> `%t0 != %t1` be different at optimization time in one file than from
> runtime in the "real" implementation?  Doesn't this make the CSE
> invalid?

`%t0` and `%t1` are "allowed" to "always be the same", i.e. an
implementation of @foo that always feeds in the same
value for `%t0` and `%t1` is a valid implementation (which is why the
CSE was valid); but it is not the *only* valid implementation.  If I
don't CSE the two load instructions (also a valid thing to do), and
this is a second thread writing to `%par`, then the two values loaded
can be different, and you could end up printing `"X"` in `@foo`.

Did that make sense?

> Does linkonce_odr linkage have the same problem?
> - If so, do you want to change it too?
> - Else, why not?

Going by the specification in the LangRef, I'd say it depends on how
you define "definitive".  If you're allowed to replace the body of a
function with a differently optimized body, then the above problem
exists.

>> The above example is clearly fabricated, but such cases can come up
>> even if everything is optimized to the same level.  E.g. one of the
>> atomic loads in the unrefined implementation of @foo() could have been
>> hidden behind a function call, whose body existed in only one module.
>> That module would then be able to refine @foo() to `ret void` but
>> other modules won't.
>>
>> The only solution I can think of is to redefine available_externally
>> to mean "the only kind of IPO/IPA you can do over a call to this
>> function is to inline it".  Redefining available_externally this way
>> will also let us soundly use it to represent calls to functions that
>> have guard intrinsics, since a failed guard intrinsic basically
>> replaces the function with a "very de-refined" implementation (the
>> interpreter).
>>
>> What do you think?  I don't think implementing the above above will be
>> very difficult, but needless to say, it will still be a fairly
>> non-trivial semantic change (hence I'm not directly jumping to
>> implementation).
>
> This linkage is used in three places (that I know of) by clang:
>
>   1. C-style `inline` functions.
>   2. Functions defined in C++ template classes with external explicit
>      instantiations, e.g. S::foo() in:
>
>          template <class T> struct S { void foo() {} };
>          void bar() { S<int>().foo(); }
>          extern template struct S<int>;
>
>   3. -flto=thin cross-module function importing.
>
> (No comment on (1); its exact semantics are a little fuzzy to me.)
> For (2) and (3), the current behaviour seems correct, and I'd be
> hesitant to lose optimizing power.  (2) is under the "ODR" rule, and
> I think we've been applying the same logic to (3).  Unless, are you
> saying ODR isn't enough?

By ODR, do you mean you only have one definition of the function in
the whole link (i.e. across all modules you'll link together)?
Then yes, ODR should be enough to avoid this.  But in any place where
the linker sees two differently optimized definitions for a function
and picks one as the definitive version all non-inlined calls link to,
we have this problem.

> Assuming you need this new definition (but under ODR, the semantics
> are correct), I would rather split the linkage than change it.  E.g.,
> use a name like available_externally_odr for (2) and (3).

If what I said above is correct (i.e. ODR == OD across everything
you're linking into your final executable) then splitting the linkage
/ adding a new one is probably the best alternative.

-- Sanjoy