[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Duncan P. N. Exon Smith via llvm-dev llvm-dev at lists.llvm.org
Wed Feb 24 18:51:38 PST 2016


> On 2016-Feb-24, at 15:57, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Hi all,
> 
> This is something that came up in the "RFC: Add guard intrinsics to
> LLVM" thread; and while I'm not exactly blocked on this, figuring out
> a path forward here will be helpful in deciding if we can use the
> available_externally linkage type to expression certain semantic
> properties guard intrinsics will have.
> 
> Let's start with an example that shows that we have a problem (direct
> copy/paste from the guard intrinsics thread). Say we have:
> 
> ```
> void foo() available_externally {
>  %t0 = load atomic %ptr
>  %t1 = load atomic %ptr
>  if (%t0 != %t1) print("X");
> }
> void main() {
>  foo();
>  print("Y");
> }
> ```
> 
> The possible behaviors of the above program are {print("X"),
> print("Y")} or {print("Y")}.  But if we run opt then we have
> 
> ```
> void foo() available_externally readnone nounwind {
>  ;; After CSE'ing the two loads and folding the condition
> }
> void main() {
>  foo();
>  print("Y");
> }
> ```
> 
> and some generic reordering
> 
> ```
> void foo() available_externally readnone nounwind {
>  ;; After CSE'ing the two loads and folding the condition
> }
> void main() {
>  print("Y");
>  foo();  // legal since we're moving a readnone nounwind function that
>          // was guaranteed to execute (hence can't have UB)
> }
> ```
> 
> If we do not inline @foo(), and instead re-link the call site in @main
> to some non-optimized copy (or differently optimized copy) of @foo,
> then it is possible for the program to have the behavior {print("Y");
> print ("X")}, which was disallowed in the earlier program.
> 
> In other words, opt refined the semantics of @foo() (i.e. reduced the
> set of behaviors it may have) in ways that would make later
> optimizations invalid if we de-refine the implementation of @foo().

I'm probably missing something obvious here.  How could the result of
`%t0 != %t1` be different at optimization time in one file than from
runtime in the "real" implementation?  Doesn't this make the CSE
invalid?

Does linkonce_odr linkage have the same problem?
- If so, do you want to change it too?
- Else, why not?

> The above example is clearly fabricated, but such cases can come up
> even if everything is optimized to the same level.  E.g. one of the
> atomic loads in the unrefined implementation of @foo() could have been
> hidden behind a function call, whose body existed in only one module.
> That module would then be able to refine @foo() to `ret void` but
> other modules won't.
> 
> The only solution I can think of is to redefine available_externally
> to mean "the only kind of IPO/IPA you can do over a call to this
> function is to inline it".  Redefining available_externally this way
> will also let us soundly use it to represent calls to functions that
> have guard intrinsics, since a failed guard intrinsic basically
> replaces the function with a "very de-refined" implementation (the
> interpreter).
> 
> What do you think?  I don't think implementing the above above will be
> very difficult, but needless to say, it will still be a fairly
> non-trivial semantic change (hence I'm not directly jumping to
> implementation).

This linkage is used in three places (that I know of) by clang:

  1. C-style `inline` functions.
  2. Functions defined in C++ template classes with external explicit
     instantiations, e.g. S::foo() in:

         template <class T> struct S { void foo() {} };
         void bar() { S<int>().foo(); }
         extern template struct S<int>;

  3. -flto=thin cross-module function importing.

(No comment on (1); its exact semantics are a little fuzzy to me.)
For (2) and (3), the current behaviour seems correct, and I'd be
hesitant to lose optimizing power.  (2) is under the "ODR" rule, and
I think we've been applying the same logic to (3).  Unless, are you
saying ODR isn't enough?

Assuming you need this new definition (but under ODR, the semantics
are correct), I would rather split the linkage than change it.  E.g.,
use a name like available_externally_odr for (2) and (3).



More information about the llvm-dev mailing list