[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
Duncan P. N. Exon Smith via llvm-dev
llvm-dev at lists.llvm.org
Wed Feb 24 18:51:38 PST 2016
> On 2016-Feb-24, at 15:57, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hi all,
>
> This is something that came up in the "RFC: Add guard intrinsics to
> LLVM" thread; and while I'm not exactly blocked on this, figuring out
> a path forward here will be helpful in deciding if we can use the
> available_externally linkage type to expression certain semantic
> properties guard intrinsics will have.
>
> Let's start with an example that shows that we have a problem (direct
> copy/paste from the guard intrinsics thread). Say we have:
>
> ```
> void foo() available_externally {
> %t0 = load atomic %ptr
> %t1 = load atomic %ptr
> if (%t0 != %t1) print("X");
> }
> void main() {
> foo();
> print("Y");
> }
> ```
>
> The possible behaviors of the above program are {print("X"),
> print("Y")} or {print("Y")}. But if we run opt then we have
>
> ```
> void foo() available_externally readnone nounwind {
> ;; After CSE'ing the two loads and folding the condition
> }
> void main() {
> foo();
> print("Y");
> }
> ```
>
> and some generic reordering
>
> ```
> void foo() available_externally readnone nounwind {
> ;; After CSE'ing the two loads and folding the condition
> }
> void main() {
> print("Y");
> foo(); // legal since we're moving a readnone nounwind function that
> // was guaranteed to execute (hence can't have UB)
> }
> ```
>
> If we do not inline @foo(), and instead re-link the call site in @main
> to some non-optimized copy (or differently optimized copy) of @foo,
> then it is possible for the program to have the behavior {print("Y");
> print ("X")}, which was disallowed in the earlier program.
>
> In other words, opt refined the semantics of @foo() (i.e. reduced the
> set of behaviors it may have) in ways that would make later
> optimizations invalid if we de-refine the implementation of @foo().
I'm probably missing something obvious here. How could the result of
`%t0 != %t1` be different at optimization time in one file than from
runtime in the "real" implementation? Doesn't this make the CSE
invalid?
Does linkonce_odr linkage have the same problem?
- If so, do you want to change it too?
- Else, why not?
> The above example is clearly fabricated, but such cases can come up
> even if everything is optimized to the same level. E.g. one of the
> atomic loads in the unrefined implementation of @foo() could have been
> hidden behind a function call, whose body existed in only one module.
> That module would then be able to refine @foo() to `ret void` but
> other modules won't.
>
> The only solution I can think of is to redefine available_externally
> to mean "the only kind of IPO/IPA you can do over a call to this
> function is to inline it". Redefining available_externally this way
> will also let us soundly use it to represent calls to functions that
> have guard intrinsics, since a failed guard intrinsic basically
> replaces the function with a "very de-refined" implementation (the
> interpreter).
>
> What do you think? I don't think implementing the above above will be
> very difficult, but needless to say, it will still be a fairly
> non-trivial semantic change (hence I'm not directly jumping to
> implementation).
This linkage is used in three places (that I know of) by clang:
1. C-style `inline` functions.
2. Functions defined in C++ template classes with external explicit
instantiations, e.g. S::foo() in:
template <class T> struct S { void foo() {} };
void bar() { S<int>().foo(); }
extern template struct S<int>;
3. -flto=thin cross-module function importing.
(No comment on (1); its exact semantics are a little fuzzy to me.)
For (2) and (3), the current behaviour seems correct, and I'd be
hesitant to lose optimizing power. (2) is under the "ODR" rule, and
I think we've been applying the same logic to (3). Unless, are you
saying ODR isn't enough?
Assuming you need this new definition (but under ODR, the semantics
are correct), I would rather split the linkage than change it. E.g.,
use a name like available_externally_odr for (2) and (3).
More information about the llvm-dev
mailing list