[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 29 10:02:45 PST 2016


----- Original Message -----

> From: "Eric Christopher" <echristo at gmail.com>
> To: "Reid Kleckner" <rnk at google.com>, "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Tuesday, November 29, 2016 11:34:56 AM
> Subject: Re: [llvm-dev] RFC: Add an "interposible" linkage type (and
> implement -fsemantic-interposition)

> On Tue, Nov 29, 2016 at 9:15 AM Reid Kleckner via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:

> > I think that all makes sense. You're just adding the missing
> > non-ODR
> > conterpart of 'external' linkage. I could imagine having "external
> > /
> > external_odr" linkage for example.
> 

> > That said, do you think we should take the opportunity to split out
> > a
> > bit for interposability so that we can kill off the *_odr linkage
> > variants? Today's non-ODR weak functions would look more like this:
> 
> > define weak interposable void @foo() { ret void }
> 

> > We could probably preserve bitcode compatibility by continuing to
> > use
> > the old combined linkage encoding. This is useful because we want
> > the old weak to decode as weak+interposable and the old weak_odr to
> > decode as weak.
> 

> > Some more prior discussion: https://reviews.llvm.org/D19995#423481
> 

> +1

So we're all on the same page, our current encodings are: 

External: 0 
WeakAny: 16 
Appending: 2 
Internal: 3 
LinkOnceAny: 18 
ExternalWeak: 7 
Common: 8 
Private: 9 
WeakODR: 17 
LinkOnceODR: 19 
AvailableExternally: 12 

And the new encodings would be: 

External: 0 
External + Interposible: 20 
Weak + Interposible: 16 
Appending: 2 [interposible N/A] 
Internal: 3 [never interposible] 
LinkOnce + Interposible: 18 
ExternalWeak: 7 [always interposible] 
Common: 8 [interposible N/A] 
Private: 9 [never interposible] 
Weak: 17 
LinkOnce: 19 
AvailableExternally: 12 [never interposible] 

With the mappings: 

[Old] -> [New] 
External -> External 
WeakAny -> Weak + Interposible 
LinkOnceAny -> LinkOnce + Interposible 
WeakODR -> Weak 
LinkOnceODR -> LinkOnce 

Is this what you had in mind? 

Thanks again, 
Hal 

> -eric

> > On Tue, Nov 29, 2016 at 8:01 AM, Hal Finkel via llvm-dev <
> > llvm-dev at lists.llvm.org > wrote:
> 

> > > Hi everyone,
> > 
> 

> > > Clang/LLVM's support for ELF interposition is in a confusing
> > > state,
> > > and I propose making a few (hopefully simple) adjustments in
> > > order
> > > to bring our model into a self-consistent state.
> > 
> 

> > > The problem: On ELF systems, global symbols can be interposed.
> > > This
> > > means, for example, that calls to global functions in some
> > > (shared)
> > > library defined in that same library might end up being
> > > redirected
> > > to an implementation in some other library (or in the main
> > > executable). The most common reason for this is the use of
> > > LD_PRELOAD, but there are plenty of other ways to trigger
> > > interposition as well. As a result, it is technically
> > > inconsistent
> > > to inline any global function or do inter-procedural analysis on
> > > them because the implementation might be replaced by code with
> > > completely different behavior at runtme (or link time). Clang has
> > > never supported this (i.e. we do treat these functions as being
> > > eligible for inlining and perform IPA on them). GCC, on the other
> > > hand, has traditionally respected the possibility of ELF
> > > interposition and refrained from doing these things (at least
> > > when
> > > compiling with -fPIC).
> > 
> 

> > > I believe that Clang/LLVM's current behavior is the most-useful
> > > behavior and we should keep the current behavior (at least as a
> > > default). I do understand, however, that there are valid use
> > > cases
> > > for ELF interposition and places where we should allow it (e.g.
> > > when
> > > compiling certain system libraries). GCC recently added a flag
> > > -fsemantic-interposition/-fno-semantic-interposition, where using
> > > -fno-semantic-interposition provides Clang/LLVM's behavior of
> > > assuming that ELF interposition will not be used.
> > 
> 

> > > It has been suggested that, to be self consistent, LLVM should
> > > emit
> > > global symbols with protected ELF visibility in cases where we've
> > > assumed that ELF interposition won't happen. ELF protected
> > > visibility does seem to have exactly that meaning: A protected
> > > global symbol is externally visible but cannot be interposed.
> > > Unfortunately, as I understand it, on some major platforms (e.g.
> > > x86), protected-visibility symbols have a major flaw:
> > > Non-uniqueness
> > > of function pointers (i.e. the function pointer obtained to a
> > > function outside of the defining library might be different from
> > > the
> > > pointer obtained within the defining library). As a result,
> > > making
> > > this change might be practically prohibited (even if it makes
> > > sense
> > > in theory).
> > 
> 

> > > Proposal:
> > 
> 

> > > 1. Add a new linkage type, interposible, which is like external
> > > except that isInterposableLinkage will return true (thus
> > > preventing
> > > inlining, IPA, etc.). This is similar to weak linkage, in a
> > > sense,
> > > except that such symbols are never discarded and are not marked
> > > as
> > > weak for linking, etc.
> > 
> 

> > > 2. Add -fsemantic-interposition/-fno-semantic-interposition to
> > > Clang.
> > > Default to -fno-semantic-interposition, but when
> > > -fsemantic-interposition is used, use interposible linkage for
> > > all
> > > functions where external linkage might otherwise have been used.
> > 
> 

> > > Thoughts?
> > 
> 

> > > Some useful links:
> > 
> 
> > > http://hubicka.blogspot.com/2015/04/GCC5-IPA-LTO-news.html (the
> > > section on the -fno-semantic-interposition flag)
> > 
> 
> > > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01671.html
> > 
> 

> > > On some issues with ELF protected-visibility symbols:
> > 
> 
> > > http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/
> > 
> 
> > > http://www.airs.com/blog/archives/307
> > 
> 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520
> > 
> 

> > > Thanks again,
> > 
> 
> > > Hal
> > 
> 

> > > P.S. For some previous discussion on this, see below...
> > 
> 

> > > > From: "Hal Finkel via llvm-dev" < llvm-dev at lists.llvm.org >
> > > 
> > 
> 
> > > > To: "James Y Knight" < jyknight at google.com >
> > > 
> > 
> 
> > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> > > 
> > 
> 
> > > > Sent: Monday, February 29, 2016 9:50:15 AM
> > > 
> > 
> 
> > > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > > available_externally (split from "RFC: Add guard intrinsics")
> > > 
> > 
> 

> > > > > From: "James Y Knight" < jyknight at google.com >
> > > > 
> > > 
> > 
> 
> > > > > To: "Hal Finkel" < hfinkel at anl.gov >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "Sanjoy Das" < sanjoy at playingwithpointers.com >,
> > > > > "llvm-dev"
> > > > > <
> > > > > llvm-dev at lists.llvm.org >
> > > > 
> > > 
> > 
> 
> > > > > Sent: Monday, February 29, 2016 9:31:24 AM
> > > > 
> > > 
> > 
> 
> > > > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > > > available_externally (split from "RFC: Add guard intrinsics")
> > > > 
> > > 
> > 
> 

> > > > > On Feb 26, 2016 8:50 PM, "Hal Finkel" < hfinkel at anl.gov >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > > > From: "James Y Knight via llvm-dev" <
> > > > > > > llvm-dev at lists.llvm.org
> > > > > > > >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > To: "Sanjoy Das" < sanjoy at playingwithpointers.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Sent: Thursday, February 25, 2016 1:41:43 PM
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > > > > > available_externally (split from "RFC: Add guard
> > > > > > > intrinsics")
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > While we're talking about this, I'd just mention again
> > > > > > > that
> > > > > > > the
> > > > > > > same
> > > > > > > issue arises for *normal* functions too, when linked into
> > > > > > > a
> > > > > > > shared
> > > > > > > library:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > int foo() { return 1; }
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > int bar() { return foo(); }
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Now, compare:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > clang -fPIC -O1 -S -o - test.c
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > gcc -fPIC -O1 -S -o - test.c
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > GCC will refuse to inline foo into bar, or use any
> > > > > > > information
> > > > > > > about
> > > > > > > foo in compiling bar, because foo is exported in the
> > > > > > > dynamic
> > > > > > > symbol
> > > > > > > table, and thus replaceable via symbol interposition.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Clang assumes that you won't do that, or that you don't
> > > > > > > care
> > > > > > > what
> > > > > > > happens if you do. It will happily inline. And, in
> > > > > > > absense
> > > > > > > of
> > > > > > > inlining (e.g. if foo is too long to inline), clang will
> > > > > > > deduce
> > > > > > > function attributes about foo and rely on those in bar --
> > > > > > > despite
> > > > > > > that the call goes through the PLT and could in fact be
> > > > > > > an
> > > > > > > entirely
> > > > > > > different unrelated implementation (or, for that matter,
> > > > > > > a
> > > > > > > differently-optimized version of the same
> > > > > > > implementation).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Is that *really* okay?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > I'm comfortable with saying that symbol interposition falls
> > > > > > outside
> > > > > > of the model we have for the targeted system (at least by
> > > > > > default),
> > > > > > and thus, this is okay. We also don't model the possibility
> > > > > > of
> > > > > > someone hex-editing the binary ;)
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > I'm not really okay with it; the current behavior feels
> > > > > unprincipled.
> > > > 
> > > 
> > 
> 
> > > > > We have a visibility attribute which can be used to control
> > > > > this:
> > > > > On
> > > > > ELF systems, "default" visibililty allows interposition
> > > > > (unlike
> > > > > on
> > > > > Darwin) -- that is, it explicitly ALLOWS for replacing the
> > > > > symbol's
> > > > > definition. The policy of "You can't replace the definition
> > > > > of
> > > > > the
> > > > > symbol, but it is globally visible" is exactly what the
> > > > > "protected"
> > > > > visibility mode is for.
> > > > 
> > > 
> > 
> 

> > > > > If we want to say that you can't interpose by default on ELF
> > > > > targets,
> > > > > that would be a choice. Then, we should make the default
> > > > > symbol
> > > > > visibility "protected" instead of "default". But, continuing
> > > > > to
> > > > > generate calls through the PLT -- which is only needed
> > > > > because
> > > > > the
> > > > > symbols might be replaced -- while simultaneously making
> > > > > optimizations that are broken if they actually ARE replaced,
> > > > > seems
> > > > > kinda bogus.
> > > > 
> > > 
> > 
> 
> > > > This makes sense, and I think you understand my concern here:
> > > > Most
> > > > programmers don't understand these issues, nor do they ever
> > > > expect
> > > > to use dynamic interposition. They do expect, however, that the
> > > > compiler has good IPA and will use the information it is
> > > > provided
> > > > effectively. I'd be happy to make the default visibility
> > > > protected,
> > > > allowing us to continue optimizing well, and provide a
> > > > principled
> > > > behavior otherwise. Given, as you point out, this is the
> > > > default
> > > > on
> > > > Darwin, is there experience from Darwin porting, or any other
> > > > factors, that would indicate this would be a hardship?
> > > 
> > 
> 

> > > > Thanks again,
> > > 
> > 
> 
> > > > Hal
> > > 
> > 
> 

> > > > --
> > > 
> > 
> 

> > > > Hal Finkel
> > > 
> > 
> 
> > > > Assistant Computational Scientist
> > > 
> > 
> 
> > > > Leadership Computing Facility
> > > 
> > 
> 
> > > > Argonne National Laboratory
> > > 
> > 
> 

> > > > _______________________________________________
> > > 
> > 
> 
> > > > LLVM Developers mailing list
> > > 
> > 
> 
> > > > llvm-dev at lists.llvm.org
> > > 
> > 
> 
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > 
> > 
> 

> > > --
> > 
> 

> > > Hal Finkel
> > 
> 
> > > Lead, Compiler Technology and Programming Languages
> > 
> 
> > > Leadership Computing Facility
> > 
> 
> > > Argonne National Laboratory
> > 
> 

> > > _______________________________________________
> > 
> 
> > > LLVM Developers mailing list
> > 
> 
> > > llvm-dev at lists.llvm.org
> > 
> 
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > 
> 

> > _______________________________________________
> 
> > LLVM Developers mailing list
> 
> > llvm-dev at lists.llvm.org
> 
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161129/3029d82a/attachment-0001.html>


More information about the llvm-dev mailing list