[cfe-dev] different gcc/clang behavior for Internal visibility

Andrii Nakryiko via cfe-dev cfe-dev at lists.llvm.org
Wed May 12 10:58:44 PDT 2021


On Tue, May 11, 2021 at 9:53 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>
> On Tue, May 11, 2021 at 9:41 PM Andrii Nakryiko
> <andrii.nakryiko at gmail.com> wrote:
> >
> > On Tue, May 11, 2021 at 3:37 PM Fāng-ruì Sòng <maskray at google.com> wrote:
> > >
> > > On Tue, May 11, 2021 at 2:59 PM Andrii Nakryiko
> > > <andrii.nakryiko at gmail.com> wrote:
> > > >
> > > > On Tue, May 11, 2021 at 11:05 AM Fāng-ruì Sòng <maskray at google.com> wrote:
> > > > >
> > > > > > rnk: IIRC, internal visibility is a guarantee that a function is not escaped and called indirectly from another DSO. This allows the compiler to avoid setting up a TOC or PIC base register in the prologue for architectures that have one. This was notably expensive for x86_32 because of the call/pop code sequence required to materialize EIP. There are other RISC-y architectures (PPC? 64? not sure) with PIC base registers that could benefit from internal visibility support, but I think most of them are considered legacy architectures at this point, so it's not top priority.
> > > > >
> > > > > I vaguely recall that I have seen something similar to "a function is
> > > > > not escaped", but I just searched generic-abi and could not find any
> > > > > proof...
> > > > >
> > > > > I just checked i386 and PowerPC64 ELFv2 ABIs - they don't define STV_INTERNAL.
> > > > >
> > > > > On Tue, May 11, 2021 at 10:45 AM Andrii Nakryiko
> > > > > <andrii.nakryiko at gmail.com> wrote:
> > > > > >
> > > > > > On Mon, May 10, 2021 at 12:52 PM Fangrui Song <maskray at google.com> wrote:
> > > > > > >
> > > > > > > On 2021-05-10, Y Song via cfe-dev wrote:
> > > > > > > >Hi,
> > > > > > > >
> > > > > > > >The bpf linker project tries to explore to use INTERNAL visibility as in
> > > > > > > >   https://lore.kernel.org/bpf/20210507054119.270888-1-andrii@kernel.org/
> > > > > > > >
> > > > > > > >But we found clang actually changed user "internal" visibility to "hidden".
> > > > > > > >For example, I have the following example,
> > > > > > > >
> > > > > > > >$ cat t.c
> > > > > > > >int __attribute__((visibility("internal"))) foo() { return 0; }
> > > > > > > >$ clang -c t.c && llvm-readelf -s t.o | grep foo
> > > > > > > >     3: 0000000000000000     8 FUNC    GLOBAL HIDDEN      2 foo
> > > > > > > >$ gcc -c t.c && llvm-readelf -s t.o | grep foo
> > > > > > > >     8: 0000000000000000    11 FUNC    GLOBAL INTERNAL    1 foo
> > > > > > > >$
> > > > > > > >
> > > > > > > >Looks like this is caused by clang Attr.td,
> > > > > > > >
> > > > > > > >diff --git a/clang/include/clang/Basic/Attr.td
> > > > > > > >b/clang/include/clang/Basic/Attr.td
> > > > > > > >index 5e04f32187cd..4559a1bcfe42 100644
> > > > > > > >--- a/clang/include/clang/Basic/Attr.td
> > > > > > > >+++ b/clang/include/clang/Basic/Attr.td
> > > > > > > >@@ -2776,7 +2776,7 @@ def Visibility : InheritableAttr {
> > > > > > > >   let Spellings = [GCC<"visibility">];
> > > > > > > >   let Args = [EnumArgument<"Visibility", "VisibilityType",
> > > > > > > >                            ["default", "hidden", "internal", "protected"],
> > > > > > > >-                           ["Default", "Hidden", "Hidden", "Protected"]>];
> > > > > > > >+                           ["Default", "Hidden", "Internal", "Protected"]>];
> > > > > > > >   let MeaningfulToClassTemplateDefinition = 1;
> > > > > > > >   let Documentation = [Undocumented];
> > > > > > > > }
> > > > > > > >@@ -2786,7 +2786,7 @@ def TypeVisibility : InheritableAttr {
> > > > > > > >   let Spellings = [Clang<"type_visibility">];
> > > > > > > >   let Args = [EnumArgument<"Visibility", "VisibilityType",
> > > > > > > >                            ["default", "hidden", "internal", "protected"],
> > > > > > > >-                           ["Default", "Hidden", "Hidden", "Protected"]>];
> > > > > > > >+                           ["Default", "Hidden", "Internal", "Protected"]>];
> > > > > > > > //  let Subjects = [Tag, ObjCInterface, Namespace];
> > > > > > > >   let Documentation = [Undocumented];
> > > > > > > > }
> > > > > > > >
> > > > > > > >One of early commits,
> > > > > > > >
> > > > > > > >commit 570024a8d9b4a4aa4a35f077a0a65003dc7b71fe
> > > > > > > >Author: Eli Friedman <eli.friedman at gmail.com>
> > > > > > > >Date:   Thu Aug 5 06:57:20 2010 +0000
> > > > > > > >
> > > > > > > >    Implement #pragma GCC visibility.
> > > > > > > >
> > > > > > > >    llvm-svn: 110315
> > > > > > > >
> > > > > > > >I see
> > > > > > > >
> > > > > > > >+    else if (VisType->isStr("internal"))
> > > > > > > >+      type = VisibilityAttr::HiddenVisibility; // FIXME
> > > > > > > >
> > > > > > > >Do we have any plan to support Internal visibility?
> > > > > > >
> > > > > > > I don't think there is value supporting STV_INTERNAL.
> > > > > > >
> > > > > > > STV_INTERNAL as we see today in the ELF specification was requested by SGI.
> > > > > > >
> > > > > > > > The meaning of this visibility attribute may be defined by processor
> > > > > > > > supplements to further constrain hidden symbols. A processor
> > > > > > > > supplement's definition should be such that generic tools can safely
> > > > > > > > treat internal symbols as hidden.  An internal symbol contained in a
> > > > > > > > relocatable object must be either removed or converted to STB_LOCAL
> > > > > > > > binding by the link-editor when the relocatable object is included in an
> > > > > > > > executable file or shared object.
> > > > > > >
> > > > > > > I recall from reading somewhere that it is used by its propritery
> > > > > > > compiler for some LTO like optimizations. STV_INTERNAL is identical to
> > > > > > > STV_HIDDEN in GNU/Solaris (and likely HP-UX).
> > > > > > >
> > > > > > > For the Linux kernel, my suggestion is to just use __attribute__((visibility("hidden"))).
> > > > > > > Then there will be no confusion that "internal" translates to STV_HIDDEN.
> > > > > >
> > > > > > We are considering using both STV_HIDDEN and STV_INTERNAL for BPF
> > > > > > target and we want to be able to distinguish between the two. Given
> > > > > > ELF defines all four (STV_DEFAULT, STV_PROTECTED, STV_HIDDEN,
> > > > > > STV_INTERNAL), it would be great to be able to actually express
> > > > > > STV_INTERNAL. Are there any specific problems with passing-through
> > > > > > STV_INTERNAL down to ELF symbol visibility?
> > > > >
> > > > > Since the ELF specification delegates the further definition of
> > > > > STV_INTERNAL to processor supplements, BPF as a processor has the
> > > > > right to further define STV_INTERNAL.
> > > > > I'd like to know more arguments favoring a distinguished
> > > > > STV_INTERNAL/STV_HIDDEN, since the two bits are ubiquitously
> > > > > equivalent in GNU ABI and Solaris (and likely HP-UX; only SGI does
> > > > > something different but it is difficult to find the evidence now).
> > > > >
> > > >
> > > > It's quite BPF specific, but you can check this patch with some
> > > > discussion around STV_HIDDEN vs STV_INTERNAL.\
> > > >
> > > >   [0] https://lore.kernel.org/bpf/20210507054119.270888-8-andrii@kernel.org/
> > >
> > > I am not subscribed to the bpf list so I'll reply here.
> > > As you are proposing new ELF features. and I'd like to make sure we
> > > are making good use of them.
> > >
> > > > STV_INTERNAL is stricter and prevents any other object files to reference such symbol (function, variable, map).
> > >
> > > I do not understand this. Do you mean that STV_INTERNAL, being a
> > > visibility value, acts like the binding STB_LOCAL?
> >
> > Yes, sort of. It prevents any extern resolution against such symbol,
> > just like for STB_LOCAL. But it still enforces no naming conflicts
> > (unlike STB_LOCAL) because it stays STB_GLOBAL.
>
> This is not allowed in ELF. STB_GLOBAL needs to satisfy "One file's
> definition of a global symbol will satisfy another file's undefined
> reference to the same global symbol."
> You can't redefine STV_INTERNAL STB_LOCAL.

I'm not redefining it as STB_LOCAL, I explicitly don't want to define
it as STB_LOCAL. I'm just defining a custom behavior of STB_GLOBAL +
STV_INTERNAL. Weren't you the one saying "BPF as a processor has the
right to further define STV_INTERNAL"?

>
> > >
> > > > The undesirable behavior is that after shared library is linked, application can define "conflicting" symbol with the same name and they will happily co-exists, because shared library's symbol is now STB_LOCAL.
> > >
> > > The specification says "Local symbols of the same name may exist in
> > > multiple files without interfering with each other." for STB_LOCAL, so
> > > this is really natural.
> > > I do not see how the proposed STV_INTERNAL behaves here.
> >
> > We keep STV_INTERNAL + STB_GLOBAL, instead of STB_LOCAL.
> >
> > >
> > > > __attribute__((visibility("internal"))) int user_space_var;
> > >
> > > The neigh paragraphs probably need more wording for folks we are not
> > > versed in the Linux kernel BPF stuff (like me).
> > > But from the wording I am worrying that you are defining STV_INTERNAL
> > > as something less constraining than STV_HIDDEN.
> > > That is disallowed by the specification:
> > >
> > > "the most constraining visibility attribute must be propagated to the
> > > resolving symbol in the linked object. The attributes, ordered from
> > > least to most constraining, are: STV_PROTECTED, STV_HIDDEN and
> > > STV_INTERNAL."
> > >
> > > STV_INTERNAL must be more constraining than STV_HIDDEN.
> >
> > And it is. STV_HIDDEN symbols are resolvable within BPF library which
> > is linked from multiple .o files. STV_INTERNAL is not resolvable
> > outside of single defining .o file. So STV_INTERNAL is stricter.
> >
> > The general idea is that for BPF world we want to keep using
> > STB_GLOBAL for variables that are shared with user-space, because that
> > guarantees name uniqueness, otherwise it gets hairy very fast (this is
> > unique BPF aspect where we have kernel-side code that can share pieces
> > of data with user-space code). So sticking to global symbols is good
> > for that purpose. But we also want to have more control over which
> > files can resolve externs against such globals. So that "outside" BPF
> > code can't access BPF library's internal global symbols that are
> > supposed to be shared only within BPF library .o files.
> >
> > For the latter case, you start out with STV_HIDDEN, statically-link
> > BPF library .o's into a BPF library's object file. But once that is
> > done, all those STV_HIDDEN symbols will become STV_INTERNAL and won't
> > be resolvable outside of BPF library, when BPF library is later linked
> > into user's BPF application. We don't have shared libraries, but we'd
> > like static BPF libraries to have the same boundary/API protection as
> > shared libraries have in user-space world.
> >
> > I hope this helps a little bit, but you are right that one needs to
> > know quite a bit of BPF specifics to understand these nuances.
>
> If you raise an example about how the symbol's visibility/binding
> changes during emission / linking / used by user applications, it
> could be clearer.
> From the description I don't see how Clang
> __attribute__((visibility("internal"))) is involved.
> (And from above I think your STV_INTERNAL redefinition is incorrect -
> so it does not justify modifying LLVM IR and updating the
> __attribute__((visibility("internal")) behavior).

STV_INTERNAL is defined separately from STV_HIDDEN in ELF, right?
Regardless how we define STV_INTERNAL for BPF, shouldn't Clang allow
to express STV_INTERNAL without imposing artificial conversions of
STV_INTERNAL -> STV_HIDDEN in ELF symbol visibility? Is there any
technical reason to mangle __attribute__((visibility("internal")))?
GCC allows to preserve STV_INTERNAL just fine without imposing
unnecessary restrictions, so would be great for Clang to allow the
same.

>
> > >
> > >
> > >
> > > > >
> > > > > --
> > > > > 宋方睿
> > >
> > >
> > >
> > > --
> > > 宋方睿
>
>
>
> --
> 宋方睿


More information about the cfe-dev mailing list