[cfe-dev] different gcc/clang behavior for Internal visibility

Tue May 11 21:41:46 PDT 2021

On Tue, May 11, 2021 at 3:37 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>
> On Tue, May 11, 2021 at 2:59 PM Andrii Nakryiko
> <andrii.nakryiko at gmail.com> wrote:
> >
> > On Tue, May 11, 2021 at 11:05 AM Fāng-ruì Sòng <maskray at google.com> wrote:
> > >
> > > > rnk: IIRC, internal visibility is a guarantee that a function is not escaped and called indirectly from another DSO. This allows the compiler to avoid setting up a TOC or PIC base register in the prologue for architectures that have one. This was notably expensive for x86_32 because of the call/pop code sequence required to materialize EIP. There are other RISC-y architectures (PPC? 64? not sure) with PIC base registers that could benefit from internal visibility support, but I think most of them are considered legacy architectures at this point, so it's not top priority.
> > >
> > > I vaguely recall that I have seen something similar to "a function is
> > > not escaped", but I just searched generic-abi and could not find any
> > > proof...
> > >
> > > I just checked i386 and PowerPC64 ELFv2 ABIs - they don't define STV_INTERNAL.
> > >
> > > On Tue, May 11, 2021 at 10:45 AM Andrii Nakryiko
> > > <andrii.nakryiko at gmail.com> wrote:
> > > >
> > > > On Mon, May 10, 2021 at 12:52 PM Fangrui Song <maskray at google.com> wrote:
> > > > >
> > > > > On 2021-05-10, Y Song via cfe-dev wrote:
> > > > > >Hi,
> > > > > >
> > > > > >The bpf linker project tries to explore to use INTERNAL visibility as in
> > > > > >   https://lore.kernel.org/bpf/20210507054119.270888-1-andrii@kernel.org/
> > > > > >
> > > > > >But we found clang actually changed user "internal" visibility to "hidden".
> > > > > >For example, I have the following example,
> > > > > >
> > > > > >$ cat t.c
> > > > > >int __attribute__((visibility("internal"))) foo() { return 0; }
> > > > > >$ clang -c t.c && llvm-readelf -s t.o | grep foo
> > > > > >     3: 0000000000000000     8 FUNC    GLOBAL HIDDEN      2 foo
> > > > > >$ gcc -c t.c && llvm-readelf -s t.o | grep foo
> > > > > >     8: 0000000000000000    11 FUNC    GLOBAL INTERNAL    1 foo
> > > > > >$
> > > > > >
> > > > > >Looks like this is caused by clang Attr.td,
> > > > > >
> > > > > >diff --git a/clang/include/clang/Basic/Attr.td
> > > > > >b/clang/include/clang/Basic/Attr.td
> > > > > >index 5e04f32187cd..4559a1bcfe42 100644
> > > > > >--- a/clang/include/clang/Basic/Attr.td
> > > > > >+++ b/clang/include/clang/Basic/Attr.td
> > > > > >@@ -2776,7 +2776,7 @@ def Visibility : InheritableAttr {
> > > > > >   let Spellings = [GCC<"visibility">];
> > > > > >   let Args = [EnumArgument<"Visibility", "VisibilityType",
> > > > > >                            ["default", "hidden", "internal", "protected"],
> > > > > >-                           ["Default", "Hidden", "Hidden", "Protected"]>];
> > > > > >+                           ["Default", "Hidden", "Internal", "Protected"]>];
> > > > > >   let MeaningfulToClassTemplateDefinition = 1;
> > > > > >   let Documentation = [Undocumented];
> > > > > > }
> > > > > >@@ -2786,7 +2786,7 @@ def TypeVisibility : InheritableAttr {
> > > > > >   let Spellings = [Clang<"type_visibility">];
> > > > > >   let Args = [EnumArgument<"Visibility", "VisibilityType",
> > > > > >                            ["default", "hidden", "internal", "protected"],
> > > > > >-                           ["Default", "Hidden", "Hidden", "Protected"]>];
> > > > > >+                           ["Default", "Hidden", "Internal", "Protected"]>];
> > > > > > //  let Subjects = [Tag, ObjCInterface, Namespace];
> > > > > >   let Documentation = [Undocumented];
> > > > > > }
> > > > > >
> > > > > >One of early commits,
> > > > > >
> > > > > >commit 570024a8d9b4a4aa4a35f077a0a65003dc7b71fe
> > > > > >Author: Eli Friedman <eli.friedman at gmail.com>
> > > > > >Date:   Thu Aug 5 06:57:20 2010 +0000
> > > > > >
> > > > > >    Implement #pragma GCC visibility.
> > > > > >
> > > > > >    llvm-svn: 110315
> > > > > >
> > > > > >I see
> > > > > >
> > > > > >+    else if (VisType->isStr("internal"))
> > > > > >+      type = VisibilityAttr::HiddenVisibility; // FIXME
> > > > > >
> > > > > >Do we have any plan to support Internal visibility?
> > > > >
> > > > > I don't think there is value supporting STV_INTERNAL.
> > > > >
> > > > > STV_INTERNAL as we see today in the ELF specification was requested by SGI.
> > > > >
> > > > > > The meaning of this visibility attribute may be defined by processor
> > > > > > supplements to further constrain hidden symbols. A processor
> > > > > > supplement's definition should be such that generic tools can safely
> > > > > > treat internal symbols as hidden.  An internal symbol contained in a
> > > > > > relocatable object must be either removed or converted to STB_LOCAL
> > > > > > binding by the link-editor when the relocatable object is included in an
> > > > > > executable file or shared object.
> > > > >
> > > > > I recall from reading somewhere that it is used by its propritery
> > > > > compiler for some LTO like optimizations. STV_INTERNAL is identical to
> > > > > STV_HIDDEN in GNU/Solaris (and likely HP-UX).
> > > > >
> > > > > For the Linux kernel, my suggestion is to just use __attribute__((visibility("hidden"))).
> > > > > Then there will be no confusion that "internal" translates to STV_HIDDEN.
> > > >
> > > > We are considering using both STV_HIDDEN and STV_INTERNAL for BPF
> > > > target and we want to be able to distinguish between the two. Given
> > > > ELF defines all four (STV_DEFAULT, STV_PROTECTED, STV_HIDDEN,
> > > > STV_INTERNAL), it would be great to be able to actually express
> > > > STV_INTERNAL. Are there any specific problems with passing-through
> > > > STV_INTERNAL down to ELF symbol visibility?
> > >
> > > Since the ELF specification delegates the further definition of
> > > STV_INTERNAL to processor supplements, BPF as a processor has the
> > > right to further define STV_INTERNAL.
> > > I'd like to know more arguments favoring a distinguished
> > > STV_INTERNAL/STV_HIDDEN, since the two bits are ubiquitously
> > > equivalent in GNU ABI and Solaris (and likely HP-UX; only SGI does
> > > something different but it is difficult to find the evidence now).
> > >
> >
> > It's quite BPF specific, but you can check this patch with some
> > discussion around STV_HIDDEN vs STV_INTERNAL.\
> >
> >   [0] https://lore.kernel.org/bpf/20210507054119.270888-8-andrii@kernel.org/
>
> I am not subscribed to the bpf list so I'll reply here.
> As you are proposing new ELF features. and I'd like to make sure we
> are making good use of them.
>
> > STV_INTERNAL is stricter and prevents any other object files to reference such symbol (function, variable, map).
>
> I do not understand this. Do you mean that STV_INTERNAL, being a
> visibility value, acts like the binding STB_LOCAL?

Yes, sort of. It prevents any extern resolution against such symbol,
just like for STB_LOCAL. But it still enforces no naming conflicts
(unlike STB_LOCAL) because it stays STB_GLOBAL.

>
> > The undesirable behavior is that after shared library is linked, application can define "conflicting" symbol with the same name and they will happily co-exists, because shared library's symbol is now STB_LOCAL.
>
> The specification says "Local symbols of the same name may exist in
> multiple files without interfering with each other." for STB_LOCAL, so
> this is really natural.
> I do not see how the proposed STV_INTERNAL behaves here.

We keep STV_INTERNAL + STB_GLOBAL, instead of STB_LOCAL.

>
> > __attribute__((visibility("internal"))) int user_space_var;
>
> The neigh paragraphs probably need more wording for folks we are not
> versed in the Linux kernel BPF stuff (like me).
> But from the wording I am worrying that you are defining STV_INTERNAL
> as something less constraining than STV_HIDDEN.
> That is disallowed by the specification:
>
> "the most constraining visibility attribute must be propagated to the
> resolving symbol in the linked object. The attributes, ordered from
> least to most constraining, are: STV_PROTECTED, STV_HIDDEN and
> STV_INTERNAL."
>
> STV_INTERNAL must be more constraining than STV_HIDDEN.

And it is. STV_HIDDEN symbols are resolvable within BPF library which
is linked from multiple .o files. STV_INTERNAL is not resolvable
outside of single defining .o file. So STV_INTERNAL is stricter.

The general idea is that for BPF world we want to keep using
STB_GLOBAL for variables that are shared with user-space, because that
guarantees name uniqueness, otherwise it gets hairy very fast (this is
unique BPF aspect where we have kernel-side code that can share pieces
of data with user-space code). So sticking to global symbols is good
for that purpose. But we also want to have more control over which
files can resolve externs against such globals. So that "outside" BPF
code can't access BPF library's internal global symbols that are
supposed to be shared only within BPF library .o files.

For the latter case, you start out with STV_HIDDEN, statically-link
BPF library .o's into a BPF library's object file. But once that is
done, all those STV_HIDDEN symbols will become STV_INTERNAL and won't
be resolvable outside of BPF library, when BPF library is later linked
into user's BPF application. We don't have shared libraries, but we'd
like static BPF libraries to have the same boundary/API protection as
shared libraries have in user-space world.

I hope this helps a little bit, but you are right that one needs to
know quite a bit of BPF specifics to understand these nuances.

>
>
>
> > >
> > > --
> > > 宋方睿
>
>
>
> --
> 宋方睿