[cfe-dev] different gcc/clang behavior for Internal visibility

Fāng-ruì Sòng via cfe-dev cfe-dev at lists.llvm.org
Wed May 12 11:40:30 PDT 2021


On 2021-05-12, Andrii Nakryiko wrote:
>On Tue, May 11, 2021 at 9:53 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>>
>> On Tue, May 11, 2021 at 9:41 PM Andrii Nakryiko
>> <andrii.nakryiko at gmail.com> wrote:
>> >
>> > On Tue, May 11, 2021 at 3:37 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>> > >
>> > > On Tue, May 11, 2021 at 2:59 PM Andrii Nakryiko
>> > > <andrii.nakryiko at gmail.com> wrote:
>> > > >
>> > > > On Tue, May 11, 2021 at 11:05 AM Fāng-ruì Sòng <maskray at google.com> wrote:
>> > > > >
>> > > > > > rnk: IIRC, internal visibility is a guarantee that a function is not escaped and called indirectly from another DSO. This allows the compiler to avoid setting up a TOC or PIC base register in the prologue for architectures that have one. This was notably expensive for x86_32 because of the call/pop code sequence required to materialize EIP. There are other RISC-y architectures (PPC? 64? not sure) with PIC base registers that could benefit from internal visibility support, but I think most of them are considered legacy architectures at this point, so it's not top priority.
>> > > > >
>> > > > > I vaguely recall that I have seen something similar to "a function is
>> > > > > not escaped", but I just searched generic-abi and could not find any
>> > > > > proof...
>> > > > >
>> > > > > I just checked i386 and PowerPC64 ELFv2 ABIs - they don't define STV_INTERNAL.
>> > > > >
>> > > > > On Tue, May 11, 2021 at 10:45 AM Andrii Nakryiko
>> > > > > <andrii.nakryiko at gmail.com> wrote:
>> > > > > >
>> > > > > > On Mon, May 10, 2021 at 12:52 PM Fangrui Song <maskray at google.com> wrote:
>> > > > > > >
>> > > > > > > On 2021-05-10, Y Song via cfe-dev wrote:
>> > > > > > > >Hi,
>> > > > > > > >
>> > > > > > > >The bpf linker project tries to explore to use INTERNAL visibility as in
>> > > > > > > >   https://lore.kernel.org/bpf/20210507054119.270888-1-andrii@kernel.org/
>> > > > > > > >
>> > > > > > > >But we found clang actually changed user "internal" visibility to "hidden".
>> > > > > > > >For example, I have the following example,
>> > > > > > > >
>> > > > > > > >$ cat t.c
>> > > > > > > >int __attribute__((visibility("internal"))) foo() { return 0; }
>> > > > > > > >$ clang -c t.c && llvm-readelf -s t.o | grep foo
>> > > > > > > >     3: 0000000000000000     8 FUNC    GLOBAL HIDDEN      2 foo
>> > > > > > > >$ gcc -c t.c && llvm-readelf -s t.o | grep foo
>> > > > > > > >     8: 0000000000000000    11 FUNC    GLOBAL INTERNAL    1 foo
>> > > > > > > >$
>> > > > > > > >
>> > > > > > > >Looks like this is caused by clang Attr.td,
>> > > > > > > >
>> > > > > > > >diff --git a/clang/include/clang/Basic/Attr.td
>> > > > > > > >b/clang/include/clang/Basic/Attr.td
>> > > > > > > >index 5e04f32187cd..4559a1bcfe42 100644
>> > > > > > > >--- a/clang/include/clang/Basic/Attr.td
>> > > > > > > >+++ b/clang/include/clang/Basic/Attr.td
>> > > > > > > >@@ -2776,7 +2776,7 @@ def Visibility : InheritableAttr {
>> > > > > > > >   let Spellings = [GCC<"visibility">];
>> > > > > > > >   let Args = [EnumArgument<"Visibility", "VisibilityType",
>> > > > > > > >                            ["default", "hidden", "internal", "protected"],
>> > > > > > > >-                           ["Default", "Hidden", "Hidden", "Protected"]>];
>> > > > > > > >+                           ["Default", "Hidden", "Internal", "Protected"]>];
>> > > > > > > >   let MeaningfulToClassTemplateDefinition = 1;
>> > > > > > > >   let Documentation = [Undocumented];
>> > > > > > > > }
>> > > > > > > >@@ -2786,7 +2786,7 @@ def TypeVisibility : InheritableAttr {
>> > > > > > > >   let Spellings = [Clang<"type_visibility">];
>> > > > > > > >   let Args = [EnumArgument<"Visibility", "VisibilityType",
>> > > > > > > >                            ["default", "hidden", "internal", "protected"],
>> > > > > > > >-                           ["Default", "Hidden", "Hidden", "Protected"]>];
>> > > > > > > >+                           ["Default", "Hidden", "Internal", "Protected"]>];
>> > > > > > > > //  let Subjects = [Tag, ObjCInterface, Namespace];
>> > > > > > > >   let Documentation = [Undocumented];
>> > > > > > > > }
>> > > > > > > >
>> > > > > > > >One of early commits,
>> > > > > > > >
>> > > > > > > >commit 570024a8d9b4a4aa4a35f077a0a65003dc7b71fe
>> > > > > > > >Author: Eli Friedman <eli.friedman at gmail.com>
>> > > > > > > >Date:   Thu Aug 5 06:57:20 2010 +0000
>> > > > > > > >
>> > > > > > > >    Implement #pragma GCC visibility.
>> > > > > > > >
>> > > > > > > >    llvm-svn: 110315
>> > > > > > > >
>> > > > > > > >I see
>> > > > > > > >
>> > > > > > > >+    else if (VisType->isStr("internal"))
>> > > > > > > >+      type = VisibilityAttr::HiddenVisibility; // FIXME
>> > > > > > > >
>> > > > > > > >Do we have any plan to support Internal visibility?
>> > > > > > >
>> > > > > > > I don't think there is value supporting STV_INTERNAL.
>> > > > > > >
>> > > > > > > STV_INTERNAL as we see today in the ELF specification was requested by SGI.
>> > > > > > >
>> > > > > > > > The meaning of this visibility attribute may be defined by processor
>> > > > > > > > supplements to further constrain hidden symbols. A processor
>> > > > > > > > supplement's definition should be such that generic tools can safely
>> > > > > > > > treat internal symbols as hidden.  An internal symbol contained in a
>> > > > > > > > relocatable object must be either removed or converted to STB_LOCAL
>> > > > > > > > binding by the link-editor when the relocatable object is included in an
>> > > > > > > > executable file or shared object.
>> > > > > > >
>> > > > > > > I recall from reading somewhere that it is used by its propritery
>> > > > > > > compiler for some LTO like optimizations. STV_INTERNAL is identical to
>> > > > > > > STV_HIDDEN in GNU/Solaris (and likely HP-UX).
>> > > > > > >
>> > > > > > > For the Linux kernel, my suggestion is to just use __attribute__((visibility("hidden"))).
>> > > > > > > Then there will be no confusion that "internal" translates to STV_HIDDEN.
>> > > > > >
>> > > > > > We are considering using both STV_HIDDEN and STV_INTERNAL for BPF
>> > > > > > target and we want to be able to distinguish between the two. Given
>> > > > > > ELF defines all four (STV_DEFAULT, STV_PROTECTED, STV_HIDDEN,
>> > > > > > STV_INTERNAL), it would be great to be able to actually express
>> > > > > > STV_INTERNAL. Are there any specific problems with passing-through
>> > > > > > STV_INTERNAL down to ELF symbol visibility?
>> > > > >
>> > > > > Since the ELF specification delegates the further definition of
>> > > > > STV_INTERNAL to processor supplements, BPF as a processor has the
>> > > > > right to further define STV_INTERNAL.
>> > > > > I'd like to know more arguments favoring a distinguished
>> > > > > STV_INTERNAL/STV_HIDDEN, since the two bits are ubiquitously
>> > > > > equivalent in GNU ABI and Solaris (and likely HP-UX; only SGI does
>> > > > > something different but it is difficult to find the evidence now).
>> > > > >
>> > > >
>> > > > It's quite BPF specific, but you can check this patch with some
>> > > > discussion around STV_HIDDEN vs STV_INTERNAL.\
>> > > >
>> > > >   [0] https://lore.kernel.org/bpf/20210507054119.270888-8-andrii@kernel.org/
>> > >
>> > > I am not subscribed to the bpf list so I'll reply here.
>> > > As you are proposing new ELF features. and I'd like to make sure we
>> > > are making good use of them.
>> > >
>> > > > STV_INTERNAL is stricter and prevents any other object files to reference such symbol (function, variable, map).
>> > >
>> > > I do not understand this. Do you mean that STV_INTERNAL, being a
>> > > visibility value, acts like the binding STB_LOCAL?
>> >
>> > Yes, sort of. It prevents any extern resolution against such symbol,
>> > just like for STB_LOCAL. But it still enforces no naming conflicts
>> > (unlike STB_LOCAL) because it stays STB_GLOBAL.
>>
>> This is not allowed in ELF. STB_GLOBAL needs to satisfy "One file's
>> definition of a global symbol will satisfy another file's undefined
>> reference to the same global symbol."
>> You can't redefine STV_INTERNAL STB_LOCAL.
>
>I'm not redefining it as STB_LOCAL, I explicitly don't want to define
>it as STB_LOCAL. I'm just defining a custom behavior of STB_GLOBAL +
>STV_INTERNAL. Weren't you the one saying "BPF as a processor has the
>right to further define STV_INTERNAL"?

A processor can provide further definitions which do not conflict the
existing framework. The things you propose cause a conflict now, so it
is not ok.

>> > >
>> > > > The undesirable behavior is that after shared library is linked, application can define "conflicting" symbol with the same name and they will happily co-exists, because shared library's symbol is now STB_LOCAL.
>> > >
>> > > The specification says "Local symbols of the same name may exist in
>> > > multiple files without interfering with each other." for STB_LOCAL, so
>> > > this is really natural.
>> > > I do not see how the proposed STV_INTERNAL behaves here.
>> >
>> > We keep STV_INTERNAL + STB_GLOBAL, instead of STB_LOCAL.
>> >
>> > >
>> > > > __attribute__((visibility("internal"))) int user_space_var;
>> > >
>> > > The neigh paragraphs probably need more wording for folks we are not
>> > > versed in the Linux kernel BPF stuff (like me).
>> > > But from the wording I am worrying that you are defining STV_INTERNAL
>> > > as something less constraining than STV_HIDDEN.
>> > > That is disallowed by the specification:
>> > >
>> > > "the most constraining visibility attribute must be propagated to the
>> > > resolving symbol in the linked object. The attributes, ordered from
>> > > least to most constraining, are: STV_PROTECTED, STV_HIDDEN and
>> > > STV_INTERNAL."
>> > >
>> > > STV_INTERNAL must be more constraining than STV_HIDDEN.
>> >
>> > And it is. STV_HIDDEN symbols are resolvable within BPF library which
>> > is linked from multiple .o files. STV_INTERNAL is not resolvable
>> > outside of single defining .o file. So STV_INTERNAL is stricter.
>> >
>> > The general idea is that for BPF world we want to keep using
>> > STB_GLOBAL for variables that are shared with user-space, because that
>> > guarantees name uniqueness, otherwise it gets hairy very fast (this is
>> > unique BPF aspect where we have kernel-side code that can share pieces
>> > of data with user-space code). So sticking to global symbols is good
>> > for that purpose. But we also want to have more control over which
>> > files can resolve externs against such globals. So that "outside" BPF
>> > code can't access BPF library's internal global symbols that are
>> > supposed to be shared only within BPF library .o files.
>> >
>> > For the latter case, you start out with STV_HIDDEN, statically-link
>> > BPF library .o's into a BPF library's object file. But once that is
>> > done, all those STV_HIDDEN symbols will become STV_INTERNAL and won't
>> > be resolvable outside of BPF library, when BPF library is later linked
>> > into user's BPF application. We don't have shared libraries, but we'd
>> > like static BPF libraries to have the same boundary/API protection as
>> > shared libraries have in user-space world.
>> >
>> > I hope this helps a little bit, but you are right that one needs to
>> > know quite a bit of BPF specifics to understand these nuances.
>>
>> If you raise an example about how the symbol's visibility/binding
>> changes during emission / linking / used by user applications, it
>> could be clearer.
>> From the description I don't see how Clang
>> __attribute__((visibility("internal"))) is involved.
>> (And from above I think your STV_INTERNAL redefinition is incorrect -
>> so it does not justify modifying LLVM IR and updating the
>> __attribute__((visibility("internal")) behavior).
>
>STV_INTERNAL is defined separately from STV_HIDDEN in ELF, right?

Yes.

>Regardless how we define STV_INTERNAL for BPF, shouldn't Clang allow
>to express STV_INTERNAL without imposing artificial conversions of
>STV_INTERNAL -> STV_HIDDEN in ELF symbol visibility? 

Only if it is useful.
There is nearly no __attribute__((visibility("internal"))) in the wild
because in most ELF implementations STV_INTERNAL isn't useful at all.

>Is there any
>technical reason to mangle __attribute__((visibility("internal")))?
>GCC allows to preserve STV_INTERNAL just fine without imposing
>unnecessary restrictions, so would be great for Clang to allow the
>same.

We will have to introduce a new LLVM IR visibility "internal":
https://llvm.org/docs/LangRef.html#visibility-styles
Since the BPF usage is potentially questionable, I don't consider this a
convincing argument that we should introduce the visibility value.

>>
>> > >
>> > >
>> > >
>> > > > >
>> > > > > --
>> > > > > 宋方睿
>> > >
>> > >
>> > >
>> > > --
>> > > 宋方睿
>>
>>
>>
>> --
>> 宋方睿


More information about the cfe-dev mailing list