[PATCH] D93003: [libunwind][ELF] Hide unw_getcontext with LIBUNWIND_HIDE_SYMBOLS

Ryan Prichard via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Feb 9 19:57:50 PST 2021


rprichard added inline comments.


================
Comment at: libunwind/src/assembly.h:82
   .globl SYMBOL_NAME(aliasname) SEPARATOR                                      \
-  WEAK_SYMBOL(aliasname) SEPARATOR                                             \
+  EXPORT_SYMBOL(SYMBOL_NAME(aliasname)) SEPARATOR                              \
   SYMBOL_NAME(aliasname) = SYMBOL_NAME(name)
----------------
rprichard wrote:
> steven_wu wrote:
> > rprichard wrote:
> > > compnerd wrote:
> > > > Does this not change the behaviour on MachO?  This symbol is now `private_extern` rather than a `weak_reference`.  A weak reference will be set to 0 by the loader if it is not found, and a `private_extern` is a strong internal reference.
> > > Is `.weak_reference` the right directive to use here, instead of `.weak_definition`? We're defining a symbol (`aliasname`) and setting its value to that of another symbol (`name`).
> > > 
> > > I think marking `unw_*` weak is intended to let some other strong definition override it. Its value won't ever be set to 0.
> > > 
> > > Currently on Mach-O, the hide-symbols flag hides almost everything (including `_Unwind_*`) but leaves all of the `unw_*` alias symbols as extern (and not private-extern) and not weak. With my change, they're still not weak, but they're private-extern.
> > > 
> > > libunwind's current assembly.h behavior for a weak alias:
> > > 
> > >     .globl aliasname
> > >     .weak_reference aliasname
> > >     aliasname = name
> > > 
> > > The LLVM Mach-O assembler ignores the `.weak_reference` directive. If I change it to `.weak_definition`, it is still ignored. AFAICT, the LLVM assembler uses the WeakDef/WeakRef attributes from the `name` symbol.
> > > 
> > > e.g.
> > > 
> > > ```
> > > $ cat test.S    
> > >     .text
> > >     .space 0x42
> > > 
> > >     // Define foo.
> > >     .globl foo
> > > foo:
> > >     ret
> > > 
> > >     // Define a weak alias, bar.
> > >     .globl bar
> > >     .weak_reference bar
> > >     bar = foo
> > > 
> > > $ ~/clang11/bin/clang test.S -c && ~/clang11/bin/llvm-readobj --syms test.o
> > > 
> > > File: test.o
> > > Format: Mach-O 64-bit x86-64
> > > Arch: x86_64
> > > AddressSize: 64bit
> > > Symbols [
> > >   Symbol {
> > >     Name: bar (1)
> > >     Extern
> > >     Type: Section (0xE)
> > >     Section: __text (0x1)
> > >     RefType: UndefinedNonLazy (0x0)
> > >     Flags [ (0x0)
> > >     ]
> > >     Value: 0x42
> > >   }
> > >   Symbol {
> > >     Name: foo (5)
> > >     Extern
> > >     Type: Section (0xE)
> > >     Section: __text (0x1)
> > >     RefType: UndefinedNonLazy (0x0)
> > >     Flags [ (0x0)
> > >     ]
> > >     Value: 0x42
> > >   }
> > > ]
> > > ```
> > > 
> > > The Flags are empty.
> > > 
> > > OTOH, if I flip things around:
> > > 
> > > ```
> > >     .text
> > >     .space 0x42
> > > 
> > >     // Define a weak function, foo.
> > >     .globl foo
> > >     .weak_reference foo
> > > foo:
> > >     ret
> > > 
> > >     // Define an alias, bar.
> > >     .globl bar
> > >     bar = foo
> > > ```
> > > 
> > > Now both symbols are WeakRef:
> > > 
> > > ```
> > > File: test.o
> > > Format: Mach-O 64-bit x86-64
> > > Arch: x86_64
> > > AddressSize: 64bit
> > > Symbols [
> > >   Symbol {
> > >     Name: bar (1)
> > >     Extern
> > >     Type: Section (0xE)
> > >     Section: __text (0x1)
> > >     RefType: UndefinedNonLazy (0x0)
> > >     Flags [ (0x40)
> > >       WeakRef (0x40)
> > >     ]
> > >     Value: 0x42
> > >   }
> > >   Symbol {
> > >     Name: foo (5)
> > >     Extern
> > >     Type: Section (0xE)
> > >     Section: __text (0x1)
> > >     RefType: UndefinedNonLazy (0x0)
> > >     Flags [ (0x40)
> > >       WeakRef (0x40)
> > >     ]
> > >     Value: 0x42
> > >   }
> > > ]
> > > ```
> > > 
> > > I'm wondering if this LLVM behavior is actually correct, but I'm also unfamiliar with Mach-O. (i.e. We want to copy the symbol's address, but should we be copying the WeakRef/WeakDef flags?) It looks like the behavior is controlled in `MachObjectWriter::writeNlist`. `writeNlist` finds the aliased symbol and uses its flags:
> > > ```
> > >   // The Mach-O streamer uses the lowest 16-bits of the flags for the 'desc'
> > >   // value.
> > >   bool EncodeAsAltEntry =
> > >     IsAlias && cast<MCSymbolMachO>(OrigSymbol).isAltEntry();
> > >   W.write<uint16_t>(cast<MCSymbolMachO>(Symbol)->getEncodedFlags(EncodeAsAltEntry));
> > > ```
> > > 
> > > The PrivateExtern attribute, on the other hand, isn't part of these encoded flags:
> > > ```
> > >   if (Data.isPrivateExtern())
> > >     Type |= MachO::N_PEXT;
> > > ```
> > > `Data` continues to refer to the alias symbol rather than the aliased symbol. However, apparently the author isn't completely sure about things?
> > > ```
> > >     // FIXME: Should this update Data as well?
> > > ```
> > > 
> > > In libunwind, there is one usage of assembly.h's WEAK_ALIAS. UnwindRegistersSave.S defines a hidden non-weak __unw_getcontext function, and also a weak alias unw_getcontext. My patch's goal is to make unw_getcontext hidden or not-hidden depending on a CMake config variable.
> > > 
> > > Currently, on Mach-O and on Windows, `WEAK_ALIAS` doesn't actually make the alias weak. I'm just making it a bit more explicit on Mach-O.
> > > 
> > > Alternatively:
> > >  - Instead of a weak alias, we could output a weak thunk -- a weak function with a branch to the internal hidden symbol. That's more of a functional change, though.
> > >  - Or, on Mach-O, both the `unw_*` and `__unw_*` functions could be WeakDef.
> > >  - Maybe the hide-symbols flag should only affect ELF?
> > > 
> > I guess the symbol is never really `weak` for Darwin. The `weak_import` attribute will turn all the reference to the symbol to `weak_reference` but since the alias is declared in `.cpp` file and never referenced, it will not create any weak linkage to the symbol. I am also not sure if a weak alias is possible on Darwin :) I think making everything `.private_extern` for Darwin should be fine?
> > 
> > @ldionne The change was added: https://reviews.llvm.org/D59921. I am not sure why the alias need to be weak?
> > I guess the symbol is never really `weak` for Darwin. The `weak_import` attribute will turn all the reference to the symbol to `weak_reference` but since the alias is declared in `.cpp` file and never referenced, it will not create any weak linkage to the symbol. I am also not sure if a weak alias is possible on Darwin :) I think making everything `.private_extern` for Darwin should be fine?
> >
> > @ldionne The change was added: https://reviews.llvm.org/D59921. I am not sure why the alias need to be weak?
> 
> I'm guessing the `unw_*` symbols are marked weak because they're not "reserved for the implementation" the way the underscore-uppercase symbols are (`_Unwind_*`), so if a program wants to define its own `unw_*` symbols, it should be able to without breaking unwinding. This situation could cause a multiply-defined symbol error at link-time, at least when libunwind is linked statically.
> 
Ok, I've restored the Mach-O part of this change. Please take a look.

I did confirm that weak aliases don't work on Mach-O: when a weak function has been overridden with a strong function, the linker can delete the weak function, leaving the aliases pointing at the deleted area (i.e. whatever function happens to come next). https://gist.github.com/rprichard/8775d7844228577e40d2fd0776397f47

I also removed the `WEAK_SYMBOL()` macro for the Apple case, because it's not used, Windows also doesn't have the macro, and it's not clear whether `.weak_definition` or `.weak_reference` would be wanted.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93003/new/

https://reviews.llvm.org/D93003



More information about the cfe-commits mailing list