[llvm-dev] [RFC] Proposal: llvm-tapi, adding YAML/stub generation for ELF linking support

Armando Montanez via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 3 15:55:05 PST 2019


On the Mach-O side, Juergen attempted to land a patch to support
reading/writing TBD files (https://reviews.llvm.org/D53945), but he ran
into some trouble with the builders so the change was reverted. I'm
assuming it will pop back up when the builder errors are fixed, though I
don't know when that would be.

For the ELF side, the YAML reader/writer and the ELF-specific tool that
utilizes it have both landed (https://reviews.llvm.org/D53051 and
https://reviews.llvm.org/D55352). There's a stack of changes up for review
starting at D55352 that add support for reading binaries into the internal
representation (that can then be written to YAML).


On Sun, Dec 30, 2018 at 5:17 PM John Ericson <john.ericson at obsidian.systems>
wrote:

> Maybe LLD is lagging behind, but is the TAPI stuff that's landed close
> enough to the original such that Apple cctools could be modified to use it?
>
> Cheers,
>
> John
>
> On Dec 30, 2018 6:09 PM, Jake Ehrlich <jakehehrlich at google.com> wrote:
>
> Some patches have landed and several are up for review but there's nothing
> 100% working where you can use it today. It's an ongoing effort. I think
> linking with MachO is a bit far off at the moment but Jurgen would know
> more. It's also my understanding the the MachO linker is lagging a bit
> behind in terms of what it can actually manage to link compared to the ELF
> and COFF linker anyhow so it might not be the best option for you any how.
>
> Sat, Dec 29, 2018, 11:54 PM Moritz Angermann <moritz.angermann at gmail.com>
> wrote:
>
> Hi everyone,
>
> did anything happen in this regard throughout the last three month? Are
> there
> any open diffs for the ELF part or the Mach-O part or even been merged
> already?
>
> What's the current state of say being able to read apples TBD files on
> linux when
> linking?
>
> Cheers,
>  Moritz
>
> > On Sep 29, 2018, at 2:33 AM, Jake Ehrlich via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Oof, I didn't think about Clang not being in the same place. Perhaps we
> could put this in clang-tools-extra to solve that?
> >
> > As for the unification of the code bases. I was assuming we didn't want
> to just throw a ton of code over the wall anyway so the merge was going to
> need to be reviewed chunk by chunk anyhow. Support for the two formats
> should be possible to add in parallel (although, I assume if anyone puts in
> the time the MachO support could race ahead). The hope would be that people
> familiar with Apple's tapi would catch any cases where something that would
> make it difficult to merge code from Apple's tapi in during review.
> >
> > On Fri, Sep 28, 2018 at 10:46 AM Chris Bieneman via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> > I think that the idea of adding an llvm-tapi tool that is ELF-only and
> ignoring the earlier efforts to upstream Apple's tapi would be unfortunate.
> Unifying the codebases after the fact could be challenging and complicated,
> especially since Apple's tapi relies on clang not just LLVM. That would
> mean a different source organization and layout from the start.
> >
> > -Chris
> >
> > > On Sep 27, 2018, at 2:42 PM, Armando Montanez via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> > >
> > > Since the goal is to start llvm-tapi more or less from scratch, I feel
> > > the best approach initially is to focus on the structure as a key
> > > point of feedback in initial reviews. Once the foundations are set,
> > > integrating Mach-O TAPI in parallel with the ELF implementation should
> > > be relatively straightforward. The features outside of stubbing aren't
> > > as appealing for ELF, so I probably won't be working on extending that
> > > functionality. With that being said, the overall design goal is
> > > generalization/abstraction where possible to welcome feature parity in
> > > case it is eventually desired. I'm sure we'll run into things that
> > > belong in the tool but end up being uniquely specialized, and it will
> > > probably be best to address them on a case-by-case basis.
> > >
> > > On Wed, Sep 26, 2018 at 2:42 PM Steven Wu <stevenwu at apple.com> wrote:
> > >>
> > >> Hi Armando
> > >>
> > >> Thanks for the detailed RFC and all the background research. I think
> the concept is good and I will be happy to work with you to integrate the
> ELF implementation with Apple's MachO implementation and contribute it
> upstream. Do you have any proposal on how to integrate with Apple's tapi
> and how should we collaborate?
> > >>
> > >> Also, Apple's tapi does more than just stubbing. Are you interested
> to add ELF support for other features as well? (I guess it should not be
> too hard to do that).
> > >>
> > >> Thanks
> > >>
> > >> Steven
> > >>
> > >>> On Sep 26, 2018, at 8:29 AM, Armando Montanez via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> > >>>
> > >>> Hello all,
> > >>>
> > >>> LLVM-TAPI seeks to decouple the necessary link-time information for a
> > >>> dynamic shared object from the implementation of the runtime object.
> > >>> This process will be referred to as dynamic shared object (DSO)
> > >>> stubbing throughout this proposal. A number of projects have
> > >>> implemented their own versions of shared object stubbing for a
> variety
> > >>> of reasons related to improving the overall linking experience. This
> > >>> functionality is absent from LLVM despite how close the practice is
> to
> > >>> LLVM’s domain. The goal of this project would be to produce a library
> > >>> for LLVM that not only provides a means for DSO stubbing, but also
> > >>> gives meaningful insight into the contents of these stubs and how
> they
> > >>> change. I’ve collected a few example instances of object stubbing as
> > >>> part of larger tools and the key benefits that resulted from them:
> > >>>
> > >>> - Apple’s TAPI [1]: Stubbing used to reduce SDK size and improve
> build times.
> > >>> - Oracle’s Solaris OS linker [2]: Stubbing used to improve build
> > >>> times, and improve robustness of build system (against dependency
> > >>> cycles and race conditions).
> > >>> - Google’s Bazel [3]: Stubbing used to improve build times.
> > >>> - Google’s Fuchsia [4] [5]: Stubbing used to improve build times.
> > >>> - Android NDK: Stubbing used to reduce size of native sdk, control
> > >>> exported symbols, and improve build times.
> > >>>
> > >>> Somewhat tangentially, a tool called libabigail [6] provides
> utilities
> > >>> for tracking changes relevant to ELF files in a meaningful way. One
> of
> > >>> libabigai’s tools provides very detailed textual XML representations
> > >>> of objects, which is especially useful in the absence of a
> preexisting
> > >>> textual representation of shared objects’ exposed interfaces. Glibc
> > >>> [7] and libc++ [8] have made an effort to address this in their own
> > >>> ways by using scripts to produce textual representations of object
> > >>> interfaces. This functionality makes it significantly easier to
> > >>> analyze and control symbol visibility, though the existing solutions
> > >>> are quite bespoke. Controlling these symbols can have an implicit
> > >>> benefit of reducing binary size by pruning visible symbols, but the
> > >>> more critical feature is being able to easily view and edit the
> > >>> exposed symbols in the first place. Using human-readable stubs
> > >>> addresses the issues of DSO analysis and control without requiring
> > >>> highly specialized tools. This does not strive to replace tools
> > >>> altogether; it just makes small tasks significantly more
> approachable.
> > >>>
> > >>> llvm-tapi would strive to be an intersection between a means to
> > >>> produce and link against stubs, and providing tools that offer more
> > >>> control and insight into the public interfaces of DSOs. More
> > >>> fundamentally, llvm-tapi would introduce a library to generate and
> > >>> ingest human-readable stubs from DSOs to address these issues
> directly
> > >>> in LLVM. Overall, this idea is most similar to the vein of Apple’s
> > >>> TAPI, as the original TAPI also uses human-readable stubs.
> > >>>
> > >>> In general, llvm-tapi should:
> > >>>
> > >>> 1. Produce human-readable text files from dynamic shared objects that
> > >>> are concise, readable, and contain everything required for linking
> > >>> that can’t be implicitly derived.
> > >>> 2. Produce linkable files from said human readable text files.
> > >>> 3. Provide tools to track and control the exposed interfaces of
> object files.
> > >>> 4. Integrate well with LLVM’s existing tools.
> > >>> 5. Strive to enable integration of the original TAPI code for Mach-O
> support.
> > >>>
> > >>> There are a number of key benefits to using stubs and text-based
> > >>> application binary interfaces such as:
> > >>> - Reducing the size of dynamic shared objects used exclusively for
> linking.
> > >>> - The ability to avoid re-linking an object when its dependencies’
> > >>> exposed interfaces do not change but their implementation does (which
> > >>> happens frequently).
> > >>> - Simplicity of viewing a diff for a changed DSO interface.
> > >>> A large number of other use cases exist; this would open up the floor
> > >>> for a variety of other tools and future work as the concept is rather
> > >>> generic.
> > >>>
> > >>> The proposed YAML format would be analogous to Apple’s .tbd format
> but
> > >>> differ in a few ways to support ELF object types. An example would be
> > >>> as follows:
> > >>>
> > >>> --- !tapi-tbe-v1
> > >>> soname: someobj.so
> > >>> architecture: aarch64
> > >>> symbols:
> > >>> - name: fish
> > >>>  type: object
> > >>>  size: 48
> > >>> - name: foobar
> > >>>  type: function
> > >>>  warning-text: “deprecated in SOMEOBJ_1.3”
> > >>> - name: printf
> > >>>  type: function
> > >>> - name: rndfunc
> > >>>  type: function
> > >>>  undefined: true
> > >>> ...
> > >>>
> > >>> (Note that this doesn’t account for version sets, but such
> > >>> functionality can be included in a later version.)
> > >>>
> > >>> Most of the fields are self-explanatory, with size not being relevant
> > >>> to function symbols, and warning text being purely optional. One
> > >>> reason this departs from .tbd format is to make diffs much easier:
> > >>> sorting symbols alphabetically on individual lines makes it much more
> > >>> obvious which symbols are added, removed, or modified. Despite the
> > >>> differences, the desire is for llvm-tapi to be structured such that
> > >>> integrating Apple’s Mach-O TAPI will be plausible and welcomed. Prior
> > >>> discussion [9] indicated interest in integrating Apple TAPI into
> LLVM,
> > >>> so I’d definitely like to leave that door open and encourage that in
> > >>> the future.
> > >>>
> > >>> I feel the best place to start this is as a library to best
> facilitate
> > >>> integration into other areas of LLVM, later wrapping it in a
> > >>> standalone tool and eventually considering direct integration into
> > >>> LLD. The tool will initially support basic generation of .tbe and
> stub
> > >>> files from .tbe or ELF. This should give enough functionality for
> > >>> manually checking shared object interface diffs, as well as having
> > >>> access to linkable stubs. The goal is for the tool to eventually
> > >>> provide additional functionality such as compatibility checking, but
> > >>> that’s a ways into the future.shared
> > >>>
> > >>> There’s multiple options for integrating llvm-tapi to work with LLD;
> > >>> LLD could directly use llvm-tapi to produce and ingest .tbe files
> > >>> directly, or llvm-tapi could be used to produce stubs that LLD can be
> > >>> taught to use. From a technical standpoint, these are not mutually
> > >>> exclusive. This step is a ways down the road, but is definitely a
> > >>> high-priority goal.
> > >>>
> > >>> I’m interested to hear your thoughts and feedback on this.
> > >>>
> > >>> Best,
> > >>> Armando
> > >>>
> > >>>
> > >>> [1] https://github.com/ributzka/tapi
> > >>> [2]
> https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter2-22.html
> > >>> [3]
> https://docs.bazel.build/versions/master/user-manual.html#flag--interface_shared_objects
> > >>> [4]
> https://fuchsia.googlesource.com/zircon/+/master/scripts/shlib-symbols
> > >>> [5]
> https://fuchsia.googlesource.com/zircon/+/master/scripts/dso-abi.h
> > >>> [6] https://sourceware.org/libabigail/
> > >>> [7]
> https://sourceware.org/git/?p=glibc.git;a=blob;f=scripts/abilist.awk;h=bad7c3807e478e50e63c3834aa8969214bdd6f63;hb=HEAD
> > >>> [8]
> https://github.com/llvm-mirror/libcxx/blob/master/utils/sym_extract.py
> > >>> [9]
> http://lists.llvm.org/pipermail/cfe-dev/2018-April/thread.html#57576
> > >>> _______________________________________________
> > >>> LLVM Developers mailing list
> > >>> llvm-dev at lists.llvm.org
> > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >>
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190103/7188732b/attachment-0001.html>


More information about the llvm-dev mailing list