[llvm-dev] [RFC] LLVM Busybox Proposal

Tue Jun 22 22:55:02 PDT 2021

On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek at google.com> wrote:

> I guess this depends on a particular implementation of the distributed
> build system. In the case of Goma, we only supply the compiler binary which
> was invoked as the command (that binary links glibc as a shared library but
> we assume that one is supplied by the host system), all other files like
> headers are passed together with the compiler invocation as inputs. If we
> used dynamic linking, Goma would need to figure out what other shared
> libraries need to be sent to the server. It's certainly doable but it's an
> extra complexity we would like to avoid.
>

Curious/fair enough - good to know!

>
> On Tue, Jun 22, 2021 at 10:09 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> From our perspective as a toolchain vendor, even if using shared
>>> libraries could get us closer to static linking in terms of performance,
>>> we'd still prefer static linking for the ease of distribution. Dealing with
>>> a single statically linked executable is much easier than dealing
>>> with multiple shared libraries. This is especially important in distributed
>>> compilation environments like Goma.
>>>
>>
>> What makes it especially complicated for distributed compilation
>> environments? (I'd expect a toolchain contains so many files that whether
>> it's one binary, or a binary and a handful of shared libraries wouldn't
>> change the general implementation complexity of a distributed build system?)
>>
>>
>>>
>>> When comparing performance between static and dynamic linking, I'd also
>>> recommend doing a comparison between binaries built with PGO+LTO. Plain -O3
>>> leaves a lot of performance on the table and as far as I'm aware, most
>>> toolchain vendors use PGO+LTO.
>>>
>>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
>>>> >Small update: I have a WIP prototype of the tool at
>>>> >https://reviews.llvm.org/D104686. The prototype only includes
>>>> llvm-objcopy
>>>> >and llvm-objdump packed together, but we're seeing size benefits from
>>>> >busyboxing those two compared against having two separate tools. (More
>>>> >details in the prototype's description.) I don't plan on landing this
>>>> as-is
>>>> >anytime soon and there's still some things I'd like to improve/change
>>>> and
>>>> >get feedback on.
>>>> >
>>>> >To answer some replies:
>>>> >
>>>> >- Ideally, we could start off with an incremental approach and not
>>>> package
>>>> >large tools like clang/lld off the bat. The llvm-* tools seem like a
>>>> good
>>>> >place to start since they're generally a bunch of relatively small
>>>> binaries
>>>> >that all share a subset of functions in libLLVM, but don't necessarily
>>>> use
>>>> >all of libLLVM, so statically linking them together (with
>>>> --gc-sections)
>>>> >can help dedup a lot of shared components vs having separate statically
>>>> >compiled tools. In my measurements, the busybox tool containing
>>>> >llvm-objcopy+objdump is negligibly larger than llvm-objdump on its own
>>>> (a
>>>> >couple KB difference) indicating a lot of shared code between objdump
>>>> and
>>>> >objcopy.
>>>> >
>>>> >- Will Dietz's multiplexing tool looks like a good place to start
>>>> from. The
>>>> >only concern I can see though is mostly the amount of work needed to
>>>> update
>>>> >it to LLVM 13.
>>>> >
>>>> >- We don't have plans for windows support now, but it's not off the
>>>> table.
>>>> >(Been mostly focusing on *nix for now). Depending on overall traction
>>>> for
>>>> >this idea, we could approach incrementally and add support for
>>>> different
>>>> >platforms over time.
>>>>
>>>> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
>>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>>>> vs
>>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>>>
>>>>
>>>> # This is the lower bound for any multiplexing approach. clang is the
>>>> largest executable.
>>>> % stat -c %s /tmp/out/custom2/bin/clang-13
>>>> 102900408
>>>>
>>>> I have built clang, lld and a bunch of ELF binary utilities.
>>>>
>>>> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
>>>> /tmp/out/custom1/lib/libclang-cpp.so.13git
>>>> /tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>>>> | awk '{s+=$1}END{print s}'
>>>> 138896544
>>>>
>>>> % stat -c %s
>>>> /tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>>>> | awk '{s+=$1}END{print s}'
>>>> 209054440
>>>>
>>>>
>>>> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build is
>>>> doing a really good job.
>>>>
>>>> A multiplexing approach can squeeze some bytes from 138896544 toward
>>>> 102900408,
>>>> but how much can it do?
>>>>
>>>>
>>>> >- I'm starting to think the `cl::opt` to `OptTable` issue might be
>>>> >orthogonal to the busybox implementation. The tool essentially
>>>> dispatches
>>>> >to different "main" functions in different tools, but as long as we
>>>> don't
>>>> >do anything within busybox after exiting that tool's main, then the
>>>> global
>>>> >state issues we weren't sure of with `cl::opt` might not be of any
>>>> concern
>>>> >now. It may be an issue down the line if, let's say, the tool flags
>>>> moved
>>>> >from being "owned" by the tools themselves to instead being "owned" by
>>>> >busybox, and then we'd have to merge similarly-named flags together. In
>>>> >that case, migrating these tools to use `OptTable` may be necessary
>>>> since
>>>> >(I think) `OptTable` should handle this. This may be a tedious task,
>>>> but
>>>> >this is just to say that busybox won't need to be immediately blocked
>>>> on it.
>>>>
>>>> Such improvement is useful even if we don't do multiplexing.
>>>> I switched llvm-symbolizer. thakis switched llvm-objdump.
>>>> I can look at some binary utilities.
>>>>
>>>> >- I haven't seen any issues with colliding symbols when linking
>>>> (although
>>>> >I've only merged two tools for now). I suspect that with small-ish
>>>> llvm-*
>>>> >tools, the bulk of their code is shared from libLLVM, and they have
>>>> their
>>>> >own distinct logic built on top of it, which could mean a low chance of
>>>> >conflicting internal ABIs.
>>>> >
>>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan <leonardchan at google.com>
>>>> >wrote:
>>>> >
>>>> >> Hello all,
>>>> >>
>>>> >> When building LLVM tools, including Clang and lld, it's currently
>>>> possible
>>>> >> to use either static or shared linking for LLVM libraries. The
>>>> latter can
>>>> >> significantly reduce the size of the toolchain since we aren't
>>>> duplicating
>>>> >> the same code in every binary, but the dynamic relocations can affect
>>>> >> performance. The former doesn't affect performance but significantly
>>>> >> increases the size of our toolchain.
>>>> >>
>>>> >> We would like to implement a support for a third approach which we
>>>> call,
>>>> >> for a lack of better term, "busybox" feature, where everything is
>>>> compiled
>>>> >> into a single binary which then dispatches into an appropriate tool
>>>> >> depending on the first command. This approach can significantly
>>>> reduce the
>>>> >> size by deduplicating all of the shared code without affecting the
>>>> >> performance.
>>>> >>
>>>> >> In terms of implementation, the build would produce a single binary
>>>> called
>>>> >> `llvm` and the first command would identify the tool. For example,
>>>> instead
>>>> >> of invoking `llvm-nm` you'd invoke `llvm nm`. Ideally we would also
>>>> support
>>>> >> creation of `llvm-nm` symlink which redirects to `llvm` for backwards
>>>> >> compatibility.
>>>> >> This functionality would ideally be implemented as an option in the
>>>> CMake
>>>> >> build that toolchain vendors can opt into.
>>>> >>
>>>> >> The implementation would have to replace `main` function of each
>>>> tool with
>>>> >> an entrypoint regular function which is registered into a tool
>>>> registry.
>>>> >> This could be wrapped in a macro for convenience. When the "busybox"
>>>> >> feature is disabled, the macro would expand to a `main` function as
>>>> before
>>>> >> and redirect to the entrypoint function. When the "busybox" feature
>>>> is
>>>> >> enabled, it would register the entrypoint function into the
>>>> registry, which
>>>> >> would be responsible for the dispatching based on the tool name.
>>>> Ideally,
>>>> >> toolchain maintainers would also be able to control which tools they
>>>> could
>>>> >> add to the "busybox" binary via CMake build options, so toolchains
>>>> will
>>>> >> only include the tools they use.
>>>> >>
>>>> >> One implementation detail we think will be an issue is merging
>>>> arguments
>>>> >> in individual tools that use `cl::opt`. `cl::opt` works by
>>>> maintaining a
>>>> >> global state of flags, but we aren’t confident of what the resulting
>>>> >> behavior will be when merging them together in the dispatching
>>>> `main`. What
>>>> >> we would like to avoid is having flags used by one specific tool
>>>> available
>>>> >> on other tools. To address this issue, we would like to migrate all
>>>> tools
>>>> >> to use `OptTable` which doesn't have this issue and has been the
>>>> general
>>>> >> direction most tools have been already moving into.
>>>> >>
>>>> >> A second issue would be resolving symlinks. For example,
>>>> llvm-objcopy will
>>>> >> check argv[0] and behave as llvm-strip (ie. use the right flags +
>>>> >> configuration) if it is called via a symlink that “looks like” a
>>>> strip
>>>> >> tool, but for all other cases it will run under the default objcopy
>>>> mode.
>>>> >> The “looks like” function is usually an `Is` function copied in
>>>> multiple
>>>> >> tools that is essentially a substring check: so symlinks like
>>>> `llvm-strip`,
>>>> >> strip.exe, and `gnu-llvm-strip-10` all result in using the strip
>>>> “mode”
>>>> >> while all other names use the objcopy mode. To replicate the same
>>>> behavior,
>>>> >> we will need to take great care in making sure symlinks to the
>>>> busybox tool
>>>> >> dispatch correctly to the appropriate llvm tool, which might mean
>>>> exposing
>>>> >> and merging these `Is` functions.
>>>> >>
>>>> >> Some open questions:
>>>> >> - People's initial thoughts/opinions?
>>>> >> - Are there existing tools in LLVM that already do this?
>>>> >> - Other implementation details/global states that we would also need
>>>> to
>>>> >> account for?
>>>> >>
>>>> >> - Leonard
>>>> >>
>>>>
>>>> >_______________________________________________
>>>> >LLVM Developers mailing list
>>>> >llvm-dev at lists.llvm.org
>>>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/b4bfc58c/attachment.html>