[llvm-dev] [RFC] LLVM Busybox Proposal

Tue Jun 22 22:09:42 PDT 2021

On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> From our perspective as a toolchain vendor, even if using shared libraries
> could get us closer to static linking in terms of performance, we'd still
> prefer static linking for the ease of distribution. Dealing with a single
> statically linked executable is much easier than dealing with multiple
> shared libraries. This is especially important in distributed compilation
> environments like Goma.
>

What makes it especially complicated for distributed compilation
environments? (I'd expect a toolchain contains so many files that whether
it's one binary, or a binary and a handful of shared libraries wouldn't
change the general implementation complexity of a distributed build system?)

>
> When comparing performance between static and dynamic linking, I'd also
> recommend doing a comparison between binaries built with PGO+LTO. Plain -O3
> leaves a lot of performance on the table and as far as I'm aware, most
> toolchain vendors use PGO+LTO.
>
> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
>> >Small update: I have a WIP prototype of the tool at
>> >https://reviews.llvm.org/D104686. The prototype only includes
>> llvm-objcopy
>> >and llvm-objdump packed together, but we're seeing size benefits from
>> >busyboxing those two compared against having two separate tools. (More
>> >details in the prototype's description.) I don't plan on landing this
>> as-is
>> >anytime soon and there's still some things I'd like to improve/change and
>> >get feedback on.
>> >
>> >To answer some replies:
>> >
>> >- Ideally, we could start off with an incremental approach and not
>> package
>> >large tools like clang/lld off the bat. The llvm-* tools seem like a good
>> >place to start since they're generally a bunch of relatively small
>> binaries
>> >that all share a subset of functions in libLLVM, but don't necessarily
>> use
>> >all of libLLVM, so statically linking them together (with --gc-sections)
>> >can help dedup a lot of shared components vs having separate statically
>> >compiled tools. In my measurements, the busybox tool containing
>> >llvm-objcopy+objdump is negligibly larger than llvm-objdump on its own (a
>> >couple KB difference) indicating a lot of shared code between objdump and
>> >objcopy.
>> >
>> >- Will Dietz's multiplexing tool looks like a good place to start from.
>> The
>> >only concern I can see though is mostly the amount of work needed to
>> update
>> >it to LLVM 13.
>> >
>> >- We don't have plans for windows support now, but it's not off the
>> table.
>> >(Been mostly focusing on *nix for now). Depending on overall traction for
>> >this idea, we could approach incrementally and add support for different
>> >platforms over time.
>>
>> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
>> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>> vs
>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>
>>
>> # This is the lower bound for any multiplexing approach. clang is the
>> largest executable.
>> % stat -c %s /tmp/out/custom2/bin/clang-13
>> 102900408
>>
>> I have built clang, lld and a bunch of ELF binary utilities.
>>
>> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
>> /tmp/out/custom1/lib/libclang-cpp.so.13git
>> /tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>> | awk '{s+=$1}END{print s}'
>> 138896544
>>
>> % stat -c %s
>> /tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>> | awk '{s+=$1}END{print s}'
>> 209054440
>>
>>
>> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build is doing
>> a really good job.
>>
>> A multiplexing approach can squeeze some bytes from 138896544 toward
>> 102900408,
>> but how much can it do?
>>
>>
>> >- I'm starting to think the `cl::opt` to `OptTable` issue might be
>> >orthogonal to the busybox implementation. The tool essentially dispatches
>> >to different "main" functions in different tools, but as long as we don't
>> >do anything within busybox after exiting that tool's main, then the
>> global
>> >state issues we weren't sure of with `cl::opt` might not be of any
>> concern
>> >now. It may be an issue down the line if, let's say, the tool flags moved
>> >from being "owned" by the tools themselves to instead being "owned" by
>> >busybox, and then we'd have to merge similarly-named flags together. In
>> >that case, migrating these tools to use `OptTable` may be necessary since
>> >(I think) `OptTable` should handle this. This may be a tedious task, but
>> >this is just to say that busybox won't need to be immediately blocked on
>> it.
>>
>> Such improvement is useful even if we don't do multiplexing.
>> I switched llvm-symbolizer. thakis switched llvm-objdump.
>> I can look at some binary utilities.
>>
>> >- I haven't seen any issues with colliding symbols when linking (although
>> >I've only merged two tools for now). I suspect that with small-ish llvm-*
>> >tools, the bulk of their code is shared from libLLVM, and they have their
>> >own distinct logic built on top of it, which could mean a low chance of
>> >conflicting internal ABIs.
>> >
>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan <leonardchan at google.com>
>> >wrote:
>> >
>> >> Hello all,
>> >>
>> >> When building LLVM tools, including Clang and lld, it's currently
>> possible
>> >> to use either static or shared linking for LLVM libraries. The latter
>> can
>> >> significantly reduce the size of the toolchain since we aren't
>> duplicating
>> >> the same code in every binary, but the dynamic relocations can affect
>> >> performance. The former doesn't affect performance but significantly
>> >> increases the size of our toolchain.
>> >>
>> >> We would like to implement a support for a third approach which we
>> call,
>> >> for a lack of better term, "busybox" feature, where everything is
>> compiled
>> >> into a single binary which then dispatches into an appropriate tool
>> >> depending on the first command. This approach can significantly reduce
>> the
>> >> size by deduplicating all of the shared code without affecting the
>> >> performance.
>> >>
>> >> In terms of implementation, the build would produce a single binary
>> called
>> >> `llvm` and the first command would identify the tool. For example,
>> instead
>> >> of invoking `llvm-nm` you'd invoke `llvm nm`. Ideally we would also
>> support
>> >> creation of `llvm-nm` symlink which redirects to `llvm` for backwards
>> >> compatibility.
>> >> This functionality would ideally be implemented as an option in the
>> CMake
>> >> build that toolchain vendors can opt into.
>> >>
>> >> The implementation would have to replace `main` function of each tool
>> with
>> >> an entrypoint regular function which is registered into a tool
>> registry.
>> >> This could be wrapped in a macro for convenience. When the "busybox"
>> >> feature is disabled, the macro would expand to a `main` function as
>> before
>> >> and redirect to the entrypoint function. When the "busybox" feature is
>> >> enabled, it would register the entrypoint function into the registry,
>> which
>> >> would be responsible for the dispatching based on the tool name.
>> Ideally,
>> >> toolchain maintainers would also be able to control which tools they
>> could
>> >> add to the "busybox" binary via CMake build options, so toolchains will
>> >> only include the tools they use.
>> >>
>> >> One implementation detail we think will be an issue is merging
>> arguments
>> >> in individual tools that use `cl::opt`. `cl::opt` works by maintaining
>> a
>> >> global state of flags, but we aren’t confident of what the resulting
>> >> behavior will be when merging them together in the dispatching `main`.
>> What
>> >> we would like to avoid is having flags used by one specific tool
>> available
>> >> on other tools. To address this issue, we would like to migrate all
>> tools
>> >> to use `OptTable` which doesn't have this issue and has been the
>> general
>> >> direction most tools have been already moving into.
>> >>
>> >> A second issue would be resolving symlinks. For example, llvm-objcopy
>> will
>> >> check argv[0] and behave as llvm-strip (ie. use the right flags +
>> >> configuration) if it is called via a symlink that “looks like” a strip
>> >> tool, but for all other cases it will run under the default objcopy
>> mode.
>> >> The “looks like” function is usually an `Is` function copied in
>> multiple
>> >> tools that is essentially a substring check: so symlinks like
>> `llvm-strip`,
>> >> strip.exe, and `gnu-llvm-strip-10` all result in using the strip “mode”
>> >> while all other names use the objcopy mode. To replicate the same
>> behavior,
>> >> we will need to take great care in making sure symlinks to the busybox
>> tool
>> >> dispatch correctly to the appropriate llvm tool, which might mean
>> exposing
>> >> and merging these `Is` functions.
>> >>
>> >> Some open questions:
>> >> - People's initial thoughts/opinions?
>> >> - Are there existing tools in LLVM that already do this?
>> >> - Other implementation details/global states that we would also need to
>> >> account for?
>> >>
>> >> - Leonard
>> >>
>>
>> >_______________________________________________
>> >LLVM Developers mailing list
>> >llvm-dev at lists.llvm.org
>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/1450f39e/attachment-0001.html>