[PATCH] D130975: --thinlto-full-index to index both native and bitcode objects

Wei Wang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 2 14:18:38 PDT 2022


weiwang added a comment.

Thanks for the quick feedback!

In D130975#3694180 <https://reviews.llvm.org/D130975#3694180>, @MaskRay wrote:

> I considered this approach. I went with D130229 <https://reviews.llvm.org/D130229> as I have thought of some minor issues:
>
> - The requires `clang -nostdlib ...`
> - An archive member cannot be serialized as `file->getName()`. Actually lld doesn't support syntax reading just an archive member instead of the whole archive. As an example, glibc `libc.so` reads `libc_nonshared.a` which contains `elf-init.oS` and the patch will incorrectly print `elf-init.oS`. This approach assuredly needs an option doing something similar to --remapping-file= in spirit, but more involved.
> - A build system needs to recognize all input files, and remove them for the final native link. This is nearly mostly feasible with Bazel/Buck style build systems, but not with others. Even in Bazel/Buck, I don't think `linkopts = ["file_added_as_an_option_instead_of_src_or_deps.o"]` can be recognized.
>
> The above is off the top of my head.

Your concerns are valid. I encountered them as well during local testing.

- Our internal build system (buck) applies `-nostdlib` throughout linking, so this requirement is satisfied.
- There are no direct support to extract specific members from archive. We are updating build system to make all regular archives into thin archive (`--[start|end]-lib`). With that done, archive members can be accessed directly. And you are right about `elf-init.oS`. During my local testing, I had to extract it manually, so some work needs to be done here. In a more general approach, formatting the index with ("archivename", "object") pair and using a separate flag to parse the index (like "--remapping-file=") is also doable.
- The index file contains line separated input file list, and can be used directly as response file in the final link (I should've just used `@thinlto.index.full` in the test case provided).

> And we lose opportunities to fix some problems for free:
>
> - If we are to support bitcode files in archives, we will need more options.
> - This still misses an opportunity to properly fix "libcall symbols defined in bitcode member of an archive".
>
> I wish that you can give a fair consideration of D130229 <https://reviews.llvm.org/D130229> and not simply dismiss it with "it differs from in-process ThinLTO, it's not right."

The concern we have is not about which linking order is right or wrong, it's rather making it consistent with its local in-process thinlto counterpart. We are kind of late in the game of distributed thinlto and we have large scale of services (20K inputs for linking) running with local thinlto for years. We can't guarantee that changing the resolution order of inputs from local to distributed thinlto won't make a difference in binary runtime behavior, and it would be a big headache to debug after everything goes production. So our current approach is to

1. Make sure that distributed thinlto thinlink uses the same inputs and order as local thinlto. This is more of the build system level change.
2. Make sure that distributed thinlto final link uses the same resolution order as thinlink with the help of this new option.
3. Make sure that binaries generated from both local and distributed thinlink matches at per symbol level.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130975/new/

https://reviews.llvm.org/D130975



More information about the llvm-commits mailing list