[all-commits] [llvm/llvm-project] 3d8542: [ELF] Parse archives as --start-lib object files

Fangrui Song via All-commits all-commits at lists.llvm.org
Tue Feb 15 09:38:14 PST 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 3d85424096ff1e20ca735cbe455870cea7ed098f
      https://github.com/llvm/llvm-project/commit/3d85424096ff1e20ca735cbe455870cea7ed098f
  Author: Fangrui Song <i at maskray.me>
  Date:   2022-02-15 (Tue, 15 Feb 2022)

  Changed paths:
    M lld/ELF/Driver.cpp
    M lld/ELF/Driver.h
    M lld/ELF/InputFiles.cpp
    M lld/ELF/InputFiles.h
    M lld/ELF/MapFile.cpp
    M lld/ELF/Symbols.cpp
    M lld/ELF/Symbols.h
    A lld/test/ELF/archive-as-start-lib.s
    M lld/test/ELF/archive-no-index.s
    M lld/test/ELF/archive-thin-missing-member.s
    M lld/test/ELF/incompatible-ar-first.s
    M lld/test/ELF/incompatible.s
    M lld/test/ELF/lto/comdat-mixed-archive.test
    M lld/test/ELF/lto/exclude-libs-libcall.ll
    M lld/test/ELF/no-obj.s
    M lld/test/ELF/trace-symbols.s

  Log Message:
  -----------
  [ELF] Parse archives as --start-lib object files

https://maskray.me/blog/2022-01-16-archives-and-start-lib

For every definition in an extracted archive member, we intern the symbol twice,
once for the archive index entry, once for the .o symbol table after extraction.
This is inefficient.

Symbols in a --start-lib ObjFile/BitcodeFile are only interned once because the
result is cached in symbols[i].

Just handle an archive using the --start-lib code path. We can therefore remove
ArchiveFile and LazyArchive. For many projects, archive member extraction ratio
is high and it is a net performance win. Linking a Release build of clang is
1.01x as fast.

Note: --start-lib scans symbols in the same order that llvm-ar adds them to the
index, so in the common case the semantics should be identical. If the archive
symbol table was created in a different order, or is incomplete, this strategy
may have different semantics. Such cases are considered user error.

The `is neither ET_REL nor LLVM bitcode` error is changed to a warning.
Previously an archive may have such members without a diagnostic. Using a
warning prevents breakage.

* For some tests, the diagnostics get improved where we did not consider
  the archive member name: `b.a:` => `b.a(b.o):`.
* `no-obj.s`: the link is now allowed, matching GNU ld
* `archive-no-index.s`: the `is neither ET_REL nor LLVM bitcode` diagnostic is
  demoted to a warning.
* `incompatible.s`: even when an archive is unextracted, we may report an
  "incompatible with" error.

---

I recently decreased sizeof(SymbolUnion) by 8 and decreased memory usage quite a
bit, so retaining `symbols` for un-extracted archive members should not cause a
memory usage problem.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D119074




More information about the All-commits mailing list