[all-commits] [llvm/llvm-project] 3fcb0e: [lld-macho] Emit STABS symbols for debugging, and ...

Jez Ng via All-commits all-commits at lists.llvm.org
Tue Dec 1 15:05:44 PST 2020


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 3fcb0eeb152beb4320c7632bcfa2b1e7c2e5ca00
      https://github.com/llvm/llvm-project/commit/3fcb0eeb152beb4320c7632bcfa2b1e7c2e5ca00
  Author: Jez Ng <jezng at fb.com>
  Date:   2020-12-01 (Tue, 01 Dec 2020)

  Changed paths:
    M lld/MachO/CMakeLists.txt
    A lld/MachO/Dwarf.cpp
    A lld/MachO/Dwarf.h
    M lld/MachO/InputFiles.cpp
    M lld/MachO/InputFiles.h
    M lld/MachO/InputSection.h
    M lld/MachO/OutputSegment.h
    M lld/MachO/SyntheticSections.cpp
    M lld/MachO/SyntheticSections.h
    M lld/MachO/Writer.cpp
    A lld/test/MachO/stabs.s

  Log Message:
  -----------
  [lld-macho] Emit STABS symbols for debugging, and drop debug sections

Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
`dsymutil` and the debugger to locate their contents in the object files.

With this diff, `dsymutil` is able to locate the debug info. However, we need
a few more features before `lldb` is able to work well with our binaries --
e.g. having `LC_DYSYMTAB` accurately reflect the number of local symbols,
emitting `LC_UUID`, and more. Those will be handled in follow-up diffs.

Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one `N_SO` symbol instead of two. (`ld64`
emits one `N_SO` for the dirname and one of the basename.) Second, we do not
emit `N_BNSYM` and `N_ENSYM` STABS to mark the start and end of functions,
because the `N_FUN` STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.

Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if `.subsections_with_symbols` is set, and if there
isn't an `N_ALT_ENTRY` in that particular subsection. I think we have two
options to solve this:

1. We can split up subsections by symbol even if `.subsections_with_symbols`
   is not set, but include constraints to ensure those subsections retain
   their order in the final output. This is `ld64`'s approach.
2. We could just add a `size` field to our `Symbol` class. This seems simpler,
   and I'm more inclined toward it, but I'm not sure if there are use cases
   that it doesn't handle well. As such I'm punting on the decision for now.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D89257


  Commit: 51629abce0e2f9d1376eb0b5070532a2bbec6766
      https://github.com/llvm/llvm-project/commit/51629abce0e2f9d1376eb0b5070532a2bbec6766
  Author: Jez Ng <jezng at fb.com>
  Date:   2020-12-01 (Tue, 01 Dec 2020)

  Changed paths:
    M lld/MachO/SyntheticSections.cpp
    M lld/MachO/SyntheticSections.h
    M lld/MachO/Writer.cpp
    M lld/test/MachO/stabs.s
    M lld/test/MachO/subsections-symbol-relocs.s
    M lld/test/MachO/symtab.s
    M lld/test/MachO/tlv.s

  Log Message:
  -----------
  [lld-macho] Emit local symbols in symtab; record metadata in LC_DYSYMTAB

Symbols of the same type must be laid out contiguously: following ld64's
lead, we choose to emit all local symbols first, then external symbols,
and finally undefined symbols. For each symbol type, the LC_DYSYMTAB
load command will record the range (start index and total number) of
those symbols in the symbol table.

This work was motivated by the fact that LLDB won't search for debug
info if LC_DYSYMTAB says there are no local symbols (since STABS symbols
are all local symbols). With this change, LLDB is now able to display
the source lines at a given breakpoint when debugging our binaries.

Some tests had to be updated due to local symbol names now appearing in
`llvm-objdump`'s output.

Reviewed By: #lld-macho, smeenai, clayborg

Differential Revision: https://reviews.llvm.org/D89285


  Commit: d0c4be42e35d8cff069f91a45b76ea24187c233d
      https://github.com/llvm/llvm-project/commit/d0c4be42e35d8cff069f91a45b76ea24187c233d
  Author: Jez Ng <jezng at fb.com>
  Date:   2020-12-01 (Tue, 01 Dec 2020)

  Changed paths:
    M lld/MachO/SyntheticSections.h
    M lld/test/MachO/symtab.s

  Log Message:
  -----------
  [lld-macho] Emit empty string as first entry of string table

ld64 emits string tables which start with a space and a zero byte. We
match its behavior here since some tools depend on it.

Similar rationale as {D89561}.

Reviewed By: #lld-macho, smeenai

Differential Revision: https://reviews.llvm.org/D89639


  Commit: b768d57b368781e6737c403e425bd835850f3a0a
      https://github.com/llvm/llvm-project/commit/b768d57b368781e6737c403e425bd835850f3a0a
  Author: Jez Ng <jezng at fb.com>
  Date:   2020-12-01 (Tue, 01 Dec 2020)

  Changed paths:
    M lld/MachO/Driver.cpp
    M lld/MachO/Driver.h
    M lld/MachO/DriverUtils.cpp
    M lld/MachO/InputFiles.cpp
    M lld/MachO/InputFiles.h
    M lld/MachO/LTO.cpp
    M lld/MachO/SyntheticSections.cpp
    M lld/test/MachO/stabs.s

  Log Message:
  -----------
  [lld-macho] Add archive name and file modtime to STABS output

We should also set the modtime when running LTO. That will be done in a
future diff, together with support for the `-object_path_lto` flag.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D91318


  Commit: 78f6498cdcdb5a7644b1c32615cfe2fdfd9c2545
      https://github.com/llvm/llvm-project/commit/78f6498cdcdb5a7644b1c32615cfe2fdfd9c2545
  Author: Jez Ng <jezng at fb.com>
  Date:   2020-12-01 (Tue, 01 Dec 2020)

  Changed paths:
    M lld/MachO/InputFiles.cpp
    M lld/MachO/InputFiles.h
    M lld/MachO/SyntheticSections.cpp
    M lld/MachO/SyntheticSections.h
    M lld/test/MachO/stabs.s

  Log Message:
  -----------
  [lld-macho] Flesh out STABS implementation

This addresses a lot of the comments in {D89257}. Ideally it'd have been
done in the same diff, but the commits in between make that difficult.

This diff implements:
* N_GSYM and N_STSYM, the STABS for global and static symbols
* Has the STABS reflect the section IDs of their referent symbols
* Ensures we don't fail when encountering absolute symbols or files with
  no debug info
* Sorts STABS symbols by file to minimize the number of N_OSO entries

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92366


  Commit: c7dbaec396ef98b8bc6acb7631d2919449986add
      https://github.com/llvm/llvm-project/commit/c7dbaec396ef98b8bc6acb7631d2919449986add
  Author: Jez Ng <jezng at fb.com>
  Date:   2020-12-01 (Tue, 01 Dec 2020)

  Changed paths:
    M lld/MachO/InputSection.cpp
    M lld/MachO/InputSection.h
    M lld/MachO/SyntheticSections.cpp
    M lld/test/MachO/stabs.s

  Log Message:
  -----------
  [lld-macho] Add isCodeSection()

This is the same logic that ld64 uses to determine which sections
contain functions. This was added so that we could determine which
STABS entries should be N_FUN.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D92430


Compare: https://github.com/llvm/llvm-project/compare/ba4e45a0aa65...c7dbaec396ef


More information about the All-commits mailing list