[all-commits] [llvm/llvm-project] ba668e: Re-apply [lldb] Do not use LC_FUNCTION_STARTS data...

Thu Nov 21 13:06:37 PST 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: ba668eb99c5dc37d3c5cf2775079562460fd7619
      https://github.com/llvm/llvm-project/commit/ba668eb99c5dc37d3c5cf2775079562460fd7619
  Author: Alex Langford <alangford at apple.com>
  Date:   2024-11-21 (Thu, 21 Nov 2024)

  Changed paths:
    M lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp

  Log Message:
  -----------
  Re-apply [lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as symbols are created (#117079)

I backed this out due to a problem on one of the bots that myself and
others have problems reproducing locally. I'd like to try to land it
again, at least to gain more information.

Summary:
This improves the performance of ObjectFileMacho::ParseSymtab by
removing eager and expensive work in favor of doing it later in a
less-expensive fashion.

Experiment:
My goal was to understand LLDB's startup time.
First, I produced a Debug build of LLDB (no dSYM) and a
Release+NoAsserts build of LLDB. The Release build debugged the Debug
build as it debugged a small C++ program. I found that
ObjectFileMachO::ParseSymtab accounted for somewhere between 1.2 and 1.3
seconds consistently. After applying this change, I consistently
measured a reduction of approximately 100ms, putting the time closer to
1.1s and 1.2s on average.

Background:
ObjectFileMachO::ParseSymtab will incrementally create symbols by
parsing nlist entries from the symtab section of a MachO binary. As it
does this, it eagerly tries to determine the size of symbols (e.g. how
long a function is) using LC_FUNCTION_STARTS data (or eh_frame if
LC_FUNCTION_STARTS is unavailable). Concretely, this is done by
performing a binary search on the function starts array and calculating
the distance to the next function or the end of the section (whichever
is smaller).

However, this work is unnecessary for 2 reasons:
1. If you have debug symbol entries (i.e. STABs), the size of a function
is usually stored right after the function's entry. Performing this work
right before parsing the next entry is unnecessary work.
2. Calculating symbol sizes for symbols of size 0 is already performed
in `Symtab::InitAddressIndexes` after all the symbols are added to the
Symtab. It also does this more efficiently by walking over a list of
symbols sorted by address, so the work to calculate the size per symbol
is constant instead of O(log n).

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications