[all-commits] [llvm/llvm-project] 329008: [lldb] Improve/fix base address selection in locat...

Pavel Labath via All-commits all-commits at lists.llvm.org
Mon Dec 9 04:48:10 PST 2019


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 329008fdf1889c0554f7afbb426f829f98327c78
      https://github.com/llvm/llvm-project/commit/329008fdf1889c0554f7afbb426f829f98327c78
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2019-12-09 (Mon, 09 Dec 2019)

  Changed paths:
    M lldb/include/lldb/Expression/DWARFExpression.h
    M lldb/source/Expression/DWARFExpression.cpp
    M lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
    M lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
    A lldb/test/Shell/SymbolFile/DWARF/Inputs/debug_loc-aslr.yaml
    A lldb/test/Shell/SymbolFile/DWARF/debug_loc-aslr.s

  Log Message:
  -----------
  [lldb] Improve/fix base address selection in location lists

Summary:
Lldb support base address selection entries in location lists was broken
for a long time. This wasn't noticed until llvm started producing these
kinds of entries more frequently with r374600.

In r374769, I made a quick patch which added sufficient support for them
to get the test suite to pass. However, I did not fully understand how
this code operates, and so the fix was not complete. Specifically, what
was lacking was the ability to handle modules which were not loaded at
their preferred load address (for instance, due to ASLR).

Now that I better understand how this code works, I've come to the
conclusion that the current setup does not provide enough information
to correctly process these entries. In the current setup the location
lists were parameterized by two addresses:
- the distance of the function start from the start of the compile unit.
  The purpose of this was to make the location ranges relative to the
  start of the function.
- the actual address where the function was loaded at. With this the
  function-start-relative ranges can be translated to actual memory
  locations.

The reason for the two values, instead of just one (the load bias) is (I
think) MachO, where the debug info in the object files will appear to be
relative to the address zero, but the actual code it refers to
can be moved and reordered by the linker. This means that the location
lists need to be "linked" to reflect the locations in the actual linked
file.

These two bits of information were enough to correctly process location
lists which do not contain base address selection entries (and so all
entries are relative to the CU base). However, they don't work with
them because, in theory two base address can be completely unrelated (as
can happen for instace with hot/cold function splitting, where the
linker can reorder the two pars arbitrarily).

To fix that, I split the first parameter into two:
- the compile unit base address
- the function start address, as is known in the object file

The new algorithm becomes:
- the location lists are processed as they were meant to be processed.
  The CU base address is used as the initial base address value. Base
  address selection entries can set a new base.
- the difference between the "file" and "load" function start addresses
  is used to compute the load bias. This value is added to the final
  ranges to get the actual memory location.

This algorithm is correct for non-MachO debug info, as there the
location lists correctly describe the code in the final executable, and
the dynamic linker can just move the entire module, not pieces of it. It
will also be correct for MachO if the static linker preserves relative
positions of the various parts of the location lists -- I don't know
whether it actually does that, but judging by the lack of base address
selection support in dsymutil and lldb, this isn't something that has
come up in the past.

I add a test case which simulates the ASLR scenario and demonstrates
that base address selection entries now work correctly here.

Reviewers: JDevlieghere, aprantl, clayborg

Subscribers: dblaikie, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D70532


  Commit: 773b849c10a60171908ab1bd658a44b6b396f567
      https://github.com/llvm/llvm-project/commit/773b849c10a60171908ab1bd658a44b6b396f567
  Author: Pavel Labath <pavel at labath.sk>
  Date:   2019-12-09 (Mon, 09 Dec 2019)

  Changed paths:
    M lldb/include/lldb/Expression/DWARFExpression.h
    M lldb/source/Expression/DWARFExpression.cpp
    M lldb/test/Shell/SymbolFile/DWARF/debug_loc.s

  Log Message:
  -----------
  [lldb/DWARF] Switch to llvm location list parser

Summary:
This patch deletes the lldb location list parser and teaches the
DWARFExpression class to use the parser in llvm instead. I have
centralized all the places doing the parsing into a single
GetLocationExpression function.

In theory the the actual location list parsing should be covered by llvm
tests, and this glue code by our existing location list tests, but since
we don't have that many location list tests, I've tried to extend the
coverage a bit by adding some explicit dwarf5 loclist handling and a
test of the dumping code.

For DWARF4 location lists this should be NFC (modulo small differences
in error handling which should only show up on invalid inputs). In case
of DWARF5, this fixes various missing bits of functionality, most
notably, the lack of support for DW_LLE_offset_pair.

Reviewers: JDevlieghere, aprantl, clayborg

Subscribers: lldb-commits, dblaikie

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D71003


Compare: https://github.com/llvm/llvm-project/compare/dba420bc05ae...773b849c10a6


More information about the All-commits mailing list