[Lldb-commits] [PATCH] D108889: Use dSYM SymbolFile Sections file addr instead of ObjectFile Section file addr when they differ in ObjectFileMachO

Jason Molenda via Phabricator via lldb-commits lldb-commits at lists.llvm.org
Sun Aug 29 01:55:32 PDT 2021

jasonmolenda created this revision.
jasonmolenda added a reviewer: clayborg.
jasonmolenda added a project: LLDB.
Herald added a subscriber: JDevlieghere.
jasonmolenda requested review of this revision.

This is addressing a regression introduced by Greg's change in https://reviews.llvm.org/D86122 .  His change was fine, but the original code was written unclearly as to what it was doing.

When a binary is in the system's "shared cache", the file addresses will be different from the dSYM's file addresses.  The binary (both binary file & dSYM) usually start with a file address of 0, but when the binary is integrated into the "shared cache" of all common binaries, the segments are rearranged and now the TEXT segment has a non-0 file address.  This is pretty unusual normally -- when the UUID matches for a binary and its dSYMs, they should have identical file addresses for their segments, for example.

When we add a dSYM SymbolFile to an ObjectFile that is in the shared cache, we're going to use the symbol table and debug information from the dSYM (SymbolFile) -- we need the Sections to have file addresses that match the symbol table and debug info in the dSYM.  In other words, the dSYM's file addresses must win.

This is handled in ObjectFileMachO::ProcessSegmentCommand.  This was previously behind a check of "does this section have a valid base address" or something that didn't make sense.  Greg noticed this and changed it to "Is this ObjectFile a memory image".  This seems like the right thing to do but "IsInMemory()" doesn't apply to the way we create shared cache binaries when lldb and the inferior are using the same shared cache.  lldb thinks of these as the same as an on-disk ObjectFile that we've mmap'ed into our space, except it's contents are represented by a DataBufferUnowned.  With this "file addr fixup" code behind a "IsInMemory()" check, now we weren't fixing the file addresses and none of the debug info was ever found because the fileaddrs were contained by any of the Sections.

I shouldn't intermix changes, but I was also annoyed by how many places we check if a binary is in the shared cache by checking flags in ObjectFileMachO, and put it in a little method.

  rG LLVM Github Monorepo



Index: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
--- lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
+++ lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
@@ -84,6 +84,8 @@
   bool IsDynamicLoader() const;
+  bool IsSharedCacheBinary() const;
   uint32_t GetAddressByteSize() const override;
   lldb_private::AddressClass GetAddressClass(lldb::addr_t file_addr) override;
Index: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
--- lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -1127,6 +1127,10 @@
   return m_header.filetype == MH_DYLINKER;
+bool ObjectFileMachO::IsSharedCacheBinary() const {
+  return m_header.flags & MH_DYLIB_IN_CACHE;
 uint32_t ObjectFileMachO::GetAddressByteSize() const {
   return m_data.GetAddressByteSize();
@@ -1377,7 +1381,7 @@
   if (m_length == 0 || seg_cmd.filesize == 0)
-  if ((m_header.flags & MH_DYLIB_IN_CACHE) && !IsInMemory()) {
+  if (IsSharedCacheBinary() && !IsInMemory()) {
     // In shared cache images, the load commands are relative to the
     // shared cache file, and not the specific image we are
     // examining. Let's fix this up so that it looks like a normal
@@ -1675,14 +1679,15 @@
     if (add_to_unified)
   } else if (unified_section_sp) {
+    // If this is a dSYM and the file addresses in the dSYM differ from the
+    // file addresses in the ObjectFile, we must use the file base address for
+    // the Section from the dSYM for the DWARF to resolve correctly.  This
+    // only happens with binaries in the shared cache in practice; normally a
+    // mismatch like this would give a binary & dSYM that do not match UUIDs.
+    // When a binary is included in the shared cache, its segments are
+    // rearranged to optimize the shared cache, so its file addresses will
+    // differ from what the ObjectFile had originally, and what the dSYM has.
     if (is_dsym && unified_section_sp->GetFileAddress() != load_cmd.vmaddr) {
-      // Check to see if the module was read from memory?
-      if (module_sp->GetObjectFile()->IsInMemory()) {
-        // We have a module that is in memory and needs to have its file
-        // address adjusted. We need to do this because when we load a file
-        // from memory, its addresses will be slid already, yet the addresses
-        // in the new symbol file will still be unslid.  Since everything is
-        // stored as section offset, this shouldn't cause any problems.
         // Make sure we've parsed the symbol table from the ObjectFile before
         // we go around changing its Sections.
@@ -1698,7 +1703,6 @@
         // we're done so any file-address caches can be updated.
         context.FileAddressesChanged = true;
-    }
@@ -1726,7 +1730,7 @@
     if (m_data.GetU32(&offset, &sect64.offset, num_u32s) == nullptr)
-    if ((m_header.flags & MH_DYLIB_IN_CACHE) && !IsInMemory()) {
+    if (IsSharedCacheBinary() && !IsInMemory()) {
       sect64.offset = sect64.addr - m_text_address;
@@ -2352,7 +2356,7 @@
   Process *process = process_sp.get();
   uint32_t memory_module_load_level = eMemoryModuleLoadLevelComplete;
-  bool is_shared_cache_image = m_header.flags & MH_DYLIB_IN_CACHE;
+  bool is_shared_cache_image = IsSharedCacheBinary();
   bool is_local_shared_cache_image = is_shared_cache_image && !IsInMemory();
   SectionSP linkedit_section_sp(
@@ -2668,7 +2672,7 @@
   // to parse any DSC unmapped symbol information. If we find any, we set a
   // flag that tells the normal nlist parser to ignore all LOCAL symbols.
-  if (m_header.flags & MH_DYLIB_IN_CACHE) {
+  if (IsSharedCacheBinary()) {
     // Before we can start mapping the DSC, we need to make certain the
     // target process is actually using the cache we can find.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D108889.369312.patch
Type: text/x-patch
Size: 4185 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20210829/5135e29b/attachment.bin>

More information about the lldb-commits mailing list