[Lldb-commits] [lldb] f0697d7 - Don't create sections for SHN_ABS symbols in ELF files.

Greg Clayton via lldb-commits lldb-commits at lists.llvm.org
Mon Aug 22 14:46:40 PDT 2022


Author: Greg Clayton
Date: 2022-08-22T14:46:27-07:00
New Revision: f0697d7c3fb5296cfec1718206aceb77b7ca9ab8

URL: https://github.com/llvm/llvm-project/commit/f0697d7c3fb5296cfec1718206aceb77b7ca9ab8
DIFF: https://github.com/llvm/llvm-project/commit/f0697d7c3fb5296cfec1718206aceb77b7ca9ab8.diff

LOG: Don't create sections for SHN_ABS symbols in ELF files.

Symbols that have the section index of SHN_ABS were previously creating extra top level sections that contained the value of the symbol as if the symbol's value was an address. As far as I can tell, these symbol's values are not addresses, even if they do have a size. To make matters worse, adding these extra sections can stop address lookups from succeeding if the symbol's value + size overlaps with an existing section as these sections get mapped into memory when the image is loaded by the dynamic loader. This can cause stack frames to appear empty as the address lookup fails completely.

This patch:
- doesn't create a section for any SHN_ABS symbols
- makes symbols that are absolute have values that are not addresses
- add accessors to SBSymbol to get the value and size of a symbol as raw integers. Prevoiusly there was no way to access a symbol's value from a SBSymbol because the only accessors were:

  SBAddress SBSymbol::GetStartAddress();
  SBAddress SBSymbol::GetEndAddress();

  and these accessors would return an invalid SBAddress if the symbol's value wasn't an address
- Adds a test to ensure no ".absolute.<symbol-name>" sections are created
- Adds a test to test the new SBSymbol APIs

Differential Revision: https://reviews.llvm.org/D131705

Added: 
    lldb/test/API/python_api/absolute_symbol/TestAbsoluteSymbol.py
    lldb/test/API/python_api/absolute_symbol/absolute.yaml

Modified: 
    lldb/bindings/interface/SBSymbol.i
    lldb/include/lldb/API/SBSymbol.h
    lldb/include/lldb/Symbol/Symbol.h
    lldb/source/API/SBSymbol.cpp
    lldb/source/Commands/CommandObjectTarget.cpp
    lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
    lldb/source/Symbol/Symbol.cpp
    lldb/test/Shell/SymbolFile/absolute-symbol.test

Removed: 
    


################################################################################
diff  --git a/lldb/bindings/interface/SBSymbol.i b/lldb/bindings/interface/SBSymbol.i
index fa0b3e4e1378d..f5efee75e249d 100644
--- a/lldb/bindings/interface/SBSymbol.i
+++ b/lldb/bindings/interface/SBSymbol.i
@@ -49,6 +49,10 @@ public:
     SBAddress
     GetEndAddress ();
 
+    uint64_t GetValue();
+
+    uint64_t GetSize();
+
     uint32_t
     GetPrologueByteSize ();
 

diff  --git a/lldb/include/lldb/API/SBSymbol.h b/lldb/include/lldb/API/SBSymbol.h
index 5935ccfed0245..94521881f82f9 100644
--- a/lldb/include/lldb/API/SBSymbol.h
+++ b/lldb/include/lldb/API/SBSymbol.h
@@ -41,10 +41,46 @@ class LLDB_API SBSymbol {
   lldb::SBInstructionList GetInstructions(lldb::SBTarget target,
                                           const char *flavor_string);
 
+  /// Get the start address of this symbol
+  ///
+  /// \returns
+  ///   If the symbol's value is not an address, an invalid SBAddress object
+  ///   will be returned. If the symbol's value is an address, a valid SBAddress
+  ///   object will be returned.
   SBAddress GetStartAddress();
 
+  /// Get the end address of this symbol
+  ///
+  /// \returns
+  ///   If the symbol's value is not an address, an invalid SBAddress object
+  ///   will be returned. If the symbol's value is an address, a valid SBAddress
+  ///   object will be returned.
   SBAddress GetEndAddress();
 
+  /// Get the raw value of a symbol.
+  ///
+  /// This accessor allows direct access to the symbol's value from the symbol
+  /// table regardless of what the value is. The value can be a file address or
+  /// it can be an integer value that depends on what the symbol's type is. Some
+  /// symbol values are not addresses, but absolute values or integer values
+  /// that can be mean 
diff erent things. The GetStartAddress() accessor will
+  /// only return a valid SBAddress if the symbol's value is an address, so this
+  /// accessor provides a way to access the symbol's value when the value is
+  /// not an address.
+  ///
+  /// \returns
+  ///   Returns the raw integer value of a symbol from the symbol table.
+  uint64_t GetValue();
+
+  /// Get the size of the symbol.
+  ///
+  /// This accessor allows direct access to the symbol's size from the symbol
+  /// table regardless of what the value is (address or integer value).
+  ///
+  /// \returns
+  ///   Returns the size of a symbol from the symbol table.
+  uint64_t GetSize();
+
   uint32_t GetPrologueByteSize();
 
   SymbolType GetType();

diff  --git a/lldb/include/lldb/Symbol/Symbol.h b/lldb/include/lldb/Symbol/Symbol.h
index 23a68090ff4d6..164fafd437dcb 100644
--- a/lldb/include/lldb/Symbol/Symbol.h
+++ b/lldb/include/lldb/Symbol/Symbol.h
@@ -85,6 +85,16 @@ class Symbol : public SymbolContextScope {
       return Address();
   }
 
+  /// Get the raw value of the symbol from the symbol table.
+  ///
+  /// If the symbol's value is an address, return the file address, else return
+  /// the raw value that is stored in the m_addr_range. If the base address has
+  /// no section, then getting the file address will return the correct value
+  /// as it will return the offset in the base address which is the value.
+  uint64_t GetRawValue() const {
+    return m_addr_range.GetBaseAddress().GetFileAddress();
+  }
+
   // When a symbol's value isn't an address, we need to access the raw value.
   // This function will ensure this symbol's value isn't an address and return
   // the integer value if this checks out, otherwise it will return

diff  --git a/lldb/source/API/SBSymbol.cpp b/lldb/source/API/SBSymbol.cpp
index b671f987dc996..3f940538d1240 100644
--- a/lldb/source/API/SBSymbol.cpp
+++ b/lldb/source/API/SBSymbol.cpp
@@ -162,6 +162,20 @@ SBAddress SBSymbol::GetEndAddress() {
   return addr;
 }
 
+uint64_t SBSymbol::GetValue() {
+  LLDB_INSTRUMENT_VA(this);
+  if (m_opaque_ptr)
+    return m_opaque_ptr->GetRawValue();
+  return 0;
+}
+
+uint64_t SBSymbol::GetSize() {
+  LLDB_INSTRUMENT_VA(this);
+  if (m_opaque_ptr && m_opaque_ptr->GetByteSizeIsValid())
+    return m_opaque_ptr->GetByteSize();
+  return 0;
+}
+
 uint32_t SBSymbol::GetPrologueByteSize() {
   LLDB_INSTRUMENT_VA(this);
 

diff  --git a/lldb/source/Commands/CommandObjectTarget.cpp b/lldb/source/Commands/CommandObjectTarget.cpp
index f72d4c7715c6d..08bdb856d9161 100644
--- a/lldb/source/Commands/CommandObjectTarget.cpp
+++ b/lldb/source/Commands/CommandObjectTarget.cpp
@@ -1541,10 +1541,20 @@ static uint32_t LookupSymbolInModule(CommandInterpreter &interpreter,
     strm.IndentMore();
     for (uint32_t i = 0; i < num_matches; ++i) {
       Symbol *symbol = symtab->SymbolAtIndex(match_indexes[i]);
-      if (symbol && symbol->ValueIsAddress()) {
-        DumpAddress(
-            interpreter.GetExecutionContext().GetBestExecutionContextScope(),
-            symbol->GetAddressRef(), verbose, all_ranges, strm);
+      if (symbol) {
+        if (symbol->ValueIsAddress()) {
+          DumpAddress(
+              interpreter.GetExecutionContext().GetBestExecutionContextScope(),
+              symbol->GetAddressRef(), verbose, all_ranges, strm);
+        } else {
+          strm.IndentMore();
+          strm.Indent("    Value: ");
+          strm.Printf("0x%16.16" PRIx64 "\n", symbol->GetRawValue());
+          if (symbol->GetByteSizeIsValid()) {
+            strm.Indent("    Size: ");
+            strm.Printf("0x%16.16" PRIx64 "\n", symbol->GetByteSize());
+          }
+        }
       }
     }
     strm.IndentLess();

diff  --git a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 122298d87bf8d..bc488a77fd94f 100644
--- a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -2223,23 +2223,6 @@ unsigned ObjectFileELF::ParseSymbols(Symtab *symtab, user_id_t start_id,
     // symbols. See above for more details.
     uint64_t symbol_value = symbol.st_value + symbol_value_offset;
 
-    if (symbol_section_sp == nullptr && shndx == SHN_ABS &&
-        symbol.st_size != 0) {
-      // We don't have a section for a symbol with non-zero size. Create a new
-      // section for it so the address range covered by the symbol is also
-      // covered by the module (represented through the section list). It is
-      // needed so module lookup for the addresses covered by this symbol will
-      // be successfull. This case happens for absolute symbols.
-      ConstString fake_section_name(std::string(".absolute.") + symbol_name);
-      symbol_section_sp =
-          std::make_shared<Section>(module_sp, this, SHN_ABS, fake_section_name,
-                                    eSectionTypeAbsoluteAddress, symbol_value,
-                                    symbol.st_size, 0, 0, 0, SHF_ALLOC);
-
-      module_section_list->AddSection(symbol_section_sp);
-      section_list->AddSection(symbol_section_sp);
-    }
-
     if (symbol_section_sp &&
         CalculateType() != ObjectFile::Type::eTypeObjectFile)
       symbol_value -= symbol_section_sp->GetFileAddress();

diff  --git a/lldb/source/Symbol/Symbol.cpp b/lldb/source/Symbol/Symbol.cpp
index 09a5fc936cd0b..067b7b900d89d 100644
--- a/lldb/source/Symbol/Symbol.cpp
+++ b/lldb/source/Symbol/Symbol.cpp
@@ -115,8 +115,7 @@ void Symbol::Clear() {
 }
 
 bool Symbol::ValueIsAddress() const {
-  return m_addr_range.GetBaseAddress().GetSection().get() != nullptr ||
-         m_type == eSymbolTypeAbsolute;
+  return (bool)m_addr_range.GetBaseAddress().GetSection();
 }
 
 ConstString Symbol::GetDisplayName() const {

diff  --git a/lldb/test/API/python_api/absolute_symbol/TestAbsoluteSymbol.py b/lldb/test/API/python_api/absolute_symbol/TestAbsoluteSymbol.py
new file mode 100644
index 0000000000000..f51bd2d3a0374
--- /dev/null
+++ b/lldb/test/API/python_api/absolute_symbol/TestAbsoluteSymbol.py
@@ -0,0 +1,80 @@
+"""
+Test absolute symbols in ELF files to make sure they don't create sections and
+to verify that symbol values and size can still be accessed via SBSymbol APIs.
+"""
+
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+import os
+
+class TestAbsoluteSymbol(TestBase):
+
+    @no_debug_info_test
+    def test_absolute_symbol(self):
+        '''
+            Load an ELF file that contains two symbols:
+            - "absolute" which is a symbol with the section SHN_ABS
+            - "main" which is a code symbol in .text
+
+            Index   st_name    st_value           st_size            st_info                             st_other st_shndx Name
+            ======= ---------- ------------------ ------------------ ----------------------------------- -------- -------- ===========================
+            [    0] 0x00000000 0x0000000000000000 0x0000000000000000 0x00 (STB_LOCAL      STT_NOTYPE   ) 0x00            0
+            [    1] 0x00000001 0x0000000000001000 0x0000000000000004 0x12 (STB_GLOBAL     STT_FUNC     ) 0x00            1 main
+            [    2] 0x00000006 0xffffffff80000000 0x0000000000000009 0x10 (STB_GLOBAL     STT_NOTYPE   ) 0x00      SHN_ABS absolute
+
+            We used to create sections for symbols whose section ID was SHN_ABS
+            and this caused problems as the new sections could interfere with
+            with address resolution. Absolute symbols' values are not addresses
+            and should not be treated this way.
+
+            New APIs were added to SBSymbol to allow access to the raw integer
+            value and size of symbols so symbols whose value was not an address
+            could be accessed. Prior to this commit, you could only call:
+
+            SBAddress SBSymbol::GetStartAddress()
+            SBAddress SBSymbol::GetEndAddress()
+
+            If the symbol's value was not an address, you couldn't access the
+            raw value because the above accessors would return invalid SBAddress
+            objects if the value wasn't an address. New APIs were added for this:
+
+            uint64_t SBSymbol::GetValue()
+            uint64_t SBSymbol::GetSize();
+        '''
+        src_dir = self.getSourceDir()
+        yaml_path = os.path.join(src_dir, "absolute.yaml")
+        yaml_base, ext = os.path.splitext(yaml_path)
+        obj_path = self.getBuildArtifact("a.out")
+        self.yaml2obj(yaml_path, obj_path)
+
+        # Create a target with the object file we just created from YAML
+        target = self.dbg.CreateTarget(obj_path)
+        self.assertTrue(target, VALID_TARGET)
+
+        module = target.modules[0]
+
+        # Make sure the 'main' symbol is valid and has address values. Also make
+        # sure we can access the raw file address and size via the new APIS.
+        symbol = module.FindSymbol('main')
+        self.assertTrue(symbol.IsValid())
+        self.assertTrue(symbol.GetStartAddress().IsValid())
+        self.assertTrue(symbol.GetEndAddress().IsValid())
+        self.assertEqual(symbol.GetValue(), 0x1000)
+        self.assertEqual(symbol.GetSize(), 0x4)
+
+        # Make sure the 'absolute' symbol is valid and has no address values.
+        # Also make sure we can access the raw file address and size via the new
+        # APIS.
+        symbol = module.FindSymbol('absolute')
+        self.assertTrue(symbol.IsValid())
+        self.assertFalse(symbol.GetStartAddress().IsValid())
+        self.assertFalse(symbol.GetEndAddress().IsValid())
+        self.assertEqual(symbol.GetValue(), 0xffffffff80000000)
+        self.assertEqual(symbol.GetSize(), 9)
+
+        # Make sure no sections were created for the absolute symbol with a
+        # prefix of ".absolute." followed by the symbol name as they interfere
+        # with address lookups if they are treated like real sections.
+        for section in module.sections:
+            self.assertTrue(section.GetName() != ".absolute.absolute")

diff  --git a/lldb/test/API/python_api/absolute_symbol/absolute.yaml b/lldb/test/API/python_api/absolute_symbol/absolute.yaml
new file mode 100644
index 0000000000000..24b5c84cec1a9
--- /dev/null
+++ b/lldb/test/API/python_api/absolute_symbol/absolute.yaml
@@ -0,0 +1,36 @@
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  OSABI:           ELFOSABI_FREEBSD
+  Type:            ET_EXEC
+  Machine:         EM_X86_64
+  Entry:           0xFFFFFFFF8037C000
+Sections:
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x1000
+    AddressAlign:    0x4
+    Content:         "c3c3c3c3"
+ProgramHeaders:
+  - Type: PT_LOAD
+    Flags: [ PF_X, PF_R ]
+    VAddr: 0x1000
+    PAddr: 0x1000
+    Align: 0x4
+    FirstSec: .text
+    LastSec:  .text
+Symbols:
+  - Name:            main
+    Type:            STT_FUNC
+    Section:         .text
+    Binding:         STB_GLOBAL
+    Value:           0x1000
+    Size:            0x4
+  - Name:            absolute
+    Index:           SHN_ABS
+    Binding:         STB_GLOBAL
+    Value:           0xFFFFFFFF80000000
+    Size:            0x9
+

diff  --git a/lldb/test/Shell/SymbolFile/absolute-symbol.test b/lldb/test/Shell/SymbolFile/absolute-symbol.test
index 1d234cb55e059..11feb1f89c10a 100644
--- a/lldb/test/Shell/SymbolFile/absolute-symbol.test
+++ b/lldb/test/Shell/SymbolFile/absolute-symbol.test
@@ -1,8 +1,7 @@
 # RUN: yaml2obj %s -o %t.o
 # RUN: %lldb -b -o 'target modules lookup -s absolute_symbol' %t.o | FileCheck %s
 # CHECK: 1 symbols match 'absolute_symbol'
-# CHECK:   Address: 0x0000000012345678 (0x0000000012345678)
-# CHECK:   Summary: 0x0000000012345678
+# CHECK:   Value: 0x0000000012345678
 # Created from:
 #   .globl absolute_symbol
 #   absolute_symbol = 0x12345678
@@ -92,4 +91,3 @@ LinkEditData:
     - ltmp0
     - ''
 ...
-


        


More information about the lldb-commits mailing list