[llvm] e9aac2c - [llvm-objdump] Look in all viable sections for call/branch targets

James Henderson via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 22 04:48:45 PDT 2020


Author: James Henderson
Date: 2020-04-22T12:28:30+01:00
New Revision: e9aac2c3ef4ccbe26a05e9aaea90e0df66cec04d

URL: https://github.com/llvm/llvm-project/commit/e9aac2c3ef4ccbe26a05e9aaea90e0df66cec04d
DIFF: https://github.com/llvm/llvm-project/commit/e9aac2c3ef4ccbe26a05e9aaea90e0df66cec04d.diff

LOG: [llvm-objdump] Look in all viable sections for call/branch targets

Prior to this patch, llvm-objdump would only look in the last section
(according to the section header table order) that matched an address
for a symbol when identifying the target symbol of a call or branch
operation. If there are multiple sections with the same address, due to
some of them being empty, it did not look in those, even if the symbol
couldn't be found in the first section looked in.

This patch causes llvm-objdump to look in all sections for possible
candidate symbols. If there are multiple possible symbols, it picks one
from a non-empty section, if possible (as that is more likely to be the
"real" symbol since functions can't really be in emptiy sections),
before falling back to those in empty sections. If all else fails, it
falls back to absolute symbols as it did before.

Differential Revision: https://reviews.llvm.org/D78549

Reviewed by: grimar, Higuoxing

Added: 
    

Modified: 
    llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
    llvm/tools/llvm-objdump/llvm-objdump.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test b/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
index cbb77884f0c3..4b0da011c9d4 100644
--- a/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
+++ b/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
@@ -4,14 +4,34 @@
 ## the section. The test uses YAML for the input, as we need a fully linked ELF
 ## to reproduce the original failure.
 
-# RUN: yaml2obj %s -o %t1 -D SECTION=.second
-# RUN: llvm-objdump -d %t1 | FileCheck %s
-# RUN: yaml2obj %s -o %t2 -D SECTION=.first
-## FIXME: this case should print "<target>" too.
-# RUN: llvm-objdump -d %t2 | FileCheck %s --check-prefix=FAIL
+## Two empty sections, one with symbol in, one without.
+# RUN: yaml2obj %s --docnum=1 -o %t1 -D SIZE1=0 -D SIZE2=0 -D SECTION=.second -D INDEX=SHN_ABS
+# RUN: llvm-objdump -d %t1 | FileCheck %s --check-prefix=TARGET
+# RUN: yaml2obj %s --docnum=1 -o %t2 -D SIZE1=0 -D SIZE2=0 -D SECTION=.first -D INDEX=SHN_ABS
+# RUN: llvm-objdump -d %t2 | FileCheck %s --check-prefix=TARGET
 
-# CHECK: callq 0x5 <target>
-# FAIL:  callq 0x5{{$}}
+## Two sections, one empty with symbol, other non-empty, without symbol.
+# RUN: yaml2obj %s --docnum=1 -o %t3 -D SIZE1=1 -D SIZE2=0 -D SECTION=.second -D INDEX=SHN_ABS
+# RUN: llvm-objdump -d %t3 | FileCheck %s --check-prefix=TARGET
+# RUN: yaml2obj %s --docnum=1 -o %t4 -D SIZE1=0 -D SIZE2=1 -D SECTION=.first -D INDEX=SHN_ABS
+# RUN: llvm-objdump -d %t4 | FileCheck %s --check-prefix=TARGET
+
+## Fall back to absolute symbol if no symbol found in candidate sections.
+# RUN: yaml2obj %s --docnum=1 -o %t5 -D SIZE1=1 -D SIZE2=0 -D SECTION=.caller -D INDEX=SHN_ABS
+# RUN: llvm-objdump -d %t5 | FileCheck %s --check-prefix=ABSOLUTE
+
+## Show that other symbols with reserved st_shndx values are treated as absolute
+## symbols.
+# RUN: yaml2obj %s --docnum=1 -o %t6 -D SIZE1=1 -D SIZE2=0 -D SECTION=.caller -D INDEX=SHN_LOPROC
+# RUN: llvm-objdump -d %t6 | FileCheck %s --check-prefix=ABSOLUTE
+
+## Print no target if no symbol in section/absolute symbol found.
+# RUN: llvm-objcopy %t5 %t7 -N other
+# RUN: llvm-objdump -d %t7 | FileCheck %s --check-prefix=FAIL
+
+# TARGET:   callq 0x5 <target>
+# ABSOLUTE: callq 0x5 <other+0x5>
+# FAIL:     callq 0x5{{$}}
 
 --- !ELF
 FileHeader:
@@ -24,16 +44,71 @@ Sections:
     Type:    SHT_PROGBITS
     Flags:   [SHF_ALLOC, SHF_EXECINSTR]
     Address: 0x0
-    Content: e800000000 # Call instruction to next address.
+    Content: e800000000 ## Call instruction to next address.
   - Name:    .first
     Type:    SHT_PROGBITS
     Flags:   [SHF_ALLOC, SHF_EXECINSTR]
     Address: 0x5
+    Size:    [[SIZE1]]
   - Name:    .second
     Type:    SHT_PROGBITS
     Flags:   [SHF_ALLOC, SHF_EXECINSTR]
     Address: 0x5
+    Size:    [[SIZE2]]
 Symbols:
   - Name:    target
     Section: [[SECTION]]
     Value:   0x5
+  - Name:    other
+    Index:   [[INDEX]]
+    Value:   0x0
+
+## Two empty sections, both with symbols.
+# RUN: yaml2obj %s --docnum=2 -o %t7 -D SIZE1=0 -D SIZE2=0 -D SYMVAL1=0x5 -D SYMVAL2=0x5
+# RUN: llvm-objdump -d %t7 | FileCheck %s --check-prefix=SECOND
+
+## Two sections, both with symbols, one empty, the other not.
+# RUN: yaml2obj %s --docnum=2 -o %t8 -D SIZE1=1 -D SIZE2=0 -D SYMVAL1=0x5 -D SYMVAL2=0x5
+# RUN: llvm-objdump -d %t8 | FileCheck %s --check-prefix=FIRST
+# RUN: yaml2obj %s --docnum=2 -o %t9 -D SIZE1=0 -D SIZE2=1 -D SYMVAL1=0x5 -D SYMVAL2=0x5
+# RUN: llvm-objdump -d %t9 | FileCheck %s --check-prefix=SECOND
+
+## Two sections, both with symbols, one empty, other not, symbol in non-empty
+## section has value higher than target address.
+# RUN: yaml2obj %s --docnum=2 -o %t10 -D SIZE1=1 -D SIZE2=0 -D SYMVAL1=0x6 -D SYMVAL2=0x5
+# RUN: llvm-objdump -d %t10 | FileCheck %s --check-prefix=SECOND
+# RUN: yaml2obj %s --docnum=2 -o %t11 -D SIZE1=0 -D SIZE2=1 -D SYMVAL1=0x5 -D SYMVAL2=0x6
+# RUN: llvm-objdump -d %t11 | FileCheck %s --check-prefix=FIRST
+
+# FIRST:  callq 0x5 <first>
+# SECOND: callq 0x5 <second>
+
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:    ELFDATA2LSB
+  Type:    ET_EXEC
+  Machine: EM_X86_64
+Sections:
+  - Name:    .caller
+    Type:    SHT_PROGBITS
+    Flags:   [SHF_ALLOC, SHF_EXECINSTR]
+    Address: 0x0
+    Content: e800000000 ## Call instruction to next address.
+  - Name:    .first
+    Type:    SHT_PROGBITS
+    Flags:   [SHF_ALLOC, SHF_EXECINSTR]
+    Address: 0x5
+    Size:    [[SIZE1]]
+  - Name:    .second
+    Type:    SHT_PROGBITS
+    Flags:   [SHF_ALLOC, SHF_EXECINSTR]
+    Address: 0x5
+    Size:    [[SIZE2]]
+Symbols:
+  - Name:    first
+    Section: .first
+    Value:   [[SYMVAL1]]
+  - Name:    second
+    Section: .second
+    Value:   [[SYMVAL2]]

diff  --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index 0eeb337c1b7b..4ce8f691acbc 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -1245,12 +1245,17 @@ static void disassembleObject(const Target *TheTarget, const ObjectFile *Obj,
   addPltEntries(Obj, AllSymbols, Saver);
 
   // Create a mapping from virtual address to section. An empty section can
-  // cause more than one section at the same address. Use a stable sort to
-  // stabilize the output.
+  // cause more than one section at the same address. Sort such sections to be
+  // before same-addressed non-empty sections so that symbol lookups prefer the
+  // non-empty section.
   std::vector<std::pair<uint64_t, SectionRef>> SectionAddresses;
   for (SectionRef Sec : Obj->sections())
     SectionAddresses.emplace_back(Sec.getAddress(), Sec);
-  llvm::stable_sort(SectionAddresses, llvm::less_first());
+  llvm::stable_sort(SectionAddresses, [](const auto &LHS, const auto &RHS) {
+    if (LHS.first != RHS.first)
+      return LHS.first < RHS.first;
+    return LHS.second.getSize() < RHS.second.getSize();
+  });
 
   // Linked executables (.exe and .dll files) typically don't include a real
   // symbol table but they might contain an export table.
@@ -1520,41 +1525,48 @@ static void disassembleObject(const Target *TheTarget, const ObjectFile *Obj,
             // through a relocation.
             //
             // In a non-relocatable object, the target may be in any section.
+            // In that case, locate the section(s) containing the target address
+            // and find the symbol in one of those, if possible.
             //
             // N.B. We don't walk the relocations in the relocatable case yet.
-            auto *TargetSectionSymbols = &Symbols;
+            std::vector<const SectionSymbolsTy *> TargetSectionSymbols;
             if (!Obj->isRelocatableObject()) {
-              auto It = partition_point(
+              auto It = llvm::partition_point(
                   SectionAddresses,
                   [=](const std::pair<uint64_t, SectionRef> &O) {
                     return O.first <= Target;
                   });
-              if (It != SectionAddresses.begin()) {
+              uint64_t TargetSecAddr = 0;
+              while (It != SectionAddresses.begin()) {
                 --It;
-                TargetSectionSymbols = &AllSymbols[It->second];
-              } else {
-                TargetSectionSymbols = &AbsoluteSymbols;
+                if (TargetSecAddr == 0)
+                  TargetSecAddr = It->first;
+                if (It->first != TargetSecAddr)
+                  break;
+                TargetSectionSymbols.push_back(&AllSymbols[It->second]);
               }
+            } else {
+              TargetSectionSymbols.push_back(&Symbols);
             }
-
-            // Find the last symbol in the section whose offset is less than
-            // or equal to the target. If there isn't a section that contains
-            // the target, find the nearest preceding absolute symbol.
-            auto TargetSym = partition_point(
-                *TargetSectionSymbols,
-                [=](const SymbolInfoTy &O) {
-                  return O.Addr <= Target;
-                });
-            if (TargetSym == TargetSectionSymbols->begin()) {
-              TargetSectionSymbols = &AbsoluteSymbols;
-              TargetSym = partition_point(
-                  AbsoluteSymbols,
-                  [=](const SymbolInfoTy &O) {
-                    return O.Addr <= Target;
-                  });
+            TargetSectionSymbols.push_back(&AbsoluteSymbols);
+
+            // Find the last symbol in the first candidate section whose offset
+            // is less than or equal to the target. If there are no such
+            // symbols, try in the next section and so on, before finally using
+            // the nearest preceding absolute symbol (if any), if there are no
+            // other valid symbols.
+            const SymbolInfoTy *TargetSym = nullptr;
+            for (const SectionSymbolsTy *TargetSymbols : TargetSectionSymbols) {
+              auto It = llvm::partition_point(
+                  *TargetSymbols,
+                  [=](const SymbolInfoTy &O) { return O.Addr <= Target; });
+              if (It != TargetSymbols->begin()) {
+                TargetSym = &*(It - 1);
+                break;
+              }
             }
-            if (TargetSym != TargetSectionSymbols->begin()) {
-              --TargetSym;
+
+            if (TargetSym != nullptr) {
               uint64_t TargetAddress = TargetSym->Addr;
               std::string TargetName = TargetSym->Name.str();
               if (Demangle)


        


More information about the llvm-commits mailing list