[lld] [llvm] Objdump symbol search (PR #145253)
Arjun Patel via llvm-commits
llvm-commits at lists.llvm.org
Sun Jun 22 17:49:39 PDT 2025
https://github.com/arjunUpatel created https://github.com/llvm/llvm-project/pull/145253
Relevant after #144620 is merged.
This PR changes how `llvm-objdump` picks candidate sections to choose symbols and labels for address resolution. Currently `llvm-objdump` adds only the symbols and labels in the set of sections that are closest to a target address at an address less than the target address. The problem is that this section may not contain any symbols. As a fallback, absolute symbols are added to the set of candidate sections but these are also not guaranteed to be there. Hence, we run into cases of disassemblies where no address resolution is emitted despite labels/symbols existing in addresses lower than the target address. This is illustrated by this test:
```
.text
test:
la a0, gdata
.skip 0x100000
ldata:
.int 0
.data
gdata:
.int 0
```
Credits to @PiJoules for coming up with this test in #108469.
Compiled with:
`clang --target=fuchsia-elf-riscv64 -march=rv64g test.s -nostdlib -o test`
Disassembly with `llvm-objdump -d test` yields:
```
0000000000001000 <_start>:
1000: 00101517 auipc a0, 0x101
1004: 00853503 ld a0, 0x8(a0)
```
while disassembly with `riscv64-linux-gnu-objdump -d test` yields
```
0000000000001000 <_start>:
1000: 00101517 auipc a0,0x101
1004: 00853503 ld a0,8(a0) # 102008 <ldata+0x1000>
```
Binutils auipc+ld sequence prints address resolution but this is not the case with llvm, which is exactly what this PR addresses
>From 894d8d1ac82b2502af26062a9ff10329110c1314 Mon Sep 17 00:00:00 2001
From: Arjun Patel <arjunpatel151002 at gmail.com>
Date: Wed, 18 Jun 2025 03:33:47 -0400
Subject: [PATCH 1/3] Change how labels are searched for by llvm-objdump
Previously, the loop would only add labels in the set of sections that were closest to Target. Set of section here because multiple sections can have the same address, so all of their symbols would be added to the set of candidate symbols. The following changes make it such that we loop down from sections closest to the Target and populate the set of candidate symbols with symbols from the first set of sections that do contain symbols
---
llvm/tools/llvm-objdump/llvm-objdump.cpp | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/llvm/tools/llvm-objdump/llvm-objdump.cpp b/llvm/tools/llvm-objdump/llvm-objdump.cpp
index 5ecb33375943f..dabb200d526ba 100644
--- a/llvm/tools/llvm-objdump/llvm-objdump.cpp
+++ b/llvm/tools/llvm-objdump/llvm-objdump.cpp
@@ -2392,14 +2392,19 @@ disassembleObject(ObjectFile &Obj, const ObjectFile &DbgObj,
[=](const std::pair<uint64_t, SectionRef> &O) {
return O.first <= Target;
});
- uint64_t TargetSecAddr = 0;
+ uint64_t TargetSecAddr = It == SectionAddresses.end() ? 0 : It->first;
+ bool FoundSymbols = false;
while (It != SectionAddresses.begin()) {
--It;
- if (TargetSecAddr == 0)
+ if (It->first != TargetSecAddr) {
+ if (FoundSymbols)
+ break;
TargetSecAddr = It->first;
- if (It->first != TargetSecAddr)
- break;
- TargetSectionSymbols.push_back(&AllSymbols[It->second]);
+ }
+ auto *SectionSymbols = &AllSymbols[It->second];
+ TargetSectionSymbols.push_back(SectionSymbols);
+ if (!SectionSymbols->empty())
+ FoundSymbols = true;
}
} else {
TargetSectionSymbols.push_back(&Symbols);
>From ec782afc54b23d96b54e1dfd90d622671ca129a0 Mon Sep 17 00:00:00 2001
From: Arjun Patel <arjunpatel151002 at gmail.com>
Date: Wed, 18 Jun 2025 16:55:33 -0400
Subject: [PATCH 2/3] Update tests affected by cchange in search pattern
---
lld/test/ELF/aarch64-feature-pauth.s | 4 +--
lld/test/ELF/aarch64-gnu-ifunc-plt.s | 4 +--
lld/test/ELF/i386-feature-cet.s | 2 +-
lld/test/ELF/loongarch-relax-call36.s | 4 +--
lld/test/ELF/x86-64-feature-cet.s | 4 +--
.../X86/disassemble-same-section-addr.test | 27 ++++++++-----------
6 files changed, 20 insertions(+), 25 deletions(-)
diff --git a/lld/test/ELF/aarch64-feature-pauth.s b/lld/test/ELF/aarch64-feature-pauth.s
index e8c900b9cb134..765a030736796 100644
--- a/lld/test/ELF/aarch64-feature-pauth.s
+++ b/lld/test/ELF/aarch64-feature-pauth.s
@@ -70,7 +70,7 @@
# PACPLT: Disassembly of section .plt:
# PACPLT: <.plt>:
# PACPLT-NEXT: stp x16, x30, [sp, #-0x10]!
-# PACPLT-NEXT: adrp x16, 0x30000 <func3+0x30000>
+# PACPLT-NEXT: adrp x16, 0x30000 <_DYNAMIC+0x{{[0-9a-fA-F]+}}>
# PACPLT-NEXT: ldr x17, [x16, #0x[[B]]]
# PACPLT-NEXT: add x16, x16, #0x[[B]]
# PACPLT-NEXT: br x17
@@ -78,7 +78,7 @@
# PACPLT-NEXT: nop
# PACPLT-NEXT: nop
# PACPLT: <func3 at plt>:
-# PACPLT-NEXT: adrp x16, 0x30000 <func3+0x30000>
+# PACPLT-NEXT: adrp x16, 0x30000 <_DYNAMIC+0x{{[0-9a-fA-F]+}}>
# PACPLT-NEXT: ldr x17, [x16, #0x[[C]]]
# PACPLT-NEXT: add x16, x16, #0x[[C]]
# NOHINT-NEXT: braa x17, x16
diff --git a/lld/test/ELF/aarch64-gnu-ifunc-plt.s b/lld/test/ELF/aarch64-gnu-ifunc-plt.s
index 73ecf58ee76bf..09fd792c1b94b 100644
--- a/lld/test/ELF/aarch64-gnu-ifunc-plt.s
+++ b/lld/test/ELF/aarch64-gnu-ifunc-plt.s
@@ -55,8 +55,8 @@
// DISASM: <bar>:
// DISASM-NEXT: 2102dc: ret
// DISASM: <_start>:
-// DISASM-NEXT: 2102e0: bl 0x210330 <zed2+0x210330>
-// DISASM-NEXT: 2102e4: bl 0x210340 <zed2+0x210340>
+// DISASM-NEXT: 2102e0: bl 0x210330 <zed2 at plt+0x10>
+// DISASM-NEXT: 2102e4: bl 0x210340 <zed2 at plt+0x20>
// DISASM-NEXT: 2102e8: bl 0x210310 <bar2 at plt>
// DISASM-NEXT: 2102ec: bl 0x210320 <zed2 at plt>
// DISASM-EMPTY:
diff --git a/lld/test/ELF/i386-feature-cet.s b/lld/test/ELF/i386-feature-cet.s
index a7de05a1870dc..606c88d60894b 100644
--- a/lld/test/ELF/i386-feature-cet.s
+++ b/lld/test/ELF/i386-feature-cet.s
@@ -58,7 +58,7 @@
# DISASM: Disassembly of section .text:
# DISASM: 00401200 <func1>:
-# DISASM-NEXT: 401200: calll 0x401230 <func2+0x401230>
+# DISASM-NEXT: 401200: calll 0x401230 <func1+0x30>
# DISASM-NEXT: 401205: calll 0x401240 <ifunc>
# DISASM-NEXT: retl
diff --git a/lld/test/ELF/loongarch-relax-call36.s b/lld/test/ELF/loongarch-relax-call36.s
index fa6e79dfa5803..e2c81460e162f 100644
--- a/lld/test/ELF/loongarch-relax-call36.s
+++ b/lld/test/ELF/loongarch-relax-call36.s
@@ -25,8 +25,8 @@
# RELAX-NEXT: nop
# RELAX-NEXT: nop
## offset = .plt(0x10400)+32 - 0x10010 = 1040
-# RELAX-NEXT: 10010: bl 1040 <bar+0x10420>
-# RELAX-NEXT: b 1036 <bar+0x10420>
+# RELAX-NEXT: 10010: bl 1040 <_start_end+0x404>
+# RELAX-NEXT: b 1036 <_start_end+0x404>
# RELAX-EMPTY:
# RELAX-NEXT: <a>:
# RELAX-NEXT: 10018: ret
diff --git a/lld/test/ELF/x86-64-feature-cet.s b/lld/test/ELF/x86-64-feature-cet.s
index 6a88463ff8bfd..3f7b3180d6a44 100644
--- a/lld/test/ELF/x86-64-feature-cet.s
+++ b/lld/test/ELF/x86-64-feature-cet.s
@@ -65,8 +65,8 @@
# DISASM: Disassembly of section .text:
# DISASM: 0000000000201330 <func1>:
-# DISASM-NEXT: 201330: callq 0x201360 <func2+0x201360>
-# DISASM-NEXT: 201335: callq 0x201370 <func2+0x201370>
+# DISASM-NEXT: 201330: callq 0x201360 <ifunc+0x25>
+# DISASM-NEXT: 201335: callq 0x201370 <ifunc+0x35>
# DISASM-NEXT: retq
# DISASM: Disassembly of section .plt:
diff --git a/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test b/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
index 4b0da011c9d46..89f699c9fe4b1 100644
--- a/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
+++ b/llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
@@ -5,33 +5,28 @@
## to reproduce the original failure.
## Two empty sections, one with symbol in, one without.
-# RUN: yaml2obj %s --docnum=1 -o %t1 -D SIZE1=0 -D SIZE2=0 -D SECTION=.second -D INDEX=SHN_ABS
+# RUN: yaml2obj %s --docnum=1 -o %t1 -D SIZE1=0 -D SIZE2=0 -D SECTION=.second -D INDEX=SHN_ABS -D VALUE=0x5
# RUN: llvm-objdump -d %t1 | FileCheck %s --check-prefix=TARGET
-# RUN: yaml2obj %s --docnum=1 -o %t2 -D SIZE1=0 -D SIZE2=0 -D SECTION=.first -D INDEX=SHN_ABS
+# RUN: yaml2obj %s --docnum=1 -o %t2 -D SIZE1=0 -D SIZE2=0 -D SECTION=.first -D INDEX=SHN_ABS -D VALUE=0x5
# RUN: llvm-objdump -d %t2 | FileCheck %s --check-prefix=TARGET
## Two sections, one empty with symbol, other non-empty, without symbol.
-# RUN: yaml2obj %s --docnum=1 -o %t3 -D SIZE1=1 -D SIZE2=0 -D SECTION=.second -D INDEX=SHN_ABS
+# RUN: yaml2obj %s --docnum=1 -o %t3 -D SIZE1=1 -D SIZE2=0 -D SECTION=.second -D INDEX=SHN_ABS -D VALUE=0x5
# RUN: llvm-objdump -d %t3 | FileCheck %s --check-prefix=TARGET
-# RUN: yaml2obj %s --docnum=1 -o %t4 -D SIZE1=0 -D SIZE2=1 -D SECTION=.first -D INDEX=SHN_ABS
+# RUN: yaml2obj %s --docnum=1 -o %t4 -D SIZE1=0 -D SIZE2=1 -D SECTION=.first -D INDEX=SHN_ABS -D VALUE=0x5
# RUN: llvm-objdump -d %t4 | FileCheck %s --check-prefix=TARGET
-## Fall back to absolute symbol if no symbol found in candidate sections.
-# RUN: yaml2obj %s --docnum=1 -o %t5 -D SIZE1=1 -D SIZE2=0 -D SECTION=.caller -D INDEX=SHN_ABS
+## Fall back to absolute symbols if no symbol found in candidate sections.
+# RUN: llvm-objcopy -N foo --add-symbol absol=0 %p/../ELF/Inputs/call-absolute-symbol.elf-x86_64 %t5
# RUN: llvm-objdump -d %t5 | FileCheck %s --check-prefix=ABSOLUTE
-## Show that other symbols with reserved st_shndx values are treated as absolute
-## symbols.
-# RUN: yaml2obj %s --docnum=1 -o %t6 -D SIZE1=1 -D SIZE2=0 -D SECTION=.caller -D INDEX=SHN_LOPROC
-# RUN: llvm-objdump -d %t6 | FileCheck %s --check-prefix=ABSOLUTE
-
## Print no target if no symbol in section/absolute symbol found.
-# RUN: llvm-objcopy %t5 %t7 -N other
-# RUN: llvm-objdump -d %t7 | FileCheck %s --check-prefix=FAIL
+# RUN: llvm-objcopy %p/../ELF/Inputs/call-absolute-symbol.elf-x86_64 %t6 -N foo
+# RUN: llvm-objdump -d %t6 | FileCheck %s --check-prefix=FAIL
# TARGET: callq 0x5 <target>
-# ABSOLUTE: callq 0x5 <other+0x5>
-# FAIL: callq 0x5{{$}}
+# ABSOLUTE: callq 0x100 <absol+0x100>
+# FAIL: callq 0x100{{$}}
--- !ELF
FileHeader:
@@ -58,7 +53,7 @@ Sections:
Symbols:
- Name: target
Section: [[SECTION]]
- Value: 0x5
+ Value: [[VALUE]]
- Name: other
Index: [[INDEX]]
Value: 0x0
>From bf0a48615f43f1d41d2780e7c9c1b4b80fb9c1cd Mon Sep 17 00:00:00 2001
From: Arjun Patel <arjunpatel151002 at gmail.com>
Date: Sun, 22 Jun 2025 20:42:23 -0400
Subject: [PATCH 3/3] Add tests
---
.../tools/llvm-objdump/RISCV/riscv-sym-search.s | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
create mode 100644 cross-project-tests/tools/llvm-objdump/RISCV/riscv-sym-search.s
diff --git a/cross-project-tests/tools/llvm-objdump/RISCV/riscv-sym-search.s b/cross-project-tests/tools/llvm-objdump/RISCV/riscv-sym-search.s
new file mode 100644
index 0000000000000..01fc4806116a4
--- /dev/null
+++ b/cross-project-tests/tools/llvm-objdump/RISCV/riscv-sym-search.s
@@ -0,0 +1,17 @@
+# RUN: %clang --target=fuchsia-elf-riscv64 -march=rv64g %s -nostdlib -o %t
+# RUN: llvm-objdump -d %t | FileCheck %s
+
+# CHECK: auipc a0, 0x101
+# CHECK: ld a0, 0x8(a0) <ldata+0x1000>
+.global _start
+.text
+_start:
+ la a0, gdata
+
+.skip 0x100000
+ldata:
+ .int 0
+
+.data
+gdata:
+ .int 0
More information about the llvm-commits
mailing list