<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/132736>132736</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[RISCV] llvm-objdump disassembly ends prematurely using --disassemble-symbols when generating object files directly from assembly
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
thebigclub
</td>
</tr>
</table>
<pre>
I'm getting strange behavior when disassembling object files generated from C files vs. assembly:
```bash
$ cat << EOF > test.c
int testfn(int n) {
for (int i = 0; i < 5; i++) {
n += i;
}
return n;
}
EOF
$
$ # First generate directly from C file
$
$ clang -c test.c
$ llvm-nm -S test.o
00000000 00000048 T testfn
$
$ llvm-objdump -t test.o | grep testfn
00000000 g F .text 00000048 testfn
$
$ llvm-readobj --elf-output-style=GNU -s test.o | grep testfn
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
$
$ llvm-objdump --disassemble-symbols=testfn test.o
test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <testfn>:
0: 1141 addi sp, sp, -0x10
2: c606 sw ra, 0xc(sp)
4: c422 sw s0, 0x8(sp)
6: 0800 addi s0, sp, 0x10
8: fea42a23 sw a0, -0xc(s0)
c: 4501 li a0, 0x0
e: fea42823 sw a0, -0x10(s0)
12: a001 j 0x12 <testfn+0x12>
14: ff042583 lw a1, -0x10(s0)
18: 4511 li a0, 0x4
1a: 00b54063 blt a0, a1, 0x1a <testfn+0x1a>
1e: a001 j 0x1e <testfn+0x1e>
20: ff042583 lw a1, -0x10(s0)
24: ff442503 lw a0, -0xc(s0)
28: 952e add a0, a0, a1
2a: fea42a23 sw a0, -0xc(s0)
2e: a001 j 0x2e <testfn+0x2e>
30: ff042503 lw a0, -0x10(s0)
34: 0505 addi a0, a0, 0x1
36: fea42823 sw a0, -0x10(s0)
3a: a001 j 0x3a <testfn+0x3a>
3c: ff442503 lw a0, -0xc(s0)
40: 40b2 lw ra, 0xc(sp)
42: 4422 lw s0, 0x8(sp)
44: 0141 addi sp, sp, 0x10
46: 8082 ret
$
$ # Now generate from assembly file
$
$ clang -S test.c
$ clang -c test.s
$ llvm-nm -S test.o | grep testfn
00000000 00000048 T testfn
$
$ llvm-objdump -t test.o | grep testfn
00000000 g F .text 00000048 testfn
$
$ llvm-readobj --elf-output-style=GNU -s test.o | grep testfn
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
$
$ llvm-objdump --disassemble-symbols=testfn test.o
test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <testfn>:
0: 1141 addi sp, sp, -0x10
2: c606 sw ra, 0xc(sp)
4: c422 sw s0, 0x8(sp)
6: 0800 addi s0, sp, 0x10
8: fea42a23 sw a0, -0xc(s0)
c: 4501 li a0, 0x0
e: fea42823 sw a0, -0x10(s0)
12: a001 j 0x12 <testfn+0x12>
```
You can see that the disassembly output ends prematurely when going from C => assembly => object file compared to C => object file. The size of the `testfn()` function is 72 (0x48) bytes in both cases. If I use the `-d` option instead of `--disassemble-symbols`, the entire file is disassembled properly for the assembly version.
<br>The local labels are different for C => object file compared to C => assembly => object file:
```bash
# When compiled from C file
$ llvm-readobj --elf-output-style=GNU -s test.o
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS test.c
2: 00000000 0 NOTYPE LOCAL DEFAULT 2 $x
3: 00000014 0 NOTYPE LOCAL DEFAULT 2 .L0
4: 0000003c 0 NOTYPE LOCAL DEFAULT 2 .L0
5: 00000020 0 NOTYPE LOCAL DEFAULT 2 .L0
6: 00000030 0 NOTYPE LOCAL DEFAULT 2 .L0
7: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d
8: 00000000 0 NOTYPE LOCAL DEFAULT 6 $d
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
$
# When compiled from assembly file
$ llvm-readobj --elf-output-style=GNU -s test.o
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS test.c
2: 00000000 0 NOTYPE LOCAL DEFAULT 2 $x
3: 00000014 0 NOTYPE LOCAL DEFAULT 2 .LBB0_1
4: 0000003c 0 NOTYPE LOCAL DEFAULT 2 .LBB0_4
5: 00000020 0 NOTYPE LOCAL DEFAULT 2 .LBB0_2
6: 00000030 0 NOTYPE LOCAL DEFAULT 2 .LBB0_3
7: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d
8: 00000000 0 NOTYPE LOCAL DEFAULT 6 $d
9: 00000000 72 FUNC GLOBAL DEFAULT 2 testfn
```
<br>I noticed that if I edit the assembly file and change the jump references to `.LBB0_1` to another local label, the disassembly output advances and stops at the next local label (`.LBB0_2`):
```bash
$ # Edit test.s
$ clang -c test.s
$ llvm-objdump --disassemble-symbols=testfn test.o
test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <testfn>:
0: 1141 addi sp, sp, -0x10
2: c606 sw ra, 0xc(sp)
4: c422 sw s0, 0x8(sp)
6: 0800 addi s0, sp, 0x10
8: fea42a23 sw a0, -0xc(s0)
c: 4501 li a0, 0x0
e: fea42823 sw a0, -0x10(s0)
12: a001 j 0x12 <testfn+0x12>
14: ff042583 lw a1, -0x10(s0)
18: 4511 li a0, 0x4
1a: 00b54063 blt a0, a1, 0x1a <testfn+0x1a>
1e: a001 j 0x1e <testfn+0x1e>
```
<br>Tool versions:
```bash
$ clang -v
clang version 21.0.0git (https://github.com/llvm/llvm-project.git 30ff508614c90311509adc0890e32e7f86ec4fb8)
Target: riscv32-unknown-unknown-elf
$
$ llvm-objdump -v
LLVM (http://llvm.org/):
LLVM version 21.0.0git
Optimized build.
Registered Targets:
riscv32 - 32-bit RISC-V
riscv64 - 64-bit RISC-V
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsWdtvozoa_2ucl09ExiaEPOQhl-aoUrddnenM6jytDHwkniUQYdMm569f2VwCCem0092VVmeqqAnYv-_m7wpCKbnNEOdksiST9UiUepcXc73DUG6jtAxHYR6f5veETfewRa1ltgWlC5FtEULciReZF_C6wwxiqYRSuA9TsycPv2OkIZEpKthihoXQGENS5HtY1bdf1BhqyInwBaH249PqEwq1M9fMg0hoIHxF-ArunjZA-B1oVHocEbqQmbYXSUZYYC4ywmZApsuKHNR_SV5AvUEC4WughC_tzxVM7E_ClvbTgCEDc83XIAlfdkiR6bpPu0BdFhlk1bZq-e5pUwlfq0AYh40slG6NAbEsMNLpqWuUHiZKRbYFJzora-6m6cveyfbgfKnu54QuaP0H1ZcXwHNjlC5BC83D73G5P4CjazyQ6Qq2BR7OkJbe1iq4gbHGoyZ01tK_Rb1AEefhd3AcTBMnL_Wh1I7SpxQJX__2-BUcdZutZTYjfAGtAAAwZbD5-riC3x6elosHWN9tFl8fnu1e9mMtnbNfoqNO-zBPFeHrCne2IKGL-rdxxJk5C-Mze6EB04QzJ5Vap1hIFb1U29ct3RPkCSiMtMyz2lKNN7d6EL6qReV31WrtPNSo67qeC61_0ZmIY0noTB0IW0H136FHlxoYM4DIp34XoF4JnRXCbKTHiLDAgGZnLp4FeYxdgRStQME1yLcnEVSncCkaPYvWSFb9BQaVoPCYYLxlI2ithBWO9vhEBuFNaM8EqWxR9Nghjy35YIi8S6_ou9ZigvbpfzfefHRZ52TY0twwB0QX4FqTJQn12CTgrVSWmfsGs6BSxr2tjGfJi8rNw4lH_ZZ8mOp2Y8WFHl1xKaKoRawYYqvdWSu8hGAXwuhHVWO1MTyPTegF4vbBMmuL2YThhQOddWw0bSGi5z1dR32L0YAN2KUNWM8GvGODWxr1bcCtDeiETgbCoasOPZ714X7PXQf1GfAiLm67LL_0B97zBx71juodx-RZU3g07CWH9O2M4tmg8i4ySvp2RvEqE_442TUZxbPmC2jQ41Kgvqqtj_nrubLagtom51t19Uu_rvarrbpVbd-qlr-q76_q-6v6_geqbzsEELr4Iy8hEhkoRNA7oUHvsDNsnKByc8AsVnAocC90WWB6qqaSbW6mkbrDJnxtZocWWV93ZhWI8v1BFBiDzs-AzoYxPO8QlPwTjdsZUYhP2_nDKO5TSMqsckepTPwQFtCjZxYhPGlUIDMIc72DSChUY7hP4B5KhQ05JzZE8kNFIlMaRWy4maXBgPKt_Q0aMy0LrFSRqmMmjOFQ5AcsTELMC7u5tcMLFkrm2biewPgqLAi_M4qmeSRSSEWIqQJRGMMnCRaYaUtlyELDJnzD5nW4Xs19HP5hTtCQk2l_dvy5nFep98UaDbQIUwTCpmN12msREjaFKM-0kJkClxpLFhJVmy0ey71xZAD4JtISAb4YH3g-HWxzs5RZbJakqn38MT7Co9hjEwf0Mq8Chcen5z_-fgfw8LRaPEAntX59XEMDdAeAm_uHu2vUYtmpaFWCfj9T46XesQHyM9D1fiStKQTjB9pK7J3BPPooeHIGsx_a6RLsdzgPg29jpx85IADTdXixAQcfA_ot8M1yC_DOijsYJQO9z18wWOpAgQHQh2Jl0O8-GS7LJf2n-4mIMXjvE0Fj8Oz9cTOI5--PneHQgaZh-Znw-UTodLqLTr27hyzXMjKFy7QZ0lRljKXu10pb4kQWQ7SzD0HN4nfT6xZoC2OEyhQ-4tPmkH1qbogs1zssuhW1KdoDzYyIX4QlZTgpnR8U1J1PhkfdJWK6i5YZs63A7FZFrUalO6tTd8h5Y_T5P-jkP9zDwzng_8ed_Mc7eHjfE5DPdPGDLPqd_M89QqtS8n_3OVq1uf8w7dMP0q51vPVA7UYqec7ztOmq1e1orOPOBEL1s4YAc8d0TLdSm-jeaX2wRNiGsM1W6l0ZjqN8T9jGhGj95RyK3LTUY4PiNEkmNPBdL5pR7roT43cRDWYUOcNpEvgYeUkYVBZ_FsUWTayBDUvOnDL7V5a_Zu03pskbQ76R_-Hh298aYVtZzbZxXmztVZ2VwO68UtQew9NBy738E2MIS5nGzTxiln7HrVQazVhRids0G7XI4ABnTig1_H7_ZeV8q4_VrvoeOOB7_dX2PEbxnMczPhMjnLtTExITj_mj3ZzjxJ9EAqeUchcjLxZeQBOPhtMoEDOcjuScUTahnHmuRzmn42kQJSLhccBmgnouEo_iXsh03BhiJJUqce5yNuX-qBqr7Ks3xjJ8BbtKGCOT9aiYWyOH5VYRj6ZSaXUmo6VO7Ts7o843Mln3T6RbUK7G4lKZiXgwn9cjc_UY7eotXv99VcNgVBbp_GMuStjGaqoI29SmeJmzfwcAAP__XWyZpw">