[llvm-bugs] [Bug 47994] New: When disassembling thumb an unknown instruction causes disassembly to restart on an unaligned boundary

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Oct 28 03:21:33 PDT 2020


https://bugs.llvm.org/show_bug.cgi?id=47994

            Bug ID: 47994
           Summary: When disassembling thumb an unknown instruction causes
                    disassembly to restart on an unaligned boundary
           Product: tools
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: llvm-objdump
          Assignee: unassignedbugs at nondot.org
          Reporter: smithp352 at googlemail.com
                CC: llvm-bugs at lists.llvm.org

Consider the example below:        
        .text
        .thumb
        ldab r0, [r0]
        nop
The ldab instruction requires armv8
llvm-mc --triple=armv8 -filetype=obj -o file.o file.s

When we tell llvm-objdump that the file is armv8 then this disassembles
correctly:
lvm-objdump -d file.o --triple=armv8

ldab.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <$t.0>:
       0: d0 e8 8f 0f   ldab    r0, [r0]
       4: 00 bf         nop

When we don't both instructions are disassembled incorrectly with:

llvm-objdump -d ldab.o

ldab.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <$t.0>:
       0: d0            <unknown>
       1: e8 8f         ldrh    r0, [r5, #62]
       3: 0f 00         movs    r7, r1
       5: bf            <unknown>

Note that disassembly restarts at address 0x1 which is an unaligned address
that can never have a Thumb instruction. This prevents the nop instruction
which is known from being disassembled as well.

Ideally when llvm-objdump encounters an instruction it doesn't understand it
should follow the rule for determining how large a Thumb instruction is and
skip ahead that many bytes. That would prevent a single <unknown> instruction
from making the rest of the disassembly unusable.

"F3.1 T32 instruction set encoding
The T32 instruction stream is a sequence of halfword-aligned halfwords. Each
T32 instruction is either a single
16-bit halfword in that stream, or a 32-bit instruction consisting of two
consecutive halfwords in that stream.
If the value of bits[15:11] of the halfword being decoded is one of the
following, the halfword is the first halfword
of a 32-bit instruction:
• 0b11101.
• 0b11110.
• 0b11111.
Otherwise, the halfword is a 16-bit instruction."

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20201028/7828285e/attachment.html>


More information about the llvm-bugs mailing list