<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - When disassembling thumb an unknown instruction causes disassembly to restart on an unaligned boundary"
   href="https://bugs.llvm.org/show_bug.cgi?id=47994">47994</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>When disassembling thumb an unknown instruction causes disassembly to restart on an unaligned boundary
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>tools
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>llvm-objdump
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>smithp352@googlemail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Consider the example below:        
        .text
        .thumb
        ldab r0, [r0]
        nop
The ldab instruction requires armv8
llvm-mc --triple=armv8 -filetype=obj -o file.o file.s

When we tell llvm-objdump that the file is armv8 then this disassembles
correctly:
lvm-objdump -d file.o --triple=armv8

ldab.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <$t.0>:
       0: d0 e8 8f 0f   ldab    r0, [r0]
       4: 00 bf         nop

When we don't both instructions are disassembled incorrectly with:

llvm-objdump -d ldab.o

ldab.o: file format elf32-littlearm

Disassembly of section .text:

00000000 <$t.0>:
       0: d0            <unknown>
       1: e8 8f         ldrh    r0, [r5, #62]
       3: 0f 00         movs    r7, r1
       5: bf            <unknown>

Note that disassembly restarts at address 0x1 which is an unaligned address
that can never have a Thumb instruction. This prevents the nop instruction
which is known from being disassembled as well.

Ideally when llvm-objdump encounters an instruction it doesn't understand it
should follow the rule for determining how large a Thumb instruction is and
skip ahead that many bytes. That would prevent a single <unknown> instruction
from making the rest of the disassembly unusable.

"F3.1 T32 instruction set encoding
The T32 instruction stream is a sequence of halfword-aligned halfwords. Each
T32 instruction is either a single
16-bit halfword in that stream, or a 32-bit instruction consisting of two
consecutive halfwords in that stream.
If the value of bits[15:11] of the halfword being decoded is one of the
following, the halfword is the first halfword
of a 32-bit instruction:
• 0b11101.
• 0b11110.
• 0b11111.
Otherwise, the halfword is a 16-bit instruction."</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>