[llvm-bugs] [Bug 49974] New: [arm disassembler] Incorrect number of operands in MCInst generated by disassembler
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Apr 15 10:49:03 PDT 2021
https://bugs.llvm.org/show_bug.cgi?id=49974
Bug ID: 49974
Summary: [arm disassembler] Incorrect number of operands in
MCInst generated by disassembler
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: ARM
Assignee: unassignedbugs at nondot.org
Reporter: minyihh at uci.edu
CC: llvm-bugs at lists.llvm.org, smithp352 at googlemail.com,
Ties.Stuij at arm.com
This ticket actually contains two bugs but they're really similar so I just
combined them into one.
First, given this binary instruction "0x26,0x00,0x00,0xeb", we can disassemble
it with the following command:
```
$ echo "0x26 0x00 0x00 0xeb" | llvm-mc --disassemble -triple=armv7 -o -
.text
bl #152
$
```
Although the above command looked normal, if we look into its disassembled
`MCInst` (currently the debug output of `llvm-mc --disassemble` doesn't print
the disassembled `MCInst` but you can observe it in other ways like using gdb),
it looks like this:
```
<MCInst #703 BL <MCOperand Imm:152> <MCOperand Imm:14> <MCOperand Reg:0>>
```
According to the instruction definition of `BL`, it only takes 1 operand rather
than 3. The latter two are predicate operands (the second operand represents
`ARMCC::AL` and the third is predicate register it depends on) inserted by
mistake.
Another input that triggers a similar bug is "0xad 0xf2 0x7c 0x4d":
```
$ echo "0xad 0xf2 0x7c 0x4d" | llvm-mc --disassemble -triple=thumbv7 -o -
.text
subw sp, sp, #1148
$
```
Again, the disassembled text is benign, but the disassembled `MCInst` looks
like this:
```
<MCInst #4193 t2SUBspImm12 \
<MCOperand Reg:15> <MCOperand Reg:15> \
<MCOperand Imm:1148> \
<MCOperand Imm:14> <MCOperand Reg:0> \
<MCOperand Reg:0> <MCOperand Reg:0>>
```
According to the instruction definition of `t2SUBspImm12`, there should be only
5 operands rather than 7. The last two operands are inserted by mistake.
These bug affect some of the users that directly consume the disassembled
`MCInst` object. For example, feeding the disassembled `MCInst` into LLVM MCA
-- it will cause MCA to choke because MCA is more sensitive to the total number
of operands in a `MCInst`.
The reason these two bugs were never caught is because we never directly test
on the in-memory `MCInst` object (or its textual format). The testing
infrastructure we have translate the `MCInst` into assembly code before
checking them. But as you can see above, this can not detect surplus operands
appended at the _end_.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210415/7b817f92/attachment-0001.html>
More information about the llvm-bugs
mailing list