[LLVMdev] X86 disassembler 0x66 prefix
Sean Callanan
scallanan at apple.com
Wed Dec 22 18:17:05 PST 2010
Olivier,
thanks for identifying that issue! Your fix corrects the specific testcase you provide, but it could produce incorrect results in other cases.
Here's a bit of background on necessary_prefix_location.
necessary_prefix_location is supposed to keep track of where prefixes like 66, f3, and f2 – which affect the class of instruction being decoded fundamentally – should be expected to be. If you look at the "Instruction Format" section of the Intel manuals (specifically, Volume 2A of the Intel 64 and IA-32 Architectures Software Developer's Manual), it specifies that
–
Some instructions may use F2H,F3H as a mandatory prefix to express distinct functionality. A mandatory prefix generally should be placed after other optional prefixes (exception to this is discussed in Section 2.2.1, “REX Prefixes”).
–
Also,
–
Some SSE2/SSE3/SSSE3/SSE4 instructions and instructions using a three-byte sequence of primary opcode bytes may use 66H as a mandatory prefix to express distinct functionality. A mandatory prefix generally should be placed after other optional prefixes (exception to this is discussed in Section 2.2.1, “REX Prefixes”).
–
This is why the decoder maintains a special location – necessary_prefix_location – at which f2, f3, or 66 must reside. Perhaps I ought to call it "mandatory_prefix_location"...
In the 64-bit case (under "REX Prefixes"), the manual states:
–
Only one REX prefix is allowed per instruction. If used, the REX prefix byte must immediately precede the opcode byte or the escape opcode byte (0FH). When a REX prefix is used in conjunction with an instruction containing a mandatory prefix, the mandatory prefix must come before the REX so the REX prefix can be immediately preceding the opcode or the escape byte.
–
I took a quick look in the sources and saw that necessaryPrefixLocation was being set at two locations: line 382, and line 387. Those are both only enabled if insn->mode == MODE_64BIT, which in X86 (of course) it isn't. They are handling the specific 64-bit case; the 32-bit case is unhandled (as you spotted).
Your patch would set necessaryPrefixLocation to wherever the 66 prefix is found, which works most of the time but would break cases in which 66 is not used as a mandatory prefix, but merely as an optional prefix (the operand-size prefix). Although obscure, such a use is possible. I have attached a patch that solves the problem more generally.
Sean
On Dec 22, 2010, at 8:31 AM, Olivier Meurant wrote:
> There is a problem on X86 disassembler for instructions beginning with x86 prefix :
>
> $ echo "0x66 0x0f 0x6f 0x8f 0x00 0x00 0x00 0x00" | llvm-mc --disassemble
> movdqa (%edi), %xmm1
>
> $ echo "0x53 0x66 0x0f 0x6f 0x8f 0x00 0x00 0x00 0x00" | llvm-mc --disassemble
> pushl %ebx
> <stdin>:1:6: warning: invalid instruction encoding
> 0x53 0x66 0x0f 0x6f 0x8f 0x00 0x00 0x00 0x00
> ^
>
>
> I attach a patch, but I'm not really familiar with this code, so if someone can review/correct, it would be great.
>
> The problem is between readPrefixes and getId : prefixLocation depends on buffer position, but necessaryPrefixLocation is never updated to match buffer position.
>
> Cheers,
> Olivier.
>
>
> <x86_disassembler_66_prefix.patch>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101222/692701b0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm.necessaary-prefix-location.diff
Type: application/octet-stream
Size: 445 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101222/692701b0/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101222/692701b0/attachment-0001.html>
More information about the llvm-dev
mailing list