[LLVMdev] X86 disassembler is quite broken on handling REX

Jun Koi junkoi2004 at gmail.com
Tue Dec 23 22:17:38 PST 2014


hi,

i think the current X86 disassembler is quite broken and fails badly on
handling REX for x86_64 code.

below are some examples:

$ echo "0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble
-triple=x86_64
    .text
    por    %mm3, %mm0

$ echo "0x40,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble
-triple=x86_64
    .text
    por    %mm3, %mm0

$ echo "0x41,0x0f,0xeb,0xc3"|./Release+Asserts/bin/llvm-mc -disassemble
-triple=x86_64
    .text
<stdin>:1:1: warning: invalid instruction encoding
0x41,0x0f,0xeb,0xc3
^


the last example should also return "por %mm3, %mm0", but it fails to
understand the input.

the reason stays with this line in X86DisassemblerDecoder.cpp:

    rm  |= bFromREX(insn->rexPrefix) << 3;

we can see that we take into account REX.B, but for "por" (0F EB), this
should be ignored.

there are quite a lot of other instructions taking into account REX like
this, while according to the manual, REX should be ignored.

i dont see any clean solution for this issue without some significant
changes into the way we decode ModRM & providing more information to .td
files.

any idea?

thanks.
Jun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141224/2206cb49/attachment.html>


More information about the llvm-dev mailing list