[PATCH] Fixed disassembly of x86 prefixes.

Kevin Enderby enderby at apple.com
Fri Aug 30 13:55:49 PDT 2013


Looked at this and it seems better than what is there.  I agree about the whole thing of disassembling prefixes as separate instructions is kinda bogus but the llvm disassembler and MCInsts are just not set up to hang prefixes off them and get that style of disassembly.

Seems good to commit,
Kev

On Aug 30, 2013, at 1:24 PM, Eric Christopher <echristo at gmail.com> wrote:

> Adding Kevin and Jim to this since they've done the most work recently
> on the disassembler.
> 
> -eric
> 
> On Fri, Aug 30, 2013 at 12:02 PM, Richard Mitton
> <richard at codersnotes.com> wrote:
>> ping!
>> 
>> I should probably also mention this fixes an actual bug in lldb too.
>> 
>> Richard Mitton
>> richard at codersnotes.com
>> 
>> 
>> On 08/26/2013 01:30 PM, Richard Mitton wrote:
>>> 
>>> Fixed a bug where diassembling an instruction that had a prefix would
>>> cause LLVM to identify a 1-byte instruction, but then upon querying it for
>>> that 1-byte instruction would cause an undefined opcode.
>>> 
>>> Test case included.
>>> 
>>> 
>>> http://llvm-reviews.chandlerc.com/D1523
>>> 
>>> Files:
>>>   lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
>>>   test/MC/Disassembler/X86/prefixes.txt
>>> 
>>> Index: lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
>>> ===================================================================
>>> --- lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
>>> +++ lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
>>> @@ -314,20 +314,22 @@
>>>    while (isPrefix) {
>>>      prefixLocation = insn->readerCursor;
>>>  +    /* If we fail reading prefixes, just stop here and let the opcode
>>> reader deal with it */
>>>      if (consumeByte(insn, &byte))
>>> -      return -1;
>>> +      break;
>>>        /*
>>>       * If the byte is a LOCK/REP/REPNE prefix and not a part of the
>>> opcode, then
>>>       * break and let it be disassembled as a normal "instruction".
>>>       */
>>> +    if (insn->readerCursor - 1 == insn->startLocation && byte == 0xf0)
>>> +      break;
>>> +
>>> +    uint8_t nextByte;
>>>      if (insn->readerCursor - 1 == insn->startLocation
>>> -        && (byte == 0xf0 || byte == 0xf2 || byte == 0xf3)) {
>>> -      uint8_t nextByte;
>>> -      if (byte == 0xf0)
>>> -        break;
>>> -      if (lookAtByte(insn, &nextByte))
>>> -        return -1;
>>> +        && (byte == 0xf2 || byte == 0xf3)
>>> +        && !lookAtByte(insn, &nextByte))
>>> +    {
>>>        /*
>>>         * If the byte is 0xf2 or 0xf3, and any of the following
>>> conditions are
>>>         * met:
>>> Index: test/MC/Disassembler/X86/prefixes.txt
>>> ===================================================================
>>> --- /dev/null
>>> +++ test/MC/Disassembler/X86/prefixes.txt
>>> @@ -0,0 +1,59 @@
>>> +# RUN: llvm-mc --disassemble %s -triple=x86_64 | FileCheck %s
>>> +
>>> +# CHECK: lock
>>> +# CHECK-NEXT:  orl     $16, %fs:776
>>> +0xf0 0x64 0x83 0x0c 0x25 0x08 0x03 0x00 0x00 0x10
>>> +
>>> +# CHECK: movq  %fs:768, %rdi
>>> +0x64 0x48 0x8b 0x3c 0x25 0x00 0x03 0x00 0x00
>>> +
>>> +# CHECK: rep
>>> +# CHECK-NEXT:          stosq
>>> +0xf3 0x48 0xab
>>> +
>>> +# CHECK: rep
>>> +# CHECK-NEXT:          stosl
>>> +0xf3 0x67 0x48 0xab
>>> +
>>> +# CHECK: movl 32(%rbp), %eax
>>> +0x8b 0x45 0x20
>>> +
>>> +# CHECK: movl %es:32(%rbp), %eax
>>> +0x26 0x8b 0x45 0x20
>>> +
>>> +# CHECK: movl %es:32(%rbp), %eax
>>> +0x2e 0x26 0x8b 0x45 0x20
>>> +
>>> +# Test that multiple prefixes stack.
>>> +#    (todo- the correct disassembly is actually more like "es movl
>>> %cs:32(%rbp), %eax"
>>> +#    but we don't support that)
>>> +# CHECK: movl %cs:32(%rbp), %eax
>>> +0x26 0x2e 0x8b 0x45 0x20
>>> +
>>> +# Test that 0xf3 as part of the opcode works.
>>> +# CHECK: cvtdq2pd      (%rax), %xmm0
>>> +0xf3 0x0f 0xe6 0x00
>>> +
>>> +# CHECK: pause
>>> +0xf3 0x90
>>> +
>>> +# CHECK: nop
>>> +0x90
>>> +
>>> +# CHECK:               lock
>>> +# CHECK-NEXT:  nop
>>> +0xf0 0x90
>>> +
>>> +# Test that multiple redundant prefixes work (redundant, but valid x86).
>>> +# CHECK: rep
>>> +# CHECK-NEXT: rep
>>> +# CHECK-NEXT: stosq
>>> +0xf3 0xf3 0x48 0xab
>>> +
>>> +# Test that a prefix on it's own works. It's debatable as to if this is
>>> +# something that is considered valid, but however as LLVM's own
>>> disassembler
>>> +# has decided to disassemble prefixes as being separate opcodes, it
>>> therefore
>>> +# should be capable of re-consuming it's own input.
>>> +# CHECK: rep
>>> +0xf3
>>> +# ***IMPORTANT ^-- this must be at the end of the file to be a valid test
>>> ***
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list