[llvm-bugs] [Bug 26279] New: X86 Decoder doesn't decode properly REP STOSW

Sun Jan 24 03:47:43 PST 2016

https://llvm.org/bugs/show_bug.cgi?id=26279

            Bug ID: 26279
           Summary: X86 Decoder doesn't decode properly REP STOSW
           Product: new-bugs
           Version: 3.8
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: yaten at probud.homeip.net
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

In LLVM 3.7.0 and 3.7.1 (I couldn't get any newer version downloaded, but 3.7.1
is very new), if one passes REP STOSW ( 0x66 0xF3 0xAB ) to X86 32-bit mode
llvm::MCDisassembler's getInstruction() call, one will get back "stosd dword
ptr es:[edi], eax" (instruction opcode STOSL, op 0 - register EDI).

The correct outcome should be either a REP_PREFIX instruction, followed by
STOSW instruction with op 0 - register DI, or a REP STOSW instruction with op 0
- register DI.

The 0x66 0xF3 0xAB encoding of REP STOSW has been generated by Microsoft Visual
Studio compiler 2010, and so I expect this encoding to be widely used.

As far as I could ascertain from looking at the code, the problem stems from
the usage of insn->necessaryPrefixLocation in X86DisassemblerDecoder.cpp. Two
prefixes are correctly detected for instruction , the operand size override one
(0x66), and the rep one (0xF3), but in readPrefixes() body, the
necessaryPrefixLocation for instruction is set at location of rep prefix, and
so later, in getID() body, line "if( insn->mode != MODE_16BIT &&
isPrefixAtLocation(insn, 0x66, insn->necessaryPrefixLocation))", it's looking
for the operand size override prefix at location of rep prefix, and so it
doesn't detect it, which leads to instruction being classified as STOSL rather
than STOSW.

The assumption that there will only be one prefix for instruction, and thus
noting its location and looking for it only at that location seems wrong. Intel
64 and IA-32 Architectures Software Developers Manual Volume 2 states in
section 2.1.1 (InstructionPrefixes) states that there may be more than one
prefix, of different prefix groups, and that Groups 1 through 4 may be placed
in any order relative to each other.

Possibly it would be better to treat necessaryPrefixLocation as only the
highest possible address where the prefix byte may be located, and in
isPrefixAtLocation() code to search for prefix byte in every byte between
insn->startLocation and location given as argument, not just at location?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160124/0e3323ac/attachment.html>