[llvm-bugs] [Bug 48991] New: Possible bugs in X86_64 assembler

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Feb 1 10:39:36 PST 2021


https://bugs.llvm.org/show_bug.cgi?id=48991

            Bug ID: 48991
           Summary: Possible bugs in X86_64 assembler
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: MC
          Assignee: unassignedbugs at nondot.org
          Reporter: kobalicek.petr at gmail.com
                CC: llvm-bugs at lists.llvm.org

I'm trying to compare LLVM assembler with AsmJit and I have found few cases
that I would like to report. I'm using LLVM build from main branch and llvm-mc
tool, which is called like this for each supported instruction:

"""
echo "--Instruction String--" | llvm-mc --arch=x86-64 -x86-asm-syntax=intel
-show-encoding -output-asm-variant=1
"""



Here is a list of instructions that I think LLVM encodes wrongly:

[1] "lcall tbyte ptr [rcx+rdx]" encodes as "FF1C11" (FWORD PTR) although it
should be "48FF1C11" (TBYTE PTR).

[2] "vextractps ecx, xmm2, 1" encodes as "62F37D0817D101" (AVX512) although it
should be "C4E37917D101" (AVX).



Here is a list of instructions that I think have wrong size of memory operands:

[3] "clrssbsy qword ptr [m64]" - LLVM wants "clrssbsy dword ptr [m32]" although 
Intel manual specifies it's 8-byte operation and the operand is m64.

[4] "rstorssp qword ptr [m64]" - LLVM wants "rstorssp dword ptr [m32]" although
Intel manual specifies it's 8-byte operation and the operand is m64.



Here is a list of instructions that LLVM supports, but not all documented
variations:

[5] "wrssd r32, r32"
    "wrssq r64, r64"
    "wrussd r32, r32"
    "wrussq r64, r64"

For some reason LLVM only supports wr[u]ss[d|q] [mem], reg" form, but not "reg
reg" form. However, Intel manual describes the following encodings:

             0F 38 F6 WRSSD r/m32, r32
       REX.W 0F 38 F6 WRSSQ r/m64, r64
          66 0F 38 F5 WRUSSD r/m32, r32
    66 REX.W 0F 38 F5 WRUSSQ r/m64, r64



Miscellaneous variations refused by LLVM, but disassembled by disassemblers:

[6] "bswap r16" - bswap with 16-bit override prefix.
[7] "movsxd r16, reg" - movsxd with 16-bit override prefix.
[8] "movsxd r32, reg" - movsxd without REX.W prefix

You would only want to encode these rarely, however, disassemblers encode these
properly and these forms are not forbidden (CPU would execute these
instructions).



Conclusion:

I think that somebody should check [1-2] especially as the binary produced by
LLVM doesn't match the output I would expect. [3-4] are also interesting as
LLVM refuses the correct size of memory operand it seems. The rest is most
likely uninteresting.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210201/127f7dd7/attachment.html>


More information about the llvm-bugs mailing list