<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Possible bugs in X86_64 assembler"
   href="https://bugs.llvm.org/show_bug.cgi?id=48991">48991</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Possible bugs in X86_64 assembler
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>MC
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>kobalicek.petr@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I'm trying to compare LLVM assembler with AsmJit and I have found few cases
that I would like to report. I'm using LLVM build from main branch and llvm-mc
tool, which is called like this for each supported instruction:

"""
echo "--Instruction String--" | llvm-mc --arch=x86-64 -x86-asm-syntax=intel
-show-encoding -output-asm-variant=1
"""



Here is a list of instructions that I think LLVM encodes wrongly:

[1] "lcall tbyte ptr [rcx+rdx]" encodes as "FF1C11" (FWORD PTR) although it
should be "48FF1C11" (TBYTE PTR).

[2] "vextractps ecx, xmm2, 1" encodes as "62F37D0817D101" (AVX512) although it
should be "C4E37917D101" (AVX).



Here is a list of instructions that I think have wrong size of memory operands:

[3] "clrssbsy qword ptr [m64]" - LLVM wants "clrssbsy dword ptr [m32]" although 
Intel manual specifies it's 8-byte operation and the operand is m64.

[4] "rstorssp qword ptr [m64]" - LLVM wants "rstorssp dword ptr [m32]" although
Intel manual specifies it's 8-byte operation and the operand is m64.



Here is a list of instructions that LLVM supports, but not all documented
variations:

[5] "wrssd r32, r32"
    "wrssq r64, r64"
    "wrussd r32, r32"
    "wrussq r64, r64"

For some reason LLVM only supports wr[u]ss[d|q] [mem], reg" form, but not "reg
reg" form. However, Intel manual describes the following encodings:

             0F 38 F6 WRSSD r/m32, r32
       REX.W 0F 38 F6 WRSSQ r/m64, r64
          66 0F 38 F5 WRUSSD r/m32, r32
    66 REX.W 0F 38 F5 WRUSSQ r/m64, r64



Miscellaneous variations refused by LLVM, but disassembled by disassemblers:

[6] "bswap r16" - bswap with 16-bit override prefix.
[7] "movsxd r16, reg" - movsxd with 16-bit override prefix.
[8] "movsxd r32, reg" - movsxd without REX.W prefix

You would only want to encode these rarely, however, disassemblers encode these
properly and these forms are not forbidden (CPU would execute these
instructions).



Conclusion:

I think that somebody should check [1-2] especially as the binary produced by
LLVM doesn't match the output I would expect. [3-4] are also interesting as
LLVM refuses the correct size of memory operand it seems. The rest is most
likely uninteresting.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>