<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Possible bugs in X86_64 assembler"
href="https://bugs.llvm.org/show_bug.cgi?id=48991">48991</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Possible bugs in X86_64 assembler
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>MC
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>kobalicek.petr@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>I'm trying to compare LLVM assembler with AsmJit and I have found few cases
that I would like to report. I'm using LLVM build from main branch and llvm-mc
tool, which is called like this for each supported instruction:
"""
echo "--Instruction String--" | llvm-mc --arch=x86-64 -x86-asm-syntax=intel
-show-encoding -output-asm-variant=1
"""
Here is a list of instructions that I think LLVM encodes wrongly:
[1] "lcall tbyte ptr [rcx+rdx]" encodes as "FF1C11" (FWORD PTR) although it
should be "48FF1C11" (TBYTE PTR).
[2] "vextractps ecx, xmm2, 1" encodes as "62F37D0817D101" (AVX512) although it
should be "C4E37917D101" (AVX).
Here is a list of instructions that I think have wrong size of memory operands:
[3] "clrssbsy qword ptr [m64]" - LLVM wants "clrssbsy dword ptr [m32]" although
Intel manual specifies it's 8-byte operation and the operand is m64.
[4] "rstorssp qword ptr [m64]" - LLVM wants "rstorssp dword ptr [m32]" although
Intel manual specifies it's 8-byte operation and the operand is m64.
Here is a list of instructions that LLVM supports, but not all documented
variations:
[5] "wrssd r32, r32"
"wrssq r64, r64"
"wrussd r32, r32"
"wrussq r64, r64"
For some reason LLVM only supports wr[u]ss[d|q] [mem], reg" form, but not "reg
reg" form. However, Intel manual describes the following encodings:
0F 38 F6 WRSSD r/m32, r32
REX.W 0F 38 F6 WRSSQ r/m64, r64
66 0F 38 F5 WRUSSD r/m32, r32
66 REX.W 0F 38 F5 WRUSSQ r/m64, r64
Miscellaneous variations refused by LLVM, but disassembled by disassemblers:
[6] "bswap r16" - bswap with 16-bit override prefix.
[7] "movsxd r16, reg" - movsxd with 16-bit override prefix.
[8] "movsxd r32, reg" - movsxd without REX.W prefix
You would only want to encode these rarely, however, disassemblers encode these
properly and these forms are not forbidden (CPU would execute these
instructions).
Conclusion:
I think that somebody should check [1-2] especially as the binary produced by
LLVM doesn't match the output I would expect. [3-4] are also interesting as
LLVM refuses the correct size of memory operand it seems. The rest is most
likely uninteresting.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>