<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - PMULLD should be avoided if possible on Silvermont"
href="https://llvm.org/bugs/show_bug.cgi?id=31202">31202</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>PMULLD should be avoided if possible on Silvermont
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>zvi.rackover@intel.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>For the following case:
define <4 x i32> @foo(<4 x i8> %A) {
%z = zext <4 x i8> %A to <4 x i32>
%m = mul nuw nsw <4 x i32> %z, <i32 18778, i32 18778, i32 18778, i32 18778>
ret <4 x i32> %m
}
The following code is generated for Silvermont:
pand .LCPI1_0, %xmm0
pmulld .LCPI1_1, %xmm0
retl
On Silvermont:
PMULLD has a throughput of 1/11 [instruction/cycles].
PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles].
Note that the multiplicands fit in 16-bits.
We would achieve a higher throughput with the following sequence:
pshufb
pmullw
pmulhw
punpcklwd
This issue was root caused by Farhana Aleen during analysis on internal
workloads which would regress if interleaving would be enabled for Silvermont
in X86TTI (so commit 284779 did not enable interleaving for some subtargets).
It turns out that with interleaving the vectorized IR prior to codegen is
decent for the chosen vectorization width. The issue reported here is one of
the major reasons for the slow-down (but fixing this issue alone only reduces
the regression).</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>