<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Instcombine transformation causes poor vector codegen [SSE4]"
href="http://llvm.org/bugs/show_bug.cgi?id=16776">16776</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Instcombine transformation causes poor vector codegen [SSE4]
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>matt@pharr.org
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=10973" name="attach_10973" title="test case">attachment 10973</a> <a href="attachment.cgi?id=10973&action=edit" title="test case">[details]</a></span>
test case
The attached test case does a vector compare of a <16 x i8> value with zero and
then a vector select based on the comparison to negate elements that are less
than zero (i.e. computes the absolute value). If I run it through llc as is, a
single glorious PABSB instruction is generated:
pabsb %xmm0, %xmm0
However, if I run "opt -instcombine <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - Customize Bugzilla to have some LLVM flavah"
href="show_bug.cgi?id=2">bug2</a>.ll | llc -o -", I get a 13 instruction
sequence instead of the PABSB:
movdqa %xmm0, %xmm1
pxor %xmm2, %xmm2
movdqa %xmm1, %xmm3
psrlw $7, %xmm3
movdqa LCPI0_0(%rip), %xmm0
pand %xmm0, %xmm3
pand %xmm0, %xmm3
pcmpeqb %xmm2, %xmm3
psubb %xmm1, %xmm2
pcmpeqd %xmm0, %xmm0
pxor %xmm3, %xmm0
pblendvb %xmm2, %xmm1
movdqa %xmm1, %xmm0
What seems to be happening is that the "vector (x <s 0) ? -1 : 0 -> ashr x, 31
-> all ones if signed." test at the end of InstCombiner::transformSExtICmp()
is kicking in, which in turn leads to the "< 0" test being transformed into a
lshr of 7 and an 'and' of the low bit, which in turn doesn't hit the PABSB
pattern in the X86 code generator.
Interestingly enough, the IR code triggering this transformation is dead code;
if I run "opt -dce -instcombine <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - Customize Bugzilla to have some LLVM flavah"
href="show_bug.cgi?id=2">bug2</a>.ll", then I again get the single PABSB.
As such, it looks like I can work around this by always running DCE before
InstCombine, but that seems like it may just be plastering over the underlying
issue.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>