<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><span class="vcard"><a class="email" href="mailto:llvm-dev@redking.me.uk" title="Simon Pilgrim <llvm-dev@redking.me.uk>"> <span class="fn">Simon Pilgrim</span></a>
</span> changed
<a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction"
href="https://bugs.llvm.org/show_bug.cgi?id=39921">bug 39921</a>
<br>
<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>What</th>
<th>Removed</th>
<th>Added</th>
</tr>
<tr>
<td style="text-align:right;">Fixed By Commit(s)</td>
<td>r359491
</td>
<td>r359491,r362327
</td>
</tr>
<tr>
<td style="text-align:right;">Resolution</td>
<td>---
</td>
<td>FIXED
</td>
</tr>
<tr>
<td style="text-align:right;">Status</td>
<td>NEW
</td>
<td>RESOLVED
</td>
</tr></table>
<p>
<div>
<b><a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction"
href="https://bugs.llvm.org/show_bug.cgi?id=39921#c2">Comment # 2</a>
on <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction"
href="https://bugs.llvm.org/show_bug.cgi?id=39921">bug 39921</a>
from <span class="vcard"><a class="email" href="mailto:llvm-dev@redking.me.uk" title="Simon Pilgrim <llvm-dev@redking.me.uk>"> <span class="fn">Simon Pilgrim</span></a>
</span></b>
<pre>(In reply to Simon Pilgrim from <a href="show_bug.cgi?id=39921#c1">comment #1</a>)
<span class="quote">> But Intel targets can get stuck as the 'fast shuffle' attribute gets in the
> way:
>
> pairwise_reduction8i32: # @pairwise_reduction8i32
> vmovdqa .LCPI0_0(%rip), %ymm1 # ymm1 = [0,2,4,6,4,6,6,7]
> vpermd %ymm0, %ymm1, %ymm1
> vmovdqa .LCPI0_1(%rip), %ymm2 # ymm2 = [1,3,5,7,5,7,6,7]
> vpermd %ymm0, %ymm2, %ymm0
> vpaddd %xmm0, %xmm1, %xmm0
> vpshufd $232, %xmm0, %xmm1 # xmm1 = xmm0[0,2,2,3]
> vpshufd $237, %xmm0, %xmm0 # xmm0 = xmm0[1,3,2,3]
> vpaddd %xmm0, %xmm1, %xmm0
> vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]
> vpaddd %xmm1, %xmm0, %xmm0
> vmovd %xmm0, %eax
> vzeroupper
> retq</span >
Resolving - the fast-variable-shuffle issue was fixed at rL362327:
pairwise_reduction8i32: # @pairwise_reduction8i32
vextracti128 $1, %ymm0, %xmm1
vphaddd %xmm1, %xmm0, %xmm0
vphaddd %xmm0, %xmm0, %xmm0
vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]
vpaddd %xmm1, %xmm0, %xmm0
vmovd %xmm0, %eax
vzeroupper
retq</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>