[llvm-bugs] [Bug 39921] [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun Jun 2 09:02:30 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=39921
Simon Pilgrim <llvm-dev at redking.me.uk> changed:
What |Removed |Added
----------------------------------------------------------------------------
Fixed By Commit(s)|r359491 |r359491,r362327
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #2 from Simon Pilgrim <llvm-dev at redking.me.uk> ---
(In reply to Simon Pilgrim from comment #1)
> But Intel targets can get stuck as the 'fast shuffle' attribute gets in the
> way:
>
> pairwise_reduction8i32: # @pairwise_reduction8i32
> vmovdqa .LCPI0_0(%rip), %ymm1 # ymm1 = [0,2,4,6,4,6,6,7]
> vpermd %ymm0, %ymm1, %ymm1
> vmovdqa .LCPI0_1(%rip), %ymm2 # ymm2 = [1,3,5,7,5,7,6,7]
> vpermd %ymm0, %ymm2, %ymm0
> vpaddd %xmm0, %xmm1, %xmm0
> vpshufd $232, %xmm0, %xmm1 # xmm1 = xmm0[0,2,2,3]
> vpshufd $237, %xmm0, %xmm0 # xmm0 = xmm0[1,3,2,3]
> vpaddd %xmm0, %xmm1, %xmm0
> vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]
> vpaddd %xmm1, %xmm0, %xmm0
> vmovd %xmm0, %eax
> vzeroupper
> retq
Resolving - the fast-variable-shuffle issue was fixed at rL362327:
pairwise_reduction8i32: # @pairwise_reduction8i32
vextracti128 $1, %ymm0, %xmm1
vphaddd %xmm1, %xmm0, %xmm0
vphaddd %xmm0, %xmm0, %xmm0
vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]
vpaddd %xmm1, %xmm0, %xmm0
vmovd %xmm0, %eax
vzeroupper
retq
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190602/97fd50f1/attachment-0001.html>
More information about the llvm-bugs
mailing list