[PATCH] D59997: [x86] allow movmsk with 2-element reductions
Andrea Di Biagio via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 29 10:47:45 PDT 2019
andreadb added a comment.
llvm-mca numbers are quite accurate for btver2 (see below for the perf results):
vcmpltpd %xmm0, %xmm1, %xmm2
vmovmskpd %xmm0, %ecx
xorl %eax, %eax
cmpl $3, %ecx
sete %al
negq %rax
-->
cycles: 79314982 ( +- 0.36% )
instructions: 154000245 # 1.94 insn per cycle ( +- 0.00% )
micro-opcodes: 154030776 # 1.94 uOps per cycle ( +- 0.00% )
While..
vcmpltpd %xmm0, %xmm1, %xmm2
vpermilpd $1, %xmm2, %xmm1
vandpd %xmm1, %xmm2, %xmm2
vmovq %xmm2, %rax
Gives us this:
cycles: 114486380 ( +- 1.56% )
instructions: 102800331 # 0.90 insn per cycle ( +- 0.00% )
micro-opcodes: 102844837 # 0.90 uOps per cycle ( +- 0.00% )
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D59997/new/
https://reviews.llvm.org/D59997
More information about the llvm-commits
mailing list