[PATCH] D59997: [x86] allow movmsk with 2-element reductions

Andrea Di Biagio via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 29 10:47:45 PDT 2019


andreadb added a comment.

llvm-mca numbers are quite accurate for btver2 (see below for the perf results):

  vcmpltpd %xmm0, %xmm1, %xmm2
  vmovmskpd %xmm0, %ecx
  xorl %eax, %eax
  cmpl $3, %ecx
  sete %al
  negq %rax

-->

  cycles:           79314982                                        ( +- 0.36% )
  instructions:     154000245        #   1.94 insn per cycle        ( +- 0.00% )
  micro-opcodes:    154030776        #   1.94 uOps per cycle        ( +- 0.00% )

While..

  vcmpltpd %xmm0, %xmm1, %xmm2
  vpermilpd $1, %xmm2, %xmm1
  vandpd %xmm1, %xmm2, %xmm2
  vmovq %xmm2, %rax

Gives us this:

  cycles:           114486380                                       ( +- 1.56% )
  instructions:     102800331        #   0.90 insn per cycle        ( +- 0.00% )
  micro-opcodes:    102844837        #   0.90 uOps per cycle        ( +- 0.00% )


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59997/new/

https://reviews.llvm.org/D59997





More information about the llvm-commits mailing list