[PATCH] D30137: [InstCombine] shrink truncated insertelement with constant operand

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 7 12:44:54 PST 2017


efriedma added a comment.

We can ignore deficiencies which are obviously bugs; I was thinking of something more like this:

  define <16 x i8> @trunc_inselt1(<16 x i16> %a, i16 %x) {
    %vec = insertelement <16 x i16> %a, i16 %x, i32 1
    %trunc = trunc <16 x i16> %vec to <16 x i8>
    ret <16 x i8> %trunc
  }
  
  define <16 x i8> @trunc_inselt2(<16 x i16> %a, i8 %x) {
    %trunc = trunc <16 x i16> %a to <16 x i8>
    %vec = insertelement <16 x i8> %trunc, i8 %x, i32 1
    ret <16 x i8> %vec
  }

For trunc_inselt2, the x86 backend produces:

  trunc_inselt2:                          # @trunc_inselt2
          .cfi_startproc
  # BB#0:
          movdqa  .LCPI1_0(%rip), %xmm2   # xmm2 = [255,255,255,255,255,255,255,255]
          pand    %xmm2, %xmm1
          pand    %xmm2, %xmm0
          packuswb        %xmm1, %xmm0
          movdqa  .LCPI1_1(%rip), %xmm1   # xmm1 = [255,0,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
          pand    %xmm1, %xmm0
          movd    %edi, %xmm2
          psllw   $8, %xmm2
          pandn   %xmm2, %xmm1
          por     %xmm1, %xmm0
          retq

The pand+movd+psllw+pandn+por sequence is clearly a lot worse than a pinsrw.


https://reviews.llvm.org/D30137





More information about the llvm-commits mailing list