[PATCH] D79718: [x86][CGP] enable target hook to sink funnel shift intrinsic's splatted shift amount

Tue May 12 11:16:59 PDT 2020

spatel marked an inline comment as done.
spatel added inline comments.

================
Comment at: llvm/test/CodeGen/X86/vector-fshl-128.ll:2195
 ; AVX1-NEXT:    movq $-1024, %rax # imm = 0xFC00
-; AVX1-NEXT:    vpand {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT:    vpslld $23, %xmm0, %xmm0
-; AVX1-NEXT:    vpaddd {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT:    vcvttps2dq %xmm0, %xmm0
-; AVX1-NEXT:    vpshufd {{.*#+}} xmm1 = xmm0[1,1,3,3]
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[0,0,0,0]
+; AVX1-NEXT:    vpand {{.*}}(%rip), %xmm0, %xmm1
----------------
craig.topper wrote:
> Why are we splatting the scalar here when only element 0 is used?
This is another limitation caused by the block-level visibility - SDAG doesn't know that the splat is from a scalar because we are only sinking the shuffle instruction, not the insertelement:
      t12: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
    t14: v4i32 = vector_shuffle<0,0,0,0> t12, undef:v4i32

The splat doesn't get hoisted back out of the loop until later in MachineLICM, and there's apparently no really late analysis for demanded elements.

We could try to sink insertelement to shuffles. That should probably be another patch though.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79718/new/

https://reviews.llvm.org/D79718