[PATCH] D79718: [x86][CGP] enable target hook to sink funnel shift intrinsic's splatted shift amount

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 12 11:16:59 PDT 2020


spatel marked an inline comment as done.
spatel added inline comments.


================
Comment at: llvm/test/CodeGen/X86/vector-fshl-128.ll:2195
 ; AVX1-NEXT:    movq $-1024, %rax # imm = 0xFC00
-; AVX1-NEXT:    vpand {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT:    vpslld $23, %xmm0, %xmm0
-; AVX1-NEXT:    vpaddd {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT:    vcvttps2dq %xmm0, %xmm0
-; AVX1-NEXT:    vpshufd {{.*#+}} xmm1 = xmm0[1,1,3,3]
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[0,0,0,0]
+; AVX1-NEXT:    vpand {{.*}}(%rip), %xmm0, %xmm1
----------------
craig.topper wrote:
> Why are we splatting the scalar here when only element 0 is used?
This is another limitation caused by the block-level visibility - SDAG doesn't know that the splat is from a scalar because we are only sinking the shuffle instruction, not the insertelement:
      t12: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
    t14: v4i32 = vector_shuffle<0,0,0,0> t12, undef:v4i32

The splat doesn't get hoisted back out of the loop until later in MachineLICM, and there's apparently no really late analysis for demanded elements.

We could try to sink insertelement to shuffles. That should probably be another patch though.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79718/new/

https://reviews.llvm.org/D79718





More information about the llvm-commits mailing list