[PATCH] D79718: [x86][CGP] enable target hook to sink funnel shift intrinsic's splatted shift amount

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 12 11:49:26 PDT 2020


craig.topper added inline comments.


================
Comment at: llvm/test/CodeGen/X86/vector-fshl-128.ll:2195
 ; AVX1-NEXT:    movq $-1024, %rax # imm = 0xFC00
-; AVX1-NEXT:    vpand {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT:    vpslld $23, %xmm0, %xmm0
-; AVX1-NEXT:    vpaddd {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT:    vcvttps2dq %xmm0, %xmm0
-; AVX1-NEXT:    vpshufd {{.*#+}} xmm1 = xmm0[1,1,3,3]
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm0 = xmm0[0,0,0,0]
+; AVX1-NEXT:    vpand {{.*}}(%rip), %xmm0, %xmm1
----------------
spatel wrote:
> craig.topper wrote:
> > Why are we splatting the scalar here when only element 0 is used?
> This is another limitation caused by the block-level visibility - SDAG doesn't know that the splat is from a scalar because we are only sinking the shuffle instruction, not the insertelement:
>       t12: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
>     t14: v4i32 = vector_shuffle<0,0,0,0> t12, undef:v4i32
> 
> The splat doesn't get hoisted back out of the loop until later in MachineLICM, and there's apparently no really late analysis for demanded elements.
> 
> We could try to sink insertelement to shuffles. That should probably be another patch though.
I'm still confused. Shouldn't demandedelts inside selectiondag have determined the splat shuffle was unnecessary regardless of it coming from an insertelement? 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79718/new/

https://reviews.llvm.org/D79718





More information about the llvm-commits mailing list