[PATCH] D79718: [x86][CGP] enable target hook to sink funnel shift intrinsic's splatted shift amount
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 12 11:16:59 PDT 2020
spatel marked an inline comment as done.
spatel added inline comments.
================
Comment at: llvm/test/CodeGen/X86/vector-fshl-128.ll:2195
; AVX1-NEXT: movq $-1024, %rax # imm = 0xFC00
-; AVX1-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT: vpslld $23, %xmm0, %xmm0
-; AVX1-NEXT: vpaddd {{.*}}(%rip), %xmm0, %xmm0
-; AVX1-NEXT: vcvttps2dq %xmm0, %xmm0
-; AVX1-NEXT: vpshufd {{.*#+}} xmm1 = xmm0[1,1,3,3]
+; AVX1-NEXT: vpshufd {{.*#+}} xmm0 = xmm0[0,0,0,0]
+; AVX1-NEXT: vpand {{.*}}(%rip), %xmm0, %xmm1
----------------
craig.topper wrote:
> Why are we splatting the scalar here when only element 0 is used?
This is another limitation caused by the block-level visibility - SDAG doesn't know that the splat is from a scalar because we are only sinking the shuffle instruction, not the insertelement:
t12: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
t14: v4i32 = vector_shuffle<0,0,0,0> t12, undef:v4i32
The splat doesn't get hoisted back out of the loop until later in MachineLICM, and there's apparently no really late analysis for demanded elements.
We could try to sink insertelement to shuffles. That should probably be another patch though.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79718/new/
https://reviews.llvm.org/D79718
More information about the llvm-commits
mailing list