[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 17 09:30:54 PDT 2021
RKSimon added inline comments.
================
Comment at: llvm/test/CodeGen/X86/insertelement-ones.ll:315
; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
+; SSE2-NEXT: pand %xmm1, %xmm0
; SSE2-NEXT: movl $255, %eax
----------------
We're going to have to improve INSERT_VECTOR_ELT handling of 0/-1 elements - just AND/OR if we don't have a legal PINSRB instruction (pre-SSE41).
================
Comment at: llvm/test/CodeGen/X86/oddshuffles.ll:2268
+; AVX1-NEXT: vblendps {{.*#+}} ymm0 = ymm1[0],ymm0[1],ymm1[2,3,4,5,6,7]
+; AVX1-NEXT: vmovd %eax, %xmm2
+; AVX1-NEXT: vpshufd {{.*#+}} xmm2 = xmm2[0,0,0,0]
----------------
Looks like we're missing a fold to share scalar_to_vector(x) and scalar_to_vector(trunc(x)) (maybe worth supporting scalar_to_vector(ext(x)) as well)?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D109065/new/
https://reviews.llvm.org/D109065
More information about the llvm-commits
mailing list