[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 17 09:30:54 PDT 2021


RKSimon added inline comments.


================
Comment at: llvm/test/CodeGen/X86/insertelement-ones.ll:315
 ; SSE2-NEXT:    movdqa {{.*#+}} xmm1 = [0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
+; SSE2-NEXT:    pand %xmm1, %xmm0
 ; SSE2-NEXT:    movl $255, %eax
----------------
We're going to have to improve INSERT_VECTOR_ELT handling of 0/-1 elements - just AND/OR if we don't have a legal PINSRB instruction (pre-SSE41).


================
Comment at: llvm/test/CodeGen/X86/oddshuffles.ll:2268
+; AVX1-NEXT:    vblendps {{.*#+}} ymm0 = ymm1[0],ymm0[1],ymm1[2,3,4,5,6,7]
+; AVX1-NEXT:    vmovd %eax, %xmm2
+; AVX1-NEXT:    vpshufd {{.*#+}} xmm2 = xmm2[0,0,0,0]
----------------
Looks like we're missing a fold to share scalar_to_vector(x) and scalar_to_vector(trunc(x)) (maybe worth supporting scalar_to_vector(ext(x)) as well)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109065/new/

https://reviews.llvm.org/D109065



More information about the llvm-commits mailing list