[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 17 11:05:52 PDT 2021


lebedev.ri added inline comments.


================
Comment at: llvm/test/CodeGen/X86/insertelement-ones.ll:315
 ; SSE2-NEXT:    movdqa {{.*#+}} xmm1 = [0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
+; SSE2-NEXT:    pand %xmm1, %xmm0
 ; SSE2-NEXT:    movl $255, %eax
----------------
RKSimon wrote:
> RKSimon wrote:
> > We're going to have to improve INSERT_VECTOR_ELT handling of 0/-1 elements - just AND/OR if we don't have a legal PINSRB instruction (pre-SSE41).
> It looks like we might be able to do this more easily by extending lowerShuffleAsBitMask to handle the allones elements case as well as the zero elements case.
Note that `X86TargetLowering::LowerINSERT_VECTOR_ELT` isn't even called for this test,
since we expand, not legalize, in this case.
Marking it as legalize causes crashes "don't know how to legalize",
i guess it doesn't retry to legalize via the generic expansion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109065/new/

https://reviews.llvm.org/D109065



More information about the llvm-commits mailing list