[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 17 11:05:52 PDT 2021
lebedev.ri added inline comments.
================
Comment at: llvm/test/CodeGen/X86/insertelement-ones.ll:315
; SSE2-NEXT: movdqa {{.*#+}} xmm1 = [0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
+; SSE2-NEXT: pand %xmm1, %xmm0
; SSE2-NEXT: movl $255, %eax
----------------
RKSimon wrote:
> RKSimon wrote:
> > We're going to have to improve INSERT_VECTOR_ELT handling of 0/-1 elements - just AND/OR if we don't have a legal PINSRB instruction (pre-SSE41).
> It looks like we might be able to do this more easily by extending lowerShuffleAsBitMask to handle the allones elements case as well as the zero elements case.
Note that `X86TargetLowering::LowerINSERT_VECTOR_ELT` isn't even called for this test,
since we expand, not legalize, in this case.
Marking it as legalize causes crashes "don't know how to legalize",
i guess it doesn't retry to legalize via the generic expansion.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D109065/new/
https://reviews.llvm.org/D109065
More information about the llvm-commits
mailing list