[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 17 10:53:41 PDT 2021


RKSimon added inline comments.


================
Comment at: llvm/test/CodeGen/X86/insertelement-ones.ll:315
 ; SSE2-NEXT:    movdqa {{.*#+}} xmm1 = [0,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255]
+; SSE2-NEXT:    pand %xmm1, %xmm0
 ; SSE2-NEXT:    movl $255, %eax
----------------
RKSimon wrote:
> We're going to have to improve INSERT_VECTOR_ELT handling of 0/-1 elements - just AND/OR if we don't have a legal PINSRB instruction (pre-SSE41).
It looks like we might be able to do this more easily by extending lowerShuffleAsBitMask to handle the allones elements case as well as the zero elements case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109065/new/

https://reviews.llvm.org/D109065



More information about the llvm-commits mailing list