[PATCH] D108382: [X86] lowerShuffleAsDecomposedShuffleMerge(): if both inputs are broadcastable/identities, canonicalize broadcasts as such

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 25 13:35:07 PDT 2021


lebedev.ri added inline comments.


================
Comment at: llvm/test/CodeGen/X86/oddshuffles.ll:2284
+; AVX2-FAST-NEXT:    vmovq {{.*#+}} xmm0 = mem[0],zero
+; AVX2-FAST-NEXT:    vpinsrd $2, 8(%rdi), %xmm0, %xmm1
 ; AVX2-FAST-NEXT:    vpxor %xmm0, %xmm0, %xmm0
----------------
RKSimon wrote:
> any luck with this?
I wrote a comment here, and phab just lost it :(

This seems like demandedelts failure.
In LHS, we successfully dropped this load.
Whenever in `SimplifyMultipleUseDemandedBits()` we look at this `insert_vector_elt`,
demandedelts implies that we demand all elements.
The problem is that we need to decode the target shuffle mask to notice that, i think.

Wild guess: perhaps in `SimplifyMultipleUseDemandedBitsForTargetNode()` after `getTargetShuffleInputs()`,
we can call `SimplifyMultipleUseDemandedBits()` on inputs, and recreate the shuffle if that succeeded?
I'm not really sure if there is some other better place to do that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108382/new/

https://reviews.llvm.org/D108382



More information about the llvm-commits mailing list