[PATCH] D104156: [DAGCombine][X86][ARM] EXTRACT_SUBVECTOR(VECTOR_SHUFFLE(?,?,Mask)) -> VECTOR_SHUFFLE(EXTRACT_SUBVECTOR(?, ?), EXTRACT_SUBVECTOR(?, ?), Mask')

Fri Jun 11 15:31:40 PDT 2021

lebedev.ri created this revision.
lebedev.ri added reviewers: RKSimon, spatel, craig.topper.
lebedev.ri added a project: LLVM.
Herald added subscribers: ecnelises, danielkiss, pengfei, hiraditya, kristof.beyls.
lebedev.ri requested review of this revision.

I haven't fully groked the effect this has, there are some regressions,
but many cases trade lane-crossing shuffle to lane-extraction + per-lane shuffle.
which is a win for Zen's, but i guess a regression for targets that have fast cross-lane shuffles.

I'm not familiar enough with this area to know the answer,
but the results are somewhat surprising to me,
do we need a reverse fold, or am i forgetting some profitability check?

I think i got the logic right (at least, i have already caught the obvious bugs i had..)

The SVE tests are bad, they aren't autogenerated,
and i haven't checked if i can grok them to update manually.
I don't like that trend. CC @bsmith

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D104156

Files:
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/test/CodeGen/AArch64/arm64-neon-copy.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-fp-vselect.ll
  llvm/test/CodeGen/AArch64/sve-fixed-length-int-vselect.ll
  llvm/test/CodeGen/ARM/crash-on-pow2-shufflevector.ll
  llvm/test/CodeGen/ARM/fp16-insert-extract.ll
  llvm/test/CodeGen/ARM/vext.ll
  llvm/test/CodeGen/X86/avx2-conversions.ll
  llvm/test/CodeGen/X86/avx2-shift.ll
  llvm/test/CodeGen/X86/avx2-vector-shifts.ll
  llvm/test/CodeGen/X86/avx512-hadd-hsub.ll
  llvm/test/CodeGen/X86/avx512-shuffles/partial_permute.ll
  llvm/test/CodeGen/X86/cast-vsel.ll
  llvm/test/CodeGen/X86/combine-shl.ll
  llvm/test/CodeGen/X86/combine-sra.ll
  llvm/test/CodeGen/X86/combine-srl.ll
  llvm/test/CodeGen/X86/known-signbits-vector.ll
  llvm/test/CodeGen/X86/masked_store_trunc.ll
  llvm/test/CodeGen/X86/min-legal-vector-width.ll
  llvm/test/CodeGen/X86/psubus.ll
  llvm/test/CodeGen/X86/reduce-trunc-shl.ll
  llvm/test/CodeGen/X86/shuffle-vs-trunc-256.ll
  llvm/test/CodeGen/X86/trunc-subvector.ll
  llvm/test/CodeGen/X86/vector-fshl-128.ll
  llvm/test/CodeGen/X86/vector-fshl-256.ll
  llvm/test/CodeGen/X86/vector-fshl-rot-128.ll
  llvm/test/CodeGen/X86/vector-fshl-rot-256.ll
  llvm/test/CodeGen/X86/vector-fshr-128.ll
  llvm/test/CodeGen/X86/vector-fshr-rot-128.ll
  llvm/test/CodeGen/X86/vector-fshr-rot-256.ll
  llvm/test/CodeGen/X86/vector-narrow-binop.ll
  llvm/test/CodeGen/X86/vector-pack-256.ll
  llvm/test/CodeGen/X86/vector-rotate-128.ll
  llvm/test/CodeGen/X86/vector-rotate-256.ll
  llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
  llvm/test/CodeGen/X86/vector-shift-shl-128.ll
  llvm/test/CodeGen/X86/vector-shift-shl-256.ll
  llvm/test/CodeGen/X86/vector-shift-shl-sub128.ll
  llvm/test/CodeGen/X86/vector-shuffle-256-v8.ll
  llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
  llvm/test/CodeGen/X86/vector-trunc-math.ll
  llvm/test/CodeGen/X86/vector-trunc-packus.ll
  llvm/test/CodeGen/X86/vector-trunc-ssat.ll
  llvm/test/CodeGen/X86/vector-trunc-usat.ll
  llvm/test/CodeGen/X86/vector-trunc.ll
  llvm/test/CodeGen/X86/x86-interleaved-access.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D104156.351570.patch
Type: text/x-patch
Size: 200182 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210611/6433cb7c/attachment-0001.bin>