[PATCH] D92553: [ARM][X86] Match dual lane vmovs from insert_vector_elt

Thu Dec 3 01:15:28 PST 2020

dmgreen created this revision.
dmgreen added reviewers: SjoerdMeijer, simon_tatham, pengfei, efriedma, RKSimon.
Herald added subscribers: ecnelises, danielkiss, steven.zhang, hiraditya, kristof.beyls.
Herald added a project: LLVM.
dmgreen requested review of this revision.

MVE has a dual lane vector move instruction, capable of moving two general purpose registers into lanes of a vector register. They look like one of:

  vmov q0[2], q0[0], r2, r0
  vmov q0[3], q0[1], r3, r1

They only accept these lane indices though (and only insert into an i32), either moving lanes 1 and 3, or 0 and 2.

This patch adds some patterns for them, selecting from a pair of vector inserts elements. Because of the format of the instructions, there is also an added combine to transform the order of insert_vector_elts into one where the 0 and 2 elements are adjacent.

Unfortunately there is a "canonicalization" in DAGCombine that will transform insert_vector_elts back into ascending order, even if this is worse for the target. This does not seem to be needed in many places though, with only some X86 tests changing, so I have moved the canonicalization there.

https://reviews.llvm.org/D92553

Files:
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
  llvm/lib/Target/ARM/ARMISelLowering.cpp
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/test/CodeGen/Thumb2/active_lane_mask.ll
  llvm/test/CodeGen/Thumb2/mve-abs.ll
  llvm/test/CodeGen/Thumb2/mve-div-expand.ll
  llvm/test/CodeGen/Thumb2/mve-gather-increment.ll
  llvm/test/CodeGen/Thumb2/mve-gather-ind32-unscaled.ll
  llvm/test/CodeGen/Thumb2/mve-gather-ind8-unscaled.ll
  llvm/test/CodeGen/Thumb2/mve-gather-ptrs.ll
  llvm/test/CodeGen/Thumb2/mve-gather-scatter-opt.ll
  llvm/test/CodeGen/Thumb2/mve-masked-ldst.ll
  llvm/test/CodeGen/Thumb2/mve-minmax.ll
  llvm/test/CodeGen/Thumb2/mve-neg.ll
  llvm/test/CodeGen/Thumb2/mve-phireg.ll
  llvm/test/CodeGen/Thumb2/mve-pred-and.ll
  llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
  llvm/test/CodeGen/Thumb2/mve-pred-constfold.ll
  llvm/test/CodeGen/Thumb2/mve-pred-ext.ll
  llvm/test/CodeGen/Thumb2/mve-pred-loadstore.ll
  llvm/test/CodeGen/Thumb2/mve-pred-not.ll
  llvm/test/CodeGen/Thumb2/mve-pred-or.ll
  llvm/test/CodeGen/Thumb2/mve-pred-shuffle.ll
  llvm/test/CodeGen/Thumb2/mve-pred-xor.ll
  llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll
  llvm/test/CodeGen/Thumb2/mve-saturating-arith.ll
  llvm/test/CodeGen/Thumb2/mve-scatter-ind8-unscaled.ll
  llvm/test/CodeGen/Thumb2/mve-sext.ll
  llvm/test/CodeGen/Thumb2/mve-shifts.ll
  llvm/test/CodeGen/Thumb2/mve-simple-arith.ll
  llvm/test/CodeGen/Thumb2/mve-soft-float-abi.ll
  llvm/test/CodeGen/Thumb2/mve-vabdus.ll
  llvm/test/CodeGen/Thumb2/mve-vcmp.ll
  llvm/test/CodeGen/Thumb2/mve-vcmpr.ll
  llvm/test/CodeGen/Thumb2/mve-vcmpz.ll
  llvm/test/CodeGen/Thumb2/mve-vcreate.ll
  llvm/test/CodeGen/Thumb2/mve-vcvt.ll
  llvm/test/CodeGen/Thumb2/mve-vdup.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-add.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-addpred.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-mla.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-mlapred.ll
  llvm/test/CodeGen/Thumb2/mve-vhadd.ll
  llvm/test/CodeGen/Thumb2/mve-vld2-post.ll
  llvm/test/CodeGen/Thumb2/mve-vld2.ll
  llvm/test/CodeGen/Thumb2/mve-vld3.ll
  llvm/test/CodeGen/Thumb2/mve-vld4-post.ll
  llvm/test/CodeGen/Thumb2/mve-vld4.ll
  llvm/test/CodeGen/Thumb2/mve-vmulh.ll
  llvm/test/CodeGen/Thumb2/mve-vmull-loop.ll
  llvm/test/CodeGen/Thumb2/mve-vqdmulh.ll
  llvm/test/CodeGen/Thumb2/mve-vqmovn.ll
  llvm/test/CodeGen/Thumb2/mve-vqshrn.ll
  llvm/test/CodeGen/Thumb2/mve-vst2.ll
  llvm/test/CodeGen/Thumb2/mve-vst3.ll
  llvm/test/CodeGen/Thumb2/mve-vst4.ll
  llvm/test/CodeGen/Thumb2/mve-widen-narrow.ll