[all-commits] [llvm/llvm-project] 48cef1: [ARM] Create VMOVRRD from adjacent vector extracts

Tue Apr 20 07:16:20 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 48cef1fa8ee6448e35ffc34259da500d3b81c6b6
      https://github.com/llvm/llvm-project/commit/48cef1fa8ee6448e35ffc34259da500d3b81c6b6
  Author: David Green <david.green at arm.com>
  Date:   2021-04-20 (Tue, 20 Apr 2021)

  Changed paths:
    M llvm/lib/Target/ARM/ARMBaseRegisterInfo.cpp
    M llvm/lib/Target/ARM/ARMBaseRegisterInfo.h
    M llvm/lib/Target/ARM/ARMISelLowering.cpp
    M llvm/test/CodeGen/ARM/addsubo-legalization.ll
    M llvm/test/CodeGen/ARM/big-endian-neon-fp16-bitconv.ll
    M llvm/test/CodeGen/ARM/big-endian-vector-callee.ll
    M llvm/test/CodeGen/ARM/combine-vmovdrr.ll
    M llvm/test/CodeGen/ARM/vselect_imax.ll
    M llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-float-loops.ll
    M llvm/test/CodeGen/Thumb2/active_lane_mask.ll
    M llvm/test/CodeGen/Thumb2/mve-abs.ll
    M llvm/test/CodeGen/Thumb2/mve-ctlz.ll
    M llvm/test/CodeGen/Thumb2/mve-ctpop.ll
    M llvm/test/CodeGen/Thumb2/mve-cttz.ll
    M llvm/test/CodeGen/Thumb2/mve-div-expand.ll
    M llvm/test/CodeGen/Thumb2/mve-fmath.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-increment.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-ind16-scaled.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-ind16-unscaled.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-ind32-unscaled.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-ind8-unscaled.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-ptrs.ll
    M llvm/test/CodeGen/Thumb2/mve-gather-scatter-opt.ll
    M llvm/test/CodeGen/Thumb2/mve-laneinterleaving-cost.ll
    M llvm/test/CodeGen/Thumb2/mve-laneinterleaving.ll
    M llvm/test/CodeGen/Thumb2/mve-masked-load.ll
    M llvm/test/CodeGen/Thumb2/mve-masked-store.ll
    M llvm/test/CodeGen/Thumb2/mve-minmax.ll
    M llvm/test/CodeGen/Thumb2/mve-neg.ll
    M llvm/test/CodeGen/Thumb2/mve-nofloat.ll
    M llvm/test/CodeGen/Thumb2/mve-phireg.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-and.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-ext.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-loadstore.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-not.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-or.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-shuffle.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-vselect.ll
    M llvm/test/CodeGen/Thumb2/mve-pred-xor.ll
    M llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll
    M llvm/test/CodeGen/Thumb2/mve-saturating-arith.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-increment.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-ind16-scaled.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-ind16-unscaled.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-ind32-scaled.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-ind32-unscaled.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-ind8-unscaled.ll
    M llvm/test/CodeGen/Thumb2/mve-scatter-ptrs.ll
    M llvm/test/CodeGen/Thumb2/mve-sext.ll
    M llvm/test/CodeGen/Thumb2/mve-shifts.ll
    M llvm/test/CodeGen/Thumb2/mve-shuffle.ll
    M llvm/test/CodeGen/Thumb2/mve-simple-arith.ll
    M llvm/test/CodeGen/Thumb2/mve-soft-float-abi.ll
    M llvm/test/CodeGen/Thumb2/mve-vabd.ll
    M llvm/test/CodeGen/Thumb2/mve-vabdus.ll
    M llvm/test/CodeGen/Thumb2/mve-vaddv.ll
    M llvm/test/CodeGen/Thumb2/mve-vcmp.ll
    M llvm/test/CodeGen/Thumb2/mve-vcmpr.ll
    M llvm/test/CodeGen/Thumb2/mve-vcmpz.ll
    M llvm/test/CodeGen/Thumb2/mve-vcvt.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-add.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-addpred.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-bit.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-loops.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-mla.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-mlapred.ll
    M llvm/test/CodeGen/Thumb2/mve-vecreduce-mul.ll
    M llvm/test/CodeGen/Thumb2/mve-vld2-post.ll
    M llvm/test/CodeGen/Thumb2/mve-vld2.ll
    M llvm/test/CodeGen/Thumb2/mve-vld3.ll
    M llvm/test/CodeGen/Thumb2/mve-vld4-post.ll
    M llvm/test/CodeGen/Thumb2/mve-vld4.ll
    M llvm/test/CodeGen/Thumb2/mve-vmaxv-vminv-scalar.ll
    M llvm/test/CodeGen/Thumb2/mve-vmovn.ll
    M llvm/test/CodeGen/Thumb2/mve-vmull-loop.ll
    M llvm/test/CodeGen/Thumb2/mve-vqdmulh.ll
    M llvm/test/CodeGen/Thumb2/mve-vqmovn.ll
    M llvm/test/CodeGen/Thumb2/mve-vqshrn.ll
    M llvm/test/CodeGen/Thumb2/mve-vst2.ll
    M llvm/test/CodeGen/Thumb2/mve-vst3.ll
    M llvm/test/CodeGen/Thumb2/mve-vst4.ll
    M llvm/test/CodeGen/Thumb2/mve-zext-masked-load.ll

  Log Message:
  -----------
  [ARM] Create VMOVRRD from adjacent vector extracts

This adds a combine for extract(x, n); extract(x, n+1)  ->
VMOVRRD(extract x, n/2). This allows two vector lanes to be moved at the
same time in a single instruction, and thanks to the other VMOVRRD folds
we have added recently can help reduce the amount of executed
instructions. Floating point types are very similar, but will include a
bitcast to an integer type.

This also adds a shouldRewriteCopySrc, to prevent copy propagation from
DPR to SPR, which can break as not all DPR regs can be extracted from
directly.  Otherwise the machine verifier is unhappy.

Differential Revision: https://reviews.llvm.org/D100244