[all-commits] [llvm/llvm-project] 8dfb96: [X86] Make v32i16/v64i8 legal types without avx512...

Wed Apr 15 12:18:27 PDT 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 8dfb9627b7be27e7b37ab4200c60f65f5af95256
      https://github.com/llvm/llvm-project/commit/8dfb9627b7be27e7b37ab4200c60f65f5af95256
  Author: Craig Topper <craig.topper at intel.com>
  Date:   2020-04-15 (Wed, 15 Apr 2020)

  Changed paths:
    M llvm/docs/ReleaseNotes.rst
    M llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
    M llvm/lib/Target/X86/X86ISelLowering.cpp
    M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
    M llvm/test/Analysis/CostModel/X86/arith-fix.ll
    M llvm/test/Analysis/CostModel/X86/arith-overflow.ll
    M llvm/test/Analysis/CostModel/X86/arith.ll
    M llvm/test/Analysis/CostModel/X86/fshl.ll
    M llvm/test/Analysis/CostModel/X86/fshr.ll
    M llvm/test/Analysis/CostModel/X86/icmp.ll
    M llvm/test/Analysis/CostModel/X86/masked-intrinsic-cost.ll
    M llvm/test/Analysis/CostModel/X86/reduce-add.ll
    M llvm/test/Analysis/CostModel/X86/reduce-and.ll
    M llvm/test/Analysis/CostModel/X86/reduce-mul.ll
    M llvm/test/Analysis/CostModel/X86/reduce-or.ll
    M llvm/test/Analysis/CostModel/X86/reduce-smax.ll
    M llvm/test/Analysis/CostModel/X86/reduce-smin.ll
    M llvm/test/Analysis/CostModel/X86/reduce-umax.ll
    M llvm/test/Analysis/CostModel/X86/reduce-umin.ll
    M llvm/test/Analysis/CostModel/X86/reduce-xor.ll
    M llvm/test/Analysis/CostModel/X86/rem.ll
    M llvm/test/Analysis/CostModel/X86/shuffle-extract_subvector.ll
    M llvm/test/Analysis/CostModel/X86/shuffle-reverse.ll
    M llvm/test/Analysis/CostModel/X86/shuffle-two-src.ll
    M llvm/test/Analysis/CostModel/X86/trunc.ll
    M llvm/test/Analysis/CostModel/X86/vector-extract.ll
    M llvm/test/Analysis/CostModel/X86/vector-insert.ll
    M llvm/test/CodeGen/X86/avg-mask.ll
    M llvm/test/CodeGen/X86/avg.ll
    M llvm/test/CodeGen/X86/avx512-calling-conv.ll
    M llvm/test/CodeGen/X86/avx512-ext.ll
    M llvm/test/CodeGen/X86/avx512-insert-extract.ll
    M llvm/test/CodeGen/X86/avx512-logic.ll
    M llvm/test/CodeGen/X86/avx512-mask-op.ll
    M llvm/test/CodeGen/X86/avx512-select.ll
    M llvm/test/CodeGen/X86/avx512-trunc.ll
    M llvm/test/CodeGen/X86/avx512-vbroadcasti128.ll
    M llvm/test/CodeGen/X86/avx512-vbroadcasti256.ll
    M llvm/test/CodeGen/X86/avx512-vec-cmp.ll
    M llvm/test/CodeGen/X86/avx512-vselect.ll
    M llvm/test/CodeGen/X86/avx512vl-vec-masked-cmp.ll
    M llvm/test/CodeGen/X86/bitcast-and-setcc-512.ll
    M llvm/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
    M llvm/test/CodeGen/X86/bitcast-setcc-512.ll
    M llvm/test/CodeGen/X86/fast-isel-nontemporal.ll
    M llvm/test/CodeGen/X86/kshift.ll
    M llvm/test/CodeGen/X86/madd.ll
    M llvm/test/CodeGen/X86/masked_store_trunc.ll
    M llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
    M llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
    M llvm/test/CodeGen/X86/midpoint-int-vec-512.ll
    M llvm/test/CodeGen/X86/movmsk-cmp.ll
    M llvm/test/CodeGen/X86/nontemporal-loads-2.ll
    M llvm/test/CodeGen/X86/nontemporal-loads.ll
    M llvm/test/CodeGen/X86/pmaddubsw.ll
    M llvm/test/CodeGen/X86/pmul.ll
    M llvm/test/CodeGen/X86/pmulh.ll
    M llvm/test/CodeGen/X86/pr45443.ll
    M llvm/test/CodeGen/X86/var-permute-512.ll
    M llvm/test/CodeGen/X86/vector-compare-results.ll
    M llvm/test/CodeGen/X86/vector-fshl-512.ll
    M llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-512.ll
    M llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
    M llvm/test/CodeGen/X86/vector-idiv-sdiv-512.ll
    M llvm/test/CodeGen/X86/vector-idiv-udiv-512.ll
    M llvm/test/CodeGen/X86/vector-popcnt-512.ll
    M llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
    M llvm/test/CodeGen/X86/vector-reduce-mul.ll
    M llvm/test/CodeGen/X86/vector-reduce-or-bool.ll
    M llvm/test/CodeGen/X86/vector-reduce-xor-bool.ll
    M llvm/test/CodeGen/X86/vector-rotate-512.ll
    M llvm/test/CodeGen/X86/vector-sext.ll
    M llvm/test/CodeGen/X86/vector-shift-ashr-512.ll
    M llvm/test/CodeGen/X86/vector-shift-lshr-512.ll
    M llvm/test/CodeGen/X86/vector-shift-shl-512.ll
    M llvm/test/CodeGen/X86/vector-shuffle-512-v32.ll
    M llvm/test/CodeGen/X86/vector-shuffle-512-v64.ll
    M llvm/test/CodeGen/X86/vector-shuffle-v1.ll
    M llvm/test/CodeGen/X86/vector-tzcnt-512.ll
    M llvm/test/CodeGen/X86/vector-zext.ll
    M llvm/test/CodeGen/X86/viabs.ll

  Log Message:
  -----------
  [X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead.

This moves v32i16/v64i8 to a model consistent with how we
treat integer types with avx1.

This does change the ABI for types vXi16/vXi8 vectors larger than
512 bits to pass in multiple zmms instead of multiple ymms. We'd
already hacked some code to make v64i8/v32i16 pass in zmm.

Cost model is still a bit of a mess. In some place I tried to
match existing behavior. But really we need to account for
splitting and concating costs. Cost model for shuffles is
especially pessimistic.

Differential Revision: https://reviews.llvm.org/D76212