[all-commits] [llvm/llvm-project] e9f946: [X86] X86FixupInstTunings - add VPERMILPDri -> VSH...
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Sun Apr 23 03:49:16 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e9f9467da063875bd684e46660e2ff36ba4f55e2
https://github.com/llvm/llvm-project/commit/e9f9467da063875bd684e46660e2ff36ba4f55e2
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2023-04-23 (Sun, 23 Apr 2023)
Changed paths:
M llvm/lib/Target/X86/X86FixupInstTuning.cpp
M llvm/test/CodeGen/X86/avx-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll
M llvm/test/CodeGen/X86/avx-intrinsics-x86.ll
M llvm/test/CodeGen/X86/avx-vbroadcast.ll
M llvm/test/CodeGen/X86/avx512-cvt.ll
M llvm/test/CodeGen/X86/avx512-hadd-hsub.ll
M llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/avx512-intrinsics-upgrade.ll
M llvm/test/CodeGen/X86/avx512-shuffles/in_lane_permute.ll
M llvm/test/CodeGen/X86/avx512fp16-mov.ll
M llvm/test/CodeGen/X86/avx512fp16-mscatter.ll
M llvm/test/CodeGen/X86/avx512vl-intrinsics-upgrade.ll
M llvm/test/CodeGen/X86/combine-and.ll
M llvm/test/CodeGen/X86/complex-fastmath.ll
M llvm/test/CodeGen/X86/copy-low-subvec-elt-to-high-subvec-elt.ll
M llvm/test/CodeGen/X86/extract-concat.ll
M llvm/test/CodeGen/X86/fmaddsub-combine.ll
M llvm/test/CodeGen/X86/fmf-reduction.ll
M llvm/test/CodeGen/X86/haddsub-2.ll
M llvm/test/CodeGen/X86/haddsub-3.ll
M llvm/test/CodeGen/X86/haddsub-broadcast.ll
M llvm/test/CodeGen/X86/haddsub-shuf.ll
M llvm/test/CodeGen/X86/haddsub-undef.ll
M llvm/test/CodeGen/X86/haddsub.ll
M llvm/test/CodeGen/X86/half.ll
M llvm/test/CodeGen/X86/horizontal-reduce-fadd.ll
M llvm/test/CodeGen/X86/horizontal-sum.ll
M llvm/test/CodeGen/X86/known-signbits-vector.ll
M llvm/test/CodeGen/X86/load-partial-dot-product.ll
M llvm/test/CodeGen/X86/matrix-multiply.ll
M llvm/test/CodeGen/X86/oddshuffles.ll
M llvm/test/CodeGen/X86/pr40730.ll
M llvm/test/CodeGen/X86/scalar-int-to-fp.ll
M llvm/test/CodeGen/X86/scalarize-fp.ll
M llvm/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
M llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll
M llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/sse3-avx-addsub-2.ll
M llvm/test/CodeGen/X86/tuning-shuffle-permilpd-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-permilpd.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-128.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-256.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-512.ll
M llvm/test/CodeGen/X86/vec_fp_to_int.ll
M llvm/test/CodeGen/X86/vector-half-conversions.ll
M llvm/test/CodeGen/X86/vector-interleave.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-4.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-7.ll
M llvm/test/CodeGen/X86/vector-narrow-binop.ll
M llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll
M llvm/test/CodeGen/X86/vector-reduce-fadd.ll
M llvm/test/CodeGen/X86/vector-reduce-fmax-fmin-fast.ll
M llvm/test/CodeGen/X86/vector-reduce-fmax-nnan.ll
M llvm/test/CodeGen/X86/vector-reduce-fmax.ll
M llvm/test/CodeGen/X86/vector-reduce-fmin-nnan.ll
M llvm/test/CodeGen/X86/vector-reduce-fmin.ll
M llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll
M llvm/test/CodeGen/X86/vector-reduce-fmul.ll
M llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v8.ll
M llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
M llvm/test/CodeGen/X86/vector-shuffle-512-v8.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-avx.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-xop.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining.ll
M llvm/test/CodeGen/X86/x86-interleaved-access.ll
Log Message:
-----------
[X86] X86FixupInstTunings - add VPERMILPDri -> VSHUFPDrri mapping
Similar to the original VPERMILPSri -> VSHUFPSrri mapping added in D143787, replacing VPERMILPDri -> VSHUFPDrri should never be any slower and saves an encoding byte.
The sibling VPERMILPDmi -> VPSHUFDmi mapping is trickier as we need the same shuffle mask in every lane (and it needs to be adjusted) - I haven't attempted that yet but we can investigate it in the future if there's interest.
Fixes #61060
Differential Revision: https://reviews.llvm.org/D148999
More information about the All-commits
mailing list