[all-commits] [llvm/llvm-project] 980610: [X86] X86FixupVectorConstantsPass - attempt to rep...
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Mon May 29 08:31:06 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 98061013e01207444cfd3980cde17b5e75764fbe
https://github.com/llvm/llvm-project/commit/98061013e01207444cfd3980cde17b5e75764fbe
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2023-05-29 (Mon, 29 May 2023)
Changed paths:
M llvm/lib/Target/X86/X86FixupVectorConstants.cpp
M llvm/test/CodeGen/X86/avx-basic.ll
M llvm/test/CodeGen/X86/avx-vbroadcast.ll
M llvm/test/CodeGen/X86/avx2-conversions.ll
M llvm/test/CodeGen/X86/avx2-intrinsics-x86.ll
M llvm/test/CodeGen/X86/avx2-vbroadcast.ll
M llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
M llvm/test/CodeGen/X86/avx512-shuffles/partial_permute.ll
M llvm/test/CodeGen/X86/bitreverse.ll
M llvm/test/CodeGen/X86/broadcast-elm-cross-splat-vec.ll
M llvm/test/CodeGen/X86/cast-vsel.ll
M llvm/test/CodeGen/X86/combine-and.ll
M llvm/test/CodeGen/X86/combine-sdiv.ll
M llvm/test/CodeGen/X86/combine-udiv.ll
M llvm/test/CodeGen/X86/extractelement-load.ll
M llvm/test/CodeGen/X86/fma-fneg-combine-2.ll
M llvm/test/CodeGen/X86/fma-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/fma_patterns.ll
M llvm/test/CodeGen/X86/fma_patterns_wide.ll
M llvm/test/CodeGen/X86/fminimum-fmaximum.ll
M llvm/test/CodeGen/X86/fold-vector-sext-zext.ll
M llvm/test/CodeGen/X86/fold-vector-trunc-sitofp.ll
M llvm/test/CodeGen/X86/fp-round.ll
M llvm/test/CodeGen/X86/insert-into-constant-vector.ll
M llvm/test/CodeGen/X86/known-bits-vector.ll
M llvm/test/CodeGen/X86/masked_store_trunc.ll
M llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
M llvm/test/CodeGen/X86/memset-nonzero.ll
M llvm/test/CodeGen/X86/merge-store-constants.ll
M llvm/test/CodeGen/X86/oddshuffles.ll
M llvm/test/CodeGen/X86/paddus.ll
M llvm/test/CodeGen/X86/pr30290.ll
M llvm/test/CodeGen/X86/pr32368.ll
M llvm/test/CodeGen/X86/pr38639.ll
M llvm/test/CodeGen/X86/psubus.ll
M llvm/test/CodeGen/X86/recip-fastmath.ll
M llvm/test/CodeGen/X86/recip-fastmath2.ll
M llvm/test/CodeGen/X86/sadd_sat_vec.ll
M llvm/test/CodeGen/X86/sat-add.ll
M llvm/test/CodeGen/X86/shuffle-vs-trunc-256.ll
M llvm/test/CodeGen/X86/splat-const.ll
M llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll
M llvm/test/CodeGen/X86/sqrt-fastmath.ll
M llvm/test/CodeGen/X86/srem-seteq-vec-splat.ll
M llvm/test/CodeGen/X86/sse2.ll
M llvm/test/CodeGen/X86/sshl_sat_vec.ll
M llvm/test/CodeGen/X86/ssub_sat_vec.ll
M llvm/test/CodeGen/X86/urem-seteq-vec-splat.ll
M llvm/test/CodeGen/X86/v8i1-masks.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-128.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-256.ll
M llvm/test/CodeGen/X86/vec_anyext.ll
M llvm/test/CodeGen/X86/vec_fabs.ll
M llvm/test/CodeGen/X86/vec_fp_to_int.ll
M llvm/test/CodeGen/X86/vec_int_to_fp.ll
M llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
M llvm/test/CodeGen/X86/vector-fshl-256.ll
M llvm/test/CodeGen/X86/vector-fshr-256.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-4.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-6.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-6.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-8.ll
M llvm/test/CodeGen/X86/vector-reduce-add-mask.ll
M llvm/test/CodeGen/X86/vector-reduce-xor-bool.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v32.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v8.ll
M llvm/test/CodeGen/X86/vector-shuffle-avx512.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-avx.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining.ll
M llvm/test/CodeGen/X86/vector-trunc-math.ll
M llvm/test/CodeGen/X86/vector-trunc-ssat.ll
M llvm/test/CodeGen/X86/vector-trunc-usat.ll
M llvm/test/CodeGen/X86/vector-trunc.ll
M llvm/test/CodeGen/X86/vselect-avx.ll
M llvm/test/CodeGen/X86/vselect-zero.ll
M llvm/test/CodeGen/X86/win_cst_pool.ll
Log Message:
-----------
[X86] X86FixupVectorConstantsPass - attempt to replace full width fp vector constant loads with broadcasts on AVX+ targets
lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.
NOTE: SSE3 targets can use MOVDDUP but not all SSE era CPUs can perform this as cheaply as a vector load, we will need to add scheduler model checks if we want to pursue this.
More information about the All-commits
mailing list