[all-commits] [llvm/llvm-project] cecaf2: Adding tuning flags for int <-> fp domain switchin...
goldsteinn via All-commits
all-commits at lists.llvm.org
Mon Feb 27 16:54:00 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: cecaf295898f6bb23b052892c1d06c27f2715b0d
https://github.com/llvm/llvm-project/commit/cecaf295898f6bb23b052892c1d06c27f2715b0d
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-27 (Mon, 27 Feb 2023)
Changed paths:
M llvm/lib/Target/X86/X86.td
M llvm/lib/Target/X86/X86Subtarget.h
M llvm/lib/Target/X86/X86TargetTransformInfo.h
Log Message:
-----------
Adding tuning flags for int <-> fp domain switching penalties; NFC
Atom
- No domain switching penalties
Nehalem+
- No penalty on moves
Haswell+
- No penalty on moves / shuffles
Skylake+
- No penality on moves / shuffles / blends
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D143859
Commit: e56ddae849317a7f41f97a8a9c41cf63ce40e4f8
https://github.com/llvm/llvm-project/commit/e56ddae849317a7f41f97a8a9c41cf63ce40e4f8
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-27 (Mon, 27 Feb 2023)
Changed paths:
A llvm/test/CodeGen/X86/tuning-shuffle-permilps-avx512.ll
A llvm/test/CodeGen/X86/tuning-shuffle-permilps.ll
Log Message:
-----------
Add tests for replacing `{v}permilps` -> `{v}shufps/{v}pshufd`; NFC
Differential Revision: https://reviews.llvm.org/D144779
Commit: 69a322fed19b977d15be9500d8653496b73673e9
https://github.com/llvm/llvm-project/commit/69a322fed19b977d15be9500d8653496b73673e9
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-27 (Mon, 27 Feb 2023)
Changed paths:
M llvm/lib/Target/X86/CMakeLists.txt
M llvm/lib/Target/X86/X86.h
A llvm/lib/Target/X86/X86FixupInstTuning.cpp
M llvm/lib/Target/X86/X86TargetMachine.cpp
M llvm/test/CodeGen/X86/2012-01-12-extract-sv.ll
M llvm/test/CodeGen/X86/SwizzleShuff.ll
M llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast.ll
M llvm/test/CodeGen/X86/any_extend_vector_inreg_of_broadcast_from_memory.ll
M llvm/test/CodeGen/X86/avx-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/avx-intrinsics-x86-upgrade.ll
M llvm/test/CodeGen/X86/avx-splat.ll
M llvm/test/CodeGen/X86/avx-vbroadcast.ll
M llvm/test/CodeGen/X86/avx-vinsertf128.ll
M llvm/test/CodeGen/X86/avx-vperm2x128.ll
M llvm/test/CodeGen/X86/avx2-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/avx512-cvt.ll
M llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/avx512-intrinsics-upgrade.ll
M llvm/test/CodeGen/X86/avx512-shuffles/in_lane_permute.ll
M llvm/test/CodeGen/X86/avx512-shuffles/shuffle.ll
M llvm/test/CodeGen/X86/avx512-trunc.ll
M llvm/test/CodeGen/X86/avx512-vec-cmp.ll
M llvm/test/CodeGen/X86/avx512fp16-mov.ll
M llvm/test/CodeGen/X86/avx512fp16-mscatter.ll
M llvm/test/CodeGen/X86/avx512vl-intrinsics-upgrade.ll
M llvm/test/CodeGen/X86/bitcast-int-to-vector-bool-sext.ll
M llvm/test/CodeGen/X86/bitcast-int-to-vector-bool-zext.ll
M llvm/test/CodeGen/X86/bitcast-int-to-vector-bool.ll
M llvm/test/CodeGen/X86/buildvec-extract.ll
M llvm/test/CodeGen/X86/combine-and.ll
M llvm/test/CodeGen/X86/combine-concatvectors.ll
M llvm/test/CodeGen/X86/copy-low-subvec-elt-to-high-subvec-elt.ll
M llvm/test/CodeGen/X86/extract-concat.ll
M llvm/test/CodeGen/X86/extract-store.ll
M llvm/test/CodeGen/X86/fdiv-combine-vec.ll
M llvm/test/CodeGen/X86/fmaddsub-combine.ll
M llvm/test/CodeGen/X86/haddsub-2.ll
M llvm/test/CodeGen/X86/haddsub-4.ll
M llvm/test/CodeGen/X86/haddsub-undef.ll
M llvm/test/CodeGen/X86/haddsub.ll
M llvm/test/CodeGen/X86/horizontal-reduce-smax.ll
M llvm/test/CodeGen/X86/horizontal-reduce-smin.ll
M llvm/test/CodeGen/X86/horizontal-reduce-umax.ll
M llvm/test/CodeGen/X86/horizontal-reduce-umin.ll
M llvm/test/CodeGen/X86/horizontal-shuffle-2.ll
M llvm/test/CodeGen/X86/horizontal-shuffle-3.ll
M llvm/test/CodeGen/X86/horizontal-shuffle-4.ll
M llvm/test/CodeGen/X86/horizontal-sum.ll
M llvm/test/CodeGen/X86/i64-to-float.ll
M llvm/test/CodeGen/X86/insertelement-var-index.ll
M llvm/test/CodeGen/X86/known-bits-vector.ll
M llvm/test/CodeGen/X86/known-signbits-vector.ll
M llvm/test/CodeGen/X86/masked_store.ll
M llvm/test/CodeGen/X86/masked_store_trunc.ll
M llvm/test/CodeGen/X86/masked_store_trunc_ssat.ll
M llvm/test/CodeGen/X86/masked_store_trunc_usat.ll
M llvm/test/CodeGen/X86/matrix-multiply.ll
M llvm/test/CodeGen/X86/oddshuffles.ll
M llvm/test/CodeGen/X86/opt-pipeline.ll
M llvm/test/CodeGen/X86/packss.ll
M llvm/test/CodeGen/X86/palignr.ll
M llvm/test/CodeGen/X86/pr31956.ll
M llvm/test/CodeGen/X86/pr40730.ll
M llvm/test/CodeGen/X86/pr40811.ll
M llvm/test/CodeGen/X86/pr50609.ll
M llvm/test/CodeGen/X86/rotate_vec.ll
M llvm/test/CodeGen/X86/scalarize-fp.ll
M llvm/test/CodeGen/X86/shuffle-of-shift.ll
M llvm/test/CodeGen/X86/shuffle-of-splat-multiuses.ll
M llvm/test/CodeGen/X86/sse-fsignum.ll
M llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll
M llvm/test/CodeGen/X86/sse2-intrinsics-x86-upgrade.ll
M llvm/test/CodeGen/X86/sse2.ll
M llvm/test/CodeGen/X86/sse3-avx-addsub-2.ll
M llvm/test/CodeGen/X86/sse41.ll
M llvm/test/CodeGen/X86/swizzle-avx2.ll
M llvm/test/CodeGen/X86/tuning-shuffle-permilps-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-permilps.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-256.ll
M llvm/test/CodeGen/X86/vec-strict-fptoint-512.ll
M llvm/test/CodeGen/X86/vec-strict-inttofp-128.ll
M llvm/test/CodeGen/X86/vec-strict-inttofp-256.ll
M llvm/test/CodeGen/X86/vec-strict-inttofp-512.ll
M llvm/test/CodeGen/X86/vec_fp_to_int.ll
M llvm/test/CodeGen/X86/vec_int_to_fp.ll
M llvm/test/CodeGen/X86/vec_umulo.ll
M llvm/test/CodeGen/X86/vector-fshr-256.ll
M llvm/test/CodeGen/X86/vector-half-conversions.ll
M llvm/test/CodeGen/X86/vector-interleave.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-8.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-2.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-6.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i32-stride-8.ll
M llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-8.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-2.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-4.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-6.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-8.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-3.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-5.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-7.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-6.ll
M llvm/test/CodeGen/X86/vector-interleaved-store-i8-stride-8.ll
M llvm/test/CodeGen/X86/vector-reduce-add-mask.ll
M llvm/test/CodeGen/X86/vector-reduce-and-cmp.ll
M llvm/test/CodeGen/X86/vector-reduce-and.ll
M llvm/test/CodeGen/X86/vector-reduce-fadd.ll
M llvm/test/CodeGen/X86/vector-reduce-fmax.ll
M llvm/test/CodeGen/X86/vector-reduce-fmin.ll
M llvm/test/CodeGen/X86/vector-reduce-fmul.ll
M llvm/test/CodeGen/X86/vector-reduce-or.ll
M llvm/test/CodeGen/X86/vector-reduce-smax.ll
M llvm/test/CodeGen/X86/vector-reduce-smin.ll
M llvm/test/CodeGen/X86/vector-reduce-umax.ll
M llvm/test/CodeGen/X86/vector-reduce-umin.ll
M llvm/test/CodeGen/X86/vector-reduce-xor.ll
M llvm/test/CodeGen/X86/vector-sext.ll
M llvm/test/CodeGen/X86/vector-shift-lshr-128.ll
M llvm/test/CodeGen/X86/vector-shift-lshr-256.ll
M llvm/test/CodeGen/X86/vector-shift-shl-256.ll
M llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll
M llvm/test/CodeGen/X86/vector-shuffle-128-v4.ll
M llvm/test/CodeGen/X86/vector-shuffle-128-v8.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v16.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v32.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v4.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v8.ll
M llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
M llvm/test/CodeGen/X86/vector-shuffle-512-v8.ll
M llvm/test/CodeGen/X86/vector-shuffle-avx512.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-avx.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-avx2.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-avx512f.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining-ssse3.ll
M llvm/test/CodeGen/X86/vector-shuffle-combining.ll
M llvm/test/CodeGen/X86/vector-shuffle-concatenation.ll
M llvm/test/CodeGen/X86/vector-trunc-ssat.ll
M llvm/test/CodeGen/X86/vector-trunc-usat.ll
M llvm/test/CodeGen/X86/vselect-avx.ll
M llvm/test/CodeGen/X86/x86-interleaved-access.ll
M llvm/test/CodeGen/X86/zero_extend_vector_inreg_of_broadcast.ll
M llvm/test/CodeGen/X86/zero_extend_vector_inreg_of_broadcast_from_memory.ll
M llvm/utils/gn/secondary/llvm/lib/Target/X86/BUILD.gn
Log Message:
-----------
Add new pass `X86FixupInstTuning` for fixing up machine-instruction selection.
There are a variety of cases where we want more control over the exact
instruction emitted. This commit creates a new pass to fixup
instructions after the DAG has been lowered. The pass is only meant to
replace instructions that are guranteed to be interchangable, not to
do analysis for special cases.
Handling these instruction changes in in X86ISelLowering of
X86ISelDAGToDAG isn't ideal, as its liable to either break existing
patterns that expected a certain instruction or generate infinite
loops.
As well, operating as the MachineInstruction level allows us to access
scheduling/code size information for making the decisions.
Currently only implements `{v}permilps` -> `{v}shufps/{v}shufd` but
more transforms can be added.
Differential Revision: https://reviews.llvm.org/D143787
Commit: 6957a8cc6c92db5c1b6e9307b29bb78eb1beb718
https://github.com/llvm/llvm-project/commit/6957a8cc6c92db5c1b6e9307b29bb78eb1beb718
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-02-27 (Mon, 27 Feb 2023)
Changed paths:
A llvm/test/CodeGen/X86/tuning-shuffle-unpckpd-avx512.ll
A llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll
Log Message:
-----------
Add tests for replacing `{v}unpck{l|h}pd` -> `{v}shufps`; NFC
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D144442
Compare: https://github.com/llvm/llvm-project/compare/7198c87f42f6...6957a8cc6c92
More information about the All-commits
mailing list