[all-commits] [llvm/llvm-project] e2f652: [X86] Improve inst tuning tests for X86FixupInstTu...
goldsteinn via All-commits
all-commits at lists.llvm.org
Sun Apr 9 22:17:40 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e2f65276908e6e3ca0129df08a64c973e27bcc46
https://github.com/llvm/llvm-project/commit/e2f65276908e6e3ca0129df08a64c973e27bcc46
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-04-10 (Mon, 10 Apr 2023)
Changed paths:
M llvm/test/CodeGen/X86/tuning-shuffle-permilps-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-permilps.ll
M llvm/test/CodeGen/X86/tuning-shuffle-unpckpd-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll
A llvm/test/CodeGen/X86/tuning-shuffle-unpckps-avx512.ll
A llvm/test/CodeGen/X86/tuning-shuffle-unpckps.ll
Log Message:
-----------
[X86] Improve inst tuning tests for X86FixupInstTuning Pass; NFC
1) Add tests for `unpckps`.
2) Add explicit test for fast shuffles (ICX+) but WITH bypass delay.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D147726
Commit: 2ce1698a343c599910bceed399ca7020816b230e
https://github.com/llvm/llvm-project/commit/2ce1698a343c599910bceed399ca7020816b230e
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-04-10 (Mon, 10 Apr 2023)
Changed paths:
M llvm/lib/Target/X86/X86FixupInstTuning.cpp
M llvm/test/CodeGen/X86/tuning-shuffle-permilps-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-permilps.ll
Log Message:
-----------
[X86] Fix perf bug in `permilps` -> `shufd` in X86FixupInstTuning.
We shouldn't do the transformation if we either have bypass delay OR
the new opcode has worse performance. Previous code was incorrectly
using AND.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D147727
Commit: c3f01f13b10d708b9b7ff45a6ccc2f0c3462b3af
https://github.com/llvm/llvm-project/commit/c3f01f13b10d708b9b7ff45a6ccc2f0c3462b3af
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-04-10 (Mon, 10 Apr 2023)
Changed paths:
M llvm/lib/Target/X86/X86FixupInstTuning.cpp
M llvm/test/CodeGen/X86/tuning-shuffle-unpckpd-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll
Log Message:
-----------
[X86] Add inst fixup for `unpckpd` -> `unpckqdq`.
`unpckqdq` seems to be treated as a shuffle from bypass delay
perspective (which makes sense it appears to have shared shuffle units
for all micro-arch).
`unpckqdq` is slightly preferable to `shufpd` as it saves 1-byte of
code size and can be used to replace the micro-fused `rm` version. So,
if the target has no bypass delay, we should do `unpckpd` ->
`unpckqdq` instead of `shufpd.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D147728
Commit: d65720652dd644483530ecb547365a2239a97979
https://github.com/llvm/llvm-project/commit/d65720652dd644483530ecb547365a2239a97979
Author: Noah Goldstein <goldstein.w.n at gmail.com>
Date: 2023-04-10 (Mon, 10 Apr 2023)
Changed paths:
M llvm/lib/Target/X86/X86FixupInstTuning.cpp
M llvm/test/CodeGen/X86/tuning-shuffle-unpckps-avx512.ll
M llvm/test/CodeGen/X86/tuning-shuffle-unpckps.ll
Log Message:
-----------
[X86] Add inst fixup for `unpckps` -> `unpckdq`.
`unpckps` has the same performance as `unpckpd` (only port5) wereas
`unpckdq` can run on p15 on some newer architectures.
`unpckdq` is in the integer domain, so only do the transform if the
target has no bypass delay on shuffles (SKL+).
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D147729
Compare: https://github.com/llvm/llvm-project/compare/bc257ff07b4d...d65720652dd6
More information about the All-commits
mailing list