[PATCH] D143786: [X86] Add `TuningPreferShiftShuffle` for when Shifts are preferable to shuffles.
Noah Goldstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 12 14:22:00 PST 2023
goldstein.w.n marked an inline comment as done.
goldstein.w.n added a comment.
In D143786#4120206 <https://reviews.llvm.org/D143786#4120206>, @RKSimon wrote:
> Without AVX512 we can't load fold arg0 for bit-shift ops - isn't that likely to be a problem?
I'm not sure what you mean?
But the tuning is only for SKX which has avx512.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:18288
- if (V2.isUndef()) {
- // When the shuffle is mirrored between the 128-bit lanes of the unit, we
- // can use lower latency instructions that will operate on both lanes.
- SmallVector<int, 2> RepeatedMask;
- if (is128BitLaneRepeatedShuffleMask(MVT::v4i64, Mask, RepeatedMask)) {
- SmallVector<int, 4> PSHUFDMask;
- narrowShuffleMaskElts(2, RepeatedMask, PSHUFDMask);
- return DAG.getBitcast(
- MVT::v4i64,
- DAG.getNode(X86ISD::PSHUFD, DL, MVT::v8i32,
- DAG.getBitcast(MVT::v8i32, V1),
- getV4X86ShuffleImm8ForMask(PSHUFDMask, DL, DAG)));
- }
+ for (unsigned Order = 0; Order < 2; ++Order) {
+ if (Subtarget.hasFasterShiftThanShuffle() ? (Order == 1) : (Order == 0)) {
----------------
goldstein.w.n wrote:
> RKSimon wrote:
> > This approach isn't particularly easy to grok - why not just add an additional lowerShuffleAsShift check before behind a hasFasterShiftThanShuffle check?
> Was to avoid duplicating ~30 lines of code, but will do for v2.
Refactored as you suggest everything except matchunaryshufflepermute helper where it would cause too much duplication imo.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D143786/new/
https://reviews.llvm.org/D143786
More information about the llvm-commits
mailing list