[PATCH] D143786: [X86] Add `TuningPreferShiftShuffle` for when Shifts are preferable to shuffles.
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Feb 11 03:24:22 PST 2023
RKSimon added a comment.
Without AVX512 we can't load fold arg0 for bit-shift ops - isn't that likely to be a problem?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:18288
- if (V2.isUndef()) {
- // When the shuffle is mirrored between the 128-bit lanes of the unit, we
- // can use lower latency instructions that will operate on both lanes.
- SmallVector<int, 2> RepeatedMask;
- if (is128BitLaneRepeatedShuffleMask(MVT::v4i64, Mask, RepeatedMask)) {
- SmallVector<int, 4> PSHUFDMask;
- narrowShuffleMaskElts(2, RepeatedMask, PSHUFDMask);
- return DAG.getBitcast(
- MVT::v4i64,
- DAG.getNode(X86ISD::PSHUFD, DL, MVT::v8i32,
- DAG.getBitcast(MVT::v8i32, V1),
- getV4X86ShuffleImm8ForMask(PSHUFDMask, DL, DAG)));
- }
+ for (unsigned Order = 0; Order < 2; ++Order) {
+ if (Subtarget.hasFasterShiftThanShuffle() ? (Order == 1) : (Order == 0)) {
----------------
This approach isn't particularly easy to grok - why not just add an additional lowerShuffleAsShift check before behind a hasFasterShiftThanShuffle check?
================
Comment at: llvm/test/CodeGen/X86/pr57340.ll:272
; CHECK-NEXT: kandw %k1, %k0, %k0
-; CHECK-NEXT: vpshufd {{.*#+}} xmm2 = xmm1[3,3,3,3]
+; CHECK-NEXT: vpsrldq {{.*#+}} xmm2 = xmm1[12,13,14,15],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: vpextrw $0, %xmm2, %eax
----------------
Are byte shifts faster I thought they were still Port5 bound?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D143786/new/
https://reviews.llvm.org/D143786
More information about the llvm-commits
mailing list