[PATCH] D143786: [X86] Add `TuningPreferShiftShuffle` for when Shifts are preferable to shuffles.
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 12 03:53:11 PST 2023
RKSimon added inline comments.
================
Comment at: llvm/test/CodeGen/X86/pr57340.ll:272
; CHECK-NEXT: kandw %k1, %k0, %k0
-; CHECK-NEXT: vpshufd {{.*#+}} xmm2 = xmm1[3,3,3,3]
+; CHECK-NEXT: vpsrldq {{.*#+}} xmm2 = xmm1[12,13,14,15],zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero,zero
; CHECK-NEXT: vpextrw $0, %xmm2, %eax
----------------
goldstein.w.n wrote:
> RKSimon wrote:
> > Are byte shifts faster I thought they were still Port5 bound?
> Same perf/code size for byte-shift vs shuffle so figure its all the same. I guess, however, it could have a drawback because its harder to switch domains for shift than shuffle so I can update logic to only do bit-shift.
>
> Also note this particular case actually reflects a missed optimization in `combineExtractVectorElt` because it should be just using `vpextrw` but I still haven't figured out exactly whats missing.
The combineExtractVectorElt peek through shuffle code has slowly evolved as we encountered individual regressions - I'm not surprised it still misses many.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D143786/new/
https://reviews.llvm.org/D143786
More information about the llvm-commits
mailing list