[PATCH] D86429: [X86] Make lowerShuffleAsLanePermuteAndPermute use sublanes on AVX2
TellowKrinkle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 26 17:08:04 PDT 2020
TellowKrinkle added inline comments.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15612
+ // First attempt a solution with 64-bit sublanes (vpermq)
+ if (!getSublanePermute(/*NumSublanes=*/NumLanes * 2, NumLanes, NumElts,
+ Mask, CrossLaneMask, InLaneMask)) {
----------------
RKSimon wrote:
> What happens if you try the defaut (AVX1/binary shuffle) variant first and then the AVX2/AVX2-FAST unary shuffles afterward?
>
Do you mean check 128-bit sublanes on both sides of the `if`?
It cancels the points where a vpermq `ymm0 = ymm0[2,3,2,3]` gets turned into a `ymm0 = ymm0[3,3,3,3]` or a `ymm0 = ymm0[0,1,0,1]` gets turned into a `vpbroadcastq`, but that's about it
AFAIK the 64-bit sublane version handles everything the 128-bit sublane version does (and the 32-bit sublane version handles everything the 64-bit sublane version does, it's just that vpermd is potentially more expensive than vpermq and needs an extra register for the mask)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86429/new/
https://reviews.llvm.org/D86429
More information about the llvm-commits
mailing list