[PATCH] D86429: [X86] Make lowerShuffleAsLanePermuteAndPermute use sublanes on AVX2

TellowKrinkle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 26 17:08:04 PDT 2020


TellowKrinkle added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15612
+    // First attempt a solution with 64-bit sublanes (vpermq)
+    if (!getSublanePermute(/*NumSublanes=*/NumLanes * 2, NumLanes, NumElts,
+                           Mask, CrossLaneMask, InLaneMask)) {
----------------
RKSimon wrote:
> What happens if you try the defaut (AVX1/binary shuffle) variant first and then the AVX2/AVX2-FAST unary shuffles afterward?
> 
Do you mean check 128-bit sublanes on both sides of the `if`?

It cancels the points where a vpermq `ymm0 = ymm0[2,3,2,3]` gets turned into a  `ymm0 = ymm0[3,3,3,3]` or a `ymm0 = ymm0[0,1,0,1]` gets turned into a `vpbroadcastq`, but that's about it

AFAIK the 64-bit sublane version handles everything the 128-bit sublane version does (and the 32-bit sublane version handles everything the 64-bit sublane version does, it's just that vpermd is potentially more expensive than vpermq and needs an extra register for the mask)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86429/new/

https://reviews.llvm.org/D86429



More information about the llvm-commits mailing list