[PATCH] D86429: [X86] Make lowerShuffleAsLanePermuteAndPermute use sublanes on AVX2

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 26 02:21:35 PDT 2020


RKSimon added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15540
+/// If unsuccessful, returns false and may overwrite CrossLaneMask or InLaneMask
+static bool getSublanePermute(int NumSublanes, int NumLanes, int NumElts,
+                              ArrayRef<int> Mask,
----------------
Maybe pull getSublanePermute inside lowerShuffleAsLanePermuteAndPermute as a lamdba?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15580
+  // Fill CrossLaneMask using CrossLaneMaskLarge
+  CrossLaneMask.assign(NumElts, SM_SentinelUndef);
+  for (int Sublane = 0; Sublane != NumSublanes; ++Sublane) {
----------------
use the existing narrowShuffleMask helper to do this?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15612
+    // First attempt a solution with 64-bit sublanes (vpermq)
+    if (!getSublanePermute(/*NumSublanes=*/NumLanes * 2, NumLanes, NumElts,
+                           Mask, CrossLaneMask, InLaneMask)) {
----------------
What happens if you try the defaut (AVX1/binary shuffle) variant first and then the AVX2/AVX2-FAST unary shuffles afterward?



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86429/new/

https://reviews.llvm.org/D86429



More information about the llvm-commits mailing list