[PATCH] D86429: [X86] Make lowerShuffleAsLanePermuteAndPermute use sublanes on AVX2
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 26 02:21:35 PDT 2020
RKSimon added inline comments.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15540
+/// If unsuccessful, returns false and may overwrite CrossLaneMask or InLaneMask
+static bool getSublanePermute(int NumSublanes, int NumLanes, int NumElts,
+ ArrayRef<int> Mask,
----------------
Maybe pull getSublanePermute inside lowerShuffleAsLanePermuteAndPermute as a lamdba?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15580
+ // Fill CrossLaneMask using CrossLaneMaskLarge
+ CrossLaneMask.assign(NumElts, SM_SentinelUndef);
+ for (int Sublane = 0; Sublane != NumSublanes; ++Sublane) {
----------------
use the existing narrowShuffleMask helper to do this?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:15612
+ // First attempt a solution with 64-bit sublanes (vpermq)
+ if (!getSublanePermute(/*NumSublanes=*/NumLanes * 2, NumLanes, NumElts,
+ Mask, CrossLaneMask, InLaneMask)) {
----------------
What happens if you try the defaut (AVX1/binary shuffle) variant first and then the AVX2/AVX2-FAST unary shuffles afterward?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86429/new/
https://reviews.llvm.org/D86429
More information about the llvm-commits
mailing list