[PATCH] D41794: [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 14 16:10:02 PST 2018


craig.topper added a comment.

bYeah there may be some crossover with lowerVectorShuffleByMerging128BitLanes. I'll see if I can generalize lowerVectorShuffleByMerging128BitLanes more.



================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10761
+  int Size = Mask.size();
+  SmallVector<int, 8> RepeatMask(Size, -1);
+
----------------
RKSimon wrote:
> Should't RepeatMask be just LaneSize wide? 
Yes it should.


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:10791
+    PermuteMask[i] = M;
+    if (PermuteMask[i] < 0)
+      continue;
----------------
RKSimon wrote:
> Do we gain anything by relaxing this and keeping PermuteMask[i] as UNDEF if the original Mask[i] was UNDEF?
Not sure. I was trying to create a repeated lane shuffle so its based on both lanes. If its undef in both lanes it will be undef here.


Repository:
  rL LLVM

https://reviews.llvm.org/D41794





More information about the llvm-commits mailing list