[llvm] [RISCV] Match deinterleave(4,8) shuffles to SHL/TRUNC when legal (PR #118509)

Craig Topper via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 3 15:25:26 PST 2024


================
@@ -5332,11 +5301,31 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
   if (ShuffleVectorInst::isReverseMask(Mask, NumElts) && V2.isUndef())
     return DAG.getNode(ISD::VECTOR_REVERSE, DL, VT, V1);
 
-  // If this is a deinterleave and we can widen the vector, then we can use
-  // vnsrl to deinterleave.
-  if (SDValue Src =
-          isDeinterleaveShuffle(VT, ContainerVT, V1, V2, Mask, Subtarget))
-    return getDeinterleaveViaVNSRL(DL, VT, Src, Mask[0] == 0, DAG);
+  // If this is a deinterleave(2,4,8) and we can widen the vector, then we can
+  // use shift and truncate to perform the shuffle.
+  // TODO: For Factor=6, we can perform the first step of the deinterleave via
+  // shift-and-trunc reducing total cost for everything except an mf8 result.
+  // TODO: For Factor=4,8, we can do the same when the ratio isn't high enough
+  // to do the entire operation.
+  if (VT.getScalarSizeInBits() < Subtarget.getELen()) {
+    const unsigned MaxFactor = Subtarget.getELen() / VT.getScalarSizeInBits();
+    assert(MaxFactor == 2 || MaxFactor == 4 || MaxFactor == 8);
+    for (unsigned Factor = 2; Factor <= MaxFactor; Factor <<= 1) {
+      unsigned Index = 0;
+      if (ShuffleVectorInst::isDeInterleaveMaskOfFactor(Mask, Factor, Index) &&
+          1 < count_if(Mask, [](int Idx) { return Idx != -1; })) {
+        if (SDValue Src = getSingleShuffleSrc(VT, ContainerVT, V1, V2)) {
+          if (Src.getValueType() == VT) {
+            EVT WideVT = VT.getDoubleNumVectorElementsVT();
----------------
topperc wrote:

> Are we bothered by the toggles? I can probably eliminate them < m1 by adding back the removed code, but conditional on LMUL. For the > m2 cases, this is probably an improvement. We could potentially treat this a generic improvement to add to e.g. VLOptimizer too.

I agree its probably an improvement for the wide LMUL. I don't think I'm bothered by the toggles. Some of the tests are producing undefined elements in the upper half of the shuffle result and then using them in a later instruction. Is that likely in real code?

https://github.com/llvm/llvm-project/pull/118509


More information about the llvm-commits mailing list