[llvm] [SLP] Compute a shuffle mask for SK_InsertSubvector (PR #85408)

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 15 07:59:18 PDT 2024


https://github.com/preames created https://github.com/llvm/llvm-project/pull/85408

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left.  I excluded it from this patch just because I thought it worthy of individual attention and review.

>From 26e04558ead1a6ee5290b83e51b3e1cca1641020 Mon Sep 17 00:00:00 2001
From: Philip Reames <preames at rivosinc.com>
Date: Fri, 15 Mar 2024 07:39:51 -0700
Subject: [PATCH] [SLP] Compute a shuffle mask for SK_InsertSubvector

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left.  I excluded it from this patch just because I thought it worthy of individual attention and review.
---
 llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index b4cce680e2876f..b6868fb3f3ca3f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
               llvm_unreachable(
                   "Expected only consecutive, strided or masked gather loads.");
             }
+            SmallVector<int> ShuffleMask(VL.size());
+            for (int i = 0; i < VL.size(); i++)
+              ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i;
             VecLdCost +=
                 TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
-                                   std::nullopt, CostKind, I * VF, SubVecTy);
+                                   ShuffleMask, CostKind, I * VF, SubVecTy);
           }
           // If masked gather cost is higher - better to vectorize, so
           // consider it as a gather node. It will be better estimated
@@ -7454,7 +7457,7 @@ getShuffleCost(const TargetTransformInfo &TTI, TTI::ShuffleKind Kind,
         Index + NumSrcElts <= static_cast<int>(Mask.size()))
       return TTI.getShuffleCost(
           TTI::SK_InsertSubvector,
-          FixedVectorType::get(Tp->getElementType(), Mask.size()), std::nullopt,
+          FixedVectorType::get(Tp->getElementType(), Mask.size()), Mask,
           TTI::TCK_RecipThroughput, Index, Tp);
   }
   return TTI.getShuffleCost(Kind, Tp, Mask, CostKind, Index, SubTp, Args);
@@ -7727,9 +7730,13 @@ class BoUpSLP::ShuffleCostEstimator : public BaseShuffleAnalysis {
         }
         if (NeedInsertSubvectorAnalysis) {
           // Add the cost for the subvectors insert.
-          for (int I = VF, E = VL.size(); I < E; I += VF)
+          SmallVector<int> ShuffleMask(VL.size());
+          for (int I = VF, E = VL.size(); I < E; I += VF) {
+            for (int i = 0; i < E; i++)
+              ShuffleMask[i] = i / VF == I ? E + i % VF : i;
             GatherCost += TTI.getShuffleCost(TTI::SK_InsertSubvector, VecTy,
-                                             std::nullopt, CostKind, I, LoadTy);
+                                             ShuffleMask, CostKind, I, LoadTy);
+          }
         }
         GatherCost -= ScalarsCost;
       }



More information about the llvm-commits mailing list