[llvm] [CostModel][X86] Add initial costs for non-lane-crossing one/two input shuffles (PR #114680)

Alexey Bataev via llvm-commits llvm-commits at lists.llvm.org
Sun Nov 3 06:23:33 PST 2024


================
@@ -1559,6 +1559,23 @@ InstructionCost X86TTIImpl::getShuffleCost(
       return TTI::TCC_Free;
   }
 
+  // Attempt to detect a cheaper inlane shuffle, avoiding 128-bit subvector
+  // permutation.
+  bool IsInLaneShuffle = false;
+  if (BaseTp->getPrimitiveSizeInBits() > 0 &&
+      (BaseTp->getPrimitiveSizeInBits() % 128) == 0 &&
+      Mask.size() == BaseTp->getElementCount().getKnownMinValue()) {
+    unsigned NumLanes = BaseTp->getPrimitiveSizeInBits() / 128;
+    unsigned NumEltsPerLane = Mask.size() / NumLanes;
+    if ((Mask.size() % NumLanes) == 0) {
+      IsInLaneShuffle = true;
+      for (auto [I, M] : enumerate(Mask))
+        if (0 <= M)
+          IsInLaneShuffle &=
+              ((M % Mask.size()) / NumEltsPerLane) == (I / NumEltsPerLane);
----------------
alexey-bataev wrote:

`IsInLaneShuffle = all_of(enumerate(Mask), [&](const auto &P) { return P.value() == PoisonMaskElem || ((P.value() % Mask.size()) / NumEltsPerLane) == (P.index() / NumEltsPerLane); });`

https://github.com/llvm/llvm-project/pull/114680


More information about the llvm-commits mailing list