[llvm] [SelectionDAG] Use Karatsuba decomposition to expand vector CLMUL via narrower legal types (PR #184468)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 4 03:52:51 PST 2026


================
@@ -8456,22 +8456,158 @@ SDValue TargetLowering::expandROT(SDNode *Node, bool AllowVectorOps,
   return DAG.getNode(ISD::OR, DL, VT, ShVal, HsVal);
 }
 
+/// Check if CLMUL on VT can eventually reach a type with legal CLMUL through
+/// a chain of Karatsuba decompositions (halving element width) and/or vector
+/// widening (doubling element count). This guides expansion strategy selection:
+/// if true, the Karatsuba/widening path produces better code than bit-by-bit.
+///
+/// KaratsubaDepth tracks halving steps only (each creates ~4x more operations).
+/// Widening steps are cheap (O(1) pad/extract) and don't count.
+/// Limiting halvings to 2 prevents exponential blowup:
+///   1 halving: ~4 sub-CLMULs (good, e.g. v8i16 -> v8i8)
+///   2 halvings: ~16 sub-CLMULs (acceptable, e.g. v4i32 -> v4i16 -> v8i8)
+///   3 halvings: ~64 sub-CLMULs (worse than bit-by-bit expansion)
+static bool canNarrowCLMULToLegal(const TargetLowering &TLI, LLVMContext &Ctx,
+                                  EVT VT, unsigned KaratsubaDepth = 0,
+                                  unsigned TotalDepth = 0) {
+  if (KaratsubaDepth > 2 || TotalDepth > 8 || !VT.isVector() ||
+      VT.isScalableVector())
+    return false;
+  if (TLI.isOperationLegalOrCustom(ISD::CLMUL, VT))
+    return true;
+  if (!TLI.isTypeLegal(VT))
+    return false;
+
+  unsigned BW = VT.getScalarSizeInBits();
+
+  // Karatsuba: halve element width, same element count.
+  // This is the expensive step — each halving creates ~4x more operations.
+  if (BW >= 16) {
+    EVT HalfEltVT = EVT::getIntegerVT(Ctx, BW / 2);
+    EVT HalfVT = EVT::getVectorVT(Ctx, HalfEltVT, VT.getVectorElementCount());
+    if (TLI.isTypeLegal(HalfVT) &&
+        canNarrowCLMULToLegal(TLI, Ctx, HalfVT, KaratsubaDepth + 1,
+                              TotalDepth + 1))
+      return true;
+  }
+
+  // Widen: double element count (fixed-width vectors only).
+  // This is cheap — just INSERT_SUBVECTOR + EXTRACT_SUBVECTOR.
+  if (auto EC = VT.getVectorElementCount(); EC.isFixed()) {
+    EVT WideVT = EVT::getVectorVT(Ctx, VT.getVectorElementType(), EC * 2);
----------------
RKSimon wrote:

```suggestion
    EVT WideVT = VT.getDoubleNumVectorElementsVT(Ctx);
```

https://github.com/llvm/llvm-project/pull/184468


More information about the llvm-commits mailing list