[llvm] [SelectionDAG] Use Karatsuba decomposition to expand vector CLMUL via narrower legal types (PR #184468)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 4 03:52:52 PST 2026
================
@@ -8456,22 +8456,158 @@ SDValue TargetLowering::expandROT(SDNode *Node, bool AllowVectorOps,
return DAG.getNode(ISD::OR, DL, VT, ShVal, HsVal);
}
+/// Check if CLMUL on VT can eventually reach a type with legal CLMUL through
+/// a chain of Karatsuba decompositions (halving element width) and/or vector
+/// widening (doubling element count). This guides expansion strategy selection:
+/// if true, the Karatsuba/widening path produces better code than bit-by-bit.
+///
+/// KaratsubaDepth tracks halving steps only (each creates ~4x more operations).
+/// Widening steps are cheap (O(1) pad/extract) and don't count.
+/// Limiting halvings to 2 prevents exponential blowup:
+/// 1 halving: ~4 sub-CLMULs (good, e.g. v8i16 -> v8i8)
+/// 2 halvings: ~16 sub-CLMULs (acceptable, e.g. v4i32 -> v4i16 -> v8i8)
+/// 3 halvings: ~64 sub-CLMULs (worse than bit-by-bit expansion)
+static bool canNarrowCLMULToLegal(const TargetLowering &TLI, LLVMContext &Ctx,
+ EVT VT, unsigned KaratsubaDepth = 0,
+ unsigned TotalDepth = 0) {
+ if (KaratsubaDepth > 2 || TotalDepth > 8 || !VT.isVector() ||
+ VT.isScalableVector())
+ return false;
+ if (TLI.isOperationLegalOrCustom(ISD::CLMUL, VT))
+ return true;
+ if (!TLI.isTypeLegal(VT))
+ return false;
+
+ unsigned BW = VT.getScalarSizeInBits();
+
+ // Karatsuba: halve element width, same element count.
+ // This is the expensive step — each halving creates ~4x more operations.
+ if (BW >= 16) {
+ EVT HalfEltVT = EVT::getIntegerVT(Ctx, BW / 2);
+ EVT HalfVT = EVT::getVectorVT(Ctx, HalfEltVT, VT.getVectorElementCount());
+ if (TLI.isTypeLegal(HalfVT) &&
+ canNarrowCLMULToLegal(TLI, Ctx, HalfVT, KaratsubaDepth + 1,
+ TotalDepth + 1))
+ return true;
+ }
+
+ // Widen: double element count (fixed-width vectors only).
+ // This is cheap — just INSERT_SUBVECTOR + EXTRACT_SUBVECTOR.
+ if (auto EC = VT.getVectorElementCount(); EC.isFixed()) {
----------------
RKSimon wrote:
Can this fail? You early-out above if VT is scalable / non-vector
https://github.com/llvm/llvm-project/pull/184468
More information about the llvm-commits
mailing list