[llvm] [AArch64] Optimized rdsvl followed by constant mul (PR #162853)

Tue Oct 21 03:48:09 PDT 2025

================
@@ -19579,6 +19579,47 @@ static SDValue performMulCombine(SDNode *N, SelectionDAG &DAG,
        if (ConstValue.sge(1) && ConstValue.sle(16))
          return SDValue();
 
+  // Multiplying an RDSVL value by a constant can sometimes be done cheaper by
+  // folding a power-of-two factor of the constant into the RDSVL immediate and
+  // compensating with an extra shift.
+  //
+  // We rewrite:
+  //   (mul (srl (rdsvl 1), 3), x)
+  // to one of:
+  //   (shl (rdsvl y),  z)   if z > 0
+  //   (srl (rdsvl y), abs(z))   if z < 0
+  // where integers y, z satisfy   x = y * 2^(3 + z)   and   y ∈ [-32, 31].
+  if ((N0->getOpcode() == ISD::SRL) &&
+      (N0->getOperand(0).getOpcode() == AArch64ISD::RDSVL)) {
+    unsigned AbsConstValue = ConstValue.abs().getZExtValue();
+
+    // z ≤ ctz(|x|) - 3  (largest extra shift we can take while keeping y
+    // integral)
+    int UpperBound = llvm::countr_zero(AbsConstValue) - 3;
----------------
kmclaughlin-arm wrote:

I think the 3 here is coming from the assumption that the RDSVL & shift have come from expanding a `cntsd` intrinsic. If that is the case, should we make sure the shift value is 3 as expected?

https://github.com/llvm/llvm-project/pull/162853