[llvm] [RISCV][POC] Recursive search for mul expansion (PR #96327)

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 21 09:44:11 PDT 2024


https://github.com/preames created https://github.com/llvm/llvm-project/pull/96327

(This is very much a POC - code quality is poor, and some bugs are unfixed.)

I want to gather opinions on what we should do next (if anything) for multiply strength reduction.  As a strawman, this patch implements a recursive search strategy which is applied during ISEL.  This is exponential in search depth, and has a branching factor of ~17 if I counted correctly.

For the range [0,199] inclusive:
On ToT, we cover all but 123 values
At depth=3, we cover all but 43 values.
As depth=4, we cover all values in that range.
An older gcc, covers all but 94.

For the range [-1000,999] inclusive:
On ToT, we miss 1853 values.      (0.9s)
At depth=3, we miss 1667 values.  (1.14s
At depth=4, we miss 1130 values   (2.5s)
At depth=5, we miss 741 values.  (12.25s)
An older gcc, misses 1545 values. (1.3s - but for full compile, not just llc)

On the right hand side is the llc time for the given search depth.

There are things we could do to improve time here.  The biggest one is likely memoizing results - in particular, remembering that a result can't be found in the budget would likely help a lot.

Another approach might be to pre-generate lookup tables for "common" numbers. The search strategy here could be run offline, and the results fairly cached in a data table.  The size of the table of course depends on the range chosen, but for e.g. -128..127 we could probably get away with 3 bytes per entry or 768 bits of static storage.  One slight complications is that the tables differ by ISA - at the moment, we'd only need 2 (zba vs no-zba), but we could reasonably want to exploit other extensions in the future.

We could of course also chose to do nothing.

Thoughts, ideas?

>From 448713b431c4bf999fd04a564469965ff3b6f6a6 Mon Sep 17 00:00:00 2001
From: Philip Reames <preames at rivosinc.com>
Date: Thu, 20 Jun 2024 12:02:45 -0700
Subject: [PATCH] [RISCV][POC] Recursive search for mul expansion

(This is very much a POC - code quality is poor, and some bugs are unfixed.)

I want to gather opinions on what we should do next (if anything) for
multiply strength reduction.  As a strawman, this patch implements a
recursive search strategy which is applied during ISEL.  This is
exponential in search depth, and has a branching factor of ~17 if I
counted correctly.

For the range [0,199] inclusive:
On ToT, we cover all but 123 values
At depth=3, we cover all but 43 values.
As depth=4, we cover all values in that range.
An older gcc, covers all but 94.

For the range [-1000,999] inclusive:
On ToT, we miss 1853 values.      (0.9s)
At depth=3, we miss 1667 values.  (1.14s
At depth=4, we miss 1130 values   (2.5s)
At depth=5, we miss 741 values.  (12.25s)
An older gcc, misses 1545 values. (1.3s - but for full compile, not just llc)

On the right hand side is the llc time for the given search depth.

There are things we could do to improve time here.  The biggest one is likely
memoizing results - in particular, remembering that a result can't be found
in the budget would likely help a lot.

Another approach might be to pre-generate lookup tables for "common" numbers.
The search strategy here could be run offline, and the results fairly cached
in a data table.  The size of the table of course depends on the range chosen,
but for e.g. -128..127 we could probably get away with 3 bytes per entry or
768 bits of static storage.  One slight complications is that the tables differ
by ISA - at the moment, we'd only need 2 (zba vs no-zba), but we could
reasonably want to exploit other extensions in the future.

We could of course also chose to do nothing.

Thoughts, ideas?
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp   | 344 ++++++++++++------
 llvm/test/CodeGen/RISCV/addimm-mulimm.ll      | 158 ++++----
 llvm/test/CodeGen/RISCV/div-by-constant.ll    |  56 ++-
 llvm/test/CodeGen/RISCV/mul.ll                | 184 +++++-----
 llvm/test/CodeGen/RISCV/rv32xtheadba.ll       |  62 ++--
 llvm/test/CodeGen/RISCV/rv32zba.ll            |  81 +++--
 .../CodeGen/RISCV/rv64-legal-i32/rv64zba.ll   | 135 ++++---
 llvm/test/CodeGen/RISCV/rv64xtheadba.ll       |  70 ++--
 llvm/test/CodeGen/RISCV/rv64zba.ll            | 329 ++++++++++-------
 .../CodeGen/RISCV/rvv/calling-conv-fastcc.ll  | 102 +++---
 .../CodeGen/RISCV/rvv/extract-subvector.ll    |  14 +-
 .../fixed-vectors-strided-load-store-asm.ll   |  46 ++-
 .../CodeGen/RISCV/rvv/mscatter-combine.ll     |   8 +-
 llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll    |  14 +-
 llvm/test/CodeGen/RISCV/rvv/stepvector.ll     |  10 +-
 .../CodeGen/RISCV/srem-seteq-illegal-types.ll |  30 +-
 llvm/test/CodeGen/RISCV/urem-vector-lkk.ll    |  12 +-
 17 files changed, 955 insertions(+), 700 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 57817832c9b42..b3ad67423b550 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -80,6 +80,12 @@ static cl::opt<bool>
     RV64LegalI32("riscv-experimental-rv64-legal-i32", cl::ReallyHidden,
                  cl::desc("Make i32 a legal type for SelectionDAG on RV64."));
 
+static cl::opt<unsigned> MulExpansionDepth(
+    "riscv-mul-expansion-depth", cl::Hidden,
+    cl::desc("Maximum depth to search when expanding a mul by constant"),
+    cl::init(3));
+
+
 RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
                                          const RISCVSubtarget &STI)
     : TargetLowering(TM), Subtarget(STI) {
@@ -13689,6 +13695,221 @@ static SDValue performXORCombine(SDNode *N, SelectionDAG &DAG,
   return combineSelectAndUseCommutative(N, DAG, /*AllOnes*/ false, Subtarget);
 }
 
+struct Step {
+  unsigned Opcode;
+  uint64_t A;
+  uint64_t B;
+  unsigned Shift = 0;
+};
+
+static bool findMulExpansionRecursive(uint64_t MulAmt, bool HasShlAdd,
+                                      SmallVector<Step> &Path) {
+  // Maximum sequence size is 3, avoid anything beyond that.
+  // In 0-199 inclusive, with max depth = X, all but:
+  // 3: 43 unique values
+  // 4: none, all covered
+
+  if (Path.size() > MulExpansionDepth)
+    return false;
+
+  // Base case
+  if (MulAmt == 1)
+    return true;
+
+  // Rework the recursion bit here...
+  SmallVector<Step> TmpPath = Path;
+  std::optional<SmallVector<Step>> BestPath;
+
+  auto recurseAndScore = [&](uint64_t MulAmt2, Step S) {
+    unsigned Len = TmpPath.size();
+    TmpPath.push_back(S);
+    if (findMulExpansionRecursive(MulAmt2, HasShlAdd, TmpPath))
+      if (!BestPath || BestPath->size() > TmpPath.size())
+        BestPath = TmpPath;
+    TmpPath.resize(Len);
+  };
+
+  // We only recurse on the first operand, so the second step must
+  // be a complete without further search.
+  auto recurseAndScore2 = [&](uint64_t MulAmt2, Step S1, Step S2) {
+    unsigned Len = TmpPath.size();
+    TmpPath.push_back(S1);
+    TmpPath.push_back(S2);
+    if (findMulExpansionRecursive(MulAmt2, HasShlAdd, TmpPath))
+      if (!BestPath || BestPath->size() > TmpPath.size())
+        BestPath = TmpPath;
+    TmpPath.resize(Len);
+  };
+
+
+  // Only the base case (MulAmt is a power of two) is required, but we prefer to
+  // expand the shl last, so add that to our solution set eagerly so the cost
+  // with be equal to exclude the inverse factoring.
+  if (unsigned TZ = llvm::countr_zero(MulAmt); TZ) {
+    uint64_t MulAmt2 = MulAmt >> TZ;
+    recurseAndScore(MulAmt2, {ISD::SHL, MulAmt2, TZ});
+  }
+
+  // TODO: Should we factor out the MUL note entirely in favor of
+  // the RISCVISD::SHL_ADD one?
+  if (HasShlAdd) {
+    // {3,5,9}*W -> shNadd W, W
+    for (uint64_t Divisor : {3, 5, 9}) {
+      if (MulAmt % Divisor != 0)
+        continue;
+      uint64_t MulAmt2 = MulAmt / Divisor;
+      recurseAndScore(MulAmt2, {ISD::MUL, MulAmt2, Divisor});
+    }
+
+    // 2^(1,2,3) * W + 1 -> (shNadd (W, x)
+    unsigned TZ = llvm::countr_zero(MulAmt - 1);
+    if (TZ == 1 || TZ == 2 || TZ == 3) {
+      uint64_t MulAmt2 = (MulAmt - 1) >> TZ;
+      recurseAndScore(MulAmt2, {RISCVISD::SHL_ADD, MulAmt2, 1, TZ});
+    }
+
+    // W + [2,4,8] -> shNadd x, W
+    for (uint64_t Offset : {2, 4, 8}) {
+      uint64_t MulAmt2 = MulAmt - Offset;
+      recurseAndScore(MulAmt2, {ISD::ADD, MulAmt2, Offset});
+    }
+  }
+
+  {
+    uint64_t MulAmt2 = MulAmt - 1;
+    recurseAndScore(MulAmt2, {ISD::ADD, MulAmt2, 1});
+  }
+
+  {
+    uint64_t MulAmt2 = MulAmt + 1;
+    recurseAndScore(MulAmt2, {ISD::SUB, MulAmt2, 1});
+  }
+
+
+  // Add +/- 3,5,9 cases, needs two instructions each even using
+  // shNadd
+  if (HasShlAdd) {
+    for (uint64_t Offset : {3, 5, 9}) {
+      uint64_t MulAmt2 = (MulAmt - Offset);
+      recurseAndScore2(MulAmt2,
+                   {ISD::ADD, MulAmt2, Offset},
+                   {ISD::MUL, 1, Offset});
+    }
+
+    for (uint64_t Offset : {3, 5, 9}) {
+      uint64_t MulAmt2 = (MulAmt + Offset);
+      recurseAndScore2(MulAmt2,
+                       {ISD::SUB, MulAmt2, Offset},
+                       {ISD::MUL, 1, Offset});
+    }
+  }
+
+  // Isolate the last set bit, and recurse on the remaining
+  {
+    uint64_t MulAmtLowBit = MulAmt & (-MulAmt);
+    uint64_t MulAmt2 = MulAmt - MulAmtLowBit;
+    recurseAndScore2(MulAmt2,
+                     {ISD::ADD, MulAmt2, MulAmtLowBit},
+                     {ISD::SHL, 1, Log2_64(MulAmtLowBit)});
+  }
+
+  // TODO: Add subtracting last zero bit..
+  
+
+  if (!BestPath)
+    return false;
+
+  Path = *BestPath;
+  return true;
+}
+
+static SDValue expandMulPath(SelectionDAG &DAG, SDNode *N,
+                             const bool HasShlAdd,
+                             SmallVector<Step> &Path, SDValue X) {
+  assert(!Path.empty());
+  EVT VT = N->getValueType(0);
+  SDLoc DL(N);
+
+  uint64_t MulAmt = cast<ConstantSDNode>(N->getOperand(1))->getZExtValue();
+  //dbgs() << "Found path for " << MulAmt << "\n";
+  
+  DenseMap<uint64_t, SDValue> Expansions;
+  Expansions[1] = X;
+  for (Step &S : llvm::reverse(Path)) {
+    switch (S.Opcode) {
+    default:
+      llvm_unreachable("");
+    case ISD::SHL: {
+      //dbgs() << "Expanding " << S.A << " << " << S.B << "\n";
+      assert(Expansions.contains(S.A));
+      SDValue A = Expansions[S.A];
+      SDValue Res = DAG.getNode(ISD::SHL, DL, VT, A,
+                                DAG.getConstant(S.B, DL, VT));
+      Expansions[S.A << S.B] = Res;
+      break;
+    }
+    case ISD::MUL: {
+      //dbgs() << "Expanding " << S.A << " * " << S.B << "\n";
+      assert(Expansions.contains(S.A));
+      SDValue A = Expansions[S.A];
+      assert(S.B == 3 || S.B == 5 || S.B == 9);
+      SDValue Res = DAG.getNode(RISCVISD::SHL_ADD, DL, VT, A,
+                                DAG.getConstant(Log2_64(S.B - 1), DL, VT),
+                                A);
+      Expansions[S.A * S.B] = Res;
+      break;
+    }
+    case ISD::ADD: {
+      //dbgs() << "Expanding " << S.A << " + " << S.B << "\n";
+      assert(Expansions.contains(S.A));
+      SDValue A = Expansions[S.A];
+      SDValue Res;
+      if (HasShlAdd && (S.B == 2 || S.B == 4 || S.B == 8))
+        // TODO: Refactor this out
+        Res = DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
+                          DAG.getConstant(Log2_64(S.B), DL, VT),
+                          A);
+      else {
+        assert(Expansions.contains(S.B));
+        SDValue B = Expansions[S.B];
+        Res =  DAG.getNode(ISD::ADD, DL, VT, A, B);
+      }
+      assert(Res);
+      Expansions[S.A + S.B] = Res;
+      break;
+    }
+    case ISD::SUB: {
+      //dbgs() << "Expanding " << S.A << " - " << S.B << "\n";
+      assert(Expansions.contains(S.A));
+      SDValue A = Expansions[S.A];
+      assert(Expansions.contains(S.B));
+      SDValue B = Expansions[S.B];
+      SDValue Res =  DAG.getNode(ISD::SUB, DL, VT, A, B);
+      Expansions[S.A - S.B] = Res;
+      break;
+    }
+    case RISCVISD::SHL_ADD: {
+      //dbgs() << "Expanding " << S.A << " << " << S.Shift << " + " << S.B << "\n";
+      assert(HasShlAdd);
+      assert(Expansions.contains(S.A));
+      assert(Expansions.contains(S.B));
+      SDValue A = Expansions[S.A];
+      SDValue B = Expansions[S.B];
+      SDValue Res = DAG.getNode(RISCVISD::SHL_ADD, DL, VT, A,
+                                DAG.getConstant(S.Shift, DL, VT),
+                                B);
+      Expansions[(S.A << S.Shift) + S.B] = Res;
+      break;
+    }
+    };
+  }
+
+
+  assert(Expansions.contains(MulAmt));
+  return Expansions[MulAmt];
+}
+  
+
 // Try to expand a scalar multiply to a faster sequence.
 static SDValue expandMul(SDNode *N, SelectionDAG &DAG,
                          TargetLowering::DAGCombinerInfo &DCI,
@@ -13720,124 +13941,11 @@ static SDValue expandMul(SDNode *N, SelectionDAG &DAG,
   // other target properly freezes X in these cases either.
   SDValue X = N->getOperand(0);
 
-  if (HasShlAdd) {
-    for (uint64_t Divisor : {3, 5, 9}) {
-      if (MulAmt % Divisor != 0)
-        continue;
-      uint64_t MulAmt2 = MulAmt / Divisor;
-      // 3/5/9 * 2^N ->  shl (shXadd X, X), N
-      if (isPowerOf2_64(MulAmt2)) {
-        SDLoc DL(N);
-        SDValue X = N->getOperand(0);
-        // Put the shift first if we can fold a zext into the
-        // shift forming a slli.uw.
-        if (X.getOpcode() == ISD::AND && isa<ConstantSDNode>(X.getOperand(1)) &&
-            X.getConstantOperandVal(1) == UINT64_C(0xffffffff)) {
-          SDValue Shl = DAG.getNode(ISD::SHL, DL, VT, X,
-                                    DAG.getConstant(Log2_64(MulAmt2), DL, VT));
-          return DAG.getNode(RISCVISD::SHL_ADD, DL, VT, Shl,
-                             DAG.getConstant(Log2_64(Divisor - 1), DL, VT),
-                             Shl);
-        }
-        // Otherwise, put rhe shl second so that it can fold with following
-        // instructions (e.g. sext or add).
-        SDValue Mul359 =
-            DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
-                        DAG.getConstant(Log2_64(Divisor - 1), DL, VT), X);
-        return DAG.getNode(ISD::SHL, DL, VT, Mul359,
-                           DAG.getConstant(Log2_64(MulAmt2), DL, VT));
-      }
-
-      // 3/5/9 * 3/5/9 -> shXadd (shYadd X, X), (shYadd X, X)
-      if (MulAmt2 == 3 || MulAmt2 == 5 || MulAmt2 == 9) {
-        SDLoc DL(N);
-        SDValue Mul359 =
-            DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
-                        DAG.getConstant(Log2_64(Divisor - 1), DL, VT), X);
-        return DAG.getNode(RISCVISD::SHL_ADD, DL, VT, Mul359,
-                           DAG.getConstant(Log2_64(MulAmt2 - 1), DL, VT),
-                           Mul359);
-      }
-    }
-
-    // If this is a power 2 + 2/4/8, we can use a shift followed by a single
-    // shXadd. First check if this a sum of two power of 2s because that's
-    // easy. Then count how many zeros are up to the first bit.
-    if (isPowerOf2_64(MulAmt & (MulAmt - 1))) {
-      unsigned ScaleShift = llvm::countr_zero(MulAmt);
-      if (ScaleShift >= 1 && ScaleShift < 4) {
-        unsigned ShiftAmt = Log2_64((MulAmt & (MulAmt - 1)));
-        SDLoc DL(N);
-        SDValue Shift1 =
-            DAG.getNode(ISD::SHL, DL, VT, X, DAG.getConstant(ShiftAmt, DL, VT));
-        return DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
-                           DAG.getConstant(ScaleShift, DL, VT), Shift1);
-      }
-    }
-
-    // 2^(1,2,3) * 3,5,9 + 1 -> (shXadd (shYadd x, x), x)
-    // This is the two instruction form, there are also three instruction
-    // variants we could implement.  e.g.
-    //   (2^(1,2,3) * 3,5,9 + 1) << C2
-    //   2^(C1>3) * 3,5,9 +/- 1
-    for (uint64_t Divisor : {3, 5, 9}) {
-      uint64_t C = MulAmt - 1;
-      if (C <= Divisor)
-        continue;
-      unsigned TZ = llvm::countr_zero(C);
-      if ((C >> TZ) == Divisor && (TZ == 1 || TZ == 2 || TZ == 3)) {
-        SDLoc DL(N);
-        SDValue Mul359 =
-            DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
-                        DAG.getConstant(Log2_64(Divisor - 1), DL, VT), X);
-        return DAG.getNode(RISCVISD::SHL_ADD, DL, VT, Mul359,
-                           DAG.getConstant(TZ, DL, VT), X);
-      }
-    }
-
-    // 2^n + 2/4/8 + 1 -> (add (shl X, C1), (shXadd X, X))
-    if (MulAmt > 2 && isPowerOf2_64((MulAmt - 1) & (MulAmt - 2))) {
-      unsigned ScaleShift = llvm::countr_zero(MulAmt - 1);
-      if (ScaleShift >= 1 && ScaleShift < 4) {
-        unsigned ShiftAmt = Log2_64(((MulAmt - 1) & (MulAmt - 2)));
-        SDLoc DL(N);
-        SDValue Shift1 =
-            DAG.getNode(ISD::SHL, DL, VT, X, DAG.getConstant(ShiftAmt, DL, VT));
-        return DAG.getNode(ISD::ADD, DL, VT, Shift1,
-                           DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
-                                       DAG.getConstant(ScaleShift, DL, VT), X));
-      }
-    }
-
-    // 2^N - 3/5/9 --> (sub (shl X, C1), (shXadd X, x))
-    for (uint64_t Offset : {3, 5, 9}) {
-      if (isPowerOf2_64(MulAmt + Offset)) {
-        SDLoc DL(N);
-        SDValue Shift1 =
-            DAG.getNode(ISD::SHL, DL, VT, X,
-                        DAG.getConstant(Log2_64(MulAmt + Offset), DL, VT));
-        SDValue Mul359 =
-            DAG.getNode(RISCVISD::SHL_ADD, DL, VT, X,
-                        DAG.getConstant(Log2_64(Offset - 1), DL, VT), X);
-        return DAG.getNode(ISD::SUB, DL, VT, Shift1, Mul359);
-      }
-    }
-  }
-
-  // 2^N - 2^M -> (sub (shl X, C1), (shl X, C2))
-  uint64_t MulAmtLowBit = MulAmt & (-MulAmt);
-  if (isPowerOf2_64(MulAmt + MulAmtLowBit)) {
-    uint64_t ShiftAmt1 = MulAmt + MulAmtLowBit;
-    SDLoc DL(N);
-    SDValue Shift1 = DAG.getNode(ISD::SHL, DL, VT, N->getOperand(0),
-                                 DAG.getConstant(Log2_64(ShiftAmt1), DL, VT));
-    SDValue Shift2 =
-        DAG.getNode(ISD::SHL, DL, VT, N->getOperand(0),
-                    DAG.getConstant(Log2_64(MulAmtLowBit), DL, VT));
-    return DAG.getNode(ISD::SUB, DL, VT, Shift1, Shift2);
-  }
-
-  return SDValue();
+  SmallVector<Step> Path;
+  if (!findMulExpansionRecursive(MulAmt, HasShlAdd, Path))
+    return SDValue();
+  assert(!Path.empty());
+  return expandMulPath(DAG, N, HasShlAdd, Path, X);
 }
 
 // Combine vXi32 (mul (and (lshr X, 15), 0x10001), 0xffff) ->
diff --git a/llvm/test/CodeGen/RISCV/addimm-mulimm.ll b/llvm/test/CodeGen/RISCV/addimm-mulimm.ll
index a18526718461e..570712be40b4b 100644
--- a/llvm/test/CodeGen/RISCV/addimm-mulimm.ll
+++ b/llvm/test/CodeGen/RISCV/addimm-mulimm.ll
@@ -11,16 +11,16 @@ define i32 @add_mul_combine_accept_a1(i32 %x) {
 ; RV32IMB-LABEL: add_mul_combine_accept_a1:
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    addi a0, a0, 1073
 ; RV32IMB-NEXT:    ret
 ;
 ; RV64IMB-LABEL: add_mul_combine_accept_a1:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    addiw a0, a0, 1073
 ; RV64IMB-NEXT:    ret
   %tmp0 = add i32 %x, 37
@@ -32,16 +32,16 @@ define signext i32 @add_mul_combine_accept_a2(i32 signext %x) {
 ; RV32IMB-LABEL: add_mul_combine_accept_a2:
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    addi a0, a0, 1073
 ; RV32IMB-NEXT:    ret
 ;
 ; RV64IMB-LABEL: add_mul_combine_accept_a2:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    addiw a0, a0, 1073
 ; RV64IMB-NEXT:    ret
   %tmp0 = add i32 %x, 37
@@ -55,12 +55,12 @@ define i64 @add_mul_combine_accept_a3(i64 %x) {
 ; RV32IMB-NEXT:    li a2, 29
 ; RV32IMB-NEXT:    mulhu a2, a0, a2
 ; RV32IMB-NEXT:    sh1add a3, a1, a1
-; RV32IMB-NEXT:    slli a1, a1, 5
-; RV32IMB-NEXT:    sub a1, a1, a3
+; RV32IMB-NEXT:    sh1add a3, a3, a1
+; RV32IMB-NEXT:    sh2add a1, a3, a1
 ; RV32IMB-NEXT:    add a1, a2, a1
 ; RV32IMB-NEXT:    sh1add a2, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a2, a0, a2
+; RV32IMB-NEXT:    sh1add a2, a2, a0
+; RV32IMB-NEXT:    sh2add a2, a2, a0
 ; RV32IMB-NEXT:    addi a0, a2, 1073
 ; RV32IMB-NEXT:    sltu a2, a0, a2
 ; RV32IMB-NEXT:    add a1, a1, a2
@@ -69,8 +69,8 @@ define i64 @add_mul_combine_accept_a3(i64 %x) {
 ; RV64IMB-LABEL: add_mul_combine_accept_a3:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    sub a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    addi a0, a0, 1073
 ; RV64IMB-NEXT:    ret
   %tmp0 = add i64 %x, 37
@@ -81,9 +81,9 @@ define i64 @add_mul_combine_accept_a3(i64 %x) {
 define i32 @add_mul_combine_accept_b1(i32 %x) {
 ; RV32IMB-LABEL: add_mul_combine_accept_b1:
 ; RV32IMB:       # %bb.0:
-; RV32IMB-NEXT:    sh3add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh2add a1, a0, a0
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh1add a0, a1, a0
 ; RV32IMB-NEXT:    lui a1, 50
 ; RV32IMB-NEXT:    addi a1, a1, 1119
 ; RV32IMB-NEXT:    add a0, a0, a1
@@ -91,9 +91,9 @@ define i32 @add_mul_combine_accept_b1(i32 %x) {
 ;
 ; RV64IMB-LABEL: add_mul_combine_accept_b1:
 ; RV64IMB:       # %bb.0:
-; RV64IMB-NEXT:    sh3add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh2add a1, a0, a0
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh1add a0, a1, a0
 ; RV64IMB-NEXT:    lui a1, 50
 ; RV64IMB-NEXT:    addi a1, a1, 1119
 ; RV64IMB-NEXT:    addw a0, a0, a1
@@ -106,9 +106,9 @@ define i32 @add_mul_combine_accept_b1(i32 %x) {
 define signext i32 @add_mul_combine_accept_b2(i32 signext %x) {
 ; RV32IMB-LABEL: add_mul_combine_accept_b2:
 ; RV32IMB:       # %bb.0:
-; RV32IMB-NEXT:    sh3add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh2add a1, a0, a0
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh1add a0, a1, a0
 ; RV32IMB-NEXT:    lui a1, 50
 ; RV32IMB-NEXT:    addi a1, a1, 1119
 ; RV32IMB-NEXT:    add a0, a0, a1
@@ -116,9 +116,9 @@ define signext i32 @add_mul_combine_accept_b2(i32 signext %x) {
 ;
 ; RV64IMB-LABEL: add_mul_combine_accept_b2:
 ; RV64IMB:       # %bb.0:
-; RV64IMB-NEXT:    sh3add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh2add a1, a0, a0
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh1add a0, a1, a0
 ; RV64IMB-NEXT:    lui a1, 50
 ; RV64IMB-NEXT:    addi a1, a1, 1119
 ; RV64IMB-NEXT:    addw a0, a0, a1
@@ -133,13 +133,13 @@ define i64 @add_mul_combine_accept_b3(i64 %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    li a2, 23
 ; RV32IMB-NEXT:    mulhu a2, a0, a2
-; RV32IMB-NEXT:    sh3add a3, a1, a1
-; RV32IMB-NEXT:    slli a1, a1, 5
-; RV32IMB-NEXT:    sub a1, a1, a3
+; RV32IMB-NEXT:    sh2add a3, a1, a1
+; RV32IMB-NEXT:    sh1add a3, a3, a1
+; RV32IMB-NEXT:    sh1add a1, a3, a1
 ; RV32IMB-NEXT:    add a1, a2, a1
-; RV32IMB-NEXT:    sh3add a2, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a2, a0, a2
+; RV32IMB-NEXT:    sh2add a2, a0, a0
+; RV32IMB-NEXT:    sh1add a2, a2, a0
+; RV32IMB-NEXT:    sh1add a2, a2, a0
 ; RV32IMB-NEXT:    lui a0, 50
 ; RV32IMB-NEXT:    addi a0, a0, 1119
 ; RV32IMB-NEXT:    add a0, a2, a0
@@ -149,9 +149,9 @@ define i64 @add_mul_combine_accept_b3(i64 %x) {
 ;
 ; RV64IMB-LABEL: add_mul_combine_accept_b3:
 ; RV64IMB:       # %bb.0:
-; RV64IMB-NEXT:    sh3add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    sub a0, a0, a1
+; RV64IMB-NEXT:    sh2add a1, a0, a0
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh1add a0, a1, a0
 ; RV64IMB-NEXT:    lui a1, 50
 ; RV64IMB-NEXT:    addiw a1, a1, 1119
 ; RV64IMB-NEXT:    add a0, a0, a1
@@ -166,16 +166,17 @@ define i32 @add_mul_combine_reject_a1(i32 %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    addi a0, a0, 1971
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    ret
 ;
 ; RV64IMB-LABEL: add_mul_combine_reject_a1:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1971
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
+; RV64IMB-NEXT:    sext.w a0, a0
 ; RV64IMB-NEXT:    ret
   %tmp0 = add i32 %x, 1971
   %tmp1 = mul i32 %tmp0, 29
@@ -187,16 +188,17 @@ define signext i32 @add_mul_combine_reject_a2(i32 signext %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    addi a0, a0, 1971
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    ret
 ;
 ; RV64IMB-LABEL: add_mul_combine_reject_a2:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1971
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
+; RV64IMB-NEXT:    sext.w a0, a0
 ; RV64IMB-NEXT:    ret
   %tmp0 = add i32 %x, 1971
   %tmp1 = mul i32 %tmp0, 29
@@ -209,12 +211,12 @@ define i64 @add_mul_combine_reject_a3(i64 %x) {
 ; RV32IMB-NEXT:    li a2, 29
 ; RV32IMB-NEXT:    mulhu a2, a0, a2
 ; RV32IMB-NEXT:    sh1add a3, a1, a1
-; RV32IMB-NEXT:    slli a1, a1, 5
-; RV32IMB-NEXT:    sub a1, a1, a3
+; RV32IMB-NEXT:    sh1add a3, a3, a1
+; RV32IMB-NEXT:    sh2add a1, a3, a1
 ; RV32IMB-NEXT:    add a1, a2, a1
 ; RV32IMB-NEXT:    sh1add a2, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a2, a0, a2
+; RV32IMB-NEXT:    sh1add a2, a2, a0
+; RV32IMB-NEXT:    sh2add a2, a2, a0
 ; RV32IMB-NEXT:    lui a0, 14
 ; RV32IMB-NEXT:    addi a0, a0, -185
 ; RV32IMB-NEXT:    add a0, a2, a0
@@ -226,8 +228,8 @@ define i64 @add_mul_combine_reject_a3(i64 %x) {
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1971
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    sub a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    ret
   %tmp0 = add i64 %x, 1971
   %tmp1 = mul i64 %tmp0, 29
@@ -373,16 +375,17 @@ define i32 @add_mul_combine_reject_e1(i32 %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    addi a0, a0, 1971
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    ret
 ;
 ; RV64IMB-LABEL: add_mul_combine_reject_e1:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1971
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
+; RV64IMB-NEXT:    sext.w a0, a0
 ; RV64IMB-NEXT:    ret
   %tmp0 = mul i32 %x, 29
   %tmp1 = add i32 %tmp0, 57159
@@ -394,16 +397,17 @@ define signext i32 @add_mul_combine_reject_e2(i32 signext %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    addi a0, a0, 1971
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    ret
 ;
 ; RV64IMB-LABEL: add_mul_combine_reject_e2:
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1971
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
+; RV64IMB-NEXT:    sext.w a0, a0
 ; RV64IMB-NEXT:    ret
   %tmp0 = mul i32 %x, 29
   %tmp1 = add i32 %tmp0, 57159
@@ -416,12 +420,12 @@ define i64 @add_mul_combine_reject_e3(i64 %x) {
 ; RV32IMB-NEXT:    li a2, 29
 ; RV32IMB-NEXT:    mulhu a2, a0, a2
 ; RV32IMB-NEXT:    sh1add a3, a1, a1
-; RV32IMB-NEXT:    slli a1, a1, 5
-; RV32IMB-NEXT:    sub a1, a1, a3
+; RV32IMB-NEXT:    sh1add a3, a3, a1
+; RV32IMB-NEXT:    sh2add a1, a3, a1
 ; RV32IMB-NEXT:    add a1, a2, a1
 ; RV32IMB-NEXT:    sh1add a2, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a2, a0, a2
+; RV32IMB-NEXT:    sh1add a2, a2, a0
+; RV32IMB-NEXT:    sh2add a2, a2, a0
 ; RV32IMB-NEXT:    lui a0, 14
 ; RV32IMB-NEXT:    addi a0, a0, -185
 ; RV32IMB-NEXT:    add a0, a2, a0
@@ -433,8 +437,8 @@ define i64 @add_mul_combine_reject_e3(i64 %x) {
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1971
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    sub a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    ret
   %tmp0 = mul i64 %x, 29
   %tmp1 = add i64 %tmp0, 57159
@@ -446,8 +450,8 @@ define i32 @add_mul_combine_reject_f1(i32 %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    addi a0, a0, 1972
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    addi a0, a0, 11
 ; RV32IMB-NEXT:    ret
 ;
@@ -455,8 +459,8 @@ define i32 @add_mul_combine_reject_f1(i32 %x) {
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1972
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    addiw a0, a0, 11
 ; RV64IMB-NEXT:    ret
   %tmp0 = mul i32 %x, 29
@@ -469,8 +473,8 @@ define signext i32 @add_mul_combine_reject_f2(i32 signext %x) {
 ; RV32IMB:       # %bb.0:
 ; RV32IMB-NEXT:    addi a0, a0, 1972
 ; RV32IMB-NEXT:    sh1add a1, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a0, a0, a1
+; RV32IMB-NEXT:    sh1add a1, a1, a0
+; RV32IMB-NEXT:    sh2add a0, a1, a0
 ; RV32IMB-NEXT:    addi a0, a0, 11
 ; RV32IMB-NEXT:    ret
 ;
@@ -478,8 +482,8 @@ define signext i32 @add_mul_combine_reject_f2(i32 signext %x) {
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1972
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    subw a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    addiw a0, a0, 11
 ; RV64IMB-NEXT:    ret
   %tmp0 = mul i32 %x, 29
@@ -493,12 +497,12 @@ define i64 @add_mul_combine_reject_f3(i64 %x) {
 ; RV32IMB-NEXT:    li a2, 29
 ; RV32IMB-NEXT:    mulhu a2, a0, a2
 ; RV32IMB-NEXT:    sh1add a3, a1, a1
-; RV32IMB-NEXT:    slli a1, a1, 5
-; RV32IMB-NEXT:    sub a1, a1, a3
+; RV32IMB-NEXT:    sh1add a3, a3, a1
+; RV32IMB-NEXT:    sh2add a1, a3, a1
 ; RV32IMB-NEXT:    add a1, a2, a1
 ; RV32IMB-NEXT:    sh1add a2, a0, a0
-; RV32IMB-NEXT:    slli a0, a0, 5
-; RV32IMB-NEXT:    sub a2, a0, a2
+; RV32IMB-NEXT:    sh1add a2, a2, a0
+; RV32IMB-NEXT:    sh2add a2, a2, a0
 ; RV32IMB-NEXT:    lui a0, 14
 ; RV32IMB-NEXT:    addi a0, a0, -145
 ; RV32IMB-NEXT:    add a0, a2, a0
@@ -510,8 +514,8 @@ define i64 @add_mul_combine_reject_f3(i64 %x) {
 ; RV64IMB:       # %bb.0:
 ; RV64IMB-NEXT:    addi a0, a0, 1972
 ; RV64IMB-NEXT:    sh1add a1, a0, a0
-; RV64IMB-NEXT:    slli a0, a0, 5
-; RV64IMB-NEXT:    sub a0, a0, a1
+; RV64IMB-NEXT:    sh1add a1, a1, a0
+; RV64IMB-NEXT:    sh2add a0, a1, a0
 ; RV64IMB-NEXT:    addi a0, a0, 11
 ; RV64IMB-NEXT:    ret
   %tmp0 = mul i64 %x, 29
diff --git a/llvm/test/CodeGen/RISCV/div-by-constant.ll b/llvm/test/CodeGen/RISCV/div-by-constant.ll
index 91ac7c5ddae3f..bc6d12c559f58 100644
--- a/llvm/test/CodeGen/RISCV/div-by-constant.ll
+++ b/llvm/test/CodeGen/RISCV/div-by-constant.ll
@@ -141,21 +141,39 @@ define i64 @udiv64_constant_add(i64 %a) nounwind {
 }
 
 define i8 @udiv8_constant_no_add(i8 %a) nounwind {
-; RV32-LABEL: udiv8_constant_no_add:
-; RV32:       # %bb.0:
-; RV32-NEXT:    andi a0, a0, 255
-; RV32-NEXT:    li a1, 205
-; RV32-NEXT:    mul a0, a0, a1
-; RV32-NEXT:    srli a0, a0, 10
-; RV32-NEXT:    ret
+; RV32IM-LABEL: udiv8_constant_no_add:
+; RV32IM:       # %bb.0:
+; RV32IM-NEXT:    andi a0, a0, 255
+; RV32IM-NEXT:    li a1, 205
+; RV32IM-NEXT:    mul a0, a0, a1
+; RV32IM-NEXT:    srli a0, a0, 10
+; RV32IM-NEXT:    ret
 ;
-; RV64-LABEL: udiv8_constant_no_add:
-; RV64:       # %bb.0:
-; RV64-NEXT:    andi a0, a0, 255
-; RV64-NEXT:    li a1, 205
-; RV64-NEXT:    mul a0, a0, a1
-; RV64-NEXT:    srli a0, a0, 10
-; RV64-NEXT:    ret
+; RV32IMZB-LABEL: udiv8_constant_no_add:
+; RV32IMZB:       # %bb.0:
+; RV32IMZB-NEXT:    andi a0, a0, 255
+; RV32IMZB-NEXT:    sh2add a1, a0, a0
+; RV32IMZB-NEXT:    sh3add a0, a1, a0
+; RV32IMZB-NEXT:    sh2add a0, a0, a0
+; RV32IMZB-NEXT:    srli a0, a0, 10
+; RV32IMZB-NEXT:    ret
+;
+; RV64IM-LABEL: udiv8_constant_no_add:
+; RV64IM:       # %bb.0:
+; RV64IM-NEXT:    andi a0, a0, 255
+; RV64IM-NEXT:    li a1, 205
+; RV64IM-NEXT:    mul a0, a0, a1
+; RV64IM-NEXT:    srli a0, a0, 10
+; RV64IM-NEXT:    ret
+;
+; RV64IMZB-LABEL: udiv8_constant_no_add:
+; RV64IMZB:       # %bb.0:
+; RV64IMZB-NEXT:    andi a0, a0, 255
+; RV64IMZB-NEXT:    sh2add a1, a0, a0
+; RV64IMZB-NEXT:    sh3add a0, a1, a0
+; RV64IMZB-NEXT:    sh2add a0, a0, a0
+; RV64IMZB-NEXT:    srli a0, a0, 10
+; RV64IMZB-NEXT:    ret
   %1 = udiv i8 %a, 5
   ret i8 %1
 }
@@ -657,8 +675,9 @@ define i8 @sdiv8_constant_sub_srai(i8 %a) nounwind {
 ; RV32IMZB-LABEL: sdiv8_constant_sub_srai:
 ; RV32IMZB:       # %bb.0:
 ; RV32IMZB-NEXT:    sext.b a1, a0
-; RV32IMZB-NEXT:    li a2, 109
-; RV32IMZB-NEXT:    mul a1, a1, a2
+; RV32IMZB-NEXT:    sh3add a2, a1, a1
+; RV32IMZB-NEXT:    sh1add a2, a2, a2
+; RV32IMZB-NEXT:    sh2add a1, a2, a1
 ; RV32IMZB-NEXT:    srli a1, a1, 8
 ; RV32IMZB-NEXT:    sub a1, a1, a0
 ; RV32IMZB-NEXT:    slli a1, a1, 24
@@ -684,8 +703,9 @@ define i8 @sdiv8_constant_sub_srai(i8 %a) nounwind {
 ; RV64IMZB-LABEL: sdiv8_constant_sub_srai:
 ; RV64IMZB:       # %bb.0:
 ; RV64IMZB-NEXT:    sext.b a1, a0
-; RV64IMZB-NEXT:    li a2, 109
-; RV64IMZB-NEXT:    mul a1, a1, a2
+; RV64IMZB-NEXT:    sh3add a2, a1, a1
+; RV64IMZB-NEXT:    sh1add a2, a2, a2
+; RV64IMZB-NEXT:    sh2add a1, a2, a1
 ; RV64IMZB-NEXT:    srli a1, a1, 8
 ; RV64IMZB-NEXT:    subw a1, a1, a0
 ; RV64IMZB-NEXT:    slli a1, a1, 56
diff --git a/llvm/test/CodeGen/RISCV/mul.ll b/llvm/test/CodeGen/RISCV/mul.ll
index 14f2777fdd06d..67d3eeae74962 100644
--- a/llvm/test/CodeGen/RISCV/mul.ll
+++ b/llvm/test/CodeGen/RISCV/mul.ll
@@ -473,23 +473,23 @@ define i32 @muli32_p14(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p14:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 1
-; RV32IM-NEXT:    slli a0, a0, 4
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 3
+; RV32IM-NEXT:    sub a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 1
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p14:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 1
-; RV64I-NEXT:    slli a0, a0, 4
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p14:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 1
-; RV64IM-NEXT:    slli a0, a0, 4
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 3
+; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    slliw a0, a1, 1
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 14
   ret i32 %1
@@ -503,23 +503,23 @@ define i32 @muli32_p28(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p28:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 2
-; RV32IM-NEXT:    slli a0, a0, 5
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 3
+; RV32IM-NEXT:    sub a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 2
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p28:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 2
-; RV64I-NEXT:    slli a0, a0, 5
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p28:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 2
-; RV64IM-NEXT:    slli a0, a0, 5
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 3
+; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    slliw a0, a1, 2
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 28
   ret i32 %1
@@ -533,23 +533,23 @@ define i32 @muli32_p30(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p30:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 1
-; RV32IM-NEXT:    slli a0, a0, 5
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 4
+; RV32IM-NEXT:    sub a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 1
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p30:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 1
-; RV64I-NEXT:    slli a0, a0, 5
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 4
+; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p30:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 1
-; RV64IM-NEXT:    slli a0, a0, 5
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 4
+; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    slliw a0, a1, 1
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 30
   ret i32 %1
@@ -564,22 +564,22 @@ define i32 @muli32_p56(i32 %a) nounwind {
 ; RV32IM-LABEL: muli32_p56:
 ; RV32IM:       # %bb.0:
 ; RV32IM-NEXT:    slli a1, a0, 3
-; RV32IM-NEXT:    slli a0, a0, 6
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    sub a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 3
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p56:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a1, a0, 3
-; RV64I-NEXT:    slli a0, a0, 6
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p56:
 ; RV64IM:       # %bb.0:
 ; RV64IM-NEXT:    slli a1, a0, 3
-; RV64IM-NEXT:    slli a0, a0, 6
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    slliw a0, a1, 3
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 56
   ret i32 %1
@@ -593,23 +593,23 @@ define i32 @muli32_p60(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p60:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 2
-; RV32IM-NEXT:    slli a0, a0, 6
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 4
+; RV32IM-NEXT:    sub a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 2
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p60:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 2
-; RV64I-NEXT:    slli a0, a0, 6
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 4
+; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p60:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 2
-; RV64IM-NEXT:    slli a0, a0, 6
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 4
+; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    slliw a0, a1, 2
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 60
   ret i32 %1
@@ -623,23 +623,23 @@ define i32 @muli32_p62(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p62:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 1
-; RV32IM-NEXT:    slli a0, a0, 6
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 5
+; RV32IM-NEXT:    sub a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 1
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p62:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 1
-; RV64I-NEXT:    slli a0, a0, 6
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 5
+; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p62:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 1
-; RV64IM-NEXT:    slli a0, a0, 6
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 5
+; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    slliw a0, a1, 1
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 62
   ret i32 %1
@@ -937,23 +937,23 @@ define i32 @muli32_p384(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p384:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 7
-; RV32IM-NEXT:    slli a0, a0, 9
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 1
+; RV32IM-NEXT:    add a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 7
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p384:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 7
-; RV64I-NEXT:    slli a0, a0, 9
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 7
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p384:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 7
-; RV64IM-NEXT:    slli a0, a0, 9
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 1
+; RV64IM-NEXT:    add a0, a1, a0
+; RV64IM-NEXT:    slliw a0, a0, 7
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 384
   ret i32 %1
@@ -967,23 +967,23 @@ define i32 @muli32_p12288(i32 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli32_p12288:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a1, a0, 12
-; RV32IM-NEXT:    slli a0, a0, 14
-; RV32IM-NEXT:    sub a0, a0, a1
+; RV32IM-NEXT:    slli a1, a0, 1
+; RV32IM-NEXT:    add a0, a1, a0
+; RV32IM-NEXT:    slli a0, a0, 12
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli32_p12288:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 12
-; RV64I-NEXT:    slli a0, a0, 14
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 12
 ; RV64I-NEXT:    ret
 ;
 ; RV64IM-LABEL: muli32_p12288:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 12
-; RV64IM-NEXT:    slli a0, a0, 14
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 1
+; RV64IM-NEXT:    add a0, a1, a0
+; RV64IM-NEXT:    slliw a0, a0, 12
 ; RV64IM-NEXT:    ret
   %1 = mul i32 %a, 12288
   ret i32 %1
@@ -1141,10 +1141,14 @@ define i64 @muli64_p4352(i64 %a) nounwind {
 ; RV32IM:       # %bb.0:
 ; RV32IM-NEXT:    li a2, 17
 ; RV32IM-NEXT:    slli a2, a2, 8
-; RV32IM-NEXT:    mul a1, a1, a2
-; RV32IM-NEXT:    mulhu a3, a0, a2
+; RV32IM-NEXT:    mulhu a2, a0, a2
+; RV32IM-NEXT:    slli a3, a1, 4
 ; RV32IM-NEXT:    add a1, a3, a1
-; RV32IM-NEXT:    mul a0, a0, a2
+; RV32IM-NEXT:    slli a1, a1, 8
+; RV32IM-NEXT:    add a1, a2, a1
+; RV32IM-NEXT:    slli a2, a0, 4
+; RV32IM-NEXT:    add a0, a2, a0
+; RV32IM-NEXT:    slli a0, a0, 8
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli64_p4352:
@@ -1183,16 +1187,16 @@ define i64 @muli64_p3840(i64 %a) nounwind {
 ;
 ; RV32IM-LABEL: muli64_p3840:
 ; RV32IM:       # %bb.0:
-; RV32IM-NEXT:    slli a2, a1, 8
-; RV32IM-NEXT:    slli a1, a1, 12
-; RV32IM-NEXT:    sub a1, a1, a2
 ; RV32IM-NEXT:    li a2, 15
 ; RV32IM-NEXT:    slli a2, a2, 8
 ; RV32IM-NEXT:    mulhu a2, a0, a2
+; RV32IM-NEXT:    slli a3, a1, 4
+; RV32IM-NEXT:    sub a3, a3, a1
+; RV32IM-NEXT:    slli a1, a3, 8
 ; RV32IM-NEXT:    add a1, a2, a1
-; RV32IM-NEXT:    slli a2, a0, 8
-; RV32IM-NEXT:    slli a0, a0, 12
-; RV32IM-NEXT:    sub a0, a0, a2
+; RV32IM-NEXT:    slli a2, a0, 4
+; RV32IM-NEXT:    sub a0, a2, a0
+; RV32IM-NEXT:    slli a0, a0, 8
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muli64_p3840:
@@ -1858,15 +1862,15 @@ define i64 @muland_demand(i64 %x) nounwind {
 ; RV32IM-LABEL: muland_demand:
 ; RV32IM:       # %bb.0:
 ; RV32IM-NEXT:    andi a0, a0, -8
-; RV32IM-NEXT:    slli a2, a1, 2
-; RV32IM-NEXT:    slli a1, a1, 4
-; RV32IM-NEXT:    sub a1, a1, a2
 ; RV32IM-NEXT:    li a2, 12
 ; RV32IM-NEXT:    mulhu a2, a0, a2
+; RV32IM-NEXT:    slli a3, a1, 1
+; RV32IM-NEXT:    add a1, a3, a1
+; RV32IM-NEXT:    slli a1, a1, 2
 ; RV32IM-NEXT:    add a1, a2, a1
-; RV32IM-NEXT:    slli a2, a0, 2
-; RV32IM-NEXT:    slli a0, a0, 4
-; RV32IM-NEXT:    sub a0, a0, a2
+; RV32IM-NEXT:    slli a2, a0, 1
+; RV32IM-NEXT:    add a0, a2, a0
+; RV32IM-NEXT:    slli a0, a0, 2
 ; RV32IM-NEXT:    ret
 ;
 ; RV64I-LABEL: muland_demand:
@@ -1880,9 +1884,9 @@ define i64 @muland_demand(i64 %x) nounwind {
 ; RV64IM-LABEL: muland_demand:
 ; RV64IM:       # %bb.0:
 ; RV64IM-NEXT:    andi a0, a0, -8
-; RV64IM-NEXT:    slli a1, a0, 2
-; RV64IM-NEXT:    slli a0, a0, 4
-; RV64IM-NEXT:    sub a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 1
+; RV64IM-NEXT:    add a0, a1, a0
+; RV64IM-NEXT:    slli a0, a0, 2
 ; RV64IM-NEXT:    ret
   %and = and i64 %x, 4611686018427387896
   %mul = mul i64 %and, 12
@@ -1916,9 +1920,9 @@ define i64 @mulzext_demand(i32 signext %x) nounwind {
 ;
 ; RV64IM-LABEL: mulzext_demand:
 ; RV64IM:       # %bb.0:
-; RV64IM-NEXT:    slli a1, a0, 32
-; RV64IM-NEXT:    slli a0, a0, 34
-; RV64IM-NEXT:    sub a0, a0, a1
+; RV64IM-NEXT:    slli a1, a0, 1
+; RV64IM-NEXT:    add a0, a1, a0
+; RV64IM-NEXT:    slli a0, a0, 32
 ; RV64IM-NEXT:    ret
   %ext = zext i32 %x to i64
   %mul = mul i64 %ext, 12884901888
diff --git a/llvm/test/CodeGen/RISCV/rv32xtheadba.ll b/llvm/test/CodeGen/RISCV/rv32xtheadba.ll
index 332e49771bedf..e0ba4bf91e71f 100644
--- a/llvm/test/CodeGen/RISCV/rv32xtheadba.ll
+++ b/llvm/test/CodeGen/RISCV/rv32xtheadba.ll
@@ -98,8 +98,8 @@ define i32 @addmul6(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul6:
 ; RV32I:       # %bb.0:
 ; RV32I-NEXT:    slli a2, a0, 1
-; RV32I-NEXT:    slli a0, a0, 3
-; RV32I-NEXT:    sub a0, a0, a2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -116,8 +116,9 @@ define i32 @addmul6(i32 %a, i32 %b) {
 define i32 @addmul10(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul10:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 10
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -134,9 +135,9 @@ define i32 @addmul10(i32 %a, i32 %b) {
 define i32 @addmul12(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul12:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    slli a2, a0, 2
-; RV32I-NEXT:    slli a0, a0, 4
-; RV32I-NEXT:    sub a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 1
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -153,8 +154,9 @@ define i32 @addmul12(i32 %a, i32 %b) {
 define i32 @addmul18(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul18:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 18
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 3
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -171,8 +173,9 @@ define i32 @addmul18(i32 %a, i32 %b) {
 define i32 @addmul20(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul20:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 20
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -189,9 +192,9 @@ define i32 @addmul20(i32 %a, i32 %b) {
 define i32 @addmul24(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul24:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    slli a2, a0, 3
-; RV32I-NEXT:    slli a0, a0, 5
-; RV32I-NEXT:    sub a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 1
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -208,8 +211,9 @@ define i32 @addmul24(i32 %a, i32 %b) {
 define i32 @addmul36(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul36:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 36
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 3
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -226,8 +230,9 @@ define i32 @addmul36(i32 %a, i32 %b) {
 define i32 @addmul40(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul40:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 40
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -244,8 +249,9 @@ define i32 @addmul40(i32 %a, i32 %b) {
 define i32 @addmul72(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul72:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 72
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 3
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -262,9 +268,9 @@ define i32 @addmul72(i32 %a, i32 %b) {
 define i32 @mul96(i32 %a) {
 ; RV32I-LABEL: mul96:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    slli a1, a0, 5
-; RV32I-NEXT:    slli a0, a0, 7
-; RV32I-NEXT:    sub a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 1
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 5
 ; RV32I-NEXT:    ret
 ;
 ; RV32XTHEADBA-LABEL: mul96:
@@ -279,8 +285,9 @@ define i32 @mul96(i32 %a) {
 define i32 @mul160(i32 %a) {
 ; RV32I-LABEL: mul160:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 160
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 2
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 5
 ; RV32I-NEXT:    ret
 ;
 ; RV32XTHEADBA-LABEL: mul160:
@@ -312,8 +319,9 @@ define i32 @mul200(i32 %a) {
 define i32 @mul288(i32 %a) {
 ; RV32I-LABEL: mul288:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 288
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 3
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 5
 ; RV32I-NEXT:    ret
 ;
 ; RV32XTHEADBA-LABEL: mul288:
diff --git a/llvm/test/CodeGen/RISCV/rv32zba.ll b/llvm/test/CodeGen/RISCV/rv32zba.ll
index 89273ef0e50b5..57e408c081003 100644
--- a/llvm/test/CodeGen/RISCV/rv32zba.ll
+++ b/llvm/test/CodeGen/RISCV/rv32zba.ll
@@ -64,8 +64,8 @@ define i32 @addmul6(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul6:
 ; RV32I:       # %bb.0:
 ; RV32I-NEXT:    slli a2, a0, 1
-; RV32I-NEXT:    slli a0, a0, 3
-; RV32I-NEXT:    sub a0, a0, a2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -82,8 +82,9 @@ define i32 @addmul6(i32 %a, i32 %b) {
 define i32 @addmul10(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul10:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 10
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -100,9 +101,9 @@ define i32 @addmul10(i32 %a, i32 %b) {
 define i32 @addmul12(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul12:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    slli a2, a0, 2
-; RV32I-NEXT:    slli a0, a0, 4
-; RV32I-NEXT:    sub a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 1
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -119,8 +120,9 @@ define i32 @addmul12(i32 %a, i32 %b) {
 define i32 @addmul18(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul18:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 18
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 3
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -137,8 +139,9 @@ define i32 @addmul18(i32 %a, i32 %b) {
 define i32 @addmul20(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul20:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 20
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -155,9 +158,9 @@ define i32 @addmul20(i32 %a, i32 %b) {
 define i32 @addmul24(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul24:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    slli a2, a0, 3
-; RV32I-NEXT:    slli a0, a0, 5
-; RV32I-NEXT:    sub a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 1
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -174,8 +177,9 @@ define i32 @addmul24(i32 %a, i32 %b) {
 define i32 @addmul36(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul36:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 36
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 3
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -192,8 +196,9 @@ define i32 @addmul36(i32 %a, i32 %b) {
 define i32 @addmul40(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul40:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 40
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 2
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -210,8 +215,9 @@ define i32 @addmul40(i32 %a, i32 %b) {
 define i32 @addmul72(i32 %a, i32 %b) {
 ; RV32I-LABEL: addmul72:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a2, 72
-; RV32I-NEXT:    mul a0, a0, a2
+; RV32I-NEXT:    slli a2, a0, 3
+; RV32I-NEXT:    add a0, a2, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    add a0, a0, a1
 ; RV32I-NEXT:    ret
 ;
@@ -228,9 +234,9 @@ define i32 @addmul72(i32 %a, i32 %b) {
 define i32 @mul96(i32 %a) {
 ; RV32I-LABEL: mul96:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    slli a1, a0, 5
-; RV32I-NEXT:    slli a0, a0, 7
-; RV32I-NEXT:    sub a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 1
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 5
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBA-LABEL: mul96:
@@ -245,8 +251,9 @@ define i32 @mul96(i32 %a) {
 define i32 @mul160(i32 %a) {
 ; RV32I-LABEL: mul160:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 160
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 2
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 5
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBA-LABEL: mul160:
@@ -261,8 +268,9 @@ define i32 @mul160(i32 %a) {
 define i32 @mul288(i32 %a) {
 ; RV32I-LABEL: mul288:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 288
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 3
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 5
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBA-LABEL: mul288:
@@ -277,8 +285,9 @@ define i32 @mul288(i32 %a) {
 define i32 @mul258(i32 %a) {
 ; RV32I-LABEL: mul258:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 258
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 7
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 1
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBA-LABEL: mul258:
@@ -293,8 +302,9 @@ define i32 @mul258(i32 %a) {
 define i32 @mul260(i32 %a) {
 ; RV32I-LABEL: mul260:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 260
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 6
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 2
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBA-LABEL: mul260:
@@ -309,8 +319,9 @@ define i32 @mul260(i32 %a) {
 define i32 @mul264(i32 %a) {
 ; RV32I-LABEL: mul264:
 ; RV32I:       # %bb.0:
-; RV32I-NEXT:    li a1, 264
-; RV32I-NEXT:    mul a0, a0, a1
+; RV32I-NEXT:    slli a1, a0, 5
+; RV32I-NEXT:    add a0, a1, a0
+; RV32I-NEXT:    slli a0, a0, 3
 ; RV32I-NEXT:    ret
 ;
 ; RV32ZBA-LABEL: mul264:
@@ -459,8 +470,8 @@ define i32 @mul27(i32 %a) {
 ;
 ; RV32ZBA-LABEL: mul27:
 ; RV32ZBA:       # %bb.0:
-; RV32ZBA-NEXT:    sh1add a0, a0, a0
 ; RV32ZBA-NEXT:    sh3add a0, a0, a0
+; RV32ZBA-NEXT:    sh1add a0, a0, a0
 ; RV32ZBA-NEXT:    ret
   %c = mul i32 %a, 27
   ret i32 %c
@@ -475,8 +486,8 @@ define i32 @mul45(i32 %a) {
 ;
 ; RV32ZBA-LABEL: mul45:
 ; RV32ZBA:       # %bb.0:
-; RV32ZBA-NEXT:    sh2add a0, a0, a0
 ; RV32ZBA-NEXT:    sh3add a0, a0, a0
+; RV32ZBA-NEXT:    sh2add a0, a0, a0
 ; RV32ZBA-NEXT:    ret
   %c = mul i32 %a, 45
   ret i32 %c
diff --git a/llvm/test/CodeGen/RISCV/rv64-legal-i32/rv64zba.ll b/llvm/test/CodeGen/RISCV/rv64-legal-i32/rv64zba.ll
index 7e2e57d317681..1d71eacefd46c 100644
--- a/llvm/test/CodeGen/RISCV/rv64-legal-i32/rv64zba.ll
+++ b/llvm/test/CodeGen/RISCV/rv64-legal-i32/rv64zba.ll
@@ -370,8 +370,8 @@ define i64 @addmul6(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul6:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a2, a0, 1
-; RV64I-NEXT:    slli a0, a0, 3
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -388,8 +388,9 @@ define i64 @addmul6(i64 %a, i64 %b) {
 define i64 @addmul10(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul10:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 10
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -406,9 +407,9 @@ define i64 @addmul10(i64 %a, i64 %b) {
 define i64 @addmul12(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul12:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a2, a0, 2
-; RV64I-NEXT:    slli a0, a0, 4
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 1
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -425,8 +426,9 @@ define i64 @addmul12(i64 %a, i64 %b) {
 define i64 @addmul18(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul18:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 18
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -443,8 +445,9 @@ define i64 @addmul18(i64 %a, i64 %b) {
 define i64 @addmul20(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul20:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 20
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -461,9 +464,9 @@ define i64 @addmul20(i64 %a, i64 %b) {
 define i64 @addmul24(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul24:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a2, a0, 3
-; RV64I-NEXT:    slli a0, a0, 5
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 1
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -480,8 +483,9 @@ define i64 @addmul24(i64 %a, i64 %b) {
 define i64 @addmul36(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul36:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 36
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -498,8 +502,9 @@ define i64 @addmul36(i64 %a, i64 %b) {
 define i64 @addmul40(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul40:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 40
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -516,8 +521,9 @@ define i64 @addmul40(i64 %a, i64 %b) {
 define i64 @addmul72(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul72:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 72
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -534,9 +540,9 @@ define i64 @addmul72(i64 %a, i64 %b) {
 define i64 @mul96(i64 %a) {
 ; RV64I-LABEL: mul96:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 5
-; RV64I-NEXT:    slli a0, a0, 7
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul96:
@@ -551,8 +557,9 @@ define i64 @mul96(i64 %a) {
 define i64 @mul160(i64 %a) {
 ; RV64I-LABEL: mul160:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 160
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul160:
@@ -567,8 +574,9 @@ define i64 @mul160(i64 %a) {
 define i64 @mul288(i64 %a) {
 ; RV64I-LABEL: mul288:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 288
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul288:
@@ -584,15 +592,17 @@ define i64 @zext_mul96(i32 signext %a) {
 ; RV64I-LABEL: zext_mul96:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    srli a1, a0, 27
-; RV64I-NEXT:    srli a0, a0, 25
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 31
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul96:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    slli.uw a0, a0, 5
-; RV64ZBA-NEXT:    sh1add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh1add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 5
 ; RV64ZBA-NEXT:    ret
   %b = zext i32 %a to i64
   %c = mul i64 %b, 96
@@ -602,16 +612,18 @@ define i64 @zext_mul96(i32 signext %a) {
 define i64 @zext_mul160(i32 signext %a) {
 ; RV64I-LABEL: zext_mul160:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 5
-; RV64I-NEXT:    slli a1, a1, 37
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    mulhu a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 30
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul160:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    slli.uw a0, a0, 5
-; RV64ZBA-NEXT:    sh2add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh2add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 5
 ; RV64ZBA-NEXT:    ret
   %b = zext i32 %a to i64
   %c = mul i64 %b, 160
@@ -621,16 +633,18 @@ define i64 @zext_mul160(i32 signext %a) {
 define i64 @zext_mul288(i32 signext %a) {
 ; RV64I-LABEL: zext_mul288:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 9
-; RV64I-NEXT:    slli a1, a1, 37
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    mulhu a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 29
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul288:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    slli.uw a0, a0, 5
-; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh3add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 5
 ; RV64ZBA-NEXT:    ret
   %b = zext i32 %a to i64
   %c = mul i64 %b, 288
@@ -641,9 +655,9 @@ define i64 @zext_mul288(i32 signext %a) {
 define i64 @zext_mul12884901888(i32 signext %a) {
 ; RV64I-LABEL: zext_mul12884901888:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 32
-; RV64I-NEXT:    slli a0, a0, 34
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul12884901888:
@@ -660,9 +674,9 @@ define i64 @zext_mul12884901888(i32 signext %a) {
 define i64 @zext_mul21474836480(i32 signext %a) {
 ; RV64I-LABEL: zext_mul21474836480:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 5
-; RV64I-NEXT:    slli a1, a1, 32
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul21474836480:
@@ -679,9 +693,9 @@ define i64 @zext_mul21474836480(i32 signext %a) {
 define i64 @zext_mul38654705664(i32 signext %a) {
 ; RV64I-LABEL: zext_mul38654705664:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 9
-; RV64I-NEXT:    slli a1, a1, 32
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul38654705664:
@@ -805,8 +819,9 @@ define i64 @adduw_imm(i32 signext %0) nounwind {
 define i64 @mul258(i64 %a) {
 ; RV64I-LABEL: mul258:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 258
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 7
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul258:
@@ -821,8 +836,9 @@ define i64 @mul258(i64 %a) {
 define i64 @mul260(i64 %a) {
 ; RV64I-LABEL: mul260:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 260
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 6
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul260:
@@ -837,8 +853,9 @@ define i64 @mul260(i64 %a) {
 define i64 @mul264(i64 %a) {
 ; RV64I-LABEL: mul264:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 264
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 5
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul264:
@@ -1003,8 +1020,8 @@ define i64 @mul27(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul27:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh1add a0, a0, a0
 ; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh1add a0, a0, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 27
   ret i64 %c
@@ -1019,8 +1036,8 @@ define i64 @mul45(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul45:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh2add a0, a0, a0
 ; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 45
   ret i64 %c
diff --git a/llvm/test/CodeGen/RISCV/rv64xtheadba.ll b/llvm/test/CodeGen/RISCV/rv64xtheadba.ll
index 2d44ffbf63749..b19a65c7683ee 100644
--- a/llvm/test/CodeGen/RISCV/rv64xtheadba.ll
+++ b/llvm/test/CodeGen/RISCV/rv64xtheadba.ll
@@ -94,8 +94,8 @@ define i64 @addmul6(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul6:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a2, a0, 1
-; RV64I-NEXT:    slli a0, a0, 3
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -112,8 +112,9 @@ define i64 @addmul6(i64 %a, i64 %b) {
 define i64 @addmul10(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul10:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 10
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -130,9 +131,9 @@ define i64 @addmul10(i64 %a, i64 %b) {
 define i64 @addmul12(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul12:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a2, a0, 2
-; RV64I-NEXT:    slli a0, a0, 4
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 1
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -149,8 +150,9 @@ define i64 @addmul12(i64 %a, i64 %b) {
 define i64 @addmul18(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul18:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 18
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -167,8 +169,9 @@ define i64 @addmul18(i64 %a, i64 %b) {
 define i64 @addmul20(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul20:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 20
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -185,9 +188,9 @@ define i64 @addmul20(i64 %a, i64 %b) {
 define i64 @addmul24(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul24:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a2, a0, 3
-; RV64I-NEXT:    slli a0, a0, 5
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 1
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -204,8 +207,9 @@ define i64 @addmul24(i64 %a, i64 %b) {
 define i64 @addmul36(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul36:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 36
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -222,8 +226,9 @@ define i64 @addmul36(i64 %a, i64 %b) {
 define i64 @addmul40(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul40:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 40
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -240,8 +245,9 @@ define i64 @addmul40(i64 %a, i64 %b) {
 define i64 @addmul72(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul72:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 72
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -392,8 +398,8 @@ define i64 @mul27(i64 %a) {
 ;
 ; RV64XTHEADBA-LABEL: mul27:
 ; RV64XTHEADBA:       # %bb.0:
-; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a0, 1
 ; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a0, 3
+; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a0, 1
 ; RV64XTHEADBA-NEXT:    ret
   %c = mul i64 %a, 27
   ret i64 %c
@@ -408,8 +414,8 @@ define i64 @mul45(i64 %a) {
 ;
 ; RV64XTHEADBA-LABEL: mul45:
 ; RV64XTHEADBA:       # %bb.0:
-; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a0, 2
 ; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a0, 3
+; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a0, 2
 ; RV64XTHEADBA-NEXT:    ret
   %c = mul i64 %a, 45
   ret i64 %c
@@ -435,9 +441,9 @@ define i64 @mul81(i64 %a) {
 define i64 @mul96(i64 %a) {
 ; RV64I-LABEL: mul96:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 5
-; RV64I-NEXT:    slli a0, a0, 7
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64XTHEADBA-LABEL: mul96:
@@ -459,8 +465,8 @@ define i64 @mul137(i64 %a) {
 ; RV64XTHEADBA-LABEL: mul137:
 ; RV64XTHEADBA:       # %bb.0:
 ; RV64XTHEADBA-NEXT:    th.addsl a1, a0, a0, 3
-; RV64XTHEADBA-NEXT:    slli a0, a0, 7
-; RV64XTHEADBA-NEXT:    add a0, a0, a1
+; RV64XTHEADBA-NEXT:    th.addsl a1, a1, a0, 3
+; RV64XTHEADBA-NEXT:    th.addsl a0, a0, a1, 3
 ; RV64XTHEADBA-NEXT:    ret
   %c = mul i64 %a, 137
   ret i64 %c
@@ -469,8 +475,9 @@ define i64 @mul137(i64 %a) {
 define i64 @mul160(i64 %a) {
 ; RV64I-LABEL: mul160:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 160
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64XTHEADBA-LABEL: mul160:
@@ -502,8 +509,9 @@ define i64 @mul200(i64 %a) {
 define i64 @mul288(i64 %a) {
 ; RV64I-LABEL: mul288:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 288
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64XTHEADBA-LABEL: mul288:
diff --git a/llvm/test/CodeGen/RISCV/rv64zba.ll b/llvm/test/CodeGen/RISCV/rv64zba.ll
index 7cb2452e1a148..70cbf16aef209 100644
--- a/llvm/test/CodeGen/RISCV/rv64zba.ll
+++ b/llvm/test/CodeGen/RISCV/rv64zba.ll
@@ -377,8 +377,8 @@ define i64 @addmul6(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul6:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a2, a0, 1
-; RV64I-NEXT:    slli a0, a0, 3
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -396,8 +396,8 @@ define i64 @disjointormul6(i64 %a, i64 %b) {
 ; RV64I-LABEL: disjointormul6:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a2, a0, 1
-; RV64I-NEXT:    slli a0, a0, 3
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    or a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -414,8 +414,9 @@ define i64 @disjointormul6(i64 %a, i64 %b) {
 define i64 @addmul10(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul10:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 10
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -432,9 +433,9 @@ define i64 @addmul10(i64 %a, i64 %b) {
 define i64 @addmul12(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul12:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a2, a0, 2
-; RV64I-NEXT:    slli a0, a0, 4
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 1
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -451,8 +452,9 @@ define i64 @addmul12(i64 %a, i64 %b) {
 define i64 @addmul18(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul18:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 18
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -469,8 +471,9 @@ define i64 @addmul18(i64 %a, i64 %b) {
 define i64 @addmul20(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul20:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 20
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -485,12 +488,19 @@ define i64 @addmul20(i64 %a, i64 %b) {
 }
 
 define i64 @addmul22(i64 %a, i64 %b) {
-; CHECK-LABEL: addmul22:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 22
-; CHECK-NEXT:    mul a0, a0, a2
-; CHECK-NEXT:    add a0, a0, a1
-; CHECK-NEXT:    ret
+; RV64I-LABEL: addmul22:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    li a2, 22
+; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBA-LABEL: addmul22:
+; RV64ZBA:       # %bb.0:
+; RV64ZBA-NEXT:    sh2add a2, a0, a0
+; RV64ZBA-NEXT:    sh1add a0, a2, a0
+; RV64ZBA-NEXT:    sh1add a0, a0, a1
+; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 22
   %d = add i64 %c, %b
   ret i64 %d
@@ -499,9 +509,9 @@ define i64 @addmul22(i64 %a, i64 %b) {
 define i64 @addmul24(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul24:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a2, a0, 3
-; RV64I-NEXT:    slli a0, a0, 5
-; RV64I-NEXT:    sub a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 1
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -518,8 +528,9 @@ define i64 @addmul24(i64 %a, i64 %b) {
 define i64 @addmul36(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul36:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 36
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -536,8 +547,9 @@ define i64 @addmul36(i64 %a, i64 %b) {
 define i64 @addmul40(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul40:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 40
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 2
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -554,8 +566,9 @@ define i64 @addmul40(i64 %a, i64 %b) {
 define i64 @addmul72(i64 %a, i64 %b) {
 ; RV64I-LABEL: addmul72:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a2, 72
-; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    slli a2, a0, 3
+; RV64I-NEXT:    add a0, a2, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
@@ -570,36 +583,58 @@ define i64 @addmul72(i64 %a, i64 %b) {
 }
 
 define i64 @addmul162(i64 %a, i64 %b) {
-; CHECK-LABEL: addmul162:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 162
-; CHECK-NEXT:    mul a0, a0, a2
-; CHECK-NEXT:    add a0, a0, a1
-; CHECK-NEXT:    ret
+; RV64I-LABEL: addmul162:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    li a2, 162
+; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBA-LABEL: addmul162:
+; RV64ZBA:       # %bb.0:
+; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh1add a0, a0, a1
+; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 162
   %d = add i64 %c, %b
   ret i64 %d
 }
 
 define i64 @addmul180(i64 %a, i64 %b) {
-; CHECK-LABEL: addmul180:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 180
-; CHECK-NEXT:    mul a0, a0, a2
-; CHECK-NEXT:    add a0, a0, a1
-; CHECK-NEXT:    ret
+; RV64I-LABEL: addmul180:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    li a2, 180
+; RV64I-NEXT:    mul a0, a0, a2
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBA-LABEL: addmul180:
+; RV64ZBA:       # %bb.0:
+; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a1
+; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 180
   %d = add i64 %c, %b
   ret i64 %d
 }
 
 define i64 @add255mul180(i64 %a) {
-; CHECK-LABEL: add255mul180:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, 180
-; CHECK-NEXT:    mul a0, a0, a1
-; CHECK-NEXT:    addi a0, a0, 255
-; CHECK-NEXT:    ret
+; RV64I-LABEL: add255mul180:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    li a1, 180
+; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    addi a0, a0, 255
+; RV64I-NEXT:    ret
+;
+; RV64ZBA-LABEL: add255mul180:
+; RV64ZBA:       # %bb.0:
+; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
+; RV64ZBA-NEXT:    slli a0, a0, 2
+; RV64ZBA-NEXT:    addi a0, a0, 255
+; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 180
   %d = add i64 %c, 255
   ret i64 %d
@@ -633,9 +668,9 @@ define i64 @addmul4230(i64 %a, i64 %b) {
 define i64 @mul96(i64 %a) {
 ; RV64I-LABEL: mul96:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 5
-; RV64I-NEXT:    slli a0, a0, 7
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul96:
@@ -674,8 +709,8 @@ define i64 @mul123(i64 %a) {
 ; RV64ZBA-LABEL: mul123:
 ; RV64ZBA:       # %bb.0:
 ; RV64ZBA-NEXT:    sh2add a1, a0, a0
-; RV64ZBA-NEXT:    slli a0, a0, 7
-; RV64ZBA-NEXT:    sub a0, a0, a1
+; RV64ZBA-NEXT:    sh3add a0, a1, a0
+; RV64ZBA-NEXT:    sh1add a0, a0, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 123
   ret i64 %c
@@ -690,9 +725,9 @@ define i64 @mul125(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul125:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh1add a1, a0, a0
-; RV64ZBA-NEXT:    slli a0, a0, 7
-; RV64ZBA-NEXT:    sub a0, a0, a1
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 125
   ret i64 %c
@@ -707,9 +742,9 @@ define i64 @mul131(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul131:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh1add a1, a0, a0
-; RV64ZBA-NEXT:    slli a0, a0, 7
-; RV64ZBA-NEXT:    add a0, a0, a1
+; RV64ZBA-NEXT:    slli a1, a0, 6
+; RV64ZBA-NEXT:    add a1, a1, a0
+; RV64ZBA-NEXT:    sh1add a0, a1, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 131
   ret i64 %c
@@ -724,9 +759,9 @@ define i64 @mul133(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul133:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh2add a1, a0, a0
-; RV64ZBA-NEXT:    slli a0, a0, 7
-; RV64ZBA-NEXT:    add a0, a0, a1
+; RV64ZBA-NEXT:    slli a1, a0, 5
+; RV64ZBA-NEXT:    add a1, a1, a0
+; RV64ZBA-NEXT:    sh2add a0, a1, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 133
   ret i64 %c
@@ -742,8 +777,8 @@ define i64 @mul137(i64 %a) {
 ; RV64ZBA-LABEL: mul137:
 ; RV64ZBA:       # %bb.0:
 ; RV64ZBA-NEXT:    sh3add a1, a0, a0
-; RV64ZBA-NEXT:    slli a0, a0, 7
-; RV64ZBA-NEXT:    add a0, a0, a1
+; RV64ZBA-NEXT:    sh3add a1, a0, a1
+; RV64ZBA-NEXT:    sh3add a0, a1, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 137
   ret i64 %c
@@ -752,8 +787,9 @@ define i64 @mul137(i64 %a) {
 define i64 @mul160(i64 %a) {
 ; RV64I-LABEL: mul160:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 160
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul160:
@@ -768,8 +804,9 @@ define i64 @mul160(i64 %a) {
 define i64 @mul288(i64 %a) {
 ; RV64I-LABEL: mul288:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 288
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul288:
@@ -784,10 +821,11 @@ define i64 @mul288(i64 %a) {
 define i64 @zext_mul68(i32 signext %a) {
 ; RV64I-LABEL: zext_mul68:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 17
-; RV64I-NEXT:    slli a1, a1, 34
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    mulhu a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 28
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul68:
@@ -804,15 +842,17 @@ define i64 @zext_mul96(i32 signext %a) {
 ; RV64I-LABEL: zext_mul96:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    srli a1, a0, 27
-; RV64I-NEXT:    srli a0, a0, 25
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 31
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul96:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    slli.uw a0, a0, 5
-; RV64ZBA-NEXT:    sh1add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh1add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 5
 ; RV64ZBA-NEXT:    ret
   %b = zext i32 %a to i64
   %c = mul i64 %b, 96
@@ -822,16 +862,18 @@ define i64 @zext_mul96(i32 signext %a) {
 define i64 @zext_mul160(i32 signext %a) {
 ; RV64I-LABEL: zext_mul160:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 5
-; RV64I-NEXT:    slli a1, a1, 37
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    mulhu a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 30
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul160:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    slli.uw a0, a0, 5
-; RV64ZBA-NEXT:    sh2add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh2add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 5
 ; RV64ZBA-NEXT:    ret
   %b = zext i32 %a to i64
   %c = mul i64 %b, 160
@@ -841,16 +883,18 @@ define i64 @zext_mul160(i32 signext %a) {
 define i64 @zext_mul288(i32 signext %a) {
 ; RV64I-LABEL: zext_mul288:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 9
-; RV64I-NEXT:    slli a1, a1, 37
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    mulhu a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 29
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 5
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul288:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    slli.uw a0, a0, 5
-; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh3add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 5
 ; RV64ZBA-NEXT:    ret
   %b = zext i32 %a to i64
   %c = mul i64 %b, 288
@@ -861,9 +905,9 @@ define i64 @zext_mul288(i32 signext %a) {
 define i64 @zext_mul12884901888(i32 signext %a) {
 ; RV64I-LABEL: zext_mul12884901888:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 32
-; RV64I-NEXT:    slli a0, a0, 34
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul12884901888:
@@ -880,9 +924,9 @@ define i64 @zext_mul12884901888(i32 signext %a) {
 define i64 @zext_mul21474836480(i32 signext %a) {
 ; RV64I-LABEL: zext_mul21474836480:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 5
-; RV64I-NEXT:    slli a1, a1, 32
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul21474836480:
@@ -899,9 +943,9 @@ define i64 @zext_mul21474836480(i32 signext %a) {
 define i64 @zext_mul38654705664(i32 signext %a) {
 ; RV64I-LABEL: zext_mul38654705664:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 9
-; RV64I-NEXT:    slli a1, a1, 32
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 32
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: zext_mul38654705664:
@@ -1025,8 +1069,9 @@ define i64 @adduw_imm(i32 signext %0) nounwind {
 define i64 @mul258(i64 %a) {
 ; RV64I-LABEL: mul258:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 258
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 7
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 1
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul258:
@@ -1041,8 +1086,9 @@ define i64 @mul258(i64 %a) {
 define i64 @mul260(i64 %a) {
 ; RV64I-LABEL: mul260:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 260
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 6
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 2
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul260:
@@ -1057,8 +1103,9 @@ define i64 @mul260(i64 %a) {
 define i64 @mul264(i64 %a) {
 ; RV64I-LABEL: mul264:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 264
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 5
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mul264:
@@ -1223,8 +1270,8 @@ define i64 @mul27(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul27:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh1add a0, a0, a0
 ; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh1add a0, a0, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 27
   ret i64 %c
@@ -1239,8 +1286,8 @@ define i64 @mul45(i64 %a) {
 ;
 ; RV64ZBA-LABEL: mul45:
 ; RV64ZBA:       # %bb.0:
-; RV64ZBA-NEXT:    sh2add a0, a0, a0
 ; RV64ZBA-NEXT:    sh3add a0, a0, a0
+; RV64ZBA-NEXT:    sh2add a0, a0, a0
 ; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, 45
   ret i64 %c
@@ -1316,9 +1363,9 @@ define i64 @mul4104(i64 %a) {
 define signext i32 @mulw192(i32 signext %a) {
 ; RV64I-LABEL: mulw192:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    slli a1, a0, 6
-; RV64I-NEXT:    slli a0, a0, 8
-; RV64I-NEXT:    subw a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 1
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slliw a0, a0, 6
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mulw192:
@@ -1333,8 +1380,9 @@ define signext i32 @mulw192(i32 signext %a) {
 define signext i32 @mulw320(i32 signext %a) {
 ; RV64I-LABEL: mulw320:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 320
-; RV64I-NEXT:    mulw a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slliw a0, a0, 6
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mulw320:
@@ -1349,8 +1397,9 @@ define signext i32 @mulw320(i32 signext %a) {
 define signext i32 @mulw576(i32 signext %a) {
 ; RV64I-LABEL: mulw576:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    li a1, 576
-; RV64I-NEXT:    mulw a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 3
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    slliw a0, a0, 6
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: mulw576:
@@ -2626,16 +2675,18 @@ define i64 @regression(i32 signext %x, i32 signext %y) {
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    slli a0, a0, 32
-; RV64I-NEXT:    srli a1, a0, 29
-; RV64I-NEXT:    srli a0, a0, 27
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    srli a1, a0, 32
+; RV64I-NEXT:    srli a0, a0, 31
+; RV64I-NEXT:    add a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 3
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: regression:
 ; RV64ZBA:       # %bb.0:
 ; RV64ZBA-NEXT:    subw a0, a0, a1
-; RV64ZBA-NEXT:    slli.uw a0, a0, 3
-; RV64ZBA-NEXT:    sh1add a0, a0, a0
+; RV64ZBA-NEXT:    zext.w a1, a0
+; RV64ZBA-NEXT:    sh1add.uw a0, a0, a1
+; RV64ZBA-NEXT:    slli a0, a0, 3
 ; RV64ZBA-NEXT:    ret
   %sub = sub i32 %x, %y
   %ext = zext i32 %sub to i64
@@ -2707,11 +2758,18 @@ define i64 @mul_neg5(i64 %a) {
 }
 
 define i64 @mul_neg6(i64 %a) {
-; CHECK-LABEL: mul_neg6:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, -6
-; CHECK-NEXT:    mul a0, a0, a1
-; CHECK-NEXT:    ret
+; RV64I-LABEL: mul_neg6:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    li a1, -6
+; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    ret
+;
+; RV64ZBA-LABEL: mul_neg6:
+; RV64ZBA:       # %bb.0:
+; RV64ZBA-NEXT:    sh3add a1, a0, a0
+; RV64ZBA-NEXT:    sub a1, a0, a1
+; RV64ZBA-NEXT:    sh1add a0, a0, a1
+; RV64ZBA-NEXT:    ret
   %c = mul i64 %a, -6
   ret i64 %c
 }
@@ -2742,8 +2800,8 @@ define i64 @bext_mul12(i32 %1, i32 %2) {
 ; RV64I-NEXT:    srlw a0, a0, a1
 ; RV64I-NEXT:    andi a0, a0, 1
 ; RV64I-NEXT:    slli a1, a0, 2
-; RV64I-NEXT:    slli a0, a0, 4
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    slli a0, a0, 3
+; RV64I-NEXT:    or a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBANOZBB-LABEL: bext_mul12:
@@ -2789,23 +2847,23 @@ define i64 @bext_mul45(i32 %1, i32 %2) {
 ; RV64ZBANOZBB:       # %bb.0: # %entry
 ; RV64ZBANOZBB-NEXT:    srlw a0, a0, a1
 ; RV64ZBANOZBB-NEXT:    andi a0, a0, 1
-; RV64ZBANOZBB-NEXT:    sh2add a0, a0, a0
 ; RV64ZBANOZBB-NEXT:    sh3add a0, a0, a0
+; RV64ZBANOZBB-NEXT:    sh2add a0, a0, a0
 ; RV64ZBANOZBB-NEXT:    ret
 ;
 ; RV64ZBAZBBNOZBS-LABEL: bext_mul45:
 ; RV64ZBAZBBNOZBS:       # %bb.0: # %entry
 ; RV64ZBAZBBNOZBS-NEXT:    srlw a0, a0, a1
 ; RV64ZBAZBBNOZBS-NEXT:    andi a0, a0, 1
-; RV64ZBAZBBNOZBS-NEXT:    sh2add a0, a0, a0
 ; RV64ZBAZBBNOZBS-NEXT:    sh3add a0, a0, a0
+; RV64ZBAZBBNOZBS-NEXT:    sh2add a0, a0, a0
 ; RV64ZBAZBBNOZBS-NEXT:    ret
 ;
 ; RV64ZBAZBBZBS-LABEL: bext_mul45:
 ; RV64ZBAZBBZBS:       # %bb.0: # %entry
 ; RV64ZBAZBBZBS-NEXT:    bext a0, a0, a1
-; RV64ZBAZBBZBS-NEXT:    sh2add a0, a0, a0
 ; RV64ZBAZBBZBS-NEXT:    sh3add a0, a0, a0
+; RV64ZBAZBBZBS-NEXT:    sh2add a0, a0, a0
 ; RV64ZBAZBBZBS-NEXT:    ret
 entry:
   %3 = lshr i32 %1, %2
@@ -2820,8 +2878,9 @@ define i64 @bext_mul132(i32 %1, i32 %2) {
 ; RV64I:       # %bb.0: # %entry
 ; RV64I-NEXT:    srlw a0, a0, a1
 ; RV64I-NEXT:    andi a0, a0, 1
-; RV64I-NEXT:    li a1, 132
-; RV64I-NEXT:    mul a0, a0, a1
+; RV64I-NEXT:    slli a1, a0, 2
+; RV64I-NEXT:    slli a0, a0, 7
+; RV64I-NEXT:    or a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBANOZBB-LABEL: bext_mul132:
@@ -2857,19 +2916,23 @@ entry:
 define ptr @gep_lshr_i32(ptr %0, i64 %1) {
 ; RV64I-LABEL: gep_lshr_i32:
 ; RV64I:       # %bb.0: # %entry
-; RV64I-NEXT:    srli a1, a1, 2
-; RV64I-NEXT:    li a2, 5
-; RV64I-NEXT:    slli a2, a2, 36
-; RV64I-NEXT:    slli a1, a1, 32
-; RV64I-NEXT:    mulhu a1, a1, a2
+; RV64I-NEXT:    srli a2, a1, 2
+; RV64I-NEXT:    slli a1, a1, 30
+; RV64I-NEXT:    srli a1, a1, 32
+; RV64I-NEXT:    slli a2, a2, 32
+; RV64I-NEXT:    srli a2, a2, 30
+; RV64I-NEXT:    add a1, a2, a1
+; RV64I-NEXT:    slli a1, a1, 4
 ; RV64I-NEXT:    add a0, a0, a1
 ; RV64I-NEXT:    ret
 ;
 ; RV64ZBA-LABEL: gep_lshr_i32:
 ; RV64ZBA:       # %bb.0: # %entry
-; RV64ZBA-NEXT:    srli a1, a1, 2
-; RV64ZBA-NEXT:    slli.uw a1, a1, 4
-; RV64ZBA-NEXT:    sh2add a1, a1, a1
+; RV64ZBA-NEXT:    srli a2, a1, 2
+; RV64ZBA-NEXT:    slli a1, a1, 30
+; RV64ZBA-NEXT:    srli a1, a1, 32
+; RV64ZBA-NEXT:    sh2add.uw a1, a2, a1
+; RV64ZBA-NEXT:    slli a1, a1, 4
 ; RV64ZBA-NEXT:    add a0, a0, a1
 ; RV64ZBA-NEXT:    ret
 entry:
diff --git a/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll b/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
index ee9f96a45d23e..8b1a7fad8c054 100644
--- a/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/calling-conv-fastcc.ll
@@ -70,12 +70,13 @@ define fastcc <vscale x 64 x i32> @ret_split_nxv64i32(ptr %x) {
 ; CHECK-LABEL: ret_split_nxv64i32:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    csrr a2, vlenb
-; CHECK-NEXT:    slli a3, a2, 3
-; CHECK-NEXT:    slli a4, a2, 5
-; CHECK-NEXT:    sub a4, a4, a3
+; CHECK-NEXT:    slli a3, a2, 1
+; CHECK-NEXT:    add a3, a3, a2
+; CHECK-NEXT:    slli a3, a3, 3
+; CHECK-NEXT:    add a4, a1, a3
+; CHECK-NEXT:    vl8re32.v v8, (a4)
+; CHECK-NEXT:    slli a4, a2, 3
 ; CHECK-NEXT:    add a5, a1, a4
-; CHECK-NEXT:    vl8re32.v v8, (a5)
-; CHECK-NEXT:    add a5, a1, a3
 ; CHECK-NEXT:    slli a2, a2, 4
 ; CHECK-NEXT:    vl8re32.v v16, (a1)
 ; CHECK-NEXT:    add a1, a1, a2
@@ -84,9 +85,9 @@ define fastcc <vscale x 64 x i32> @ret_split_nxv64i32(ptr %x) {
 ; CHECK-NEXT:    vs8r.v v16, (a0)
 ; CHECK-NEXT:    add a2, a0, a2
 ; CHECK-NEXT:    vs8r.v v24, (a2)
-; CHECK-NEXT:    add a3, a0, a3
-; CHECK-NEXT:    vs8r.v v0, (a3)
-; CHECK-NEXT:    add a0, a0, a4
+; CHECK-NEXT:    add a4, a0, a4
+; CHECK-NEXT:    vs8r.v v0, (a4)
+; CHECK-NEXT:    add a0, a0, a3
 ; CHECK-NEXT:    vs8r.v v8, (a0)
 ; CHECK-NEXT:    ret
   %v = load <vscale x 64 x i32>, ptr %x
@@ -104,74 +105,77 @@ define fastcc <vscale x 128 x i32> @ret_split_nxv128i32(ptr %x) {
 ; CHECK-NEXT:    sub sp, sp, a2
 ; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x20, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 32 * vlenb
 ; CHECK-NEXT:    csrr a2, vlenb
-; CHECK-NEXT:    slli a3, a2, 3
-; CHECK-NEXT:    slli a4, a2, 5
-; CHECK-NEXT:    sub a5, a4, a3
+; CHECK-NEXT:    slli a3, a2, 1
+; CHECK-NEXT:    add a3, a3, a2
+; CHECK-NEXT:    slli a4, a3, 3
+; CHECK-NEXT:    add a5, a1, a4
+; CHECK-NEXT:    vl8re32.v v8, (a5)
+; CHECK-NEXT:    csrr a5, vlenb
+; CHECK-NEXT:    li a6, 24
+; CHECK-NEXT:    mul a5, a5, a6
+; CHECK-NEXT:    add a5, sp, a5
+; CHECK-NEXT:    addi a5, a5, 16
+; CHECK-NEXT:    vs8r.v v8, (a5) # Unknown-size Folded Spill
+; CHECK-NEXT:    slli a5, a2, 2
+; CHECK-NEXT:    add a5, a5, a2
+; CHECK-NEXT:    slli a5, a5, 3
 ; CHECK-NEXT:    add a6, a1, a5
 ; CHECK-NEXT:    vl8re32.v v8, (a6)
 ; CHECK-NEXT:    csrr a6, vlenb
-; CHECK-NEXT:    li a7, 24
-; CHECK-NEXT:    mul a6, a6, a7
+; CHECK-NEXT:    slli a6, a6, 4
+; CHECK-NEXT:    add a6, sp, a6
+; CHECK-NEXT:    addi a6, a6, 16
+; CHECK-NEXT:    vs8r.v v8, (a6) # Unknown-size Folded Spill
+; CHECK-NEXT:    slli a3, a3, 4
+; CHECK-NEXT:    add a6, a1, a3
+; CHECK-NEXT:    vl8re32.v v8, (a6)
+; CHECK-NEXT:    csrr a6, vlenb
+; CHECK-NEXT:    slli a6, a6, 3
 ; CHECK-NEXT:    add a6, sp, a6
 ; CHECK-NEXT:    addi a6, a6, 16
 ; CHECK-NEXT:    vs8r.v v8, (a6) # Unknown-size Folded Spill
-; CHECK-NEXT:    slli a6, a2, 4
-; CHECK-NEXT:    slli a7, a2, 6
-; CHECK-NEXT:    sub t0, a7, a6
-; CHECK-NEXT:    add t1, a1, t0
-; CHECK-NEXT:    vl8re32.v v8, (t1)
-; CHECK-NEXT:    csrr t1, vlenb
-; CHECK-NEXT:    slli t1, t1, 4
-; CHECK-NEXT:    add t1, sp, t1
-; CHECK-NEXT:    addi t1, t1, 16
-; CHECK-NEXT:    vs8r.v v8, (t1) # Unknown-size Folded Spill
-; CHECK-NEXT:    sub a7, a7, a3
-; CHECK-NEXT:    add t1, a1, a7
-; CHECK-NEXT:    vl8re32.v v8, (t1)
-; CHECK-NEXT:    csrr t1, vlenb
-; CHECK-NEXT:    slli t1, t1, 3
-; CHECK-NEXT:    add t1, sp, t1
-; CHECK-NEXT:    addi t1, t1, 16
-; CHECK-NEXT:    vs8r.v v8, (t1) # Unknown-size Folded Spill
-; CHECK-NEXT:    add t1, a1, a3
-; CHECK-NEXT:    vl8re32.v v8, (t1)
-; CHECK-NEXT:    addi t1, sp, 16
-; CHECK-NEXT:    vs8r.v v8, (t1) # Unknown-size Folded Spill
-; CHECK-NEXT:    add t1, a1, a6
-; CHECK-NEXT:    add t2, a1, a4
-; CHECK-NEXT:    li t3, 40
-; CHECK-NEXT:    mul a2, a2, t3
+; CHECK-NEXT:    slli a6, a2, 3
+; CHECK-NEXT:    sub a7, a6, a2
+; CHECK-NEXT:    slli a7, a7, 3
+; CHECK-NEXT:    add t0, a1, a7
+; CHECK-NEXT:    vl8re32.v v8, (t0)
+; CHECK-NEXT:    addi t0, sp, 16
+; CHECK-NEXT:    vs8r.v v8, (t0) # Unknown-size Folded Spill
+; CHECK-NEXT:    add t0, a1, a6
+; CHECK-NEXT:    slli t1, a2, 4
+; CHECK-NEXT:    add t2, a1, t1
+; CHECK-NEXT:    slli a2, a2, 5
 ; CHECK-NEXT:    add t3, a1, a2
 ; CHECK-NEXT:    vl8re32.v v8, (a1)
-; CHECK-NEXT:    vl8re32.v v0, (t1)
+; CHECK-NEXT:    vl8re32.v v0, (t0)
 ; CHECK-NEXT:    vl8re32.v v16, (t3)
 ; CHECK-NEXT:    vl8re32.v v24, (t2)
 ; CHECK-NEXT:    vs8r.v v8, (a0)
 ; CHECK-NEXT:    add a2, a0, a2
 ; CHECK-NEXT:    vs8r.v v16, (a2)
-; CHECK-NEXT:    add a4, a0, a4
-; CHECK-NEXT:    vs8r.v v24, (a4)
+; CHECK-NEXT:    add t1, a0, t1
+; CHECK-NEXT:    vs8r.v v24, (t1)
 ; CHECK-NEXT:    add a6, a0, a6
 ; CHECK-NEXT:    vs8r.v v0, (a6)
-; CHECK-NEXT:    add a3, a0, a3
+; CHECK-NEXT:    add a7, a0, a7
 ; CHECK-NEXT:    addi a1, sp, 16
 ; CHECK-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
-; CHECK-NEXT:    vs8r.v v8, (a3)
-; CHECK-NEXT:    add a7, a0, a7
+; CHECK-NEXT:    vs8r.v v8, (a7)
+; CHECK-NEXT:    add a3, a0, a3
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 3
 ; CHECK-NEXT:    add a1, sp, a1
 ; CHECK-NEXT:    addi a1, a1, 16
 ; CHECK-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
-; CHECK-NEXT:    vs8r.v v8, (a7)
-; CHECK-NEXT:    add t0, a0, t0
+; CHECK-NEXT:    vs8r.v v8, (a3)
+; CHECK-NEXT:    add a5, a0, a5
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    slli a1, a1, 4
 ; CHECK-NEXT:    add a1, sp, a1
 ; CHECK-NEXT:    addi a1, a1, 16
 ; CHECK-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
-; CHECK-NEXT:    vs8r.v v8, (t0)
-; CHECK-NEXT:    add a0, a0, a5
+; CHECK-NEXT:    vs8r.v v8, (a5)
+; CHECK-NEXT:    add a0, a0, a4
 ; CHECK-NEXT:    csrr a1, vlenb
 ; CHECK-NEXT:    li a2, 24
 ; CHECK-NEXT:    mul a1, a1, a2
diff --git a/llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll b/llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
index 2a0c9ce8b6299..85eaf84b89cde 100644
--- a/llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll
@@ -274,9 +274,10 @@ define <vscale x 2 x i8> @extract_nxv32i8_nxv2i8_6(<vscale x 32 x i8> %vec) {
 ; CHECK-LABEL: extract_nxv32i8_nxv2i8_6:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    csrr a0, vlenb
-; CHECK-NEXT:    srli a1, a0, 3
-; CHECK-NEXT:    slli a1, a1, 1
-; CHECK-NEXT:    sub a0, a0, a1
+; CHECK-NEXT:    srli a0, a0, 3
+; CHECK-NEXT:    slli a1, a0, 1
+; CHECK-NEXT:    add a0, a1, a0
+; CHECK-NEXT:    slli a0, a0, 1
 ; CHECK-NEXT:    vsetvli a1, zero, e8, m1, ta, ma
 ; CHECK-NEXT:    vslidedown.vx v8, v8, a0
 ; CHECK-NEXT:    ret
@@ -297,9 +298,10 @@ define <vscale x 2 x i8> @extract_nxv32i8_nxv2i8_22(<vscale x 32 x i8> %vec) {
 ; CHECK-LABEL: extract_nxv32i8_nxv2i8_22:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    csrr a0, vlenb
-; CHECK-NEXT:    srli a1, a0, 3
-; CHECK-NEXT:    slli a1, a1, 1
-; CHECK-NEXT:    sub a0, a0, a1
+; CHECK-NEXT:    srli a0, a0, 3
+; CHECK-NEXT:    slli a1, a0, 1
+; CHECK-NEXT:    add a0, a1, a0
+; CHECK-NEXT:    slli a0, a0, 1
 ; CHECK-NEXT:    vsetvli a1, zero, e8, m1, ta, ma
 ; CHECK-NEXT:    vslidedown.vx v8, v10, a0
 ; CHECK-NEXT:    ret
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll
index 64ad86db04959..11aac32292891 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll
@@ -617,21 +617,24 @@ define void @gather_of_pointers(ptr noalias nocapture %arg, ptr noalias nocaptur
 ; ZVE32F-NEXT:    lui a3, 2
 ; ZVE32F-NEXT:    add a3, a0, a3
 ; ZVE32F-NEXT:    li a4, 1
-; ZVE32F-NEXT:    li a5, 40
 ; ZVE32F-NEXT:  .LBB11_1: # %bb2
 ; ZVE32F-NEXT:    # =>This Inner Loop Header: Depth=1
-; ZVE32F-NEXT:    mul a6, a4, a5
+; ZVE32F-NEXT:    slli a5, a4, 2
+; ZVE32F-NEXT:    add a5, a5, a4
+; ZVE32F-NEXT:    slli a5, a5, 3
+; ZVE32F-NEXT:    add a5, a1, a5
+; ZVE32F-NEXT:    slli a6, a2, 2
+; ZVE32F-NEXT:    add a6, a6, a2
+; ZVE32F-NEXT:    slli a6, a6, 3
 ; ZVE32F-NEXT:    add a6, a1, a6
-; ZVE32F-NEXT:    mul a7, a2, a5
-; ZVE32F-NEXT:    add a7, a1, a7
+; ZVE32F-NEXT:    ld a7, 0(a5)
 ; ZVE32F-NEXT:    ld t0, 0(a6)
-; ZVE32F-NEXT:    ld t1, 0(a7)
+; ZVE32F-NEXT:    ld a5, 80(a5)
 ; ZVE32F-NEXT:    ld a6, 80(a6)
-; ZVE32F-NEXT:    ld a7, 80(a7)
-; ZVE32F-NEXT:    sd t0, 8(a0)
-; ZVE32F-NEXT:    sd t1, 0(a0)
-; ZVE32F-NEXT:    sd a6, 24(a0)
-; ZVE32F-NEXT:    sd a7, 16(a0)
+; ZVE32F-NEXT:    sd a7, 8(a0)
+; ZVE32F-NEXT:    sd t0, 0(a0)
+; ZVE32F-NEXT:    sd a5, 24(a0)
+; ZVE32F-NEXT:    sd a6, 16(a0)
 ; ZVE32F-NEXT:    addi a2, a2, 4
 ; ZVE32F-NEXT:    addi a0, a0, 32
 ; ZVE32F-NEXT:    addi a4, a4, 4
@@ -694,21 +697,24 @@ define void @scatter_of_pointers(ptr noalias nocapture %arg, ptr noalias nocaptu
 ; ZVE32F-NEXT:    lui a3, 2
 ; ZVE32F-NEXT:    add a3, a1, a3
 ; ZVE32F-NEXT:    li a4, 1
-; ZVE32F-NEXT:    li a5, 40
 ; ZVE32F-NEXT:  .LBB12_1: # %bb2
 ; ZVE32F-NEXT:    # =>This Inner Loop Header: Depth=1
-; ZVE32F-NEXT:    ld a6, 8(a1)
-; ZVE32F-NEXT:    ld a7, 0(a1)
-; ZVE32F-NEXT:    ld t0, 24(a1)
-; ZVE32F-NEXT:    ld t1, 16(a1)
-; ZVE32F-NEXT:    mul t2, a4, a5
+; ZVE32F-NEXT:    ld a5, 8(a1)
+; ZVE32F-NEXT:    ld a6, 0(a1)
+; ZVE32F-NEXT:    ld a7, 24(a1)
+; ZVE32F-NEXT:    ld t0, 16(a1)
+; ZVE32F-NEXT:    slli t1, a4, 2
+; ZVE32F-NEXT:    add t1, t1, a4
+; ZVE32F-NEXT:    slli t1, t1, 3
+; ZVE32F-NEXT:    add t1, a0, t1
+; ZVE32F-NEXT:    slli t2, a2, 2
+; ZVE32F-NEXT:    add t2, t2, a2
+; ZVE32F-NEXT:    slli t2, t2, 3
 ; ZVE32F-NEXT:    add t2, a0, t2
-; ZVE32F-NEXT:    mul t3, a2, a5
-; ZVE32F-NEXT:    add t3, a0, t3
-; ZVE32F-NEXT:    sd a7, 0(t3)
 ; ZVE32F-NEXT:    sd a6, 0(t2)
-; ZVE32F-NEXT:    sd t1, 80(t3)
+; ZVE32F-NEXT:    sd a5, 0(t1)
 ; ZVE32F-NEXT:    sd t0, 80(t2)
+; ZVE32F-NEXT:    sd a7, 80(t1)
 ; ZVE32F-NEXT:    addi a2, a2, 4
 ; ZVE32F-NEXT:    addi a1, a1, 32
 ; ZVE32F-NEXT:    addi a4, a4, 4
diff --git a/llvm/test/CodeGen/RISCV/rvv/mscatter-combine.ll b/llvm/test/CodeGen/RISCV/rvv/mscatter-combine.ll
index 1c3b429202adf..718b7941c9139 100644
--- a/llvm/test/CodeGen/RISCV/rvv/mscatter-combine.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/mscatter-combine.ll
@@ -81,10 +81,10 @@ define void @strided_store_offset_start(i64 %n, ptr %p) {
 ; RV64-LABEL: strided_store_offset_start:
 ; RV64:       # %bb.0:
 ; RV64-NEXT:    slli a2, a0, 3
-; RV64-NEXT:    slli a0, a0, 6
-; RV64-NEXT:    sub a0, a0, a2
-; RV64-NEXT:    add a0, a1, a0
-; RV64-NEXT:    addi a0, a0, 36
+; RV64-NEXT:    sub a2, a2, a0
+; RV64-NEXT:    slli a2, a2, 3
+; RV64-NEXT:    add a1, a1, a2
+; RV64-NEXT:    addi a0, a1, 36
 ; RV64-NEXT:    vsetvli a1, zero, e64, m1, ta, ma
 ; RV64-NEXT:    vmv.v.i v8, 0
 ; RV64-NEXT:    li a1, 56
diff --git a/llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll b/llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll
index 12604711be191..e256ce27601c6 100644
--- a/llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll
@@ -3512,16 +3512,16 @@ define <vscale x 32 x i1> @fcmp_oeq_vv_nxv32f64(<vscale x 32 x double> %va, <vsc
 ; CHECK-NEXT:    vs8r.v v8, (a1) # Unknown-size Folded Spill
 ; CHECK-NEXT:    csrr a4, vlenb
 ; CHECK-NEXT:    slli t0, a4, 3
-; CHECK-NEXT:    slli a1, a4, 5
-; CHECK-NEXT:    sub t1, a1, t0
 ; CHECK-NEXT:    srli a1, a4, 2
 ; CHECK-NEXT:    vsetvli a3, zero, e8, mf2, ta, ma
 ; CHECK-NEXT:    vslidedown.vx v7, v0, a1
 ; CHECK-NEXT:    srli a3, a4, 3
-; CHECK-NEXT:    add a5, a2, t0
-; CHECK-NEXT:    vl8re64.v v8, (a5)
-; CHECK-NEXT:    slli t3, a4, 4
 ; CHECK-NEXT:    slli a5, a4, 1
+; CHECK-NEXT:    add t1, a5, a4
+; CHECK-NEXT:    add a7, a2, t0
+; CHECK-NEXT:    vl8re64.v v8, (a7)
+; CHECK-NEXT:    slli t1, t1, 3
+; CHECK-NEXT:    slli t3, a4, 4
 ; CHECK-NEXT:    vsetvli a7, zero, e8, mf4, ta, ma
 ; CHECK-NEXT:    vslidedown.vx v0, v0, a3
 ; CHECK-NEXT:    mv a7, a6
@@ -3559,6 +3559,8 @@ define <vscale x 32 x i1> @fcmp_oeq_vv_nxv32f64(<vscale x 32 x double> %va, <vsc
 ; CHECK-NEXT:    add a2, sp, a2
 ; CHECK-NEXT:    addi a2, a2, 16
 ; CHECK-NEXT:    vs8r.v v8, (a2) # Unknown-size Folded Spill
+; CHECK-NEXT:    vsetvli a2, zero, e8, mf4, ta, ma
+; CHECK-NEXT:    vslidedown.vx v18, v7, a3
 ; CHECK-NEXT:    vl8re64.v v8, (t1)
 ; CHECK-NEXT:    csrr a2, vlenb
 ; CHECK-NEXT:    li t1, 24
@@ -3566,8 +3568,6 @@ define <vscale x 32 x i1> @fcmp_oeq_vv_nxv32f64(<vscale x 32 x double> %va, <vsc
 ; CHECK-NEXT:    add a2, sp, a2
 ; CHECK-NEXT:    addi a2, a2, 16
 ; CHECK-NEXT:    vs8r.v v8, (a2) # Unknown-size Folded Spill
-; CHECK-NEXT:    vsetvli a2, zero, e8, mf4, ta, ma
-; CHECK-NEXT:    vslidedown.vx v18, v7, a3
 ; CHECK-NEXT:    vl8re64.v v8, (t0)
 ; CHECK-NEXT:    csrr a2, vlenb
 ; CHECK-NEXT:    slli a2, a2, 3
diff --git a/llvm/test/CodeGen/RISCV/rvv/stepvector.ll b/llvm/test/CodeGen/RISCV/rvv/stepvector.ll
index 064ea816593ac..18934661330bb 100644
--- a/llvm/test/CodeGen/RISCV/rvv/stepvector.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/stepvector.ll
@@ -643,11 +643,11 @@ define <vscale x 16 x i64> @mul_bigimm_stepvector_nxv16i64() {
 ; RV32-NEXT:    lui a1, 92455
 ; RV32-NEXT:    addi a1, a1, -1368
 ; RV32-NEXT:    mulhu a1, a0, a1
-; RV32-NEXT:    slli a2, a0, 1
-; RV32-NEXT:    slli a0, a0, 6
-; RV32-NEXT:    sub a0, a0, a2
-; RV32-NEXT:    add a0, a1, a0
-; RV32-NEXT:    sw a0, 4(sp)
+; RV32-NEXT:    slli a2, a0, 5
+; RV32-NEXT:    sub a2, a2, a0
+; RV32-NEXT:    slli a2, a2, 1
+; RV32-NEXT:    add a1, a1, a2
+; RV32-NEXT:    sw a1, 4(sp)
 ; RV32-NEXT:    addi a0, sp, 8
 ; RV32-NEXT:    vsetvli a1, zero, e64, m8, ta, ma
 ; RV32-NEXT:    vlse64.v v8, (a0), zero
diff --git a/llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll b/llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll
index 457d0380ca8a8..19651f27e11f6 100644
--- a/llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll
+++ b/llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll
@@ -148,10 +148,10 @@ define i1 @test_srem_even(i4 %X) nounwind {
 ; RV32M-NEXT:    slli a1, a1, 24
 ; RV32M-NEXT:    srli a1, a1, 31
 ; RV32M-NEXT:    add a1, a2, a1
-; RV32M-NEXT:    slli a2, a1, 3
+; RV32M-NEXT:    slli a2, a1, 1
+; RV32M-NEXT:    add a1, a2, a1
 ; RV32M-NEXT:    slli a1, a1, 1
-; RV32M-NEXT:    sub a1, a1, a2
-; RV32M-NEXT:    add a0, a0, a1
+; RV32M-NEXT:    sub a0, a0, a1
 ; RV32M-NEXT:    andi a0, a0, 15
 ; RV32M-NEXT:    addi a0, a0, -1
 ; RV32M-NEXT:    seqz a0, a0
@@ -167,10 +167,10 @@ define i1 @test_srem_even(i4 %X) nounwind {
 ; RV64M-NEXT:    slli a1, a1, 56
 ; RV64M-NEXT:    srli a1, a1, 63
 ; RV64M-NEXT:    add a1, a2, a1
-; RV64M-NEXT:    slli a2, a1, 3
+; RV64M-NEXT:    slli a2, a1, 1
+; RV64M-NEXT:    add a1, a2, a1
 ; RV64M-NEXT:    slli a1, a1, 1
-; RV64M-NEXT:    subw a1, a1, a2
-; RV64M-NEXT:    add a0, a0, a1
+; RV64M-NEXT:    subw a0, a0, a1
 ; RV64M-NEXT:    andi a0, a0, 15
 ; RV64M-NEXT:    addi a0, a0, -1
 ; RV64M-NEXT:    seqz a0, a0
@@ -186,10 +186,10 @@ define i1 @test_srem_even(i4 %X) nounwind {
 ; RV32MV-NEXT:    slli a1, a1, 24
 ; RV32MV-NEXT:    srli a1, a1, 31
 ; RV32MV-NEXT:    add a1, a2, a1
-; RV32MV-NEXT:    slli a2, a1, 3
+; RV32MV-NEXT:    slli a2, a1, 1
+; RV32MV-NEXT:    add a1, a2, a1
 ; RV32MV-NEXT:    slli a1, a1, 1
-; RV32MV-NEXT:    sub a1, a1, a2
-; RV32MV-NEXT:    add a0, a0, a1
+; RV32MV-NEXT:    sub a0, a0, a1
 ; RV32MV-NEXT:    andi a0, a0, 15
 ; RV32MV-NEXT:    addi a0, a0, -1
 ; RV32MV-NEXT:    seqz a0, a0
@@ -205,10 +205,10 @@ define i1 @test_srem_even(i4 %X) nounwind {
 ; RV64MV-NEXT:    slli a1, a1, 56
 ; RV64MV-NEXT:    srli a1, a1, 63
 ; RV64MV-NEXT:    add a1, a2, a1
-; RV64MV-NEXT:    slli a2, a1, 3
+; RV64MV-NEXT:    slli a2, a1, 1
+; RV64MV-NEXT:    add a1, a2, a1
 ; RV64MV-NEXT:    slli a1, a1, 1
-; RV64MV-NEXT:    subw a1, a1, a2
-; RV64MV-NEXT:    add a0, a0, a1
+; RV64MV-NEXT:    subw a0, a0, a1
 ; RV64MV-NEXT:    andi a0, a0, 15
 ; RV64MV-NEXT:    addi a0, a0, -1
 ; RV64MV-NEXT:    seqz a0, a0
@@ -756,12 +756,12 @@ define void @test_srem_vec(ptr %X) nounwind {
 ; RV64MV-NEXT:    mulh a4, a3, a5
 ; RV64MV-NEXT:    srli a5, a4, 63
 ; RV64MV-NEXT:    add a4, a4, a5
-; RV64MV-NEXT:    slli a5, a4, 3
+; RV64MV-NEXT:    slli a5, a4, 1
+; RV64MV-NEXT:    add a4, a5, a4
 ; RV64MV-NEXT:    slli a4, a4, 1
-; RV64MV-NEXT:    sub a4, a4, a5
 ; RV64MV-NEXT:    lui a5, %hi(.LCPI3_2)
 ; RV64MV-NEXT:    ld a5, %lo(.LCPI3_2)(a5)
-; RV64MV-NEXT:    add a3, a3, a4
+; RV64MV-NEXT:    sub a3, a3, a4
 ; RV64MV-NEXT:    vsetivli zero, 4, e64, m2, ta, ma
 ; RV64MV-NEXT:    vmv.v.x v8, a3
 ; RV64MV-NEXT:    vslide1down.vx v8, v8, a2
diff --git a/llvm/test/CodeGen/RISCV/urem-vector-lkk.ll b/llvm/test/CodeGen/RISCV/urem-vector-lkk.ll
index c057c656e0fb7..b82bf5c7d3d9a 100644
--- a/llvm/test/CodeGen/RISCV/urem-vector-lkk.ll
+++ b/llvm/test/CodeGen/RISCV/urem-vector-lkk.ll
@@ -61,10 +61,10 @@ define <4 x i16> @fold_urem_vec_1(<4 x i16> %x) nounwind {
 ; RV32IM-NEXT:    lui a5, 8456
 ; RV32IM-NEXT:    addi a5, a5, 1058
 ; RV32IM-NEXT:    mulhu a5, a4, a5
-; RV32IM-NEXT:    slli a6, a5, 7
+; RV32IM-NEXT:    slli a6, a5, 5
+; RV32IM-NEXT:    sub a5, a6, a5
 ; RV32IM-NEXT:    slli a5, a5, 2
-; RV32IM-NEXT:    sub a5, a5, a6
-; RV32IM-NEXT:    add a4, a4, a5
+; RV32IM-NEXT:    sub a4, a4, a5
 ; RV32IM-NEXT:    lui a5, 11038
 ; RV32IM-NEXT:    addi a5, a5, -1465
 ; RV32IM-NEXT:    mulhu a5, a1, a5
@@ -140,12 +140,12 @@ define <4 x i16> @fold_urem_vec_1(<4 x i16> %x) nounwind {
 ; RV64IM-NEXT:    lhu a5, 16(a1)
 ; RV64IM-NEXT:    lhu a1, 0(a1)
 ; RV64IM-NEXT:    mulhu a3, a2, a3
-; RV64IM-NEXT:    slli a6, a3, 7
+; RV64IM-NEXT:    slli a6, a3, 5
 ; RV64IM-NEXT:    lui a7, %hi(.LCPI0_1)
 ; RV64IM-NEXT:    ld a7, %lo(.LCPI0_1)(a7)
+; RV64IM-NEXT:    subw a3, a6, a3
 ; RV64IM-NEXT:    slli a3, a3, 2
-; RV64IM-NEXT:    subw a3, a3, a6
-; RV64IM-NEXT:    add a2, a2, a3
+; RV64IM-NEXT:    subw a2, a2, a3
 ; RV64IM-NEXT:    mulhu a3, a1, a7
 ; RV64IM-NEXT:    lui a6, %hi(.LCPI0_2)
 ; RV64IM-NEXT:    ld a6, %lo(.LCPI0_2)(a6)



More information about the llvm-commits mailing list