[llvm] [LoongArch] Introduce `32s` target feature for LA32S ISA extensions (PR #139695)

via llvm-commits llvm-commits at lists.llvm.org
Wed May 14 23:40:03 PDT 2025


https://github.com/heiher updated https://github.com/llvm/llvm-project/pull/139695

>From 3ce7e72fb455338b856429f30b8b4258516c4089 Mon Sep 17 00:00:00 2001
From: WANG Rui <wangrui at loongson.cn>
Date: Tue, 13 May 2025 17:05:35 +0800
Subject: [PATCH 1/2] [LoongArch] Introduce `32s` target feature for LA32S ISA
 extensions

According to the offical LoongArch reference manual, the 32-bit
LoongArch is divied into two variants: the Reduced version (LA32R)
and Standard version (LA32S). LA32S extends LA32R by adding additional
instructions, and the 64-bit version (LA64) fully includes the LA32S
instruction set.

This patch introduces a new target feature `32s` for the LoongArch
backend, enabling support for instructions specific to the LA32S
variant.

The LA32S exntension includes the following additional instructions:

- ALSL.W
- {AND,OR}N
- B{EQ,NE}Z
- BITREV.{4B,W}
- BSTR{INS,PICK}.W
- BYTEPICK.W
- CL{O,Z}.W
- CPUCFG
- CT{O,Z}.W
- EXT.W,{B,H}
- F{LD,ST}X.{D,S}
- MASK{EQ,NE}Z
- PC{ADDI,ALAU12I}
- REVB.2H
- ROTR{I},W

Additionally, LA32R defines three new instruction aliases:

- RDCNTID.W RJ => RDTIMEL.W ZERO, RJ
- RDCNTVH.W RD => RDTIMEH.W RD, ZERO
- RDCNTVL.W RD => RDTIMEL.W RD, ZERO
---
 llvm/lib/Target/LoongArch/LoongArch.td        |   12 +
 .../LoongArchExpandAtomicPseudoInsts.cpp      |   18 +-
 .../Target/LoongArch/LoongArchISelDAGToDAG.h  |   22 +
 .../LoongArch/LoongArchISelLowering.cpp       |  517 +-
 .../Target/LoongArch/LoongArchISelLowering.h  |    4 +
 .../Target/LoongArch/LoongArchInstrInfo.td    |  175 +-
 llvm/test/CodeGen/LoongArch/alloca.ll         |  219 +-
 llvm/test/CodeGen/LoongArch/alsl.ll           |  391 +-
 .../CodeGen/LoongArch/annotate-tablejump.ll   |    2 +-
 .../LoongArch/atomicrmw-cond-sub-clamp.ll     |   16 +-
 .../LoongArch/atomicrmw-uinc-udec-wrap.ll     |   16 +-
 llvm/test/CodeGen/LoongArch/bitreverse.ll     |  645 +-
 llvm/test/CodeGen/LoongArch/bnez-beqz.ll      |   97 +-
 .../CodeGen/LoongArch/branch-relaxation.ll    |  154 +-
 llvm/test/CodeGen/LoongArch/bstrins_w.ll      |    2 +-
 llvm/test/CodeGen/LoongArch/bstrpick_w.ll     |    2 +-
 .../CodeGen/LoongArch/bswap-bitreverse.ll     |  264 +-
 llvm/test/CodeGen/LoongArch/bswap.ll          |  294 +-
 llvm/test/CodeGen/LoongArch/bytepick.ll       |    2 +-
 .../test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll | 1003 ++-
 llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll |  117 +-
 .../LoongArch/exception-pointer-register.ll   |    2 +-
 llvm/test/CodeGen/LoongArch/fabs.ll           |    4 +-
 llvm/test/CodeGen/LoongArch/fcopysign.ll      |    4 +-
 llvm/test/CodeGen/LoongArch/feature-32bit.ll  |    1 +
 .../LoongArch/intrinsic-csr-side-effects.ll   |    2 +-
 .../CodeGen/LoongArch/ir-instruction/and.ll   |  439 +-
 .../CodeGen/LoongArch/ir-instruction/ashr.ll  |   42 +-
 .../ir-instruction/atomic-cmpxchg-128.ll      |   20 +-
 .../ir-instruction/atomic-cmpxchg.ll          |   88 +-
 .../LoongArch/ir-instruction/atomicrmw-fp.ll  |   80 +-
 .../ir-instruction/atomicrmw-lam-bh.ll        | 1148 ++-
 .../ir-instruction/atomicrmw-lamcas.ll        |  180 +-
 .../ir-instruction/atomicrmw-minmax.ll        |   80 +-
 .../LoongArch/ir-instruction/atomicrmw.ll     | 7018 +++++++++++------
 .../CodeGen/LoongArch/ir-instruction/br.ll    |  291 +-
 .../ir-instruction/double-convert.ll          |   17 +-
 .../LoongArch/ir-instruction/fcmp-dbl.ll      |    2 +-
 .../LoongArch/ir-instruction/fcmp-flt.ll      |    2 +-
 .../LoongArch/ir-instruction/float-convert.ll |   34 +-
 .../LoongArch/ir-instruction/load-store-fp.ll |    4 +-
 .../LoongArch/ir-instruction/load-store.ll    | 1208 ++-
 .../CodeGen/LoongArch/ir-instruction/lshr.ll  |  189 +-
 .../CodeGen/LoongArch/ir-instruction/mul.ll   | 1870 +++--
 .../ir-instruction/sdiv-udiv-srem-urem.ll     | 1531 ++--
 .../ir-instruction/select-bare-int.ll         |   61 +-
 .../ir-instruction/select-fpcc-int.ll         |  196 +-
 .../ir-instruction/select-icc-int.ll          |   96 +-
 .../ir-instruction/sext-zext-trunc.ll         |  402 +-
 .../CodeGen/LoongArch/ir-instruction/shl.ll   |   24 +-
 llvm/test/CodeGen/LoongArch/jump-table.ll     |    4 +-
 llvm/test/CodeGen/LoongArch/rotl-rotr.ll      | 1031 ++-
 .../CodeGen/LoongArch/select-to-shiftand.ll   |    8 +-
 .../CodeGen/LoongArch/shift-masked-shamt.ll   |   80 +-
 .../CodeGen/LoongArch/smul-with-overflow.ll   |  272 +-
 ...realignment-with-variable-sized-objects.ll |    2 +-
 .../LoongArch/typepromotion-overflow.ll       |  690 +-
 llvm/test/MC/LoongArch/Basic/Integer/atomic.s |   16 +-
 ...arch_generated_funcs.ll.generated.expected |    6 +-
 ...ch_generated_funcs.ll.nogenerated.expected |    6 +-
 60 files changed, 14426 insertions(+), 6696 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArch.td b/llvm/lib/Target/LoongArch/LoongArch.td
index 5fd52babfc6ec..a811ff1ef7f90 100644
--- a/llvm/lib/Target/LoongArch/LoongArch.td
+++ b/llvm/lib/Target/LoongArch/LoongArch.td
@@ -32,6 +32,14 @@ def IsLA32
 defvar LA32 = DefaultMode;
 def LA64 : HwMode<"+64bit", [IsLA64]>;
 
+// LoongArch 32-bit is divided into two variants, the reduced 32-bit variant
+// (LA32R) and the standard 32-bit variant (LA32S).
+def Feature32S
+    : SubtargetFeature<"32s", "Has32S", "true",
+                       "LA32 Standard Basic Instruction Extension">;
+def Has32S : Predicate<"Subtarget->has32S()">;
+def Not32S : Predicate<"!Subtarget->has32S()">;
+
 // Single Precision floating point
 def FeatureBasicF
     : SubtargetFeature<"f", "HasBasicF", "true",
@@ -159,11 +167,13 @@ include "LoongArchInstrInfo.td"
 
 def : ProcessorModel<"generic-la32", NoSchedModel, [Feature32Bit]>;
 def : ProcessorModel<"generic-la64", NoSchedModel, [Feature64Bit,
+                                                    Feature32S,
                                                     FeatureUAL,
                                                     FeatureExtLSX]>;
 
 // Generic 64-bit processor with double-precision floating-point support.
 def : ProcessorModel<"loongarch64", NoSchedModel, [Feature64Bit,
+                                                   Feature32S,
                                                    FeatureUAL,
                                                    FeatureBasicD]>;
 
@@ -172,12 +182,14 @@ def : ProcessorModel<"loongarch64", NoSchedModel, [Feature64Bit,
 def : ProcessorModel<"generic", NoSchedModel, []>;
 
 def : ProcessorModel<"la464", NoSchedModel, [Feature64Bit,
+                                             Feature32S,
                                              FeatureUAL,
                                              FeatureExtLASX,
                                              FeatureExtLVZ,
                                              FeatureExtLBT]>;
 
 def : ProcessorModel<"la664", NoSchedModel, [Feature64Bit,
+                                             Feature32S,
                                              FeatureUAL,
                                              FeatureExtLASX,
                                              FeatureExtLVZ,
diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
index 27d20390eb6ae..3be012feb2385 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
@@ -214,8 +214,9 @@ static void doAtomicBinOpExpansion(const LoongArchInstrInfo *TII,
       .addReg(ScratchReg)
       .addReg(AddrReg)
       .addImm(0);
-  BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQZ))
+  BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQ))
       .addReg(ScratchReg)
+      .addReg(LoongArch::R0)
       .addMBB(LoopMBB);
 }
 
@@ -296,8 +297,9 @@ static void doMaskedAtomicBinOpExpansion(
       .addReg(ScratchReg)
       .addReg(AddrReg)
       .addImm(0);
-  BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQZ))
+  BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQ))
       .addReg(ScratchReg)
+      .addReg(LoongArch::R0)
       .addMBB(LoopMBB);
 }
 
@@ -454,8 +456,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicMinMaxOp(
       .addReg(Scratch1Reg)
       .addReg(AddrReg)
       .addImm(0);
-  BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+  BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
       .addReg(Scratch1Reg)
+      .addReg(LoongArch::R0)
       .addMBB(LoopHeadMBB);
 
   NextMBBI = MBB.end();
@@ -529,8 +532,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg(
         .addReg(ScratchReg)
         .addReg(AddrReg)
         .addImm(0);
-    BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+    BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
         .addReg(ScratchReg)
+        .addReg(LoongArch::R0)
         .addMBB(LoopHeadMBB);
     BuildMI(LoopTailMBB, DL, TII->get(LoongArch::B)).addMBB(DoneMBB);
   } else {
@@ -569,8 +573,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg(
         .addReg(ScratchReg)
         .addReg(AddrReg)
         .addImm(0);
-    BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+    BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
         .addReg(ScratchReg)
+        .addReg(LoongArch::R0)
         .addMBB(LoopHeadMBB);
     BuildMI(LoopTailMBB, DL, TII->get(LoongArch::B)).addMBB(DoneMBB);
   }
@@ -677,8 +682,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg128(
       .addReg(ScratchReg)
       .addReg(NewValHiReg)
       .addReg(AddrReg);
-  BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+  BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
       .addReg(ScratchReg)
+      .addReg(LoongArch::R0)
       .addMBB(LoopHeadMBB);
   BuildMI(LoopTailMBB, DL, TII->get(LoongArch::B)).addMBB(DoneMBB);
   int hint;
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h
index 8a7eba418d804..e94f249c14be2 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h
+++ b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h
@@ -64,6 +64,28 @@ class LoongArchDAGToDAGISel : public SelectionDAGISel {
   bool selectVSplatUimmInvPow2(SDValue N, SDValue &SplatImm) const;
   bool selectVSplatUimmPow2(SDValue N, SDValue &SplatImm) const;
 
+  // Return the LoongArch branch opcode that matches the given DAG integer
+  // condition code. The CondCode must be one of those supported by the
+  // LoongArch ISA (see translateSetCCForBranch).
+  static unsigned getBranchOpcForIntCC(ISD::CondCode CC) {
+    switch (CC) {
+    default:
+      llvm_unreachable("Unsupported CondCode");
+    case ISD::SETEQ:
+      return LoongArch::BEQ;
+    case ISD::SETNE:
+      return LoongArch::BNE;
+    case ISD::SETLT:
+      return LoongArch::BLT;
+    case ISD::SETGE:
+      return LoongArch::BGE;
+    case ISD::SETULT:
+      return LoongArch::BLTU;
+    case ISD::SETUGE:
+      return LoongArch::BGEU;
+    }
+  }
+
 // Include the pieces autogenerated from the target description.
 #include "LoongArchGenDAGISel.inc"
 };
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index b729b4ea6f9b4..6e3e1396e6aeb 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -18,6 +18,7 @@
 #include "LoongArchSubtarget.h"
 #include "MCTargetDesc/LoongArchBaseInfo.h"
 #include "MCTargetDesc/LoongArchMCTargetDesc.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/CodeGen/ISDOpcodes.h"
@@ -102,15 +103,26 @@ LoongArchTargetLowering::LoongArchTargetLowering(const TargetMachine &TM,
 
   setOperationAction(ISD::PREFETCH, MVT::Other, Custom);
 
-  // Expand bitreverse.i16 with native-width bitrev and shift for now, before
-  // we get to know which of sll and revb.2h is faster.
-  setOperationAction(ISD::BITREVERSE, MVT::i8, Custom);
-  setOperationAction(ISD::BITREVERSE, GRLenVT, Legal);
-
-  // LA32 does not have REVB.2W and REVB.D due to the 64-bit operands, and
-  // the narrower REVB.W does not exist. But LA32 does have REVB.2H, so i16
-  // and i32 could still be byte-swapped relatively cheaply.
-  setOperationAction(ISD::BSWAP, MVT::i16, Custom);
+  // BITREV/REVB requires the 32S feature.
+  if (STI.has32S()) {
+    // Expand bitreverse.i16 with native-width bitrev and shift for now, before
+    // we get to know which of sll and revb.2h is faster.
+    setOperationAction(ISD::BITREVERSE, MVT::i8, Custom);
+    setOperationAction(ISD::BITREVERSE, GRLenVT, Legal);
+
+    // LA32 does not have REVB.2W and REVB.D due to the 64-bit operands, and
+    // the narrower REVB.W does not exist. But LA32 does have REVB.2H, so i16
+    // and i32 could still be byte-swapped relatively cheaply.
+    setOperationAction(ISD::BSWAP, MVT::i16, Custom);
+  } else {
+    setOperationAction(ISD::BSWAP, GRLenVT, Expand);
+    setOperationAction(ISD::CTTZ, GRLenVT, Expand);
+    setOperationAction(ISD::CTLZ, GRLenVT, Expand);
+    setOperationAction(ISD::ROTR, GRLenVT, Expand);
+    setOperationAction(ISD::SELECT, GRLenVT, Custom);
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
+  }
 
   setOperationAction(ISD::BR_JT, MVT::Other, Expand);
   setOperationAction(ISD::BR_CC, GRLenVT, Expand);
@@ -476,6 +488,8 @@ SDValue LoongArchTargetLowering::LowerOperation(SDValue Op,
     return lowerSCALAR_TO_VECTOR(Op, DAG);
   case ISD::PREFETCH:
     return lowerPREFETCH(Op, DAG);
+  case ISD::SELECT:
+    return lowerSELECT(Op, DAG);
   }
   return SDValue();
 }
@@ -492,6 +506,327 @@ SDValue LoongArchTargetLowering::lowerPREFETCH(SDValue Op,
   return Op;
 }
 
+// Return true if Val is equal to (setcc LHS, RHS, CC).
+// Return false if Val is the inverse of (setcc LHS, RHS, CC).
+// Otherwise, return std::nullopt.
+static std::optional<bool> matchSetCC(SDValue LHS, SDValue RHS,
+                                      ISD::CondCode CC, SDValue Val) {
+  assert(Val->getOpcode() == ISD::SETCC);
+  SDValue LHS2 = Val.getOperand(0);
+  SDValue RHS2 = Val.getOperand(1);
+  ISD::CondCode CC2 = cast<CondCodeSDNode>(Val.getOperand(2))->get();
+
+  if (LHS == LHS2 && RHS == RHS2) {
+    if (CC == CC2)
+      return true;
+    if (CC == ISD::getSetCCInverse(CC2, LHS2.getValueType()))
+      return false;
+  } else if (LHS == RHS2 && RHS == LHS2) {
+    CC2 = ISD::getSetCCSwappedOperands(CC2);
+    if (CC == CC2)
+      return true;
+    if (CC == ISD::getSetCCInverse(CC2, LHS2.getValueType()))
+      return false;
+  }
+
+  return std::nullopt;
+}
+
+static SDValue combineSelectToBinOp(SDNode *N, SelectionDAG &DAG,
+                                    const LoongArchSubtarget &Subtarget) {
+  SDValue CondV = N->getOperand(0);
+  SDValue TrueV = N->getOperand(1);
+  SDValue FalseV = N->getOperand(2);
+  MVT VT = N->getSimpleValueType(0);
+  SDLoc DL(N);
+
+  // (select c, -1, y) -> -c | y
+  if (isAllOnesConstant(TrueV)) {
+    SDValue Neg = DAG.getNegative(CondV, DL, VT);
+    return DAG.getNode(ISD::OR, DL, VT, Neg, DAG.getFreeze(FalseV));
+  }
+  // (select c, y, -1) -> (c-1) | y
+  if (isAllOnesConstant(FalseV)) {
+    SDValue Neg =
+        DAG.getNode(ISD::ADD, DL, VT, CondV, DAG.getAllOnesConstant(DL, VT));
+    return DAG.getNode(ISD::OR, DL, VT, Neg, DAG.getFreeze(TrueV));
+  }
+
+  // (select c, 0, y) -> (c-1) & y
+  if (isNullConstant(TrueV)) {
+    SDValue Neg =
+        DAG.getNode(ISD::ADD, DL, VT, CondV, DAG.getAllOnesConstant(DL, VT));
+    return DAG.getNode(ISD::AND, DL, VT, Neg, DAG.getFreeze(FalseV));
+  }
+  // (select c, y, 0) -> -c & y
+  if (isNullConstant(FalseV)) {
+    SDValue Neg = DAG.getNegative(CondV, DL, VT);
+    return DAG.getNode(ISD::AND, DL, VT, Neg, DAG.getFreeze(TrueV));
+  }
+
+  // select c, ~x, x --> xor -c, x
+  if (isa<ConstantSDNode>(TrueV) && isa<ConstantSDNode>(FalseV)) {
+    const APInt &TrueVal = TrueV->getAsAPIntVal();
+    const APInt &FalseVal = FalseV->getAsAPIntVal();
+    if (~TrueVal == FalseVal) {
+      SDValue Neg = DAG.getNegative(CondV, DL, VT);
+      return DAG.getNode(ISD::XOR, DL, VT, Neg, FalseV);
+    }
+  }
+
+  // Try to fold (select (setcc lhs, rhs, cc), truev, falsev) into bitwise ops
+  // when both truev and falsev are also setcc.
+  if (CondV.getOpcode() == ISD::SETCC && TrueV.getOpcode() == ISD::SETCC &&
+      FalseV.getOpcode() == ISD::SETCC) {
+    SDValue LHS = CondV.getOperand(0);
+    SDValue RHS = CondV.getOperand(1);
+    ISD::CondCode CC = cast<CondCodeSDNode>(CondV.getOperand(2))->get();
+
+    // (select x, x, y) -> x | y
+    // (select !x, x, y) -> x & y
+    if (std::optional<bool> MatchResult = matchSetCC(LHS, RHS, CC, TrueV)) {
+      return DAG.getNode(*MatchResult ? ISD::OR : ISD::AND, DL, VT, TrueV,
+                         DAG.getFreeze(FalseV));
+    }
+    // (select x, y, x) -> x & y
+    // (select !x, y, x) -> x | y
+    if (std::optional<bool> MatchResult = matchSetCC(LHS, RHS, CC, FalseV)) {
+      return DAG.getNode(*MatchResult ? ISD::AND : ISD::OR, DL, VT,
+                         DAG.getFreeze(TrueV), FalseV);
+    }
+  }
+
+  return SDValue();
+}
+
+// Transform `binOp (select cond, x, c0), c1` where `c0` and `c1` are constants
+// into `select cond, binOp(x, c1), binOp(c0, c1)` if profitable.
+// For now we only consider transformation profitable if `binOp(c0, c1)` ends up
+// being `0` or `-1`. In such cases we can replace `select` with `and`.
+// TODO: Should we also do this if `binOp(c0, c1)` is cheaper to materialize
+// than `c0`?
+static SDValue
+foldBinOpIntoSelectIfProfitable(SDNode *BO, SelectionDAG &DAG,
+                                const LoongArchSubtarget &Subtarget) {
+  unsigned SelOpNo = 0;
+  SDValue Sel = BO->getOperand(0);
+  if (Sel.getOpcode() != ISD::SELECT || !Sel.hasOneUse()) {
+    SelOpNo = 1;
+    Sel = BO->getOperand(1);
+  }
+
+  if (Sel.getOpcode() != ISD::SELECT || !Sel.hasOneUse())
+    return SDValue();
+
+  unsigned ConstSelOpNo = 1;
+  unsigned OtherSelOpNo = 2;
+  if (!isa<ConstantSDNode>(Sel->getOperand(ConstSelOpNo))) {
+    ConstSelOpNo = 2;
+    OtherSelOpNo = 1;
+  }
+  SDValue ConstSelOp = Sel->getOperand(ConstSelOpNo);
+  ConstantSDNode *ConstSelOpNode = dyn_cast<ConstantSDNode>(ConstSelOp);
+  if (!ConstSelOpNode || ConstSelOpNode->isOpaque())
+    return SDValue();
+
+  SDValue ConstBinOp = BO->getOperand(SelOpNo ^ 1);
+  ConstantSDNode *ConstBinOpNode = dyn_cast<ConstantSDNode>(ConstBinOp);
+  if (!ConstBinOpNode || ConstBinOpNode->isOpaque())
+    return SDValue();
+
+  SDLoc DL(Sel);
+  EVT VT = BO->getValueType(0);
+
+  SDValue NewConstOps[2] = {ConstSelOp, ConstBinOp};
+  if (SelOpNo == 1)
+    std::swap(NewConstOps[0], NewConstOps[1]);
+
+  SDValue NewConstOp =
+      DAG.FoldConstantArithmetic(BO->getOpcode(), DL, VT, NewConstOps);
+  if (!NewConstOp)
+    return SDValue();
+
+  const APInt &NewConstAPInt = NewConstOp->getAsAPIntVal();
+  if (!NewConstAPInt.isZero() && !NewConstAPInt.isAllOnes())
+    return SDValue();
+
+  SDValue OtherSelOp = Sel->getOperand(OtherSelOpNo);
+  SDValue NewNonConstOps[2] = {OtherSelOp, ConstBinOp};
+  if (SelOpNo == 1)
+    std::swap(NewNonConstOps[0], NewNonConstOps[1]);
+  SDValue NewNonConstOp = DAG.getNode(BO->getOpcode(), DL, VT, NewNonConstOps);
+
+  SDValue NewT = (ConstSelOpNo == 1) ? NewConstOp : NewNonConstOp;
+  SDValue NewF = (ConstSelOpNo == 1) ? NewNonConstOp : NewConstOp;
+  return DAG.getSelect(DL, VT, Sel.getOperand(0), NewT, NewF);
+}
+
+// Changes the condition code and swaps operands if necessary, so the SetCC
+// operation matches one of the comparisons supported directly by branches
+// in the LoongArch ISA. May adjust compares to favor compare with 0 over
+// compare with 1/-1.
+static void translateSetCCForBranch(const SDLoc &DL, SDValue &LHS, SDValue &RHS,
+                                    ISD::CondCode &CC, SelectionDAG &DAG) {
+  // If this is a single bit test that can't be handled by ANDI, shift the
+  // bit to be tested to the MSB and perform a signed compare with 0.
+  if (isIntEqualitySetCC(CC) && isNullConstant(RHS) &&
+      LHS.getOpcode() == ISD::AND && LHS.hasOneUse() &&
+      isa<ConstantSDNode>(LHS.getOperand(1))) {
+    uint64_t Mask = LHS.getConstantOperandVal(1);
+    if ((isPowerOf2_64(Mask) || isMask_64(Mask)) && !isInt<12>(Mask)) {
+      unsigned ShAmt = 0;
+      if (isPowerOf2_64(Mask)) {
+        CC = CC == ISD::SETEQ ? ISD::SETGE : ISD::SETLT;
+        ShAmt = LHS.getValueSizeInBits() - 1 - Log2_64(Mask);
+      } else {
+        ShAmt = LHS.getValueSizeInBits() - llvm::bit_width(Mask);
+      }
+
+      LHS = LHS.getOperand(0);
+      if (ShAmt != 0)
+        LHS = DAG.getNode(ISD::SHL, DL, LHS.getValueType(), LHS,
+                          DAG.getConstant(ShAmt, DL, LHS.getValueType()));
+      return;
+    }
+  }
+
+  if (auto *RHSC = dyn_cast<ConstantSDNode>(RHS)) {
+    int64_t C = RHSC->getSExtValue();
+    switch (CC) {
+    default:
+      break;
+    case ISD::SETGT:
+      // Convert X > -1 to X >= 0.
+      if (C == -1) {
+        RHS = DAG.getConstant(0, DL, RHS.getValueType());
+        CC = ISD::SETGE;
+        return;
+      }
+      break;
+    case ISD::SETLT:
+      // Convert X < 1 to 0 >= X.
+      if (C == 1) {
+        RHS = LHS;
+        LHS = DAG.getConstant(0, DL, RHS.getValueType());
+        CC = ISD::SETGE;
+        return;
+      }
+      break;
+    }
+  }
+
+  switch (CC) {
+  default:
+    break;
+  case ISD::SETGT:
+  case ISD::SETLE:
+  case ISD::SETUGT:
+  case ISD::SETULE:
+    CC = ISD::getSetCCSwappedOperands(CC);
+    std::swap(LHS, RHS);
+    break;
+  }
+}
+
+SDValue LoongArchTargetLowering::lowerSELECT(SDValue Op,
+                                             SelectionDAG &DAG) const {
+  SDValue CondV = Op.getOperand(0);
+  SDValue TrueV = Op.getOperand(1);
+  SDValue FalseV = Op.getOperand(2);
+  SDLoc DL(Op);
+  MVT VT = Op.getSimpleValueType();
+  MVT GRLenVT = Subtarget.getGRLenVT();
+
+  if (SDValue V = combineSelectToBinOp(Op.getNode(), DAG, Subtarget))
+    return V;
+
+  if (Op.hasOneUse()) {
+    unsigned UseOpc = Op->user_begin()->getOpcode();
+    if (isBinOp(UseOpc) && DAG.isSafeToSpeculativelyExecute(UseOpc)) {
+      SDNode *BinOp = *Op->user_begin();
+      if (SDValue NewSel = foldBinOpIntoSelectIfProfitable(*Op->user_begin(),
+                                                           DAG, Subtarget)) {
+        DAG.ReplaceAllUsesWith(BinOp, &NewSel);
+        // Opcode check is necessary because foldBinOpIntoSelectIfProfitable
+        // may return a constant node and cause crash in lowerSELECT.
+        if (NewSel.getOpcode() == ISD::SELECT)
+          return lowerSELECT(NewSel, DAG);
+        return NewSel;
+      }
+    }
+  }
+
+  // If the condition is not an integer SETCC which operates on GRLenVT, we need
+  // to emit a LoongArchISD::SELECT_CC comparing the condition to zero. i.e.:
+  // (select condv, truev, falsev)
+  // -> (loongarchisd::select_cc condv, zero, setne, truev, falsev)
+  if (CondV.getOpcode() != ISD::SETCC ||
+      CondV.getOperand(0).getSimpleValueType() != GRLenVT) {
+    SDValue Zero = DAG.getConstant(0, DL, GRLenVT);
+    SDValue SetNE = DAG.getCondCode(ISD::SETNE);
+
+    SDValue Ops[] = {CondV, Zero, SetNE, TrueV, FalseV};
+
+    return DAG.getNode(LoongArchISD::SELECT_CC, DL, VT, Ops);
+  }
+
+  // If the CondV is the output of a SETCC node which operates on GRLenVT
+  // inputs, then merge the SETCC node into the lowered LoongArchISD::SELECT_CC
+  // to take advantage of the integer compare+branch instructions. i.e.: (select
+  // (setcc lhs, rhs, cc), truev, falsev)
+  // -> (loongarchisd::select_cc lhs, rhs, cc, truev, falsev)
+  SDValue LHS = CondV.getOperand(0);
+  SDValue RHS = CondV.getOperand(1);
+  ISD::CondCode CCVal = cast<CondCodeSDNode>(CondV.getOperand(2))->get();
+
+  // Special case for a select of 2 constants that have a difference of 1.
+  // Normally this is done by DAGCombine, but if the select is introduced by
+  // type legalization or op legalization, we miss it. Restricting to SETLT
+  // case for now because that is what signed saturating add/sub need.
+  // FIXME: We don't need the condition to be SETLT or even a SETCC,
+  // but we would probably want to swap the true/false values if the condition
+  // is SETGE/SETLE to avoid an XORI.
+  if (isa<ConstantSDNode>(TrueV) && isa<ConstantSDNode>(FalseV) &&
+      CCVal == ISD::SETLT) {
+    const APInt &TrueVal = TrueV->getAsAPIntVal();
+    const APInt &FalseVal = FalseV->getAsAPIntVal();
+    if (TrueVal - 1 == FalseVal)
+      return DAG.getNode(ISD::ADD, DL, VT, CondV, FalseV);
+    if (TrueVal + 1 == FalseVal)
+      return DAG.getNode(ISD::SUB, DL, VT, FalseV, CondV);
+  }
+
+  translateSetCCForBranch(DL, LHS, RHS, CCVal, DAG);
+  // 1 < x ? x : 1 -> 0 < x ? x : 1
+  if (isOneConstant(LHS) && (CCVal == ISD::SETLT || CCVal == ISD::SETULT) &&
+      RHS == TrueV && LHS == FalseV) {
+    LHS = DAG.getConstant(0, DL, VT);
+    // 0 <u x is the same as x != 0.
+    if (CCVal == ISD::SETULT) {
+      std::swap(LHS, RHS);
+      CCVal = ISD::SETNE;
+    }
+  }
+
+  // x <s -1 ? x : -1 -> x <s 0 ? x : -1
+  if (isAllOnesConstant(RHS) && CCVal == ISD::SETLT && LHS == TrueV &&
+      RHS == FalseV) {
+    RHS = DAG.getConstant(0, DL, VT);
+  }
+
+  SDValue TargetCC = DAG.getCondCode(CCVal);
+
+  if (isa<ConstantSDNode>(TrueV) && !isa<ConstantSDNode>(FalseV)) {
+    // (select (setcc lhs, rhs, CC), constant, falsev)
+    // -> (select (setcc lhs, rhs, InverseCC), falsev, constant)
+    std::swap(TrueV, FalseV);
+    TargetCC = DAG.getCondCode(ISD::getSetCCInverse(CCVal, LHS.getValueType()));
+  }
+
+  SDValue Ops[] = {LHS, RHS, TargetCC, TrueV, FalseV};
+  return DAG.getNode(LoongArchISD::SELECT_CC, DL, VT, Ops);
+}
+
 SDValue
 LoongArchTargetLowering::lowerSCALAR_TO_VECTOR(SDValue Op,
                                                SelectionDAG &DAG) const {
@@ -3835,6 +4170,10 @@ static SDValue performANDCombine(SDNode *N, SelectionDAG &DAG,
   SDValue NewOperand;
   MVT GRLenVT = Subtarget.getGRLenVT();
 
+  // BSTRPICK requires the 32S feature.
+  if (!Subtarget.has32S())
+    return SDValue();
+
   // Op's second operand must be a shifted mask.
   if (!(CN = dyn_cast<ConstantSDNode>(SecondOperand)) ||
       !isShiftedMask_64(CN->getZExtValue(), SMIdx, SMLen))
@@ -3907,6 +4246,10 @@ static SDValue performANDCombine(SDNode *N, SelectionDAG &DAG,
 static SDValue performSRLCombine(SDNode *N, SelectionDAG &DAG,
                                  TargetLowering::DAGCombinerInfo &DCI,
                                  const LoongArchSubtarget &Subtarget) {
+  // BSTRPICK requires the 32S feature.
+  if (!Subtarget.has32S())
+    return SDValue();
+
   if (DCI.isBeforeLegalizeOps())
     return SDValue();
 
@@ -3958,6 +4301,10 @@ static SDValue performORCombine(SDNode *N, SelectionDAG &DAG,
   unsigned Shamt;
   bool SwapAndRetried = false;
 
+  // BSTRPICK requires the 32S feature.
+  if (!Subtarget.has32S())
+    return SDValue();
+
   if (DCI.isBeforeLegalizeOps())
     return SDValue();
 
@@ -4974,7 +5321,7 @@ static MachineBasicBlock *insertDivByZeroTrap(MachineInstr &MI,
   // Build instructions:
   // MBB:
   //   div(or mod)   $dst, $dividend, $divisor
-  //   bnez          $divisor, SinkMBB
+  //   bne           $divisor, $zero, SinkMBB
   // BreakMBB:
   //   break         7 // BRK_DIVZERO
   // SinkMBB:
@@ -4997,8 +5344,9 @@ static MachineBasicBlock *insertDivByZeroTrap(MachineInstr &MI,
   Register DivisorReg = Divisor.getReg();
 
   // MBB:
-  BuildMI(MBB, DL, TII.get(LoongArch::BNEZ))
+  BuildMI(MBB, DL, TII.get(LoongArch::BNE))
       .addReg(DivisorReg, getKillRegState(Divisor.isKill()))
+      .addReg(LoongArch::R0)
       .addMBB(SinkMBB);
   MBB->addSuccessor(BreakMBB);
   MBB->addSuccessor(SinkMBB);
@@ -5243,6 +5591,150 @@ static MachineBasicBlock *emitPseudoCTPOP(MachineInstr &MI,
   return BB;
 }
 
+static bool isSelectPseudo(MachineInstr &MI) {
+  switch (MI.getOpcode()) {
+  default:
+    return false;
+  case LoongArch::Select_GPR_Using_CC_GPR:
+    return true;
+  }
+}
+
+static MachineBasicBlock *
+emitSelectPseudo(MachineInstr &MI, MachineBasicBlock *BB,
+                 const LoongArchSubtarget &Subtarget) {
+  // To "insert" Select_* instructions, we actually have to insert the triangle
+  // control-flow pattern.  The incoming instructions know the destination vreg
+  // to set, the condition code register to branch on, the true/false values to
+  // select between, and the condcode to use to select the appropriate branch.
+  //
+  // We produce the following control flow:
+  //     HeadMBB
+  //     |  \
+  //     |  IfFalseMBB
+  //     | /
+  //    TailMBB
+  //
+  // When we find a sequence of selects we attempt to optimize their emission
+  // by sharing the control flow. Currently we only handle cases where we have
+  // multiple selects with the exact same condition (same LHS, RHS and CC).
+  // The selects may be interleaved with other instructions if the other
+  // instructions meet some requirements we deem safe:
+  // - They are not pseudo instructions.
+  // - They are debug instructions. Otherwise,
+  // - They do not have side-effects, do not access memory and their inputs do
+  //   not depend on the results of the select pseudo-instructions.
+  // The TrueV/FalseV operands of the selects cannot depend on the result of
+  // previous selects in the sequence.
+  // These conditions could be further relaxed. See the X86 target for a
+  // related approach and more information.
+
+  Register LHS = MI.getOperand(1).getReg();
+  Register RHS;
+  if (MI.getOperand(2).isReg())
+    RHS = MI.getOperand(2).getReg();
+  auto CC = static_cast<unsigned>(MI.getOperand(3).getImm());
+
+  SmallVector<MachineInstr *, 4> SelectDebugValues;
+  SmallSet<Register, 4> SelectDests;
+  SelectDests.insert(MI.getOperand(0).getReg());
+
+  MachineInstr *LastSelectPseudo = &MI;
+  for (auto E = BB->end(), SequenceMBBI = MachineBasicBlock::iterator(MI);
+       SequenceMBBI != E; ++SequenceMBBI) {
+    if (SequenceMBBI->isDebugInstr())
+      continue;
+    if (isSelectPseudo(*SequenceMBBI)) {
+      if (SequenceMBBI->getOperand(1).getReg() != LHS ||
+          !SequenceMBBI->getOperand(2).isReg() ||
+          SequenceMBBI->getOperand(2).getReg() != RHS ||
+          SequenceMBBI->getOperand(3).getImm() != CC ||
+          SelectDests.count(SequenceMBBI->getOperand(4).getReg()) ||
+          SelectDests.count(SequenceMBBI->getOperand(5).getReg()))
+        break;
+      LastSelectPseudo = &*SequenceMBBI;
+      SequenceMBBI->collectDebugValues(SelectDebugValues);
+      SelectDests.insert(SequenceMBBI->getOperand(0).getReg());
+      continue;
+    }
+    if (SequenceMBBI->hasUnmodeledSideEffects() ||
+        SequenceMBBI->mayLoadOrStore() ||
+        SequenceMBBI->usesCustomInsertionHook())
+      break;
+    if (llvm::any_of(SequenceMBBI->operands(), [&](MachineOperand &MO) {
+          return MO.isReg() && MO.isUse() && SelectDests.count(MO.getReg());
+        }))
+      break;
+  }
+
+  const LoongArchInstrInfo &TII = *Subtarget.getInstrInfo();
+  const BasicBlock *LLVM_BB = BB->getBasicBlock();
+  DebugLoc DL = MI.getDebugLoc();
+  MachineFunction::iterator I = ++BB->getIterator();
+
+  MachineBasicBlock *HeadMBB = BB;
+  MachineFunction *F = BB->getParent();
+  MachineBasicBlock *TailMBB = F->CreateMachineBasicBlock(LLVM_BB);
+  MachineBasicBlock *IfFalseMBB = F->CreateMachineBasicBlock(LLVM_BB);
+
+  F->insert(I, IfFalseMBB);
+  F->insert(I, TailMBB);
+
+  // Set the call frame size on entry to the new basic blocks.
+  unsigned CallFrameSize = TII.getCallFrameSizeAt(*LastSelectPseudo);
+  IfFalseMBB->setCallFrameSize(CallFrameSize);
+  TailMBB->setCallFrameSize(CallFrameSize);
+
+  // Transfer debug instructions associated with the selects to TailMBB.
+  for (MachineInstr *DebugInstr : SelectDebugValues) {
+    TailMBB->push_back(DebugInstr->removeFromParent());
+  }
+
+  // Move all instructions after the sequence to TailMBB.
+  TailMBB->splice(TailMBB->end(), HeadMBB,
+                  std::next(LastSelectPseudo->getIterator()), HeadMBB->end());
+  // Update machine-CFG edges by transferring all successors of the current
+  // block to the new block which will contain the Phi nodes for the selects.
+  TailMBB->transferSuccessorsAndUpdatePHIs(HeadMBB);
+  // Set the successors for HeadMBB.
+  HeadMBB->addSuccessor(IfFalseMBB);
+  HeadMBB->addSuccessor(TailMBB);
+
+  // Insert appropriate branch.
+  if (MI.getOperand(2).isImm())
+    BuildMI(HeadMBB, DL, TII.get(CC))
+        .addReg(LHS)
+        .addImm(MI.getOperand(2).getImm())
+        .addMBB(TailMBB);
+  else
+    BuildMI(HeadMBB, DL, TII.get(CC)).addReg(LHS).addReg(RHS).addMBB(TailMBB);
+
+  // IfFalseMBB just falls through to TailMBB.
+  IfFalseMBB->addSuccessor(TailMBB);
+
+  // Create PHIs for all of the select pseudo-instructions.
+  auto SelectMBBI = MI.getIterator();
+  auto SelectEnd = std::next(LastSelectPseudo->getIterator());
+  auto InsertionPoint = TailMBB->begin();
+  while (SelectMBBI != SelectEnd) {
+    auto Next = std::next(SelectMBBI);
+    if (isSelectPseudo(*SelectMBBI)) {
+      // %Result = phi [ %TrueValue, HeadMBB ], [ %FalseValue, IfFalseMBB ]
+      BuildMI(*TailMBB, InsertionPoint, SelectMBBI->getDebugLoc(),
+              TII.get(LoongArch::PHI), SelectMBBI->getOperand(0).getReg())
+          .addReg(SelectMBBI->getOperand(4).getReg())
+          .addMBB(HeadMBB)
+          .addReg(SelectMBBI->getOperand(5).getReg())
+          .addMBB(IfFalseMBB);
+      SelectMBBI->eraseFromParent();
+    }
+    SelectMBBI = Next;
+  }
+
+  F->getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+  return TailMBB;
+}
+
 MachineBasicBlock *LoongArchTargetLowering::EmitInstrWithCustomInserter(
     MachineInstr &MI, MachineBasicBlock *BB) const {
   const TargetInstrInfo *TII = Subtarget.getInstrInfo();
@@ -5277,6 +5769,8 @@ MachineBasicBlock *LoongArchTargetLowering::EmitInstrWithCustomInserter(
     MI.eraseFromParent();
     return BB;
   }
+  case LoongArch::Select_GPR_Using_CC_GPR:
+    return emitSelectPseudo(MI, BB, Subtarget);
   case LoongArch::PseudoVBZ:
   case LoongArch::PseudoVBZ_B:
   case LoongArch::PseudoVBZ_H:
@@ -5349,6 +5843,7 @@ const char *LoongArchTargetLowering::getTargetNodeName(unsigned Opcode) const {
     NODE_NAME_CASE(TAIL)
     NODE_NAME_CASE(TAIL_MEDIUM)
     NODE_NAME_CASE(TAIL_LARGE)
+    NODE_NAME_CASE(SELECT_CC)
     NODE_NAME_CASE(SLL_W)
     NODE_NAME_CASE(SRA_W)
     NODE_NAME_CASE(SRL_W)
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.h b/llvm/lib/Target/LoongArch/LoongArchISelLowering.h
index 40a36b653b1b3..6bf295984dfc5 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.h
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.h
@@ -34,6 +34,9 @@ enum NodeType : unsigned {
   TAIL_MEDIUM,
   TAIL_LARGE,
 
+  // Select
+  SELECT_CC,
+
   // 32-bit shifts, directly matching the semantics of the named LoongArch
   // instructions.
   SLL_W,
@@ -357,6 +360,7 @@ class LoongArchTargetLowering : public TargetLowering {
   SDValue lowerBITREVERSE(SDValue Op, SelectionDAG &DAG) const;
   SDValue lowerSCALAR_TO_VECTOR(SDValue Op, SelectionDAG &DAG) const;
   SDValue lowerPREFETCH(SDValue Op, SelectionDAG &DAG) const;
+  SDValue lowerSELECT(SDValue Op, SelectionDAG &DAG) const;
 
   bool isFPImmLegal(const APFloat &Imm, EVT VT,
                     bool ForCodeSize) const override;
diff --git a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
index 69d6266a4bf54..f3c132cdb42e6 100644
--- a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
@@ -26,6 +26,11 @@ def SDT_LoongArchIntBinOpW : SDTypeProfile<1, 2, [
   SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisVT<0, i64>
 ]>;
 
+def SDT_LoongArchSelectCC : SDTypeProfile<1, 5, [SDTCisSameAs<1, 2>,
+                                                 SDTCisVT<3, OtherVT>,
+                                                 SDTCisSameAs<0, 4>,
+                                                 SDTCisSameAs<4, 5>]>;
+
 def SDT_LoongArchBStrIns: SDTypeProfile<1, 4, [
   SDTCisInt<0>, SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisInt<3>,
   SDTCisSameAs<3, 4>
@@ -81,6 +86,7 @@ def loongarch_call_large : SDNode<"LoongArchISD::CALL_LARGE", SDT_LoongArchCall,
 def loongarch_tail_large : SDNode<"LoongArchISD::TAIL_LARGE", SDT_LoongArchCall,
                                   [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
                                    SDNPVariadic]>;
+def loongarch_selectcc : SDNode<"LoongArchISD::SELECT_CC", SDT_LoongArchSelectCC>;
 def loongarch_sll_w : SDNode<"LoongArchISD::SLL_W", SDT_LoongArchIntBinOpW>;
 def loongarch_sra_w : SDNode<"LoongArchISD::SRA_W", SDT_LoongArchIntBinOpW>;
 def loongarch_srl_w : SDNode<"LoongArchISD::SRL_W", SDT_LoongArchIntBinOpW>;
@@ -769,7 +775,6 @@ class IOCSRWR<bits<32> op>
 def ADD_W : ALU_3R<0x00100000>;
 def SUB_W : ALU_3R<0x00110000>;
 def ADDI_W : ALU_2RI12<0x02800000, simm12_addlike>;
-def ALSL_W : ALU_3RI2<0x00040000, uimm2_plus1>;
 let isReMaterializable = 1 in {
 def LU12I_W : ALU_1RI20<0x14000000, simm20_lu12iw>;
 }
@@ -777,15 +782,11 @@ def SLT  : ALU_3R<0x00120000>;
 def SLTU : ALU_3R<0x00128000>;
 def SLTI  : ALU_2RI12<0x02000000, simm12>;
 def SLTUI : ALU_2RI12<0x02400000, simm12>;
-def PCADDI    : ALU_1RI20<0x18000000, simm20_pcaddi>;
 def PCADDU12I : ALU_1RI20<0x1c000000, simm20>;
-def PCALAU12I : ALU_1RI20<0x1a000000, simm20_pcalau12i>;
 def AND  : ALU_3R<0x00148000>;
 def OR   : ALU_3R<0x00150000>;
 def NOR  : ALU_3R<0x00140000>;
 def XOR  : ALU_3R<0x00158000>;
-def ANDN : ALU_3R<0x00168000>;
-def ORN  : ALU_3R<0x00160000>;
 def ANDI : ALU_2RI12<0x03400000, uimm12>;
 // See LoongArchInstrInfo::isAsCheapAsAMove for more details.
 let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
@@ -806,34 +807,10 @@ def MOD_WU  : ALU_3R<0x00218000>;
 def SLL_W  : ALU_3R<0x00170000>;
 def SRL_W  : ALU_3R<0x00178000>;
 def SRA_W  : ALU_3R<0x00180000>;
-def ROTR_W : ALU_3R<0x001b0000>;
 
 def SLLI_W  : ALU_2RI5<0x00408000, uimm5>;
 def SRLI_W  : ALU_2RI5<0x00448000, uimm5>;
 def SRAI_W  : ALU_2RI5<0x00488000, uimm5>;
-def ROTRI_W : ALU_2RI5<0x004c8000, uimm5>;
-
-// Bit-manipulation Instructions
-def EXT_W_B : ALU_2R<0x00005c00>;
-def EXT_W_H : ALU_2R<0x00005800>;
-def CLO_W   : ALU_2R<0x00001000>;
-def CLZ_W   : ALU_2R<0x00001400>;
-def CTO_W   : ALU_2R<0x00001800>;
-def CTZ_W   : ALU_2R<0x00001c00>;
-def BYTEPICK_W : ALU_3RI2<0x00080000, uimm2>;
-def REVB_2H   : ALU_2R<0x00003000>;
-def BITREV_4B : ALU_2R<0x00004800>;
-def BITREV_W  : ALU_2R<0x00005000>;
-let Constraints = "$rd = $dst" in {
-def BSTRINS_W  : FmtBSTR_W<0x00600000, (outs GPR:$dst),
-                           (ins GPR:$rd, GPR:$rj, uimm5:$msbw, uimm5:$lsbw),
-                           "$rd, $rj, $msbw, $lsbw">;
-}
-def BSTRPICK_W : FmtBSTR_W<0x00608000, (outs GPR:$rd),
-                           (ins GPR:$rj, uimm5:$msbw, uimm5:$lsbw),
-                           "$rd, $rj, $msbw, $lsbw">;
-def MASKEQZ : ALU_3R<0x00130000>;
-def MASKNEZ : ALU_3R<0x00138000>;
 
 // Branch Instructions
 def BEQ  : BrCC_2RI16<0x58000000>;
@@ -842,8 +819,6 @@ def BLT  : BrCC_2RI16<0x60000000>;
 def BGE  : BrCC_2RI16<0x64000000>;
 def BLTU : BrCC_2RI16<0x68000000>;
 def BGEU : BrCC_2RI16<0x6c000000>;
-def BEQZ : BrCCZ_1RI21<0x40000000>;
-def BNEZ : BrCCZ_1RI21<0x44000000>;
 def B : Br_I26<0x50000000>;
 
 let hasSideEffects = 0, mayLoad = 0, mayStore = 0, isCall = 1, Defs=[R1] in
@@ -868,8 +843,6 @@ def PRELD : FmtPRELD<(outs), (ins uimm5:$imm5, GPR:$rj, simm12:$imm12),
 // Atomic Memory Access Instructions
 def LL_W : LLBase<0x20000000>;
 def SC_W : SCBase<0x21000000>;
-def LLACQ_W : LLBase_ACQ<0x38578000>;
-def SCREL_W : SCBase_REL<0x38578400>;
 
 // Barrier Instructions
 def DBAR : MISC_I15<0x38720000>;
@@ -880,12 +853,53 @@ def SYSCALL : MISC_I15<0x002b0000>;
 def BREAK   : MISC_I15<0x002a0000>;
 def RDTIMEL_W : RDTIME_2R<0x00006000>;
 def RDTIMEH_W : RDTIME_2R<0x00006400>;
-def CPUCFG : ALU_2R<0x00006c00>;
 
 // Cache Maintenance Instructions
 def CACOP : FmtCACOP<(outs), (ins uimm5:$op, GPR:$rj, simm12:$imm12),
                      "$op, $rj, $imm12">;
 
+let Predicates = [Has32S] in {
+// Arithmetic Operation Instructions
+def ALSL_W : ALU_3RI2<0x00040000, uimm2_plus1>;
+def ANDN : ALU_3R<0x00168000>;
+def ORN  : ALU_3R<0x00160000>;
+def PCADDI    : ALU_1RI20<0x18000000, simm20_pcaddi>;
+def PCALAU12I : ALU_1RI20<0x1a000000, simm20_pcalau12i>;
+
+// Bit-shift Instructions
+def ROTR_W : ALU_3R<0x001b0000>;
+def ROTRI_W : ALU_2RI5<0x004c8000, uimm5>;
+
+// Bit-manipulation Instructions
+def EXT_W_B : ALU_2R<0x00005c00>;
+def EXT_W_H : ALU_2R<0x00005800>;
+def CLO_W   : ALU_2R<0x00001000>;
+def CLZ_W   : ALU_2R<0x00001400>;
+def CTO_W   : ALU_2R<0x00001800>;
+def CTZ_W   : ALU_2R<0x00001c00>;
+def BYTEPICK_W : ALU_3RI2<0x00080000, uimm2>;
+def REVB_2H   : ALU_2R<0x00003000>;
+def BITREV_4B : ALU_2R<0x00004800>;
+def BITREV_W  : ALU_2R<0x00005000>;
+let Constraints = "$rd = $dst" in {
+def BSTRINS_W  : FmtBSTR_W<0x00600000, (outs GPR:$dst),
+                           (ins GPR:$rd, GPR:$rj, uimm5:$msbw, uimm5:$lsbw),
+                           "$rd, $rj, $msbw, $lsbw">;
+}
+def BSTRPICK_W : FmtBSTR_W<0x00608000, (outs GPR:$rd),
+                           (ins GPR:$rj, uimm5:$msbw, uimm5:$lsbw),
+                           "$rd, $rj, $msbw, $lsbw">;
+def MASKEQZ : ALU_3R<0x00130000>;
+def MASKNEZ : ALU_3R<0x00138000>;
+
+// Branch Instructions
+def BEQZ : BrCCZ_1RI21<0x40000000>;
+def BNEZ : BrCCZ_1RI21<0x44000000>;
+
+// Other Miscellaneous Instructions
+def CPUCFG : ALU_2R<0x00006c00>;
+} // Predicates = [Has32S]
+
 /// LA64 instructions
 
 let Predicates = [IsLA64] in {
@@ -1054,6 +1068,8 @@ def AMCAS__DB_D  : AMCAS_3R<0x385b8000>;
 def LL_D : LLBase<0x22000000>;
 def SC_D : SCBase<0x23000000>;
 def SC_Q : SCBase_128<0x38570000>;
+def LLACQ_W : LLBase_ACQ<0x38578000>;
+def SCREL_W : SCBase_REL<0x38578400>;
 def LLACQ_D : LLBase_ACQ<0x38578800>;
 def SCREL_D : SCBase_REL<0x38578C00>;
 
@@ -1145,6 +1161,9 @@ def : PatGprGpr<urem, MOD_WU>;
 def : PatGprGpr<mul, MUL_W>;
 def : PatGprGpr<mulhs, MULH_W>;
 def : PatGprGpr<mulhu, MULH_WU>;
+} // Predicates = [IsLA32]
+
+let Predicates = [IsLA32, Has32S] in {
 def : PatGprGpr<shiftop<rotr>, ROTR_W>;
 def : PatGprImm<rotr, ROTRI_W, uimm5>;
 
@@ -1154,7 +1173,7 @@ foreach Idx = 1...3 in {
   def : Pat<(or (shl GPR:$rk, (i32 ShamtA)), (srl GPR:$rj, (i32 ShamtB))),
             (BYTEPICK_W GPR:$rj, GPR:$rk, Idx)>;
 }
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 let Predicates = [IsLA64] in {
 def : PatGprGpr<add, ADD_D>;
@@ -1214,7 +1233,7 @@ def : Pat<(sext_inreg (add GPR:$rj, (AddiPair:$im)), i32),
                   (AddiPairImmSmall AddiPair:$im))>;
 } // Predicates = [IsLA64]
 
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 foreach Idx0 = 1...4 in {
   foreach Idx1 = 1...4 in {
     defvar CImm = !add(1, !shl(!add(1, !shl(1, Idx0)), Idx1));
@@ -1232,7 +1251,7 @@ foreach Idx0 = 1...4 in {
                       (ALSL_W GPR:$r, GPR:$r, (i32 Idx0)), (i32 Idx1))>;
   }
 }
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 let Predicates = [IsLA64] in {
 foreach Idx0 = 1...4 in {
@@ -1260,11 +1279,11 @@ foreach Idx0 = 1...4 in {
 }
 } // Predicates = [IsLA64]
 
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 def : Pat<(mul GPR:$rj, (AlslSlliImm:$im)),
           (SLLI_W (ALSL_W GPR:$rj, GPR:$rj, (AlslSlliImmI0 AlslSlliImm:$im)),
                   (AlslSlliImmI1 AlslSlliImm:$im))>;
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 let Predicates = [IsLA64] in {
 def : Pat<(sext_inreg (mul GPR:$rj, (AlslSlliImm:$im)), i32),
@@ -1305,11 +1324,11 @@ def : Pat<(not (or GPR:$rj, GPR:$rk)), (NOR GPR:$rj, GPR:$rk)>;
 def : Pat<(or GPR:$rj, (not GPR:$rk)), (ORN GPR:$rj, GPR:$rk)>;
 def : Pat<(and GPR:$rj, (not GPR:$rk)), (ANDN GPR:$rj, GPR:$rk)>;
 
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 def : Pat<(and GPR:$rj, BstrinsImm:$imm),
           (BSTRINS_W GPR:$rj, R0, (BstrinsMsb BstrinsImm:$imm),
                      (BstrinsLsb BstrinsImm:$imm))>;
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 let Predicates = [IsLA64] in {
 def : Pat<(and GPR:$rj, BstrinsImm:$imm),
@@ -1345,12 +1364,12 @@ def : Pat<(loongarch_clzw (not GPR:$rj)), (CLO_W GPR:$rj)>;
 def : Pat<(loongarch_ctzw (not GPR:$rj)), (CTO_W GPR:$rj)>;
 } // Predicates = [IsLA64]
 
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 def : PatGpr<ctlz, CLZ_W>;
 def : PatGpr<cttz, CTZ_W>;
 def : Pat<(ctlz (not GPR:$rj)), (CLO_W GPR:$rj)>;
 def : Pat<(cttz (not GPR:$rj)), (CTO_W GPR:$rj)>;
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 /// FrameIndex calculations
 let Predicates = [IsLA32] in {
@@ -1363,10 +1382,10 @@ def : Pat<(AddLike (i64 BaseAddr:$rj), simm12:$imm12),
 } // Predicates = [IsLA64]
 
 /// Shifted addition
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 def : Pat<(add GPR:$rk, (shl GPR:$rj, uimm2_plus1:$imm2)),
           (ALSL_W GPR:$rj, GPR:$rk, uimm2_plus1:$imm2)>;
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 let Predicates = [IsLA64] in {
 def : Pat<(add GPR:$rk, (shl GPR:$rj, uimm2_plus1:$imm2)),
           (ALSL_D GPR:$rj, GPR:$rk, uimm2_plus1:$imm2)>;
@@ -1402,8 +1421,10 @@ def : PatGprImm<srl, SRLI_D, uimm6>;
 
 /// sext and zext
 
+let Predicates = [Has32S] in {
 def : Pat<(sext_inreg GPR:$rj, i8), (EXT_W_B GPR:$rj)>;
 def : Pat<(sext_inreg GPR:$rj, i16), (EXT_W_H GPR:$rj)>;
+} // Predicates = [Has32S]
 
 let Predicates = [IsLA64] in {
 def : Pat<(sext_inreg GPR:$rj, i32), (ADDI_W GPR:$rj, 0)>;
@@ -1447,10 +1468,49 @@ def : Pat<(setle GPR:$rj, GPR:$rk), (XORI (SLT GPR:$rk, GPR:$rj), 1)>;
 
 /// Select
 
+def IntCCtoBranchOpc : SDNodeXForm<loongarch_selectcc, [{
+  ISD::CondCode CC = cast<CondCodeSDNode>(N->getOperand(2))->get();
+  unsigned BrCC = getBranchOpcForIntCC(CC);
+  return CurDAG->getTargetConstant(BrCC, SDLoc(N), Subtarget->getGRLenVT());
+}]>;
+
+def loongarch_selectcc_frag : PatFrag<(ops node:$lhs, node:$rhs, node:$cc,
+                                           node:$truev, node:$falsev),
+                                      (loongarch_selectcc node:$lhs, node:$rhs,
+                                                          node:$cc, node:$truev,
+                                                          node:$falsev), [{}],
+                                      IntCCtoBranchOpc>;
+
+multiclass SelectCC_GPR_rrirr<DAGOperand valty, ValueType vt> {
+  let usesCustomInserter = 1 in
+  def _Using_CC_GPR : Pseudo<(outs valty:$dst),
+                             (ins GPR:$lhs, GPR:$rhs, GPR:$cc,
+                              valty:$truev, valty:$falsev),
+                             [(set valty:$dst,
+                               (loongarch_selectcc_frag:$cc (GRLenVT GPR:$lhs), GPR:$rhs, cond,
+                                                            (vt valty:$truev), valty:$falsev))]>;
+  // Explicitly select 0 in the condition to R0. The register coalescer doesn't
+  // always do it.
+  def : Pat<(loongarch_selectcc_frag:$cc (GRLenVT GPR:$lhs), 0, cond, (vt valty:$truev),
+                                         valty:$falsev),
+            (!cast<Instruction>(NAME#"_Using_CC_GPR") GPR:$lhs, (GRLenVT R0),
+             (IntCCtoBranchOpc $cc), valty:$truev, valty:$falsev)>;
+}
+
+defm Select_GPR : SelectCC_GPR_rrirr<GPR, GRLenVT>;
+
+class SelectCompressOpt<CondCode Cond>
+    : Pat<(loongarch_selectcc_frag:$select (GRLenVT GPR:$lhs), simm12:$Constant, Cond,
+                                           (GRLenVT GPR:$truev), GPR:$falsev),
+    (Select_GPR_Using_CC_GPR (GRLenVT (ADDI_W GPR:$lhs, (NegImm simm12:$Constant))), (GRLenVT R0),
+                          (IntCCtoBranchOpc $select), GPR:$truev, GPR:$falsev)>;
+
+let Predicates = [Has32S] in {
 def : Pat<(select GPR:$cond, GPR:$t, 0), (MASKEQZ GPR:$t, GPR:$cond)>;
 def : Pat<(select GPR:$cond, 0, GPR:$f), (MASKNEZ GPR:$f, GPR:$cond)>;
 def : Pat<(select GPR:$cond, GPR:$t, GPR:$f),
           (OR (MASKEQZ GPR:$t, GPR:$cond), (MASKNEZ GPR:$f, GPR:$cond))>;
+} // Predicates = [Has32S]
 
 /// Branches and jumps
 
@@ -1476,6 +1536,7 @@ def : BccSwapPat<setle, BGE>;
 def : BccSwapPat<setugt, BLTU>;
 def : BccSwapPat<setule, BGEU>;
 
+let Predicates = [Has32S] in {
 // An extra pattern is needed for a brcond without a setcc (i.e. where the
 // condition was calculated elsewhere).
 def : Pat<(brcond GPR:$rj, bb:$imm21), (BNEZ GPR:$rj, bb:$imm21)>;
@@ -1484,6 +1545,16 @@ def : Pat<(brcond (GRLenVT (seteq GPR:$rj, 0)), bb:$imm21),
           (BEQZ GPR:$rj, bb:$imm21)>;
 def : Pat<(brcond (GRLenVT (setne GPR:$rj, 0)), bb:$imm21),
           (BNEZ GPR:$rj, bb:$imm21)>;
+} // Predicates = [Has32S]
+
+// An extra pattern is needed for a brcond without a setcc (i.e. where the
+// condition was calculated elsewhere).
+def : Pat<(brcond GPR:$rj, bb:$imm16), (BNE GPR:$rj, R0, bb:$imm16)>;
+
+def : Pat<(brcond (GRLenVT (seteq GPR:$rj, 0)), bb:$imm16),
+          (BEQ GPR:$rj, R0, bb:$imm16)>;
+def : Pat<(brcond (GRLenVT (setne GPR:$rj, 0)), bb:$imm16),
+          (BNE GPR:$rj, R0, bb:$imm16)>;
 
 let isBarrier = 1, isBranch = 1, isTerminator = 1 in
 def PseudoBR : Pseudo<(outs), (ins simm26_b:$imm26), [(br bb:$imm26)]>,
@@ -1741,12 +1812,12 @@ def : InstAlias<"la.local $dst, $tmp, $src",
 
 /// BSTRINS and BSTRPICK
 
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 def : Pat<(loongarch_bstrins GPR:$rd, GPR:$rj, uimm5:$msbd, uimm5:$lsbd),
           (BSTRINS_W GPR:$rd, GPR:$rj, uimm5:$msbd, uimm5:$lsbd)>;
 def : Pat<(loongarch_bstrpick GPR:$rj, uimm5:$msbd, uimm5:$lsbd),
           (BSTRPICK_W GPR:$rj, uimm5:$msbd, uimm5:$lsbd)>;
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 let Predicates = [IsLA64] in {
 def : Pat<(loongarch_bstrins GPR:$rd, GPR:$rj, uimm6:$msbd, uimm6:$lsbd),
@@ -1760,12 +1831,12 @@ def : Pat<(loongarch_bstrpick GPR:$rj, uimm6:$msbd, uimm6:$lsbd),
 def : Pat<(loongarch_revb_2h GPR:$rj), (REVB_2H GPR:$rj)>;
 def : Pat<(loongarch_bitrev_4b GPR:$rj), (BITREV_4B GPR:$rj)>;
 
-let Predicates = [IsLA32] in {
+let Predicates = [IsLA32, Has32S] in {
 def : Pat<(bswap GPR:$rj), (ROTRI_W (REVB_2H GPR:$rj), 16)>;
 def : Pat<(bitreverse GPR:$rj), (BITREV_W GPR:$rj)>;
 def : Pat<(bswap (bitreverse GPR:$rj)), (BITREV_4B GPR:$rj)>;
 def : Pat<(bitreverse (bswap GPR:$rj)), (BITREV_4B GPR:$rj)>;
-} // Predicates = [IsLA32]
+} // Predicates = [IsLA32, Has32S]
 
 let Predicates = [IsLA64] in {
 def : Pat<(loongarch_revb_2w GPR:$rj), (REVB_2W GPR:$rj)>;
@@ -2298,6 +2369,12 @@ def : InstAlias<"move $dst, $src", (OR GPR:$dst, GPR:$src, R0)>;
 def : InstAlias<"ret", (JIRL R0, R1, 0)>;
 def : InstAlias<"jr $rj", (JIRL R0, GPR:$rj, 0)>;
 
+let Predicates = [Not32S] in {
+def : InstAlias<"rdcntid.w $rj", (RDTIMEL_W R0, GPR:$rj)>;
+def : InstAlias<"rdcntvh.w $rd", (RDTIMEH_W GPR:$rd, R0)>;
+def : InstAlias<"rdcntvl.w $rd", (RDTIMEL_W GPR:$rd, R0)>;
+} // Predicates = [Not32S]
+
 // Branches implemented with aliases.
 // Disassemble branch instructions not having a $zero operand to the
 // canonical mnemonics respectively, but disassemble BLT/BGE with a $zero
diff --git a/llvm/test/CodeGen/LoongArch/alloca.ll b/llvm/test/CodeGen/LoongArch/alloca.ll
index 8a3b2aefaee6a..90e39e0b1d3a4 100644
--- a/llvm/test/CodeGen/LoongArch/alloca.ll
+++ b/llvm/test/CodeGen/LoongArch/alloca.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
-; RUN:   | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
 
@@ -10,22 +12,40 @@ declare void @notdead(ptr)
 ;; pointer
 
 define void @simple_alloca(i32 %n) nounwind {
-; LA32-LABEL: simple_alloca:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
-; LA32-NEXT:    addi.w $fp, $sp, 16
-; LA32-NEXT:    addi.w $a0, $a0, 15
-; LA32-NEXT:    bstrins.w $a0, $zero, 3, 0
-; LA32-NEXT:    sub.w $a0, $sp, $a0
-; LA32-NEXT:    move $sp, $a0
-; LA32-NEXT:    bl notdead
-; LA32-NEXT:    addi.w $sp, $fp, -16
-; LA32-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: simple_alloca:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
+; LA32R-NEXT:    addi.w $fp, $sp, 16
+; LA32R-NEXT:    addi.w $a0, $a0, 15
+; LA32R-NEXT:    addi.w $a1, $zero, -16
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    sub.w $a0, $sp, $a0
+; LA32R-NEXT:    move $sp, $a0
+; LA32R-NEXT:    bl notdead
+; LA32R-NEXT:    addi.w $sp, $fp, -16
+; LA32R-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: simple_alloca:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
+; LA32S-NEXT:    addi.w $fp, $sp, 16
+; LA32S-NEXT:    addi.w $a0, $a0, 15
+; LA32S-NEXT:    bstrins.w $a0, $zero, 3, 0
+; LA32S-NEXT:    sub.w $a0, $sp, $a0
+; LA32S-NEXT:    move $sp, $a0
+; LA32S-NEXT:    bl notdead
+; LA32S-NEXT:    addi.w $sp, $fp, -16
+; LA32S-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: simple_alloca:
 ; LA64:       # %bb.0:
@@ -55,26 +75,48 @@ declare ptr @llvm.stacksave()
 declare void @llvm.stackrestore(ptr)
 
 define void @scoped_alloca(i32 %n) nounwind {
-; LA32-LABEL: scoped_alloca:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
-; LA32-NEXT:    st.w $s0, $sp, 4 # 4-byte Folded Spill
-; LA32-NEXT:    addi.w $fp, $sp, 16
-; LA32-NEXT:    move $s0, $sp
-; LA32-NEXT:    addi.w $a0, $a0, 15
-; LA32-NEXT:    bstrins.w $a0, $zero, 3, 0
-; LA32-NEXT:    sub.w $a0, $sp, $a0
-; LA32-NEXT:    move $sp, $a0
-; LA32-NEXT:    bl notdead
-; LA32-NEXT:    move $sp, $s0
-; LA32-NEXT:    addi.w $sp, $fp, -16
-; LA32-NEXT:    ld.w $s0, $sp, 4 # 4-byte Folded Reload
-; LA32-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: scoped_alloca:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
+; LA32R-NEXT:    st.w $s0, $sp, 4 # 4-byte Folded Spill
+; LA32R-NEXT:    addi.w $fp, $sp, 16
+; LA32R-NEXT:    move $s0, $sp
+; LA32R-NEXT:    addi.w $a0, $a0, 15
+; LA32R-NEXT:    addi.w $a1, $zero, -16
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    sub.w $a0, $sp, $a0
+; LA32R-NEXT:    move $sp, $a0
+; LA32R-NEXT:    bl notdead
+; LA32R-NEXT:    move $sp, $s0
+; LA32R-NEXT:    addi.w $sp, $fp, -16
+; LA32R-NEXT:    ld.w $s0, $sp, 4 # 4-byte Folded Reload
+; LA32R-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: scoped_alloca:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
+; LA32S-NEXT:    st.w $s0, $sp, 4 # 4-byte Folded Spill
+; LA32S-NEXT:    addi.w $fp, $sp, 16
+; LA32S-NEXT:    move $s0, $sp
+; LA32S-NEXT:    addi.w $a0, $a0, 15
+; LA32S-NEXT:    bstrins.w $a0, $zero, 3, 0
+; LA32S-NEXT:    sub.w $a0, $sp, $a0
+; LA32S-NEXT:    move $sp, $a0
+; LA32S-NEXT:    bl notdead
+; LA32S-NEXT:    move $sp, $s0
+; LA32S-NEXT:    addi.w $sp, $fp, -16
+; LA32S-NEXT:    ld.w $s0, $sp, 4 # 4-byte Folded Reload
+; LA32S-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: scoped_alloca:
 ; LA64:       # %bb.0:
@@ -111,39 +153,74 @@ declare void @func(ptr, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32)
 ;; Check that outgoing arguments passed on the stack do not corrupt a
 ;; variable-sized stack object.
 define void @alloca_callframe(i32 %n) nounwind {
-; LA32-LABEL: alloca_callframe:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
-; LA32-NEXT:    addi.w $fp, $sp, 16
-; LA32-NEXT:    addi.w $a0, $a0, 15
-; LA32-NEXT:    bstrins.w $a0, $zero, 3, 0
-; LA32-NEXT:    sub.w $a0, $sp, $a0
-; LA32-NEXT:    move $sp, $a0
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    ori $a1, $zero, 12
-; LA32-NEXT:    st.w $a1, $sp, 12
-; LA32-NEXT:    ori $a1, $zero, 11
-; LA32-NEXT:    st.w $a1, $sp, 8
-; LA32-NEXT:    ori $a1, $zero, 10
-; LA32-NEXT:    st.w $a1, $sp, 4
-; LA32-NEXT:    ori $t0, $zero, 9
-; LA32-NEXT:    ori $a1, $zero, 2
-; LA32-NEXT:    ori $a2, $zero, 3
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    ori $a4, $zero, 5
-; LA32-NEXT:    ori $a5, $zero, 6
-; LA32-NEXT:    ori $a6, $zero, 7
-; LA32-NEXT:    ori $a7, $zero, 8
-; LA32-NEXT:    st.w $t0, $sp, 0
-; LA32-NEXT:    bl func
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    addi.w $sp, $fp, -16
-; LA32-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: alloca_callframe:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
+; LA32R-NEXT:    addi.w $fp, $sp, 16
+; LA32R-NEXT:    addi.w $a0, $a0, 15
+; LA32R-NEXT:    addi.w $a1, $zero, -16
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    sub.w $a0, $sp, $a0
+; LA32R-NEXT:    move $sp, $a0
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    ori $a1, $zero, 12
+; LA32R-NEXT:    st.w $a1, $sp, 12
+; LA32R-NEXT:    ori $a1, $zero, 11
+; LA32R-NEXT:    st.w $a1, $sp, 8
+; LA32R-NEXT:    ori $a1, $zero, 10
+; LA32R-NEXT:    st.w $a1, $sp, 4
+; LA32R-NEXT:    ori $t0, $zero, 9
+; LA32R-NEXT:    ori $a1, $zero, 2
+; LA32R-NEXT:    ori $a2, $zero, 3
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    ori $a4, $zero, 5
+; LA32R-NEXT:    ori $a5, $zero, 6
+; LA32R-NEXT:    ori $a6, $zero, 7
+; LA32R-NEXT:    ori $a7, $zero, 8
+; LA32R-NEXT:    st.w $t0, $sp, 0
+; LA32R-NEXT:    bl func
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    addi.w $sp, $fp, -16
+; LA32R-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alloca_callframe:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    st.w $fp, $sp, 8 # 4-byte Folded Spill
+; LA32S-NEXT:    addi.w $fp, $sp, 16
+; LA32S-NEXT:    addi.w $a0, $a0, 15
+; LA32S-NEXT:    bstrins.w $a0, $zero, 3, 0
+; LA32S-NEXT:    sub.w $a0, $sp, $a0
+; LA32S-NEXT:    move $sp, $a0
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    ori $a1, $zero, 12
+; LA32S-NEXT:    st.w $a1, $sp, 12
+; LA32S-NEXT:    ori $a1, $zero, 11
+; LA32S-NEXT:    st.w $a1, $sp, 8
+; LA32S-NEXT:    ori $a1, $zero, 10
+; LA32S-NEXT:    st.w $a1, $sp, 4
+; LA32S-NEXT:    ori $t0, $zero, 9
+; LA32S-NEXT:    ori $a1, $zero, 2
+; LA32S-NEXT:    ori $a2, $zero, 3
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    ori $a4, $zero, 5
+; LA32S-NEXT:    ori $a5, $zero, 6
+; LA32S-NEXT:    ori $a6, $zero, 7
+; LA32S-NEXT:    ori $a7, $zero, 8
+; LA32S-NEXT:    st.w $t0, $sp, 0
+; LA32S-NEXT:    bl func
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    addi.w $sp, $fp, -16
+; LA32S-NEXT:    ld.w $fp, $sp, 8 # 4-byte Folded Reload
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alloca_callframe:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/alsl.ll b/llvm/test/CodeGen/LoongArch/alsl.ll
index 34baccc60d547..da3460953c5c9 100644
--- a/llvm/test/CodeGen/LoongArch/alsl.ll
+++ b/llvm/test/CodeGen/LoongArch/alsl.ll
@@ -1,12 +1,19 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 define i8 @alsl_i8(i8 signext %a, i8 signext %b) nounwind {
-; LA32-LABEL: alsl_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_i8:
 ; LA64:       # %bb.0: # %entry
@@ -19,10 +26,16 @@ entry:
 }
 
 define i16 @alsl_i16(i16 signext %a, i16 signext %b) nounwind {
-; LA32-LABEL: alsl_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_i16:
 ; LA64:       # %bb.0: # %entry
@@ -35,10 +48,16 @@ entry:
 }
 
 define i32 @alsl_i32(i32 signext %a, i32 signext %b) nounwind {
-; LA32-LABEL: alsl_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_i32:
 ; LA64:       # %bb.0: # %entry
@@ -51,16 +70,28 @@ entry:
 }
 
 define i64 @alsl_i64(i64 signext %a, i64 signext %b) nounwind {
-; LA32-LABEL: alsl_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    srli.w $a4, $a0, 28
-; LA32-NEXT:    slli.w $a1, $a1, 4
-; LA32-NEXT:    or $a1, $a1, $a4
-; LA32-NEXT:    alsl.w $a0, $a0, $a2, 4
-; LA32-NEXT:    sltu $a2, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    add.w $a1, $a1, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    srli.w $a4, $a0, 28
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    sltu $a2, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    add.w $a1, $a1, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    srli.w $a4, $a0, 28
+; LA32S-NEXT:    slli.w $a1, $a1, 4
+; LA32S-NEXT:    or $a1, $a1, $a4
+; LA32S-NEXT:    alsl.w $a0, $a0, $a2, 4
+; LA32S-NEXT:    sltu $a2, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    add.w $a1, $a1, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_i64:
 ; LA64:       # %bb.0: # %entry
@@ -73,11 +104,18 @@ entry:
 }
 
 define i32 @alsl_zext_i8(i8 signext %a, i8 signext %b) nounwind {
-; LA32-LABEL: alsl_zext_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 1
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_zext_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_zext_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 1
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_zext_i8:
 ; LA64:       # %bb.0: # %entry
@@ -92,11 +130,20 @@ entry:
 }
 
 define i32 @alsl_zext_i16(i16 signext %a, i16 signext %b) nounwind {
-; LA32-LABEL: alsl_zext_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 2
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_zext_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_zext_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 2
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_zext_i16:
 ; LA64:       # %bb.0: # %entry
@@ -111,11 +158,18 @@ entry:
 }
 
 define i64 @alsl_zext_i32(i32 signext %a, i32 signext %b) nounwind {
-; LA32-LABEL: alsl_zext_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 3
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_zext_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_zext_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 3
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_zext_i32:
 ; LA64:       # %bb.0: # %entry
@@ -129,11 +183,18 @@ entry:
 }
 
 define i8 @mul_add_i8(i8 signext %a, i8 signext %b) nounwind {
-; LA32-LABEL: mul_add_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 1
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 1
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_i8:
 ; LA64:       # %bb.0: # %entry
@@ -147,12 +208,20 @@ entry:
 }
 
 define i16 @mul_add_i16(i16 signext %a, i16 signext %b) nounwind {
-; LA32-LABEL: mul_add_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a0, $a2, 1
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    add.w $a0, $a0, $a2
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a0, $a2, 1
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_i16:
 ; LA64:       # %bb.0: # %entry
@@ -167,12 +236,20 @@ entry:
 }
 
 define i32 @mul_add_i32(i32 signext %a, i32 signext %b) nounwind {
-; LA32-LABEL: mul_add_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a0, $a2, 2
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    add.w $a0, $a0, $a2
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a0, $a2, 2
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_i32:
 ; LA64:       # %bb.0: # %entry
@@ -187,20 +264,35 @@ entry:
 }
 
 define i64 @mul_add_i64(i64 signext %a, i64 signext %b) nounwind {
-; LA32-LABEL: mul_add_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ori $a4, $zero, 15
-; LA32-NEXT:    mulh.wu $a4, $a0, $a4
-; LA32-NEXT:    slli.w $a5, $a1, 4
-; LA32-NEXT:    sub.w $a1, $a5, $a1
-; LA32-NEXT:    add.w $a1, $a4, $a1
-; LA32-NEXT:    slli.w $a4, $a0, 4
-; LA32-NEXT:    sub.w $a0, $a4, $a0
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    add.w $a0, $a2, $a0
-; LA32-NEXT:    sltu $a2, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a1, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ori $a4, $zero, 15
+; LA32R-NEXT:    mulh.wu $a4, $a0, $a4
+; LA32R-NEXT:    slli.w $a5, $a1, 4
+; LA32R-NEXT:    sub.w $a1, $a5, $a1
+; LA32R-NEXT:    add.w $a1, $a4, $a1
+; LA32R-NEXT:    slli.w $a4, $a0, 4
+; LA32R-NEXT:    sub.w $a0, $a4, $a0
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    sltu $a2, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a1, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ori $a4, $zero, 15
+; LA32S-NEXT:    mulh.wu $a4, $a0, $a4
+; LA32S-NEXT:    slli.w $a5, $a1, 4
+; LA32S-NEXT:    sub.w $a1, $a5, $a1
+; LA32S-NEXT:    add.w $a1, $a4, $a1
+; LA32S-NEXT:    slli.w $a4, $a0, 4
+; LA32S-NEXT:    sub.w $a0, $a4, $a0
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    add.w $a0, $a2, $a0
+; LA32S-NEXT:    sltu $a2, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a1, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_i64:
 ; LA64:       # %bb.0: # %entry
@@ -215,12 +307,20 @@ entry:
 }
 
 define i32 @mul_add_zext_i8(i8 signext %a, i8 signext %b) nounwind {
-; LA32-LABEL: mul_add_zext_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_zext_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 2
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_zext_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_zext_i8:
 ; LA64:       # %bb.0: # %entry
@@ -236,13 +336,23 @@ entry:
 }
 
 define i32 @mul_add_zext_i16(i16 signext %a, i16 signext %b) nounwind {
-; LA32-LABEL: mul_add_zext_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    slli.w $a2, $a0, 4
-; LA32-NEXT:    sub.w $a0, $a2, $a0
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_zext_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 4
+; LA32R-NEXT:    sub.w $a0, $a2, $a0
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_zext_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    slli.w $a2, $a0, 4
+; LA32S-NEXT:    sub.w $a0, $a2, $a0
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_zext_i16:
 ; LA64:       # %bb.0: # %entry
@@ -259,12 +369,20 @@ entry:
 }
 
 define i64 @mul_add_zext_i32(i32 signext %a, i32 signext %b) nounwind {
-; LA32-LABEL: mul_add_zext_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_zext_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 2
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_zext_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_zext_i32:
 ; LA64:       # %bb.0: # %entry
@@ -280,11 +398,18 @@ entry:
 }
 
 define i8 @alsl_neg_i8(i8 signext %a, i8 signext %b) nounwind {
-; LA32-LABEL: alsl_neg_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 1
-; LA32-NEXT:    sub.w $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_neg_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    sub.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_neg_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 1
+; LA32S-NEXT:    sub.w $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_neg_i8:
 ; LA64:       # %bb.0: # %entry
@@ -298,11 +423,18 @@ entry:
 }
 
 define i16 @alsl_neg_i16(i16 signext %a, i16 signext %b) nounwind {
-; LA32-LABEL: alsl_neg_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    sub.w $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_neg_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 2
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    sub.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_neg_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    sub.w $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_neg_i16:
 ; LA64:       # %bb.0: # %entry
@@ -316,11 +448,18 @@ entry:
 }
 
 define i32 @alsl_neg_i32(i32 signext %a, i32 signext %b) nounwind {
-; LA32-LABEL: alsl_neg_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 3
-; LA32-NEXT:    sub.w $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: alsl_neg_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a2, $a0, 3
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    sub.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: alsl_neg_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 3
+; LA32S-NEXT:    sub.w $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: alsl_neg_i32:
 ; LA64:       # %bb.0: # %entry
@@ -334,21 +473,37 @@ entry:
 }
 
 define i64 @mul_add_neg_i64(i64 signext %a, i64 signext %b) nounwind {
-; LA32-LABEL: mul_add_neg_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    slli.w $a4, $a1, 4
-; LA32-NEXT:    sub.w $a1, $a1, $a4
-; LA32-NEXT:    addi.w $a4, $zero, -15
-; LA32-NEXT:    mulh.wu $a4, $a0, $a4
-; LA32-NEXT:    sub.w $a4, $a4, $a0
-; LA32-NEXT:    add.w $a1, $a4, $a1
-; LA32-NEXT:    slli.w $a4, $a0, 4
-; LA32-NEXT:    sub.w $a0, $a0, $a4
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    add.w $a0, $a2, $a0
-; LA32-NEXT:    sltu $a2, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a1, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_add_neg_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a4, $a1, 4
+; LA32R-NEXT:    sub.w $a1, $a1, $a4
+; LA32R-NEXT:    addi.w $a4, $zero, -15
+; LA32R-NEXT:    mulh.wu $a4, $a0, $a4
+; LA32R-NEXT:    sub.w $a4, $a4, $a0
+; LA32R-NEXT:    add.w $a1, $a4, $a1
+; LA32R-NEXT:    slli.w $a4, $a0, 4
+; LA32R-NEXT:    sub.w $a0, $a0, $a4
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    sltu $a2, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a1, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_add_neg_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    slli.w $a4, $a1, 4
+; LA32S-NEXT:    sub.w $a1, $a1, $a4
+; LA32S-NEXT:    addi.w $a4, $zero, -15
+; LA32S-NEXT:    mulh.wu $a4, $a0, $a4
+; LA32S-NEXT:    sub.w $a4, $a4, $a0
+; LA32S-NEXT:    add.w $a1, $a4, $a1
+; LA32S-NEXT:    slli.w $a4, $a0, 4
+; LA32S-NEXT:    sub.w $a0, $a0, $a4
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    add.w $a0, $a2, $a0
+; LA32S-NEXT:    sltu $a2, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a1, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_add_neg_i64:
 ; LA64:       # %bb.0: # %entry
diff --git a/llvm/test/CodeGen/LoongArch/annotate-tablejump.ll b/llvm/test/CodeGen/LoongArch/annotate-tablejump.ll
index 2bba7f1dc0e96..2af80789086e6 100644
--- a/llvm/test/CodeGen/LoongArch/annotate-tablejump.ll
+++ b/llvm/test/CodeGen/LoongArch/annotate-tablejump.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc --mtriple=loongarch32 -mattr=+d \
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d \
 ; RUN:   --min-jump-table-entries=4 < %s \
 ; RUN:   --loongarch-annotate-tablejump \
 ; RUN:   | FileCheck %s --check-prefix=LA32-JT
diff --git a/llvm/test/CodeGen/LoongArch/atomicrmw-cond-sub-clamp.ll b/llvm/test/CodeGen/LoongArch/atomicrmw-cond-sub-clamp.ll
index 95bb25c41dabc..ab09cc9ed50a0 100644
--- a/llvm/test/CodeGen/LoongArch/atomicrmw-cond-sub-clamp.ll
+++ b/llvm/test/CodeGen/LoongArch/atomicrmw-cond-sub-clamp.ll
@@ -38,7 +38,7 @@ define i8 @atomicrmw_usub_cond_i8(ptr %ptr, i8 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB0_3 Depth=2
 ; LA64-NEXT:    move $t0, $a7
 ; LA64-NEXT:    sc.w $t0, $a0, 0
-; LA64-NEXT:    beqz $t0, .LBB0_3
+; LA64-NEXT:    beq $t0, $zero, .LBB0_3
 ; LA64-NEXT:    b .LBB0_6
 ; LA64-NEXT:  .LBB0_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB0_1 Depth=1
@@ -91,7 +91,7 @@ define i16 @atomicrmw_usub_cond_i16(ptr %ptr, i16 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB1_3 Depth=2
 ; LA64-NEXT:    move $t0, $a7
 ; LA64-NEXT:    sc.w $t0, $a0, 0
-; LA64-NEXT:    beqz $t0, .LBB1_3
+; LA64-NEXT:    beq $t0, $zero, .LBB1_3
 ; LA64-NEXT:    b .LBB1_6
 ; LA64-NEXT:  .LBB1_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB1_1 Depth=1
@@ -131,7 +131,7 @@ define i32 @atomicrmw_usub_cond_i32(ptr %ptr, i32 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB2_3 Depth=2
 ; LA64-NEXT:    move $a6, $a5
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB2_3
+; LA64-NEXT:    beq $a6, $zero, .LBB2_3
 ; LA64-NEXT:    b .LBB2_6
 ; LA64-NEXT:  .LBB2_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB2_1 Depth=1
@@ -170,7 +170,7 @@ define i64 @atomicrmw_usub_cond_i64(ptr %ptr, i64 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB3_3 Depth=2
 ; LA64-NEXT:    move $a5, $a4
 ; LA64-NEXT:    sc.d $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB3_3
+; LA64-NEXT:    beq $a5, $zero, .LBB3_3
 ; LA64-NEXT:    b .LBB3_6
 ; LA64-NEXT:  .LBB3_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB3_1 Depth=1
@@ -218,7 +218,7 @@ define i8 @atomicrmw_usub_sat_i8(ptr %ptr, i8 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB4_3 Depth=2
 ; LA64-NEXT:    move $a7, $a6
 ; LA64-NEXT:    sc.w $a7, $a0, 0
-; LA64-NEXT:    beqz $a7, .LBB4_3
+; LA64-NEXT:    beq $a7, $zero, .LBB4_3
 ; LA64-NEXT:    b .LBB4_6
 ; LA64-NEXT:  .LBB4_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB4_1 Depth=1
@@ -267,7 +267,7 @@ define i16 @atomicrmw_usub_sat_i16(ptr %ptr, i16 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB5_3 Depth=2
 ; LA64-NEXT:    move $a7, $a6
 ; LA64-NEXT:    sc.w $a7, $a0, 0
-; LA64-NEXT:    beqz $a7, .LBB5_3
+; LA64-NEXT:    beq $a7, $zero, .LBB5_3
 ; LA64-NEXT:    b .LBB5_6
 ; LA64-NEXT:  .LBB5_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB5_1 Depth=1
@@ -304,7 +304,7 @@ define i32 @atomicrmw_usub_sat_i32(ptr %ptr, i32 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB6_3 Depth=2
 ; LA64-NEXT:    move $a5, $a4
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB6_3
+; LA64-NEXT:    beq $a5, $zero, .LBB6_3
 ; LA64-NEXT:    b .LBB6_6
 ; LA64-NEXT:  .LBB6_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB6_1 Depth=1
@@ -340,7 +340,7 @@ define i64 @atomicrmw_usub_sat_i64(ptr %ptr, i64 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB7_3 Depth=2
 ; LA64-NEXT:    move $a5, $a4
 ; LA64-NEXT:    sc.d $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB7_3
+; LA64-NEXT:    beq $a5, $zero, .LBB7_3
 ; LA64-NEXT:    b .LBB7_6
 ; LA64-NEXT:  .LBB7_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB7_1 Depth=1
diff --git a/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll b/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
index 854518ed1fc97..c9e4a05e838f4 100644
--- a/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
+++ b/llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll
@@ -36,7 +36,7 @@ define i8 @atomicrmw_uinc_wrap_i8(ptr %ptr, i8 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB0_3 Depth=2
 ; LA64-NEXT:    move $a7, $a6
 ; LA64-NEXT:    sc.w $a7, $a0, 0
-; LA64-NEXT:    beqz $a7, .LBB0_3
+; LA64-NEXT:    beq $a7, $zero, .LBB0_3
 ; LA64-NEXT:    b .LBB0_6
 ; LA64-NEXT:  .LBB0_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB0_1 Depth=1
@@ -87,7 +87,7 @@ define i16 @atomicrmw_uinc_wrap_i16(ptr %ptr, i16 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB1_3 Depth=2
 ; LA64-NEXT:    move $a7, $a6
 ; LA64-NEXT:    sc.w $a7, $a0, 0
-; LA64-NEXT:    beqz $a7, .LBB1_3
+; LA64-NEXT:    beq $a7, $zero, .LBB1_3
 ; LA64-NEXT:    b .LBB1_6
 ; LA64-NEXT:  .LBB1_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB1_1 Depth=1
@@ -125,7 +125,7 @@ define i32 @atomicrmw_uinc_wrap_i32(ptr %ptr, i32 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB2_3 Depth=2
 ; LA64-NEXT:    move $a5, $a4
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB2_3
+; LA64-NEXT:    beq $a5, $zero, .LBB2_3
 ; LA64-NEXT:    b .LBB2_6
 ; LA64-NEXT:  .LBB2_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB2_1 Depth=1
@@ -162,7 +162,7 @@ define i64 @atomicrmw_uinc_wrap_i64(ptr %ptr, i64 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB3_3 Depth=2
 ; LA64-NEXT:    move $a5, $a4
 ; LA64-NEXT:    sc.d $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB3_3
+; LA64-NEXT:    beq $a5, $zero, .LBB3_3
 ; LA64-NEXT:    b .LBB3_6
 ; LA64-NEXT:  .LBB3_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB3_1 Depth=1
@@ -217,7 +217,7 @@ define i8 @atomicrmw_udec_wrap_i8(ptr %ptr, i8 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB4_3 Depth=2
 ; LA64-NEXT:    move $t0, $a7
 ; LA64-NEXT:    sc.w $t0, $a0, 0
-; LA64-NEXT:    beqz $t0, .LBB4_3
+; LA64-NEXT:    beq $t0, $zero, .LBB4_3
 ; LA64-NEXT:    b .LBB4_6
 ; LA64-NEXT:  .LBB4_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB4_1 Depth=1
@@ -273,7 +273,7 @@ define i16 @atomicrmw_udec_wrap_i16(ptr %ptr, i16 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB5_3 Depth=2
 ; LA64-NEXT:    move $t0, $a7
 ; LA64-NEXT:    sc.w $t0, $a0, 0
-; LA64-NEXT:    beqz $t0, .LBB5_3
+; LA64-NEXT:    beq $t0, $zero, .LBB5_3
 ; LA64-NEXT:    b .LBB5_6
 ; LA64-NEXT:  .LBB5_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB5_1 Depth=1
@@ -316,7 +316,7 @@ define i32 @atomicrmw_udec_wrap_i32(ptr %ptr, i32 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB6_3 Depth=2
 ; LA64-NEXT:    move $a6, $a5
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB6_3
+; LA64-NEXT:    beq $a6, $zero, .LBB6_3
 ; LA64-NEXT:    b .LBB6_6
 ; LA64-NEXT:  .LBB6_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB6_1 Depth=1
@@ -358,7 +358,7 @@ define i64 @atomicrmw_udec_wrap_i64(ptr %ptr, i64 %val) {
 ; LA64-NEXT:    # in Loop: Header=BB7_3 Depth=2
 ; LA64-NEXT:    move $a5, $a4
 ; LA64-NEXT:    sc.d $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB7_3
+; LA64-NEXT:    beq $a5, $zero, .LBB7_3
 ; LA64-NEXT:    b .LBB7_6
 ; LA64-NEXT:  .LBB7_5: # %atomicrmw.start
 ; LA64-NEXT:    # in Loop: Header=BB7_1 Depth=1
diff --git a/llvm/test/CodeGen/LoongArch/bitreverse.ll b/llvm/test/CodeGen/LoongArch/bitreverse.ll
index 78d5c7e4a7977..159bfeb1407ee 100644
--- a/llvm/test/CodeGen/LoongArch/bitreverse.ll
+++ b/llvm/test/CodeGen/LoongArch/bitreverse.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
-; RUN:   | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
 
@@ -15,10 +17,29 @@ declare i77 @llvm.bitreverse.i77(i77)
 declare i128 @llvm.bitreverse.i128(i128)
 
 define i8 @test_bitreverse_i8(i8 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.4b $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a1, $a0, 240
+; LA32R-NEXT:    andi $a0, $a0, 15
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    srli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    andi $a1, $a0, 51
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    andi $a0, $a0, 51
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 85
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    srli.w $a0, $a0, 1
+; LA32R-NEXT:    andi $a0, $a0, 85
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.4b $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i8:
 ; LA64:       # %bb.0:
@@ -29,11 +50,40 @@ define i8 @test_bitreverse_i8(i8 %a) nounwind {
 }
 
 define i16 @test_bitreverse_i16(i16 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.w $a0, $a0
-; LA32-NEXT:    srli.w $a0, $a0, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 3840
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a1, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 3855
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    srli.w $a0, $a0, 4
+; LA32R-NEXT:    andi $a0, $a0, 3855
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 3
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.w $a0, $a0
+; LA32S-NEXT:    srli.w $a0, $a0, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i16:
 ; LA64:       # %bb.0:
@@ -45,10 +95,46 @@ define i16 @test_bitreverse_i16(i16 %a) nounwind {
 }
 
 define i32 @test_bitreverse_i32(i32 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 3840
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    srli.w $a3, $a0, 24
+; LA32R-NEXT:    or $a1, $a1, $a3
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a2, $a2, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i32:
 ; LA64:       # %bb.0:
@@ -59,12 +145,73 @@ define i32 @test_bitreverse_i32(i32 %a) nounwind {
 }
 
 define i64 @test_bitreverse_i64(i64 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.w $a2, $a1
-; LA32-NEXT:    bitrev.w $a1, $a0
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a1, 8
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 3840
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    srli.w $a4, $a1, 24
+; LA32R-NEXT:    or $a2, $a2, $a4
+; LA32R-NEXT:    and $a4, $a1, $a3
+; LA32R-NEXT:    slli.w $a4, $a4, 8
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:    or $a1, $a1, $a2
+; LA32R-NEXT:    srli.w $a2, $a1, 4
+; LA32R-NEXT:    lu12i.w $a4, 61680
+; LA32R-NEXT:    ori $a4, $a4, 3855
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 2
+; LA32R-NEXT:    lu12i.w $a5, 209715
+; LA32R-NEXT:    ori $a5, $a5, 819
+; LA32R-NEXT:    and $a2, $a2, $a5
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 1
+; LA32R-NEXT:    lu12i.w $a6, 349525
+; LA32R-NEXT:    ori $a6, $a6, 1365
+; LA32R-NEXT:    and $a2, $a2, $a6
+; LA32R-NEXT:    and $a1, $a1, $a6
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a2, $a2, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    srli.w $a7, $a0, 24
+; LA32R-NEXT:    or $a1, $a1, $a7
+; LA32R-NEXT:    and $a3, $a0, $a3
+; LA32R-NEXT:    slli.w $a3, $a3, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a3
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    and $a0, $a0, $a5
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    and $a1, $a1, $a6
+; LA32R-NEXT:    and $a0, $a0, $a6
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a1, $a1, $a0
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.w $a2, $a1
+; LA32S-NEXT:    bitrev.w $a1, $a0
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i64:
 ; LA64:       # %bb.0:
@@ -77,11 +224,48 @@ define i64 @test_bitreverse_i64(i64 %a) nounwind {
 ;; Bitreverse on non-native integer widths.
 
 define i7 @test_bitreverse_i7(i7 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i7:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.w $a0, $a0
-; LA32-NEXT:    srli.w $a0, $a0, 25
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i7:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 3840
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    srli.w $a3, $a0, 24
+; LA32R-NEXT:    or $a1, $a1, $a3
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a2, $a2, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 344064
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    lu12i.w $a2, 348160
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a0, $a0, 25
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i7:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.w $a0, $a0
+; LA32S-NEXT:    srli.w $a0, $a0, 25
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i7:
 ; LA64:       # %bb.0:
@@ -93,11 +277,48 @@ define i7 @test_bitreverse_i7(i7 %a) nounwind {
 }
 
 define i24 @test_bitreverse_i24(i24 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i24:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.w $a0, $a0
-; LA32-NEXT:    srli.w $a0, $a0, 8
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i24:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 3840
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    srli.w $a3, $a0, 24
+; LA32R-NEXT:    or $a1, $a1, $a3
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a2, $a2, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1280
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a0, $a0, 8
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i24:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.w $a0, $a0
+; LA32S-NEXT:    srli.w $a0, $a0, 8
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i24:
 ; LA64:       # %bb.0:
@@ -109,13 +330,78 @@ define i24 @test_bitreverse_i24(i24 %a) nounwind {
 }
 
 define i48 @test_bitreverse_i48(i48 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i48:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.w $a2, $a0
-; LA32-NEXT:    bitrev.w $a0, $a1
-; LA32-NEXT:    bytepick.w $a0, $a0, $a2, 2
-; LA32-NEXT:    srli.w $a1, $a2, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i48:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a0, 8
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 3840
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    srli.w $a4, $a0, 24
+; LA32R-NEXT:    or $a2, $a2, $a4
+; LA32R-NEXT:    and $a4, $a0, $a3
+; LA32R-NEXT:    slli.w $a4, $a4, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a4
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    srli.w $a2, $a0, 4
+; LA32R-NEXT:    lu12i.w $a4, 61680
+; LA32R-NEXT:    ori $a4, $a4, 3855
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 2
+; LA32R-NEXT:    lu12i.w $a5, 209715
+; LA32R-NEXT:    ori $a5, $a5, 819
+; LA32R-NEXT:    and $a2, $a2, $a5
+; LA32R-NEXT:    and $a0, $a0, $a5
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 1
+; LA32R-NEXT:    lu12i.w $a6, 349525
+; LA32R-NEXT:    ori $a6, $a6, 1365
+; LA32R-NEXT:    and $a2, $a2, $a6
+; LA32R-NEXT:    and $a0, $a0, $a6
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a2, $a2, $a0
+; LA32R-NEXT:    slli.w $a0, $a2, 16
+; LA32R-NEXT:    srli.w $a6, $a1, 8
+; LA32R-NEXT:    and $a6, $a6, $a3
+; LA32R-NEXT:    srli.w $a7, $a1, 24
+; LA32R-NEXT:    or $a6, $a6, $a7
+; LA32R-NEXT:    and $a3, $a1, $a3
+; LA32R-NEXT:    slli.w $a3, $a3, 8
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    or $a1, $a1, $a3
+; LA32R-NEXT:    or $a1, $a1, $a6
+; LA32R-NEXT:    srli.w $a3, $a1, 4
+; LA32R-NEXT:    and $a3, $a3, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a3, $a1
+; LA32R-NEXT:    srli.w $a3, $a1, 2
+; LA32R-NEXT:    and $a3, $a3, $a5
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a1, $a3, $a1
+; LA32R-NEXT:    srli.w $a3, $a1, 1
+; LA32R-NEXT:    lu12i.w $a4, 349520
+; LA32R-NEXT:    and $a3, $a3, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a1, $a3, $a1
+; LA32R-NEXT:    srli.w $a1, $a1, 16
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a2, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i48:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.w $a2, $a0
+; LA32S-NEXT:    bitrev.w $a0, $a1
+; LA32S-NEXT:    bytepick.w $a0, $a0, $a2, 2
+; LA32S-NEXT:    srli.w $a1, $a2, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i48:
 ; LA64:       # %bb.0:
@@ -127,25 +413,124 @@ define i48 @test_bitreverse_i48(i48 %a) nounwind {
 }
 
 define i77 @test_bitreverse_i77(i77 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i77:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 4
-; LA32-NEXT:    ld.w $a3, $a1, 8
-; LA32-NEXT:    ld.w $a1, $a1, 0
-; LA32-NEXT:    bitrev.w $a2, $a2
-; LA32-NEXT:    slli.w $a4, $a2, 13
-; LA32-NEXT:    bitrev.w $a3, $a3
-; LA32-NEXT:    srli.w $a3, $a3, 19
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    srli.w $a2, $a2, 19
-; LA32-NEXT:    bitrev.w $a1, $a1
-; LA32-NEXT:    slli.w $a4, $a1, 13
-; LA32-NEXT:    or $a2, $a4, $a2
-; LA32-NEXT:    srli.w $a1, $a1, 19
-; LA32-NEXT:    st.h $a1, $a0, 8
-; LA32-NEXT:    st.w $a2, $a0, 4
-; LA32-NEXT:    st.w $a3, $a0, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i77:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a3, $a1, 4
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    ld.w $a5, $a1, 8
+; LA32R-NEXT:    srli.w $a4, $a3, 8
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 3840
+; LA32R-NEXT:    and $a4, $a4, $a1
+; LA32R-NEXT:    srli.w $a6, $a3, 24
+; LA32R-NEXT:    or $a4, $a4, $a6
+; LA32R-NEXT:    and $a6, $a3, $a1
+; LA32R-NEXT:    slli.w $a6, $a6, 8
+; LA32R-NEXT:    slli.w $a3, $a3, 24
+; LA32R-NEXT:    or $a3, $a3, $a6
+; LA32R-NEXT:    or $a4, $a3, $a4
+; LA32R-NEXT:    srli.w $a6, $a4, 4
+; LA32R-NEXT:    lu12i.w $a3, 61680
+; LA32R-NEXT:    ori $a3, $a3, 3855
+; LA32R-NEXT:    and $a6, $a6, $a3
+; LA32R-NEXT:    and $a4, $a4, $a3
+; LA32R-NEXT:    slli.w $a4, $a4, 4
+; LA32R-NEXT:    or $a6, $a6, $a4
+; LA32R-NEXT:    srli.w $a7, $a6, 2
+; LA32R-NEXT:    lu12i.w $a4, 209715
+; LA32R-NEXT:    ori $a4, $a4, 819
+; LA32R-NEXT:    and $a7, $a7, $a4
+; LA32R-NEXT:    and $a6, $a6, $a4
+; LA32R-NEXT:    slli.w $a6, $a6, 2
+; LA32R-NEXT:    or $a6, $a7, $a6
+; LA32R-NEXT:    srli.w $a7, $a6, 1
+; LA32R-NEXT:    lu12i.w $t0, 349525
+; LA32R-NEXT:    ori $t0, $t0, 1365
+; LA32R-NEXT:    and $a7, $a7, $t0
+; LA32R-NEXT:    and $a6, $a6, $t0
+; LA32R-NEXT:    slli.w $a6, $a6, 1
+; LA32R-NEXT:    or $a6, $a7, $a6
+; LA32R-NEXT:    slli.w $a7, $a6, 13
+; LA32R-NEXT:    srli.w $t1, $a5, 8
+; LA32R-NEXT:    and $t1, $t1, $a1
+; LA32R-NEXT:    srli.w $t2, $a5, 24
+; LA32R-NEXT:    or $t1, $t1, $t2
+; LA32R-NEXT:    and $t2, $a5, $a1
+; LA32R-NEXT:    slli.w $t2, $t2, 8
+; LA32R-NEXT:    slli.w $a5, $a5, 24
+; LA32R-NEXT:    or $a5, $a5, $t2
+; LA32R-NEXT:    or $a5, $a5, $t1
+; LA32R-NEXT:    srli.w $t1, $a5, 4
+; LA32R-NEXT:    and $t1, $t1, $a3
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    slli.w $a5, $a5, 4
+; LA32R-NEXT:    or $a5, $t1, $a5
+; LA32R-NEXT:    srli.w $t1, $a5, 2
+; LA32R-NEXT:    and $t1, $t1, $a4
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    slli.w $a5, $a5, 2
+; LA32R-NEXT:    or $a5, $t1, $a5
+; LA32R-NEXT:    srli.w $t1, $a5, 1
+; LA32R-NEXT:    lu12i.w $t2, 349440
+; LA32R-NEXT:    and $t1, $t1, $t2
+; LA32R-NEXT:    lu12i.w $t2, 349504
+; LA32R-NEXT:    and $a5, $a5, $t2
+; LA32R-NEXT:    slli.w $a5, $a5, 1
+; LA32R-NEXT:    or $a5, $t1, $a5
+; LA32R-NEXT:    srli.w $a5, $a5, 19
+; LA32R-NEXT:    or $a5, $a5, $a7
+; LA32R-NEXT:    srli.w $a6, $a6, 19
+; LA32R-NEXT:    srli.w $a7, $a2, 8
+; LA32R-NEXT:    and $a7, $a7, $a1
+; LA32R-NEXT:    srli.w $t1, $a2, 24
+; LA32R-NEXT:    or $a7, $a7, $t1
+; LA32R-NEXT:    and $a1, $a2, $a1
+; LA32R-NEXT:    slli.w $a1, $a1, 8
+; LA32R-NEXT:    slli.w $a2, $a2, 24
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    or $a1, $a1, $a7
+; LA32R-NEXT:    srli.w $a2, $a1, 4
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 2
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 1
+; LA32R-NEXT:    and $a2, $a2, $t0
+; LA32R-NEXT:    and $a1, $a1, $t0
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    slli.w $a2, $a1, 13
+; LA32R-NEXT:    or $a2, $a2, $a6
+; LA32R-NEXT:    srli.w $a1, $a1, 19
+; LA32R-NEXT:    st.h $a1, $a0, 8
+; LA32R-NEXT:    st.w $a2, $a0, 4
+; LA32R-NEXT:    st.w $a5, $a0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i77:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 4
+; LA32S-NEXT:    ld.w $a3, $a1, 8
+; LA32S-NEXT:    ld.w $a1, $a1, 0
+; LA32S-NEXT:    bitrev.w $a2, $a2
+; LA32S-NEXT:    slli.w $a4, $a2, 13
+; LA32S-NEXT:    bitrev.w $a3, $a3
+; LA32S-NEXT:    srli.w $a3, $a3, 19
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    srli.w $a2, $a2, 19
+; LA32S-NEXT:    bitrev.w $a1, $a1
+; LA32S-NEXT:    slli.w $a4, $a1, 13
+; LA32S-NEXT:    or $a2, $a4, $a2
+; LA32S-NEXT:    srli.w $a1, $a1, 19
+; LA32S-NEXT:    st.h $a1, $a0, 8
+; LA32S-NEXT:    st.w $a2, $a0, 4
+; LA32S-NEXT:    st.w $a3, $a0, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i77:
 ; LA64:       # %bb.0:
@@ -161,21 +546,137 @@ define i77 @test_bitreverse_i77(i77 %a) nounwind {
 }
 
 define i128 @test_bitreverse_i128(i128 %a) nounwind {
-; LA32-LABEL: test_bitreverse_i128:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 12
-; LA32-NEXT:    ld.w $a3, $a1, 8
-; LA32-NEXT:    ld.w $a4, $a1, 4
-; LA32-NEXT:    ld.w $a1, $a1, 0
-; LA32-NEXT:    bitrev.w $a2, $a2
-; LA32-NEXT:    bitrev.w $a3, $a3
-; LA32-NEXT:    bitrev.w $a4, $a4
-; LA32-NEXT:    bitrev.w $a1, $a1
-; LA32-NEXT:    st.w $a1, $a0, 12
-; LA32-NEXT:    st.w $a4, $a0, 8
-; LA32-NEXT:    st.w $a3, $a0, 4
-; LA32-NEXT:    st.w $a2, $a0, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_i128:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a3, $a1, 12
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    ld.w $a7, $a1, 4
+; LA32R-NEXT:    ld.w $t0, $a1, 8
+; LA32R-NEXT:    srli.w $a4, $a3, 8
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 3840
+; LA32R-NEXT:    and $a4, $a4, $a1
+; LA32R-NEXT:    srli.w $a5, $a3, 24
+; LA32R-NEXT:    or $a4, $a4, $a5
+; LA32R-NEXT:    and $a5, $a3, $a1
+; LA32R-NEXT:    slli.w $a5, $a5, 8
+; LA32R-NEXT:    slli.w $a3, $a3, 24
+; LA32R-NEXT:    or $a3, $a3, $a5
+; LA32R-NEXT:    or $a4, $a3, $a4
+; LA32R-NEXT:    srli.w $a5, $a4, 4
+; LA32R-NEXT:    lu12i.w $a3, 61680
+; LA32R-NEXT:    ori $a3, $a3, 3855
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    and $a4, $a4, $a3
+; LA32R-NEXT:    slli.w $a4, $a4, 4
+; LA32R-NEXT:    or $a5, $a5, $a4
+; LA32R-NEXT:    srli.w $a6, $a5, 2
+; LA32R-NEXT:    lu12i.w $a4, 209715
+; LA32R-NEXT:    ori $a4, $a4, 819
+; LA32R-NEXT:    and $a6, $a6, $a4
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    slli.w $a5, $a5, 2
+; LA32R-NEXT:    or $a5, $a6, $a5
+; LA32R-NEXT:    srli.w $t1, $a5, 1
+; LA32R-NEXT:    lu12i.w $a6, 349525
+; LA32R-NEXT:    ori $a6, $a6, 1365
+; LA32R-NEXT:    and $t1, $t1, $a6
+; LA32R-NEXT:    and $a5, $a5, $a6
+; LA32R-NEXT:    slli.w $a5, $a5, 1
+; LA32R-NEXT:    or $a5, $t1, $a5
+; LA32R-NEXT:    srli.w $t1, $t0, 8
+; LA32R-NEXT:    and $t1, $t1, $a1
+; LA32R-NEXT:    srli.w $t2, $t0, 24
+; LA32R-NEXT:    or $t1, $t1, $t2
+; LA32R-NEXT:    and $t2, $t0, $a1
+; LA32R-NEXT:    slli.w $t2, $t2, 8
+; LA32R-NEXT:    slli.w $t0, $t0, 24
+; LA32R-NEXT:    or $t0, $t0, $t2
+; LA32R-NEXT:    or $t0, $t0, $t1
+; LA32R-NEXT:    srli.w $t1, $t0, 4
+; LA32R-NEXT:    and $t1, $t1, $a3
+; LA32R-NEXT:    and $t0, $t0, $a3
+; LA32R-NEXT:    slli.w $t0, $t0, 4
+; LA32R-NEXT:    or $t0, $t1, $t0
+; LA32R-NEXT:    srli.w $t1, $t0, 2
+; LA32R-NEXT:    and $t1, $t1, $a4
+; LA32R-NEXT:    and $t0, $t0, $a4
+; LA32R-NEXT:    slli.w $t0, $t0, 2
+; LA32R-NEXT:    or $t0, $t1, $t0
+; LA32R-NEXT:    srli.w $t1, $t0, 1
+; LA32R-NEXT:    and $t1, $t1, $a6
+; LA32R-NEXT:    and $t0, $t0, $a6
+; LA32R-NEXT:    slli.w $t0, $t0, 1
+; LA32R-NEXT:    or $t0, $t1, $t0
+; LA32R-NEXT:    srli.w $t1, $a7, 8
+; LA32R-NEXT:    and $t1, $t1, $a1
+; LA32R-NEXT:    srli.w $t2, $a7, 24
+; LA32R-NEXT:    or $t1, $t1, $t2
+; LA32R-NEXT:    and $t2, $a7, $a1
+; LA32R-NEXT:    slli.w $t2, $t2, 8
+; LA32R-NEXT:    slli.w $a7, $a7, 24
+; LA32R-NEXT:    or $a7, $a7, $t2
+; LA32R-NEXT:    or $a7, $a7, $t1
+; LA32R-NEXT:    srli.w $t1, $a7, 4
+; LA32R-NEXT:    and $t1, $t1, $a3
+; LA32R-NEXT:    and $a7, $a7, $a3
+; LA32R-NEXT:    slli.w $a7, $a7, 4
+; LA32R-NEXT:    or $a7, $t1, $a7
+; LA32R-NEXT:    srli.w $t1, $a7, 2
+; LA32R-NEXT:    and $t1, $t1, $a4
+; LA32R-NEXT:    and $a7, $a7, $a4
+; LA32R-NEXT:    slli.w $a7, $a7, 2
+; LA32R-NEXT:    or $a7, $t1, $a7
+; LA32R-NEXT:    srli.w $t1, $a7, 1
+; LA32R-NEXT:    and $t1, $t1, $a6
+; LA32R-NEXT:    and $a7, $a7, $a6
+; LA32R-NEXT:    slli.w $a7, $a7, 1
+; LA32R-NEXT:    or $a7, $t1, $a7
+; LA32R-NEXT:    srli.w $t1, $a2, 8
+; LA32R-NEXT:    and $t1, $t1, $a1
+; LA32R-NEXT:    srli.w $t2, $a2, 24
+; LA32R-NEXT:    or $t1, $t1, $t2
+; LA32R-NEXT:    and $a1, $a2, $a1
+; LA32R-NEXT:    slli.w $a1, $a1, 8
+; LA32R-NEXT:    slli.w $a2, $a2, 24
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    or $a1, $a1, $t1
+; LA32R-NEXT:    srli.w $a2, $a1, 4
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 2
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 1
+; LA32R-NEXT:    and $a2, $a2, $a6
+; LA32R-NEXT:    and $a1, $a1, $a6
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    st.w $a1, $a0, 12
+; LA32R-NEXT:    st.w $a7, $a0, 8
+; LA32R-NEXT:    st.w $t0, $a0, 4
+; LA32R-NEXT:    st.w $a5, $a0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_i128:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 12
+; LA32S-NEXT:    ld.w $a3, $a1, 8
+; LA32S-NEXT:    ld.w $a4, $a1, 4
+; LA32S-NEXT:    ld.w $a1, $a1, 0
+; LA32S-NEXT:    bitrev.w $a2, $a2
+; LA32S-NEXT:    bitrev.w $a3, $a3
+; LA32S-NEXT:    bitrev.w $a4, $a4
+; LA32S-NEXT:    bitrev.w $a1, $a1
+; LA32S-NEXT:    st.w $a1, $a0, 12
+; LA32S-NEXT:    st.w $a4, $a0, 8
+; LA32S-NEXT:    st.w $a3, $a0, 4
+; LA32S-NEXT:    st.w $a2, $a0, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_i128:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/bnez-beqz.ll b/llvm/test/CodeGen/LoongArch/bnez-beqz.ll
index 3b1dabaf6ea14..44cdfe9670e8d 100644
--- a/llvm/test/CodeGen/LoongArch/bnez-beqz.ll
+++ b/llvm/test/CodeGen/LoongArch/bnez-beqz.ll
@@ -1,17 +1,26 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 declare void @bar()
 
 define void @bnez_i32(i32 signext %0) nounwind {
-; LA32-LABEL: bnez_i32:
-; LA32:       # %bb.0: # %start
-; LA32-NEXT:    beqz $a0, .LBB0_2
-; LA32-NEXT:  # %bb.1: # %f
-; LA32-NEXT:    ret
-; LA32-NEXT:  .LBB0_2: # %t
-; LA32-NEXT:    b bar
+; LA32R-LABEL: bnez_i32:
+; LA32R:       # %bb.0: # %start
+; LA32R-NEXT:    beq $a0, $zero, .LBB0_2
+; LA32R-NEXT:  # %bb.1: # %f
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB0_2: # %t
+; LA32R-NEXT:    b bar
+;
+; LA32S-LABEL: bnez_i32:
+; LA32S:       # %bb.0: # %start
+; LA32S-NEXT:    beqz $a0, .LBB0_2
+; LA32S-NEXT:  # %bb.1: # %f
+; LA32S-NEXT:    ret
+; LA32S-NEXT:  .LBB0_2: # %t
+; LA32S-NEXT:    b bar
 ;
 ; LA64-LABEL: bnez_i32:
 ; LA64:       # %bb.0: # %start
@@ -34,13 +43,21 @@ f:
 }
 
 define void @beqz_i32(i32 signext %0) nounwind {
-; LA32-LABEL: beqz_i32:
-; LA32:       # %bb.0: # %start
-; LA32-NEXT:    beqz $a0, .LBB1_2
-; LA32-NEXT:  # %bb.1: # %t
-; LA32-NEXT:    b bar
-; LA32-NEXT:  .LBB1_2: # %f
-; LA32-NEXT:    ret
+; LA32R-LABEL: beqz_i32:
+; LA32R:       # %bb.0: # %start
+; LA32R-NEXT:    beq $a0, $zero, .LBB1_2
+; LA32R-NEXT:  # %bb.1: # %t
+; LA32R-NEXT:    b bar
+; LA32R-NEXT:  .LBB1_2: # %f
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: beqz_i32:
+; LA32S:       # %bb.0: # %start
+; LA32S-NEXT:    beqz $a0, .LBB1_2
+; LA32S-NEXT:  # %bb.1: # %t
+; LA32S-NEXT:    b bar
+; LA32S-NEXT:  .LBB1_2: # %f
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: beqz_i32:
 ; LA64:       # %bb.0: # %start
@@ -63,14 +80,23 @@ f:
 }
 
 define void @bnez_i64(i64 %0) nounwind {
-; LA32-LABEL: bnez_i64:
-; LA32:       # %bb.0: # %start
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    beqz $a0, .LBB2_2
-; LA32-NEXT:  # %bb.1: # %f
-; LA32-NEXT:    ret
-; LA32-NEXT:  .LBB2_2: # %t
-; LA32-NEXT:    b bar
+; LA32R-LABEL: bnez_i64:
+; LA32R:       # %bb.0: # %start
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    beq $a0, $zero, .LBB2_2
+; LA32R-NEXT:  # %bb.1: # %f
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB2_2: # %t
+; LA32R-NEXT:    b bar
+;
+; LA32S-LABEL: bnez_i64:
+; LA32S:       # %bb.0: # %start
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    beqz $a0, .LBB2_2
+; LA32S-NEXT:  # %bb.1: # %f
+; LA32S-NEXT:    ret
+; LA32S-NEXT:  .LBB2_2: # %t
+; LA32S-NEXT:    b bar
 ;
 ; LA64-LABEL: bnez_i64:
 ; LA64:       # %bb.0: # %start
@@ -93,14 +119,23 @@ f:
 }
 
 define void @beqz_i64(i64 %0) nounwind {
-; LA32-LABEL: beqz_i64:
-; LA32:       # %bb.0: # %start
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    beqz $a0, .LBB3_2
-; LA32-NEXT:  # %bb.1: # %t
-; LA32-NEXT:    b bar
-; LA32-NEXT:  .LBB3_2: # %f
-; LA32-NEXT:    ret
+; LA32R-LABEL: beqz_i64:
+; LA32R:       # %bb.0: # %start
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    beq $a0, $zero, .LBB3_2
+; LA32R-NEXT:  # %bb.1: # %t
+; LA32R-NEXT:    b bar
+; LA32R-NEXT:  .LBB3_2: # %f
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: beqz_i64:
+; LA32S:       # %bb.0: # %start
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    beqz $a0, .LBB3_2
+; LA32S-NEXT:  # %bb.1: # %t
+; LA32S-NEXT:    b bar
+; LA32S-NEXT:  .LBB3_2: # %f
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: beqz_i64:
 ; LA64:       # %bb.0: # %start
diff --git a/llvm/test/CodeGen/LoongArch/branch-relaxation.ll b/llvm/test/CodeGen/LoongArch/branch-relaxation.ll
index 6037ef9337dc5..f88603b6c79db 100644
--- a/llvm/test/CodeGen/LoongArch/branch-relaxation.ll
+++ b/llvm/test/CodeGen/LoongArch/branch-relaxation.ll
@@ -1,25 +1,36 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --filetype=obj --verify-machineinstrs < %s \
-; RUN:   -o /dev/null 2>&1
-; RUN: llc --mtriple=loongarch64 -mattr=+d --filetype=obj --verify-machineinstrs < %s \
-; RUN:   -o /dev/null 2>&1
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d --verify-machineinstrs < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s | FileCheck %s --check-prefix=LA64
 
 define i32 @relax_b18(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: relax_b18:
-; LA32:       # %bb.0:
-; LA32-NEXT:    beq $a0, $a1, .LBB0_1
-; LA32-NEXT:    b .LBB0_2
-; LA32-NEXT:  .LBB0_1: # %iftrue
-; LA32-NEXT:    ori $a0, $zero, 1
-; LA32-NEXT:    #APP
-; LA32-NEXT:    .space 1048576
-; LA32-NEXT:    #NO_APP
-; LA32-NEXT:    ret
-; LA32-NEXT:  .LBB0_2: # %iffalse
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: relax_b18:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    beq $a0, $a1, .LBB0_1
+; LA32R-NEXT:    b .LBB0_2
+; LA32R-NEXT:  .LBB0_1: # %iftrue
+; LA32R-NEXT:    ori $a0, $zero, 1
+; LA32R-NEXT:    #APP
+; LA32R-NEXT:    .space 1048576
+; LA32R-NEXT:    #NO_APP
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB0_2: # %iffalse
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: relax_b18:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    beq $a0, $a1, .LBB0_1
+; LA32S-NEXT:    b .LBB0_2
+; LA32S-NEXT:  .LBB0_1: # %iftrue
+; LA32S-NEXT:    ori $a0, $zero, 1
+; LA32S-NEXT:    #APP
+; LA32S-NEXT:    .space 1048576
+; LA32S-NEXT:    #NO_APP
+; LA32S-NEXT:    ret
+; LA32S-NEXT:  .LBB0_2: # %iffalse
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: relax_b18:
 ; LA64:       # %bb.0:
@@ -46,20 +57,35 @@ iffalse:
 }
 
 define i32 @relax_b23(i1 %a) {
-; LA32-LABEL: relax_b23:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    bnez $a0, .LBB1_1
-; LA32-NEXT:    b .LBB1_2
-; LA32-NEXT:  .LBB1_1: # %iftrue
-; LA32-NEXT:    ori $a0, $zero, 1
-; LA32-NEXT:    #APP
-; LA32-NEXT:    .space 16777216
-; LA32-NEXT:    #NO_APP
-; LA32-NEXT:    ret
-; LA32-NEXT:  .LBB1_2: # %iffalse
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: relax_b23:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    bne $a0, $zero, .LBB1_1
+; LA32R-NEXT:    b .LBB1_2
+; LA32R-NEXT:  .LBB1_1: # %iftrue
+; LA32R-NEXT:    ori $a0, $zero, 1
+; LA32R-NEXT:    #APP
+; LA32R-NEXT:    .space 16777216
+; LA32R-NEXT:    #NO_APP
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB1_2: # %iffalse
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: relax_b23:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    bnez $a0, .LBB1_1
+; LA32S-NEXT:    b .LBB1_2
+; LA32S-NEXT:  .LBB1_1: # %iftrue
+; LA32S-NEXT:    ori $a0, $zero, 1
+; LA32S-NEXT:    #APP
+; LA32S-NEXT:    .space 16777216
+; LA32S-NEXT:    #NO_APP
+; LA32S-NEXT:    ret
+; LA32S-NEXT:  .LBB1_2: # %iffalse
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: relax_b23:
 ; LA64:       # %bb.0:
@@ -86,27 +112,49 @@ iffalse:
 }
 
 define i32 @relax_b28(i1 %a) {
-; LA32-LABEL: relax_b28:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    .cfi_def_cfa_offset 16
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    bnez $a0, .LBB2_1
-; LA32-NEXT:  # %bb.3:
-; LA32-NEXT:    pcalau12i $a0, %pc_hi20(.LBB2_2)
-; LA32-NEXT:    addi.w $a0, $a0, %pc_lo12(.LBB2_2)
-; LA32-NEXT:    jr $a0
-; LA32-NEXT:  .LBB2_1: # %iftrue
-; LA32-NEXT:    ori $a0, $zero, 1
-; LA32-NEXT:    #APP
-; LA32-NEXT:    .space 536870912
-; LA32-NEXT:    #NO_APP
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
-; LA32-NEXT:  .LBB2_2: # %iffalse
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: relax_b28:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    bne $a0, $zero, .LBB2_1
+; LA32R-NEXT:  # %bb.3:
+; LA32R-NEXT:    pcalau12i $a0, %pc_hi20(.LBB2_2)
+; LA32R-NEXT:    addi.w $a0, $a0, %pc_lo12(.LBB2_2)
+; LA32R-NEXT:    jr $a0
+; LA32R-NEXT:  .LBB2_1: # %iftrue
+; LA32R-NEXT:    ori $a0, $zero, 1
+; LA32R-NEXT:    #APP
+; LA32R-NEXT:    .space 536870912
+; LA32R-NEXT:    #NO_APP
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB2_2: # %iffalse
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: relax_b28:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    bnez $a0, .LBB2_1
+; LA32S-NEXT:  # %bb.3:
+; LA32S-NEXT:    pcalau12i $a0, %pc_hi20(.LBB2_2)
+; LA32S-NEXT:    addi.w $a0, $a0, %pc_lo12(.LBB2_2)
+; LA32S-NEXT:    jr $a0
+; LA32S-NEXT:  .LBB2_1: # %iftrue
+; LA32S-NEXT:    ori $a0, $zero, 1
+; LA32S-NEXT:    #APP
+; LA32S-NEXT:    .space 536870912
+; LA32S-NEXT:    #NO_APP
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
+; LA32S-NEXT:  .LBB2_2: # %iffalse
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: relax_b28:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/bstrins_w.ll b/llvm/test/CodeGen/LoongArch/bstrins_w.ll
index 92b98df215644..c59f15bd64112 100644
--- a/llvm/test/CodeGen/LoongArch/bstrins_w.ll
+++ b/llvm/test/CodeGen/LoongArch/bstrins_w.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s
 
 ;; Test generation of the bstrins.w instruction.
 ;; There are 8 patterns that can be matched to bstrins.w. See performORCombine
diff --git a/llvm/test/CodeGen/LoongArch/bstrpick_w.ll b/llvm/test/CodeGen/LoongArch/bstrpick_w.ll
index 8d5c9d0df44c0..ca41429182249 100644
--- a/llvm/test/CodeGen/LoongArch/bstrpick_w.ll
+++ b/llvm/test/CodeGen/LoongArch/bstrpick_w.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s
 
 define i32 @lshr10_and255(i32 %a) {
 ; CHECK-LABEL: lshr10_and255:
diff --git a/llvm/test/CodeGen/LoongArch/bswap-bitreverse.ll b/llvm/test/CodeGen/LoongArch/bswap-bitreverse.ll
index c8f9596b9b0c1..50f35b34ff9fe 100644
--- a/llvm/test/CodeGen/LoongArch/bswap-bitreverse.ll
+++ b/llvm/test/CodeGen/LoongArch/bswap-bitreverse.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
-; RUN:   | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
 
@@ -12,12 +14,35 @@ declare i32 @llvm.bswap.i32(i32)
 declare i64 @llvm.bswap.i64(i64)
 
 define i16 @test_bswap_bitreverse_i16(i16 %a) nounwind {
-; LA32-LABEL: test_bswap_bitreverse_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    revb.2h $a0, $a0
-; LA32-NEXT:    bitrev.w $a0, $a0
-; LA32-NEXT:    srli.w $a0, $a0, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_bitreverse_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a1, $a0, 3855
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    srli.w $a0, $a0, 4
+; LA32R-NEXT:    andi $a0, $a0, 3855
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 3
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_bitreverse_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    revb.2h $a0, $a0
+; LA32S-NEXT:    bitrev.w $a0, $a0
+; LA32S-NEXT:    srli.w $a0, $a0, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_bitreverse_i16:
 ; LA64:       # %bb.0:
@@ -31,10 +56,35 @@ define i16 @test_bswap_bitreverse_i16(i16 %a) nounwind {
 }
 
 define i32 @test_bswap_bitreverse_i32(i32 %a) nounwind {
-; LA32-LABEL: test_bswap_bitreverse_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.4b $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_bitreverse_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_bitreverse_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.4b $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_bitreverse_i32:
 ; LA64:       # %bb.0:
@@ -46,11 +96,51 @@ define i32 @test_bswap_bitreverse_i32(i32 %a) nounwind {
 }
 
 define i64 @test_bswap_bitreverse_i64(i64 %a) nounwind {
-; LA32-LABEL: test_bswap_bitreverse_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.4b $a0, $a0
-; LA32-NEXT:    bitrev.4b $a1, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_bitreverse_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a0, 4
+; LA32R-NEXT:    lu12i.w $a3, 61680
+; LA32R-NEXT:    ori $a3, $a3, 3855
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    and $a0, $a0, $a3
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 2
+; LA32R-NEXT:    lu12i.w $a4, 209715
+; LA32R-NEXT:    ori $a4, $a4, 819
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 1
+; LA32R-NEXT:    lu12i.w $a5, 349525
+; LA32R-NEXT:    ori $a5, $a5, 1365
+; LA32R-NEXT:    and $a2, $a2, $a5
+; LA32R-NEXT:    and $a0, $a0, $a5
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a1, 4
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 2
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 1
+; LA32R-NEXT:    and $a2, $a2, $a5
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_bitreverse_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.4b $a0, $a0
+; LA32S-NEXT:    bitrev.4b $a1, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_bitreverse_i64:
 ; LA64:       # %bb.0:
@@ -62,12 +152,35 @@ define i64 @test_bswap_bitreverse_i64(i64 %a) nounwind {
 }
 
 define i16 @test_bitreverse_bswap_i16(i16 %a) nounwind {
-; LA32-LABEL: test_bitreverse_bswap_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    revb.2h $a0, $a0
-; LA32-NEXT:    bitrev.w $a0, $a0
-; LA32-NEXT:    srli.w $a0, $a0, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_bswap_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a1, $a0, 3855
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    srli.w $a0, $a0, 4
+; LA32R-NEXT:    andi $a0, $a0, 3855
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 3
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_bswap_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    revb.2h $a0, $a0
+; LA32S-NEXT:    bitrev.w $a0, $a0
+; LA32S-NEXT:    srli.w $a0, $a0, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_bswap_i16:
 ; LA64:       # %bb.0:
@@ -81,10 +194,35 @@ define i16 @test_bitreverse_bswap_i16(i16 %a) nounwind {
 }
 
 define i32 @test_bitreverse_bswap_i32(i32 %a) nounwind {
-; LA32-LABEL: test_bitreverse_bswap_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.4b $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_bswap_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_bswap_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.4b $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_bswap_i32:
 ; LA64:       # %bb.0:
@@ -96,11 +234,51 @@ define i32 @test_bitreverse_bswap_i32(i32 %a) nounwind {
 }
 
 define i64 @test_bitreverse_bswap_i64(i64 %a) nounwind {
-; LA32-LABEL: test_bitreverse_bswap_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bitrev.4b $a0, $a0
-; LA32-NEXT:    bitrev.4b $a1, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bitreverse_bswap_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a0, 4
+; LA32R-NEXT:    lu12i.w $a3, 61680
+; LA32R-NEXT:    ori $a3, $a3, 3855
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    and $a0, $a0, $a3
+; LA32R-NEXT:    slli.w $a0, $a0, 4
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 2
+; LA32R-NEXT:    lu12i.w $a4, 209715
+; LA32R-NEXT:    ori $a4, $a4, 819
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    slli.w $a0, $a0, 2
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 1
+; LA32R-NEXT:    lu12i.w $a5, 349525
+; LA32R-NEXT:    ori $a5, $a5, 1365
+; LA32R-NEXT:    and $a2, $a2, $a5
+; LA32R-NEXT:    and $a0, $a0, $a5
+; LA32R-NEXT:    slli.w $a0, $a0, 1
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a1, 4
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    slli.w $a1, $a1, 4
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 2
+; LA32R-NEXT:    and $a2, $a2, $a4
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    slli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    srli.w $a2, $a1, 1
+; LA32R-NEXT:    and $a2, $a2, $a5
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bitreverse_bswap_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bitrev.4b $a0, $a0
+; LA32S-NEXT:    bitrev.4b $a1, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bitreverse_bswap_i64:
 ; LA64:       # %bb.0:
@@ -112,13 +290,21 @@ define i64 @test_bitreverse_bswap_i64(i64 %a) nounwind {
 }
 
 define i32 @pr55484(i32 %0) {
-; LA32-LABEL: pr55484:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a1, $a0, 8
-; LA32-NEXT:    slli.w $a0, $a0, 8
-; LA32-NEXT:    or $a0, $a1, $a0
-; LA32-NEXT:    ext.w.h $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: pr55484:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srai.w $a0, $a0, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: pr55484:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a1, $a0, 8
+; LA32S-NEXT:    slli.w $a0, $a0, 8
+; LA32S-NEXT:    or $a0, $a1, $a0
+; LA32S-NEXT:    ext.w.h $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: pr55484:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/bswap.ll b/llvm/test/CodeGen/LoongArch/bswap.ll
index 122dab7fb4963..ff7473c38f951 100644
--- a/llvm/test/CodeGen/LoongArch/bswap.ll
+++ b/llvm/test/CodeGen/LoongArch/bswap.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
-; RUN:   | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
+; RUN:   | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
 
@@ -12,10 +14,20 @@ declare i80 @llvm.bswap.i80(i80)
 declare i128 @llvm.bswap.i128(i128)
 
 define i16 @test_bswap_i16(i16 %a) nounwind {
-; LA32-LABEL: test_bswap_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    revb.2h $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 3840
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a1, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    revb.2h $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_i16:
 ; LA64:       # %bb.0:
@@ -26,11 +38,26 @@ define i16 @test_bswap_i16(i16 %a) nounwind {
 }
 
 define i32 @test_bswap_i32(i32 %a) nounwind {
-; LA32-LABEL: test_bswap_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    revb.2h $a0, $a0
-; LA32-NEXT:    rotri.w $a0, $a0, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 3840
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    srli.w $a3, $a0, 24
+; LA32R-NEXT:    or $a1, $a1, $a3
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a2, $a2, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    revb.2h $a0, $a0
+; LA32S-NEXT:    rotri.w $a0, $a0, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_i32:
 ; LA64:       # %bb.0:
@@ -41,14 +68,39 @@ define i32 @test_bswap_i32(i32 %a) nounwind {
 }
 
 define i64 @test_bswap_i64(i64 %a) nounwind {
-; LA32-LABEL: test_bswap_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    revb.2h $a1, $a1
-; LA32-NEXT:    rotri.w $a2, $a1, 16
-; LA32-NEXT:    revb.2h $a0, $a0
-; LA32-NEXT:    rotri.w $a1, $a0, 16
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a1, 8
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 3840
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    srli.w $a4, $a1, 24
+; LA32R-NEXT:    or $a2, $a2, $a4
+; LA32R-NEXT:    and $a4, $a1, $a3
+; LA32R-NEXT:    slli.w $a4, $a4, 8
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:    or $a2, $a1, $a2
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    srli.w $a4, $a0, 24
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:    and $a3, $a0, $a3
+; LA32R-NEXT:    slli.w $a3, $a3, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a3
+; LA32R-NEXT:    or $a1, $a0, $a1
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    revb.2h $a1, $a1
+; LA32S-NEXT:    rotri.w $a2, $a1, 16
+; LA32S-NEXT:    revb.2h $a0, $a0
+; LA32S-NEXT:    rotri.w $a1, $a0, 16
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_i64:
 ; LA64:       # %bb.0:
@@ -61,15 +113,38 @@ define i64 @test_bswap_i64(i64 %a) nounwind {
 ;; Bswap on non-native integer widths.
 
 define i48 @test_bswap_i48(i48 %a) nounwind {
-; LA32-LABEL: test_bswap_i48:
-; LA32:       # %bb.0:
-; LA32-NEXT:    revb.2h $a0, $a0
-; LA32-NEXT:    rotri.w $a2, $a0, 16
-; LA32-NEXT:    revb.2h $a0, $a1
-; LA32-NEXT:    rotri.w $a0, $a0, 16
-; LA32-NEXT:    bytepick.w $a0, $a0, $a2, 2
-; LA32-NEXT:    srli.w $a1, $a2, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_i48:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a0, 8
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 3840
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    srli.w $a4, $a0, 24
+; LA32R-NEXT:    or $a2, $a2, $a4
+; LA32R-NEXT:    slli.w $a2, $a2, 16
+; LA32R-NEXT:    and $a4, $a1, $a3
+; LA32R-NEXT:    slli.w $a4, $a4, 8
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:    srli.w $a1, $a1, 16
+; LA32R-NEXT:    or $a2, $a1, $a2
+; LA32R-NEXT:    and $a1, $a0, $a3
+; LA32R-NEXT:    slli.w $a1, $a1, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_i48:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    revb.2h $a0, $a0
+; LA32S-NEXT:    rotri.w $a2, $a0, 16
+; LA32S-NEXT:    revb.2h $a0, $a1
+; LA32S-NEXT:    rotri.w $a0, $a0, 16
+; LA32S-NEXT:    bytepick.w $a0, $a0, $a2, 2
+; LA32S-NEXT:    srli.w $a1, $a2, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_i48:
 ; LA64:       # %bb.0:
@@ -81,24 +156,63 @@ define i48 @test_bswap_i48(i48 %a) nounwind {
 }
 
 define i80 @test_bswap_i80(i80 %a) nounwind {
-; LA32-LABEL: test_bswap_i80:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 4
-; LA32-NEXT:    ld.w $a3, $a1, 8
-; LA32-NEXT:    ld.w $a1, $a1, 0
-; LA32-NEXT:    revb.2h $a2, $a2
-; LA32-NEXT:    rotri.w $a2, $a2, 16
-; LA32-NEXT:    revb.2h $a3, $a3
-; LA32-NEXT:    rotri.w $a3, $a3, 16
-; LA32-NEXT:    bytepick.w $a3, $a3, $a2, 2
-; LA32-NEXT:    revb.2h $a1, $a1
-; LA32-NEXT:    rotri.w $a1, $a1, 16
-; LA32-NEXT:    bytepick.w $a2, $a2, $a1, 2
-; LA32-NEXT:    srli.w $a1, $a1, 16
-; LA32-NEXT:    st.w $a2, $a0, 4
-; LA32-NEXT:    st.w $a3, $a0, 0
-; LA32-NEXT:    st.h $a1, $a0, 8
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_i80:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 4
+; LA32R-NEXT:    ld.w $a3, $a1, 0
+; LA32R-NEXT:    ld.w $a1, $a1, 8
+; LA32R-NEXT:    srli.w $a4, $a2, 8
+; LA32R-NEXT:    lu12i.w $a5, 15
+; LA32R-NEXT:    ori $a5, $a5, 3840
+; LA32R-NEXT:    and $a4, $a4, $a5
+; LA32R-NEXT:    srli.w $a6, $a2, 24
+; LA32R-NEXT:    or $a4, $a4, $a6
+; LA32R-NEXT:    slli.w $a4, $a4, 16
+; LA32R-NEXT:    and $a6, $a1, $a5
+; LA32R-NEXT:    slli.w $a6, $a6, 8
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    or $a1, $a1, $a6
+; LA32R-NEXT:    srli.w $a1, $a1, 16
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:    and $a4, $a2, $a5
+; LA32R-NEXT:    slli.w $a4, $a4, 8
+; LA32R-NEXT:    slli.w $a2, $a2, 24
+; LA32R-NEXT:    or $a2, $a2, $a4
+; LA32R-NEXT:    srli.w $a2, $a2, 16
+; LA32R-NEXT:    srli.w $a4, $a3, 8
+; LA32R-NEXT:    and $a4, $a4, $a5
+; LA32R-NEXT:    srli.w $a6, $a3, 24
+; LA32R-NEXT:    or $a4, $a4, $a6
+; LA32R-NEXT:    slli.w $a4, $a4, 16
+; LA32R-NEXT:    or $a2, $a4, $a2
+; LA32R-NEXT:    and $a4, $a3, $a5
+; LA32R-NEXT:    slli.w $a4, $a4, 8
+; LA32R-NEXT:    slli.w $a3, $a3, 24
+; LA32R-NEXT:    or $a3, $a3, $a4
+; LA32R-NEXT:    srli.w $a3, $a3, 16
+; LA32R-NEXT:    st.h $a3, $a0, 8
+; LA32R-NEXT:    st.w $a2, $a0, 4
+; LA32R-NEXT:    st.w $a1, $a0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_i80:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 4
+; LA32S-NEXT:    ld.w $a3, $a1, 8
+; LA32S-NEXT:    ld.w $a1, $a1, 0
+; LA32S-NEXT:    revb.2h $a2, $a2
+; LA32S-NEXT:    rotri.w $a2, $a2, 16
+; LA32S-NEXT:    revb.2h $a3, $a3
+; LA32S-NEXT:    rotri.w $a3, $a3, 16
+; LA32S-NEXT:    bytepick.w $a3, $a3, $a2, 2
+; LA32S-NEXT:    revb.2h $a1, $a1
+; LA32S-NEXT:    rotri.w $a1, $a1, 16
+; LA32S-NEXT:    bytepick.w $a2, $a2, $a1, 2
+; LA32S-NEXT:    srli.w $a1, $a1, 16
+; LA32S-NEXT:    st.w $a2, $a0, 4
+; LA32S-NEXT:    st.w $a3, $a0, 0
+; LA32S-NEXT:    st.h $a1, $a0, 8
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_i80:
 ; LA64:       # %bb.0:
@@ -112,25 +226,75 @@ define i80 @test_bswap_i80(i80 %a) nounwind {
 }
 
 define i128 @test_bswap_i128(i128 %a) nounwind {
-; LA32-LABEL: test_bswap_i128:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 12
-; LA32-NEXT:    ld.w $a3, $a1, 0
-; LA32-NEXT:    ld.w $a4, $a1, 8
-; LA32-NEXT:    ld.w $a1, $a1, 4
-; LA32-NEXT:    revb.2h $a2, $a2
-; LA32-NEXT:    rotri.w $a2, $a2, 16
-; LA32-NEXT:    revb.2h $a4, $a4
-; LA32-NEXT:    rotri.w $a4, $a4, 16
-; LA32-NEXT:    revb.2h $a1, $a1
-; LA32-NEXT:    rotri.w $a1, $a1, 16
-; LA32-NEXT:    revb.2h $a3, $a3
-; LA32-NEXT:    rotri.w $a3, $a3, 16
-; LA32-NEXT:    st.w $a3, $a0, 12
-; LA32-NEXT:    st.w $a1, $a0, 8
-; LA32-NEXT:    st.w $a4, $a0, 4
-; LA32-NEXT:    st.w $a2, $a0, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_bswap_i128:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 12
+; LA32R-NEXT:    ld.w $a3, $a1, 0
+; LA32R-NEXT:    ld.w $a4, $a1, 4
+; LA32R-NEXT:    ld.w $a1, $a1, 8
+; LA32R-NEXT:    srli.w $a5, $a2, 8
+; LA32R-NEXT:    lu12i.w $a6, 15
+; LA32R-NEXT:    ori $a6, $a6, 3840
+; LA32R-NEXT:    and $a5, $a5, $a6
+; LA32R-NEXT:    srli.w $a7, $a2, 24
+; LA32R-NEXT:    or $a5, $a5, $a7
+; LA32R-NEXT:    and $a7, $a2, $a6
+; LA32R-NEXT:    slli.w $a7, $a7, 8
+; LA32R-NEXT:    slli.w $a2, $a2, 24
+; LA32R-NEXT:    or $a2, $a2, $a7
+; LA32R-NEXT:    or $a2, $a2, $a5
+; LA32R-NEXT:    srli.w $a5, $a1, 8
+; LA32R-NEXT:    and $a5, $a5, $a6
+; LA32R-NEXT:    srli.w $a7, $a1, 24
+; LA32R-NEXT:    or $a5, $a5, $a7
+; LA32R-NEXT:    and $a7, $a1, $a6
+; LA32R-NEXT:    slli.w $a7, $a7, 8
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    or $a1, $a1, $a7
+; LA32R-NEXT:    or $a1, $a1, $a5
+; LA32R-NEXT:    srli.w $a5, $a4, 8
+; LA32R-NEXT:    and $a5, $a5, $a6
+; LA32R-NEXT:    srli.w $a7, $a4, 24
+; LA32R-NEXT:    or $a5, $a5, $a7
+; LA32R-NEXT:    and $a7, $a4, $a6
+; LA32R-NEXT:    slli.w $a7, $a7, 8
+; LA32R-NEXT:    slli.w $a4, $a4, 24
+; LA32R-NEXT:    or $a4, $a4, $a7
+; LA32R-NEXT:    or $a4, $a4, $a5
+; LA32R-NEXT:    srli.w $a5, $a3, 8
+; LA32R-NEXT:    and $a5, $a5, $a6
+; LA32R-NEXT:    srli.w $a7, $a3, 24
+; LA32R-NEXT:    or $a5, $a5, $a7
+; LA32R-NEXT:    and $a6, $a3, $a6
+; LA32R-NEXT:    slli.w $a6, $a6, 8
+; LA32R-NEXT:    slli.w $a3, $a3, 24
+; LA32R-NEXT:    or $a3, $a3, $a6
+; LA32R-NEXT:    or $a3, $a3, $a5
+; LA32R-NEXT:    st.w $a3, $a0, 12
+; LA32R-NEXT:    st.w $a4, $a0, 8
+; LA32R-NEXT:    st.w $a1, $a0, 4
+; LA32R-NEXT:    st.w $a2, $a0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_bswap_i128:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 12
+; LA32S-NEXT:    ld.w $a3, $a1, 0
+; LA32S-NEXT:    ld.w $a4, $a1, 8
+; LA32S-NEXT:    ld.w $a1, $a1, 4
+; LA32S-NEXT:    revb.2h $a2, $a2
+; LA32S-NEXT:    rotri.w $a2, $a2, 16
+; LA32S-NEXT:    revb.2h $a4, $a4
+; LA32S-NEXT:    rotri.w $a4, $a4, 16
+; LA32S-NEXT:    revb.2h $a1, $a1
+; LA32S-NEXT:    rotri.w $a1, $a1, 16
+; LA32S-NEXT:    revb.2h $a3, $a3
+; LA32S-NEXT:    rotri.w $a3, $a3, 16
+; LA32S-NEXT:    st.w $a3, $a0, 12
+; LA32S-NEXT:    st.w $a1, $a0, 8
+; LA32S-NEXT:    st.w $a4, $a0, 4
+; LA32S-NEXT:    st.w $a2, $a0, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_bswap_i128:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/bytepick.ll b/llvm/test/CodeGen/LoongArch/bytepick.ll
index 22a78bcd56119..6169cbce383ce 100644
--- a/llvm/test/CodeGen/LoongArch/bytepick.ll
+++ b/llvm/test/CodeGen/LoongArch/bytepick.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
diff --git a/llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll b/llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll
index 161ed573c81f0..e564d7bddea6f 100644
--- a/llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll
+++ b/llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 declare i8 @llvm.ctlz.i8(i8, i1)
@@ -16,12 +17,35 @@ declare i32 @llvm.cttz.i32(i32, i1)
 declare i64 @llvm.cttz.i64(i64, i1)
 
 define i8 @test_ctlz_i8(i8 %a) nounwind {
-; LA32-LABEL: test_ctlz_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    clz.w $a0, $a0
-; LA32-NEXT:    addi.w $a0, $a0, -24
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctlz_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a1, $a0, 254
+; LA32R-NEXT:    srli.w $a1, $a1, 1
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 252
+; LA32R-NEXT:    srli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 240
+; LA32R-NEXT:    srli.w $a1, $a1, 4
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    andi $a1, $a1, 85
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 51
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    andi $a0, $a0, 51
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a0, $a0, 15
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctlz_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    clz.w $a0, $a0
+; LA32S-NEXT:    addi.w $a0, $a0, -24
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctlz_i8:
 ; LA64:       # %bb.0:
@@ -34,12 +58,50 @@ define i8 @test_ctlz_i8(i8 %a) nounwind {
 }
 
 define i16 @test_ctlz_i16(i16 %a) nounwind {
-; LA32-LABEL: test_ctlz_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    clz.w $a0, $a0
-; LA32-NEXT:    addi.w $a0, $a0, -16
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctlz_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a2, $a1, 4094
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    srli.w $a2, $a2, 1
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    ori $a2, $a1, 4092
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    srli.w $a2, $a2, 2
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    ori $a2, $a1, 4080
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    srli.w $a2, $a2, 4
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    ori $a1, $a1, 3840
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a1, 8
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 3
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 15
+; LA32R-NEXT:    andi $a0, $a0, 3840
+; LA32R-NEXT:    srli.w $a0, $a0, 8
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctlz_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    clz.w $a0, $a0
+; LA32S-NEXT:    addi.w $a0, $a0, -16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctlz_i16:
 ; LA64:       # %bb.0:
@@ -52,10 +114,44 @@ define i16 @test_ctlz_i16(i16 %a) nounwind {
 }
 
 define i32 @test_ctlz_i32(i32 %a) nounwind {
-; LA32-LABEL: test_ctlz_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    clz.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctlz_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 209715
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 61680
+; LA32R-NEXT:    ori $a1, $a1, 3855
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 4112
+; LA32R-NEXT:    ori $a1, $a1, 257
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctlz_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    clz.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctlz_i32:
 ; LA64:       # %bb.0:
@@ -66,17 +162,80 @@ define i32 @test_ctlz_i32(i32 %a) nounwind {
 }
 
 define i64 @test_ctlz_i64(i64 %a) nounwind {
-; LA32-LABEL: test_ctlz_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sltu $a2, $zero, $a1
-; LA32-NEXT:    clz.w $a1, $a1
-; LA32-NEXT:    maskeqz $a1, $a1, $a2
-; LA32-NEXT:    clz.w $a0, $a0
-; LA32-NEXT:    addi.w $a0, $a0, 32
-; LA32-NEXT:    masknez $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a1, $a0
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctlz_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a5, $a2, 1365
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a4, $a2, 819
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    lu12i.w $a3, 4112
+; LA32R-NEXT:    ori $a3, $a3, 257
+; LA32R-NEXT:    bne $a1, $zero, .LBB3_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a1, $a0, $a4
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    mul.w $a0, $a0, $a3
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    addi.w $a0, $a0, 32
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB3_2:
+; LA32R-NEXT:    srli.w $a0, $a1, 1
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a1, $a0, $a4
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    mul.w $a0, $a0, $a3
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctlz_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sltu $a2, $zero, $a1
+; LA32S-NEXT:    clz.w $a1, $a1
+; LA32S-NEXT:    maskeqz $a1, $a1, $a2
+; LA32S-NEXT:    clz.w $a0, $a0
+; LA32S-NEXT:    addi.w $a0, $a0, 32
+; LA32S-NEXT:    masknez $a0, $a0, $a2
+; LA32S-NEXT:    or $a0, $a1, $a0
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctlz_i64:
 ; LA64:       # %bb.0:
@@ -87,11 +246,35 @@ define i64 @test_ctlz_i64(i64 %a) nounwind {
 }
 
 define i8 @test_not_ctlz_i8(i8 %a) nounwind {
-; LA32-LABEL: test_not_ctlz_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a0, $a0, 24
-; LA32-NEXT:    clo.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_ctlz_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 254
+; LA32R-NEXT:    andn $a1, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a1, 1
+; LA32R-NEXT:    orn $a0, $a1, $a0
+; LA32R-NEXT:    andi $a1, $a0, 252
+; LA32R-NEXT:    srli.w $a1, $a1, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 240
+; LA32R-NEXT:    srli.w $a1, $a1, 4
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    andi $a1, $a1, 85
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 51
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    andi $a0, $a0, 51
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a0, $a0, 15
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_ctlz_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a0, $a0, 24
+; LA32S-NEXT:    clo.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_ctlz_i8:
 ; LA64:       # %bb.0:
@@ -104,11 +287,49 @@ define i8 @test_not_ctlz_i8(i8 %a) nounwind {
 }
 
 define i16 @test_not_ctlz_i16(i16 %a) nounwind {
-; LA32-LABEL: test_not_ctlz_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a0, $a0, 16
-; LA32-NEXT:    clo.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_ctlz_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a2, $a1, 4094
+; LA32R-NEXT:    andn $a2, $a2, $a0
+; LA32R-NEXT:    srli.w $a2, $a2, 1
+; LA32R-NEXT:    orn $a0, $a2, $a0
+; LA32R-NEXT:    ori $a2, $a1, 4092
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    srli.w $a2, $a2, 2
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    ori $a2, $a1, 4080
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    srli.w $a2, $a2, 4
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    ori $a1, $a1, 3840
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a1, 8
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 3
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 15
+; LA32R-NEXT:    andi $a0, $a0, 3840
+; LA32R-NEXT:    srli.w $a0, $a0, 8
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_ctlz_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a0, $a0, 16
+; LA32S-NEXT:    clo.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_ctlz_i16:
 ; LA64:       # %bb.0:
@@ -121,10 +342,45 @@ define i16 @test_not_ctlz_i16(i16 %a) nounwind {
 }
 
 define i32 @test_not_ctlz_i32(i32 %a) nounwind {
-; LA32-LABEL: test_not_ctlz_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    clo.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_ctlz_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    nor $a1, $a0, $zero
+; LA32R-NEXT:    srli.w $a1, $a1, 1
+; LA32R-NEXT:    orn $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 209715
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 61680
+; LA32R-NEXT:    ori $a1, $a1, 3855
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 4112
+; LA32R-NEXT:    ori $a1, $a1, 257
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_ctlz_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    clo.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_ctlz_i32:
 ; LA64:       # %bb.0:
@@ -136,18 +392,83 @@ define i32 @test_not_ctlz_i32(i32 %a) nounwind {
 }
 
 define i64 @test_not_ctlz_i64(i64 %a) nounwind {
-; LA32-LABEL: test_not_ctlz_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    nor $a2, $a1, $zero
-; LA32-NEXT:    sltu $a2, $zero, $a2
-; LA32-NEXT:    clo.w $a0, $a0
-; LA32-NEXT:    addi.w $a0, $a0, 32
-; LA32-NEXT:    masknez $a0, $a0, $a2
-; LA32-NEXT:    clo.w $a1, $a1
-; LA32-NEXT:    maskeqz $a1, $a1, $a2
-; LA32-NEXT:    or $a0, $a1, $a0
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_ctlz_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    nor $a6, $a1, $zero
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a5, $a2, 1365
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a4, $a2, 819
+; LA32R-NEXT:    lu12i.w $a2, 61680
+; LA32R-NEXT:    ori $a2, $a2, 3855
+; LA32R-NEXT:    lu12i.w $a3, 4112
+; LA32R-NEXT:    ori $a3, $a3, 257
+; LA32R-NEXT:    bne $a6, $zero, .LBB7_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    nor $a1, $a0, $zero
+; LA32R-NEXT:    srli.w $a1, $a1, 1
+; LA32R-NEXT:    orn $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a1, $a0, $a4
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    mul.w $a0, $a0, $a3
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    addi.w $a0, $a0, 32
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB7_2:
+; LA32R-NEXT:    srli.w $a0, $a6, 1
+; LA32R-NEXT:    orn $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 8
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    nor $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a1, $a0, $a4
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    mul.w $a0, $a0, $a3
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_ctlz_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    nor $a2, $a1, $zero
+; LA32S-NEXT:    sltu $a2, $zero, $a2
+; LA32S-NEXT:    clo.w $a0, $a0
+; LA32S-NEXT:    addi.w $a0, $a0, 32
+; LA32S-NEXT:    masknez $a0, $a0, $a2
+; LA32S-NEXT:    clo.w $a1, $a1
+; LA32S-NEXT:    maskeqz $a1, $a1, $a2
+; LA32S-NEXT:    or $a0, $a1, $a0
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_ctlz_i64:
 ; LA64:       # %bb.0:
@@ -159,19 +480,33 @@ define i64 @test_not_ctlz_i64(i64 %a) nounwind {
 }
 
 define i8 @test_ctpop_i8(i8 %a) nounwind {
-; LA32-LABEL: test_ctpop_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a1, $a0, 1
-; LA32-NEXT:    andi $a1, $a1, 85
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    andi $a1, $a0, 51
-; LA32-NEXT:    srli.w $a0, $a0, 2
-; LA32-NEXT:    andi $a0, $a0, 51
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    srli.w $a1, $a0, 4
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    andi $a0, $a0, 15
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    andi $a1, $a1, 85
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 51
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    andi $a0, $a0, 51
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a0, $a0, 15
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a1, $a0, 1
+; LA32S-NEXT:    andi $a1, $a1, 85
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    andi $a1, $a0, 51
+; LA32S-NEXT:    srli.w $a0, $a0, 2
+; LA32S-NEXT:    andi $a0, $a0, 51
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    srli.w $a1, $a0, 4
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    andi $a0, $a0, 15
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i8:
 ; LA64:       # %bb.0:
@@ -186,25 +521,46 @@ define i8 @test_ctpop_i8(i8 %a) nounwind {
 }
 
 define i16 @test_ctpop_i16(i16 %a) nounwind {
-; LA32-LABEL: test_ctpop_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a1, $a0, 1
-; LA32-NEXT:    lu12i.w $a2, 5
-; LA32-NEXT:    ori $a2, $a2, 1365
-; LA32-NEXT:    and $a1, $a1, $a2
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    lu12i.w $a1, 3
-; LA32-NEXT:    ori $a1, $a1, 819
-; LA32-NEXT:    and $a2, $a0, $a1
-; LA32-NEXT:    srli.w $a0, $a0, 2
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    add.w $a0, $a2, $a0
-; LA32-NEXT:    srli.w $a1, $a0, 4
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    bstrpick.w $a1, $a0, 11, 8
-; LA32-NEXT:    andi $a0, $a0, 15
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 3
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 15
+; LA32R-NEXT:    andi $a0, $a0, 3840
+; LA32R-NEXT:    srli.w $a0, $a0, 8
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a1, $a0, 1
+; LA32S-NEXT:    lu12i.w $a2, 5
+; LA32S-NEXT:    ori $a2, $a2, 1365
+; LA32S-NEXT:    and $a1, $a1, $a2
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    lu12i.w $a1, 3
+; LA32S-NEXT:    ori $a1, $a1, 819
+; LA32S-NEXT:    and $a2, $a0, $a1
+; LA32S-NEXT:    srli.w $a0, $a0, 2
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    add.w $a0, $a2, $a0
+; LA32S-NEXT:    srli.w $a1, $a0, 4
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    bstrpick.w $a1, $a0, 11, 8
+; LA32S-NEXT:    andi $a0, $a0, 15
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i16:
 ; LA64:       # %bb.0:
@@ -219,29 +575,53 @@ define i16 @test_ctpop_i16(i16 %a) nounwind {
 }
 
 define i32 @test_ctpop_i32(i32 %a) nounwind {
-; LA32-LABEL: test_ctpop_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a1, $a0, 1
-; LA32-NEXT:    lu12i.w $a2, 349525
-; LA32-NEXT:    ori $a2, $a2, 1365
-; LA32-NEXT:    and $a1, $a1, $a2
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    lu12i.w $a1, 209715
-; LA32-NEXT:    ori $a1, $a1, 819
-; LA32-NEXT:    and $a2, $a0, $a1
-; LA32-NEXT:    srli.w $a0, $a0, 2
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    add.w $a0, $a2, $a0
-; LA32-NEXT:    srli.w $a1, $a0, 4
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    lu12i.w $a1, 61680
-; LA32-NEXT:    ori $a1, $a1, 3855
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    lu12i.w $a1, 4112
-; LA32-NEXT:    ori $a1, $a1, 257
-; LA32-NEXT:    mul.w $a0, $a0, $a1
-; LA32-NEXT:    srli.w $a0, $a0, 24
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 349525
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 209715
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 61680
+; LA32R-NEXT:    ori $a1, $a1, 3855
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 4112
+; LA32R-NEXT:    ori $a1, $a1, 257
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a1, $a0, 1
+; LA32S-NEXT:    lu12i.w $a2, 349525
+; LA32S-NEXT:    ori $a2, $a2, 1365
+; LA32S-NEXT:    and $a1, $a1, $a2
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    lu12i.w $a1, 209715
+; LA32S-NEXT:    ori $a1, $a1, 819
+; LA32S-NEXT:    and $a2, $a0, $a1
+; LA32S-NEXT:    srli.w $a0, $a0, 2
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    add.w $a0, $a2, $a0
+; LA32S-NEXT:    srli.w $a1, $a0, 4
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    lu12i.w $a1, 61680
+; LA32S-NEXT:    ori $a1, $a1, 3855
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    lu12i.w $a1, 4112
+; LA32S-NEXT:    ori $a1, $a1, 257
+; LA32S-NEXT:    mul.w $a0, $a0, $a1
+; LA32S-NEXT:    srli.w $a0, $a0, 24
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i32:
 ; LA64:       # %bb.0:
@@ -256,43 +636,81 @@ define i32 @test_ctpop_i32(i32 %a) nounwind {
 }
 
 define i64 @test_ctpop_i64(i64 %a) nounwind {
-; LA32-LABEL: test_ctpop_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a2, $a1, 1
-; LA32-NEXT:    lu12i.w $a3, 349525
-; LA32-NEXT:    ori $a3, $a3, 1365
-; LA32-NEXT:    and $a2, $a2, $a3
-; LA32-NEXT:    sub.w $a1, $a1, $a2
-; LA32-NEXT:    lu12i.w $a2, 209715
-; LA32-NEXT:    ori $a2, $a2, 819
-; LA32-NEXT:    and $a4, $a1, $a2
-; LA32-NEXT:    srli.w $a1, $a1, 2
-; LA32-NEXT:    and $a1, $a1, $a2
-; LA32-NEXT:    add.w $a1, $a4, $a1
-; LA32-NEXT:    srli.w $a4, $a1, 4
-; LA32-NEXT:    add.w $a1, $a1, $a4
-; LA32-NEXT:    lu12i.w $a4, 61680
-; LA32-NEXT:    ori $a4, $a4, 3855
-; LA32-NEXT:    and $a1, $a1, $a4
-; LA32-NEXT:    lu12i.w $a5, 4112
-; LA32-NEXT:    ori $a5, $a5, 257
-; LA32-NEXT:    mul.w $a1, $a1, $a5
-; LA32-NEXT:    srli.w $a1, $a1, 24
-; LA32-NEXT:    srli.w $a6, $a0, 1
-; LA32-NEXT:    and $a3, $a6, $a3
-; LA32-NEXT:    sub.w $a0, $a0, $a3
-; LA32-NEXT:    and $a3, $a0, $a2
-; LA32-NEXT:    srli.w $a0, $a0, 2
-; LA32-NEXT:    and $a0, $a0, $a2
-; LA32-NEXT:    add.w $a0, $a3, $a0
-; LA32-NEXT:    srli.w $a2, $a0, 4
-; LA32-NEXT:    add.w $a0, $a0, $a2
-; LA32-NEXT:    and $a0, $a0, $a4
-; LA32-NEXT:    mul.w $a0, $a0, $a5
-; LA32-NEXT:    srli.w $a0, $a0, 24
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a1, 1
+; LA32R-NEXT:    lu12i.w $a3, 349525
+; LA32R-NEXT:    ori $a3, $a3, 1365
+; LA32R-NEXT:    and $a2, $a2, $a3
+; LA32R-NEXT:    sub.w $a1, $a1, $a2
+; LA32R-NEXT:    lu12i.w $a2, 209715
+; LA32R-NEXT:    ori $a2, $a2, 819
+; LA32R-NEXT:    and $a4, $a1, $a2
+; LA32R-NEXT:    srli.w $a1, $a1, 2
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    add.w $a1, $a4, $a1
+; LA32R-NEXT:    srli.w $a4, $a1, 4
+; LA32R-NEXT:    add.w $a1, $a1, $a4
+; LA32R-NEXT:    lu12i.w $a4, 61680
+; LA32R-NEXT:    ori $a4, $a4, 3855
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    lu12i.w $a5, 4112
+; LA32R-NEXT:    ori $a5, $a5, 257
+; LA32R-NEXT:    mul.w $a1, $a1, $a5
+; LA32R-NEXT:    srli.w $a1, $a1, 24
+; LA32R-NEXT:    srli.w $a6, $a0, 1
+; LA32R-NEXT:    and $a3, $a6, $a3
+; LA32R-NEXT:    sub.w $a0, $a0, $a3
+; LA32R-NEXT:    and $a3, $a0, $a2
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    add.w $a0, $a3, $a0
+; LA32R-NEXT:    srli.w $a2, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a2
+; LA32R-NEXT:    and $a0, $a0, $a4
+; LA32R-NEXT:    mul.w $a0, $a0, $a5
+; LA32R-NEXT:    srli.w $a0, $a0, 24
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a2, $a1, 1
+; LA32S-NEXT:    lu12i.w $a3, 349525
+; LA32S-NEXT:    ori $a3, $a3, 1365
+; LA32S-NEXT:    and $a2, $a2, $a3
+; LA32S-NEXT:    sub.w $a1, $a1, $a2
+; LA32S-NEXT:    lu12i.w $a2, 209715
+; LA32S-NEXT:    ori $a2, $a2, 819
+; LA32S-NEXT:    and $a4, $a1, $a2
+; LA32S-NEXT:    srli.w $a1, $a1, 2
+; LA32S-NEXT:    and $a1, $a1, $a2
+; LA32S-NEXT:    add.w $a1, $a4, $a1
+; LA32S-NEXT:    srli.w $a4, $a1, 4
+; LA32S-NEXT:    add.w $a1, $a1, $a4
+; LA32S-NEXT:    lu12i.w $a4, 61680
+; LA32S-NEXT:    ori $a4, $a4, 3855
+; LA32S-NEXT:    and $a1, $a1, $a4
+; LA32S-NEXT:    lu12i.w $a5, 4112
+; LA32S-NEXT:    ori $a5, $a5, 257
+; LA32S-NEXT:    mul.w $a1, $a1, $a5
+; LA32S-NEXT:    srli.w $a1, $a1, 24
+; LA32S-NEXT:    srli.w $a6, $a0, 1
+; LA32S-NEXT:    and $a3, $a6, $a3
+; LA32S-NEXT:    sub.w $a0, $a0, $a3
+; LA32S-NEXT:    and $a3, $a0, $a2
+; LA32S-NEXT:    srli.w $a0, $a0, 2
+; LA32S-NEXT:    and $a0, $a0, $a2
+; LA32S-NEXT:    add.w $a0, $a3, $a0
+; LA32S-NEXT:    srli.w $a2, $a0, 4
+; LA32S-NEXT:    add.w $a0, $a0, $a2
+; LA32S-NEXT:    and $a0, $a0, $a4
+; LA32S-NEXT:    mul.w $a0, $a0, $a5
+; LA32S-NEXT:    srli.w $a0, $a0, 24
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i64:
 ; LA64:       # %bb.0:
@@ -306,11 +724,27 @@ define i64 @test_ctpop_i64(i64 %a) nounwind {
 }
 
 define i8 @test_cttz_i8(i8 %a) nounwind {
-; LA32-LABEL: test_cttz_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a0, $a0, 256
-; LA32-NEXT:    ctz.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_cttz_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $a0, -1
+; LA32R-NEXT:    andn $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    andi $a1, $a1, 85
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 51
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    andi $a0, $a0, 51
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a0, $a0, 15
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_cttz_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a0, $a0, 256
+; LA32S-NEXT:    ctz.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_cttz_i8:
 ; LA64:       # %bb.0:
@@ -322,12 +756,35 @@ define i8 @test_cttz_i8(i8 %a) nounwind {
 }
 
 define i16 @test_cttz_i16(i16 %a) nounwind {
-; LA32-LABEL: test_cttz_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a1, 16
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ctz.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_cttz_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $a0, -1
+; LA32R-NEXT:    andn $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 3
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 15
+; LA32R-NEXT:    andi $a0, $a0, 3840
+; LA32R-NEXT:    srli.w $a0, $a0, 8
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_cttz_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a1, 16
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ctz.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_cttz_i16:
 ; LA64:       # %bb.0:
@@ -340,10 +797,29 @@ define i16 @test_cttz_i16(i16 %a) nounwind {
 }
 
 define i32 @test_cttz_i32(i32 %a) nounwind {
-; LA32-LABEL: test_cttz_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ctz.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_cttz_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    bne $a0, $zero, .LBB14_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 32
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB14_2:
+; LA32R-NEXT:    sub.w $a1, $zero, $a0
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 30667
+; LA32R-NEXT:    ori $a1, $a1, 1329
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 27
+; LA32R-NEXT:    pcalau12i $a1, %pc_hi20(.LCPI14_0)
+; LA32R-NEXT:    addi.w $a1, $a1, %pc_lo12(.LCPI14_0)
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ld.bu $a0, $a0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_cttz_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ctz.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_cttz_i32:
 ; LA64:       # %bb.0:
@@ -354,17 +830,49 @@ define i32 @test_cttz_i32(i32 %a) nounwind {
 }
 
 define i64 @test_cttz_i64(i64 %a) nounwind {
-; LA32-LABEL: test_cttz_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sltu $a2, $zero, $a0
-; LA32-NEXT:    ctz.w $a0, $a0
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    ctz.w $a1, $a1
-; LA32-NEXT:    addi.w $a1, $a1, 32
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_cttz_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 30667
+; LA32R-NEXT:    ori $a2, $a2, 1329
+; LA32R-NEXT:    pcalau12i $a3, %pc_hi20(.LCPI15_0)
+; LA32R-NEXT:    addi.w $a3, $a3, %pc_lo12(.LCPI15_0)
+; LA32R-NEXT:    bne $a1, $zero, .LBB15_3
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a1, $zero, 32
+; LA32R-NEXT:    beq $a0, $zero, .LBB15_4
+; LA32R-NEXT:  .LBB15_2:
+; LA32R-NEXT:    sub.w $a1, $zero, $a0
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    srli.w $a0, $a0, 27
+; LA32R-NEXT:    add.w $a0, $a3, $a0
+; LA32R-NEXT:    ld.bu $a0, $a0, 0
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB15_3:
+; LA32R-NEXT:    sub.w $a4, $zero, $a1
+; LA32R-NEXT:    and $a1, $a1, $a4
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    srli.w $a1, $a1, 27
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    ld.bu $a1, $a1, 0
+; LA32R-NEXT:    bne $a0, $zero, .LBB15_2
+; LA32R-NEXT:  .LBB15_4:
+; LA32R-NEXT:    addi.w $a0, $a1, 32
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_cttz_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sltu $a2, $zero, $a0
+; LA32S-NEXT:    ctz.w $a0, $a0
+; LA32S-NEXT:    maskeqz $a0, $a0, $a2
+; LA32S-NEXT:    ctz.w $a1, $a1
+; LA32S-NEXT:    addi.w $a1, $a1, 32
+; LA32S-NEXT:    masknez $a1, $a1, $a2
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_cttz_i64:
 ; LA64:       # %bb.0:
@@ -375,12 +883,29 @@ define i64 @test_cttz_i64(i64 %a) nounwind {
 }
 
 define i8 @test_not_cttz_i8(i8 %a) nounwind {
-; LA32-LABEL: test_not_cttz_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a1, $zero, 256
-; LA32-NEXT:    orn $a0, $a1, $a0
-; LA32-NEXT:    ctz.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_cttz_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    nor $a1, $a0, $zero
+; LA32R-NEXT:    addi.w $a1, $a1, -1
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    andi $a1, $a1, 85
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 51
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    andi $a0, $a0, 51
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a0, $a0, 15
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_cttz_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a1, $zero, 256
+; LA32S-NEXT:    orn $a0, $a1, $a0
+; LA32S-NEXT:    ctz.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_cttz_i8:
 ; LA64:       # %bb.0:
@@ -394,12 +919,36 @@ define i8 @test_not_cttz_i8(i8 %a) nounwind {
 }
 
 define i16 @test_not_cttz_i16(i16 %a) nounwind {
-; LA32-LABEL: test_not_cttz_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a1, 16
-; LA32-NEXT:    orn $a0, $a1, $a0
-; LA32-NEXT:    ctz.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_cttz_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    nor $a1, $a0, $zero
+; LA32R-NEXT:    addi.w $a1, $a1, -1
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a1, $a0, 1
+; LA32R-NEXT:    lu12i.w $a2, 5
+; LA32R-NEXT:    ori $a2, $a2, 1365
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    lu12i.w $a1, 3
+; LA32R-NEXT:    ori $a1, $a1, 819
+; LA32R-NEXT:    and $a2, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 2
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    srli.w $a1, $a0, 4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    andi $a1, $a0, 15
+; LA32R-NEXT:    andi $a0, $a0, 3840
+; LA32R-NEXT:    srli.w $a0, $a0, 8
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_cttz_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a1, 16
+; LA32S-NEXT:    orn $a0, $a1, $a0
+; LA32S-NEXT:    ctz.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_cttz_i16:
 ; LA64:       # %bb.0:
@@ -413,10 +962,30 @@ define i16 @test_not_cttz_i16(i16 %a) nounwind {
 }
 
 define i32 @test_not_cttz_i32(i32 %a) nounwind {
-; LA32-LABEL: test_not_cttz_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    cto.w $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_cttz_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    nor $a1, $a0, $zero
+; LA32R-NEXT:    bne $a1, $zero, .LBB18_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 32
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB18_2:
+; LA32R-NEXT:    sub.w $a1, $zero, $a1
+; LA32R-NEXT:    andn $a0, $a1, $a0
+; LA32R-NEXT:    lu12i.w $a1, 30667
+; LA32R-NEXT:    ori $a1, $a1, 1329
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 27
+; LA32R-NEXT:    pcalau12i $a1, %pc_hi20(.LCPI18_0)
+; LA32R-NEXT:    addi.w $a1, $a1, %pc_lo12(.LCPI18_0)
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ld.bu $a0, $a0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_cttz_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    cto.w $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_cttz_i32:
 ; LA64:       # %bb.0:
@@ -428,18 +997,52 @@ define i32 @test_not_cttz_i32(i32 %a) nounwind {
 }
 
 define i64 @test_not_cttz_i64(i64 %a) nounwind {
-; LA32-LABEL: test_not_cttz_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    nor $a2, $a0, $zero
-; LA32-NEXT:    sltu $a2, $zero, $a2
-; LA32-NEXT:    cto.w $a1, $a1
-; LA32-NEXT:    addi.w $a1, $a1, 32
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    cto.w $a0, $a0
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_not_cttz_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    nor $a5, $a1, $zero
+; LA32R-NEXT:    nor $a2, $a0, $zero
+; LA32R-NEXT:    lu12i.w $a3, 30667
+; LA32R-NEXT:    ori $a3, $a3, 1329
+; LA32R-NEXT:    pcalau12i $a4, %pc_hi20(.LCPI19_0)
+; LA32R-NEXT:    addi.w $a4, $a4, %pc_lo12(.LCPI19_0)
+; LA32R-NEXT:    bne $a5, $zero, .LBB19_3
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a1, $zero, 32
+; LA32R-NEXT:    beq $a2, $zero, .LBB19_4
+; LA32R-NEXT:  .LBB19_2:
+; LA32R-NEXT:    sub.w $a1, $zero, $a2
+; LA32R-NEXT:    andn $a0, $a1, $a0
+; LA32R-NEXT:    mul.w $a0, $a0, $a3
+; LA32R-NEXT:    srli.w $a0, $a0, 27
+; LA32R-NEXT:    add.w $a0, $a4, $a0
+; LA32R-NEXT:    ld.bu $a0, $a0, 0
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB19_3:
+; LA32R-NEXT:    sub.w $a5, $zero, $a5
+; LA32R-NEXT:    andn $a1, $a5, $a1
+; LA32R-NEXT:    mul.w $a1, $a1, $a3
+; LA32R-NEXT:    srli.w $a1, $a1, 27
+; LA32R-NEXT:    add.w $a1, $a4, $a1
+; LA32R-NEXT:    ld.bu $a1, $a1, 0
+; LA32R-NEXT:    bne $a2, $zero, .LBB19_2
+; LA32R-NEXT:  .LBB19_4:
+; LA32R-NEXT:    addi.w $a0, $a1, 32
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_not_cttz_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    nor $a2, $a0, $zero
+; LA32S-NEXT:    sltu $a2, $zero, $a2
+; LA32S-NEXT:    cto.w $a1, $a1
+; LA32S-NEXT:    addi.w $a1, $a1, 32
+; LA32S-NEXT:    masknez $a1, $a1, $a2
+; LA32S-NEXT:    cto.w $a0, $a0
+; LA32S-NEXT:    maskeqz $a0, $a0, $a2
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_not_cttz_i64:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll b/llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll
index c01f3cdb40568..150a6f16804d8 100644
--- a/llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll
+++ b/llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+lsx < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+lsx < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+lsx < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+lsx < %s | FileCheck %s --check-prefix=LA64
 
 declare i8 @llvm.ctpop.i8(i8)
@@ -8,14 +9,23 @@ declare i32 @llvm.ctpop.i32(i32)
 declare i64 @llvm.ctpop.i64(i64)
 
 define i8 @test_ctpop_i8(i8 %a) nounwind {
-; LA32-LABEL: test_ctpop_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    vldi $vr0, 0
-; LA32-NEXT:    vinsgr2vr.w $vr0, $a0, 0
-; LA32-NEXT:    vpcnt.w $vr0, $vr0
-; LA32-NEXT:    vpickve2gr.w $a0, $vr0, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    vldi $vr0, 0
+; LA32R-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32R-NEXT:    vpcnt.w $vr0, $vr0
+; LA32R-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    vldi $vr0, 0
+; LA32S-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32S-NEXT:    vpcnt.w $vr0, $vr0
+; LA32S-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i8:
 ; LA64:       # %bb.0:
@@ -30,14 +40,25 @@ define i8 @test_ctpop_i8(i8 %a) nounwind {
 }
 
 define i16 @test_ctpop_i16(i16 %a) nounwind {
-; LA32-LABEL: test_ctpop_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    vldi $vr0, 0
-; LA32-NEXT:    vinsgr2vr.w $vr0, $a0, 0
-; LA32-NEXT:    vpcnt.w $vr0, $vr0
-; LA32-NEXT:    vpickve2gr.w $a0, $vr0, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    vldi $vr0, 0
+; LA32R-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32R-NEXT:    vpcnt.w $vr0, $vr0
+; LA32R-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    vldi $vr0, 0
+; LA32S-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32S-NEXT:    vpcnt.w $vr0, $vr0
+; LA32S-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i16:
 ; LA64:       # %bb.0:
@@ -52,13 +73,21 @@ define i16 @test_ctpop_i16(i16 %a) nounwind {
 }
 
 define i32 @test_ctpop_i32(i32 %a) nounwind {
-; LA32-LABEL: test_ctpop_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    vldi $vr0, 0
-; LA32-NEXT:    vinsgr2vr.w $vr0, $a0, 0
-; LA32-NEXT:    vpcnt.w $vr0, $vr0
-; LA32-NEXT:    vpickve2gr.w $a0, $vr0, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    vldi $vr0, 0
+; LA32R-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32R-NEXT:    vpcnt.w $vr0, $vr0
+; LA32R-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    vldi $vr0, 0
+; LA32S-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32S-NEXT:    vpcnt.w $vr0, $vr0
+; LA32S-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i32:
 ; LA64:       # %bb.0:
@@ -73,19 +102,33 @@ define i32 @test_ctpop_i32(i32 %a) nounwind {
 }
 
 define i64 @test_ctpop_i64(i64 %a) nounwind {
-; LA32-LABEL: test_ctpop_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    vldi $vr0, 0
-; LA32-NEXT:    vldi $vr1, 0
-; LA32-NEXT:    vinsgr2vr.w $vr1, $a1, 0
-; LA32-NEXT:    vpcnt.w $vr1, $vr1
-; LA32-NEXT:    vpickve2gr.w $a1, $vr1, 0
-; LA32-NEXT:    vinsgr2vr.w $vr0, $a0, 0
-; LA32-NEXT:    vpcnt.w $vr0, $vr0
-; LA32-NEXT:    vpickve2gr.w $a0, $vr0, 0
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: test_ctpop_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    vldi $vr0, 0
+; LA32R-NEXT:    vldi $vr1, 0
+; LA32R-NEXT:    vinsgr2vr.w $vr1, $a1, 0
+; LA32R-NEXT:    vpcnt.w $vr1, $vr1
+; LA32R-NEXT:    vpickve2gr.w $a1, $vr1, 0
+; LA32R-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32R-NEXT:    vpcnt.w $vr0, $vr0
+; LA32R-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: test_ctpop_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    vldi $vr0, 0
+; LA32S-NEXT:    vldi $vr1, 0
+; LA32S-NEXT:    vinsgr2vr.w $vr1, $a1, 0
+; LA32S-NEXT:    vpcnt.w $vr1, $vr1
+; LA32S-NEXT:    vpickve2gr.w $a1, $vr1, 0
+; LA32S-NEXT:    vinsgr2vr.w $vr0, $a0, 0
+; LA32S-NEXT:    vpcnt.w $vr0, $vr0
+; LA32S-NEXT:    vpickve2gr.w $a0, $vr0, 0
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: test_ctpop_i64:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/exception-pointer-register.ll b/llvm/test/CodeGen/LoongArch/exception-pointer-register.ll
index 11cd573641071..0c8ed9e0b5a13 100644
--- a/llvm/test/CodeGen/LoongArch/exception-pointer-register.ll
+++ b/llvm/test/CodeGen/LoongArch/exception-pointer-register.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
 ; RUN:    | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:    | FileCheck %s --check-prefix=LA64
diff --git a/llvm/test/CodeGen/LoongArch/fabs.ll b/llvm/test/CodeGen/LoongArch/fabs.ll
index 3f3dacd9b71bc..fe06f33f23e3d 100644
--- a/llvm/test/CodeGen/LoongArch/fabs.ll
+++ b/llvm/test/CodeGen/LoongArch/fabs.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA32F
-; RUN: llc --mtriple=loongarch32 --mattr=+d < %s | FileCheck %s --check-prefix=LA32D
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+f,-d < %s | FileCheck %s --check-prefix=LA32F
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32D
 ; RUN: llc --mtriple=loongarch64 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA64F
 ; RUN: llc --mtriple=loongarch64 --mattr=+d < %s | FileCheck %s --check-prefix=LA64D
 
diff --git a/llvm/test/CodeGen/LoongArch/fcopysign.ll b/llvm/test/CodeGen/LoongArch/fcopysign.ll
index 49e8fbca3e12e..676e7c9060974 100644
--- a/llvm/test/CodeGen/LoongArch/fcopysign.ll
+++ b/llvm/test/CodeGen/LoongArch/fcopysign.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA32F
-; RUN: llc --mtriple=loongarch32 --mattr=+d < %s | FileCheck %s --check-prefix=LA32D
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+f,-d < %s | FileCheck %s --check-prefix=LA32F
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32D
 ; RUN: llc --mtriple=loongarch64 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA64F
 ; RUN: llc --mtriple=loongarch64 --mattr=+d < %s | FileCheck %s --check-prefix=LA64D
 
diff --git a/llvm/test/CodeGen/LoongArch/feature-32bit.ll b/llvm/test/CodeGen/LoongArch/feature-32bit.ll
index ef3cbd9895037..4fede2baed89d 100644
--- a/llvm/test/CodeGen/LoongArch/feature-32bit.ll
+++ b/llvm/test/CodeGen/LoongArch/feature-32bit.ll
@@ -3,3 +3,4 @@
 
 ; CHECK: Available features for this target:
 ; CHECK: 32bit - LA32 Basic Integer and Privilege Instruction Set.
+; CHECK: 32s - LA32 Standard Basic Instruction Extension.
diff --git a/llvm/test/CodeGen/LoongArch/intrinsic-csr-side-effects.ll b/llvm/test/CodeGen/LoongArch/intrinsic-csr-side-effects.ll
index 68d2c10d0ce1f..828b0662f8b2e 100644
--- a/llvm/test/CodeGen/LoongArch/intrinsic-csr-side-effects.ll
+++ b/llvm/test/CodeGen/LoongArch/intrinsic-csr-side-effects.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 declare i32 @llvm.loongarch.csrrd.w(i32 immarg) nounwind
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/and.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/and.ll
index 9f534439b4f50..c88e15d93806c 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/and.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/and.ll
@@ -1,14 +1,20 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 ;; Exercise the 'and' LLVM IR: https://llvm.org/docs/LangRef.html#and-instruction
 
 define i1 @and_i1(i1 %a, i1 %b) {
-; LA32-LABEL: and_i1:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i1:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i1:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i1:
 ; LA64:       # %bb.0: # %entry
@@ -20,10 +26,15 @@ entry:
 }
 
 define i8 @and_i8(i8 %a, i8 %b) {
-; LA32-LABEL: and_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i8:
 ; LA64:       # %bb.0: # %entry
@@ -35,10 +46,15 @@ entry:
 }
 
 define i16 @and_i16(i16 %a, i16 %b) {
-; LA32-LABEL: and_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i16:
 ; LA64:       # %bb.0: # %entry
@@ -50,10 +66,15 @@ entry:
 }
 
 define i32 @and_i32(i32 %a, i32 %b) {
-; LA32-LABEL: and_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i32:
 ; LA64:       # %bb.0: # %entry
@@ -65,11 +86,17 @@ entry:
 }
 
 define i64 @and_i64(i64 %a, i64 %b) {
-; LA32-LABEL: and_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    and $a0, $a0, $a2
-; LA32-NEXT:    and $a1, $a1, $a3
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    and $a0, $a0, $a2
+; LA32S-NEXT:    and $a1, $a1, $a3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64:
 ; LA64:       # %bb.0: # %entry
@@ -81,10 +108,15 @@ entry:
 }
 
 define i1 @and_i1_0(i1 %b) {
-; LA32-LABEL: and_i1_0:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i1_0:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i1_0:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i1_0:
 ; LA64:       # %bb.0: # %entry
@@ -96,9 +128,13 @@ entry:
 }
 
 define i1 @and_i1_5(i1 %b) {
-; LA32-LABEL: and_i1_5:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i1_5:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i1_5:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i1_5:
 ; LA64:       # %bb.0: # %entry
@@ -109,10 +145,15 @@ entry:
 }
 
 define i8 @and_i8_5(i8 %b) {
-; LA32-LABEL: and_i8_5:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 5
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i8_5:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 5
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i8_5:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 5
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i8_5:
 ; LA64:       # %bb.0: # %entry
@@ -124,10 +165,15 @@ entry:
 }
 
 define i8 @and_i8_257(i8 %b) {
-; LA32-LABEL: and_i8_257:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i8_257:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i8_257:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i8_257:
 ; LA64:       # %bb.0: # %entry
@@ -139,10 +185,15 @@ entry:
 }
 
 define i16 @and_i16_5(i16 %b) {
-; LA32-LABEL: and_i16_5:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 5
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i16_5:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 5
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i16_5:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 5
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i16_5:
 ; LA64:       # %bb.0: # %entry
@@ -154,11 +205,17 @@ entry:
 }
 
 define i16 @and_i16_0x1000(i16 %b) {
-; LA32-LABEL: and_i16_0x1000:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    lu12i.w $a1, 1
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i16_0x1000:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    lu12i.w $a1, 1
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i16_0x1000:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    lu12i.w $a1, 1
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i16_0x1000:
 ; LA64:       # %bb.0: # %entry
@@ -171,10 +228,15 @@ entry:
 }
 
 define i16 @and_i16_0x10001(i16 %b) {
-; LA32-LABEL: and_i16_0x10001:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i16_0x10001:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i16_0x10001:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i16_0x10001:
 ; LA64:       # %bb.0: # %entry
@@ -186,10 +248,15 @@ entry:
 }
 
 define i32 @and_i32_5(i32 %b) {
-; LA32-LABEL: and_i32_5:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 5
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i32_5:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 5
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i32_5:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 5
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i32_5:
 ; LA64:       # %bb.0: # %entry
@@ -201,11 +268,17 @@ entry:
 }
 
 define i32 @and_i32_0x1000(i32 %b) {
-; LA32-LABEL: and_i32_0x1000:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    lu12i.w $a1, 1
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i32_0x1000:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    lu12i.w $a1, 1
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i32_0x1000:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    lu12i.w $a1, 1
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i32_0x1000:
 ; LA64:       # %bb.0: # %entry
@@ -218,10 +291,15 @@ entry:
 }
 
 define i32 @and_i32_0x100000001(i32 %b) {
-; LA32-LABEL: and_i32_0x100000001:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i32_0x100000001:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i32_0x100000001:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i32_0x100000001:
 ; LA64:       # %bb.0: # %entry
@@ -233,11 +311,17 @@ entry:
 }
 
 define i64 @and_i64_5(i64 %b) {
-; LA32-LABEL: and_i64_5:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a0, $a0, 5
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_5:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a0, $a0, 5
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_5:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a0, $a0, 5
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_5:
 ; LA64:       # %bb.0: # %entry
@@ -249,12 +333,19 @@ entry:
 }
 
 define i64 @and_i64_0x1000(i64 %b) {
-; LA32-LABEL: and_i64_0x1000:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    lu12i.w $a1, 1
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_0x1000:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    lu12i.w $a1, 1
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_0x1000:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    lu12i.w $a1, 1
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_0x1000:
 ; LA64:       # %bb.0: # %entry
@@ -267,11 +358,18 @@ entry:
 }
 
 define signext i32 @and_i32_0xfff0(i32 %a) {
-; LA32-LABEL: and_i32_0xfff0:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 4
-; LA32-NEXT:    slli.w $a0, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i32_0xfff0:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4080
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i32_0xfff0:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 4
+; LA32S-NEXT:    slli.w $a0, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i32_0xfff0:
 ; LA64:       # %bb.0:
@@ -283,14 +381,23 @@ define signext i32 @and_i32_0xfff0(i32 %a) {
 }
 
 define signext i32 @and_i32_0xfff0_twice(i32 %a, i32 %b) {
-; LA32-LABEL: and_i32_0xfff0_twice:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 4
-; LA32-NEXT:    slli.w $a0, $a0, 4
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 4
-; LA32-NEXT:    slli.w $a1, $a1, 4
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i32_0xfff0_twice:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4080
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i32_0xfff0_twice:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 4
+; LA32S-NEXT:    slli.w $a0, $a0, 4
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 4
+; LA32S-NEXT:    slli.w $a1, $a1, 4
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i32_0xfff0_twice:
 ; LA64:       # %bb.0:
@@ -307,12 +414,20 @@ define signext i32 @and_i32_0xfff0_twice(i32 %a, i32 %b) {
 }
 
 define i64 @and_i64_0xfff0(i64 %a) {
-; LA32-LABEL: and_i64_0xfff0:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 4
-; LA32-NEXT:    slli.w $a0, $a0, 4
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_0xfff0:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4080
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_0xfff0:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 4
+; LA32S-NEXT:    slli.w $a0, $a0, 4
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_0xfff0:
 ; LA64:       # %bb.0:
@@ -324,16 +439,27 @@ define i64 @and_i64_0xfff0(i64 %a) {
 }
 
 define i64 @and_i64_0xfff0_twice(i64 %a, i64 %b) {
-; LA32-LABEL: and_i64_0xfff0_twice:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 4
-; LA32-NEXT:    slli.w $a1, $a0, 4
-; LA32-NEXT:    bstrpick.w $a0, $a2, 15, 4
-; LA32-NEXT:    slli.w $a2, $a0, 4
-; LA32-NEXT:    sub.w $a0, $a1, $a2
-; LA32-NEXT:    sltu $a1, $a1, $a2
-; LA32-NEXT:    sub.w $a1, $zero, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_0xfff0_twice:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4080
+; LA32R-NEXT:    and $a3, $a0, $a1
+; LA32R-NEXT:    and $a1, $a2, $a1
+; LA32R-NEXT:    sub.w $a0, $a3, $a1
+; LA32R-NEXT:    sltu $a1, $a3, $a1
+; LA32R-NEXT:    sub.w $a1, $zero, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_0xfff0_twice:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 4
+; LA32S-NEXT:    slli.w $a1, $a0, 4
+; LA32S-NEXT:    bstrpick.w $a0, $a2, 15, 4
+; LA32S-NEXT:    slli.w $a2, $a0, 4
+; LA32S-NEXT:    sub.w $a0, $a1, $a2
+; LA32S-NEXT:    sltu $a1, $a1, $a2
+; LA32S-NEXT:    sub.w $a1, $zero, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_0xfff0_twice:
 ; LA64:       # %bb.0:
@@ -353,12 +479,19 @@ define i64 @and_i64_0xfff0_twice(i64 %a, i64 %b) {
 ;; since the immediate 1044480 can be composed via
 ;; a single `lu12i.w $rx, 255`.
 define i64 @and_i64_0xff000(i64 %a) {
-; LA32-LABEL: and_i64_0xff000:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a1, 255
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_0xff000:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 255
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_0xff000:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a1, 255
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_0xff000:
 ; LA64:       # %bb.0:
@@ -370,10 +503,16 @@ define i64 @and_i64_0xff000(i64 %a) {
 }
 
 define i64 @and_i64_minus_2048(i64 %a) {
-; LA32-LABEL: and_i64_minus_2048:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrins.w $a0, $zero, 10, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_minus_2048:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -2048
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_minus_2048:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrins.w $a0, $zero, 10, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_minus_2048:
 ; LA64:       # %bb.0:
@@ -386,19 +525,33 @@ define i64 @and_i64_minus_2048(i64 %a) {
 ;; This case is not optimized to `bstrpick + slli`,
 ;; since the immediate 0xfff0 has more than 2 uses.
 define i64 @and_i64_0xfff0_multiple_times(i64 %a, i64 %b, i64 %c) {
-; LA32-LABEL: and_i64_0xfff0_multiple_times:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a1, 15
-; LA32-NEXT:    ori $a1, $a1, 4080
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    and $a2, $a2, $a1
-; LA32-NEXT:    and $a3, $a4, $a1
-; LA32-NEXT:    sltu $a1, $a0, $a2
-; LA32-NEXT:    sub.w $a1, $zero, $a1
-; LA32-NEXT:    sub.w $a0, $a0, $a2
-; LA32-NEXT:    mul.w $a2, $a2, $a3
-; LA32-NEXT:    xor $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_0xfff0_multiple_times:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4080
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    and $a2, $a2, $a1
+; LA32R-NEXT:    and $a3, $a4, $a1
+; LA32R-NEXT:    sltu $a1, $a0, $a2
+; LA32R-NEXT:    sub.w $a1, $zero, $a1
+; LA32R-NEXT:    sub.w $a0, $a0, $a2
+; LA32R-NEXT:    mul.w $a2, $a2, $a3
+; LA32R-NEXT:    xor $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_0xfff0_multiple_times:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a1, 15
+; LA32S-NEXT:    ori $a1, $a1, 4080
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    and $a2, $a2, $a1
+; LA32S-NEXT:    and $a3, $a4, $a1
+; LA32S-NEXT:    sltu $a1, $a0, $a2
+; LA32S-NEXT:    sub.w $a1, $zero, $a1
+; LA32S-NEXT:    sub.w $a0, $a0, $a2
+; LA32S-NEXT:    mul.w $a2, $a2, $a3
+; LA32S-NEXT:    xor $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_0xfff0_multiple_times:
 ; LA64:       # %bb.0:
@@ -421,10 +574,17 @@ define i64 @and_i64_0xfff0_multiple_times(i64 %a, i64 %b, i64 %c) {
 }
 
 define i64 @and_i64_0xffffffffff00ffff(i64 %a) {
-; LA32-LABEL: and_i64_0xffffffffff00ffff:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrins.w $a0, $zero, 23, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_i64_0xffffffffff00ffff:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, -4081
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_i64_0xffffffffff00ffff:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrins.w $a0, $zero, 23, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_i64_0xffffffffff00ffff:
 ; LA64:       # %bb.0:
@@ -435,12 +595,19 @@ define i64 @and_i64_0xffffffffff00ffff(i64 %a) {
 }
 
 define i32 @and_add_lsr(i32 %x, i32 %y) {
-; LA32-LABEL: and_add_lsr:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a0, $a0, -1
-; LA32-NEXT:    srli.w $a1, $a1, 20
-; LA32-NEXT:    and $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: and_add_lsr:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a0, $a0, -1
+; LA32R-NEXT:    srli.w $a1, $a1, 20
+; LA32R-NEXT:    and $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: and_add_lsr:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a0, $a0, -1
+; LA32S-NEXT:    srli.w $a1, $a1, 20
+; LA32S-NEXT:    and $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: and_add_lsr:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/ashr.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/ashr.ll
index 6d04372d9222f..2031011e61844 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/ashr.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/ashr.ll
@@ -19,7 +19,8 @@ define i1 @ashr_i1(i1 %x, i1 %y) {
 define i8 @ashr_i8(i8 %x, i8 %y) {
 ; LA32-LABEL: ashr_i8:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
+; LA32-NEXT:    slli.w $a0, $a0, 24
+; LA32-NEXT:    srai.w $a0, $a0, 24
 ; LA32-NEXT:    sra.w $a0, $a0, $a1
 ; LA32-NEXT:    ret
 ;
@@ -35,7 +36,8 @@ define i8 @ashr_i8(i8 %x, i8 %y) {
 define i16 @ashr_i16(i16 %x, i16 %y) {
 ; LA32-LABEL: ashr_i16:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.h $a0, $a0
+; LA32-NEXT:    slli.w $a0, $a0, 16
+; LA32-NEXT:    srai.w $a0, $a0, 16
 ; LA32-NEXT:    sra.w $a0, $a0, $a1
 ; LA32-NEXT:    ret
 ;
@@ -65,23 +67,19 @@ define i32 @ashr_i32(i32 %x, i32 %y) {
 define i64 @ashr_i64(i64 %x, i64 %y) {
 ; LA32-LABEL: ashr_i64:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    srai.w $a3, $a1, 31
-; LA32-NEXT:    addi.w $a4, $a2, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    masknez $a3, $a3, $a5
-; LA32-NEXT:    sra.w $a6, $a1, $a2
-; LA32-NEXT:    maskeqz $a6, $a6, $a5
-; LA32-NEXT:    or $a3, $a6, $a3
+; LA32-NEXT:    addi.w $a3, $a2, -32
+; LA32-NEXT:    bltz $a3, .LBB4_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    sra.w $a0, $a1, $a3
+; LA32-NEXT:    srai.w $a1, $a1, 31
+; LA32-NEXT:    ret
+; LA32-NEXT:  .LBB4_2:
 ; LA32-NEXT:    srl.w $a0, $a0, $a2
-; LA32-NEXT:    xori $a2, $a2, 31
-; LA32-NEXT:    slli.w $a6, $a1, 1
-; LA32-NEXT:    sll.w $a2, $a6, $a2
-; LA32-NEXT:    or $a0, $a0, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a5
-; LA32-NEXT:    sra.w $a1, $a1, $a4
-; LA32-NEXT:    masknez $a1, $a1, $a5
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $a3
+; LA32-NEXT:    xori $a3, $a2, 31
+; LA32-NEXT:    slli.w $a4, $a1, 1
+; LA32-NEXT:    sll.w $a3, $a4, $a3
+; LA32-NEXT:    or $a0, $a0, $a3
+; LA32-NEXT:    sra.w $a1, $a1, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: ashr_i64:
@@ -107,8 +105,8 @@ define i1 @ashr_i1_3(i1 %x) {
 define i8 @ashr_i8_3(i8 %x) {
 ; LA32-LABEL: ashr_i8_3:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    srai.w $a0, $a0, 3
+; LA32-NEXT:    slli.w $a0, $a0, 24
+; LA32-NEXT:    srai.w $a0, $a0, 27
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: ashr_i8_3:
@@ -123,8 +121,8 @@ define i8 @ashr_i8_3(i8 %x) {
 define i16 @ashr_i16_3(i16 %x) {
 ; LA32-LABEL: ashr_i16_3:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.h $a0, $a0
-; LA32-NEXT:    srai.w $a0, $a0, 3
+; LA32-NEXT:    slli.w $a0, $a0, 16
+; LA32-NEXT:    srai.w $a0, $a0, 19
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: ashr_i16_3:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg-128.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg-128.ll
index e8be3d80716de..4594cf241a187 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg-128.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg-128.ll
@@ -33,7 +33,7 @@ define void @cmpxchg_i128_acquire_acquire(ptr %ptr, i128 %cmp, i128 %val) nounwi
 ; LA64-SCQ-NEXT:  # %bb.2: # in Loop: Header=BB0_1 Depth=1
 ; LA64-SCQ-NEXT:    move $a7, $a3
 ; LA64-SCQ-NEXT:    sc.q $a7, $a4, $a0
-; LA64-SCQ-NEXT:    beqz $a7, .LBB0_1
+; LA64-SCQ-NEXT:    beq $a7, $zero, .LBB0_1
 ; LA64-SCQ-NEXT:    b .LBB0_4
 ; LA64-SCQ-NEXT:  .LBB0_3:
 ; LA64-SCQ-NEXT:    dbar 20
@@ -73,7 +73,7 @@ define void @cmpxchg_i128_acquire_monotonic(ptr %ptr, i128 %cmp, i128 %val) noun
 ; LA64-SCQ-NEXT:  # %bb.2: # in Loop: Header=BB1_1 Depth=1
 ; LA64-SCQ-NEXT:    move $a7, $a3
 ; LA64-SCQ-NEXT:    sc.q $a7, $a4, $a0
-; LA64-SCQ-NEXT:    beqz $a7, .LBB1_1
+; LA64-SCQ-NEXT:    beq $a7, $zero, .LBB1_1
 ; LA64-SCQ-NEXT:    b .LBB1_4
 ; LA64-SCQ-NEXT:  .LBB1_3:
 ; LA64-SCQ-NEXT:    dbar 20
@@ -115,7 +115,7 @@ define i128 @cmpxchg_i128_acquire_acquire_reti128(ptr %ptr, i128 %cmp, i128 %val
 ; LA64-SCQ-NEXT:  # %bb.2: # in Loop: Header=BB2_1 Depth=1
 ; LA64-SCQ-NEXT:    move $a7, $a3
 ; LA64-SCQ-NEXT:    sc.q $a7, $a4, $a0
-; LA64-SCQ-NEXT:    beqz $a7, .LBB2_1
+; LA64-SCQ-NEXT:    beq $a7, $zero, .LBB2_1
 ; LA64-SCQ-NEXT:    b .LBB2_4
 ; LA64-SCQ-NEXT:  .LBB2_3:
 ; LA64-SCQ-NEXT:    dbar 20
@@ -158,7 +158,7 @@ define i1 @cmpxchg_i128_acquire_acquire_reti1(ptr %ptr, i128 %cmp, i128 %val) no
 ; LA64-SCQ-NEXT:  # %bb.2: # in Loop: Header=BB3_1 Depth=1
 ; LA64-SCQ-NEXT:    move $a7, $a3
 ; LA64-SCQ-NEXT:    sc.q $a7, $a4, $a0
-; LA64-SCQ-NEXT:    beqz $a7, .LBB3_1
+; LA64-SCQ-NEXT:    beq $a7, $zero, .LBB3_1
 ; LA64-SCQ-NEXT:    b .LBB3_4
 ; LA64-SCQ-NEXT:  .LBB3_3:
 ; LA64-SCQ-NEXT:    dbar 20
@@ -203,7 +203,7 @@ define void @cmpxchg_i128_monotonic_monotonic(ptr %ptr, i128 %cmp, i128 %val) no
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB4_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a7, $a3
 ; NO-LD-SEQ-SA-NEXT:    sc.q $a7, $a4, $a0
-; NO-LD-SEQ-SA-NEXT:    beqz $a7, .LBB4_1
+; NO-LD-SEQ-SA-NEXT:    beq $a7, $zero, .LBB4_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB4_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB4_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -221,7 +221,7 @@ define void @cmpxchg_i128_monotonic_monotonic(ptr %ptr, i128 %cmp, i128 %val) no
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB4_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a7, $a3
 ; LD-SEQ-SA-NEXT:    sc.q $a7, $a4, $a0
-; LD-SEQ-SA-NEXT:    beqz $a7, .LBB4_1
+; LD-SEQ-SA-NEXT:    beq $a7, $zero, .LBB4_1
 ; LD-SEQ-SA-NEXT:    b .LBB4_4
 ; LD-SEQ-SA-NEXT:  .LBB4_3:
 ; LD-SEQ-SA-NEXT:  .LBB4_4:
@@ -261,7 +261,7 @@ define i128 @cmpxchg_i128_monotonic_monotonic_reti128(ptr %ptr, i128 %cmp, i128
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB5_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a7, $a3
 ; NO-LD-SEQ-SA-NEXT:    sc.q $a7, $a4, $a0
-; NO-LD-SEQ-SA-NEXT:    beqz $a7, .LBB5_1
+; NO-LD-SEQ-SA-NEXT:    beq $a7, $zero, .LBB5_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB5_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB5_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -281,7 +281,7 @@ define i128 @cmpxchg_i128_monotonic_monotonic_reti128(ptr %ptr, i128 %cmp, i128
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB5_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a7, $a3
 ; LD-SEQ-SA-NEXT:    sc.q $a7, $a4, $a0
-; LD-SEQ-SA-NEXT:    beqz $a7, .LBB5_1
+; LD-SEQ-SA-NEXT:    beq $a7, $zero, .LBB5_1
 ; LD-SEQ-SA-NEXT:    b .LBB5_4
 ; LD-SEQ-SA-NEXT:  .LBB5_3:
 ; LD-SEQ-SA-NEXT:  .LBB5_4:
@@ -322,7 +322,7 @@ define i1 @cmpxchg_i128_monotonic_monotonic_reti1(ptr %ptr, i128 %cmp, i128 %val
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB6_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a7, $a3
 ; NO-LD-SEQ-SA-NEXT:    sc.q $a7, $a4, $a0
-; NO-LD-SEQ-SA-NEXT:    beqz $a7, .LBB6_1
+; NO-LD-SEQ-SA-NEXT:    beq $a7, $zero, .LBB6_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB6_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB6_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -344,7 +344,7 @@ define i1 @cmpxchg_i128_monotonic_monotonic_reti1(ptr %ptr, i128 %cmp, i128 %val
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB6_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a7, $a3
 ; LD-SEQ-SA-NEXT:    sc.q $a7, $a4, $a0
-; LD-SEQ-SA-NEXT:    beqz $a7, .LBB6_1
+; LD-SEQ-SA-NEXT:    beq $a7, $zero, .LBB6_1
 ; LD-SEQ-SA-NEXT:    b .LBB6_4
 ; LD-SEQ-SA-NEXT:  .LBB6_3:
 ; LD-SEQ-SA-NEXT:  .LBB6_4:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
index 159ffa8c7238a..8a75dcdb19492 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll
@@ -22,7 +22,7 @@ define void @cmpxchg_i8_acquire_acquire(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; LA64-NEXT:    andn $a5, $a3, $a4
 ; LA64-NEXT:    or $a5, $a5, $a2
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB0_1
+; LA64-NEXT:    beq $a5, $zero, .LBB0_1
 ; LA64-NEXT:    b .LBB0_4
 ; LA64-NEXT:  .LBB0_3:
 ; LA64-NEXT:    dbar 20
@@ -57,7 +57,7 @@ define void @cmpxchg_i16_acquire_acquire(ptr %ptr, i16 %cmp, i16 %val) nounwind
 ; LA64-NEXT:    andn $a5, $a3, $a4
 ; LA64-NEXT:    or $a5, $a5, $a2
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB1_1
+; LA64-NEXT:    beq $a5, $zero, .LBB1_1
 ; LA64-NEXT:    b .LBB1_4
 ; LA64-NEXT:  .LBB1_3:
 ; LA64-NEXT:    dbar 20
@@ -82,7 +82,7 @@ define void @cmpxchg_i32_acquire_acquire(ptr %ptr, i32 %cmp, i32 %val) nounwind
 ; LA64-NEXT:  # %bb.2: # in Loop: Header=BB2_1 Depth=1
 ; LA64-NEXT:    move $a4, $a2
 ; LA64-NEXT:    sc.w $a4, $a0, 0
-; LA64-NEXT:    beqz $a4, .LBB2_1
+; LA64-NEXT:    beq $a4, $zero, .LBB2_1
 ; LA64-NEXT:    b .LBB2_4
 ; LA64-NEXT:  .LBB2_3:
 ; LA64-NEXT:    dbar 20
@@ -106,7 +106,7 @@ define void @cmpxchg_i64_acquire_acquire(ptr %ptr, i64 %cmp, i64 %val) nounwind
 ; LA64-NEXT:  # %bb.2: # in Loop: Header=BB3_1 Depth=1
 ; LA64-NEXT:    move $a4, $a2
 ; LA64-NEXT:    sc.d $a4, $a0, 0
-; LA64-NEXT:    beqz $a4, .LBB3_1
+; LA64-NEXT:    beq $a4, $zero, .LBB3_1
 ; LA64-NEXT:    b .LBB3_4
 ; LA64-NEXT:  .LBB3_3:
 ; LA64-NEXT:    dbar 20
@@ -140,7 +140,7 @@ define void @cmpxchg_i8_acquire_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; NO-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a5, .LBB4_1
+; NO-LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB4_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB4_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB4_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -165,7 +165,7 @@ define void @cmpxchg_i8_acquire_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind {
 ; LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a5, .LBB4_1
+; LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB4_1
 ; LD-SEQ-SA-NEXT:    b .LBB4_4
 ; LD-SEQ-SA-NEXT:  .LBB4_3:
 ; LD-SEQ-SA-NEXT:  .LBB4_4:
@@ -199,7 +199,7 @@ define void @cmpxchg_i16_acquire_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
 ; NO-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a5, .LBB5_1
+; NO-LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB5_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB5_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB5_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -225,7 +225,7 @@ define void @cmpxchg_i16_acquire_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounwin
 ; LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a5, .LBB5_1
+; LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB5_1
 ; LD-SEQ-SA-NEXT:    b .LBB5_4
 ; LD-SEQ-SA-NEXT:  .LBB5_3:
 ; LD-SEQ-SA-NEXT:  .LBB5_4:
@@ -249,7 +249,7 @@ define void @cmpxchg_i32_acquire_monotonic(ptr %ptr, i32 %cmp, i32 %val) nounwin
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB6_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB6_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB6_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB6_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB6_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -265,7 +265,7 @@ define void @cmpxchg_i32_acquire_monotonic(ptr %ptr, i32 %cmp, i32 %val) nounwin
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB6_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB6_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB6_1
 ; LD-SEQ-SA-NEXT:    b .LBB6_4
 ; LD-SEQ-SA-NEXT:  .LBB6_3:
 ; LD-SEQ-SA-NEXT:  .LBB6_4:
@@ -288,7 +288,7 @@ define void @cmpxchg_i64_acquire_monotonic(ptr %ptr, i64 %cmp, i64 %val) nounwin
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB7_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB7_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB7_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB7_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB7_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -303,7 +303,7 @@ define void @cmpxchg_i64_acquire_monotonic(ptr %ptr, i64 %cmp, i64 %val) nounwin
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB7_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB7_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB7_1
 ; LD-SEQ-SA-NEXT:    b .LBB7_4
 ; LD-SEQ-SA-NEXT:  .LBB7_3:
 ; LD-SEQ-SA-NEXT:  .LBB7_4:
@@ -336,7 +336,7 @@ define i8 @cmpxchg_i8_acquire_acquire_reti8(ptr %ptr, i8 %cmp, i8 %val) nounwind
 ; LA64-NEXT:    andn $a6, $a5, $a4
 ; LA64-NEXT:    or $a6, $a6, $a2
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB8_1
+; LA64-NEXT:    beq $a6, $zero, .LBB8_1
 ; LA64-NEXT:    b .LBB8_4
 ; LA64-NEXT:  .LBB8_3:
 ; LA64-NEXT:    dbar 20
@@ -374,7 +374,7 @@ define i16 @cmpxchg_i16_acquire_acquire_reti16(ptr %ptr, i16 %cmp, i16 %val) nou
 ; LA64-NEXT:    andn $a6, $a5, $a4
 ; LA64-NEXT:    or $a6, $a6, $a2
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB9_1
+; LA64-NEXT:    beq $a6, $zero, .LBB9_1
 ; LA64-NEXT:    b .LBB9_4
 ; LA64-NEXT:  .LBB9_3:
 ; LA64-NEXT:    dbar 20
@@ -402,7 +402,7 @@ define i32 @cmpxchg_i32_acquire_acquire_reti32(ptr %ptr, i32 %cmp, i32 %val) nou
 ; LA64-NEXT:  # %bb.2: # in Loop: Header=BB10_1 Depth=1
 ; LA64-NEXT:    move $a4, $a2
 ; LA64-NEXT:    sc.w $a4, $a0, 0
-; LA64-NEXT:    beqz $a4, .LBB10_1
+; LA64-NEXT:    beq $a4, $zero, .LBB10_1
 ; LA64-NEXT:    b .LBB10_4
 ; LA64-NEXT:  .LBB10_3:
 ; LA64-NEXT:    dbar 20
@@ -429,7 +429,7 @@ define i64 @cmpxchg_i64_acquire_acquire_reti64(ptr %ptr, i64 %cmp, i64 %val) nou
 ; LA64-NEXT:  # %bb.2: # in Loop: Header=BB11_1 Depth=1
 ; LA64-NEXT:    move $a4, $a2
 ; LA64-NEXT:    sc.d $a4, $a0, 0
-; LA64-NEXT:    beqz $a4, .LBB11_1
+; LA64-NEXT:    beq $a4, $zero, .LBB11_1
 ; LA64-NEXT:    b .LBB11_4
 ; LA64-NEXT:  .LBB11_3:
 ; LA64-NEXT:    dbar 20
@@ -466,7 +466,7 @@ define i1 @cmpxchg_i8_acquire_acquire_reti1(ptr %ptr, i8 %cmp, i8 %val) nounwind
 ; LA64-NEXT:    andn $a5, $a3, $a4
 ; LA64-NEXT:    or $a5, $a5, $a2
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB12_1
+; LA64-NEXT:    beq $a5, $zero, .LBB12_1
 ; LA64-NEXT:    b .LBB12_4
 ; LA64-NEXT:  .LBB12_3:
 ; LA64-NEXT:    dbar 20
@@ -508,7 +508,7 @@ define i1 @cmpxchg_i16_acquire_acquire_reti1(ptr %ptr, i16 %cmp, i16 %val) nounw
 ; LA64-NEXT:    andn $a5, $a3, $a4
 ; LA64-NEXT:    or $a5, $a5, $a2
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB13_1
+; LA64-NEXT:    beq $a5, $zero, .LBB13_1
 ; LA64-NEXT:    b .LBB13_4
 ; LA64-NEXT:  .LBB13_3:
 ; LA64-NEXT:    dbar 20
@@ -540,7 +540,7 @@ define i1 @cmpxchg_i32_acquire_acquire_reti1(ptr %ptr, i32 %cmp, i32 %val) nounw
 ; LA64-NEXT:  # %bb.2: # in Loop: Header=BB14_1 Depth=1
 ; LA64-NEXT:    move $a4, $a2
 ; LA64-NEXT:    sc.w $a4, $a0, 0
-; LA64-NEXT:    beqz $a4, .LBB14_1
+; LA64-NEXT:    beq $a4, $zero, .LBB14_1
 ; LA64-NEXT:    b .LBB14_4
 ; LA64-NEXT:  .LBB14_3:
 ; LA64-NEXT:    dbar 20
@@ -570,7 +570,7 @@ define i1 @cmpxchg_i64_acquire_acquire_reti1(ptr %ptr, i64 %cmp, i64 %val) nounw
 ; LA64-NEXT:  # %bb.2: # in Loop: Header=BB15_1 Depth=1
 ; LA64-NEXT:    move $a4, $a2
 ; LA64-NEXT:    sc.d $a4, $a0, 0
-; LA64-NEXT:    beqz $a4, .LBB15_1
+; LA64-NEXT:    beq $a4, $zero, .LBB15_1
 ; LA64-NEXT:    b .LBB15_4
 ; LA64-NEXT:  .LBB15_3:
 ; LA64-NEXT:    dbar 20
@@ -610,7 +610,7 @@ define void @cmpxchg_i8_monotonic_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind
 ; NO-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a5, .LBB16_1
+; NO-LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB16_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB16_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB16_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -635,7 +635,7 @@ define void @cmpxchg_i8_monotonic_monotonic(ptr %ptr, i8 %cmp, i8 %val) nounwind
 ; LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a5, .LBB16_1
+; LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB16_1
 ; LD-SEQ-SA-NEXT:    b .LBB16_4
 ; LD-SEQ-SA-NEXT:  .LBB16_3:
 ; LD-SEQ-SA-NEXT:  .LBB16_4:
@@ -669,7 +669,7 @@ define void @cmpxchg_i16_monotonic_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounw
 ; NO-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a5, .LBB17_1
+; NO-LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB17_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB17_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB17_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -695,7 +695,7 @@ define void @cmpxchg_i16_monotonic_monotonic(ptr %ptr, i16 %cmp, i16 %val) nounw
 ; LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a5, .LBB17_1
+; LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB17_1
 ; LD-SEQ-SA-NEXT:    b .LBB17_4
 ; LD-SEQ-SA-NEXT:  .LBB17_3:
 ; LD-SEQ-SA-NEXT:  .LBB17_4:
@@ -719,7 +719,7 @@ define void @cmpxchg_i32_monotonic_monotonic(ptr %ptr, i32 %cmp, i32 %val) nounw
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB18_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB18_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB18_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB18_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB18_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -735,7 +735,7 @@ define void @cmpxchg_i32_monotonic_monotonic(ptr %ptr, i32 %cmp, i32 %val) nounw
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB18_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB18_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB18_1
 ; LD-SEQ-SA-NEXT:    b .LBB18_4
 ; LD-SEQ-SA-NEXT:  .LBB18_3:
 ; LD-SEQ-SA-NEXT:  .LBB18_4:
@@ -758,7 +758,7 @@ define void @cmpxchg_i64_monotonic_monotonic(ptr %ptr, i64 %cmp, i64 %val) nounw
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB19_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB19_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB19_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB19_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB19_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -773,7 +773,7 @@ define void @cmpxchg_i64_monotonic_monotonic(ptr %ptr, i64 %cmp, i64 %val) nounw
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB19_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB19_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB19_1
 ; LD-SEQ-SA-NEXT:    b .LBB19_4
 ; LD-SEQ-SA-NEXT:  .LBB19_3:
 ; LD-SEQ-SA-NEXT:  .LBB19_4:
@@ -806,7 +806,7 @@ define i8 @cmpxchg_i8_monotonic_monotonic_reti8(ptr %ptr, i8 %cmp, i8 %val) noun
 ; NO-LD-SEQ-SA-NEXT:    andn $a6, $a5, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a6, $a6, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a6, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a6, .LBB20_1
+; NO-LD-SEQ-SA-NEXT:    beq $a6, $zero, .LBB20_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB20_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB20_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -832,7 +832,7 @@ define i8 @cmpxchg_i8_monotonic_monotonic_reti8(ptr %ptr, i8 %cmp, i8 %val) noun
 ; LD-SEQ-SA-NEXT:    andn $a6, $a5, $a4
 ; LD-SEQ-SA-NEXT:    or $a6, $a6, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a6, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a6, .LBB20_1
+; LD-SEQ-SA-NEXT:    beq $a6, $zero, .LBB20_1
 ; LD-SEQ-SA-NEXT:    b .LBB20_4
 ; LD-SEQ-SA-NEXT:  .LBB20_3:
 ; LD-SEQ-SA-NEXT:  .LBB20_4:
@@ -869,7 +869,7 @@ define i16 @cmpxchg_i16_monotonic_monotonic_reti16(ptr %ptr, i16 %cmp, i16 %val)
 ; NO-LD-SEQ-SA-NEXT:    andn $a6, $a5, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a6, $a6, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a6, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a6, .LBB21_1
+; NO-LD-SEQ-SA-NEXT:    beq $a6, $zero, .LBB21_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB21_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB21_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -896,7 +896,7 @@ define i16 @cmpxchg_i16_monotonic_monotonic_reti16(ptr %ptr, i16 %cmp, i16 %val)
 ; LD-SEQ-SA-NEXT:    andn $a6, $a5, $a4
 ; LD-SEQ-SA-NEXT:    or $a6, $a6, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a6, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a6, .LBB21_1
+; LD-SEQ-SA-NEXT:    beq $a6, $zero, .LBB21_1
 ; LD-SEQ-SA-NEXT:    b .LBB21_4
 ; LD-SEQ-SA-NEXT:  .LBB21_3:
 ; LD-SEQ-SA-NEXT:  .LBB21_4:
@@ -923,7 +923,7 @@ define i32 @cmpxchg_i32_monotonic_monotonic_reti32(ptr %ptr, i32 %cmp, i32 %val)
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB22_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB22_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB22_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB22_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB22_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -940,7 +940,7 @@ define i32 @cmpxchg_i32_monotonic_monotonic_reti32(ptr %ptr, i32 %cmp, i32 %val)
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB22_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB22_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB22_1
 ; LD-SEQ-SA-NEXT:    b .LBB22_4
 ; LD-SEQ-SA-NEXT:  .LBB22_3:
 ; LD-SEQ-SA-NEXT:  .LBB22_4:
@@ -966,7 +966,7 @@ define i64 @cmpxchg_i64_monotonic_monotonic_reti64(ptr %ptr, i64 %cmp, i64 %val)
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB23_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB23_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB23_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB23_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB23_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -982,7 +982,7 @@ define i64 @cmpxchg_i64_monotonic_monotonic_reti64(ptr %ptr, i64 %cmp, i64 %val)
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB23_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB23_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB23_1
 ; LD-SEQ-SA-NEXT:    b .LBB23_4
 ; LD-SEQ-SA-NEXT:  .LBB23_3:
 ; LD-SEQ-SA-NEXT:  .LBB23_4:
@@ -1018,7 +1018,7 @@ define i1 @cmpxchg_i8_monotonic_monotonic_reti1(ptr %ptr, i8 %cmp, i8 %val) noun
 ; NO-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a5, .LBB24_1
+; NO-LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB24_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB24_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB24_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -1046,7 +1046,7 @@ define i1 @cmpxchg_i8_monotonic_monotonic_reti1(ptr %ptr, i8 %cmp, i8 %val) noun
 ; LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a5, .LBB24_1
+; LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB24_1
 ; LD-SEQ-SA-NEXT:    b .LBB24_4
 ; LD-SEQ-SA-NEXT:  .LBB24_3:
 ; LD-SEQ-SA-NEXT:  .LBB24_4:
@@ -1087,7 +1087,7 @@ define i1 @cmpxchg_i16_monotonic_monotonic_reti1(ptr %ptr, i16 %cmp, i16 %val) n
 ; NO-LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; NO-LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a5, .LBB25_1
+; NO-LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB25_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB25_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB25_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -1116,7 +1116,7 @@ define i1 @cmpxchg_i16_monotonic_monotonic_reti1(ptr %ptr, i16 %cmp, i16 %val) n
 ; LD-SEQ-SA-NEXT:    andn $a5, $a3, $a4
 ; LD-SEQ-SA-NEXT:    or $a5, $a5, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a5, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a5, .LBB25_1
+; LD-SEQ-SA-NEXT:    beq $a5, $zero, .LBB25_1
 ; LD-SEQ-SA-NEXT:    b .LBB25_4
 ; LD-SEQ-SA-NEXT:  .LBB25_3:
 ; LD-SEQ-SA-NEXT:  .LBB25_4:
@@ -1147,7 +1147,7 @@ define i1 @cmpxchg_i32_monotonic_monotonic_reti1(ptr %ptr, i32 %cmp, i32 %val) n
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB26_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB26_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB26_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB26_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB26_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -1165,7 +1165,7 @@ define i1 @cmpxchg_i32_monotonic_monotonic_reti1(ptr %ptr, i32 %cmp, i32 %val) n
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB26_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.w $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB26_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB26_1
 ; LD-SEQ-SA-NEXT:    b .LBB26_4
 ; LD-SEQ-SA-NEXT:  .LBB26_3:
 ; LD-SEQ-SA-NEXT:  .LBB26_4:
@@ -1194,7 +1194,7 @@ define i1 @cmpxchg_i64_monotonic_monotonic_reti1(ptr %ptr, i64 %cmp, i64 %val) n
 ; NO-LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB27_1 Depth=1
 ; NO-LD-SEQ-SA-NEXT:    move $a4, $a2
 ; NO-LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; NO-LD-SEQ-SA-NEXT:    beqz $a4, .LBB27_1
+; NO-LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB27_1
 ; NO-LD-SEQ-SA-NEXT:    b .LBB27_4
 ; NO-LD-SEQ-SA-NEXT:  .LBB27_3:
 ; NO-LD-SEQ-SA-NEXT:    dbar 1792
@@ -1211,7 +1211,7 @@ define i1 @cmpxchg_i64_monotonic_monotonic_reti1(ptr %ptr, i64 %cmp, i64 %val) n
 ; LD-SEQ-SA-NEXT:  # %bb.2: # in Loop: Header=BB27_1 Depth=1
 ; LD-SEQ-SA-NEXT:    move $a4, $a2
 ; LD-SEQ-SA-NEXT:    sc.d $a4, $a0, 0
-; LD-SEQ-SA-NEXT:    beqz $a4, .LBB27_1
+; LD-SEQ-SA-NEXT:    beq $a4, $zero, .LBB27_1
 ; LD-SEQ-SA-NEXT:    b .LBB27_4
 ; LD-SEQ-SA-NEXT:  .LBB27_3:
 ; LD-SEQ-SA-NEXT:  .LBB27_4:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-fp.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-fp.ll
index f47f79d2759c2..4990e7002562d 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-fp.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-fp.ll
@@ -25,7 +25,7 @@ define float @float_fadd_acquire(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB0_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB0_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB0_3
 ; LA64F-NEXT:    b .LBB0_6
 ; LA64F-NEXT:  .LBB0_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB0_1 Depth=1
@@ -57,7 +57,7 @@ define float @float_fadd_acquire(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB0_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB0_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB0_3
 ; LA64D-NEXT:    b .LBB0_6
 ; LA64D-NEXT:  .LBB0_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB0_1 Depth=1
@@ -94,7 +94,7 @@ define float @float_fsub_acquire(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB1_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB1_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB1_3
 ; LA64F-NEXT:    b .LBB1_6
 ; LA64F-NEXT:  .LBB1_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB1_1 Depth=1
@@ -126,7 +126,7 @@ define float @float_fsub_acquire(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB1_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB1_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB1_3
 ; LA64D-NEXT:    b .LBB1_6
 ; LA64D-NEXT:  .LBB1_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB1_1 Depth=1
@@ -164,7 +164,7 @@ define float @float_fmin_acquire(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB2_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB2_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB2_3
 ; LA64F-NEXT:    b .LBB2_6
 ; LA64F-NEXT:  .LBB2_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB2_1 Depth=1
@@ -196,7 +196,7 @@ define float @float_fmin_acquire(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB2_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB2_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB2_3
 ; LA64D-NEXT:    b .LBB2_6
 ; LA64D-NEXT:  .LBB2_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB2_1 Depth=1
@@ -234,7 +234,7 @@ define float @float_fmax_acquire(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB3_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB3_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB3_3
 ; LA64F-NEXT:    b .LBB3_6
 ; LA64F-NEXT:  .LBB3_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB3_1 Depth=1
@@ -266,7 +266,7 @@ define float @float_fmax_acquire(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB3_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB3_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB3_3
 ; LA64D-NEXT:    b .LBB3_6
 ; LA64D-NEXT:  .LBB3_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB3_1 Depth=1
@@ -592,7 +592,7 @@ define float @float_fadd_release(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB8_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB8_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB8_3
 ; LA64F-NEXT:    b .LBB8_6
 ; LA64F-NEXT:  .LBB8_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB8_1 Depth=1
@@ -624,7 +624,7 @@ define float @float_fadd_release(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB8_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB8_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB8_3
 ; LA64D-NEXT:    b .LBB8_6
 ; LA64D-NEXT:  .LBB8_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB8_1 Depth=1
@@ -661,7 +661,7 @@ define float @float_fsub_release(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB9_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB9_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB9_3
 ; LA64F-NEXT:    b .LBB9_6
 ; LA64F-NEXT:  .LBB9_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB9_1 Depth=1
@@ -693,7 +693,7 @@ define float @float_fsub_release(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB9_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB9_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB9_3
 ; LA64D-NEXT:    b .LBB9_6
 ; LA64D-NEXT:  .LBB9_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB9_1 Depth=1
@@ -731,7 +731,7 @@ define float @float_fmin_release(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB10_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB10_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB10_3
 ; LA64F-NEXT:    b .LBB10_6
 ; LA64F-NEXT:  .LBB10_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB10_1 Depth=1
@@ -763,7 +763,7 @@ define float @float_fmin_release(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB10_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB10_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB10_3
 ; LA64D-NEXT:    b .LBB10_6
 ; LA64D-NEXT:  .LBB10_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB10_1 Depth=1
@@ -801,7 +801,7 @@ define float @float_fmax_release(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB11_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB11_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB11_3
 ; LA64F-NEXT:    b .LBB11_6
 ; LA64F-NEXT:  .LBB11_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB11_1 Depth=1
@@ -833,7 +833,7 @@ define float @float_fmax_release(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB11_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB11_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB11_3
 ; LA64D-NEXT:    b .LBB11_6
 ; LA64D-NEXT:  .LBB11_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB11_1 Depth=1
@@ -1159,7 +1159,7 @@ define float @float_fadd_acq_rel(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB16_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB16_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB16_3
 ; LA64F-NEXT:    b .LBB16_6
 ; LA64F-NEXT:  .LBB16_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB16_1 Depth=1
@@ -1191,7 +1191,7 @@ define float @float_fadd_acq_rel(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB16_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB16_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB16_3
 ; LA64D-NEXT:    b .LBB16_6
 ; LA64D-NEXT:  .LBB16_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB16_1 Depth=1
@@ -1228,7 +1228,7 @@ define float @float_fsub_acq_rel(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB17_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB17_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB17_3
 ; LA64F-NEXT:    b .LBB17_6
 ; LA64F-NEXT:  .LBB17_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB17_1 Depth=1
@@ -1260,7 +1260,7 @@ define float @float_fsub_acq_rel(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB17_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB17_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB17_3
 ; LA64D-NEXT:    b .LBB17_6
 ; LA64D-NEXT:  .LBB17_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB17_1 Depth=1
@@ -1298,7 +1298,7 @@ define float @float_fmin_acq_rel(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB18_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB18_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB18_3
 ; LA64F-NEXT:    b .LBB18_6
 ; LA64F-NEXT:  .LBB18_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB18_1 Depth=1
@@ -1330,7 +1330,7 @@ define float @float_fmin_acq_rel(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB18_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB18_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB18_3
 ; LA64D-NEXT:    b .LBB18_6
 ; LA64D-NEXT:  .LBB18_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB18_1 Depth=1
@@ -1368,7 +1368,7 @@ define float @float_fmax_acq_rel(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB19_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB19_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB19_3
 ; LA64F-NEXT:    b .LBB19_6
 ; LA64F-NEXT:  .LBB19_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB19_1 Depth=1
@@ -1400,7 +1400,7 @@ define float @float_fmax_acq_rel(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB19_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB19_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB19_3
 ; LA64D-NEXT:    b .LBB19_6
 ; LA64D-NEXT:  .LBB19_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB19_1 Depth=1
@@ -1726,7 +1726,7 @@ define float @float_fadd_seq_cst(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB24_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB24_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB24_3
 ; LA64F-NEXT:    b .LBB24_6
 ; LA64F-NEXT:  .LBB24_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB24_1 Depth=1
@@ -1758,7 +1758,7 @@ define float @float_fadd_seq_cst(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB24_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB24_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB24_3
 ; LA64D-NEXT:    b .LBB24_6
 ; LA64D-NEXT:  .LBB24_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB24_1 Depth=1
@@ -1795,7 +1795,7 @@ define float @float_fsub_seq_cst(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB25_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB25_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB25_3
 ; LA64F-NEXT:    b .LBB25_6
 ; LA64F-NEXT:  .LBB25_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB25_1 Depth=1
@@ -1827,7 +1827,7 @@ define float @float_fsub_seq_cst(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB25_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB25_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB25_3
 ; LA64D-NEXT:    b .LBB25_6
 ; LA64D-NEXT:  .LBB25_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB25_1 Depth=1
@@ -1865,7 +1865,7 @@ define float @float_fmin_seq_cst(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB26_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB26_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB26_3
 ; LA64F-NEXT:    b .LBB26_6
 ; LA64F-NEXT:  .LBB26_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB26_1 Depth=1
@@ -1897,7 +1897,7 @@ define float @float_fmin_seq_cst(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB26_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB26_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB26_3
 ; LA64D-NEXT:    b .LBB26_6
 ; LA64D-NEXT:  .LBB26_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB26_1 Depth=1
@@ -1935,7 +1935,7 @@ define float @float_fmax_seq_cst(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB27_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB27_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB27_3
 ; LA64F-NEXT:    b .LBB27_6
 ; LA64F-NEXT:  .LBB27_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB27_1 Depth=1
@@ -1967,7 +1967,7 @@ define float @float_fmax_seq_cst(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB27_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB27_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB27_3
 ; LA64D-NEXT:    b .LBB27_6
 ; LA64D-NEXT:  .LBB27_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB27_1 Depth=1
@@ -2293,7 +2293,7 @@ define float @float_fadd_monotonic(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB32_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB32_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB32_3
 ; LA64F-NEXT:    b .LBB32_6
 ; LA64F-NEXT:  .LBB32_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB32_1 Depth=1
@@ -2325,7 +2325,7 @@ define float @float_fadd_monotonic(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB32_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB32_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB32_3
 ; LA64D-NEXT:    b .LBB32_6
 ; LA64D-NEXT:  .LBB32_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB32_1 Depth=1
@@ -2362,7 +2362,7 @@ define float @float_fsub_monotonic(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB33_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB33_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB33_3
 ; LA64F-NEXT:    b .LBB33_6
 ; LA64F-NEXT:  .LBB33_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB33_1 Depth=1
@@ -2394,7 +2394,7 @@ define float @float_fsub_monotonic(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB33_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB33_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB33_3
 ; LA64D-NEXT:    b .LBB33_6
 ; LA64D-NEXT:  .LBB33_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB33_1 Depth=1
@@ -2432,7 +2432,7 @@ define float @float_fmin_monotonic(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB34_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB34_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB34_3
 ; LA64F-NEXT:    b .LBB34_6
 ; LA64F-NEXT:  .LBB34_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB34_1 Depth=1
@@ -2464,7 +2464,7 @@ define float @float_fmin_monotonic(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB34_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB34_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB34_3
 ; LA64D-NEXT:    b .LBB34_6
 ; LA64D-NEXT:  .LBB34_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB34_1 Depth=1
@@ -2502,7 +2502,7 @@ define float @float_fmax_monotonic(ptr %p) nounwind {
 ; LA64F-NEXT:    # in Loop: Header=BB35_3 Depth=2
 ; LA64F-NEXT:    move $a4, $a1
 ; LA64F-NEXT:    sc.w $a4, $a0, 0
-; LA64F-NEXT:    beqz $a4, .LBB35_3
+; LA64F-NEXT:    beq $a4, $zero, .LBB35_3
 ; LA64F-NEXT:    b .LBB35_6
 ; LA64F-NEXT:  .LBB35_5: # %atomicrmw.start
 ; LA64F-NEXT:    # in Loop: Header=BB35_1 Depth=1
@@ -2534,7 +2534,7 @@ define float @float_fmax_monotonic(ptr %p) nounwind {
 ; LA64D-NEXT:    # in Loop: Header=BB35_3 Depth=2
 ; LA64D-NEXT:    move $a4, $a1
 ; LA64D-NEXT:    sc.w $a4, $a0, 0
-; LA64D-NEXT:    beqz $a4, .LBB35_3
+; LA64D-NEXT:    beq $a4, $zero, .LBB35_3
 ; LA64D-NEXT:    b .LBB35_6
 ; LA64D-NEXT:  .LBB35_5: # %atomicrmw.start
 ; LA64D-NEXT:    # in Loop: Header=BB35_1 Depth=1
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lam-bh.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lam-bh.ll
index 646ccbd762685..3d80d6984f3bd 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lam-bh.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lam-bh.ll
@@ -1,37 +1,36 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 --mattr=+d,+lam-bh < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d,+lam-bh < %s | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 --mattr=+d,+lam-bh < %s | FileCheck %s --check-prefix=LA64
 
-;; We need to ensure that even if lam-bh is enabled 
+;; We need to ensure that even if lam-bh is enabled
 ;; it will not generate the am*.b/h instruction on loongarch32.
 
 define i8 @atomicrmw_xchg_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i8_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB0_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB0_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 %b acquire
   ret i8 %1
 }
@@ -39,26 +38,25 @@ define i8 @atomicrmw_xchg_i8_acquire(ptr %a, i8 %b) nounwind {
 define i8 @atomicrmw_xchg_0_i8_acquire(ptr %a) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_0_i8_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a1, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a2, $zero, 255
-; LA32-NEXT:    sll.w	$a2, $a2, $a1
-; LA32-NEXT:    nor	$a2, $a2, $zero
+; LA32-NEXT:    slli.w $a1, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a2, $zero, 255
+; LA32-NEXT:    sll.w $a2, $a2, $a1
+; LA32-NEXT:    nor $a2, $a2, $zero
 ; LA32-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a3, $a0, 0
-; LA32-NEXT:    and	$a4, $a3, $a2
-; LA32-NEXT:    sc.w	$a4, $a0, 0
-; LA32-NEXT:    beqz	$a4, .LBB1_1
+; LA32-NEXT:    ll.w $a3, $a0, 0
+; LA32-NEXT:    and $a4, $a3, $a2
+; LA32-NEXT:    sc.w $a4, $a0, 0
+; LA32-NEXT:    beq $a4, $zero, .LBB1_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a3, $a1
+; LA32-NEXT:    srl.w $a0, $a3, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i8_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.b	$a1, $zero, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    amswap_db.b $a1, $zero, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 0 acquire
   ret i8 %1
 }
@@ -66,26 +64,25 @@ define i8 @atomicrmw_xchg_0_i8_acquire(ptr %a) nounwind {
 define i8 @atomicrmw_xchg_minus_1_i8_acquire(ptr %a) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_minus_1_i8_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a1, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a2, $zero, 255
-; LA32-NEXT:    sll.w	$a2, $a2, $a1
+; LA32-NEXT:    slli.w $a1, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a2, $zero, 255
+; LA32-NEXT:    sll.w $a2, $a2, $a1
 ; LA32-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a3, $a0, 0
-; LA32-NEXT:    or	$a4, $a3, $a2
-; LA32-NEXT:    sc.w	$a4, $a0, 0
-; LA32-NEXT:    beqz	$a4, .LBB2_1
+; LA32-NEXT:    ll.w $a3, $a0, 0
+; LA32-NEXT:    or $a4, $a3, $a2
+; LA32-NEXT:    sc.w $a4, $a0, 0
+; LA32-NEXT:    beq $a4, $zero, .LBB2_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a3, $a1
+; LA32-NEXT:    srl.w $a0, $a3, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i8_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    addi.w	$a2, $zero, -1
-; LA64-NEXT:    amswap_db.b	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    addi.w $a2, $zero, -1
+; LA64-NEXT:    amswap_db.b $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 -1 acquire
   ret i8 %1
 }
@@ -93,31 +90,30 @@ define i8 @atomicrmw_xchg_minus_1_i8_acquire(ptr %a) nounwind {
 define i16 @atomicrmw_xchg_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i16_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB3_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB3_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB3_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 %b acquire
   ret i16 %1
 }
@@ -125,27 +121,26 @@ define i16 @atomicrmw_xchg_i16_acquire(ptr %a, i16 %b) nounwind {
 define i16 @atomicrmw_xchg_0_i16_acquire(ptr %a) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_0_i16_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a1, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a2, 15
-; LA32-NEXT:    ori	$a2, $a2, 4095
-; LA32-NEXT:    sll.w	$a2, $a2, $a1
-; LA32-NEXT:    nor	$a2, $a2, $zero
+; LA32-NEXT:    slli.w $a1, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a2, 15
+; LA32-NEXT:    ori $a2, $a2, 4095
+; LA32-NEXT:    sll.w $a2, $a2, $a1
+; LA32-NEXT:    nor $a2, $a2, $zero
 ; LA32-NEXT:  .LBB4_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a3, $a0, 0
-; LA32-NEXT:    and	$a4, $a3, $a2
-; LA32-NEXT:    sc.w	$a4, $a0, 0
-; LA32-NEXT:    beqz	$a4, .LBB4_1
+; LA32-NEXT:    ll.w $a3, $a0, 0
+; LA32-NEXT:    and $a4, $a3, $a2
+; LA32-NEXT:    sc.w $a4, $a0, 0
+; LA32-NEXT:    beq $a4, $zero, .LBB4_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a3, $a1
+; LA32-NEXT:    srl.w $a0, $a3, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i16_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.h	$a1, $zero, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    amswap_db.h $a1, $zero, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 0 acquire
   ret i16 %1
 }
@@ -153,27 +148,26 @@ define i16 @atomicrmw_xchg_0_i16_acquire(ptr %a) nounwind {
 define i16 @atomicrmw_xchg_minus_1_i16_acquire(ptr %a) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_minus_1_i16_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a1, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a2, 15
-; LA32-NEXT:    ori	$a2, $a2, 4095
-; LA32-NEXT:    sll.w	$a2, $a2, $a1
+; LA32-NEXT:    slli.w $a1, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a2, 15
+; LA32-NEXT:    ori $a2, $a2, 4095
+; LA32-NEXT:    sll.w $a2, $a2, $a1
 ; LA32-NEXT:  .LBB5_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a3, $a0, 0
-; LA32-NEXT:    or	$a4, $a3, $a2
-; LA32-NEXT:    sc.w	$a4, $a0, 0
-; LA32-NEXT:    beqz	$a4, .LBB5_1
+; LA32-NEXT:    ll.w $a3, $a0, 0
+; LA32-NEXT:    or $a4, $a3, $a2
+; LA32-NEXT:    sc.w $a4, $a0, 0
+; LA32-NEXT:    beq $a4, $zero, .LBB5_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a3, $a1
+; LA32-NEXT:    srl.w $a0, $a3, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i16_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    addi.w	$a2, $zero, -1
-; LA64-NEXT:    amswap_db.h	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    addi.w $a2, $zero, -1
+; LA64-NEXT:    amswap_db.h $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 -1 acquire
   ret i16 %1
 }
@@ -181,30 +175,29 @@ define i16 @atomicrmw_xchg_minus_1_i16_acquire(ptr %a) nounwind {
 define i8 @atomicrmw_xchg_i8_release(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i8_release:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB6_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB6_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB6_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_release:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 %b release
   ret i8 %1
 }
@@ -212,31 +205,30 @@ define i8 @atomicrmw_xchg_i8_release(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_xchg_i16_release(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i16_release:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB7_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB7_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB7_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_release:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 %b release
   ret i16 %1
 }
@@ -244,30 +236,29 @@ define i16 @atomicrmw_xchg_i16_release(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_xchg_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i8_acq_rel:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB8_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB8_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB8_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_acq_rel:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 %b acq_rel
   ret i8 %1
 }
@@ -275,31 +266,30 @@ define i8 @atomicrmw_xchg_i8_acq_rel(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_xchg_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i16_acq_rel:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB9_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB9_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB9_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_acq_rel:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 %b acq_rel
   ret i16 %1
 }
@@ -307,30 +297,29 @@ define i16 @atomicrmw_xchg_i16_acq_rel(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_xchg_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i8_seq_cst:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB10_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB10_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB10_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_seq_cst:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 %b seq_cst
   ret i8 %1
 }
@@ -338,31 +327,30 @@ define i8 @atomicrmw_xchg_i8_seq_cst(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_xchg_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i16_seq_cst:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB11_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB11_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB11_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_seq_cst:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 %b seq_cst
   ret i16 %1
 }
@@ -370,30 +358,29 @@ define i16 @atomicrmw_xchg_i16_seq_cst(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_xchg_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i8_monotonic:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB12_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB12_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB12_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_monotonic:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i8 %b monotonic
   ret i8 %1
 }
@@ -401,31 +388,30 @@ define i8 @atomicrmw_xchg_i8_monotonic(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_xchg_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_xchg_i16_monotonic:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB13_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    addi.w	$a5, $a1, 0
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB13_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    addi.w $a5, $a1, 0
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB13_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_monotonic:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amswap.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amswap.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw xchg ptr %a, i16 %b monotonic
   ret i16 %1
 }
@@ -433,30 +419,29 @@ define i16 @atomicrmw_xchg_i16_monotonic(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_add_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i8_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB14_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB14_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB14_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i8 %b acquire
   ret i8 %1
 }
@@ -464,31 +449,30 @@ define i8 @atomicrmw_add_i8_acquire(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_add_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i16_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB15_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB15_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB15_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i16 %b acquire
   ret i16 %1
 }
@@ -496,30 +480,29 @@ define i16 @atomicrmw_add_i16_acquire(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_add_i8_release(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i8_release:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB16_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB16_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB16_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_release:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i8 %b release
   ret i8 %1
 }
@@ -527,31 +510,30 @@ define i8 @atomicrmw_add_i8_release(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_add_i16_release(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i16_release:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB17_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB17_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB17_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_release:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i16 %b release
   ret i16 %1
 }
@@ -559,30 +541,29 @@ define i16 @atomicrmw_add_i16_release(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_add_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i8_acq_rel:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB18_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB18_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB18_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_acq_rel:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i8 %b acq_rel
   ret i8 %1
 }
@@ -590,31 +571,30 @@ define i8 @atomicrmw_add_i8_acq_rel(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_add_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i16_acq_rel:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB19_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB19_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB19_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_acq_rel:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i16 %b acq_rel
   ret i16 %1
 }
@@ -622,30 +602,29 @@ define i16 @atomicrmw_add_i16_acq_rel(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_add_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i8_seq_cst:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB20_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB20_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB20_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_seq_cst:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i8 %b seq_cst
   ret i8 %1
 }
@@ -653,31 +632,30 @@ define i8 @atomicrmw_add_i8_seq_cst(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_add_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i16_seq_cst:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB21_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB21_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB21_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_seq_cst:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd_db.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd_db.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i16 %b seq_cst
   ret i16 %1
 }
@@ -685,30 +663,29 @@ define i16 @atomicrmw_add_i16_seq_cst(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_add_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i8_monotonic:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB22_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB22_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB22_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_monotonic:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd.b	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd.b $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i8 %b monotonic
   ret i8 %1
 }
@@ -716,31 +693,30 @@ define i8 @atomicrmw_add_i8_monotonic(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_add_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_add_i16_monotonic:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB23_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    add.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB23_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    add.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB23_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_monotonic:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    amadd.h	$a2, $a1, $a0
-; LA64-NEXT:    move	$a0, $a2
+; LA64-NEXT:    amadd.h $a2, $a1, $a0
+; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw add ptr %a, i16 %b monotonic
   ret i16 %1
 }
@@ -748,31 +724,30 @@ define i16 @atomicrmw_add_i16_monotonic(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_sub_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i8_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB24_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB24_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB24_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.b	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.b $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i8 %b acquire
   ret i8 %1
 }
@@ -780,32 +755,31 @@ define i8 @atomicrmw_sub_i8_acquire(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_sub_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i16_acquire:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB25_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB25_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB25_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_acquire:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.h	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.h $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i16 %b acquire
   ret i16 %1
 }
@@ -813,31 +787,30 @@ define i16 @atomicrmw_sub_i16_acquire(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_sub_i8_release(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i8_release:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB26_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB26_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB26_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_release:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.b	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.b $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i8 %b release
   ret i8 %1
 }
@@ -845,32 +818,31 @@ define i8 @atomicrmw_sub_i8_release(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_sub_i16_release(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i16_release:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB27_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB27_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB27_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_release:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.h	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.h $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i16 %b release
   ret i16 %1
 }
@@ -878,31 +850,30 @@ define i16 @atomicrmw_sub_i16_release(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_sub_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i8_acq_rel:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB28_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB28_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB28_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_acq_rel:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.b	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.b $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i8 %b acq_rel
   ret i8 %1
 }
@@ -910,32 +881,31 @@ define i8 @atomicrmw_sub_i8_acq_rel(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_sub_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i16_acq_rel:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB29_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB29_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB29_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_acq_rel:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.h	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.h $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i16 %b acq_rel
   ret i16 %1
 }
@@ -943,31 +913,30 @@ define i16 @atomicrmw_sub_i16_acq_rel(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_sub_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i8_seq_cst:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB30_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB30_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB30_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_seq_cst:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.b	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.b $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i8 %b seq_cst
   ret i8 %1
 }
@@ -975,32 +944,31 @@ define i8 @atomicrmw_sub_i8_seq_cst(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_sub_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i16_seq_cst:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB31_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB31_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB31_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_seq_cst:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd_db.h	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd_db.h $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i16 %b seq_cst
   ret i16 %1
 }
@@ -1008,31 +976,30 @@ define i16 @atomicrmw_sub_i16_seq_cst(ptr %a, i16 %b) nounwind {
 define i8 @atomicrmw_sub_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i8_monotonic:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    ori	$a3, $zero, 255
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    andi	$a1, $a1, 255
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    ori $a3, $zero, 255
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    andi $a1, $a1, 255
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB32_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB32_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB32_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_monotonic:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd.b	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd.b $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i8 %b monotonic
   ret i8 %1
 }
@@ -1040,32 +1007,31 @@ define i8 @atomicrmw_sub_i8_monotonic(ptr %a, i8 %b) nounwind {
 define i16 @atomicrmw_sub_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA32-LABEL: atomicrmw_sub_i16_monotonic:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w	$a2, $a0, 3
-; LA32-NEXT:    bstrins.w	$a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w	$a3, 15
-; LA32-NEXT:    ori	$a3, $a3, 4095
-; LA32-NEXT:    sll.w	$a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w	$a1, $a1, 15, 0
-; LA32-NEXT:    sll.w	$a1, $a1, $a2
+; LA32-NEXT:    slli.w $a2, $a0, 3
+; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32-NEXT:    lu12i.w $a3, 15
+; LA32-NEXT:    ori $a3, $a3, 4095
+; LA32-NEXT:    sll.w $a3, $a3, $a2
+; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32-NEXT:    sll.w $a1, $a1, $a2
 ; LA32-NEXT:  .LBB33_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w	$a4, $a0, 0
-; LA32-NEXT:    sub.w	$a5, $a4, $a1
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    and	$a5, $a5, $a3
-; LA32-NEXT:    xor	$a5, $a4, $a5
-; LA32-NEXT:    sc.w	$a5, $a0, 0
-; LA32-NEXT:    beqz	$a5, .LBB33_1
+; LA32-NEXT:    ll.w $a4, $a0, 0
+; LA32-NEXT:    sub.w $a5, $a4, $a1
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    and $a5, $a5, $a3
+; LA32-NEXT:    xor $a5, $a4, $a5
+; LA32-NEXT:    sc.w $a5, $a0, 0
+; LA32-NEXT:    beq $a5, $zero, .LBB33_1
 ; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w	$a0, $a4, $a2
+; LA32-NEXT:    srl.w $a0, $a4, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_monotonic:
 ; LA64:       # %bb.0:
-; LA64-NEXT:    sub.w	$a2, $zero, $a1
-; LA64-NEXT:    amadd.h	$a1, $a2, $a0
-; LA64-NEXT:    move	$a0, $a1
+; LA64-NEXT:    sub.w $a2, $zero, $a1
+; LA64-NEXT:    amadd.h $a1, $a2, $a0
+; LA64-NEXT:    move $a0, $a1
 ; LA64-NEXT:    ret
-;
   %1 = atomicrmw sub ptr %a, i16 %b monotonic
   ret i16 %1
 }
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lamcas.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lamcas.ll
index 728dd778bdf24..1903fedaeaa40 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lamcas.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lamcas.ll
@@ -22,7 +22,7 @@ define i8 @atomicrmw_xchg_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB0_1
+; LA64-NEXT:    beq $a5, $zero, .LBB0_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -135,7 +135,7 @@ define i16 @atomicrmw_xchg_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB3_1
+; LA64-NEXT:    beq $a5, $zero, .LBB3_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -250,7 +250,7 @@ define i8 @atomicrmw_add_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB6_1
+; LA64-NEXT:    beq $a5, $zero, .LBB6_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -295,7 +295,7 @@ define i16 @atomicrmw_add_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB7_1
+; LA64-NEXT:    beq $a5, $zero, .LBB7_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -340,7 +340,7 @@ define i8 @atomicrmw_sub_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB8_1
+; LA64-NEXT:    beq $a5, $zero, .LBB8_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -386,7 +386,7 @@ define i16 @atomicrmw_sub_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB9_1
+; LA64-NEXT:    beq $a5, $zero, .LBB9_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -436,7 +436,7 @@ define i8 @atomicrmw_umax_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB10_3: # in Loop: Header=BB10_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB10_1
+; LA64-NEXT:    beq $a5, $zero, .LBB10_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -503,7 +503,7 @@ define i16 @atomicrmw_umax_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB11_3: # in Loop: Header=BB11_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB11_1
+; LA64-NEXT:    beq $a5, $zero, .LBB11_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -569,7 +569,7 @@ define i8 @atomicrmw_umin_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB12_3: # in Loop: Header=BB12_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB12_1
+; LA64-NEXT:    beq $a5, $zero, .LBB12_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -638,7 +638,7 @@ define i16 @atomicrmw_umin_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB13_3: # in Loop: Header=BB13_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB13_1
+; LA64-NEXT:    beq $a5, $zero, .LBB13_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -710,7 +710,7 @@ define i8 @atomicrmw_max_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB14_3: # in Loop: Header=BB14_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB14_1
+; LA64-NEXT:    beq $a6, $zero, .LBB14_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -780,7 +780,7 @@ define i16 @atomicrmw_max_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB15_3: # in Loop: Header=BB15_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB15_1
+; LA64-NEXT:    beq $a6, $zero, .LBB15_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -848,7 +848,7 @@ define i8 @atomicrmw_min_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB16_3: # in Loop: Header=BB16_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB16_1
+; LA64-NEXT:    beq $a6, $zero, .LBB16_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -920,7 +920,7 @@ define i16 @atomicrmw_min_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB17_3: # in Loop: Header=BB17_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB17_1
+; LA64-NEXT:    beq $a6, $zero, .LBB17_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -983,7 +983,7 @@ define i8 @atomicrmw_nand_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB18_1
+; LA64-NEXT:    beq $a5, $zero, .LBB18_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1039,7 +1039,7 @@ define i16 @atomicrmw_nand_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB19_1
+; LA64-NEXT:    beq $a5, $zero, .LBB19_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1086,7 +1086,7 @@ define i32 @atomicrmw_nand_i32_acquire(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB20_1
+; LA64-NEXT:    beq $a3, $zero, .LBB20_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -1132,7 +1132,7 @@ define i64 @atomicrmw_nand_i64_acquire(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB21_1
+; LA64-NEXT:    beq $a3, $zero, .LBB21_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -1414,7 +1414,7 @@ define i8 @atomicrmw_xchg_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB28_1
+; LA64-NEXT:    beq $a5, $zero, .LBB28_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1527,7 +1527,7 @@ define i16 @atomicrmw_xchg_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB31_1
+; LA64-NEXT:    beq $a5, $zero, .LBB31_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1642,7 +1642,7 @@ define i8 @atomicrmw_add_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB34_1
+; LA64-NEXT:    beq $a5, $zero, .LBB34_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1687,7 +1687,7 @@ define i16 @atomicrmw_add_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB35_1
+; LA64-NEXT:    beq $a5, $zero, .LBB35_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1732,7 +1732,7 @@ define i8 @atomicrmw_sub_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB36_1
+; LA64-NEXT:    beq $a5, $zero, .LBB36_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1778,7 +1778,7 @@ define i16 @atomicrmw_sub_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB37_1
+; LA64-NEXT:    beq $a5, $zero, .LBB37_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1828,7 +1828,7 @@ define i8 @atomicrmw_umax_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB38_3: # in Loop: Header=BB38_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB38_1
+; LA64-NEXT:    beq $a5, $zero, .LBB38_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1895,7 +1895,7 @@ define i16 @atomicrmw_umax_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB39_3: # in Loop: Header=BB39_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB39_1
+; LA64-NEXT:    beq $a5, $zero, .LBB39_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1961,7 +1961,7 @@ define i8 @atomicrmw_umin_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB40_3: # in Loop: Header=BB40_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB40_1
+; LA64-NEXT:    beq $a5, $zero, .LBB40_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2030,7 +2030,7 @@ define i16 @atomicrmw_umin_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB41_3: # in Loop: Header=BB41_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB41_1
+; LA64-NEXT:    beq $a5, $zero, .LBB41_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2102,7 +2102,7 @@ define i8 @atomicrmw_max_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB42_3: # in Loop: Header=BB42_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB42_1
+; LA64-NEXT:    beq $a6, $zero, .LBB42_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -2172,7 +2172,7 @@ define i16 @atomicrmw_max_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB43_3: # in Loop: Header=BB43_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB43_1
+; LA64-NEXT:    beq $a6, $zero, .LBB43_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -2240,7 +2240,7 @@ define i8 @atomicrmw_min_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB44_3: # in Loop: Header=BB44_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB44_1
+; LA64-NEXT:    beq $a6, $zero, .LBB44_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -2312,7 +2312,7 @@ define i16 @atomicrmw_min_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB45_3: # in Loop: Header=BB45_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB45_1
+; LA64-NEXT:    beq $a6, $zero, .LBB45_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -2375,7 +2375,7 @@ define i8 @atomicrmw_nand_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB46_1
+; LA64-NEXT:    beq $a5, $zero, .LBB46_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2431,7 +2431,7 @@ define i16 @atomicrmw_nand_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB47_1
+; LA64-NEXT:    beq $a5, $zero, .LBB47_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2478,7 +2478,7 @@ define i32 @atomicrmw_nand_i32_release(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB48_1
+; LA64-NEXT:    beq $a3, $zero, .LBB48_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -2524,7 +2524,7 @@ define i64 @atomicrmw_nand_i64_release(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB49_1
+; LA64-NEXT:    beq $a3, $zero, .LBB49_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -2806,7 +2806,7 @@ define i8 @atomicrmw_xchg_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB56_1
+; LA64-NEXT:    beq $a5, $zero, .LBB56_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2919,7 +2919,7 @@ define i16 @atomicrmw_xchg_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB59_1
+; LA64-NEXT:    beq $a5, $zero, .LBB59_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3034,7 +3034,7 @@ define i8 @atomicrmw_add_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB62_1
+; LA64-NEXT:    beq $a5, $zero, .LBB62_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3079,7 +3079,7 @@ define i16 @atomicrmw_add_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB63_1
+; LA64-NEXT:    beq $a5, $zero, .LBB63_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3124,7 +3124,7 @@ define i8 @atomicrmw_sub_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB64_1
+; LA64-NEXT:    beq $a5, $zero, .LBB64_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3170,7 +3170,7 @@ define i16 @atomicrmw_sub_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB65_1
+; LA64-NEXT:    beq $a5, $zero, .LBB65_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3220,7 +3220,7 @@ define i8 @atomicrmw_umax_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB66_3: # in Loop: Header=BB66_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB66_1
+; LA64-NEXT:    beq $a5, $zero, .LBB66_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3287,7 +3287,7 @@ define i16 @atomicrmw_umax_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB67_3: # in Loop: Header=BB67_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB67_1
+; LA64-NEXT:    beq $a5, $zero, .LBB67_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3353,7 +3353,7 @@ define i8 @atomicrmw_umin_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB68_3: # in Loop: Header=BB68_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB68_1
+; LA64-NEXT:    beq $a5, $zero, .LBB68_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3422,7 +3422,7 @@ define i16 @atomicrmw_umin_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB69_3: # in Loop: Header=BB69_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB69_1
+; LA64-NEXT:    beq $a5, $zero, .LBB69_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3494,7 +3494,7 @@ define i8 @atomicrmw_max_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB70_3: # in Loop: Header=BB70_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB70_1
+; LA64-NEXT:    beq $a6, $zero, .LBB70_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -3564,7 +3564,7 @@ define i16 @atomicrmw_max_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB71_3: # in Loop: Header=BB71_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB71_1
+; LA64-NEXT:    beq $a6, $zero, .LBB71_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -3632,7 +3632,7 @@ define i8 @atomicrmw_min_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB72_3: # in Loop: Header=BB72_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB72_1
+; LA64-NEXT:    beq $a6, $zero, .LBB72_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -3704,7 +3704,7 @@ define i16 @atomicrmw_min_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB73_3: # in Loop: Header=BB73_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB73_1
+; LA64-NEXT:    beq $a6, $zero, .LBB73_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -3767,7 +3767,7 @@ define i8 @atomicrmw_nand_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB74_1
+; LA64-NEXT:    beq $a5, $zero, .LBB74_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3823,7 +3823,7 @@ define i16 @atomicrmw_nand_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB75_1
+; LA64-NEXT:    beq $a5, $zero, .LBB75_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3870,7 +3870,7 @@ define i32 @atomicrmw_nand_i32_acq_rel(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB76_1
+; LA64-NEXT:    beq $a3, $zero, .LBB76_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -3916,7 +3916,7 @@ define i64 @atomicrmw_nand_i64_acq_rel(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB77_1
+; LA64-NEXT:    beq $a3, $zero, .LBB77_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -4200,7 +4200,7 @@ define i8 @atomicrmw_xchg_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB84_1
+; LA64-NEXT:    beq $a5, $zero, .LBB84_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4313,7 +4313,7 @@ define i16 @atomicrmw_xchg_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB87_1
+; LA64-NEXT:    beq $a5, $zero, .LBB87_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4428,7 +4428,7 @@ define i8 @atomicrmw_add_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB90_1
+; LA64-NEXT:    beq $a5, $zero, .LBB90_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4473,7 +4473,7 @@ define i16 @atomicrmw_add_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB91_1
+; LA64-NEXT:    beq $a5, $zero, .LBB91_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4518,7 +4518,7 @@ define i8 @atomicrmw_sub_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB92_1
+; LA64-NEXT:    beq $a5, $zero, .LBB92_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4564,7 +4564,7 @@ define i16 @atomicrmw_sub_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB93_1
+; LA64-NEXT:    beq $a5, $zero, .LBB93_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4614,7 +4614,7 @@ define i8 @atomicrmw_umax_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB94_3: # in Loop: Header=BB94_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB94_1
+; LA64-NEXT:    beq $a5, $zero, .LBB94_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4681,7 +4681,7 @@ define i16 @atomicrmw_umax_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB95_3: # in Loop: Header=BB95_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB95_1
+; LA64-NEXT:    beq $a5, $zero, .LBB95_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4747,7 +4747,7 @@ define i8 @atomicrmw_umin_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB96_3: # in Loop: Header=BB96_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB96_1
+; LA64-NEXT:    beq $a5, $zero, .LBB96_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4816,7 +4816,7 @@ define i16 @atomicrmw_umin_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB97_3: # in Loop: Header=BB97_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB97_1
+; LA64-NEXT:    beq $a5, $zero, .LBB97_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4888,7 +4888,7 @@ define i8 @atomicrmw_max_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB98_3: # in Loop: Header=BB98_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB98_1
+; LA64-NEXT:    beq $a6, $zero, .LBB98_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -4958,7 +4958,7 @@ define i16 @atomicrmw_max_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB99_3: # in Loop: Header=BB99_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB99_1
+; LA64-NEXT:    beq $a6, $zero, .LBB99_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -5026,7 +5026,7 @@ define i8 @atomicrmw_min_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB100_3: # in Loop: Header=BB100_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB100_1
+; LA64-NEXT:    beq $a6, $zero, .LBB100_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -5098,7 +5098,7 @@ define i16 @atomicrmw_min_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB101_3: # in Loop: Header=BB101_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB101_1
+; LA64-NEXT:    beq $a6, $zero, .LBB101_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -5161,7 +5161,7 @@ define i8 @atomicrmw_nand_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB102_1
+; LA64-NEXT:    beq $a5, $zero, .LBB102_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5217,7 +5217,7 @@ define i16 @atomicrmw_nand_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB103_1
+; LA64-NEXT:    beq $a5, $zero, .LBB103_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5264,7 +5264,7 @@ define i32 @atomicrmw_nand_i32_seq_cst(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB104_1
+; LA64-NEXT:    beq $a3, $zero, .LBB104_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -5310,7 +5310,7 @@ define i64 @atomicrmw_nand_i64_seq_cst(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB105_1
+; LA64-NEXT:    beq $a3, $zero, .LBB105_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -5594,7 +5594,7 @@ define i8 @atomicrmw_xchg_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB112_1
+; LA64-NEXT:    beq $a5, $zero, .LBB112_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5707,7 +5707,7 @@ define i16 @atomicrmw_xchg_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB115_1
+; LA64-NEXT:    beq $a5, $zero, .LBB115_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5822,7 +5822,7 @@ define i8 @atomicrmw_add_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB118_1
+; LA64-NEXT:    beq $a5, $zero, .LBB118_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5867,7 +5867,7 @@ define i16 @atomicrmw_add_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB119_1
+; LA64-NEXT:    beq $a5, $zero, .LBB119_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5912,7 +5912,7 @@ define i8 @atomicrmw_sub_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB120_1
+; LA64-NEXT:    beq $a5, $zero, .LBB120_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -5958,7 +5958,7 @@ define i16 @atomicrmw_sub_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB121_1
+; LA64-NEXT:    beq $a5, $zero, .LBB121_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6008,7 +6008,7 @@ define i8 @atomicrmw_umax_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB122_3: # in Loop: Header=BB122_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB122_1
+; LA64-NEXT:    beq $a5, $zero, .LBB122_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6075,7 +6075,7 @@ define i16 @atomicrmw_umax_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB123_3: # in Loop: Header=BB123_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB123_1
+; LA64-NEXT:    beq $a5, $zero, .LBB123_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6141,7 +6141,7 @@ define i8 @atomicrmw_umin_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB124_3: # in Loop: Header=BB124_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB124_1
+; LA64-NEXT:    beq $a5, $zero, .LBB124_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6210,7 +6210,7 @@ define i16 @atomicrmw_umin_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB125_3: # in Loop: Header=BB125_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB125_1
+; LA64-NEXT:    beq $a5, $zero, .LBB125_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6282,7 +6282,7 @@ define i8 @atomicrmw_max_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB126_3: # in Loop: Header=BB126_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB126_1
+; LA64-NEXT:    beq $a6, $zero, .LBB126_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -6352,7 +6352,7 @@ define i16 @atomicrmw_max_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB127_3: # in Loop: Header=BB127_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB127_1
+; LA64-NEXT:    beq $a6, $zero, .LBB127_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -6420,7 +6420,7 @@ define i8 @atomicrmw_min_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB128_3: # in Loop: Header=BB128_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB128_1
+; LA64-NEXT:    beq $a6, $zero, .LBB128_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -6492,7 +6492,7 @@ define i16 @atomicrmw_min_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB129_3: # in Loop: Header=BB129_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB129_1
+; LA64-NEXT:    beq $a6, $zero, .LBB129_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -6557,7 +6557,7 @@ define i8 @atomicrmw_nand_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB130_1
+; LA64-NEXT:    beq $a5, $zero, .LBB130_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6613,7 +6613,7 @@ define i16 @atomicrmw_nand_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB131_1
+; LA64-NEXT:    beq $a5, $zero, .LBB131_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -6660,7 +6660,7 @@ define i32 @atomicrmw_nand_i32_monotonic(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB132_1
+; LA64-NEXT:    beq $a3, $zero, .LBB132_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -6706,7 +6706,7 @@ define i64 @atomicrmw_nand_i64_monotonic(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB133_1
+; LA64-NEXT:    beq $a3, $zero, .LBB133_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-minmax.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-minmax.ll
index 03386514a72c8..096c2242661c0 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-minmax.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-minmax.ll
@@ -24,7 +24,7 @@ define i8 @atomicrmw_umax_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB0_3: # in Loop: Header=BB0_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB0_1
+; LA64-NEXT:    beq $a5, $zero, .LBB0_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -53,7 +53,7 @@ define i16 @atomicrmw_umax_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB1_3: # in Loop: Header=BB1_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB1_1
+; LA64-NEXT:    beq $a5, $zero, .LBB1_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -101,7 +101,7 @@ define i8 @atomicrmw_umin_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB4_3: # in Loop: Header=BB4_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB4_1
+; LA64-NEXT:    beq $a5, $zero, .LBB4_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -130,7 +130,7 @@ define i16 @atomicrmw_umin_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB5_3: # in Loop: Header=BB5_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB5_1
+; LA64-NEXT:    beq $a5, $zero, .LBB5_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -182,7 +182,7 @@ define i8 @atomicrmw_max_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB8_3: # in Loop: Header=BB8_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB8_1
+; LA64-NEXT:    beq $a6, $zero, .LBB8_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -216,7 +216,7 @@ define i16 @atomicrmw_max_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB9_3: # in Loop: Header=BB9_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB9_1
+; LA64-NEXT:    beq $a6, $zero, .LBB9_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -268,7 +268,7 @@ define i8 @atomicrmw_min_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB12_3: # in Loop: Header=BB12_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB12_1
+; LA64-NEXT:    beq $a6, $zero, .LBB12_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -302,7 +302,7 @@ define i16 @atomicrmw_min_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB13_3: # in Loop: Header=BB13_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB13_1
+; LA64-NEXT:    beq $a6, $zero, .LBB13_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -350,7 +350,7 @@ define i8 @atomicrmw_umax_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB16_3: # in Loop: Header=BB16_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB16_1
+; LA64-NEXT:    beq $a5, $zero, .LBB16_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -379,7 +379,7 @@ define i16 @atomicrmw_umax_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB17_3: # in Loop: Header=BB17_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB17_1
+; LA64-NEXT:    beq $a5, $zero, .LBB17_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -427,7 +427,7 @@ define i8 @atomicrmw_umin_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB20_3: # in Loop: Header=BB20_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB20_1
+; LA64-NEXT:    beq $a5, $zero, .LBB20_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -456,7 +456,7 @@ define i16 @atomicrmw_umin_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB21_3: # in Loop: Header=BB21_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB21_1
+; LA64-NEXT:    beq $a5, $zero, .LBB21_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -508,7 +508,7 @@ define i8 @atomicrmw_max_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB24_3: # in Loop: Header=BB24_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB24_1
+; LA64-NEXT:    beq $a6, $zero, .LBB24_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -542,7 +542,7 @@ define i16 @atomicrmw_max_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB25_3: # in Loop: Header=BB25_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB25_1
+; LA64-NEXT:    beq $a6, $zero, .LBB25_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -594,7 +594,7 @@ define i8 @atomicrmw_min_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB28_3: # in Loop: Header=BB28_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB28_1
+; LA64-NEXT:    beq $a6, $zero, .LBB28_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -628,7 +628,7 @@ define i16 @atomicrmw_min_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB29_3: # in Loop: Header=BB29_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB29_1
+; LA64-NEXT:    beq $a6, $zero, .LBB29_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -676,7 +676,7 @@ define i8 @atomicrmw_umax_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB32_3: # in Loop: Header=BB32_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB32_1
+; LA64-NEXT:    beq $a5, $zero, .LBB32_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -705,7 +705,7 @@ define i16 @atomicrmw_umax_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB33_3: # in Loop: Header=BB33_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB33_1
+; LA64-NEXT:    beq $a5, $zero, .LBB33_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -753,7 +753,7 @@ define i8 @atomicrmw_umin_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB36_3: # in Loop: Header=BB36_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB36_1
+; LA64-NEXT:    beq $a5, $zero, .LBB36_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -782,7 +782,7 @@ define i16 @atomicrmw_umin_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB37_3: # in Loop: Header=BB37_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB37_1
+; LA64-NEXT:    beq $a5, $zero, .LBB37_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -834,7 +834,7 @@ define i8 @atomicrmw_max_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB40_3: # in Loop: Header=BB40_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB40_1
+; LA64-NEXT:    beq $a6, $zero, .LBB40_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -868,7 +868,7 @@ define i16 @atomicrmw_max_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB41_3: # in Loop: Header=BB41_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB41_1
+; LA64-NEXT:    beq $a6, $zero, .LBB41_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -920,7 +920,7 @@ define i8 @atomicrmw_min_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB44_3: # in Loop: Header=BB44_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB44_1
+; LA64-NEXT:    beq $a6, $zero, .LBB44_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -954,7 +954,7 @@ define i16 @atomicrmw_min_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB45_3: # in Loop: Header=BB45_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB45_1
+; LA64-NEXT:    beq $a6, $zero, .LBB45_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1002,7 +1002,7 @@ define i8 @atomicrmw_umax_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB48_3: # in Loop: Header=BB48_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB48_1
+; LA64-NEXT:    beq $a5, $zero, .LBB48_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1031,7 +1031,7 @@ define i16 @atomicrmw_umax_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB49_3: # in Loop: Header=BB49_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB49_1
+; LA64-NEXT:    beq $a5, $zero, .LBB49_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1079,7 +1079,7 @@ define i8 @atomicrmw_umin_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB52_3: # in Loop: Header=BB52_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB52_1
+; LA64-NEXT:    beq $a5, $zero, .LBB52_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1108,7 +1108,7 @@ define i16 @atomicrmw_umin_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB53_3: # in Loop: Header=BB53_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB53_1
+; LA64-NEXT:    beq $a5, $zero, .LBB53_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1160,7 +1160,7 @@ define i8 @atomicrmw_max_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB56_3: # in Loop: Header=BB56_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB56_1
+; LA64-NEXT:    beq $a6, $zero, .LBB56_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1194,7 +1194,7 @@ define i16 @atomicrmw_max_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB57_3: # in Loop: Header=BB57_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB57_1
+; LA64-NEXT:    beq $a6, $zero, .LBB57_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1246,7 +1246,7 @@ define i8 @atomicrmw_min_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB60_3: # in Loop: Header=BB60_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB60_1
+; LA64-NEXT:    beq $a6, $zero, .LBB60_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1280,7 +1280,7 @@ define i16 @atomicrmw_min_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB61_3: # in Loop: Header=BB61_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB61_1
+; LA64-NEXT:    beq $a6, $zero, .LBB61_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1328,7 +1328,7 @@ define i8 @atomicrmw_umax_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB64_3: # in Loop: Header=BB64_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB64_1
+; LA64-NEXT:    beq $a5, $zero, .LBB64_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1357,7 +1357,7 @@ define i16 @atomicrmw_umax_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB65_3: # in Loop: Header=BB65_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB65_1
+; LA64-NEXT:    beq $a5, $zero, .LBB65_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1405,7 +1405,7 @@ define i8 @atomicrmw_umin_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB68_3: # in Loop: Header=BB68_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB68_1
+; LA64-NEXT:    beq $a5, $zero, .LBB68_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1434,7 +1434,7 @@ define i16 @atomicrmw_umin_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:  .LBB69_3: # in Loop: Header=BB69_1 Depth=1
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB69_1
+; LA64-NEXT:    beq $a5, $zero, .LBB69_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1486,7 +1486,7 @@ define i8 @atomicrmw_max_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB72_3: # in Loop: Header=BB72_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB72_1
+; LA64-NEXT:    beq $a6, $zero, .LBB72_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1520,7 +1520,7 @@ define i16 @atomicrmw_max_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB73_3: # in Loop: Header=BB73_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB73_1
+; LA64-NEXT:    beq $a6, $zero, .LBB73_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1572,7 +1572,7 @@ define i8 @atomicrmw_min_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB76_3: # in Loop: Header=BB76_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB76_1
+; LA64-NEXT:    beq $a6, $zero, .LBB76_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
@@ -1606,7 +1606,7 @@ define i16 @atomicrmw_min_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    xor $a6, $a5, $a6
 ; LA64-NEXT:  .LBB77_3: # in Loop: Header=BB77_1 Depth=1
 ; LA64-NEXT:    sc.w $a6, $a0, 0
-; LA64-NEXT:    beqz $a6, .LBB77_1
+; LA64-NEXT:    beq $a6, $zero, .LBB77_1
 ; LA64-NEXT:  # %bb.4:
 ; LA64-NEXT:    srl.w $a0, $a5, $a2
 ; LA64-NEXT:    ret
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw.ll
index fc393e133dd4a..607c7f83a4338 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw.ll
@@ -1,27 +1,49 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 define i8 @atomicrmw_xchg_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB0_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB0_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB0_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_acquire:
 ; LA64:       # %bb.0:
@@ -38,7 +60,7 @@ define i8 @atomicrmw_xchg_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB0_1
+; LA64-NEXT:    beq $a5, $zero, .LBB0_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -47,21 +69,38 @@ define i8 @atomicrmw_xchg_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_0_i8_acquire(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB1_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB1_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB1_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB1_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i8_acquire:
 ; LA64:       # %bb.0:
@@ -78,20 +117,36 @@ define i8 @atomicrmw_xchg_0_i8_acquire(ptr %a) nounwind {
 }
 
 define i8 @atomicrmw_xchg_minus_1_i8_acquire(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB2_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB2_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB2_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB2_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i8_acquire:
 ; LA64:       # %bb.0:
@@ -107,26 +162,48 @@ define i8 @atomicrmw_xchg_minus_1_i8_acquire(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB3_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB3_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB3_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB3_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB3_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB3_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_acquire:
 ; LA64:       # %bb.0:
@@ -144,7 +221,7 @@ define i16 @atomicrmw_xchg_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB3_1
+; LA64-NEXT:    beq $a5, $zero, .LBB3_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -153,22 +230,40 @@ define i16 @atomicrmw_xchg_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i16 @atomicrmw_xchg_0_i16_acquire(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB4_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB4_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB4_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB4_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB4_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB4_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i16_acquire:
 ; LA64:       # %bb.0:
@@ -186,21 +281,38 @@ define i16 @atomicrmw_xchg_0_i16_acquire(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_minus_1_i16_acquire(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB5_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB5_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB5_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB5_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB5_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB5_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i16_acquire:
 ; LA64:       # %bb.0:
@@ -217,16 +329,27 @@ define i16 @atomicrmw_xchg_minus_1_i16_acquire(ptr %a) nounwind {
 }
 
 define i32 @atomicrmw_xchg_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB6_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    move $a3, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB6_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB6_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    move $a3, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB6_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB6_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    move $a3, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB6_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i32_acquire:
 ; LA64:       # %bb.0:
@@ -238,15 +361,25 @@ define i32 @atomicrmw_xchg_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xchg_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_exchange_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_exchange_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_exchange_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i64_acquire:
 ; LA64:       # %bb.0:
@@ -258,25 +391,46 @@ define i64 @atomicrmw_xchg_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_add_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB8_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB8_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB8_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB8_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB8_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB8_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_acquire:
 ; LA64:       # %bb.0:
@@ -293,7 +447,7 @@ define i8 @atomicrmw_add_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB8_1
+; LA64-NEXT:    beq $a5, $zero, .LBB8_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -302,26 +456,48 @@ define i8 @atomicrmw_add_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_add_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB9_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB9_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB9_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB9_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB9_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB9_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_acquire:
 ; LA64:       # %bb.0:
@@ -339,7 +515,7 @@ define i16 @atomicrmw_add_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB9_1
+; LA64-NEXT:    beq $a5, $zero, .LBB9_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -348,16 +524,27 @@ define i16 @atomicrmw_add_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_add_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB10_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    add.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB10_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB10_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    add.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB10_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB10_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    add.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB10_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i32_acquire:
 ; LA64:       # %bb.0:
@@ -369,15 +556,25 @@ define i32 @atomicrmw_add_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_add_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_fetch_add_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_fetch_add_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_fetch_add_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i64_acquire:
 ; LA64:       # %bb.0:
@@ -389,25 +586,46 @@ define i64 @atomicrmw_add_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_sub_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB12_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB12_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB12_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB12_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB12_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB12_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_acquire:
 ; LA64:       # %bb.0:
@@ -424,7 +642,7 @@ define i8 @atomicrmw_sub_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB12_1
+; LA64-NEXT:    beq $a5, $zero, .LBB12_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -433,26 +651,48 @@ define i8 @atomicrmw_sub_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_sub_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB13_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB13_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB13_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB13_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB13_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB13_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_acquire:
 ; LA64:       # %bb.0:
@@ -470,7 +710,7 @@ define i16 @atomicrmw_sub_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB13_1
+; LA64-NEXT:    beq $a5, $zero, .LBB13_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -479,16 +719,27 @@ define i16 @atomicrmw_sub_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_sub_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB14_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    sub.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB14_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB14_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    sub.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB14_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB14_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    sub.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB14_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i32_acquire:
 ; LA64:       # %bb.0:
@@ -501,15 +752,25 @@ define i32 @atomicrmw_sub_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_sub_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_fetch_sub_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_fetch_sub_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_fetch_sub_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i64_acquire:
 ; LA64:       # %bb.0:
@@ -522,26 +783,48 @@ define i64 @atomicrmw_sub_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_nand_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB16_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB16_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB16_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    and $a5, $a4, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB16_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB16_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB16_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i8_acquire:
 ; LA64:       # %bb.0:
@@ -559,7 +842,7 @@ define i8 @atomicrmw_nand_i8_acquire(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB16_1
+; LA64-NEXT:    beq $a5, $zero, .LBB16_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -568,27 +851,50 @@ define i8 @atomicrmw_nand_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_nand_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB17_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB17_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB17_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a5, $a3, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB17_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB17_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB17_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i16_acquire:
 ; LA64:       # %bb.0:
@@ -607,7 +913,7 @@ define i16 @atomicrmw_nand_i16_acquire(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB17_1
+; LA64-NEXT:    beq $a5, $zero, .LBB17_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -616,17 +922,29 @@ define i16 @atomicrmw_nand_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_nand_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB18_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    nor $a3, $a3, $zero
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB18_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB18_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    nor $a3, $a3, $zero
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB18_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB18_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    nor $a3, $a3, $zero
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB18_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i32_acquire:
 ; LA64:       # %bb.0:
@@ -635,7 +953,7 @@ define i32 @atomicrmw_nand_i32_acquire(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB18_1
+; LA64-NEXT:    beq $a3, $zero, .LBB18_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -644,15 +962,25 @@ define i32 @atomicrmw_nand_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_nand_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_fetch_nand_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_fetch_nand_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_fetch_nand_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i64_acquire:
 ; LA64:       # %bb.0:
@@ -661,7 +989,7 @@ define i64 @atomicrmw_nand_i64_acquire(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB19_1
+; LA64-NEXT:    beq $a3, $zero, .LBB19_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -670,23 +998,42 @@ define i64 @atomicrmw_nand_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_and_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB20_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB20_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a3
+; LA32R-NEXT:  .LBB20_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB20_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB20_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB20_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i8_acquire:
 ; LA64:       # %bb.0:
@@ -705,24 +1052,44 @@ define i8 @atomicrmw_and_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_and_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB21_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB21_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a4
+; LA32R-NEXT:  .LBB21_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB21_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB21_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB21_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i16_acquire:
 ; LA64:       # %bb.0:
@@ -742,16 +1109,27 @@ define i16 @atomicrmw_and_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_and_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB22_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB22_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB22_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB22_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB22_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB22_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i32_acquire:
 ; LA64:       # %bb.0:
@@ -763,15 +1141,25 @@ define i32 @atomicrmw_and_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_and_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_fetch_and_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_fetch_and_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_fetch_and_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i64_acquire:
 ; LA64:       # %bb.0:
@@ -783,20 +1171,36 @@ define i64 @atomicrmw_and_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_or_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB24_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB24_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB24_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB24_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB24_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB24_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i8_acquire:
 ; LA64:       # %bb.0:
@@ -812,20 +1216,38 @@ define i8 @atomicrmw_or_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_or_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB25_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB25_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB25_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB25_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB25_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB25_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i16_acquire:
 ; LA64:       # %bb.0:
@@ -841,16 +1263,27 @@ define i16 @atomicrmw_or_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_or_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB26_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    or $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB26_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB26_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    or $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB26_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB26_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    or $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB26_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i32_acquire:
 ; LA64:       # %bb.0:
@@ -862,15 +1295,25 @@ define i32 @atomicrmw_or_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_or_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_fetch_or_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_fetch_or_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_fetch_or_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i64_acquire:
 ; LA64:       # %bb.0:
@@ -882,20 +1325,36 @@ define i64 @atomicrmw_or_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xor_i8_acquire(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i8_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB28_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB28_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i8_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB28_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB28_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i8_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB28_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB28_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i8_acquire:
 ; LA64:       # %bb.0:
@@ -911,20 +1370,38 @@ define i8 @atomicrmw_xor_i8_acquire(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_xor_i16_acquire(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i16_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB29_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB29_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i16_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB29_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB29_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i16_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB29_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB29_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i16_acquire:
 ; LA64:       # %bb.0:
@@ -940,16 +1417,27 @@ define i16 @atomicrmw_xor_i16_acquire(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_xor_i32_acquire(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i32_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB30_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    xor $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB30_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i32_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB30_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    xor $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB30_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i32_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB30_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    xor $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB30_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i32_acquire:
 ; LA64:       # %bb.0:
@@ -961,15 +1449,25 @@ define i32 @atomicrmw_xor_i32_acquire(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xor_i64_acquire(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i64_acquire:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    bl __atomic_fetch_xor_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i64_acquire:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 2
+; LA32R-NEXT:    bl __atomic_fetch_xor_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i64_acquire:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    bl __atomic_fetch_xor_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i64_acquire:
 ; LA64:       # %bb.0:
@@ -981,25 +1479,46 @@ define i64 @atomicrmw_xor_i64_acquire(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB32_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB32_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB32_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB32_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB32_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB32_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_release:
 ; LA64:       # %bb.0:
@@ -1016,7 +1535,7 @@ define i8 @atomicrmw_xchg_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB32_1
+; LA64-NEXT:    beq $a5, $zero, .LBB32_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1025,21 +1544,38 @@ define i8 @atomicrmw_xchg_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_0_i8_release(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB33_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB33_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB33_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB33_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB33_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB33_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i8_release:
 ; LA64:       # %bb.0:
@@ -1056,20 +1592,36 @@ define i8 @atomicrmw_xchg_0_i8_release(ptr %a) nounwind {
 }
 
 define i8 @atomicrmw_xchg_minus_1_i8_release(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB34_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB34_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB34_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB34_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB34_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB34_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i8_release:
 ; LA64:       # %bb.0:
@@ -1085,26 +1637,48 @@ define i8 @atomicrmw_xchg_minus_1_i8_release(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB35_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB35_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB35_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB35_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB35_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB35_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_release:
 ; LA64:       # %bb.0:
@@ -1122,7 +1696,7 @@ define i16 @atomicrmw_xchg_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB35_1
+; LA64-NEXT:    beq $a5, $zero, .LBB35_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1131,22 +1705,40 @@ define i16 @atomicrmw_xchg_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i16 @atomicrmw_xchg_0_i16_release(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB36_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB36_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB36_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB36_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB36_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB36_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i16_release:
 ; LA64:       # %bb.0:
@@ -1164,21 +1756,38 @@ define i16 @atomicrmw_xchg_0_i16_release(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_minus_1_i16_release(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB37_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB37_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB37_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB37_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB37_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB37_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i16_release:
 ; LA64:       # %bb.0:
@@ -1195,16 +1804,27 @@ define i16 @atomicrmw_xchg_minus_1_i16_release(ptr %a) nounwind {
 }
 
 define i32 @atomicrmw_xchg_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB38_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    move $a3, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB38_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB38_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    move $a3, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB38_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB38_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    move $a3, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB38_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i32_release:
 ; LA64:       # %bb.0:
@@ -1216,15 +1836,25 @@ define i32 @atomicrmw_xchg_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xchg_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_exchange_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_exchange_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_exchange_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i64_release:
 ; LA64:       # %bb.0:
@@ -1236,25 +1866,46 @@ define i64 @atomicrmw_xchg_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_add_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB40_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB40_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB40_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB40_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB40_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB40_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_release:
 ; LA64:       # %bb.0:
@@ -1271,7 +1922,7 @@ define i8 @atomicrmw_add_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB40_1
+; LA64-NEXT:    beq $a5, $zero, .LBB40_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1280,26 +1931,48 @@ define i8 @atomicrmw_add_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_add_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB41_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB41_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB41_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB41_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB41_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB41_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_release:
 ; LA64:       # %bb.0:
@@ -1317,7 +1990,7 @@ define i16 @atomicrmw_add_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB41_1
+; LA64-NEXT:    beq $a5, $zero, .LBB41_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1326,16 +1999,27 @@ define i16 @atomicrmw_add_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_add_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB42_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    add.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB42_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB42_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    add.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB42_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB42_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    add.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB42_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i32_release:
 ; LA64:       # %bb.0:
@@ -1347,15 +2031,25 @@ define i32 @atomicrmw_add_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_add_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_fetch_add_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_fetch_add_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_fetch_add_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i64_release:
 ; LA64:       # %bb.0:
@@ -1367,25 +2061,46 @@ define i64 @atomicrmw_add_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_sub_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB44_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB44_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB44_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB44_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB44_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB44_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_release:
 ; LA64:       # %bb.0:
@@ -1402,7 +2117,7 @@ define i8 @atomicrmw_sub_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB44_1
+; LA64-NEXT:    beq $a5, $zero, .LBB44_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1411,26 +2126,48 @@ define i8 @atomicrmw_sub_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_sub_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB45_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB45_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB45_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB45_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB45_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB45_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_release:
 ; LA64:       # %bb.0:
@@ -1448,7 +2185,7 @@ define i16 @atomicrmw_sub_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB45_1
+; LA64-NEXT:    beq $a5, $zero, .LBB45_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1457,16 +2194,27 @@ define i16 @atomicrmw_sub_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_sub_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB46_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    sub.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB46_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB46_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    sub.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB46_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB46_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    sub.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB46_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i32_release:
 ; LA64:       # %bb.0:
@@ -1479,15 +2227,25 @@ define i32 @atomicrmw_sub_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_sub_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_fetch_sub_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_fetch_sub_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_fetch_sub_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i64_release:
 ; LA64:       # %bb.0:
@@ -1500,26 +2258,48 @@ define i64 @atomicrmw_sub_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_nand_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB48_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB48_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB48_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    and $a5, $a4, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB48_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB48_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB48_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i8_release:
 ; LA64:       # %bb.0:
@@ -1537,7 +2317,7 @@ define i8 @atomicrmw_nand_i8_release(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB48_1
+; LA64-NEXT:    beq $a5, $zero, .LBB48_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1546,27 +2326,50 @@ define i8 @atomicrmw_nand_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_nand_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB49_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB49_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB49_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a5, $a3, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB49_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB49_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB49_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i16_release:
 ; LA64:       # %bb.0:
@@ -1585,7 +2388,7 @@ define i16 @atomicrmw_nand_i16_release(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB49_1
+; LA64-NEXT:    beq $a5, $zero, .LBB49_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -1594,17 +2397,29 @@ define i16 @atomicrmw_nand_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_nand_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB50_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    nor $a3, $a3, $zero
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB50_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB50_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    nor $a3, $a3, $zero
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB50_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB50_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    nor $a3, $a3, $zero
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB50_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i32_release:
 ; LA64:       # %bb.0:
@@ -1613,7 +2428,7 @@ define i32 @atomicrmw_nand_i32_release(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB50_1
+; LA64-NEXT:    beq $a3, $zero, .LBB50_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -1622,15 +2437,25 @@ define i32 @atomicrmw_nand_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_nand_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_fetch_nand_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_fetch_nand_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_fetch_nand_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i64_release:
 ; LA64:       # %bb.0:
@@ -1639,7 +2464,7 @@ define i64 @atomicrmw_nand_i64_release(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB51_1
+; LA64-NEXT:    beq $a3, $zero, .LBB51_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -1648,23 +2473,42 @@ define i64 @atomicrmw_nand_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_and_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB52_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB52_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a3
+; LA32R-NEXT:  .LBB52_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB52_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB52_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB52_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i8_release:
 ; LA64:       # %bb.0:
@@ -1683,24 +2527,44 @@ define i8 @atomicrmw_and_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_and_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB53_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB53_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a4
+; LA32R-NEXT:  .LBB53_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB53_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB53_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB53_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i16_release:
 ; LA64:       # %bb.0:
@@ -1720,16 +2584,27 @@ define i16 @atomicrmw_and_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_and_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB54_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB54_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB54_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB54_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB54_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB54_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i32_release:
 ; LA64:       # %bb.0:
@@ -1741,15 +2616,25 @@ define i32 @atomicrmw_and_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_and_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_fetch_and_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_fetch_and_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_fetch_and_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i64_release:
 ; LA64:       # %bb.0:
@@ -1761,20 +2646,36 @@ define i64 @atomicrmw_and_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_or_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB56_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB56_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB56_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB56_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB56_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB56_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i8_release:
 ; LA64:       # %bb.0:
@@ -1790,20 +2691,38 @@ define i8 @atomicrmw_or_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_or_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB57_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB57_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB57_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB57_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB57_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB57_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i16_release:
 ; LA64:       # %bb.0:
@@ -1819,16 +2738,27 @@ define i16 @atomicrmw_or_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_or_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB58_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    or $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB58_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB58_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    or $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB58_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB58_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    or $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB58_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i32_release:
 ; LA64:       # %bb.0:
@@ -1840,15 +2770,25 @@ define i32 @atomicrmw_or_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_or_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_fetch_or_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_fetch_or_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_fetch_or_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i64_release:
 ; LA64:       # %bb.0:
@@ -1860,20 +2800,36 @@ define i64 @atomicrmw_or_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xor_i8_release(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i8_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB60_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB60_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i8_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB60_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB60_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i8_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB60_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB60_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i8_release:
 ; LA64:       # %bb.0:
@@ -1889,20 +2845,38 @@ define i8 @atomicrmw_xor_i8_release(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_xor_i16_release(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i16_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB61_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB61_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i16_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB61_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB61_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i16_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB61_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB61_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i16_release:
 ; LA64:       # %bb.0:
@@ -1918,16 +2892,27 @@ define i16 @atomicrmw_xor_i16_release(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_xor_i32_release(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i32_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB62_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    xor $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB62_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i32_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB62_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    xor $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB62_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i32_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB62_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    xor $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB62_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i32_release:
 ; LA64:       # %bb.0:
@@ -1939,15 +2924,25 @@ define i32 @atomicrmw_xor_i32_release(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xor_i64_release(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i64_release:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 3
-; LA32-NEXT:    bl __atomic_fetch_xor_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i64_release:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 3
+; LA32R-NEXT:    bl __atomic_fetch_xor_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i64_release:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 3
+; LA32S-NEXT:    bl __atomic_fetch_xor_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i64_release:
 ; LA64:       # %bb.0:
@@ -1959,25 +2954,46 @@ define i64 @atomicrmw_xor_i64_release(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB64_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB64_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB64_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB64_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB64_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB64_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -1994,7 +3010,7 @@ define i8 @atomicrmw_xchg_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB64_1
+; LA64-NEXT:    beq $a5, $zero, .LBB64_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2003,21 +3019,38 @@ define i8 @atomicrmw_xchg_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_0_i8_acq_rel(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB65_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB65_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB65_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB65_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB65_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB65_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2034,20 +3067,36 @@ define i8 @atomicrmw_xchg_0_i8_acq_rel(ptr %a) nounwind {
 }
 
 define i8 @atomicrmw_xchg_minus_1_i8_acq_rel(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB66_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB66_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB66_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB66_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB66_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB66_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2063,26 +3112,48 @@ define i8 @atomicrmw_xchg_minus_1_i8_acq_rel(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB67_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB67_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB67_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB67_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB67_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB67_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2100,7 +3171,7 @@ define i16 @atomicrmw_xchg_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB67_1
+; LA64-NEXT:    beq $a5, $zero, .LBB67_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2109,22 +3180,40 @@ define i16 @atomicrmw_xchg_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i16 @atomicrmw_xchg_0_i16_acq_rel(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB68_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB68_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB68_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB68_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB68_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB68_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2142,21 +3231,38 @@ define i16 @atomicrmw_xchg_0_i16_acq_rel(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_minus_1_i16_acq_rel(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB69_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB69_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB69_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB69_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB69_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB69_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2173,16 +3279,27 @@ define i16 @atomicrmw_xchg_minus_1_i16_acq_rel(ptr %a) nounwind {
 }
 
 define i32 @atomicrmw_xchg_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB70_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    move $a3, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB70_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB70_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    move $a3, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB70_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB70_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    move $a3, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB70_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2194,15 +3311,25 @@ define i32 @atomicrmw_xchg_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xchg_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_exchange_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_exchange_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_exchange_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2214,25 +3341,46 @@ define i64 @atomicrmw_xchg_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_add_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB72_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB72_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB72_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB72_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB72_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB72_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2249,7 +3397,7 @@ define i8 @atomicrmw_add_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB72_1
+; LA64-NEXT:    beq $a5, $zero, .LBB72_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2258,26 +3406,48 @@ define i8 @atomicrmw_add_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_add_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB73_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB73_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB73_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB73_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB73_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB73_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2295,7 +3465,7 @@ define i16 @atomicrmw_add_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB73_1
+; LA64-NEXT:    beq $a5, $zero, .LBB73_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2304,16 +3474,27 @@ define i16 @atomicrmw_add_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_add_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB74_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    add.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB74_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB74_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    add.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB74_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB74_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    add.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB74_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2325,15 +3506,25 @@ define i32 @atomicrmw_add_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_add_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_fetch_add_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_fetch_add_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_fetch_add_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2345,25 +3536,46 @@ define i64 @atomicrmw_add_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_sub_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB76_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB76_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB76_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB76_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB76_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB76_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2380,7 +3592,7 @@ define i8 @atomicrmw_sub_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB76_1
+; LA64-NEXT:    beq $a5, $zero, .LBB76_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2389,26 +3601,48 @@ define i8 @atomicrmw_sub_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_sub_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB77_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB77_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB77_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB77_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB77_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB77_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2426,7 +3660,7 @@ define i16 @atomicrmw_sub_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB77_1
+; LA64-NEXT:    beq $a5, $zero, .LBB77_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2435,16 +3669,27 @@ define i16 @atomicrmw_sub_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_sub_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB78_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    sub.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB78_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB78_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    sub.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB78_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB78_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    sub.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB78_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2457,15 +3702,25 @@ define i32 @atomicrmw_sub_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_sub_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_fetch_sub_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_fetch_sub_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_fetch_sub_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2478,26 +3733,48 @@ define i64 @atomicrmw_sub_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_nand_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB80_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB80_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB80_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    and $a5, $a4, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB80_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB80_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB80_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2515,7 +3792,7 @@ define i8 @atomicrmw_nand_i8_acq_rel(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB80_1
+; LA64-NEXT:    beq $a5, $zero, .LBB80_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2524,27 +3801,50 @@ define i8 @atomicrmw_nand_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_nand_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB81_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB81_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB81_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a5, $a3, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB81_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB81_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB81_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2563,7 +3863,7 @@ define i16 @atomicrmw_nand_i16_acq_rel(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB81_1
+; LA64-NEXT:    beq $a5, $zero, .LBB81_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2572,17 +3872,29 @@ define i16 @atomicrmw_nand_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_nand_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB82_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    nor $a3, $a3, $zero
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB82_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB82_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    nor $a3, $a3, $zero
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB82_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB82_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    nor $a3, $a3, $zero
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB82_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2591,7 +3903,7 @@ define i32 @atomicrmw_nand_i32_acq_rel(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB82_1
+; LA64-NEXT:    beq $a3, $zero, .LBB82_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -2600,15 +3912,25 @@ define i32 @atomicrmw_nand_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_nand_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_fetch_nand_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_fetch_nand_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_fetch_nand_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2617,7 +3939,7 @@ define i64 @atomicrmw_nand_i64_acq_rel(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB83_1
+; LA64-NEXT:    beq $a3, $zero, .LBB83_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -2626,23 +3948,42 @@ define i64 @atomicrmw_nand_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_and_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB84_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB84_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a3
+; LA32R-NEXT:  .LBB84_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB84_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB84_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB84_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2661,24 +4002,44 @@ define i8 @atomicrmw_and_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_and_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB85_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB85_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a4
+; LA32R-NEXT:  .LBB85_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB85_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB85_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB85_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2698,16 +4059,27 @@ define i16 @atomicrmw_and_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_and_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB86_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB86_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB86_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB86_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB86_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB86_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2719,15 +4091,25 @@ define i32 @atomicrmw_and_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_and_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_fetch_and_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_fetch_and_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_fetch_and_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2739,20 +4121,36 @@ define i64 @atomicrmw_and_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_or_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB88_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB88_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB88_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB88_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB88_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB88_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2768,20 +4166,38 @@ define i8 @atomicrmw_or_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_or_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB89_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB89_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB89_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB89_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB89_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB89_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2797,16 +4213,27 @@ define i16 @atomicrmw_or_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_or_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB90_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    or $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB90_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB90_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    or $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB90_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB90_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    or $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB90_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2818,15 +4245,25 @@ define i32 @atomicrmw_or_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_or_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_fetch_or_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_fetch_or_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_fetch_or_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2838,20 +4275,36 @@ define i64 @atomicrmw_or_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xor_i8_acq_rel(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i8_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB92_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB92_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i8_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB92_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB92_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i8_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB92_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB92_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i8_acq_rel:
 ; LA64:       # %bb.0:
@@ -2867,20 +4320,38 @@ define i8 @atomicrmw_xor_i8_acq_rel(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_xor_i16_acq_rel(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i16_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB93_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB93_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i16_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB93_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB93_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i16_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB93_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB93_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i16_acq_rel:
 ; LA64:       # %bb.0:
@@ -2896,16 +4367,27 @@ define i16 @atomicrmw_xor_i16_acq_rel(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_xor_i32_acq_rel(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i32_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB94_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    xor $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB94_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i32_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB94_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    xor $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB94_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i32_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB94_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    xor $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB94_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i32_acq_rel:
 ; LA64:       # %bb.0:
@@ -2917,15 +4399,25 @@ define i32 @atomicrmw_xor_i32_acq_rel(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xor_i64_acq_rel(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i64_acq_rel:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 4
-; LA32-NEXT:    bl __atomic_fetch_xor_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i64_acq_rel:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 4
+; LA32R-NEXT:    bl __atomic_fetch_xor_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i64_acq_rel:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 4
+; LA32S-NEXT:    bl __atomic_fetch_xor_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i64_acq_rel:
 ; LA64:       # %bb.0:
@@ -2937,25 +4429,46 @@ define i64 @atomicrmw_xor_i64_acq_rel(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB96_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB96_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB96_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB96_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB96_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB96_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -2972,7 +4485,7 @@ define i8 @atomicrmw_xchg_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB96_1
+; LA64-NEXT:    beq $a5, $zero, .LBB96_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -2981,21 +4494,38 @@ define i8 @atomicrmw_xchg_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_0_i8_seq_cst(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB97_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB97_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB97_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB97_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB97_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB97_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3012,20 +4542,36 @@ define i8 @atomicrmw_xchg_0_i8_seq_cst(ptr %a) nounwind {
 }
 
 define i8 @atomicrmw_xchg_minus_1_i8_seq_cst(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB98_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB98_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB98_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB98_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB98_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB98_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3041,26 +4587,48 @@ define i8 @atomicrmw_xchg_minus_1_i8_seq_cst(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB99_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB99_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB99_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB99_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB99_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB99_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3078,7 +4646,7 @@ define i16 @atomicrmw_xchg_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB99_1
+; LA64-NEXT:    beq $a5, $zero, .LBB99_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3087,22 +4655,40 @@ define i16 @atomicrmw_xchg_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i16 @atomicrmw_xchg_0_i16_seq_cst(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB100_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB100_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB100_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB100_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB100_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB100_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3120,21 +4706,38 @@ define i16 @atomicrmw_xchg_0_i16_seq_cst(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_minus_1_i16_seq_cst(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB101_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB101_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB101_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB101_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB101_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB101_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3151,16 +4754,27 @@ define i16 @atomicrmw_xchg_minus_1_i16_seq_cst(ptr %a) nounwind {
 }
 
 define i32 @atomicrmw_xchg_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB102_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    move $a3, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB102_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB102_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    move $a3, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB102_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB102_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    move $a3, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB102_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3172,15 +4786,25 @@ define i32 @atomicrmw_xchg_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xchg_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_exchange_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_exchange_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_exchange_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3192,25 +4816,46 @@ define i64 @atomicrmw_xchg_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_add_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB104_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB104_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB104_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB104_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB104_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB104_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3227,7 +4872,7 @@ define i8 @atomicrmw_add_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB104_1
+; LA64-NEXT:    beq $a5, $zero, .LBB104_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3236,26 +4881,48 @@ define i8 @atomicrmw_add_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_add_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB105_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB105_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB105_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB105_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB105_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB105_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3273,7 +4940,7 @@ define i16 @atomicrmw_add_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB105_1
+; LA64-NEXT:    beq $a5, $zero, .LBB105_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3282,16 +4949,27 @@ define i16 @atomicrmw_add_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_add_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB106_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    add.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB106_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB106_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    add.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB106_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB106_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    add.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB106_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3303,15 +4981,25 @@ define i32 @atomicrmw_add_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_add_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_fetch_add_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_fetch_add_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_fetch_add_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3323,25 +5011,46 @@ define i64 @atomicrmw_add_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_sub_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB108_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB108_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB108_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB108_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB108_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB108_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3358,7 +5067,7 @@ define i8 @atomicrmw_sub_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB108_1
+; LA64-NEXT:    beq $a5, $zero, .LBB108_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3367,26 +5076,48 @@ define i8 @atomicrmw_sub_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_sub_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB109_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB109_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB109_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB109_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB109_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB109_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3404,7 +5135,7 @@ define i16 @atomicrmw_sub_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB109_1
+; LA64-NEXT:    beq $a5, $zero, .LBB109_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3413,16 +5144,27 @@ define i16 @atomicrmw_sub_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_sub_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB110_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    sub.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB110_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB110_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    sub.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB110_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB110_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    sub.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB110_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3435,15 +5177,25 @@ define i32 @atomicrmw_sub_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_sub_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_fetch_sub_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_fetch_sub_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_fetch_sub_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3456,26 +5208,48 @@ define i64 @atomicrmw_sub_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_nand_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB112_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB112_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB112_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    and $a5, $a4, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB112_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB112_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB112_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3493,7 +5267,7 @@ define i8 @atomicrmw_nand_i8_seq_cst(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB112_1
+; LA64-NEXT:    beq $a5, $zero, .LBB112_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3502,27 +5276,50 @@ define i8 @atomicrmw_nand_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_nand_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB113_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB113_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB113_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a5, $a3, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB113_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB113_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB113_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3541,7 +5338,7 @@ define i16 @atomicrmw_nand_i16_seq_cst(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB113_1
+; LA64-NEXT:    beq $a5, $zero, .LBB113_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3550,17 +5347,29 @@ define i16 @atomicrmw_nand_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_nand_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB114_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    nor $a3, $a3, $zero
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB114_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB114_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    nor $a3, $a3, $zero
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB114_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB114_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    nor $a3, $a3, $zero
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB114_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3569,7 +5378,7 @@ define i32 @atomicrmw_nand_i32_seq_cst(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB114_1
+; LA64-NEXT:    beq $a3, $zero, .LBB114_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -3578,15 +5387,25 @@ define i32 @atomicrmw_nand_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_nand_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_fetch_nand_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_fetch_nand_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_fetch_nand_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3595,7 +5414,7 @@ define i64 @atomicrmw_nand_i64_seq_cst(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB115_1
+; LA64-NEXT:    beq $a3, $zero, .LBB115_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -3604,23 +5423,42 @@ define i64 @atomicrmw_nand_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_and_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB116_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB116_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a3
+; LA32R-NEXT:  .LBB116_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB116_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB116_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB116_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3639,24 +5477,44 @@ define i8 @atomicrmw_and_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_and_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB117_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB117_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a4
+; LA32R-NEXT:  .LBB117_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB117_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB117_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB117_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3676,16 +5534,27 @@ define i16 @atomicrmw_and_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_and_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB118_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB118_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB118_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB118_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB118_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB118_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3697,15 +5566,25 @@ define i32 @atomicrmw_and_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_and_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_fetch_and_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_fetch_and_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_fetch_and_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3717,20 +5596,36 @@ define i64 @atomicrmw_and_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_or_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB120_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB120_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB120_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB120_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB120_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB120_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3746,20 +5641,38 @@ define i8 @atomicrmw_or_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_or_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB121_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB121_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB121_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB121_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB121_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB121_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3775,16 +5688,27 @@ define i16 @atomicrmw_or_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_or_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB122_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    or $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB122_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB122_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    or $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB122_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB122_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    or $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB122_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3796,15 +5720,25 @@ define i32 @atomicrmw_or_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_or_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_fetch_or_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_fetch_or_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_fetch_or_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3816,20 +5750,36 @@ define i64 @atomicrmw_or_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xor_i8_seq_cst(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i8_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB124_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB124_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i8_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB124_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB124_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i8_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB124_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB124_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i8_seq_cst:
 ; LA64:       # %bb.0:
@@ -3845,20 +5795,38 @@ define i8 @atomicrmw_xor_i8_seq_cst(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_xor_i16_seq_cst(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i16_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB125_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB125_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i16_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB125_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB125_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i16_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB125_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB125_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i16_seq_cst:
 ; LA64:       # %bb.0:
@@ -3874,16 +5842,27 @@ define i16 @atomicrmw_xor_i16_seq_cst(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_xor_i32_seq_cst(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i32_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB126_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    xor $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB126_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i32_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB126_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    xor $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB126_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i32_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB126_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    xor $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB126_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i32_seq_cst:
 ; LA64:       # %bb.0:
@@ -3895,15 +5874,25 @@ define i32 @atomicrmw_xor_i32_seq_cst(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xor_i64_seq_cst(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i64_seq_cst:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    ori $a3, $zero, 5
-; LA32-NEXT:    bl __atomic_fetch_xor_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i64_seq_cst:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    ori $a3, $zero, 5
+; LA32R-NEXT:    bl __atomic_fetch_xor_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i64_seq_cst:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    ori $a3, $zero, 5
+; LA32S-NEXT:    bl __atomic_fetch_xor_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i64_seq_cst:
 ; LA64:       # %bb.0:
@@ -3915,25 +5904,46 @@ define i64 @atomicrmw_xor_i64_seq_cst(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB128_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB128_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB128_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB128_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB128_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB128_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -3950,7 +5960,7 @@ define i8 @atomicrmw_xchg_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB128_1
+; LA64-NEXT:    beq $a5, $zero, .LBB128_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -3959,21 +5969,38 @@ define i8 @atomicrmw_xchg_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i8 @atomicrmw_xchg_0_i8_monotonic(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB129_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB129_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB129_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB129_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB129_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB129_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -3990,20 +6017,36 @@ define i8 @atomicrmw_xchg_0_i8_monotonic(ptr %a) nounwind {
 }
 
 define i8 @atomicrmw_xchg_minus_1_i8_monotonic(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a2, $zero, 255
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB130_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB130_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a2, $zero, 255
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB130_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB130_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a2, $zero, 255
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB130_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB130_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4019,26 +6062,48 @@ define i8 @atomicrmw_xchg_minus_1_i8_monotonic(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB131_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    addi.w $a5, $a1, 0
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB131_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB131_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    addi.w $a5, $a1, 0
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB131_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB131_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    addi.w $a5, $a1, 0
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB131_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4056,7 +6121,7 @@ define i16 @atomicrmw_xchg_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB131_1
+; LA64-NEXT:    beq $a5, $zero, .LBB131_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4065,22 +6130,40 @@ define i16 @atomicrmw_xchg_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i16 @atomicrmw_xchg_0_i16_monotonic(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_0_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:    nor $a2, $a2, $zero
-; LA32-NEXT:  .LBB132_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB132_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_0_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:    nor $a2, $a2, $zero
+; LA32R-NEXT:  .LBB132_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    and $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB132_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_0_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:    nor $a2, $a2, $zero
+; LA32S-NEXT:  .LBB132_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB132_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_0_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4098,21 +6181,38 @@ define i16 @atomicrmw_xchg_0_i16_monotonic(ptr %a) nounwind {
 }
 
 define i16 @atomicrmw_xchg_minus_1_i16_monotonic(ptr %a) nounwind {
-; LA32-LABEL: atomicrmw_xchg_minus_1_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 4095
-; LA32-NEXT:    sll.w $a2, $a2, $a1
-; LA32-NEXT:  .LBB133_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a2
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB133_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_minus_1_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    sll.w $a2, $a2, $a0
+; LA32R-NEXT:  .LBB133_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a1, 0
+; LA32R-NEXT:    or $a4, $a3, $a2
+; LA32R-NEXT:    sc.w $a4, $a1, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB133_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_minus_1_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 4095
+; LA32S-NEXT:    sll.w $a2, $a2, $a1
+; LA32S-NEXT:  .LBB133_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a2
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB133_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_minus_1_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4129,16 +6229,27 @@ define i16 @atomicrmw_xchg_minus_1_i16_monotonic(ptr %a) nounwind {
 }
 
 define i32 @atomicrmw_xchg_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB134_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    move $a3, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB134_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB134_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    move $a3, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB134_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB134_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    move $a3, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB134_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4150,15 +6261,25 @@ define i32 @atomicrmw_xchg_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xchg_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xchg_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_exchange_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xchg_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_exchange_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xchg_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_exchange_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xchg_i64_monotonic:
 ; LA64:       # %bb.0:
@@ -4170,25 +6291,46 @@ define i64 @atomicrmw_xchg_i64_monotonic(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_add_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB136_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB136_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB136_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB136_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB136_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB136_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4205,7 +6347,7 @@ define i8 @atomicrmw_add_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB136_1
+; LA64-NEXT:    beq $a5, $zero, .LBB136_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4214,26 +6356,48 @@ define i8 @atomicrmw_add_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_add_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB137_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    add.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB137_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB137_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    add.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB137_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB137_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    add.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB137_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4251,7 +6415,7 @@ define i16 @atomicrmw_add_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB137_1
+; LA64-NEXT:    beq $a5, $zero, .LBB137_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4260,16 +6424,27 @@ define i16 @atomicrmw_add_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_add_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB138_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    add.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB138_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB138_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    add.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB138_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB138_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    add.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB138_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4281,15 +6456,25 @@ define i32 @atomicrmw_add_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_add_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_add_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_fetch_add_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_add_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_fetch_add_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_add_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_fetch_add_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_add_i64_monotonic:
 ; LA64:       # %bb.0:
@@ -4301,25 +6486,46 @@ define i64 @atomicrmw_add_i64_monotonic(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_sub_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB140_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB140_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB140_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a4, $a1
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB140_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB140_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB140_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4336,7 +6542,7 @@ define i8 @atomicrmw_sub_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB140_1
+; LA64-NEXT:    beq $a5, $zero, .LBB140_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4345,26 +6551,48 @@ define i8 @atomicrmw_sub_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_sub_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB141_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    sub.w $a5, $a4, $a1
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB141_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB141_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    sub.w $a5, $a3, $a1
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB141_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB141_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    sub.w $a5, $a4, $a1
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB141_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4382,7 +6610,7 @@ define i16 @atomicrmw_sub_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB141_1
+; LA64-NEXT:    beq $a5, $zero, .LBB141_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4391,16 +6619,27 @@ define i16 @atomicrmw_sub_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_sub_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB142_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    sub.w $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB142_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB142_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    sub.w $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB142_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB142_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    sub.w $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB142_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4413,15 +6652,25 @@ define i32 @atomicrmw_sub_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_sub_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_sub_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_fetch_sub_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_sub_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_fetch_sub_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_sub_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_fetch_sub_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_sub_i64_monotonic:
 ; LA64:       # %bb.0:
@@ -4434,26 +6683,48 @@ define i64 @atomicrmw_sub_i64_monotonic(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_nand_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB144_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB144_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB144_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a4, $a2, 0
+; LA32R-NEXT:    and $a5, $a4, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    and $a5, $a5, $a3
+; LA32R-NEXT:    xor $a5, $a4, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB144_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB144_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB144_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4471,7 +6742,7 @@ define i8 @atomicrmw_nand_i8_monotonic(ptr %a, i8 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB144_1
+; LA64-NEXT:    beq $a5, $zero, .LBB144_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4480,27 +6751,50 @@ define i8 @atomicrmw_nand_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_nand_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB145_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a4, $a0, 0
-; LA32-NEXT:    and $a5, $a4, $a1
-; LA32-NEXT:    nor $a5, $a5, $zero
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    and $a5, $a5, $a3
-; LA32-NEXT:    xor $a5, $a4, $a5
-; LA32-NEXT:    sc.w $a5, $a0, 0
-; LA32-NEXT:    beqz $a5, .LBB145_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a4, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB145_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a5, $a3, $a1
+; LA32R-NEXT:    nor $a5, $a5, $zero
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    and $a5, $a5, $a4
+; LA32R-NEXT:    xor $a5, $a3, $a5
+; LA32R-NEXT:    sc.w $a5, $a2, 0
+; LA32R-NEXT:    beq $a5, $zero, .LBB145_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB145_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a4, $a0, 0
+; LA32S-NEXT:    and $a5, $a4, $a1
+; LA32S-NEXT:    nor $a5, $a5, $zero
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    and $a5, $a5, $a3
+; LA32S-NEXT:    xor $a5, $a4, $a5
+; LA32S-NEXT:    sc.w $a5, $a0, 0
+; LA32S-NEXT:    beq $a5, $zero, .LBB145_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a4, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4519,7 +6813,7 @@ define i16 @atomicrmw_nand_i16_monotonic(ptr %a, i16 %b) nounwind {
 ; LA64-NEXT:    and $a5, $a5, $a3
 ; LA64-NEXT:    xor $a5, $a4, $a5
 ; LA64-NEXT:    sc.w $a5, $a0, 0
-; LA64-NEXT:    beqz $a5, .LBB145_1
+; LA64-NEXT:    beq $a5, $zero, .LBB145_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    srl.w $a0, $a4, $a2
 ; LA64-NEXT:    ret
@@ -4528,17 +6822,29 @@ define i16 @atomicrmw_nand_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_nand_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB146_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    nor $a3, $a3, $zero
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB146_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB146_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    nor $a3, $a3, $zero
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB146_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB146_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    nor $a3, $a3, $zero
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB146_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4547,7 +6853,7 @@ define i32 @atomicrmw_nand_i32_monotonic(ptr %a, i32 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.w $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB146_1
+; LA64-NEXT:    beq $a3, $zero, .LBB146_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -4556,15 +6862,25 @@ define i32 @atomicrmw_nand_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_nand_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_nand_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_fetch_nand_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_nand_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_fetch_nand_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_nand_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_fetch_nand_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_nand_i64_monotonic:
 ; LA64:       # %bb.0:
@@ -4573,7 +6889,7 @@ define i64 @atomicrmw_nand_i64_monotonic(ptr %a, i64 %b) nounwind {
 ; LA64-NEXT:    and $a3, $a2, $a1
 ; LA64-NEXT:    nor $a3, $a3, $zero
 ; LA64-NEXT:    sc.d $a3, $a0, 0
-; LA64-NEXT:    beqz $a3, .LBB147_1
+; LA64-NEXT:    beq $a3, $zero, .LBB147_1
 ; LA64-NEXT:  # %bb.2:
 ; LA64-NEXT:    move $a0, $a2
 ; LA64-NEXT:    ret
@@ -4582,23 +6898,42 @@ define i64 @atomicrmw_nand_i64_monotonic(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_and_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB148_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB148_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ori $a3, $zero, 255
+; LA32R-NEXT:    sll.w $a3, $a3, $a0
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a3
+; LA32R-NEXT:  .LBB148_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB148_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB148_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB148_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4617,24 +6952,44 @@ define i8 @atomicrmw_and_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_and_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    lu12i.w $a3, 15
-; LA32-NEXT:    ori $a3, $a3, 4095
-; LA32-NEXT:    sll.w $a3, $a3, $a2
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    orn $a1, $a1, $a3
-; LA32-NEXT:  .LBB149_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    and $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB149_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    sll.w $a4, $a3, $a0
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:    orn $a1, $a1, $a4
+; LA32R-NEXT:  .LBB149_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    and $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB149_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    lu12i.w $a3, 15
+; LA32S-NEXT:    ori $a3, $a3, 4095
+; LA32S-NEXT:    sll.w $a3, $a3, $a2
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    orn $a1, $a1, $a3
+; LA32S-NEXT:  .LBB149_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    and $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB149_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4654,16 +7009,27 @@ define i16 @atomicrmw_and_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_and_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB150_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    and $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB150_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB150_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    and $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB150_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB150_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    and $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB150_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4675,15 +7041,25 @@ define i32 @atomicrmw_and_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_and_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_and_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_fetch_and_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_and_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_fetch_and_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_and_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_fetch_and_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_and_i64_monotonic:
 ; LA64:       # %bb.0:
@@ -4695,20 +7071,36 @@ define i64 @atomicrmw_and_i64_monotonic(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_or_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB152_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB152_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB152_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB152_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB152_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB152_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4724,20 +7116,38 @@ define i8 @atomicrmw_or_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_or_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB153_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    or $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB153_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB153_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    or $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB153_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB153_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    or $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB153_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4753,16 +7163,27 @@ define i16 @atomicrmw_or_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_or_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB154_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    or $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB154_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB154_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    or $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB154_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB154_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    or $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB154_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4774,15 +7195,25 @@ define i32 @atomicrmw_or_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_or_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_or_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_fetch_or_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_or_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_fetch_or_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_or_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_fetch_or_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_or_i64_monotonic:
 ; LA64:       # %bb.0:
@@ -4794,20 +7225,36 @@ define i64 @atomicrmw_or_i64_monotonic(ptr %a, i64 %b) nounwind {
 }
 
 define i8 @atomicrmw_xor_i8_monotonic(ptr %a, i8 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i8_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB156_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB156_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i8_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB156_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB156_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i8_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB156_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB156_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i8_monotonic:
 ; LA64:       # %bb.0:
@@ -4823,20 +7270,38 @@ define i8 @atomicrmw_xor_i8_monotonic(ptr %a, i8 %b) nounwind {
 }
 
 define i16 @atomicrmw_xor_i16_monotonic(ptr %a, i16 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i16_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a0, 3
-; LA32-NEXT:    bstrins.w $a0, $zero, 1, 0
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:  .LBB157_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a3, $a0, 0
-; LA32-NEXT:    xor $a4, $a3, $a1
-; LA32-NEXT:    sc.w $a4, $a0, 0
-; LA32-NEXT:    beqz $a4, .LBB157_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    srl.w $a0, $a3, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i16_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -4
+; LA32R-NEXT:    and $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    lu12i.w $a3, 15
+; LA32R-NEXT:    ori $a3, $a3, 4095
+; LA32R-NEXT:    and $a1, $a1, $a3
+; LA32R-NEXT:    sll.w $a1, $a1, $a0
+; LA32R-NEXT:  .LBB157_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a3, $a2, 0
+; LA32R-NEXT:    xor $a4, $a3, $a1
+; LA32R-NEXT:    sc.w $a4, $a2, 0
+; LA32R-NEXT:    beq $a4, $zero, .LBB157_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    srl.w $a0, $a3, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i16_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a0, 3
+; LA32S-NEXT:    bstrins.w $a0, $zero, 1, 0
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:  .LBB157_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a3, $a0, 0
+; LA32S-NEXT:    xor $a4, $a3, $a1
+; LA32S-NEXT:    sc.w $a4, $a0, 0
+; LA32S-NEXT:    beq $a4, $zero, .LBB157_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    srl.w $a0, $a3, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i16_monotonic:
 ; LA64:       # %bb.0:
@@ -4852,16 +7317,27 @@ define i16 @atomicrmw_xor_i16_monotonic(ptr %a, i16 %b) nounwind {
 }
 
 define i32 @atomicrmw_xor_i32_monotonic(ptr %a, i32 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i32_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:  .LBB158_1: # =>This Inner Loop Header: Depth=1
-; LA32-NEXT:    ll.w $a2, $a0, 0
-; LA32-NEXT:    xor $a3, $a2, $a1
-; LA32-NEXT:    sc.w $a3, $a0, 0
-; LA32-NEXT:    beqz $a3, .LBB158_1
-; LA32-NEXT:  # %bb.2:
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i32_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:  .LBB158_1: # =>This Inner Loop Header: Depth=1
+; LA32R-NEXT:    ll.w $a2, $a0, 0
+; LA32R-NEXT:    xor $a3, $a2, $a1
+; LA32R-NEXT:    sc.w $a3, $a0, 0
+; LA32R-NEXT:    beq $a3, $zero, .LBB158_1
+; LA32R-NEXT:  # %bb.2:
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i32_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:  .LBB158_1: # =>This Inner Loop Header: Depth=1
+; LA32S-NEXT:    ll.w $a2, $a0, 0
+; LA32S-NEXT:    xor $a3, $a2, $a1
+; LA32S-NEXT:    sc.w $a3, $a0, 0
+; LA32S-NEXT:    beq $a3, $zero, .LBB158_1
+; LA32S-NEXT:  # %bb.2:
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i32_monotonic:
 ; LA64:       # %bb.0:
@@ -4873,15 +7349,25 @@ define i32 @atomicrmw_xor_i32_monotonic(ptr %a, i32 %b) nounwind {
 }
 
 define i64 @atomicrmw_xor_i64_monotonic(ptr %a, i64 %b) nounwind {
-; LA32-LABEL: atomicrmw_xor_i64_monotonic:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    move $a3, $zero
-; LA32-NEXT:    bl __atomic_fetch_xor_8
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: atomicrmw_xor_i64_monotonic:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    move $a3, $zero
+; LA32R-NEXT:    bl __atomic_fetch_xor_8
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: atomicrmw_xor_i64_monotonic:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    move $a3, $zero
+; LA32S-NEXT:    bl __atomic_fetch_xor_8
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: atomicrmw_xor_i64_monotonic:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/br.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/br.ll
index 8b909e3314d64..b4b865a032eec 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/br.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/br.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefixes=ALL,LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefixes=ALL,LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefixes=ALL,LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefixes=ALL,LA64
 
 define void @foo() noreturn nounwind {
@@ -16,14 +17,23 @@ loop:
 }
 
 define void @foo_br_eq(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_eq:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    beq $a2, $a0, .LBB1_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB1_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_eq:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    beq $a2, $a0, .LBB1_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB1_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_eq:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    beq $a2, $a0, .LBB1_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB1_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_eq:
 ; LA64:       # %bb.0:
@@ -46,14 +56,23 @@ end:
 }
 
 define void @foo_br_ne(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_ne:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bne $a2, $a0, .LBB2_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB2_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_ne:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bne $a2, $a0, .LBB2_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB2_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_ne:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bne $a2, $a0, .LBB2_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB2_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_ne:
 ; LA64:       # %bb.0:
@@ -76,14 +95,23 @@ end:
 }
 
 define void @foo_br_slt(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_slt:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    blt $a2, $a0, .LBB3_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB3_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_slt:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    blt $a2, $a0, .LBB3_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB3_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_slt:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    blt $a2, $a0, .LBB3_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB3_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_slt:
 ; LA64:       # %bb.0:
@@ -106,14 +134,23 @@ end:
 }
 
 define void @foo_br_sge(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_sge:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bge $a2, $a0, .LBB4_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB4_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_sge:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bge $a2, $a0, .LBB4_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB4_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_sge:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bge $a2, $a0, .LBB4_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB4_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_sge:
 ; LA64:       # %bb.0:
@@ -136,14 +173,23 @@ end:
 }
 
 define void @foo_br_ult(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_ult:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bltu $a2, $a0, .LBB5_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB5_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_ult:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bltu $a2, $a0, .LBB5_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB5_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_ult:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bltu $a2, $a0, .LBB5_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB5_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_ult:
 ; LA64:       # %bb.0:
@@ -166,14 +212,23 @@ end:
 }
 
 define void @foo_br_uge(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_uge:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bgeu $a2, $a0, .LBB6_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB6_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_uge:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bgeu $a2, $a0, .LBB6_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB6_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_uge:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bgeu $a2, $a0, .LBB6_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB6_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_uge:
 ; LA64:       # %bb.0:
@@ -197,14 +252,23 @@ end:
 
 ;; Check for condition codes that don't have a matching instruction.
 define void @foo_br_sgt(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_sgt:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    blt $a0, $a2, .LBB7_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB7_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_sgt:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    blt $a0, $a2, .LBB7_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB7_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_sgt:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    blt $a0, $a2, .LBB7_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB7_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_sgt:
 ; LA64:       # %bb.0:
@@ -227,14 +291,23 @@ end:
 }
 
 define void @foo_br_sle(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_sle:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bge $a0, $a2, .LBB8_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB8_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_sle:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bge $a0, $a2, .LBB8_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB8_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_sle:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bge $a0, $a2, .LBB8_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB8_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_sle:
 ; LA64:       # %bb.0:
@@ -257,14 +330,23 @@ end:
 }
 
 define void @foo_br_ugt(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_ugt:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bltu $a0, $a2, .LBB9_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB9_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_ugt:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bltu $a0, $a2, .LBB9_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB9_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_ugt:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bltu $a0, $a2, .LBB9_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB9_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_ugt:
 ; LA64:       # %bb.0:
@@ -287,14 +369,23 @@ end:
 }
 
 define void @foo_br_ule(i32 %a, ptr %b) nounwind {
-; LA32-LABEL: foo_br_ule:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ld.w $a2, $a1, 0
-; LA32-NEXT:    bgeu $a0, $a2, .LBB10_2
-; LA32-NEXT:  # %bb.1: # %test
-; LA32-NEXT:    ld.w $zero, $a1, 0
-; LA32-NEXT:  .LBB10_2: # %end
-; LA32-NEXT:    ret
+; LA32R-LABEL: foo_br_ule:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $a2, $a1, 0
+; LA32R-NEXT:    bgeu $a0, $a2, .LBB10_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a1, 0
+; LA32R-NEXT:  .LBB10_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_ule:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $a2, $a1, 0
+; LA32S-NEXT:    bgeu $a0, $a2, .LBB10_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a1, 0
+; LA32S-NEXT:  .LBB10_2: # %end
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: foo_br_ule:
 ; LA64:       # %bb.0:
@@ -319,15 +410,35 @@ end:
 ;; Check the case of a branch where the condition was generated in another
 ;; function.
 define void @foo_br_cc(ptr %a, i1 %cc) nounwind {
-; ALL-LABEL: foo_br_cc:
-; ALL:       # %bb.0:
-; ALL-NEXT:    ld.w $zero, $a0, 0
-; ALL-NEXT:    andi $a1, $a1, 1
-; ALL-NEXT:    bnez $a1, .LBB11_2
-; ALL-NEXT:  # %bb.1: # %test
-; ALL-NEXT:    ld.w $zero, $a0, 0
-; ALL-NEXT:  .LBB11_2: # %end
-; ALL-NEXT:    ret
+; LA32R-LABEL: foo_br_cc:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ld.w $zero, $a0, 0
+; LA32R-NEXT:    andi $a1, $a1, 1
+; LA32R-NEXT:    bne $a1, $zero, .LBB11_2
+; LA32R-NEXT:  # %bb.1: # %test
+; LA32R-NEXT:    ld.w $zero, $a0, 0
+; LA32R-NEXT:  .LBB11_2: # %end
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: foo_br_cc:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ld.w $zero, $a0, 0
+; LA32S-NEXT:    andi $a1, $a1, 1
+; LA32S-NEXT:    bnez $a1, .LBB11_2
+; LA32S-NEXT:  # %bb.1: # %test
+; LA32S-NEXT:    ld.w $zero, $a0, 0
+; LA32S-NEXT:  .LBB11_2: # %end
+; LA32S-NEXT:    ret
+;
+; LA64-LABEL: foo_br_cc:
+; LA64:       # %bb.0:
+; LA64-NEXT:    ld.w $zero, $a0, 0
+; LA64-NEXT:    andi $a1, $a1, 1
+; LA64-NEXT:    bnez $a1, .LBB11_2
+; LA64-NEXT:  # %bb.1: # %test
+; LA64-NEXT:    ld.w $zero, $a0, 0
+; LA64-NEXT:  .LBB11_2: # %end
+; LA64-NEXT:    ret
   %val = load volatile i32, ptr %a
   br i1 %cc, label %end, label %test
 test:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/double-convert.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/double-convert.ll
index 14682d62d7d31..2a51fd97feb62 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/double-convert.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/double-convert.ll
@@ -119,17 +119,18 @@ define i32 @convert_double_to_u32(double %a) nounwind {
 ; LA32-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI7_0)
 ; LA32-NEXT:    fld.d $fa1, $a0, %pc_lo12(.LCPI7_0)
 ; LA32-NEXT:    fcmp.clt.d $fcc0, $fa0, $fa1
-; LA32-NEXT:    fsub.d $fa1, $fa0, $fa1
-; LA32-NEXT:    ftintrz.w.d $fa1, $fa1
-; LA32-NEXT:    movfr2gr.s $a0, $fa1
+; LA32-NEXT:    movcf2gr $a0, $fcc0
+; LA32-NEXT:    bne $a0, $zero, .LBB7_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    fsub.d $fa0, $fa0, $fa1
+; LA32-NEXT:    ftintrz.w.d $fa0, $fa0
+; LA32-NEXT:    movfr2gr.s $a0, $fa0
 ; LA32-NEXT:    lu12i.w $a1, -524288
 ; LA32-NEXT:    xor $a0, $a0, $a1
-; LA32-NEXT:    movcf2gr $a1, $fcc0
-; LA32-NEXT:    masknez $a0, $a0, $a1
+; LA32-NEXT:    ret
+; LA32-NEXT:  .LBB7_2:
 ; LA32-NEXT:    ftintrz.w.d $fa0, $fa0
-; LA32-NEXT:    movfr2gr.s $a2, $fa0
-; LA32-NEXT:    maskeqz $a1, $a2, $a1
-; LA32-NEXT:    or $a0, $a1, $a0
+; LA32-NEXT:    movfr2gr.s $a0, $fa0
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: convert_double_to_u32:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-dbl.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-dbl.ll
index 3db98d20fbf11..cff3484934214 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-dbl.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-dbl.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 --mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 --mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 ;; Test the 'fcmp' LLVM IR: https://llvm.org/docs/LangRef.html#fcmp-instruction
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-flt.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-flt.ll
index d0f8d5342280d..8b682ecac50f5 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-flt.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-flt.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+f,-d < %s | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA64
 
 ;; Test the 'fcmp' LLVM IR: https://llvm.org/docs/LangRef.html#fcmp-instruction
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/float-convert.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/float-convert.ll
index ac1e1df5395c6..413702b006b1b 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/float-convert.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/float-convert.ll
@@ -184,17 +184,18 @@ define i32 @convert_float_to_u32(float %a) nounwind {
 ; LA32F-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI6_0)
 ; LA32F-NEXT:    fld.s $fa1, $a0, %pc_lo12(.LCPI6_0)
 ; LA32F-NEXT:    fcmp.clt.s $fcc0, $fa0, $fa1
-; LA32F-NEXT:    fsub.s $fa1, $fa0, $fa1
-; LA32F-NEXT:    ftintrz.w.s $fa1, $fa1
-; LA32F-NEXT:    movfr2gr.s $a0, $fa1
+; LA32F-NEXT:    movcf2gr $a0, $fcc0
+; LA32F-NEXT:    bne $a0, $zero, .LBB6_2
+; LA32F-NEXT:  # %bb.1:
+; LA32F-NEXT:    fsub.s $fa0, $fa0, $fa1
+; LA32F-NEXT:    ftintrz.w.s $fa0, $fa0
+; LA32F-NEXT:    movfr2gr.s $a0, $fa0
 ; LA32F-NEXT:    lu12i.w $a1, -524288
 ; LA32F-NEXT:    xor $a0, $a0, $a1
-; LA32F-NEXT:    movcf2gr $a1, $fcc0
-; LA32F-NEXT:    masknez $a0, $a0, $a1
+; LA32F-NEXT:    ret
+; LA32F-NEXT:  .LBB6_2:
 ; LA32F-NEXT:    ftintrz.w.s $fa0, $fa0
-; LA32F-NEXT:    movfr2gr.s $a2, $fa0
-; LA32F-NEXT:    maskeqz $a1, $a2, $a1
-; LA32F-NEXT:    or $a0, $a1, $a0
+; LA32F-NEXT:    movfr2gr.s $a0, $fa0
 ; LA32F-NEXT:    ret
 ;
 ; LA32D-LABEL: convert_float_to_u32:
@@ -202,17 +203,18 @@ define i32 @convert_float_to_u32(float %a) nounwind {
 ; LA32D-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI6_0)
 ; LA32D-NEXT:    fld.s $fa1, $a0, %pc_lo12(.LCPI6_0)
 ; LA32D-NEXT:    fcmp.clt.s $fcc0, $fa0, $fa1
-; LA32D-NEXT:    fsub.s $fa1, $fa0, $fa1
-; LA32D-NEXT:    ftintrz.w.s $fa1, $fa1
-; LA32D-NEXT:    movfr2gr.s $a0, $fa1
+; LA32D-NEXT:    movcf2gr $a0, $fcc0
+; LA32D-NEXT:    bne $a0, $zero, .LBB6_2
+; LA32D-NEXT:  # %bb.1:
+; LA32D-NEXT:    fsub.s $fa0, $fa0, $fa1
+; LA32D-NEXT:    ftintrz.w.s $fa0, $fa0
+; LA32D-NEXT:    movfr2gr.s $a0, $fa0
 ; LA32D-NEXT:    lu12i.w $a1, -524288
 ; LA32D-NEXT:    xor $a0, $a0, $a1
-; LA32D-NEXT:    movcf2gr $a1, $fcc0
-; LA32D-NEXT:    masknez $a0, $a0, $a1
+; LA32D-NEXT:    ret
+; LA32D-NEXT:  .LBB6_2:
 ; LA32D-NEXT:    ftintrz.w.s $fa0, $fa0
-; LA32D-NEXT:    movfr2gr.s $a2, $fa0
-; LA32D-NEXT:    maskeqz $a1, $a2, $a1
-; LA32D-NEXT:    or $a0, $a1, $a0
+; LA32D-NEXT:    movfr2gr.s $a0, $fa0
 ; LA32D-NEXT:    ret
 ;
 ; LA64F-LABEL: convert_float_to_u32:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/load-store-fp.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/load-store-fp.ll
index 3f392c5e3f9c6..df82e2e042bf6 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/load-store-fp.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/load-store-fp.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA32F
-; RUN: llc --mtriple=loongarch32 --mattr=+d < %s | FileCheck %s --check-prefix=LA32D
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+f,-d < %s | FileCheck %s --check-prefix=LA32F
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32D
 ; RUN: llc --mtriple=loongarch64 --mattr=+f,-d < %s | FileCheck %s --check-prefix=LA64F
 ; RUN: llc --mtriple=loongarch64 --mattr=+d < %s | FileCheck %s --check-prefix=LA64D
 
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/load-store.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/load-store.ll
index 9654542f87745..18ba4810cfb72 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/load-store.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/load-store.ll
@@ -1,6 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
-; RUN: llc --mtriple=loongarch32 --mattr=+d --relocation-model=static < %s | FileCheck %s --check-prefixes=ALL,LA32NOPIC
-; RUN: llc --mtriple=loongarch32 --mattr=+d --relocation-model=pic < %s | FileCheck %s --check-prefixes=ALL,LA32PIC
+; RUN: llc --mtriple=loongarch32 --mattr=-32s,+d --relocation-model=static < %s | FileCheck %s --check-prefixes=ALL,LA32RNOPIC
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d --relocation-model=static < %s | FileCheck %s --check-prefixes=ALL,LA32SNOPIC
+; RUN: llc --mtriple=loongarch32 --mattr=-32s,+d --relocation-model=pic < %s | FileCheck %s --check-prefixes=ALL,LA32RPIC
+; RUN: llc --mtriple=loongarch32 --mattr=+32s,+d --relocation-model=pic < %s | FileCheck %s --check-prefixes=ALL,LA32SPIC
 ; RUN: llc --mtriple=loongarch64 --mattr=+d --relocation-model=static < %s | FileCheck %s --check-prefixes=ALL,LA64NOPIC
 ; RUN: llc --mtriple=loongarch64 --mattr=+d --relocation-model=pic < %s | FileCheck %s --check-prefixes=ALL,LA64PIC
 
@@ -9,21 +11,37 @@
 @arr = dso_local global [10 x i32] zeroinitializer, align 4
 
 define i32 @load_store_global() nounwind {
-; LA32NOPIC-LABEL: load_store_global:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    pcalau12i $a1, %pc_hi20(G)
-; LA32NOPIC-NEXT:    ld.w $a0, $a1, %pc_lo12(G)
-; LA32NOPIC-NEXT:    addi.w $a0, $a0, 1
-; LA32NOPIC-NEXT:    st.w $a0, $a1, %pc_lo12(G)
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: load_store_global:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    pcalau12i $a1, %pc_hi20(.LG$local)
-; LA32PIC-NEXT:    ld.w $a0, $a1, %pc_lo12(.LG$local)
-; LA32PIC-NEXT:    addi.w $a0, $a0, 1
-; LA32PIC-NEXT:    st.w $a0, $a1, %pc_lo12(.LG$local)
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: load_store_global:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    pcalau12i $a1, %pc_hi20(G)
+; LA32RNOPIC-NEXT:    ld.w $a0, $a1, %pc_lo12(G)
+; LA32RNOPIC-NEXT:    addi.w $a0, $a0, 1
+; LA32RNOPIC-NEXT:    st.w $a0, $a1, %pc_lo12(G)
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: load_store_global:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    pcalau12i $a1, %pc_hi20(G)
+; LA32SNOPIC-NEXT:    ld.w $a0, $a1, %pc_lo12(G)
+; LA32SNOPIC-NEXT:    addi.w $a0, $a0, 1
+; LA32SNOPIC-NEXT:    st.w $a0, $a1, %pc_lo12(G)
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: load_store_global:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    pcalau12i $a1, %pc_hi20(.LG$local)
+; LA32RPIC-NEXT:    ld.w $a0, $a1, %pc_lo12(.LG$local)
+; LA32RPIC-NEXT:    addi.w $a0, $a0, 1
+; LA32RPIC-NEXT:    st.w $a0, $a1, %pc_lo12(.LG$local)
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: load_store_global:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    pcalau12i $a1, %pc_hi20(.LG$local)
+; LA32SPIC-NEXT:    ld.w $a0, $a1, %pc_lo12(.LG$local)
+; LA32SPIC-NEXT:    addi.w $a0, $a0, 1
+; LA32SPIC-NEXT:    st.w $a0, $a1, %pc_lo12(.LG$local)
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: load_store_global:
 ; LA64NOPIC:       # %bb.0:
@@ -47,27 +65,49 @@ define i32 @load_store_global() nounwind {
 }
 
 define i32 @load_store_global_array(i32 %a) nounwind {
-; LA32NOPIC-LABEL: load_store_global_array:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    pcalau12i $a1, %pc_hi20(arr)
-; LA32NOPIC-NEXT:    addi.w $a2, $a1, %pc_lo12(arr)
-; LA32NOPIC-NEXT:    ld.w $a1, $a2, 0
-; LA32NOPIC-NEXT:    st.w $a0, $a2, 0
-; LA32NOPIC-NEXT:    ld.w $zero, $a2, 36
-; LA32NOPIC-NEXT:    st.w $a0, $a2, 36
-; LA32NOPIC-NEXT:    move $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: load_store_global_array:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    pcalau12i $a1, %pc_hi20(.Larr$local)
-; LA32PIC-NEXT:    addi.w $a2, $a1, %pc_lo12(.Larr$local)
-; LA32PIC-NEXT:    ld.w $a1, $a2, 0
-; LA32PIC-NEXT:    st.w $a0, $a2, 0
-; LA32PIC-NEXT:    ld.w $zero, $a2, 36
-; LA32PIC-NEXT:    st.w $a0, $a2, 36
-; LA32PIC-NEXT:    move $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: load_store_global_array:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    pcalau12i $a1, %pc_hi20(arr)
+; LA32RNOPIC-NEXT:    addi.w $a2, $a1, %pc_lo12(arr)
+; LA32RNOPIC-NEXT:    ld.w $a1, $a2, 0
+; LA32RNOPIC-NEXT:    st.w $a0, $a2, 0
+; LA32RNOPIC-NEXT:    ld.w $zero, $a2, 36
+; LA32RNOPIC-NEXT:    st.w $a0, $a2, 36
+; LA32RNOPIC-NEXT:    move $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: load_store_global_array:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    pcalau12i $a1, %pc_hi20(arr)
+; LA32SNOPIC-NEXT:    addi.w $a2, $a1, %pc_lo12(arr)
+; LA32SNOPIC-NEXT:    ld.w $a1, $a2, 0
+; LA32SNOPIC-NEXT:    st.w $a0, $a2, 0
+; LA32SNOPIC-NEXT:    ld.w $zero, $a2, 36
+; LA32SNOPIC-NEXT:    st.w $a0, $a2, 36
+; LA32SNOPIC-NEXT:    move $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: load_store_global_array:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    pcalau12i $a1, %pc_hi20(.Larr$local)
+; LA32RPIC-NEXT:    addi.w $a2, $a1, %pc_lo12(.Larr$local)
+; LA32RPIC-NEXT:    ld.w $a1, $a2, 0
+; LA32RPIC-NEXT:    st.w $a0, $a2, 0
+; LA32RPIC-NEXT:    ld.w $zero, $a2, 36
+; LA32RPIC-NEXT:    st.w $a0, $a2, 36
+; LA32RPIC-NEXT:    move $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: load_store_global_array:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    pcalau12i $a1, %pc_hi20(.Larr$local)
+; LA32SPIC-NEXT:    addi.w $a2, $a1, %pc_lo12(.Larr$local)
+; LA32SPIC-NEXT:    ld.w $a1, $a2, 0
+; LA32SPIC-NEXT:    st.w $a0, $a2, 0
+; LA32SPIC-NEXT:    ld.w $zero, $a2, 36
+; LA32SPIC-NEXT:    st.w $a0, $a2, 36
+; LA32SPIC-NEXT:    move $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: load_store_global_array:
 ; LA64NOPIC:       # %bb.0:
@@ -101,21 +141,37 @@ define i32 @load_store_global_array(i32 %a) nounwind {
 ;; Check indexed and unindexed, sext, zext and anyext loads.
 
 define i64 @ld_b(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_b:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.b $a2, $a0, 1
-; LA32NOPIC-NEXT:    ld.b $zero, $a0, 0
-; LA32NOPIC-NEXT:    srai.w $a1, $a2, 31
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_b:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.b $a2, $a0, 1
-; LA32PIC-NEXT:    ld.b $zero, $a0, 0
-; LA32PIC-NEXT:    srai.w $a1, $a2, 31
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_b:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.b $a2, $a0, 1
+; LA32RNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_b:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.b $a2, $a0, 1
+; LA32SNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_b:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.b $a2, $a0, 1
+; LA32RPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_b:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.b $a2, $a0, 1
+; LA32SPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_b:
 ; LA64NOPIC:       # %bb.0:
@@ -138,21 +194,37 @@ define i64 @ld_b(ptr %a) nounwind {
 }
 
 define i64 @ld_h(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_h:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.h $a2, $a0, 4
-; LA32NOPIC-NEXT:    ld.h $zero, $a0, 0
-; LA32NOPIC-NEXT:    srai.w $a1, $a2, 31
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_h:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.h $a2, $a0, 4
-; LA32PIC-NEXT:    ld.h $zero, $a0, 0
-; LA32PIC-NEXT:    srai.w $a1, $a2, 31
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_h:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.h $a2, $a0, 4
+; LA32RNOPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32RNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_h:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.h $a2, $a0, 4
+; LA32SNOPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32SNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_h:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.h $a2, $a0, 4
+; LA32RPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32RPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_h:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.h $a2, $a0, 4
+; LA32SPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32SPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_h:
 ; LA64NOPIC:       # %bb.0:
@@ -175,21 +247,37 @@ define i64 @ld_h(ptr %a) nounwind {
 }
 
 define i64 @ld_w(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_w:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.w $a2, $a0, 12
-; LA32NOPIC-NEXT:    ld.w $zero, $a0, 0
-; LA32NOPIC-NEXT:    srai.w $a1, $a2, 31
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_w:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.w $a2, $a0, 12
-; LA32PIC-NEXT:    ld.w $zero, $a0, 0
-; LA32PIC-NEXT:    srai.w $a1, $a2, 31
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_w:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.w $a2, $a0, 12
+; LA32RNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_w:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.w $a2, $a0, 12
+; LA32SNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_w:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.w $a2, $a0, 12
+; LA32RPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_w:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.w $a2, $a0, 12
+; LA32SPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_w:
 ; LA64NOPIC:       # %bb.0:
@@ -212,23 +300,41 @@ define i64 @ld_w(ptr %a) nounwind {
 }
 
 define i64 @ld_d(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_d:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.w $a1, $a0, 28
-; LA32NOPIC-NEXT:    ld.w $a2, $a0, 24
-; LA32NOPIC-NEXT:    ld.w $zero, $a0, 4
-; LA32NOPIC-NEXT:    ld.w $zero, $a0, 0
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_d:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.w $a1, $a0, 28
-; LA32PIC-NEXT:    ld.w $a2, $a0, 24
-; LA32PIC-NEXT:    ld.w $zero, $a0, 4
-; LA32PIC-NEXT:    ld.w $zero, $a0, 0
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_d:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.w $a1, $a0, 28
+; LA32RNOPIC-NEXT:    ld.w $a2, $a0, 24
+; LA32RNOPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32RNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_d:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.w $a1, $a0, 28
+; LA32SNOPIC-NEXT:    ld.w $a2, $a0, 24
+; LA32SNOPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32SNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_d:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.w $a1, $a0, 28
+; LA32RPIC-NEXT:    ld.w $a2, $a0, 24
+; LA32RPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32RPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_d:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.w $a1, $a0, 28
+; LA32SPIC-NEXT:    ld.w $a2, $a0, 24
+; LA32SPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32SPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_d:
 ; LA64NOPIC:       # %bb.0:
@@ -250,21 +356,37 @@ define i64 @ld_d(ptr %a) nounwind {
 }
 
 define i64 @ld_bu(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_bu:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.bu $a1, $a0, 4
-; LA32NOPIC-NEXT:    ld.bu $a0, $a0, 0
-; LA32NOPIC-NEXT:    add.w $a0, $a1, $a0
-; LA32NOPIC-NEXT:    sltu $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_bu:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.bu $a1, $a0, 4
-; LA32PIC-NEXT:    ld.bu $a0, $a0, 0
-; LA32PIC-NEXT:    add.w $a0, $a1, $a0
-; LA32PIC-NEXT:    sltu $a1, $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_bu:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.bu $a1, $a0, 4
+; LA32RNOPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32RNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_bu:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.bu $a1, $a0, 4
+; LA32SNOPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32SNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_bu:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.bu $a1, $a0, 4
+; LA32RPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32RPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_bu:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.bu $a1, $a0, 4
+; LA32SPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32SPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_bu:
 ; LA64NOPIC:       # %bb.0:
@@ -289,21 +411,37 @@ define i64 @ld_bu(ptr %a) nounwind {
 }
 
 define i64 @ld_hu(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_hu:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.hu $a1, $a0, 10
-; LA32NOPIC-NEXT:    ld.hu $a0, $a0, 0
-; LA32NOPIC-NEXT:    add.w $a0, $a1, $a0
-; LA32NOPIC-NEXT:    sltu $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_hu:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.hu $a1, $a0, 10
-; LA32PIC-NEXT:    ld.hu $a0, $a0, 0
-; LA32PIC-NEXT:    add.w $a0, $a1, $a0
-; LA32PIC-NEXT:    sltu $a1, $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_hu:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.hu $a1, $a0, 10
+; LA32RNOPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32RNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_hu:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.hu $a1, $a0, 10
+; LA32SNOPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32SNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_hu:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.hu $a1, $a0, 10
+; LA32RPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32RPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_hu:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.hu $a1, $a0, 10
+; LA32SPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32SPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_hu:
 ; LA64NOPIC:       # %bb.0:
@@ -328,21 +466,37 @@ define i64 @ld_hu(ptr %a) nounwind {
 }
 
 define i64 @ld_wu(ptr %a) nounwind {
-; LA32NOPIC-LABEL: ld_wu:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.w $a1, $a0, 20
-; LA32NOPIC-NEXT:    ld.w $a0, $a0, 0
-; LA32NOPIC-NEXT:    add.w $a0, $a1, $a0
-; LA32NOPIC-NEXT:    sltu $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_wu:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.w $a1, $a0, 20
-; LA32PIC-NEXT:    ld.w $a0, $a0, 0
-; LA32PIC-NEXT:    add.w $a0, $a1, $a0
-; LA32PIC-NEXT:    sltu $a1, $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_wu:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.w $a1, $a0, 20
+; LA32RNOPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32RNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_wu:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.w $a1, $a0, 20
+; LA32SNOPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32SNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_wu:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.w $a1, $a0, 20
+; LA32RPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32RPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_wu:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.w $a1, $a0, 20
+; LA32SPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32SPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_wu:
 ; LA64NOPIC:       # %bb.0:
@@ -367,23 +521,41 @@ define i64 @ld_wu(ptr %a) nounwind {
 }
 
 define i64 @ldx_b(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_b:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    add.w $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ld.b $a2, $a1, 0
-; LA32NOPIC-NEXT:    ld.b $zero, $a0, 0
-; LA32NOPIC-NEXT:    srai.w $a1, $a2, 31
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_b:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    add.w $a1, $a0, $a1
-; LA32PIC-NEXT:    ld.b $a2, $a1, 0
-; LA32PIC-NEXT:    ld.b $zero, $a0, 0
-; LA32PIC-NEXT:    srai.w $a1, $a2, 31
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_b:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.b $a2, $a1, 0
+; LA32RNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_b:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ld.b $a2, $a1, 0
+; LA32SNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_b:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.b $a2, $a1, 0
+; LA32RPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_b:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32SPIC-NEXT:    ld.b $a2, $a1, 0
+; LA32SPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_b:
 ; LA64NOPIC:       # %bb.0:
@@ -406,23 +578,43 @@ define i64 @ldx_b(ptr %a, i64 %idx) nounwind {
 }
 
 define i64 @ldx_h(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_h:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a1, $a1, $a0, 1
-; LA32NOPIC-NEXT:    ld.h $a2, $a1, 0
-; LA32NOPIC-NEXT:    ld.h $zero, $a0, 0
-; LA32NOPIC-NEXT:    srai.w $a1, $a2, 31
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_h:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a1, $a1, $a0, 1
-; LA32PIC-NEXT:    ld.h $a2, $a1, 0
-; LA32PIC-NEXT:    ld.h $zero, $a0, 0
-; LA32PIC-NEXT:    srai.w $a1, $a2, 31
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_h:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 1
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.h $a2, $a1, 0
+; LA32RNOPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32RNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_h:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a1, $a1, $a0, 1
+; LA32SNOPIC-NEXT:    ld.h $a2, $a1, 0
+; LA32SNOPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32SNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_h:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 1
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.h $a2, $a1, 0
+; LA32RPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32RPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_h:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a1, $a1, $a0, 1
+; LA32SPIC-NEXT:    ld.h $a2, $a1, 0
+; LA32SPIC-NEXT:    ld.h $zero, $a0, 0
+; LA32SPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_h:
 ; LA64NOPIC:       # %bb.0:
@@ -447,23 +639,43 @@ define i64 @ldx_h(ptr %a, i64 %idx) nounwind {
 }
 
 define i64 @ldx_w(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_w:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a1, $a1, $a0, 2
-; LA32NOPIC-NEXT:    ld.w $a2, $a1, 0
-; LA32NOPIC-NEXT:    ld.w $zero, $a0, 0
-; LA32NOPIC-NEXT:    srai.w $a1, $a2, 31
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_w:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a1, $a1, $a0, 2
-; LA32PIC-NEXT:    ld.w $a2, $a1, 0
-; LA32PIC-NEXT:    ld.w $zero, $a0, 0
-; LA32PIC-NEXT:    srai.w $a1, $a2, 31
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_w:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 2
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32RNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_w:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a1, $a1, $a0, 2
+; LA32SNOPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32SNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SNOPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_w:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 2
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32RPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_w:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a1, $a1, $a0, 2
+; LA32SPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32SPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SPIC-NEXT:    srai.w $a1, $a2, 31
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_w:
 ; LA64NOPIC:       # %bb.0:
@@ -488,25 +700,47 @@ define i64 @ldx_w(ptr %a, i64 %idx) nounwind {
 }
 
 define i64 @ldx_d(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_d:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a1, $a1, $a0, 3
-; LA32NOPIC-NEXT:    ld.w $a2, $a1, 0
-; LA32NOPIC-NEXT:    ld.w $a1, $a1, 4
-; LA32NOPIC-NEXT:    ld.w $zero, $a0, 0
-; LA32NOPIC-NEXT:    ld.w $zero, $a0, 4
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_d:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a1, $a1, $a0, 3
-; LA32PIC-NEXT:    ld.w $a2, $a1, 0
-; LA32PIC-NEXT:    ld.w $a1, $a1, 4
-; LA32PIC-NEXT:    ld.w $zero, $a0, 0
-; LA32PIC-NEXT:    ld.w $zero, $a0, 4
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_d:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 3
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32RNOPIC-NEXT:    ld.w $a1, $a1, 4
+; LA32RNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RNOPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_d:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a1, $a1, $a0, 3
+; LA32SNOPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32SNOPIC-NEXT:    ld.w $a1, $a1, 4
+; LA32SNOPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SNOPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_d:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 3
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32RPIC-NEXT:    ld.w $a1, $a1, 4
+; LA32RPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32RPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_d:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a1, $a1, $a0, 3
+; LA32SPIC-NEXT:    ld.w $a2, $a1, 0
+; LA32SPIC-NEXT:    ld.w $a1, $a1, 4
+; LA32SPIC-NEXT:    ld.w $zero, $a0, 0
+; LA32SPIC-NEXT:    ld.w $zero, $a0, 4
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_d:
 ; LA64NOPIC:       # %bb.0:
@@ -530,23 +764,41 @@ define i64 @ldx_d(ptr %a, i64 %idx) nounwind {
 }
 
 define i64 @ldx_bu(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_bu:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    add.w $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ld.bu $a1, $a1, 0
-; LA32NOPIC-NEXT:    ld.bu $a0, $a0, 0
-; LA32NOPIC-NEXT:    add.w $a0, $a1, $a0
-; LA32NOPIC-NEXT:    sltu $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_bu:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    add.w $a1, $a0, $a1
-; LA32PIC-NEXT:    ld.bu $a1, $a1, 0
-; LA32PIC-NEXT:    ld.bu $a0, $a0, 0
-; LA32PIC-NEXT:    add.w $a0, $a1, $a0
-; LA32PIC-NEXT:    sltu $a1, $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_bu:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.bu $a1, $a1, 0
+; LA32RNOPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32RNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_bu:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ld.bu $a1, $a1, 0
+; LA32SNOPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32SNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_bu:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.bu $a1, $a1, 0
+; LA32RPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32RPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_bu:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32SPIC-NEXT:    ld.bu $a1, $a1, 0
+; LA32SPIC-NEXT:    ld.bu $a0, $a0, 0
+; LA32SPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_bu:
 ; LA64NOPIC:       # %bb.0:
@@ -571,23 +823,43 @@ define i64 @ldx_bu(ptr %a, i64 %idx) nounwind {
 }
 
 define i64 @ldx_hu(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_hu:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a1, $a1, $a0, 1
-; LA32NOPIC-NEXT:    ld.hu $a1, $a1, 0
-; LA32NOPIC-NEXT:    ld.hu $a0, $a0, 0
-; LA32NOPIC-NEXT:    add.w $a0, $a1, $a0
-; LA32NOPIC-NEXT:    sltu $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_hu:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a1, $a1, $a0, 1
-; LA32PIC-NEXT:    ld.hu $a1, $a1, 0
-; LA32PIC-NEXT:    ld.hu $a0, $a0, 0
-; LA32PIC-NEXT:    add.w $a0, $a1, $a0
-; LA32PIC-NEXT:    sltu $a1, $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_hu:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 1
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.hu $a1, $a1, 0
+; LA32RNOPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32RNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_hu:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a1, $a1, $a0, 1
+; LA32SNOPIC-NEXT:    ld.hu $a1, $a1, 0
+; LA32SNOPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32SNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_hu:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 1
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.hu $a1, $a1, 0
+; LA32RPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32RPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_hu:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a1, $a1, $a0, 1
+; LA32SPIC-NEXT:    ld.hu $a1, $a1, 0
+; LA32SPIC-NEXT:    ld.hu $a0, $a0, 0
+; LA32SPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_hu:
 ; LA64NOPIC:       # %bb.0:
@@ -614,23 +886,43 @@ define i64 @ldx_hu(ptr %a, i64 %idx) nounwind {
 }
 
 define i64 @ldx_wu(ptr %a, i64 %idx) nounwind {
-; LA32NOPIC-LABEL: ldx_wu:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a1, $a1, $a0, 2
-; LA32NOPIC-NEXT:    ld.w $a1, $a1, 0
-; LA32NOPIC-NEXT:    ld.w $a0, $a0, 0
-; LA32NOPIC-NEXT:    add.w $a0, $a1, $a0
-; LA32NOPIC-NEXT:    sltu $a1, $a0, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ldx_wu:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a1, $a1, $a0, 2
-; LA32PIC-NEXT:    ld.w $a1, $a1, 0
-; LA32PIC-NEXT:    ld.w $a0, $a0, 0
-; LA32PIC-NEXT:    add.w $a0, $a1, $a0
-; LA32PIC-NEXT:    sltu $a1, $a0, $a1
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ldx_wu:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 2
+; LA32RNOPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ld.w $a1, $a1, 0
+; LA32RNOPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32RNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ldx_wu:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a1, $a1, $a0, 2
+; LA32SNOPIC-NEXT:    ld.w $a1, $a1, 0
+; LA32SNOPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32SNOPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SNOPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ldx_wu:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 2
+; LA32RPIC-NEXT:    add.w $a1, $a0, $a1
+; LA32RPIC-NEXT:    ld.w $a1, $a1, 0
+; LA32RPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32RPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32RPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ldx_wu:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a1, $a1, $a0, 2
+; LA32SPIC-NEXT:    ld.w $a1, $a1, 0
+; LA32SPIC-NEXT:    ld.w $a0, $a0, 0
+; LA32SPIC-NEXT:    add.w $a0, $a1, $a0
+; LA32SPIC-NEXT:    sltu $a1, $a0, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ldx_wu:
 ; LA64NOPIC:       # %bb.0:
@@ -695,21 +987,37 @@ define void @st_w(ptr %a, i32 %b) nounwind {
 }
 
 define void @st_d(ptr %a, i64 %b) nounwind {
-; LA32NOPIC-LABEL: st_d:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    st.w $a2, $a0, 4
-; LA32NOPIC-NEXT:    st.w $a1, $a0, 0
-; LA32NOPIC-NEXT:    st.w $a2, $a0, 68
-; LA32NOPIC-NEXT:    st.w $a1, $a0, 64
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: st_d:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    st.w $a2, $a0, 4
-; LA32PIC-NEXT:    st.w $a1, $a0, 0
-; LA32PIC-NEXT:    st.w $a2, $a0, 68
-; LA32PIC-NEXT:    st.w $a1, $a0, 64
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: st_d:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    st.w $a2, $a0, 4
+; LA32RNOPIC-NEXT:    st.w $a1, $a0, 0
+; LA32RNOPIC-NEXT:    st.w $a2, $a0, 68
+; LA32RNOPIC-NEXT:    st.w $a1, $a0, 64
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: st_d:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    st.w $a2, $a0, 4
+; LA32SNOPIC-NEXT:    st.w $a1, $a0, 0
+; LA32SNOPIC-NEXT:    st.w $a2, $a0, 68
+; LA32SNOPIC-NEXT:    st.w $a1, $a0, 64
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: st_d:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    st.w $a2, $a0, 4
+; LA32RPIC-NEXT:    st.w $a1, $a0, 0
+; LA32RPIC-NEXT:    st.w $a2, $a0, 68
+; LA32RPIC-NEXT:    st.w $a1, $a0, 64
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: st_d:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    st.w $a2, $a0, 4
+; LA32SPIC-NEXT:    st.w $a1, $a0, 0
+; LA32SPIC-NEXT:    st.w $a2, $a0, 68
+; LA32SPIC-NEXT:    st.w $a1, $a0, 64
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: st_d:
 ; LA64NOPIC:       # %bb.0:
@@ -729,17 +1037,29 @@ define void @st_d(ptr %a, i64 %b) nounwind {
 }
 
 define void @stx_b(ptr %dst, i64 %idx, i8 %val) nounwind {
-; LA32NOPIC-LABEL: stx_b:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    add.w $a0, $a0, $a1
-; LA32NOPIC-NEXT:    st.b $a3, $a0, 0
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: stx_b:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    add.w $a0, $a0, $a1
-; LA32PIC-NEXT:    st.b $a3, $a0, 0
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: stx_b:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RNOPIC-NEXT:    st.b $a3, $a0, 0
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: stx_b:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32SNOPIC-NEXT:    st.b $a3, $a0, 0
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: stx_b:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RPIC-NEXT:    st.b $a3, $a0, 0
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: stx_b:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32SPIC-NEXT:    st.b $a3, $a0, 0
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: stx_b:
 ; LA64NOPIC:       # %bb.0:
@@ -756,17 +1076,31 @@ define void @stx_b(ptr %dst, i64 %idx, i8 %val) nounwind {
 }
 
 define void @stx_h(ptr %dst, i64 %idx, i16 %val) nounwind {
-; LA32NOPIC-LABEL: stx_h:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a0, $a1, $a0, 1
-; LA32NOPIC-NEXT:    st.h $a3, $a0, 0
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: stx_h:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a0, $a1, $a0, 1
-; LA32PIC-NEXT:    st.h $a3, $a0, 0
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: stx_h:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 1
+; LA32RNOPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RNOPIC-NEXT:    st.h $a3, $a0, 0
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: stx_h:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a0, $a1, $a0, 1
+; LA32SNOPIC-NEXT:    st.h $a3, $a0, 0
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: stx_h:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 1
+; LA32RPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RPIC-NEXT:    st.h $a3, $a0, 0
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: stx_h:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a0, $a1, $a0, 1
+; LA32SPIC-NEXT:    st.h $a3, $a0, 0
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: stx_h:
 ; LA64NOPIC:       # %bb.0:
@@ -785,17 +1119,31 @@ define void @stx_h(ptr %dst, i64 %idx, i16 %val) nounwind {
 }
 
 define void @stx_w(ptr %dst, i64 %idx, i32 %val) nounwind {
-; LA32NOPIC-LABEL: stx_w:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a0, $a1, $a0, 2
-; LA32NOPIC-NEXT:    st.w $a3, $a0, 0
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: stx_w:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a0, $a1, $a0, 2
-; LA32PIC-NEXT:    st.w $a3, $a0, 0
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: stx_w:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 2
+; LA32RNOPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RNOPIC-NEXT:    st.w $a3, $a0, 0
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: stx_w:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a0, $a1, $a0, 2
+; LA32SNOPIC-NEXT:    st.w $a3, $a0, 0
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: stx_w:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 2
+; LA32RPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RPIC-NEXT:    st.w $a3, $a0, 0
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: stx_w:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a0, $a1, $a0, 2
+; LA32SPIC-NEXT:    st.w $a3, $a0, 0
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: stx_w:
 ; LA64NOPIC:       # %bb.0:
@@ -814,19 +1162,35 @@ define void @stx_w(ptr %dst, i64 %idx, i32 %val) nounwind {
 }
 
 define void @stx_d(ptr %dst, i64 %idx, i64 %val) nounwind {
-; LA32NOPIC-LABEL: stx_d:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    alsl.w $a0, $a1, $a0, 3
-; LA32NOPIC-NEXT:    st.w $a4, $a0, 4
-; LA32NOPIC-NEXT:    st.w $a3, $a0, 0
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: stx_d:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    alsl.w $a0, $a1, $a0, 3
-; LA32PIC-NEXT:    st.w $a4, $a0, 4
-; LA32PIC-NEXT:    st.w $a3, $a0, 0
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: stx_d:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    slli.w $a1, $a1, 3
+; LA32RNOPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RNOPIC-NEXT:    st.w $a4, $a0, 4
+; LA32RNOPIC-NEXT:    st.w $a3, $a0, 0
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: stx_d:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    alsl.w $a0, $a1, $a0, 3
+; LA32SNOPIC-NEXT:    st.w $a4, $a0, 4
+; LA32SNOPIC-NEXT:    st.w $a3, $a0, 0
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: stx_d:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    slli.w $a1, $a1, 3
+; LA32RPIC-NEXT:    add.w $a0, $a0, $a1
+; LA32RPIC-NEXT:    st.w $a4, $a0, 4
+; LA32RPIC-NEXT:    st.w $a3, $a0, 0
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: stx_d:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    alsl.w $a0, $a1, $a0, 3
+; LA32SPIC-NEXT:    st.w $a4, $a0, 4
+; LA32SPIC-NEXT:    st.w $a3, $a0, 0
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: stx_d:
 ; LA64NOPIC:       # %bb.0:
@@ -846,27 +1210,50 @@ define void @stx_d(ptr %dst, i64 %idx, i64 %val) nounwind {
 
 ;; Check load from and store to an i1 location.
 define i64 @load_sext_zext_anyext_i1(ptr %a) nounwind {
-; LA32NOPIC-LABEL: load_sext_zext_anyext_i1:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.bu $a1, $a0, 1
-; LA32NOPIC-NEXT:    ld.bu $a3, $a0, 2
-; LA32NOPIC-NEXT:    sub.w $a2, $a3, $a1
-; LA32NOPIC-NEXT:    ld.b $zero, $a0, 0
-; LA32NOPIC-NEXT:    sltu $a0, $a3, $a1
-; LA32NOPIC-NEXT:    sub.w $a1, $zero, $a0
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: load_sext_zext_anyext_i1:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.bu $a1, $a0, 1
-; LA32PIC-NEXT:    ld.bu $a3, $a0, 2
-; LA32PIC-NEXT:    sub.w $a2, $a3, $a1
-; LA32PIC-NEXT:    ld.b $zero, $a0, 0
-; LA32PIC-NEXT:    sltu $a0, $a3, $a1
-; LA32PIC-NEXT:    sub.w $a1, $zero, $a0
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    ret
+  ;; sextload i1
+; LA32RNOPIC-LABEL: load_sext_zext_anyext_i1:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32RNOPIC-NEXT:    ld.bu $a3, $a0, 2
+; LA32RNOPIC-NEXT:    sub.w $a2, $a3, $a1
+; LA32RNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RNOPIC-NEXT:    sltu $a0, $a3, $a1
+; LA32RNOPIC-NEXT:    sub.w $a1, $zero, $a0
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: load_sext_zext_anyext_i1:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32SNOPIC-NEXT:    ld.bu $a3, $a0, 2
+; LA32SNOPIC-NEXT:    sub.w $a2, $a3, $a1
+; LA32SNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SNOPIC-NEXT:    sltu $a0, $a3, $a1
+; LA32SNOPIC-NEXT:    sub.w $a1, $zero, $a0
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: load_sext_zext_anyext_i1:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32RPIC-NEXT:    ld.bu $a3, $a0, 2
+; LA32RPIC-NEXT:    sub.w $a2, $a3, $a1
+; LA32RPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RPIC-NEXT:    sltu $a0, $a3, $a1
+; LA32RPIC-NEXT:    sub.w $a1, $zero, $a0
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: load_sext_zext_anyext_i1:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32SPIC-NEXT:    ld.bu $a3, $a0, 2
+; LA32SPIC-NEXT:    sub.w $a2, $a3, $a1
+; LA32SPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SPIC-NEXT:    sltu $a0, $a3, $a1
+; LA32SPIC-NEXT:    sub.w $a1, $zero, $a0
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: load_sext_zext_anyext_i1:
 ; LA64NOPIC:       # %bb.0:
@@ -883,7 +1270,6 @@ define i64 @load_sext_zext_anyext_i1(ptr %a) nounwind {
 ; LA64PIC-NEXT:    ld.b $zero, $a0, 0
 ; LA64PIC-NEXT:    sub.d $a0, $a2, $a1
 ; LA64PIC-NEXT:    ret
-  ;; sextload i1
   %1 = getelementptr i1, ptr %a, i64 1
   %2 = load i1, ptr %1
   %3 = sext i1 %2 to i64
@@ -898,21 +1284,38 @@ define i64 @load_sext_zext_anyext_i1(ptr %a) nounwind {
 }
 
 define i16 @load_sext_zext_anyext_i1_i16(ptr %a) nounwind {
-; LA32NOPIC-LABEL: load_sext_zext_anyext_i1_i16:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    ld.bu $a1, $a0, 1
-; LA32NOPIC-NEXT:    ld.bu $a2, $a0, 2
-; LA32NOPIC-NEXT:    ld.b $zero, $a0, 0
-; LA32NOPIC-NEXT:    sub.w $a0, $a2, $a1
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: load_sext_zext_anyext_i1_i16:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    ld.bu $a1, $a0, 1
-; LA32PIC-NEXT:    ld.bu $a2, $a0, 2
-; LA32PIC-NEXT:    ld.b $zero, $a0, 0
-; LA32PIC-NEXT:    sub.w $a0, $a2, $a1
-; LA32PIC-NEXT:    ret
+  ;; sextload i1
+; LA32RNOPIC-LABEL: load_sext_zext_anyext_i1_i16:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32RNOPIC-NEXT:    ld.bu $a2, $a0, 2
+; LA32RNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RNOPIC-NEXT:    sub.w $a0, $a2, $a1
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: load_sext_zext_anyext_i1_i16:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32SNOPIC-NEXT:    ld.bu $a2, $a0, 2
+; LA32SNOPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SNOPIC-NEXT:    sub.w $a0, $a2, $a1
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: load_sext_zext_anyext_i1_i16:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32RPIC-NEXT:    ld.bu $a2, $a0, 2
+; LA32RPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32RPIC-NEXT:    sub.w $a0, $a2, $a1
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: load_sext_zext_anyext_i1_i16:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    ld.bu $a1, $a0, 1
+; LA32SPIC-NEXT:    ld.bu $a2, $a0, 2
+; LA32SPIC-NEXT:    ld.b $zero, $a0, 0
+; LA32SPIC-NEXT:    sub.w $a0, $a2, $a1
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: load_sext_zext_anyext_i1_i16:
 ; LA64NOPIC:       # %bb.0:
@@ -929,7 +1332,6 @@ define i16 @load_sext_zext_anyext_i1_i16(ptr %a) nounwind {
 ; LA64PIC-NEXT:    ld.b $zero, $a0, 0
 ; LA64PIC-NEXT:    sub.d $a0, $a2, $a1
 ; LA64PIC-NEXT:    ret
-  ;; sextload i1
   %1 = getelementptr i1, ptr %a, i64 1
   %2 = load i1, ptr %1
   %3 = sext i1 %2 to i16
@@ -944,31 +1346,57 @@ define i16 @load_sext_zext_anyext_i1_i16(ptr %a) nounwind {
 }
 
 define i64 @ld_sd_constant(i64 %a) nounwind {
-; LA32NOPIC-LABEL: ld_sd_constant:
-; LA32NOPIC:       # %bb.0:
-; LA32NOPIC-NEXT:    lu12i.w $a3, -136485
-; LA32NOPIC-NEXT:    ori $a4, $a3, 3823
-; LA32NOPIC-NEXT:    ld.w $a2, $a4, 0
-; LA32NOPIC-NEXT:    ori $a5, $a3, 3827
-; LA32NOPIC-NEXT:    ld.w $a3, $a5, 0
-; LA32NOPIC-NEXT:    st.w $a0, $a4, 0
-; LA32NOPIC-NEXT:    st.w $a1, $a5, 0
-; LA32NOPIC-NEXT:    move $a0, $a2
-; LA32NOPIC-NEXT:    move $a1, $a3
-; LA32NOPIC-NEXT:    ret
-;
-; LA32PIC-LABEL: ld_sd_constant:
-; LA32PIC:       # %bb.0:
-; LA32PIC-NEXT:    lu12i.w $a3, -136485
-; LA32PIC-NEXT:    ori $a4, $a3, 3823
-; LA32PIC-NEXT:    ld.w $a2, $a4, 0
-; LA32PIC-NEXT:    ori $a5, $a3, 3827
-; LA32PIC-NEXT:    ld.w $a3, $a5, 0
-; LA32PIC-NEXT:    st.w $a0, $a4, 0
-; LA32PIC-NEXT:    st.w $a1, $a5, 0
-; LA32PIC-NEXT:    move $a0, $a2
-; LA32PIC-NEXT:    move $a1, $a3
-; LA32PIC-NEXT:    ret
+; LA32RNOPIC-LABEL: ld_sd_constant:
+; LA32RNOPIC:       # %bb.0:
+; LA32RNOPIC-NEXT:    lu12i.w $a3, -136485
+; LA32RNOPIC-NEXT:    ori $a4, $a3, 3823
+; LA32RNOPIC-NEXT:    ld.w $a2, $a4, 0
+; LA32RNOPIC-NEXT:    ori $a5, $a3, 3827
+; LA32RNOPIC-NEXT:    ld.w $a3, $a5, 0
+; LA32RNOPIC-NEXT:    st.w $a0, $a4, 0
+; LA32RNOPIC-NEXT:    st.w $a1, $a5, 0
+; LA32RNOPIC-NEXT:    move $a0, $a2
+; LA32RNOPIC-NEXT:    move $a1, $a3
+; LA32RNOPIC-NEXT:    ret
+;
+; LA32SNOPIC-LABEL: ld_sd_constant:
+; LA32SNOPIC:       # %bb.0:
+; LA32SNOPIC-NEXT:    lu12i.w $a3, -136485
+; LA32SNOPIC-NEXT:    ori $a4, $a3, 3823
+; LA32SNOPIC-NEXT:    ld.w $a2, $a4, 0
+; LA32SNOPIC-NEXT:    ori $a5, $a3, 3827
+; LA32SNOPIC-NEXT:    ld.w $a3, $a5, 0
+; LA32SNOPIC-NEXT:    st.w $a0, $a4, 0
+; LA32SNOPIC-NEXT:    st.w $a1, $a5, 0
+; LA32SNOPIC-NEXT:    move $a0, $a2
+; LA32SNOPIC-NEXT:    move $a1, $a3
+; LA32SNOPIC-NEXT:    ret
+;
+; LA32RPIC-LABEL: ld_sd_constant:
+; LA32RPIC:       # %bb.0:
+; LA32RPIC-NEXT:    lu12i.w $a3, -136485
+; LA32RPIC-NEXT:    ori $a4, $a3, 3823
+; LA32RPIC-NEXT:    ld.w $a2, $a4, 0
+; LA32RPIC-NEXT:    ori $a5, $a3, 3827
+; LA32RPIC-NEXT:    ld.w $a3, $a5, 0
+; LA32RPIC-NEXT:    st.w $a0, $a4, 0
+; LA32RPIC-NEXT:    st.w $a1, $a5, 0
+; LA32RPIC-NEXT:    move $a0, $a2
+; LA32RPIC-NEXT:    move $a1, $a3
+; LA32RPIC-NEXT:    ret
+;
+; LA32SPIC-LABEL: ld_sd_constant:
+; LA32SPIC:       # %bb.0:
+; LA32SPIC-NEXT:    lu12i.w $a3, -136485
+; LA32SPIC-NEXT:    ori $a4, $a3, 3823
+; LA32SPIC-NEXT:    ld.w $a2, $a4, 0
+; LA32SPIC-NEXT:    ori $a5, $a3, 3827
+; LA32SPIC-NEXT:    ld.w $a3, $a5, 0
+; LA32SPIC-NEXT:    st.w $a0, $a4, 0
+; LA32SPIC-NEXT:    st.w $a1, $a5, 0
+; LA32SPIC-NEXT:    move $a0, $a2
+; LA32SPIC-NEXT:    move $a1, $a3
+; LA32SPIC-NEXT:    ret
 ;
 ; LA64NOPIC-LABEL: ld_sd_constant:
 ; LA64NOPIC:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/lshr.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/lshr.ll
index ce7d1ec93d9d7..95c341890b714 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/lshr.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/lshr.ll
@@ -1,13 +1,18 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 ;; Exercise the 'lshr' LLVM IR: https://llvm.org/docs/LangRef.html#lshr-instruction
 
 define i1 @lshr_i1(i1 %x, i1 %y) {
-; LA32-LABEL: lshr_i1:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i1:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i1:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i1:
 ; LA64:       # %bb.0:
@@ -17,11 +22,17 @@ define i1 @lshr_i1(i1 %x, i1 %y) {
 }
 
 define i8 @lshr_i8(i8 %x, i8 %y) {
-; LA32-LABEL: lshr_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    srl.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    srl.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    srl.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i8:
 ; LA64:       # %bb.0:
@@ -33,11 +44,19 @@ define i8 @lshr_i8(i8 %x, i8 %y) {
 }
 
 define i16 @lshr_i16(i16 %x, i16 %y) {
-; LA32-LABEL: lshr_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    srl.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    srl.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    srl.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i16:
 ; LA64:       # %bb.0:
@@ -49,10 +68,15 @@ define i16 @lshr_i16(i16 %x, i16 %y) {
 }
 
 define i32 @lshr_i32(i32 %x, i32 %y) {
-; LA32-LABEL: lshr_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srl.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srl.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srl.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i32:
 ; LA64:       # %bb.0:
@@ -63,23 +87,43 @@ define i32 @lshr_i32(i32 %x, i32 %y) {
 }
 
 define i64 @lshr_i64(i64 %x, i64 %y) {
-; LA32-LABEL: lshr_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srl.w $a0, $a0, $a2
-; LA32-NEXT:    xori $a3, $a2, 31
-; LA32-NEXT:    slli.w $a4, $a1, 1
-; LA32-NEXT:    sll.w $a3, $a4, $a3
-; LA32-NEXT:    or $a0, $a0, $a3
-; LA32-NEXT:    addi.w $a3, $a2, -32
-; LA32-NEXT:    slti $a4, $a3, 0
-; LA32-NEXT:    maskeqz $a0, $a0, $a4
-; LA32-NEXT:    srl.w $a5, $a1, $a3
-; LA32-NEXT:    masknez $a4, $a5, $a4
-; LA32-NEXT:    or $a0, $a0, $a4
-; LA32-NEXT:    srl.w $a1, $a1, $a2
-; LA32-NEXT:    srai.w $a2, $a3, 31
-; LA32-NEXT:    and $a1, $a2, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a3, $a2, -32
+; LA32R-NEXT:    bltz $a3, .LBB4_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    srl.w $a0, $a1, $a3
+; LA32R-NEXT:    b .LBB4_3
+; LA32R-NEXT:  .LBB4_2:
+; LA32R-NEXT:    srl.w $a0, $a0, $a2
+; LA32R-NEXT:    xori $a4, $a2, 31
+; LA32R-NEXT:    slli.w $a5, $a1, 1
+; LA32R-NEXT:    sll.w $a4, $a5, $a4
+; LA32R-NEXT:    or $a0, $a0, $a4
+; LA32R-NEXT:  .LBB4_3:
+; LA32R-NEXT:    slti $a3, $a3, 0
+; LA32R-NEXT:    sub.w $a3, $zero, $a3
+; LA32R-NEXT:    srl.w $a1, $a1, $a2
+; LA32R-NEXT:    and $a1, $a3, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srl.w $a0, $a0, $a2
+; LA32S-NEXT:    xori $a3, $a2, 31
+; LA32S-NEXT:    slli.w $a4, $a1, 1
+; LA32S-NEXT:    sll.w $a3, $a4, $a3
+; LA32S-NEXT:    or $a0, $a0, $a3
+; LA32S-NEXT:    addi.w $a3, $a2, -32
+; LA32S-NEXT:    slti $a4, $a3, 0
+; LA32S-NEXT:    maskeqz $a0, $a0, $a4
+; LA32S-NEXT:    srl.w $a5, $a1, $a3
+; LA32S-NEXT:    masknez $a4, $a5, $a4
+; LA32S-NEXT:    or $a0, $a0, $a4
+; LA32S-NEXT:    srl.w $a1, $a1, $a2
+; LA32S-NEXT:    srai.w $a2, $a3, 31
+; LA32S-NEXT:    and $a1, $a2, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i64:
 ; LA64:       # %bb.0:
@@ -90,9 +134,13 @@ define i64 @lshr_i64(i64 %x, i64 %y) {
 }
 
 define i1 @lshr_i1_3(i1 %x) {
-; LA32-LABEL: lshr_i1_3:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i1_3:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i1_3:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i1_3:
 ; LA64:       # %bb.0:
@@ -102,10 +150,16 @@ define i1 @lshr_i1_3(i1 %x) {
 }
 
 define i8 @lshr_i8_3(i8 %x) {
-; LA32-LABEL: lshr_i8_3:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 7, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i8_3:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 248
+; LA32R-NEXT:    srli.w $a0, $a0, 3
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i8_3:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 7, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i8_3:
 ; LA64:       # %bb.0:
@@ -116,10 +170,18 @@ define i8 @lshr_i8_3(i8 %x) {
 }
 
 define i16 @lshr_i16_3(i16 %x) {
-; LA32-LABEL: lshr_i16_3:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i16_3:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4088
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    srli.w $a0, $a0, 3
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i16_3:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i16_3:
 ; LA64:       # %bb.0:
@@ -130,10 +192,15 @@ define i16 @lshr_i16_3(i16 %x) {
 }
 
 define i32 @lshr_i32_3(i32 %x) {
-; LA32-LABEL: lshr_i32_3:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a0, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i32_3:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a0, $a0, 3
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i32_3:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a0, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i32_3:
 ; LA64:       # %bb.0:
@@ -144,13 +211,21 @@ define i32 @lshr_i32_3(i32 %x) {
 }
 
 define i64 @lshr_i64_3(i64 %x) {
-; LA32-LABEL: lshr_i64_3:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a2, $a1, 29
-; LA32-NEXT:    srli.w $a0, $a0, 3
-; LA32-NEXT:    or $a0, $a0, $a2
-; LA32-NEXT:    srli.w $a1, $a1, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: lshr_i64_3:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a2, $a1, 29
+; LA32R-NEXT:    srli.w $a0, $a0, 3
+; LA32R-NEXT:    or $a0, $a0, $a2
+; LA32R-NEXT:    srli.w $a1, $a1, 3
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: lshr_i64_3:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a2, $a1, 29
+; LA32S-NEXT:    srli.w $a0, $a0, 3
+; LA32S-NEXT:    or $a0, $a0, $a2
+; LA32S-NEXT:    srli.w $a1, $a1, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: lshr_i64_3:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll
index 3a0cfd00940c5..6d048aeca766c 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll
@@ -1,14 +1,20 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 ;; Exercise the 'mul' LLVM IR: https://llvm.org/docs/LangRef.html#mul-instruction
 
 define i1 @mul_i1(i1 %a, i1 %b) {
-; LA32-LABEL: mul_i1:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    and $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i1:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i1:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    and $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i1:
 ; LA64:       # %bb.0: # %entry
@@ -20,10 +26,15 @@ entry:
 }
 
 define i8 @mul_i8(i8 %a, i8 %b) {
-; LA32-LABEL: mul_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mul.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mul.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i8:
 ; LA64:       # %bb.0: # %entry
@@ -35,10 +46,15 @@ entry:
 }
 
 define i16 @mul_i16(i16 %a, i16 %b) {
-; LA32-LABEL: mul_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mul.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mul.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i16:
 ; LA64:       # %bb.0: # %entry
@@ -50,10 +66,15 @@ entry:
 }
 
 define i32 @mul_i32(i32 %a, i32 %b) {
-; LA32-LABEL: mul_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mul.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mul.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32:
 ; LA64:       # %bb.0: # %entry
@@ -65,15 +86,25 @@ entry:
 }
 
 define i64 @mul_i64(i64 %a, i64 %b) {
-; LA32-LABEL: mul_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mul.w $a3, $a0, $a3
-; LA32-NEXT:    mulh.wu $a4, $a0, $a2
-; LA32-NEXT:    add.w $a3, $a4, $a3
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mul.w $a3, $a0, $a3
+; LA32R-NEXT:    mulh.wu $a4, $a0, $a2
+; LA32R-NEXT:    add.w $a3, $a4, $a3
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mul.w $a3, $a0, $a3
+; LA32S-NEXT:    mulh.wu $a4, $a0, $a2
+; LA32S-NEXT:    add.w $a3, $a4, $a3
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64:
 ; LA64:       # %bb.0: # %entry
@@ -85,13 +116,21 @@ entry:
 }
 
 define i64 @mul_pow2(i64 %a) {
-; LA32-LABEL: mul_pow2:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a2, $a0, 29
-; LA32-NEXT:    slli.w $a1, $a1, 3
-; LA32-NEXT:    or $a1, $a1, $a2
-; LA32-NEXT:    slli.w $a0, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_pow2:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a0, 29
+; LA32R-NEXT:    slli.w $a1, $a1, 3
+; LA32R-NEXT:    or $a1, $a1, $a2
+; LA32R-NEXT:    slli.w $a0, $a0, 3
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_pow2:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a2, $a0, 29
+; LA32S-NEXT:    slli.w $a1, $a1, 3
+; LA32S-NEXT:    or $a1, $a1, $a2
+; LA32S-NEXT:    slli.w $a0, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_pow2:
 ; LA64:       # %bb.0:
@@ -102,14 +141,25 @@ define i64 @mul_pow2(i64 %a) {
 }
 
 define i64 @mul_p5(i64 %a) {
-; LA32-LABEL: mul_p5:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 5
-; LA32-NEXT:    mulh.wu $a2, $a0, $a2
-; LA32-NEXT:    alsl.w $a1, $a1, $a1, 2
-; LA32-NEXT:    add.w $a1, $a2, $a1
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_p5:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 5
+; LA32R-NEXT:    mulh.wu $a2, $a0, $a2
+; LA32R-NEXT:    slli.w $a3, $a1, 2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    add.w $a1, $a2, $a1
+; LA32R-NEXT:    slli.w $a2, $a0, 2
+; LA32R-NEXT:    add.w $a0, $a2, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_p5:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 5
+; LA32S-NEXT:    mulh.wu $a2, $a0, $a2
+; LA32S-NEXT:    alsl.w $a1, $a1, $a1, 2
+; LA32S-NEXT:    add.w $a1, $a2, $a1
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_p5:
 ; LA64:       # %bb.0:
@@ -120,10 +170,15 @@ define i64 @mul_p5(i64 %a) {
 }
 
 define i32 @mulh_w(i32 %a, i32 %b) {
-; LA32-LABEL: mulh_w:
-; LA32:       # %bb.0:
-; LA32-NEXT:    mulh.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mulh_w:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    mulh.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mulh_w:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    mulh.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mulh_w:
 ; LA64:       # %bb.0:
@@ -139,10 +194,15 @@ define i32 @mulh_w(i32 %a, i32 %b) {
 }
 
 define i32 @mulh_wu(i32 %a, i32 %b) {
-; LA32-LABEL: mulh_wu:
-; LA32:       # %bb.0:
-; LA32-NEXT:    mulh.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mulh_wu:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    mulh.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mulh_wu:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    mulh.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mulh_wu:
 ; LA64:       # %bb.0:
@@ -158,49 +218,93 @@ define i32 @mulh_wu(i32 %a, i32 %b) {
 }
 
 define i64 @mulh_d(i64 %a, i64 %b) {
-; LA32-LABEL: mulh_d:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srai.w $a5, $a1, 31
-; LA32-NEXT:    srai.w $a6, $a3, 31
-; LA32-NEXT:    mulh.wu $a4, $a0, $a2
-; LA32-NEXT:    mul.w $a7, $a1, $a2
-; LA32-NEXT:    add.w $a4, $a7, $a4
-; LA32-NEXT:    sltu $a7, $a4, $a7
-; LA32-NEXT:    mulh.wu $t0, $a1, $a2
-; LA32-NEXT:    add.w $a7, $t0, $a7
-; LA32-NEXT:    mul.w $t0, $a0, $a3
-; LA32-NEXT:    add.w $a4, $t0, $a4
-; LA32-NEXT:    sltu $a4, $a4, $t0
-; LA32-NEXT:    mulh.wu $t0, $a0, $a3
-; LA32-NEXT:    add.w $a4, $t0, $a4
-; LA32-NEXT:    add.w $t0, $a7, $a4
-; LA32-NEXT:    mul.w $t1, $a1, $a3
-; LA32-NEXT:    add.w $t2, $t1, $t0
-; LA32-NEXT:    mul.w $t3, $a2, $a5
-; LA32-NEXT:    mul.w $t4, $a6, $a0
-; LA32-NEXT:    add.w $t5, $t4, $t3
-; LA32-NEXT:    add.w $a4, $t2, $t5
-; LA32-NEXT:    sltu $t6, $a4, $t2
-; LA32-NEXT:    sltu $t1, $t2, $t1
-; LA32-NEXT:    sltu $a7, $t0, $a7
-; LA32-NEXT:    mulh.wu $t0, $a1, $a3
-; LA32-NEXT:    add.w $a7, $t0, $a7
-; LA32-NEXT:    add.w $a7, $a7, $t1
-; LA32-NEXT:    mulh.wu $a2, $a2, $a5
-; LA32-NEXT:    add.w $a2, $a2, $t3
-; LA32-NEXT:    mul.w $a3, $a3, $a5
-; LA32-NEXT:    add.w $a2, $a2, $a3
-; LA32-NEXT:    mul.w $a1, $a6, $a1
-; LA32-NEXT:    mulh.wu $a0, $a6, $a0
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    add.w $a0, $a0, $t4
-; LA32-NEXT:    add.w $a0, $a0, $a2
-; LA32-NEXT:    sltu $a1, $t5, $t4
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    add.w $a0, $a7, $a0
-; LA32-NEXT:    add.w $a1, $a0, $t6
-; LA32-NEXT:    move $a0, $a4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mulh_d:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srai.w $a5, $a1, 31
+; LA32R-NEXT:    srai.w $a6, $a3, 31
+; LA32R-NEXT:    mulh.wu $a4, $a0, $a2
+; LA32R-NEXT:    mul.w $a7, $a1, $a2
+; LA32R-NEXT:    add.w $a4, $a7, $a4
+; LA32R-NEXT:    sltu $a7, $a4, $a7
+; LA32R-NEXT:    mulh.wu $t0, $a1, $a2
+; LA32R-NEXT:    add.w $a7, $t0, $a7
+; LA32R-NEXT:    mul.w $t0, $a0, $a3
+; LA32R-NEXT:    add.w $a4, $t0, $a4
+; LA32R-NEXT:    sltu $a4, $a4, $t0
+; LA32R-NEXT:    mulh.wu $t0, $a0, $a3
+; LA32R-NEXT:    add.w $a4, $t0, $a4
+; LA32R-NEXT:    add.w $t0, $a7, $a4
+; LA32R-NEXT:    mul.w $t1, $a1, $a3
+; LA32R-NEXT:    add.w $t2, $t1, $t0
+; LA32R-NEXT:    mul.w $t3, $a2, $a5
+; LA32R-NEXT:    mul.w $t4, $a6, $a0
+; LA32R-NEXT:    add.w $t5, $t4, $t3
+; LA32R-NEXT:    add.w $a4, $t2, $t5
+; LA32R-NEXT:    sltu $t6, $a4, $t2
+; LA32R-NEXT:    sltu $t1, $t2, $t1
+; LA32R-NEXT:    sltu $a7, $t0, $a7
+; LA32R-NEXT:    mulh.wu $t0, $a1, $a3
+; LA32R-NEXT:    add.w $a7, $t0, $a7
+; LA32R-NEXT:    add.w $a7, $a7, $t1
+; LA32R-NEXT:    mulh.wu $a2, $a2, $a5
+; LA32R-NEXT:    add.w $a2, $a2, $t3
+; LA32R-NEXT:    mul.w $a3, $a3, $a5
+; LA32R-NEXT:    add.w $a2, $a2, $a3
+; LA32R-NEXT:    mul.w $a1, $a6, $a1
+; LA32R-NEXT:    mulh.wu $a0, $a6, $a0
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a0, $t4
+; LA32R-NEXT:    add.w $a0, $a0, $a2
+; LA32R-NEXT:    sltu $a1, $t5, $t4
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    add.w $a0, $a7, $a0
+; LA32R-NEXT:    add.w $a1, $a0, $t6
+; LA32R-NEXT:    move $a0, $a4
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mulh_d:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srai.w $a5, $a1, 31
+; LA32S-NEXT:    srai.w $a6, $a3, 31
+; LA32S-NEXT:    mulh.wu $a4, $a0, $a2
+; LA32S-NEXT:    mul.w $a7, $a1, $a2
+; LA32S-NEXT:    add.w $a4, $a7, $a4
+; LA32S-NEXT:    sltu $a7, $a4, $a7
+; LA32S-NEXT:    mulh.wu $t0, $a1, $a2
+; LA32S-NEXT:    add.w $a7, $t0, $a7
+; LA32S-NEXT:    mul.w $t0, $a0, $a3
+; LA32S-NEXT:    add.w $a4, $t0, $a4
+; LA32S-NEXT:    sltu $a4, $a4, $t0
+; LA32S-NEXT:    mulh.wu $t0, $a0, $a3
+; LA32S-NEXT:    add.w $a4, $t0, $a4
+; LA32S-NEXT:    add.w $t0, $a7, $a4
+; LA32S-NEXT:    mul.w $t1, $a1, $a3
+; LA32S-NEXT:    add.w $t2, $t1, $t0
+; LA32S-NEXT:    mul.w $t3, $a2, $a5
+; LA32S-NEXT:    mul.w $t4, $a6, $a0
+; LA32S-NEXT:    add.w $t5, $t4, $t3
+; LA32S-NEXT:    add.w $a4, $t2, $t5
+; LA32S-NEXT:    sltu $t6, $a4, $t2
+; LA32S-NEXT:    sltu $t1, $t2, $t1
+; LA32S-NEXT:    sltu $a7, $t0, $a7
+; LA32S-NEXT:    mulh.wu $t0, $a1, $a3
+; LA32S-NEXT:    add.w $a7, $t0, $a7
+; LA32S-NEXT:    add.w $a7, $a7, $t1
+; LA32S-NEXT:    mulh.wu $a2, $a2, $a5
+; LA32S-NEXT:    add.w $a2, $a2, $t3
+; LA32S-NEXT:    mul.w $a3, $a3, $a5
+; LA32S-NEXT:    add.w $a2, $a2, $a3
+; LA32S-NEXT:    mul.w $a1, $a6, $a1
+; LA32S-NEXT:    mulh.wu $a0, $a6, $a0
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    add.w $a0, $a0, $t4
+; LA32S-NEXT:    add.w $a0, $a0, $a2
+; LA32S-NEXT:    sltu $a1, $t5, $t4
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    add.w $a0, $a7, $a0
+; LA32S-NEXT:    add.w $a1, $a0, $t6
+; LA32S-NEXT:    move $a0, $a4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mulh_d:
 ; LA64:       # %bb.0:
@@ -215,28 +319,51 @@ define i64 @mulh_d(i64 %a, i64 %b) {
 }
 
 define i64 @mulh_du(i64 %a, i64 %b) {
-; LA32-LABEL: mulh_du:
-; LA32:       # %bb.0:
-; LA32-NEXT:    mulh.wu $a4, $a0, $a2
-; LA32-NEXT:    mul.w $a5, $a1, $a2
-; LA32-NEXT:    add.w $a4, $a5, $a4
-; LA32-NEXT:    sltu $a5, $a4, $a5
-; LA32-NEXT:    mulh.wu $a2, $a1, $a2
-; LA32-NEXT:    add.w $a2, $a2, $a5
-; LA32-NEXT:    mul.w $a5, $a0, $a3
-; LA32-NEXT:    add.w $a4, $a5, $a4
-; LA32-NEXT:    sltu $a4, $a4, $a5
-; LA32-NEXT:    mulh.wu $a0, $a0, $a3
-; LA32-NEXT:    add.w $a0, $a0, $a4
-; LA32-NEXT:    add.w $a4, $a2, $a0
-; LA32-NEXT:    mul.w $a5, $a1, $a3
-; LA32-NEXT:    add.w $a0, $a5, $a4
-; LA32-NEXT:    sltu $a5, $a0, $a5
-; LA32-NEXT:    sltu $a2, $a4, $a2
-; LA32-NEXT:    mulh.wu $a1, $a1, $a3
-; LA32-NEXT:    add.w $a1, $a1, $a2
-; LA32-NEXT:    add.w $a1, $a1, $a5
-; LA32-NEXT:    ret
+; LA32R-LABEL: mulh_du:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    mulh.wu $a4, $a0, $a2
+; LA32R-NEXT:    mul.w $a5, $a1, $a2
+; LA32R-NEXT:    add.w $a4, $a5, $a4
+; LA32R-NEXT:    sltu $a5, $a4, $a5
+; LA32R-NEXT:    mulh.wu $a2, $a1, $a2
+; LA32R-NEXT:    add.w $a2, $a2, $a5
+; LA32R-NEXT:    mul.w $a5, $a0, $a3
+; LA32R-NEXT:    add.w $a4, $a5, $a4
+; LA32R-NEXT:    sltu $a4, $a4, $a5
+; LA32R-NEXT:    mulh.wu $a0, $a0, $a3
+; LA32R-NEXT:    add.w $a0, $a0, $a4
+; LA32R-NEXT:    add.w $a4, $a2, $a0
+; LA32R-NEXT:    mul.w $a5, $a1, $a3
+; LA32R-NEXT:    add.w $a0, $a5, $a4
+; LA32R-NEXT:    sltu $a5, $a0, $a5
+; LA32R-NEXT:    sltu $a2, $a4, $a2
+; LA32R-NEXT:    mulh.wu $a1, $a1, $a3
+; LA32R-NEXT:    add.w $a1, $a1, $a2
+; LA32R-NEXT:    add.w $a1, $a1, $a5
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mulh_du:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    mulh.wu $a4, $a0, $a2
+; LA32S-NEXT:    mul.w $a5, $a1, $a2
+; LA32S-NEXT:    add.w $a4, $a5, $a4
+; LA32S-NEXT:    sltu $a5, $a4, $a5
+; LA32S-NEXT:    mulh.wu $a2, $a1, $a2
+; LA32S-NEXT:    add.w $a2, $a2, $a5
+; LA32S-NEXT:    mul.w $a5, $a0, $a3
+; LA32S-NEXT:    add.w $a4, $a5, $a4
+; LA32S-NEXT:    sltu $a4, $a4, $a5
+; LA32S-NEXT:    mulh.wu $a0, $a0, $a3
+; LA32S-NEXT:    add.w $a0, $a0, $a4
+; LA32S-NEXT:    add.w $a4, $a2, $a0
+; LA32S-NEXT:    mul.w $a5, $a1, $a3
+; LA32S-NEXT:    add.w $a0, $a5, $a4
+; LA32S-NEXT:    sltu $a5, $a0, $a5
+; LA32S-NEXT:    sltu $a2, $a4, $a2
+; LA32S-NEXT:    mulh.wu $a1, $a1, $a3
+; LA32S-NEXT:    add.w $a1, $a1, $a2
+; LA32S-NEXT:    add.w $a1, $a1, $a5
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mulh_du:
 ; LA64:       # %bb.0:
@@ -251,12 +378,19 @@ define i64 @mulh_du(i64 %a, i64 %b) {
 }
 
 define i64 @mulw_d_w(i32 %a, i32 %b) {
-; LA32-LABEL: mulw_d_w:
-; LA32:       # %bb.0:
-; LA32-NEXT:    mul.w $a2, $a0, $a1
-; LA32-NEXT:    mulh.w $a1, $a0, $a1
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mulw_d_w:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    mul.w $a2, $a0, $a1
+; LA32R-NEXT:    mulh.w $a1, $a0, $a1
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mulw_d_w:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    mul.w $a2, $a0, $a1
+; LA32S-NEXT:    mulh.w $a1, $a0, $a1
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mulw_d_w:
 ; LA64:       # %bb.0:
@@ -269,12 +403,19 @@ define i64 @mulw_d_w(i32 %a, i32 %b) {
 }
 
 define i64 @mulw_d_wu(i32 %a, i32 %b) {
-; LA32-LABEL: mulw_d_wu:
-; LA32:       # %bb.0:
-; LA32-NEXT:    mul.w $a2, $a0, $a1
-; LA32-NEXT:    mulh.wu $a1, $a0, $a1
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mulw_d_wu:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    mul.w $a2, $a0, $a1
+; LA32R-NEXT:    mulh.wu $a1, $a0, $a1
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mulw_d_wu:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    mul.w $a2, $a0, $a1
+; LA32S-NEXT:    mulh.wu $a1, $a0, $a1
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mulw_d_wu:
 ; LA64:       # %bb.0:
@@ -287,11 +428,17 @@ define i64 @mulw_d_wu(i32 %a, i32 %b) {
 }
 
 define signext i32 @mul_i32_11(i32 %a) {
-; LA32-LABEL: mul_i32_11:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 2
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_11:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 11
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_11:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 2
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_11:
 ; LA64:       # %bb.0:
@@ -303,11 +450,17 @@ define signext i32 @mul_i32_11(i32 %a) {
 }
 
 define signext i32 @mul_i32_13(i32 %a) {
-; LA32-LABEL: mul_i32_13:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 1
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_13:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 13
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_13:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 1
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_13:
 ; LA64:       # %bb.0:
@@ -319,11 +472,17 @@ define signext i32 @mul_i32_13(i32 %a) {
 }
 
 define signext i32 @mul_i32_19(i32 %a) {
-; LA32-LABEL: mul_i32_19:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_19:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 19
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_19:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_19:
 ; LA64:       # %bb.0:
@@ -335,11 +494,17 @@ define signext i32 @mul_i32_19(i32 %a) {
 }
 
 define signext i32 @mul_i32_21(i32 %a) {
-; LA32-LABEL: mul_i32_21:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 2
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_21:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 21
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_21:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 2
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_21:
 ; LA64:       # %bb.0:
@@ -351,11 +516,17 @@ define signext i32 @mul_i32_21(i32 %a) {
 }
 
 define signext i32 @mul_i32_25(i32 %a) {
-; LA32-LABEL: mul_i32_25:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 1
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_25:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 25
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_25:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 1
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_25:
 ; LA64:       # %bb.0:
@@ -367,11 +538,17 @@ define signext i32 @mul_i32_25(i32 %a) {
 }
 
 define signext i32 @mul_i32_27(i32 %a) {
-; LA32-LABEL: mul_i32_27:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 1
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_27:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 27
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_27:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 1
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_27:
 ; LA64:       # %bb.0:
@@ -383,11 +560,17 @@ define signext i32 @mul_i32_27(i32 %a) {
 }
 
 define signext i32 @mul_i32_35(i32 %a) {
-; LA32-LABEL: mul_i32_35:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 4
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_35:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 35
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_35:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 4
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_35:
 ; LA64:       # %bb.0:
@@ -399,11 +582,17 @@ define signext i32 @mul_i32_35(i32 %a) {
 }
 
 define signext i32 @mul_i32_37(i32 %a) {
-; LA32-LABEL: mul_i32_37:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_37:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 37
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_37:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_37:
 ; LA64:       # %bb.0:
@@ -415,11 +604,17 @@ define signext i32 @mul_i32_37(i32 %a) {
 }
 
 define signext i32 @mul_i32_41(i32 %a) {
-; LA32-LABEL: mul_i32_41:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 2
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_41:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 41
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_41:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 2
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_41:
 ; LA64:       # %bb.0:
@@ -431,11 +626,17 @@ define signext i32 @mul_i32_41(i32 %a) {
 }
 
 define signext i32 @mul_i32_45(i32 %a) {
-; LA32-LABEL: mul_i32_45:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_45:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 45
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_45:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_45:
 ; LA64:       # %bb.0:
@@ -447,11 +648,17 @@ define signext i32 @mul_i32_45(i32 %a) {
 }
 
 define signext i32 @mul_i32_49(i32 %a) {
-; LA32-LABEL: mul_i32_49:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 1
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_49:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 49
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_49:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 1
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_49:
 ; LA64:       # %bb.0:
@@ -463,11 +670,17 @@ define signext i32 @mul_i32_49(i32 %a) {
 }
 
 define signext i32 @mul_i32_51(i32 %a) {
-; LA32-LABEL: mul_i32_51:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 1
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_51:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 51
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_51:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 1
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_51:
 ; LA64:       # %bb.0:
@@ -479,11 +692,17 @@ define signext i32 @mul_i32_51(i32 %a) {
 }
 
 define signext i32 @mul_i32_69(i32 %a) {
-; LA32-LABEL: mul_i32_69:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 4
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_69:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 69
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_69:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 4
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_69:
 ; LA64:       # %bb.0:
@@ -495,11 +714,17 @@ define signext i32 @mul_i32_69(i32 %a) {
 }
 
 define signext i32 @mul_i32_73(i32 %a) {
-; LA32-LABEL: mul_i32_73:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_73:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 73
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_73:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_73:
 ; LA64:       # %bb.0:
@@ -511,11 +736,17 @@ define signext i32 @mul_i32_73(i32 %a) {
 }
 
 define signext i32 @mul_i32_81(i32 %a) {
-; LA32-LABEL: mul_i32_81:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 2
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_81:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 81
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_81:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 2
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_81:
 ; LA64:       # %bb.0:
@@ -527,11 +758,17 @@ define signext i32 @mul_i32_81(i32 %a) {
 }
 
 define signext i32 @mul_i32_85(i32 %a) {
-; LA32-LABEL: mul_i32_85:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_85:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 85
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_85:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_85:
 ; LA64:       # %bb.0:
@@ -543,11 +780,17 @@ define signext i32 @mul_i32_85(i32 %a) {
 }
 
 define signext i32 @mul_i32_137(i32 %a) {
-; LA32-LABEL: mul_i32_137:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 4
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_137:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 137
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_137:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 4
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_137:
 ; LA64:       # %bb.0:
@@ -559,11 +802,17 @@ define signext i32 @mul_i32_137(i32 %a) {
 }
 
 define signext i32 @mul_i32_145(i32 %a) {
-; LA32-LABEL: mul_i32_145:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_145:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 145
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_145:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_145:
 ; LA64:       # %bb.0:
@@ -575,11 +824,17 @@ define signext i32 @mul_i32_145(i32 %a) {
 }
 
 define signext i32 @mul_i32_153(i32 %a) {
-; LA32-LABEL: mul_i32_153:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 3
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_153:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 153
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_153:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 3
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_153:
 ; LA64:       # %bb.0:
@@ -591,11 +846,17 @@ define signext i32 @mul_i32_153(i32 %a) {
 }
 
 define signext i32 @mul_i32_273(i32 %a) {
-; LA32-LABEL: mul_i32_273:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a1, $a0, $a0, 4
-; LA32-NEXT:    alsl.w $a0, $a1, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_273:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 273
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_273:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a1, $a0, $a0, 4
+; LA32S-NEXT:    alsl.w $a0, $a1, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_273:
 ; LA64:       # %bb.0:
@@ -607,11 +868,17 @@ define signext i32 @mul_i32_273(i32 %a) {
 }
 
 define signext i32 @mul_i32_289(i32 %a) {
-; LA32-LABEL: mul_i32_289:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 4
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_289:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 289
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_289:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 4
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_289:
 ; LA64:       # %bb.0:
@@ -623,14 +890,23 @@ define signext i32 @mul_i32_289(i32 %a) {
 }
 
 define i64 @mul_i64_11(i64 %a) {
-; LA32-LABEL: mul_i64_11:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 11
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_11:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 11
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_11:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 11
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_11:
 ; LA64:       # %bb.0:
@@ -642,14 +918,23 @@ define i64 @mul_i64_11(i64 %a) {
 }
 
 define i64 @mul_i64_13(i64 %a) {
-; LA32-LABEL: mul_i64_13:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 13
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_13:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 13
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_13:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 13
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_13:
 ; LA64:       # %bb.0:
@@ -661,14 +946,23 @@ define i64 @mul_i64_13(i64 %a) {
 }
 
 define i64 @mul_i64_19(i64 %a) {
-; LA32-LABEL: mul_i64_19:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 19
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_19:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 19
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_19:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 19
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_19:
 ; LA64:       # %bb.0:
@@ -680,14 +974,23 @@ define i64 @mul_i64_19(i64 %a) {
 }
 
 define i64 @mul_i64_21(i64 %a) {
-; LA32-LABEL: mul_i64_21:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 21
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_21:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 21
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_21:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 21
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_21:
 ; LA64:       # %bb.0:
@@ -699,14 +1002,23 @@ define i64 @mul_i64_21(i64 %a) {
 }
 
 define i64 @mul_i64_25(i64 %a) {
-; LA32-LABEL: mul_i64_25:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 25
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_25:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 25
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_25:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 25
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_25:
 ; LA64:       # %bb.0:
@@ -718,14 +1030,23 @@ define i64 @mul_i64_25(i64 %a) {
 }
 
 define i64 @mul_i64_27(i64 %a) {
-; LA32-LABEL: mul_i64_27:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 27
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_27:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 27
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_27:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 27
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_27:
 ; LA64:       # %bb.0:
@@ -737,14 +1058,23 @@ define i64 @mul_i64_27(i64 %a) {
 }
 
 define i64 @mul_i64_35(i64 %a) {
-; LA32-LABEL: mul_i64_35:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 35
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_35:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 35
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_35:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 35
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_35:
 ; LA64:       # %bb.0:
@@ -756,14 +1086,23 @@ define i64 @mul_i64_35(i64 %a) {
 }
 
 define i64 @mul_i64_37(i64 %a) {
-; LA32-LABEL: mul_i64_37:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 37
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_37:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 37
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_37:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 37
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_37:
 ; LA64:       # %bb.0:
@@ -775,14 +1114,23 @@ define i64 @mul_i64_37(i64 %a) {
 }
 
 define i64 @mul_i64_41(i64 %a) {
-; LA32-LABEL: mul_i64_41:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 41
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_41:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 41
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_41:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 41
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_41:
 ; LA64:       # %bb.0:
@@ -794,14 +1142,23 @@ define i64 @mul_i64_41(i64 %a) {
 }
 
 define i64 @mul_i64_45(i64 %a) {
-; LA32-LABEL: mul_i64_45:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 45
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_45:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 45
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_45:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 45
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_45:
 ; LA64:       # %bb.0:
@@ -813,14 +1170,23 @@ define i64 @mul_i64_45(i64 %a) {
 }
 
 define i64 @mul_i64_49(i64 %a) {
-; LA32-LABEL: mul_i64_49:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 49
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_49:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 49
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_49:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 49
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_49:
 ; LA64:       # %bb.0:
@@ -832,14 +1198,23 @@ define i64 @mul_i64_49(i64 %a) {
 }
 
 define i64 @mul_i64_51(i64 %a) {
-; LA32-LABEL: mul_i64_51:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 51
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_51:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 51
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_51:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 51
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_51:
 ; LA64:       # %bb.0:
@@ -851,14 +1226,23 @@ define i64 @mul_i64_51(i64 %a) {
 }
 
 define i64 @mul_i64_69(i64 %a) {
-; LA32-LABEL: mul_i64_69:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 69
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_69:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 69
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_69:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 69
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_69:
 ; LA64:       # %bb.0:
@@ -870,14 +1254,23 @@ define i64 @mul_i64_69(i64 %a) {
 }
 
 define i64 @mul_i64_73(i64 %a) {
-; LA32-LABEL: mul_i64_73:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 73
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_73:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 73
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_73:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 73
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_73:
 ; LA64:       # %bb.0:
@@ -889,14 +1282,23 @@ define i64 @mul_i64_73(i64 %a) {
 }
 
 define i64 @mul_i64_81(i64 %a) {
-; LA32-LABEL: mul_i64_81:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 81
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_81:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 81
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_81:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 81
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_81:
 ; LA64:       # %bb.0:
@@ -908,14 +1310,23 @@ define i64 @mul_i64_81(i64 %a) {
 }
 
 define i64 @mul_i64_85(i64 %a) {
-; LA32-LABEL: mul_i64_85:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 85
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_85:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 85
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_85:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 85
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_85:
 ; LA64:       # %bb.0:
@@ -927,14 +1338,23 @@ define i64 @mul_i64_85(i64 %a) {
 }
 
 define i64 @mul_i64_137(i64 %a) {
-; LA32-LABEL: mul_i64_137:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 137
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_137:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 137
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_137:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 137
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_137:
 ; LA64:       # %bb.0:
@@ -946,14 +1366,23 @@ define i64 @mul_i64_137(i64 %a) {
 }
 
 define i64 @mul_i64_145(i64 %a) {
-; LA32-LABEL: mul_i64_145:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 145
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_145:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 145
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_145:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 145
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_145:
 ; LA64:       # %bb.0:
@@ -965,14 +1394,23 @@ define i64 @mul_i64_145(i64 %a) {
 }
 
 define i64 @mul_i64_153(i64 %a) {
-; LA32-LABEL: mul_i64_153:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 153
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_153:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 153
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_153:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 153
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_153:
 ; LA64:       # %bb.0:
@@ -984,14 +1422,23 @@ define i64 @mul_i64_153(i64 %a) {
 }
 
 define i64 @mul_i64_273(i64 %a) {
-; LA32-LABEL: mul_i64_273:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 273
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_273:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 273
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_273:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 273
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_273:
 ; LA64:       # %bb.0:
@@ -1003,14 +1450,23 @@ define i64 @mul_i64_273(i64 %a) {
 }
 
 define i64 @mul_i64_289(i64 %a) {
-; LA32-LABEL: mul_i64_289:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 289
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_289:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 289
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_289:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 289
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_289:
 ; LA64:       # %bb.0:
@@ -1022,11 +1478,18 @@ define i64 @mul_i64_289(i64 %a) {
 }
 
 define signext i32 @mul_i32_4098(i32 %a) {
-; LA32-LABEL: mul_i32_4098:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 12
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_4098:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 1
+; LA32R-NEXT:    slli.w $a0, $a0, 12
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_4098:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 12
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_4098:
 ; LA64:       # %bb.0:
@@ -1038,11 +1501,18 @@ define signext i32 @mul_i32_4098(i32 %a) {
 }
 
 define signext i32 @mul_i32_4100(i32 %a) {
-; LA32-LABEL: mul_i32_4100:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 12
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_4100:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 2
+; LA32R-NEXT:    slli.w $a0, $a0, 12
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_4100:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 12
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_4100:
 ; LA64:       # %bb.0:
@@ -1054,11 +1524,18 @@ define signext i32 @mul_i32_4100(i32 %a) {
 }
 
 define signext i32 @mul_i32_4104(i32 %a) {
-; LA32-LABEL: mul_i32_4104:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 12
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 3
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_4104:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 3
+; LA32R-NEXT:    slli.w $a0, $a0, 12
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_4104:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 12
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 3
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_4104:
 ; LA64:       # %bb.0:
@@ -1070,11 +1547,18 @@ define signext i32 @mul_i32_4104(i32 %a) {
 }
 
 define signext i32 @mul_i32_4112(i32 %a) {
-; LA32-LABEL: mul_i32_4112:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 12
-; LA32-NEXT:    alsl.w $a0, $a0, $a1, 4
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_4112:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 4
+; LA32R-NEXT:    slli.w $a0, $a0, 12
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_4112:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 12
+; LA32S-NEXT:    alsl.w $a0, $a0, $a1, 4
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_4112:
 ; LA64:       # %bb.0:
@@ -1086,15 +1570,25 @@ define signext i32 @mul_i32_4112(i32 %a) {
 }
 
 define i64 @mul_i64_4098(i64 %a) {
-; LA32-LABEL: mul_i64_4098:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 1
-; LA32-NEXT:    ori $a2, $a2, 2
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_4098:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 1
+; LA32R-NEXT:    ori $a2, $a2, 2
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_4098:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 1
+; LA32S-NEXT:    ori $a2, $a2, 2
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_4098:
 ; LA64:       # %bb.0:
@@ -1106,15 +1600,25 @@ define i64 @mul_i64_4098(i64 %a) {
 }
 
 define i64 @mul_i64_4100(i64 %a) {
-; LA32-LABEL: mul_i64_4100:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 1
-; LA32-NEXT:    ori $a2, $a2, 4
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_4100:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 1
+; LA32R-NEXT:    ori $a2, $a2, 4
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_4100:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 1
+; LA32S-NEXT:    ori $a2, $a2, 4
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_4100:
 ; LA64:       # %bb.0:
@@ -1126,15 +1630,25 @@ define i64 @mul_i64_4100(i64 %a) {
 }
 
 define i64 @mul_i64_4104(i64 %a) {
-; LA32-LABEL: mul_i64_4104:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 1
-; LA32-NEXT:    ori $a2, $a2, 8
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_4104:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 1
+; LA32R-NEXT:    ori $a2, $a2, 8
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_4104:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 1
+; LA32S-NEXT:    ori $a2, $a2, 8
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_4104:
 ; LA64:       # %bb.0:
@@ -1146,15 +1660,25 @@ define i64 @mul_i64_4104(i64 %a) {
 }
 
 define i64 @mul_i64_4112(i64 %a) {
-; LA32-LABEL: mul_i64_4112:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 1
-; LA32-NEXT:    ori $a2, $a2, 16
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_4112:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 1
+; LA32R-NEXT:    ori $a2, $a2, 16
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_4112:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 1
+; LA32S-NEXT:    ori $a2, $a2, 16
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_4112:
 ; LA64:       # %bb.0:
@@ -1166,11 +1690,17 @@ define i64 @mul_i64_4112(i64 %a) {
 }
 
 define signext i32 @mul_i32_768(i32 %a) {
-; LA32-LABEL: mul_i32_768:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 1
-; LA32-NEXT:    slli.w $a0, $a0, 8
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_768:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 768
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_768:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 1
+; LA32S-NEXT:    slli.w $a0, $a0, 8
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_768:
 ; LA64:       # %bb.0:
@@ -1182,11 +1712,17 @@ define signext i32 @mul_i32_768(i32 %a) {
 }
 
 define signext i32 @mul_i32_1280(i32 %a) {
-; LA32-LABEL: mul_i32_1280:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 2
-; LA32-NEXT:    slli.w $a0, $a0, 8
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_1280:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 1280
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_1280:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 2
+; LA32S-NEXT:    slli.w $a0, $a0, 8
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_1280:
 ; LA64:       # %bb.0:
@@ -1198,11 +1734,17 @@ define signext i32 @mul_i32_1280(i32 %a) {
 }
 
 define signext i32 @mul_i32_2304(i32 %a) {
-; LA32-LABEL: mul_i32_2304:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 3
-; LA32-NEXT:    slli.w $a0, $a0, 8
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_2304:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $zero, 2304
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_2304:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 3
+; LA32S-NEXT:    slli.w $a0, $a0, 8
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_2304:
 ; LA64:       # %bb.0:
@@ -1214,11 +1756,18 @@ define signext i32 @mul_i32_2304(i32 %a) {
 }
 
 define signext i32 @mul_i32_4352(i32 %a) {
-; LA32-LABEL: mul_i32_4352:
-; LA32:       # %bb.0:
-; LA32-NEXT:    alsl.w $a0, $a0, $a0, 4
-; LA32-NEXT:    slli.w $a0, $a0, 8
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_4352:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 1
+; LA32R-NEXT:    ori $a1, $a1, 256
+; LA32R-NEXT:    mul.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_4352:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    alsl.w $a0, $a0, $a0, 4
+; LA32S-NEXT:    slli.w $a0, $a0, 8
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_4352:
 ; LA64:       # %bb.0:
@@ -1230,14 +1779,23 @@ define signext i32 @mul_i32_4352(i32 %a) {
 }
 
 define i64 @mul_i64_768(i64 %a) {
-; LA32-LABEL: mul_i64_768:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 768
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_768:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 768
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_768:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 768
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_768:
 ; LA64:       # %bb.0:
@@ -1249,14 +1807,23 @@ define i64 @mul_i64_768(i64 %a) {
 }
 
 define i64 @mul_i64_1280(i64 %a) {
-; LA32-LABEL: mul_i64_1280:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 1280
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_1280:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 1280
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_1280:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 1280
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_1280:
 ; LA64:       # %bb.0:
@@ -1268,14 +1835,23 @@ define i64 @mul_i64_1280(i64 %a) {
 }
 
 define i64 @mul_i64_2304(i64 %a) {
-; LA32-LABEL: mul_i64_2304:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 2304
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_2304:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 2304
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_2304:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 2304
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_2304:
 ; LA64:       # %bb.0:
@@ -1287,15 +1863,25 @@ define i64 @mul_i64_2304(i64 %a) {
 }
 
 define i64 @mul_i64_4352(i64 %a) {
-; LA32-LABEL: mul_i64_4352:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 1
-; LA32-NEXT:    ori $a2, $a2, 256
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_4352:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 1
+; LA32R-NEXT:    ori $a2, $a2, 256
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_4352:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 1
+; LA32S-NEXT:    ori $a2, $a2, 256
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_4352:
 ; LA64:       # %bb.0:
@@ -1307,12 +1893,19 @@ define i64 @mul_i64_4352(i64 %a) {
 }
 
 define signext i32 @mul_i32_65792(i32 %a) {
-; LA32-LABEL: mul_i32_65792:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 8
-; LA32-NEXT:    slli.w $a0, $a0, 16
-; LA32-NEXT:    add.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_65792:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 16
+; LA32R-NEXT:    add.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_65792:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 8
+; LA32S-NEXT:    slli.w $a0, $a0, 16
+; LA32S-NEXT:    add.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_65792:
 ; LA64:       # %bb.0:
@@ -1325,12 +1918,19 @@ define signext i32 @mul_i32_65792(i32 %a) {
 }
 
 define signext i32 @mul_i32_65280(i32 %a) {
-; LA32-LABEL: mul_i32_65280:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 8
-; LA32-NEXT:    slli.w $a0, $a0, 16
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_65280:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 8
+; LA32R-NEXT:    slli.w $a0, $a0, 16
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_65280:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 8
+; LA32S-NEXT:    slli.w $a0, $a0, 16
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_65280:
 ; LA64:       # %bb.0:
@@ -1343,12 +1943,19 @@ define signext i32 @mul_i32_65280(i32 %a) {
 }
 
 define signext i32 @mul_i32_minus_65280(i32 %a) {
-; LA32-LABEL: mul_i32_minus_65280:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slli.w $a1, $a0, 16
-; LA32-NEXT:    slli.w $a0, $a0, 8
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i32_minus_65280:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 16
+; LA32R-NEXT:    slli.w $a0, $a0, 8
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i32_minus_65280:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slli.w $a1, $a0, 16
+; LA32S-NEXT:    slli.w $a0, $a0, 8
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i32_minus_65280:
 ; LA64:       # %bb.0:
@@ -1361,15 +1968,25 @@ define signext i32 @mul_i32_minus_65280(i32 %a) {
 }
 
 define i64 @mul_i64_65792(i64 %a) {
-; LA32-LABEL: mul_i64_65792:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 16
-; LA32-NEXT:    ori $a2, $a2, 256
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_65792:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 16
+; LA32R-NEXT:    ori $a2, $a2, 256
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_65792:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 16
+; LA32S-NEXT:    ori $a2, $a2, 256
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_65792:
 ; LA64:       # %bb.0:
@@ -1382,15 +1999,25 @@ define i64 @mul_i64_65792(i64 %a) {
 }
 
 define i64 @mul_i64_65280(i64 %a) {
-; LA32-LABEL: mul_i64_65280:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 15
-; LA32-NEXT:    ori $a2, $a2, 3840
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_65280:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 3840
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_65280:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 15
+; LA32S-NEXT:    ori $a2, $a2, 3840
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_65280:
 ; LA64:       # %bb.0:
@@ -1403,16 +2030,27 @@ define i64 @mul_i64_65280(i64 %a) {
 }
 
 define i64 @mul_i64_minus_65280(i64 %a) {
-; LA32-LABEL: mul_i64_minus_65280:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, -16
-; LA32-NEXT:    ori $a2, $a2, 256
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    sub.w $a3, $a3, $a0
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_minus_65280:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, -16
+; LA32R-NEXT:    ori $a2, $a2, 256
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    sub.w $a3, $a3, $a0
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_minus_65280:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, -16
+; LA32S-NEXT:    ori $a2, $a2, 256
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    sub.w $a3, $a3, $a0
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_minus_65280:
 ; LA64:       # %bb.0:
@@ -1427,14 +2065,23 @@ define i64 @mul_i64_minus_65280(i64 %a) {
 ;; This multiplication is not transformed, due to
 ;; 1088 can be composed via a single ORI.
 define i64 @mul_i64_1088(i64 %a) {
-; LA32-LABEL: mul_i64_1088:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a2, $zero, 1088
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_1088:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a2, $zero, 1088
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_1088:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a2, $zero, 1088
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_1088:
 ; LA64:       # %bb.0:
@@ -1448,15 +2095,25 @@ define i64 @mul_i64_1088(i64 %a) {
 ;; This multiplication is not transformed, due to
 ;; -992 can be composed via a single ADDI.
 define i64 @mul_i64_minus_992(i64 %a) {
-; LA32-LABEL: mul_i64_minus_992:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a2, $zero, -992
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    sub.w $a3, $a3, $a0
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_minus_992:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a2, $zero, -992
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    sub.w $a3, $a3, $a0
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_minus_992:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a2, $zero, -992
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    sub.w $a3, $a3, $a0
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_minus_992:
 ; LA64:       # %bb.0:
@@ -1470,14 +2127,23 @@ define i64 @mul_i64_minus_992(i64 %a) {
 ;; This multiplication is not transformed, due to
 ;; 4456448 can be composed via a single LUI.
 define i64 @mul_i64_4456448(i64 %a) {
-; LA32-LABEL: mul_i64_4456448:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a2, 1088
-; LA32-NEXT:    mul.w $a1, $a1, $a2
-; LA32-NEXT:    mulh.wu $a3, $a0, $a2
-; LA32-NEXT:    add.w $a1, $a3, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_4456448:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a2, 1088
+; LA32R-NEXT:    mul.w $a1, $a1, $a2
+; LA32R-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32R-NEXT:    add.w $a1, $a3, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_4456448:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a2, 1088
+; LA32S-NEXT:    mul.w $a1, $a1, $a2
+; LA32S-NEXT:    mulh.wu $a3, $a0, $a2
+; LA32S-NEXT:    add.w $a1, $a3, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_4456448:
 ; LA64:       # %bb.0:
@@ -1491,21 +2157,37 @@ define i64 @mul_i64_4456448(i64 %a) {
 ;; This multiplication is not transformed, due to
 ;; 65280 is used multiple times.
 define i64 @mul_i64_65280_twice(i64 %a, i64 %b) {
-; LA32-LABEL: mul_i64_65280_twice:
-; LA32:       # %bb.0:
-; LA32-NEXT:    lu12i.w $a4, 15
-; LA32-NEXT:    ori $a4, $a4, 3840
-; LA32-NEXT:    mul.w $a1, $a1, $a4
-; LA32-NEXT:    mulh.wu $a5, $a0, $a4
-; LA32-NEXT:    add.w $a1, $a5, $a1
-; LA32-NEXT:    mul.w $a0, $a0, $a4
-; LA32-NEXT:    mul.w $a3, $a3, $a4
-; LA32-NEXT:    mulh.wu $a5, $a2, $a4
-; LA32-NEXT:    add.w $a3, $a5, $a3
-; LA32-NEXT:    mul.w $a2, $a2, $a4
-; LA32-NEXT:    xor $a1, $a1, $a3
-; LA32-NEXT:    xor $a0, $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: mul_i64_65280_twice:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a4, 15
+; LA32R-NEXT:    ori $a4, $a4, 3840
+; LA32R-NEXT:    mul.w $a1, $a1, $a4
+; LA32R-NEXT:    mulh.wu $a5, $a0, $a4
+; LA32R-NEXT:    add.w $a1, $a5, $a1
+; LA32R-NEXT:    mul.w $a0, $a0, $a4
+; LA32R-NEXT:    mul.w $a3, $a3, $a4
+; LA32R-NEXT:    mulh.wu $a5, $a2, $a4
+; LA32R-NEXT:    add.w $a3, $a5, $a3
+; LA32R-NEXT:    mul.w $a2, $a2, $a4
+; LA32R-NEXT:    xor $a1, $a1, $a3
+; LA32R-NEXT:    xor $a0, $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: mul_i64_65280_twice:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    lu12i.w $a4, 15
+; LA32S-NEXT:    ori $a4, $a4, 3840
+; LA32S-NEXT:    mul.w $a1, $a1, $a4
+; LA32S-NEXT:    mulh.wu $a5, $a0, $a4
+; LA32S-NEXT:    add.w $a1, $a5, $a1
+; LA32S-NEXT:    mul.w $a0, $a0, $a4
+; LA32S-NEXT:    mul.w $a3, $a3, $a4
+; LA32S-NEXT:    mulh.wu $a5, $a2, $a4
+; LA32S-NEXT:    add.w $a3, $a5, $a3
+; LA32S-NEXT:    mul.w $a2, $a2, $a4
+; LA32S-NEXT:    xor $a1, $a1, $a3
+; LA32S-NEXT:    xor $a0, $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: mul_i64_65280_twice:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/sdiv-udiv-srem-urem.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/sdiv-udiv-srem-urem.ll
index d88f2951cb8a9..ff77cb3119124 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/sdiv-udiv-srem-urem.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/sdiv-udiv-srem-urem.ll
@@ -1,25 +1,36 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
-; RUN: llc --mtriple=loongarch32 -mattr=+d -loongarch-check-zero-division < %s \
-; RUN:     | FileCheck %s --check-prefix=LA32-TRAP
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d -loongarch-check-zero-division < %s \
+; RUN:     | FileCheck %s --check-prefix=LA32R-TRAP
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d -loongarch-check-zero-division < %s \
+; RUN:     | FileCheck %s --check-prefix=LA32S-TRAP
 ; RUN: llc --mtriple=loongarch64 -mattr=+d -loongarch-check-zero-division < %s \
 ; RUN:     | FileCheck %s --check-prefix=LA64-TRAP
 
 ;; Test the sdiv/udiv/srem/urem LLVM IR.
 
 define i1 @sdiv_i1(i1 %a, i1 %b) {
-; LA32-LABEL: sdiv_i1:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_i1:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_i1:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_i1:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_i1:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_i1:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_i1:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_i1:
 ; LA64-TRAP:       # %bb.0: # %entry
@@ -30,12 +41,21 @@ entry:
 }
 
 define i8 @sdiv_i8(i8 %a, i8 %b) {
-; LA32-LABEL: sdiv_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ext.w.b $a1, $a1
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    div.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    srai.w $a1, $a1, 24
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    srai.w $a0, $a0, 24
+; LA32R-NEXT:    div.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ext.w.b $a1, $a1
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    div.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_i8:
 ; LA64:       # %bb.0: # %entry
@@ -44,23 +64,36 @@ define i8 @sdiv_i8(i8 %a, i8 %b) {
 ; LA64-NEXT:    div.d $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_i8:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    ext.w.b $a1, $a1
-; LA32-TRAP-NEXT:    ext.w.b $a0, $a0
-; LA32-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB1_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB1_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_i8:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    slli.w $a1, $a1, 24
+; LA32R-TRAP-NEXT:    srai.w $a1, $a1, 24
+; LA32R-TRAP-NEXT:    slli.w $a0, $a0, 24
+; LA32R-TRAP-NEXT:    srai.w $a0, $a0, 24
+; LA32R-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB1_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB1_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_i8:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    ext.w.b $a1, $a1
+; LA32S-TRAP-NEXT:    ext.w.b $a0, $a0
+; LA32S-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB1_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB1_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_i8:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    ext.w.b $a1, $a1
 ; LA64-TRAP-NEXT:    ext.w.b $a0, $a0
 ; LA64-TRAP-NEXT:    div.d $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB1_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB1_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB1_2: # %entry
@@ -71,12 +104,21 @@ entry:
 }
 
 define i16 @sdiv_i16(i16 %a, i16 %b) {
-; LA32-LABEL: sdiv_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ext.w.h $a1, $a1
-; LA32-NEXT:    ext.w.h $a0, $a0
-; LA32-NEXT:    div.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a1, $a1, 16
+; LA32R-NEXT:    srai.w $a1, $a1, 16
+; LA32R-NEXT:    slli.w $a0, $a0, 16
+; LA32R-NEXT:    srai.w $a0, $a0, 16
+; LA32R-NEXT:    div.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ext.w.h $a1, $a1
+; LA32S-NEXT:    ext.w.h $a0, $a0
+; LA32S-NEXT:    div.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_i16:
 ; LA64:       # %bb.0: # %entry
@@ -85,23 +127,36 @@ define i16 @sdiv_i16(i16 %a, i16 %b) {
 ; LA64-NEXT:    div.d $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_i16:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    ext.w.h $a1, $a1
-; LA32-TRAP-NEXT:    ext.w.h $a0, $a0
-; LA32-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB2_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB2_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_i16:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    slli.w $a1, $a1, 16
+; LA32R-TRAP-NEXT:    srai.w $a1, $a1, 16
+; LA32R-TRAP-NEXT:    slli.w $a0, $a0, 16
+; LA32R-TRAP-NEXT:    srai.w $a0, $a0, 16
+; LA32R-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB2_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB2_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_i16:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    ext.w.h $a1, $a1
+; LA32S-TRAP-NEXT:    ext.w.h $a0, $a0
+; LA32S-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB2_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB2_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_i16:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    ext.w.h $a1, $a1
 ; LA64-TRAP-NEXT:    ext.w.h $a0, $a0
 ; LA64-TRAP-NEXT:    div.d $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB2_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB2_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB2_2: # %entry
@@ -112,10 +167,15 @@ entry:
 }
 
 define i32 @sdiv_i32(i32 %a, i32 %b) {
-; LA32-LABEL: sdiv_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_i32:
 ; LA64:       # %bb.0: # %entry
@@ -124,21 +184,30 @@ define i32 @sdiv_i32(i32 %a, i32 %b) {
 ; LA64-NEXT:    div.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_i32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB3_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB3_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_i32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB3_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB3_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_i32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB3_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB3_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_i32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB3_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB3_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB3_2: # %entry
@@ -149,29 +218,43 @@ entry:
 }
 
 define i32 @sdiv_ui32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: sdiv_ui32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_ui32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_ui32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_ui32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    div.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_ui32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB4_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB4_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_ui32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB4_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB4_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_ui32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB4_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB4_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_ui32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB4_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB4_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB4_2: # %entry
@@ -182,10 +265,15 @@ entry:
 }
 
 define signext i32 @sdiv_si32_ui32_ui32(i32 %a, i32 %b) {
-; LA32-LABEL: sdiv_si32_ui32_ui32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_si32_ui32_ui32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_si32_ui32_ui32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_si32_ui32_ui32:
 ; LA64:       # %bb.0: # %entry
@@ -194,21 +282,30 @@ define signext i32 @sdiv_si32_ui32_ui32(i32 %a, i32 %b) {
 ; LA64-NEXT:    div.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_si32_ui32_ui32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB5_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB5_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_si32_ui32_ui32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB5_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB5_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_si32_ui32_ui32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB5_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB5_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_si32_ui32_ui32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB5_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB5_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB5_2: # %entry
@@ -219,29 +316,43 @@ entry:
 }
 
 define signext i32 @sdiv_si32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: sdiv_si32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_si32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_si32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_si32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    div.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_si32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB6_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB6_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_si32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB6_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB6_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_si32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB6_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB6_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_si32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    div.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB6_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB6_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB6_2: # %entry
@@ -252,37 +363,59 @@ entry:
 }
 
 define i64 @sdiv_i64(i64 %a, i64 %b) {
-; LA32-LABEL: sdiv_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    .cfi_def_cfa_offset 16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    .cfi_offset 1, -4
-; LA32-NEXT:    bl __divdi3
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: sdiv_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    .cfi_offset 1, -4
+; LA32R-NEXT:    bl __divdi3
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sdiv_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    .cfi_offset 1, -4
+; LA32S-NEXT:    bl __divdi3
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sdiv_i64:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    div.d $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: sdiv_i64:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, -16
-; LA32-TRAP-NEXT:    .cfi_def_cfa_offset 16
-; LA32-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-TRAP-NEXT:    .cfi_offset 1, -4
-; LA32-TRAP-NEXT:    bl __divdi3
-; LA32-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, 16
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: sdiv_i64:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32R-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32R-TRAP-NEXT:    bl __divdi3
+; LA32R-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: sdiv_i64:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32S-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32S-TRAP-NEXT:    bl __divdi3
+; LA32S-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: sdiv_i64:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    div.d $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB7_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB7_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB7_2: # %entry
@@ -293,17 +426,25 @@ entry:
 }
 
 define i1 @udiv_i1(i1 %a, i1 %b) {
-; LA32-LABEL: udiv_i1:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_i1:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_i1:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_i1:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_i1:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_i1:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_i1:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_i1:
 ; LA64-TRAP:       # %bb.0: # %entry
@@ -314,12 +455,19 @@ entry:
 }
 
 define i8 @udiv_i8(i8 %a, i8 %b) {
-; LA32-LABEL: udiv_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    div.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_i8:
 ; LA64:       # %bb.0: # %entry
@@ -328,23 +476,34 @@ define i8 @udiv_i8(i8 %a, i8 %b) {
 ; LA64-NEXT:    div.du $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_i8:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    andi $a1, $a1, 255
-; LA32-TRAP-NEXT:    andi $a0, $a0, 255
-; LA32-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB9_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB9_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_i8:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    andi $a1, $a1, 255
+; LA32R-TRAP-NEXT:    andi $a0, $a0, 255
+; LA32R-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB9_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB9_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_i8:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    andi $a1, $a1, 255
+; LA32S-TRAP-NEXT:    andi $a0, $a0, 255
+; LA32S-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB9_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB9_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_i8:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    andi $a1, $a1, 255
 ; LA64-TRAP-NEXT:    andi $a0, $a0, 255
 ; LA64-TRAP-NEXT:    div.du $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB9_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB9_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB9_2: # %entry
@@ -355,12 +514,21 @@ entry:
 }
 
 define i16 @udiv_i16(i16 %a, i16 %b) {
-; LA32-LABEL: udiv_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    div.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_i16:
 ; LA64:       # %bb.0: # %entry
@@ -369,23 +537,36 @@ define i16 @udiv_i16(i16 %a, i16 %b) {
 ; LA64-NEXT:    div.du $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_i16:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-TRAP-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB10_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB10_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_i16:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    lu12i.w $a2, 15
+; LA32R-TRAP-NEXT:    ori $a2, $a2, 4095
+; LA32R-TRAP-NEXT:    and $a1, $a1, $a2
+; LA32R-TRAP-NEXT:    and $a0, $a0, $a2
+; LA32R-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB10_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB10_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_i16:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-TRAP-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB10_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB10_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_i16:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    bstrpick.d $a1, $a1, 15, 0
 ; LA64-TRAP-NEXT:    bstrpick.d $a0, $a0, 15, 0
 ; LA64-TRAP-NEXT:    div.du $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB10_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB10_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB10_2: # %entry
@@ -396,10 +577,15 @@ entry:
 }
 
 define i32 @udiv_i32(i32 %a, i32 %b) {
-; LA32-LABEL: udiv_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_i32:
 ; LA64:       # %bb.0: # %entry
@@ -408,21 +594,30 @@ define i32 @udiv_i32(i32 %a, i32 %b) {
 ; LA64-NEXT:    div.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_i32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB11_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB11_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_i32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB11_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB11_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_i32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB11_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB11_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_i32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB11_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB11_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB11_2: # %entry
@@ -433,29 +628,43 @@ entry:
 }
 
 define i32 @udiv_ui32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: udiv_ui32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_ui32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_ui32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_ui32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    div.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_ui32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB12_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB12_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_ui32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB12_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB12_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_ui32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB12_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB12_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_ui32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB12_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB12_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB12_2: # %entry
@@ -466,10 +675,15 @@ entry:
 }
 
 define signext i32 @udiv_si32_ui32_ui32(i32 %a, i32 %b) {
-; LA32-LABEL: udiv_si32_ui32_ui32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_si32_ui32_ui32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_si32_ui32_ui32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_si32_ui32_ui32:
 ; LA64:       # %bb.0: # %entry
@@ -478,21 +692,30 @@ define signext i32 @udiv_si32_ui32_ui32(i32 %a, i32 %b) {
 ; LA64-NEXT:    div.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_si32_ui32_ui32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB13_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB13_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_si32_ui32_ui32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB13_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB13_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_si32_ui32_ui32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB13_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB13_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_si32_ui32_ui32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB13_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB13_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB13_2: # %entry
@@ -503,29 +726,43 @@ entry:
 }
 
 define signext i32 @udiv_si32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: udiv_si32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    div.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_si32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_si32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_si32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    div.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_si32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB14_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB14_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_si32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB14_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB14_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_si32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    div.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB14_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB14_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_si32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    div.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB14_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB14_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB14_2: # %entry
@@ -536,37 +773,59 @@ entry:
 }
 
 define i64 @udiv_i64(i64 %a, i64 %b) {
-; LA32-LABEL: udiv_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    .cfi_def_cfa_offset 16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    .cfi_offset 1, -4
-; LA32-NEXT:    bl __udivdi3
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: udiv_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    .cfi_offset 1, -4
+; LA32R-NEXT:    bl __udivdi3
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: udiv_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    .cfi_offset 1, -4
+; LA32S-NEXT:    bl __udivdi3
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: udiv_i64:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    div.du $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: udiv_i64:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, -16
-; LA32-TRAP-NEXT:    .cfi_def_cfa_offset 16
-; LA32-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-TRAP-NEXT:    .cfi_offset 1, -4
-; LA32-TRAP-NEXT:    bl __udivdi3
-; LA32-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, 16
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: udiv_i64:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32R-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32R-TRAP-NEXT:    bl __udivdi3
+; LA32R-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: udiv_i64:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32S-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32S-TRAP-NEXT:    bl __udivdi3
+; LA32S-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: udiv_i64:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    div.du $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB15_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB15_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB15_2: # %entry
@@ -577,20 +836,30 @@ entry:
 }
 
 define i1 @srem_i1(i1 %a, i1 %b) {
-; LA32-LABEL: srem_i1:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_i1:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_i1:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_i1:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    move $a0, $zero
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_i1:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    move $a0, $zero
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_i1:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    move $a0, $zero
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_i1:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    move $a0, $zero
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_i1:
 ; LA64-TRAP:       # %bb.0: # %entry
@@ -602,12 +871,21 @@ entry:
 }
 
 define i8 @srem_i8(i8 %a, i8 %b) {
-; LA32-LABEL: srem_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ext.w.b $a1, $a1
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    mod.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a1, $a1, 24
+; LA32R-NEXT:    srai.w $a1, $a1, 24
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    srai.w $a0, $a0, 24
+; LA32R-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ext.w.b $a1, $a1
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_i8:
 ; LA64:       # %bb.0: # %entry
@@ -616,23 +894,36 @@ define i8 @srem_i8(i8 %a, i8 %b) {
 ; LA64-NEXT:    mod.d $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_i8:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    ext.w.b $a1, $a1
-; LA32-TRAP-NEXT:    ext.w.b $a0, $a0
-; LA32-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB17_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB17_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_i8:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    slli.w $a1, $a1, 24
+; LA32R-TRAP-NEXT:    srai.w $a1, $a1, 24
+; LA32R-TRAP-NEXT:    slli.w $a0, $a0, 24
+; LA32R-TRAP-NEXT:    srai.w $a0, $a0, 24
+; LA32R-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB17_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB17_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_i8:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    ext.w.b $a1, $a1
+; LA32S-TRAP-NEXT:    ext.w.b $a0, $a0
+; LA32S-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB17_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB17_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_i8:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    ext.w.b $a1, $a1
 ; LA64-TRAP-NEXT:    ext.w.b $a0, $a0
 ; LA64-TRAP-NEXT:    mod.d $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB17_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB17_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB17_2: # %entry
@@ -643,12 +934,21 @@ entry:
 }
 
 define i16 @srem_i16(i16 %a, i16 %b) {
-; LA32-LABEL: srem_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ext.w.h $a1, $a1
-; LA32-NEXT:    ext.w.h $a0, $a0
-; LA32-NEXT:    mod.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    slli.w $a1, $a1, 16
+; LA32R-NEXT:    srai.w $a1, $a1, 16
+; LA32R-NEXT:    slli.w $a0, $a0, 16
+; LA32R-NEXT:    srai.w $a0, $a0, 16
+; LA32R-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ext.w.h $a1, $a1
+; LA32S-NEXT:    ext.w.h $a0, $a0
+; LA32S-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_i16:
 ; LA64:       # %bb.0: # %entry
@@ -657,23 +957,36 @@ define i16 @srem_i16(i16 %a, i16 %b) {
 ; LA64-NEXT:    mod.d $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_i16:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    ext.w.h $a1, $a1
-; LA32-TRAP-NEXT:    ext.w.h $a0, $a0
-; LA32-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB18_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB18_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_i16:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    slli.w $a1, $a1, 16
+; LA32R-TRAP-NEXT:    srai.w $a1, $a1, 16
+; LA32R-TRAP-NEXT:    slli.w $a0, $a0, 16
+; LA32R-TRAP-NEXT:    srai.w $a0, $a0, 16
+; LA32R-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB18_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB18_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_i16:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    ext.w.h $a1, $a1
+; LA32S-TRAP-NEXT:    ext.w.h $a0, $a0
+; LA32S-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB18_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB18_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_i16:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    ext.w.h $a1, $a1
 ; LA64-TRAP-NEXT:    ext.w.h $a0, $a0
 ; LA64-TRAP-NEXT:    mod.d $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB18_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB18_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB18_2: # %entry
@@ -684,10 +997,15 @@ entry:
 }
 
 define i32 @srem_i32(i32 %a, i32 %b) {
-; LA32-LABEL: srem_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_i32:
 ; LA64:       # %bb.0: # %entry
@@ -696,21 +1014,30 @@ define i32 @srem_i32(i32 %a, i32 %b) {
 ; LA64-NEXT:    mod.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_i32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB19_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB19_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_i32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB19_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB19_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_i32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB19_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB19_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_i32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB19_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB19_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB19_2: # %entry
@@ -721,29 +1048,43 @@ entry:
 }
 
 define i32 @srem_ui32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: srem_ui32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_ui32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_ui32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_ui32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    mod.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_ui32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB20_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB20_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_ui32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB20_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB20_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_ui32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB20_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB20_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_ui32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB20_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB20_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB20_2: # %entry
@@ -754,10 +1095,15 @@ entry:
 }
 
 define signext i32 @srem_si32_ui32_ui32(i32 %a, i32 %b) {
-; LA32-LABEL: srem_si32_ui32_ui32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_si32_ui32_ui32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_si32_ui32_ui32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_si32_ui32_ui32:
 ; LA64:       # %bb.0: # %entry
@@ -766,21 +1112,30 @@ define signext i32 @srem_si32_ui32_ui32(i32 %a, i32 %b) {
 ; LA64-NEXT:    mod.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_si32_ui32_ui32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB21_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB21_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_si32_ui32_ui32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB21_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB21_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_si32_ui32_ui32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB21_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB21_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_si32_ui32_ui32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB21_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB21_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB21_2: # %entry
@@ -791,29 +1146,43 @@ entry:
 }
 
 define signext i32 @srem_si32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: srem_si32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_si32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_si32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_si32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    mod.w $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_si32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB22_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB22_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_si32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB22_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB22_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_si32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.w $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB22_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB22_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_si32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    mod.w $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB22_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB22_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB22_2: # %entry
@@ -824,37 +1193,59 @@ entry:
 }
 
 define i64 @srem_i64(i64 %a, i64 %b) {
-; LA32-LABEL: srem_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    .cfi_def_cfa_offset 16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    .cfi_offset 1, -4
-; LA32-NEXT:    bl __moddi3
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: srem_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    .cfi_offset 1, -4
+; LA32R-NEXT:    bl __moddi3
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: srem_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    .cfi_offset 1, -4
+; LA32S-NEXT:    bl __moddi3
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: srem_i64:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    mod.d $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: srem_i64:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, -16
-; LA32-TRAP-NEXT:    .cfi_def_cfa_offset 16
-; LA32-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-TRAP-NEXT:    .cfi_offset 1, -4
-; LA32-TRAP-NEXT:    bl __moddi3
-; LA32-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, 16
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: srem_i64:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32R-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32R-TRAP-NEXT:    bl __moddi3
+; LA32R-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: srem_i64:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32S-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32S-TRAP-NEXT:    bl __moddi3
+; LA32S-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: srem_i64:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    mod.d $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB23_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB23_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB23_2: # %entry
@@ -865,20 +1256,30 @@ entry:
 }
 
 define i1 @urem_i1(i1 %a, i1 %b) {
-; LA32-LABEL: urem_i1:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_i1:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_i1:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_i1:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    move $a0, $zero
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_i1:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    move $a0, $zero
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_i1:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    move $a0, $zero
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_i1:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    move $a0, $zero
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_i1:
 ; LA64-TRAP:       # %bb.0: # %entry
@@ -890,12 +1291,19 @@ entry:
 }
 
 define i8 @urem_i8(i8 %a, i8 %b) {
-; LA32-LABEL: urem_i8:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_i8:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_i8:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_i8:
 ; LA64:       # %bb.0: # %entry
@@ -904,23 +1312,34 @@ define i8 @urem_i8(i8 %a, i8 %b) {
 ; LA64-NEXT:    mod.du $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_i8:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    andi $a1, $a1, 255
-; LA32-TRAP-NEXT:    andi $a0, $a0, 255
-; LA32-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB25_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB25_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_i8:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    andi $a1, $a1, 255
+; LA32R-TRAP-NEXT:    andi $a0, $a0, 255
+; LA32R-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB25_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB25_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_i8:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    andi $a1, $a1, 255
+; LA32S-TRAP-NEXT:    andi $a0, $a0, 255
+; LA32S-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB25_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB25_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_i8:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    andi $a1, $a1, 255
 ; LA64-TRAP-NEXT:    andi $a0, $a0, 255
 ; LA64-TRAP-NEXT:    mod.du $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB25_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB25_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB25_2: # %entry
@@ -931,12 +1350,21 @@ entry:
 }
 
 define i16 @urem_i16(i16 %a, i16 %b) {
-; LA32-LABEL: urem_i16:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_i16:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    lu12i.w $a2, 15
+; LA32R-NEXT:    ori $a2, $a2, 4095
+; LA32R-NEXT:    and $a1, $a1, $a2
+; LA32R-NEXT:    and $a0, $a0, $a2
+; LA32R-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_i16:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_i16:
 ; LA64:       # %bb.0: # %entry
@@ -945,23 +1373,36 @@ define i16 @urem_i16(i16 %a, i16 %b) {
 ; LA64-NEXT:    mod.du $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_i16:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    bstrpick.w $a1, $a1, 15, 0
-; LA32-TRAP-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB26_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB26_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_i16:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    lu12i.w $a2, 15
+; LA32R-TRAP-NEXT:    ori $a2, $a2, 4095
+; LA32R-TRAP-NEXT:    and $a1, $a1, $a2
+; LA32R-TRAP-NEXT:    and $a0, $a0, $a2
+; LA32R-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB26_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB26_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_i16:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    bstrpick.w $a1, $a1, 15, 0
+; LA32S-TRAP-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB26_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB26_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_i16:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    bstrpick.d $a1, $a1, 15, 0
 ; LA64-TRAP-NEXT:    bstrpick.d $a0, $a0, 15, 0
 ; LA64-TRAP-NEXT:    mod.du $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB26_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB26_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB26_2: # %entry
@@ -972,10 +1413,15 @@ entry:
 }
 
 define i32 @urem_i32(i32 %a, i32 %b) {
-; LA32-LABEL: urem_i32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_i32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_i32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_i32:
 ; LA64:       # %bb.0: # %entry
@@ -984,21 +1430,30 @@ define i32 @urem_i32(i32 %a, i32 %b) {
 ; LA64-NEXT:    mod.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_i32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB27_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB27_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_i32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB27_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB27_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_i32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB27_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB27_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_i32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB27_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB27_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB27_2: # %entry
@@ -1009,29 +1464,43 @@ entry:
 }
 
 define i32 @urem_ui32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: urem_ui32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_ui32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_ui32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_ui32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    mod.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_ui32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB28_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB28_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_ui32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB28_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB28_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_ui32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB28_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB28_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_ui32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB28_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB28_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB28_2: # %entry
@@ -1042,10 +1511,15 @@ entry:
 }
 
 define signext i32 @urem_si32_ui32_ui32(i32 %a, i32 %b) {
-; LA32-LABEL: urem_si32_ui32_ui32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_si32_ui32_ui32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_si32_ui32_ui32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_si32_ui32_ui32:
 ; LA64:       # %bb.0: # %entry
@@ -1054,21 +1528,30 @@ define signext i32 @urem_si32_ui32_ui32(i32 %a, i32 %b) {
 ; LA64-NEXT:    mod.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_si32_ui32_ui32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB29_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB29_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_si32_ui32_ui32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB29_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB29_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_si32_ui32_ui32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB29_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB29_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_si32_ui32_ui32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    addi.w $a1, $a1, 0
 ; LA64-TRAP-NEXT:    addi.w $a0, $a0, 0
 ; LA64-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB29_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB29_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB29_2: # %entry
@@ -1079,29 +1562,43 @@ entry:
 }
 
 define signext i32 @urem_si32_si32_si32(i32 signext %a, i32 signext %b) {
-; LA32-LABEL: urem_si32_si32_si32:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_si32_si32_si32:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_si32_si32_si32:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_si32_si32_si32:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    mod.wu $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_si32_si32_si32:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA32-TRAP-NEXT:    bnez $a1, .LBB30_2
-; LA32-TRAP-NEXT:  # %bb.1: # %entry
-; LA32-TRAP-NEXT:    break 7
-; LA32-TRAP-NEXT:  .LBB30_2: # %entry
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_si32_si32_si32:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32R-TRAP-NEXT:    bne $a1, $zero, .LBB30_2
+; LA32R-TRAP-NEXT:  # %bb.1: # %entry
+; LA32R-TRAP-NEXT:    break 7
+; LA32R-TRAP-NEXT:  .LBB30_2: # %entry
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_si32_si32_si32:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    mod.wu $a0, $a0, $a1
+; LA32S-TRAP-NEXT:    bne $a1, $zero, .LBB30_2
+; LA32S-TRAP-NEXT:  # %bb.1: # %entry
+; LA32S-TRAP-NEXT:    break 7
+; LA32S-TRAP-NEXT:  .LBB30_2: # %entry
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_si32_si32_si32:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    mod.wu $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB30_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB30_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB30_2: # %entry
@@ -1112,37 +1609,59 @@ entry:
 }
 
 define i64 @urem_i64(i64 %a, i64 %b) {
-; LA32-LABEL: urem_i64:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    .cfi_def_cfa_offset 16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    .cfi_offset 1, -4
-; LA32-NEXT:    bl __umoddi3
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: urem_i64:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    .cfi_offset 1, -4
+; LA32R-NEXT:    bl __umoddi3
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: urem_i64:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    .cfi_offset 1, -4
+; LA32S-NEXT:    bl __umoddi3
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: urem_i64:
 ; LA64:       # %bb.0: # %entry
 ; LA64-NEXT:    mod.du $a0, $a0, $a1
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: urem_i64:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, -16
-; LA32-TRAP-NEXT:    .cfi_def_cfa_offset 16
-; LA32-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-TRAP-NEXT:    .cfi_offset 1, -4
-; LA32-TRAP-NEXT:    bl __umoddi3
-; LA32-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, 16
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: urem_i64:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32R-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32R-TRAP-NEXT:    bl __umoddi3
+; LA32R-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: urem_i64:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32S-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32S-TRAP-NEXT:    bl __umoddi3
+; LA32S-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: urem_i64:
 ; LA64-TRAP:       # %bb.0: # %entry
 ; LA64-TRAP-NEXT:    mod.du $a0, $a0, $a1
-; LA64-TRAP-NEXT:    bnez $a1, .LBB31_2
+; LA64-TRAP-NEXT:    bne $a1, $zero, .LBB31_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB31_2: # %entry
@@ -1153,21 +1672,37 @@ entry:
 }
 
 define signext i32 @pr107414(i32 signext %x) {
-; LA32-LABEL: pr107414:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    addi.w $sp, $sp, -16
-; LA32-NEXT:    .cfi_def_cfa_offset 16
-; LA32-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-NEXT:    .cfi_offset 1, -4
-; LA32-NEXT:    move $a2, $a0
-; LA32-NEXT:    srai.w $a3, $a0, 31
-; LA32-NEXT:    lu12i.w $a0, -266831
-; LA32-NEXT:    ori $a0, $a0, 3337
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    bl __divdi3
-; LA32-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-NEXT:    addi.w $sp, $sp, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: pr107414:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    addi.w $sp, $sp, -16
+; LA32R-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-NEXT:    .cfi_offset 1, -4
+; LA32R-NEXT:    move $a2, $a0
+; LA32R-NEXT:    srai.w $a3, $a0, 31
+; LA32R-NEXT:    lu12i.w $a0, -266831
+; LA32R-NEXT:    ori $a0, $a0, 3337
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    bl __divdi3
+; LA32R-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-NEXT:    addi.w $sp, $sp, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: pr107414:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    addi.w $sp, $sp, -16
+; LA32S-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-NEXT:    .cfi_offset 1, -4
+; LA32S-NEXT:    move $a2, $a0
+; LA32S-NEXT:    srai.w $a3, $a0, 31
+; LA32S-NEXT:    lu12i.w $a0, -266831
+; LA32S-NEXT:    ori $a0, $a0, 3337
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    bl __divdi3
+; LA32S-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-NEXT:    addi.w $sp, $sp, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: pr107414:
 ; LA64:       # %bb.0: # %entry
@@ -1178,21 +1713,37 @@ define signext i32 @pr107414(i32 signext %x) {
 ; LA64-NEXT:    addi.w $a0, $a0, 0
 ; LA64-NEXT:    ret
 ;
-; LA32-TRAP-LABEL: pr107414:
-; LA32-TRAP:       # %bb.0: # %entry
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, -16
-; LA32-TRAP-NEXT:    .cfi_def_cfa_offset 16
-; LA32-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
-; LA32-TRAP-NEXT:    .cfi_offset 1, -4
-; LA32-TRAP-NEXT:    move $a2, $a0
-; LA32-TRAP-NEXT:    srai.w $a3, $a0, 31
-; LA32-TRAP-NEXT:    lu12i.w $a0, -266831
-; LA32-TRAP-NEXT:    ori $a0, $a0, 3337
-; LA32-TRAP-NEXT:    move $a1, $zero
-; LA32-TRAP-NEXT:    bl __divdi3
-; LA32-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
-; LA32-TRAP-NEXT:    addi.w $sp, $sp, 16
-; LA32-TRAP-NEXT:    ret
+; LA32R-TRAP-LABEL: pr107414:
+; LA32R-TRAP:       # %bb.0: # %entry
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32R-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32R-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32R-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32R-TRAP-NEXT:    move $a2, $a0
+; LA32R-TRAP-NEXT:    srai.w $a3, $a0, 31
+; LA32R-TRAP-NEXT:    lu12i.w $a0, -266831
+; LA32R-TRAP-NEXT:    ori $a0, $a0, 3337
+; LA32R-TRAP-NEXT:    move $a1, $zero
+; LA32R-TRAP-NEXT:    bl __divdi3
+; LA32R-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32R-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32R-TRAP-NEXT:    ret
+;
+; LA32S-TRAP-LABEL: pr107414:
+; LA32S-TRAP:       # %bb.0: # %entry
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, -16
+; LA32S-TRAP-NEXT:    .cfi_def_cfa_offset 16
+; LA32S-TRAP-NEXT:    st.w $ra, $sp, 12 # 4-byte Folded Spill
+; LA32S-TRAP-NEXT:    .cfi_offset 1, -4
+; LA32S-TRAP-NEXT:    move $a2, $a0
+; LA32S-TRAP-NEXT:    srai.w $a3, $a0, 31
+; LA32S-TRAP-NEXT:    lu12i.w $a0, -266831
+; LA32S-TRAP-NEXT:    ori $a0, $a0, 3337
+; LA32S-TRAP-NEXT:    move $a1, $zero
+; LA32S-TRAP-NEXT:    bl __divdi3
+; LA32S-TRAP-NEXT:    ld.w $ra, $sp, 12 # 4-byte Folded Reload
+; LA32S-TRAP-NEXT:    addi.w $sp, $sp, 16
+; LA32S-TRAP-NEXT:    ret
 ;
 ; LA64-TRAP-LABEL: pr107414:
 ; LA64-TRAP:       # %bb.0: # %entry
@@ -1200,7 +1751,7 @@ define signext i32 @pr107414(i32 signext %x) {
 ; LA64-TRAP-NEXT:    ori $a1, $a1, 3337
 ; LA64-TRAP-NEXT:    lu32i.d $a1, 0
 ; LA64-TRAP-NEXT:    div.d $a1, $a1, $a0
-; LA64-TRAP-NEXT:    bnez $a0, .LBB32_2
+; LA64-TRAP-NEXT:    bne $a0, $zero, .LBB32_2
 ; LA64-TRAP-NEXT:  # %bb.1: # %entry
 ; LA64-TRAP-NEXT:    break 7
 ; LA64-TRAP-NEXT:  .LBB32_2: # %entry
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/select-bare-int.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/select-bare-int.ll
index 7239e27d99445..c13bc1edc308e 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/select-bare-int.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/select-bare-int.ll
@@ -7,10 +7,12 @@
 define i1 @bare_select_i1(i1 %a, i1 %b, i1 %c) {
 ; LA32-LABEL: bare_select_i1:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    masknez $a2, $a2, $a0
-; LA32-NEXT:    maskeqz $a0, $a1, $a0
-; LA32-NEXT:    or $a0, $a0, $a2
+; LA32-NEXT:    andi $a3, $a0, 1
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:    bne $a3, $zero, .LBB0_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a2
+; LA32-NEXT:  .LBB0_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_i1:
@@ -27,10 +29,12 @@ define i1 @bare_select_i1(i1 %a, i1 %b, i1 %c) {
 define i8 @bare_select_i8(i1 %a, i8 %b, i8 %c) {
 ; LA32-LABEL: bare_select_i8:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    masknez $a2, $a2, $a0
-; LA32-NEXT:    maskeqz $a0, $a1, $a0
-; LA32-NEXT:    or $a0, $a0, $a2
+; LA32-NEXT:    andi $a3, $a0, 1
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:    bne $a3, $zero, .LBB1_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a2
+; LA32-NEXT:  .LBB1_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_i8:
@@ -47,10 +51,12 @@ define i8 @bare_select_i8(i1 %a, i8 %b, i8 %c) {
 define i16 @bare_select_i16(i1 %a, i16 %b, i16 %c) {
 ; LA32-LABEL: bare_select_i16:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    masknez $a2, $a2, $a0
-; LA32-NEXT:    maskeqz $a0, $a1, $a0
-; LA32-NEXT:    or $a0, $a0, $a2
+; LA32-NEXT:    andi $a3, $a0, 1
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:    bne $a3, $zero, .LBB2_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a2
+; LA32-NEXT:  .LBB2_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_i16:
@@ -67,10 +73,12 @@ define i16 @bare_select_i16(i1 %a, i16 %b, i16 %c) {
 define i32 @bare_select_i32(i1 %a, i32 %b, i32 %c) {
 ; LA32-LABEL: bare_select_i32:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    masknez $a2, $a2, $a0
-; LA32-NEXT:    maskeqz $a0, $a1, $a0
-; LA32-NEXT:    or $a0, $a0, $a2
+; LA32-NEXT:    andi $a3, $a0, 1
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:    bne $a3, $zero, .LBB3_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a2
+; LA32-NEXT:  .LBB3_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_i32:
@@ -88,12 +96,13 @@ define i64 @bare_select_i64(i1 %a, i64 %b, i64 %c) {
 ; LA32-LABEL: bare_select_i64:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    andi $a5, $a0, 1
-; LA32-NEXT:    masknez $a0, $a3, $a5
-; LA32-NEXT:    maskeqz $a1, $a1, $a5
-; LA32-NEXT:    or $a0, $a1, $a0
-; LA32-NEXT:    masknez $a1, $a4, $a5
-; LA32-NEXT:    maskeqz $a2, $a2, $a5
-; LA32-NEXT:    or $a1, $a2, $a1
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:    bne $a5, $zero, .LBB4_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a3
+; LA32-NEXT:    move $a2, $a4
+; LA32-NEXT:  .LBB4_2:
+; LA32-NEXT:    move $a1, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_i64:
@@ -111,13 +120,14 @@ define i16 @bare_select_zero_i16(i1 %a, i16 %b) {
 ; LA32-LABEL: bare_select_zero_i16:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    masknez	$a0, $a1, $a0
+; LA32-NEXT:    addi.w $a0, $a0, -1
+; LA32-NEXT:    and $a0, $a0, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_zero_i16:
 ; LA64:       # %bb.0:
 ; LA64-NEXT:    andi $a0, $a0, 1
-; LA64-NEXT:    masknez	$a0, $a1, $a0
+; LA64-NEXT:    masknez $a0, $a1, $a0
 ; LA64-NEXT:    ret
   %res = select i1 %a, i16 0, i16 %b
   ret i16 %res
@@ -127,7 +137,8 @@ define i32 @bare_select_zero_i32(i1 %a, i32 %b) {
 ; LA32-LABEL: bare_select_zero_i32:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    maskeqz $a0, $a1, $a0
+; LA32-NEXT:    sub.w $a0, $zero, $a0
+; LA32-NEXT:    and $a0, $a0, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: bare_select_zero_i32:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/select-fpcc-int.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/select-fpcc-int.ll
index 3e88181a11fe8..ee6bb6e0498af 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/select-fpcc-int.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/select-fpcc-int.ll
@@ -24,9 +24,10 @@ define i32 @f32_fcmp_oeq(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.ceq.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB1_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB1_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_oeq:
@@ -47,9 +48,10 @@ define i32 @f32_fcmp_ogt(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.clt.s $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB2_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB2_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ogt:
@@ -70,9 +72,10 @@ define i32 @f32_fcmp_oge(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cle.s $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB3_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB3_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_oge:
@@ -93,9 +96,10 @@ define i32 @f32_fcmp_olt(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.clt.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB4_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB4_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_olt:
@@ -116,9 +120,10 @@ define i32 @f32_fcmp_ole(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cle.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB5_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB5_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ole:
@@ -139,9 +144,10 @@ define i32 @f32_fcmp_one(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cne.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB6_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB6_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_one:
@@ -162,9 +168,10 @@ define i32 @f32_fcmp_ord(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cor.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB7_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB7_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ord:
@@ -185,9 +192,10 @@ define i32 @f32_fcmp_ueq(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cueq.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB8_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB8_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ueq:
@@ -208,9 +216,10 @@ define i32 @f32_fcmp_ugt(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cult.s $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB9_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB9_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ugt:
@@ -231,9 +240,10 @@ define i32 @f32_fcmp_uge(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cule.s $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB10_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB10_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_uge:
@@ -254,9 +264,10 @@ define i32 @f32_fcmp_ult(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cult.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB11_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB11_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ult:
@@ -277,9 +288,10 @@ define i32 @f32_fcmp_ule(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cule.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB12_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB12_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_ule:
@@ -300,9 +312,10 @@ define i32 @f32_fcmp_une(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cune.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB13_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB13_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_une:
@@ -323,9 +336,10 @@ define i32 @f32_fcmp_uno(float %a, float %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cun.s $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB14_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB14_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f32_fcmp_uno:
@@ -374,9 +388,10 @@ define i32 @f64_fcmp_oeq(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.ceq.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB17_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB17_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_oeq:
@@ -397,9 +412,10 @@ define i32 @f64_fcmp_ogt(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.clt.d $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB18_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB18_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ogt:
@@ -420,9 +436,10 @@ define i32 @f64_fcmp_oge(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cle.d $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB19_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB19_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_oge:
@@ -443,9 +460,10 @@ define i32 @f64_fcmp_olt(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.clt.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB20_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB20_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_olt:
@@ -466,9 +484,10 @@ define i32 @f64_fcmp_ole(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cle.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB21_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB21_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ole:
@@ -489,9 +508,10 @@ define i32 @f64_fcmp_one(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cne.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB22_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB22_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_one:
@@ -512,9 +532,10 @@ define i32 @f64_fcmp_ord(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cor.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB23_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB23_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ord:
@@ -535,9 +556,10 @@ define i32 @f64_fcmp_ueq(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cueq.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB24_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB24_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ueq:
@@ -558,9 +580,10 @@ define i32 @f64_fcmp_ugt(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cult.d $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB25_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB25_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ugt:
@@ -581,9 +604,10 @@ define i32 @f64_fcmp_uge(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cule.d $fcc0, $fa1, $fa0
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB26_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB26_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_uge:
@@ -604,9 +628,10 @@ define i32 @f64_fcmp_ult(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cult.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB27_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB27_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ult:
@@ -627,9 +652,10 @@ define i32 @f64_fcmp_ule(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cule.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB28_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB28_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_ule:
@@ -650,9 +676,10 @@ define i32 @f64_fcmp_une(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cune.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB29_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB29_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_une:
@@ -673,9 +700,10 @@ define i32 @f64_fcmp_uno(double %a, double %b, i32 %x, i32 %y) {
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    fcmp.cun.d $fcc0, $fa0, $fa1
 ; LA32-NEXT:    movcf2gr $a2, $fcc0
-; LA32-NEXT:    masknez $a1, $a1, $a2
-; LA32-NEXT:    maskeqz $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a2, $zero, .LBB30_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a0, $a1
+; LA32-NEXT:  .LBB30_2:
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: f64_fcmp_uno:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/select-icc-int.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/select-icc-int.ll
index 6a2e1f6972ad9..17c4b11af50ce 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/select-icc-int.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/select-icc-int.ll
@@ -7,11 +7,11 @@
 define i32 @select_eq(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_eq:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    xor $a0, $a0, $a1
-; LA32-NEXT:    sltui $a0, $a0, 1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    beq $a0, $a1, .LBB0_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB0_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_eq:
@@ -30,11 +30,11 @@ define i32 @select_eq(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_ne(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_ne:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    xor $a0, $a0, $a1
-; LA32-NEXT:    sltu $a0, $zero, $a0
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bne $a0, $a1, .LBB1_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB1_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_ne:
@@ -53,10 +53,11 @@ define i32 @select_ne(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_ugt(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_ugt:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bltu $a1, $a0, .LBB2_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB2_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_ugt:
@@ -74,11 +75,11 @@ define i32 @select_ugt(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_uge(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_uge:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    sltu $a0, $a0, $a1
-; LA32-NEXT:    xori $a0, $a0, 1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bgeu $a0, $a1, .LBB3_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB3_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_uge:
@@ -97,10 +98,11 @@ define i32 @select_uge(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_ult(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_ult:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    sltu $a0, $a0, $a1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bltu $a0, $a1, .LBB4_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB4_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_ult:
@@ -118,11 +120,11 @@ define i32 @select_ult(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_ule(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_ule:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    xori $a0, $a0, 1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bgeu $a1, $a0, .LBB5_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB5_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_ule:
@@ -141,10 +143,11 @@ define i32 @select_ule(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_sgt(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_sgt:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slt $a0, $a1, $a0
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    blt $a1, $a0, .LBB6_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB6_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_sgt:
@@ -162,11 +165,11 @@ define i32 @select_sgt(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_sge(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_sge:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slt $a0, $a0, $a1
-; LA32-NEXT:    xori $a0, $a0, 1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bge $a0, $a1, .LBB7_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB7_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_sge:
@@ -185,10 +188,11 @@ define i32 @select_sge(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_slt(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_slt:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slt $a0, $a0, $a1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    blt $a0, $a1, .LBB8_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB8_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_slt:
@@ -206,11 +210,11 @@ define i32 @select_slt(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 define i32 @select_sle(i32 signext %a, i32 signext %b, i32 %x, i32 %y) {
 ; LA32-LABEL: select_sle:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    slt $a0, $a1, $a0
-; LA32-NEXT:    xori $a0, $a0, 1
-; LA32-NEXT:    masknez $a1, $a3, $a0
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
+; LA32-NEXT:    bge $a1, $a0, .LBB9_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    move $a2, $a3
+; LA32-NEXT:  .LBB9_2:
+; LA32-NEXT:    move $a0, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: select_sle:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/sext-zext-trunc.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/sext-zext-trunc.ll
index 255bb79ad1580..dbe28bae4c4ef 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/sext-zext-trunc.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/sext-zext-trunc.ll
@@ -1,15 +1,22 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 ;; Test sext/zext/trunc
 
 define i8 @sext_i1_to_i8(i1 %a) {
-; LA32-LABEL: sext_i1_to_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    sub.w $a0, $zero, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i1_to_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    sub.w $a0, $zero, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i1_to_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    sub.w $a0, $zero, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i1_to_i8:
 ; LA64:       # %bb.0:
@@ -21,11 +28,17 @@ define i8 @sext_i1_to_i8(i1 %a) {
 }
 
 define i16 @sext_i1_to_i16(i1 %a) {
-; LA32-LABEL: sext_i1_to_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    sub.w $a0, $zero, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i1_to_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    sub.w $a0, $zero, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i1_to_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    sub.w $a0, $zero, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i1_to_i16:
 ; LA64:       # %bb.0:
@@ -37,11 +50,17 @@ define i16 @sext_i1_to_i16(i1 %a) {
 }
 
 define i32 @sext_i1_to_i32(i1 %a) {
-; LA32-LABEL: sext_i1_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    sub.w $a0, $zero, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i1_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    sub.w $a0, $zero, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i1_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    sub.w $a0, $zero, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i1_to_i32:
 ; LA64:       # %bb.0:
@@ -53,12 +72,19 @@ define i32 @sext_i1_to_i32(i1 %a) {
 }
 
 define i64 @sext_i1_to_i64(i1 %a) {
-; LA32-LABEL: sext_i1_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    sub.w $a0, $zero, $a0
-; LA32-NEXT:    move $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i1_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    sub.w $a0, $zero, $a0
+; LA32R-NEXT:    move $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i1_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    sub.w $a0, $zero, $a0
+; LA32S-NEXT:    move $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i1_to_i64:
 ; LA64:       # %bb.0:
@@ -70,10 +96,16 @@ define i64 @sext_i1_to_i64(i1 %a) {
 }
 
 define i16 @sext_i8_to_i16(i8 %a) {
-; LA32-LABEL: sext_i8_to_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i8_to_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    srai.w $a0, $a0, 24
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i8_to_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i8_to_i16:
 ; LA64:       # %bb.0:
@@ -84,10 +116,16 @@ define i16 @sext_i8_to_i16(i8 %a) {
 }
 
 define i32 @sext_i8_to_i32(i8 %a) {
-; LA32-LABEL: sext_i8_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i8_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    srai.w $a0, $a0, 24
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i8_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i8_to_i32:
 ; LA64:       # %bb.0:
@@ -98,11 +136,18 @@ define i32 @sext_i8_to_i32(i8 %a) {
 }
 
 define i64 @sext_i8_to_i64(i8 %a) {
-; LA32-LABEL: sext_i8_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    srai.w $a1, $a0, 31
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i8_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 24
+; LA32R-NEXT:    srai.w $a0, $a1, 24
+; LA32R-NEXT:    srai.w $a1, $a1, 31
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i8_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    srai.w $a1, $a0, 31
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i8_to_i64:
 ; LA64:       # %bb.0:
@@ -113,10 +158,16 @@ define i64 @sext_i8_to_i64(i8 %a) {
 }
 
 define i32 @sext_i16_to_i32(i16 %a) {
-; LA32-LABEL: sext_i16_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.h $a0, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i16_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a0, $a0, 16
+; LA32R-NEXT:    srai.w $a0, $a0, 16
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i16_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.h $a0, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i16_to_i32:
 ; LA64:       # %bb.0:
@@ -127,11 +178,18 @@ define i32 @sext_i16_to_i32(i16 %a) {
 }
 
 define i64 @sext_i16_to_i64(i16 %a) {
-; LA32-LABEL: sext_i16_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.h $a0, $a0
-; LA32-NEXT:    srai.w $a1, $a0, 31
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i16_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 16
+; LA32R-NEXT:    srai.w $a0, $a1, 16
+; LA32R-NEXT:    srai.w $a1, $a1, 31
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i16_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.h $a0, $a0
+; LA32S-NEXT:    srai.w $a1, $a0, 31
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i16_to_i64:
 ; LA64:       # %bb.0:
@@ -142,10 +200,15 @@ define i64 @sext_i16_to_i64(i16 %a) {
 }
 
 define i64 @sext_i32_to_i64(i32 %a) {
-; LA32-LABEL: sext_i32_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srai.w $a1, $a0, 31
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_i32_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srai.w $a1, $a0, 31
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_i32_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srai.w $a1, $a0, 31
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_i32_to_i64:
 ; LA64:       # %bb.0:
@@ -156,10 +219,15 @@ define i64 @sext_i32_to_i64(i32 %a) {
 }
 
 define i8 @zext_i1_to_i8(i1 %a) {
-; LA32-LABEL: zext_i1_to_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i1_to_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i1_to_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i1_to_i8:
 ; LA64:       # %bb.0:
@@ -170,10 +238,15 @@ define i8 @zext_i1_to_i8(i1 %a) {
 }
 
 define i16 @zext_i1_to_i16(i1 %a) {
-; LA32-LABEL: zext_i1_to_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i1_to_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i1_to_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i1_to_i16:
 ; LA64:       # %bb.0:
@@ -184,10 +257,15 @@ define i16 @zext_i1_to_i16(i1 %a) {
 }
 
 define i32 @zext_i1_to_i32(i1 %a) {
-; LA32-LABEL: zext_i1_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i1_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i1_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i1_to_i32:
 ; LA64:       # %bb.0:
@@ -198,11 +276,17 @@ define i32 @zext_i1_to_i32(i1 %a) {
 }
 
 define i64 @zext_i1_to_i64(i1 %a) {
-; LA32-LABEL: zext_i1_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 1
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i1_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i1_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 1
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i1_to_i64:
 ; LA64:       # %bb.0:
@@ -213,10 +297,15 @@ define i64 @zext_i1_to_i64(i1 %a) {
 }
 
 define i16 @zext_i8_to_i16(i8 %a) {
-; LA32-LABEL: zext_i8_to_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i8_to_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i8_to_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i8_to_i16:
 ; LA64:       # %bb.0:
@@ -227,10 +316,15 @@ define i16 @zext_i8_to_i16(i8 %a) {
 }
 
 define i32 @zext_i8_to_i32(i8 %a) {
-; LA32-LABEL: zext_i8_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i8_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i8_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i8_to_i32:
 ; LA64:       # %bb.0:
@@ -241,11 +335,17 @@ define i32 @zext_i8_to_i32(i8 %a) {
 }
 
 define i64 @zext_i8_to_i64(i8 %a) {
-; LA32-LABEL: zext_i8_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i8_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a0, $a0, 255
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i8_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i8_to_i64:
 ; LA64:       # %bb.0:
@@ -256,10 +356,17 @@ define i64 @zext_i8_to_i64(i8 %a) {
 }
 
 define i32 @zext_i16_to_i32(i16 %a) {
-; LA32-LABEL: zext_i16_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i16_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i16_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i16_to_i32:
 ; LA64:       # %bb.0:
@@ -270,11 +377,19 @@ define i32 @zext_i16_to_i32(i16 %a) {
 }
 
 define i64 @zext_i16_to_i64(i16 %a) {
-; LA32-LABEL: zext_i16_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i16_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a0, $a0, $a1
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i16_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i16_to_i64:
 ; LA64:       # %bb.0:
@@ -285,10 +400,15 @@ define i64 @zext_i16_to_i64(i16 %a) {
 }
 
 define i64 @zext_i32_to_i64(i32 %a) {
-; LA32-LABEL: zext_i32_to_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: zext_i32_to_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: zext_i32_to_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: zext_i32_to_i64:
 ; LA64:       # %bb.0:
@@ -299,9 +419,13 @@ define i64 @zext_i32_to_i64(i32 %a) {
 }
 
 define i1 @trunc_i8_to_i1(i8 %a) {
-; LA32-LABEL: trunc_i8_to_i1:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i8_to_i1:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i8_to_i1:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i8_to_i1:
 ; LA64:       # %bb.0:
@@ -311,9 +435,13 @@ define i1 @trunc_i8_to_i1(i8 %a) {
 }
 
 define i1 @trunc_i16_to_i1(i16 %a) {
-; LA32-LABEL: trunc_i16_to_i1:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i16_to_i1:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i16_to_i1:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i16_to_i1:
 ; LA64:       # %bb.0:
@@ -323,9 +451,13 @@ define i1 @trunc_i16_to_i1(i16 %a) {
 }
 
 define i1 @trunc_i32_to_i1(i32 %a) {
-; LA32-LABEL: trunc_i32_to_i1:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i32_to_i1:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i32_to_i1:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i32_to_i1:
 ; LA64:       # %bb.0:
@@ -335,9 +467,13 @@ define i1 @trunc_i32_to_i1(i32 %a) {
 }
 
 define i1 @trunc_i64_to_i1(i64 %a) {
-; LA32-LABEL: trunc_i64_to_i1:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i64_to_i1:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i64_to_i1:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i64_to_i1:
 ; LA64:       # %bb.0:
@@ -347,9 +483,13 @@ define i1 @trunc_i64_to_i1(i64 %a) {
 }
 
 define i8 @trunc_i16_to_i8(i16 %a) {
-; LA32-LABEL: trunc_i16_to_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i16_to_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i16_to_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i16_to_i8:
 ; LA64:       # %bb.0:
@@ -359,9 +499,13 @@ define i8 @trunc_i16_to_i8(i16 %a) {
 }
 
 define i8 @trunc_i32_to_i8(i32 %a) {
-; LA32-LABEL: trunc_i32_to_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i32_to_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i32_to_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i32_to_i8:
 ; LA64:       # %bb.0:
@@ -371,9 +515,13 @@ define i8 @trunc_i32_to_i8(i32 %a) {
 }
 
 define i8 @trunc_i64_to_i8(i64 %a) {
-; LA32-LABEL: trunc_i64_to_i8:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i64_to_i8:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i64_to_i8:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i64_to_i8:
 ; LA64:       # %bb.0:
@@ -383,9 +531,13 @@ define i8 @trunc_i64_to_i8(i64 %a) {
 }
 
 define i16 @trunc_i32_to_i16(i32 %a) {
-; LA32-LABEL: trunc_i32_to_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i32_to_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i32_to_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i32_to_i16:
 ; LA64:       # %bb.0:
@@ -395,9 +547,13 @@ define i16 @trunc_i32_to_i16(i32 %a) {
 }
 
 define i16 @trunc_i64_to_i16(i64 %a) {
-; LA32-LABEL: trunc_i64_to_i16:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i64_to_i16:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i64_to_i16:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i64_to_i16:
 ; LA64:       # %bb.0:
@@ -407,9 +563,13 @@ define i16 @trunc_i64_to_i16(i64 %a) {
 }
 
 define i32 @trunc_i64_to_i32(i64 %a) {
-; LA32-LABEL: trunc_i64_to_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ret
+; LA32R-LABEL: trunc_i64_to_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: trunc_i64_to_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: trunc_i64_to_i32:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/ir-instruction/shl.ll b/llvm/test/CodeGen/LoongArch/ir-instruction/shl.ll
index 2be777cc8db01..97af45edd35f9 100644
--- a/llvm/test/CodeGen/LoongArch/ir-instruction/shl.ll
+++ b/llvm/test/CodeGen/LoongArch/ir-instruction/shl.ll
@@ -61,20 +61,22 @@ define i32 @shl_i32(i32 %x, i32 %y) {
 define i64 @shl_i64(i64 %x, i64 %y) {
 ; LA32-LABEL: shl_i64:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    xori $a3, $a2, 31
-; LA32-NEXT:    srli.w $a4, $a0, 1
-; LA32-NEXT:    srl.w $a3, $a4, $a3
-; LA32-NEXT:    or $a1, $a1, $a3
 ; LA32-NEXT:    addi.w $a3, $a2, -32
-; LA32-NEXT:    slti $a4, $a3, 0
-; LA32-NEXT:    maskeqz $a1, $a1, $a4
-; LA32-NEXT:    sll.w $a5, $a0, $a3
-; LA32-NEXT:    masknez $a4, $a5, $a4
+; LA32-NEXT:    bltz $a3, .LBB4_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    sll.w $a1, $a0, $a3
+; LA32-NEXT:    b .LBB4_3
+; LA32-NEXT:  .LBB4_2:
+; LA32-NEXT:    sll.w $a1, $a1, $a2
+; LA32-NEXT:    xori $a4, $a2, 31
+; LA32-NEXT:    srli.w $a5, $a0, 1
+; LA32-NEXT:    srl.w $a4, $a5, $a4
 ; LA32-NEXT:    or $a1, $a1, $a4
+; LA32-NEXT:  .LBB4_3:
+; LA32-NEXT:    slti $a3, $a3, 0
+; LA32-NEXT:    sub.w $a3, $zero, $a3
 ; LA32-NEXT:    sll.w $a0, $a0, $a2
-; LA32-NEXT:    srai.w $a2, $a3, 31
-; LA32-NEXT:    and $a0, $a2, $a0
+; LA32-NEXT:    and $a0, $a3, $a0
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: shl_i64:
diff --git a/llvm/test/CodeGen/LoongArch/jump-table.ll b/llvm/test/CodeGen/LoongArch/jump-table.ll
index e80915c1e8e0f..a59c71a85412c 100644
--- a/llvm/test/CodeGen/LoongArch/jump-table.ll
+++ b/llvm/test/CodeGen/LoongArch/jump-table.ll
@@ -1,9 +1,9 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --min-jump-table-entries=5 < %s \
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --min-jump-table-entries=5 < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --min-jump-table-entries=5 < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
-; RUN: llc --mtriple=loongarch32 -mattr=+d --min-jump-table-entries=4 < %s \
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --min-jump-table-entries=4 < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA32-JT
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --min-jump-table-entries=4 < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64-JT
diff --git a/llvm/test/CodeGen/LoongArch/rotl-rotr.ll b/llvm/test/CodeGen/LoongArch/rotl-rotr.ll
index 774cf614f6099..9e98b2de79898 100644
--- a/llvm/test/CodeGen/LoongArch/rotl-rotr.ll
+++ b/llvm/test/CodeGen/LoongArch/rotl-rotr.ll
@@ -1,13 +1,22 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d < %s | FileCheck %s --check-prefix=LA32
+; RUN: llc --mtriple=loongarch32 -mattr=-32s,+d < %s | FileCheck %s --check-prefix=LA32R
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d < %s | FileCheck %s --check-prefix=LA32S
 ; RUN: llc --mtriple=loongarch64 -mattr=+d < %s | FileCheck %s --check-prefix=LA64
 
 define signext i32 @rotl_32(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotl_32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sub.w $a1, $zero, $a1
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sub.w $a2, $zero, $a1
+; LA32R-NEXT:    sll.w $a1, $a0, $a1
+; LA32R-NEXT:    srl.w $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sub.w $a1, $zero, $a1
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_32:
 ; LA64:       # %bb.0:
@@ -22,10 +31,18 @@ define signext i32 @rotl_32(i32 signext %x, i32 signext %y) nounwind {
 }
 
 define signext i32 @rotr_32(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotr_32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sub.w $a2, $zero, $a1
+; LA32R-NEXT:    srl.w $a1, $a0, $a1
+; LA32R-NEXT:    sll.w $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_32:
 ; LA64:       # %bb.0:
@@ -39,42 +56,84 @@ define signext i32 @rotr_32(i32 signext %x, i32 signext %y) nounwind {
 }
 
 define i64 @rotl_64(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotl_64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sll.w $a3, $a1, $a2
-; LA32-NEXT:    xori $a4, $a2, 31
-; LA32-NEXT:    srli.w $a5, $a0, 1
-; LA32-NEXT:    srl.w $a4, $a5, $a4
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a2, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    sll.w $a6, $a0, $a4
-; LA32-NEXT:    masknez $a5, $a6, $a5
-; LA32-NEXT:    or $a3, $a3, $a5
-; LA32-NEXT:    sll.w $a5, $a0, $a2
-; LA32-NEXT:    srai.w $a4, $a4, 31
-; LA32-NEXT:    and $a4, $a4, $a5
-; LA32-NEXT:    sub.w $a5, $zero, $a2
-; LA32-NEXT:    srl.w $a6, $a1, $a5
-; LA32-NEXT:    ori $a7, $zero, 32
-; LA32-NEXT:    sub.w $a7, $a7, $a2
-; LA32-NEXT:    slti $t0, $a7, 0
-; LA32-NEXT:    masknez $t1, $a6, $t0
-; LA32-NEXT:    srl.w $a0, $a0, $a5
-; LA32-NEXT:    ori $a5, $zero, 64
-; LA32-NEXT:    sub.w $a2, $a5, $a2
-; LA32-NEXT:    xori $a2, $a2, 31
-; LA32-NEXT:    slli.w $a1, $a1, 1
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    maskeqz $a0, $a0, $t0
-; LA32-NEXT:    or $a0, $a0, $t1
-; LA32-NEXT:    srai.w $a1, $a7, 31
-; LA32-NEXT:    and $a1, $a1, $a6
-; LA32-NEXT:    or $a1, $a3, $a1
-; LA32-NEXT:    or $a0, $a4, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a3, $a2, -32
+; LA32R-NEXT:    slti $a4, $a3, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    sll.w $a5, $a0, $a2
+; LA32R-NEXT:    bltz $a3, .LBB2_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    sll.w $a3, $a0, $a3
+; LA32R-NEXT:    b .LBB2_3
+; LA32R-NEXT:  .LBB2_2:
+; LA32R-NEXT:    sll.w $a3, $a1, $a2
+; LA32R-NEXT:    xori $a6, $a2, 31
+; LA32R-NEXT:    srli.w $a7, $a0, 1
+; LA32R-NEXT:    srl.w $a6, $a7, $a6
+; LA32R-NEXT:    or $a3, $a3, $a6
+; LA32R-NEXT:  .LBB2_3:
+; LA32R-NEXT:    and $a4, $a4, $a5
+; LA32R-NEXT:    sub.w $a7, $zero, $a2
+; LA32R-NEXT:    ori $a5, $zero, 32
+; LA32R-NEXT:    sub.w $a6, $a5, $a2
+; LA32R-NEXT:    srl.w $a5, $a1, $a7
+; LA32R-NEXT:    bltz $a6, .LBB2_5
+; LA32R-NEXT:  # %bb.4:
+; LA32R-NEXT:    move $a0, $a5
+; LA32R-NEXT:    b .LBB2_6
+; LA32R-NEXT:  .LBB2_5:
+; LA32R-NEXT:    srl.w $a0, $a0, $a7
+; LA32R-NEXT:    ori $a7, $zero, 64
+; LA32R-NEXT:    sub.w $a2, $a7, $a2
+; LA32R-NEXT:    xori $a2, $a2, 31
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    sll.w $a1, $a1, $a2
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:  .LBB2_6:
+; LA32R-NEXT:    slti $a1, $a6, 0
+; LA32R-NEXT:    sub.w $a1, $zero, $a1
+; LA32R-NEXT:    and $a1, $a1, $a5
+; LA32R-NEXT:    or $a1, $a3, $a1
+; LA32R-NEXT:    or $a0, $a4, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sll.w $a3, $a1, $a2
+; LA32S-NEXT:    xori $a4, $a2, 31
+; LA32S-NEXT:    srli.w $a5, $a0, 1
+; LA32S-NEXT:    srl.w $a4, $a5, $a4
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a2, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    sll.w $a6, $a0, $a4
+; LA32S-NEXT:    masknez $a5, $a6, $a5
+; LA32S-NEXT:    or $a3, $a3, $a5
+; LA32S-NEXT:    sll.w $a5, $a0, $a2
+; LA32S-NEXT:    srai.w $a4, $a4, 31
+; LA32S-NEXT:    and $a4, $a4, $a5
+; LA32S-NEXT:    sub.w $a5, $zero, $a2
+; LA32S-NEXT:    srl.w $a6, $a1, $a5
+; LA32S-NEXT:    ori $a7, $zero, 32
+; LA32S-NEXT:    sub.w $a7, $a7, $a2
+; LA32S-NEXT:    slti $t0, $a7, 0
+; LA32S-NEXT:    masknez $t1, $a6, $t0
+; LA32S-NEXT:    srl.w $a0, $a0, $a5
+; LA32S-NEXT:    ori $a5, $zero, 64
+; LA32S-NEXT:    sub.w $a2, $a5, $a2
+; LA32S-NEXT:    xori $a2, $a2, 31
+; LA32S-NEXT:    slli.w $a1, $a1, 1
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    maskeqz $a0, $a0, $t0
+; LA32S-NEXT:    or $a0, $a0, $t1
+; LA32S-NEXT:    srai.w $a1, $a7, 31
+; LA32S-NEXT:    and $a1, $a1, $a6
+; LA32S-NEXT:    or $a1, $a3, $a1
+; LA32S-NEXT:    or $a0, $a4, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_64:
 ; LA64:       # %bb.0:
@@ -89,42 +148,84 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
 }
 
 define i64 @rotr_64(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotr_64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srl.w $a3, $a0, $a2
-; LA32-NEXT:    xori $a4, $a2, 31
-; LA32-NEXT:    slli.w $a5, $a1, 1
-; LA32-NEXT:    sll.w $a4, $a5, $a4
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a2, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a6, $a1, $a4
-; LA32-NEXT:    masknez $a5, $a6, $a5
-; LA32-NEXT:    or $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a5, $a1, $a2
-; LA32-NEXT:    srai.w $a4, $a4, 31
-; LA32-NEXT:    and $a4, $a4, $a5
-; LA32-NEXT:    sub.w $a5, $zero, $a2
-; LA32-NEXT:    sll.w $a6, $a0, $a5
-; LA32-NEXT:    ori $a7, $zero, 32
-; LA32-NEXT:    sub.w $a7, $a7, $a2
-; LA32-NEXT:    slti $t0, $a7, 0
-; LA32-NEXT:    masknez $t1, $a6, $t0
-; LA32-NEXT:    sll.w $a1, $a1, $a5
-; LA32-NEXT:    ori $a5, $zero, 64
-; LA32-NEXT:    sub.w $a2, $a5, $a2
-; LA32-NEXT:    xori $a2, $a2, 31
-; LA32-NEXT:    srli.w $a0, $a0, 1
-; LA32-NEXT:    srl.w $a0, $a0, $a2
-; LA32-NEXT:    or $a0, $a1, $a0
-; LA32-NEXT:    maskeqz $a0, $a0, $t0
-; LA32-NEXT:    or $a1, $a0, $t1
-; LA32-NEXT:    srai.w $a0, $a7, 31
-; LA32-NEXT:    and $a0, $a0, $a6
-; LA32-NEXT:    or $a0, $a3, $a0
-; LA32-NEXT:    or $a1, $a4, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a3, $a2, -32
+; LA32R-NEXT:    slti $a4, $a3, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    srl.w $a5, $a1, $a2
+; LA32R-NEXT:    bltz $a3, .LBB3_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    srl.w $a3, $a1, $a3
+; LA32R-NEXT:    b .LBB3_3
+; LA32R-NEXT:  .LBB3_2:
+; LA32R-NEXT:    srl.w $a3, $a0, $a2
+; LA32R-NEXT:    xori $a6, $a2, 31
+; LA32R-NEXT:    slli.w $a7, $a1, 1
+; LA32R-NEXT:    sll.w $a6, $a7, $a6
+; LA32R-NEXT:    or $a3, $a3, $a6
+; LA32R-NEXT:  .LBB3_3:
+; LA32R-NEXT:    and $a4, $a4, $a5
+; LA32R-NEXT:    sub.w $a7, $zero, $a2
+; LA32R-NEXT:    ori $a5, $zero, 32
+; LA32R-NEXT:    sub.w $a6, $a5, $a2
+; LA32R-NEXT:    sll.w $a5, $a0, $a7
+; LA32R-NEXT:    bltz $a6, .LBB3_5
+; LA32R-NEXT:  # %bb.4:
+; LA32R-NEXT:    move $a1, $a5
+; LA32R-NEXT:    b .LBB3_6
+; LA32R-NEXT:  .LBB3_5:
+; LA32R-NEXT:    sll.w $a1, $a1, $a7
+; LA32R-NEXT:    ori $a7, $zero, 64
+; LA32R-NEXT:    sub.w $a2, $a7, $a2
+; LA32R-NEXT:    xori $a2, $a2, 31
+; LA32R-NEXT:    srli.w $a0, $a0, 1
+; LA32R-NEXT:    srl.w $a0, $a0, $a2
+; LA32R-NEXT:    or $a1, $a1, $a0
+; LA32R-NEXT:  .LBB3_6:
+; LA32R-NEXT:    slti $a0, $a6, 0
+; LA32R-NEXT:    sub.w $a0, $zero, $a0
+; LA32R-NEXT:    and $a0, $a0, $a5
+; LA32R-NEXT:    or $a0, $a3, $a0
+; LA32R-NEXT:    or $a1, $a4, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srl.w $a3, $a0, $a2
+; LA32S-NEXT:    xori $a4, $a2, 31
+; LA32S-NEXT:    slli.w $a5, $a1, 1
+; LA32S-NEXT:    sll.w $a4, $a5, $a4
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a2, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a6, $a1, $a4
+; LA32S-NEXT:    masknez $a5, $a6, $a5
+; LA32S-NEXT:    or $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a5, $a1, $a2
+; LA32S-NEXT:    srai.w $a4, $a4, 31
+; LA32S-NEXT:    and $a4, $a4, $a5
+; LA32S-NEXT:    sub.w $a5, $zero, $a2
+; LA32S-NEXT:    sll.w $a6, $a0, $a5
+; LA32S-NEXT:    ori $a7, $zero, 32
+; LA32S-NEXT:    sub.w $a7, $a7, $a2
+; LA32S-NEXT:    slti $t0, $a7, 0
+; LA32S-NEXT:    masknez $t1, $a6, $t0
+; LA32S-NEXT:    sll.w $a1, $a1, $a5
+; LA32S-NEXT:    ori $a5, $zero, 64
+; LA32S-NEXT:    sub.w $a2, $a5, $a2
+; LA32S-NEXT:    xori $a2, $a2, 31
+; LA32S-NEXT:    srli.w $a0, $a0, 1
+; LA32S-NEXT:    srl.w $a0, $a0, $a2
+; LA32S-NEXT:    or $a0, $a1, $a0
+; LA32S-NEXT:    maskeqz $a0, $a0, $t0
+; LA32S-NEXT:    or $a1, $a0, $t1
+; LA32S-NEXT:    srai.w $a0, $a7, 31
+; LA32S-NEXT:    and $a0, $a0, $a6
+; LA32S-NEXT:    or $a0, $a3, $a0
+; LA32S-NEXT:    or $a1, $a4, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_64:
 ; LA64:       # %bb.0:
@@ -138,11 +239,19 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
 }
 
 define signext i32 @rotl_32_mask(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotl_32_mask:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sub.w $a1, $zero, $a1
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_32_mask:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sub.w $a2, $zero, $a1
+; LA32R-NEXT:    sll.w $a1, $a0, $a1
+; LA32R-NEXT:    srl.w $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_32_mask:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sub.w $a1, $zero, $a1
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_32_mask:
 ; LA64:       # %bb.0:
@@ -158,11 +267,19 @@ define signext i32 @rotl_32_mask(i32 signext %x, i32 signext %y) nounwind {
 }
 
 define signext i32 @rotl_32_mask_and_63_and_31(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotl_32_mask_and_63_and_31:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sub.w $a1, $zero, $a1
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_32_mask_and_63_and_31:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sll.w $a2, $a0, $a1
+; LA32R-NEXT:    sub.w $a1, $zero, $a1
+; LA32R-NEXT:    srl.w $a0, $a0, $a1
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_32_mask_and_63_and_31:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sub.w $a1, $zero, $a1
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_32_mask_and_63_and_31:
 ; LA64:       # %bb.0:
@@ -179,11 +296,16 @@ define signext i32 @rotl_32_mask_and_63_and_31(i32 signext %x, i32 signext %y) n
 }
 
 define signext i32 @rotl_32_mask_or_64_or_32(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotl_32_mask_or_64_or_32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sub.w $a1, $zero, $a1
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_32_mask_or_64_or_32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_32_mask_or_64_or_32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sub.w $a1, $zero, $a1
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_32_mask_or_64_or_32:
 ; LA64:       # %bb.0:
@@ -200,10 +322,18 @@ define signext i32 @rotl_32_mask_or_64_or_32(i32 signext %x, i32 signext %y) nou
 }
 
 define signext i32 @rotr_32_mask(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotr_32_mask:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_32_mask:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sub.w $a2, $zero, $a1
+; LA32R-NEXT:    srl.w $a1, $a0, $a1
+; LA32R-NEXT:    sll.w $a0, $a0, $a2
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_32_mask:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_32_mask:
 ; LA64:       # %bb.0:
@@ -218,10 +348,18 @@ define signext i32 @rotr_32_mask(i32 signext %x, i32 signext %y) nounwind {
 }
 
 define signext i32 @rotr_32_mask_and_63_and_31(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotr_32_mask_and_63_and_31:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_32_mask_and_63_and_31:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srl.w $a2, $a0, $a1
+; LA32R-NEXT:    sub.w $a1, $zero, $a1
+; LA32R-NEXT:    sll.w $a0, $a0, $a1
+; LA32R-NEXT:    or $a0, $a2, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_32_mask_and_63_and_31:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_32_mask_and_63_and_31:
 ; LA64:       # %bb.0:
@@ -237,10 +375,15 @@ define signext i32 @rotr_32_mask_and_63_and_31(i32 signext %x, i32 signext %y) n
 }
 
 define signext i32 @rotr_32_mask_or_64_or_32(i32 signext %x, i32 signext %y) nounwind {
-; LA32-LABEL: rotr_32_mask_or_64_or_32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotr.w $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_32_mask_or_64_or_32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_32_mask_or_64_or_32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotr.w $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_32_mask_or_64_or_32:
 ; LA64:       # %bb.0:
@@ -256,41 +399,81 @@ define signext i32 @rotr_32_mask_or_64_or_32(i32 signext %x, i32 signext %y) nou
 }
 
 define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotl_64_mask:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sll.w $a3, $a1, $a2
-; LA32-NEXT:    xori $a4, $a2, 31
-; LA32-NEXT:    srli.w $a5, $a0, 1
-; LA32-NEXT:    srl.w $a4, $a5, $a4
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a2, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    sll.w $a6, $a0, $a4
-; LA32-NEXT:    masknez $a5, $a6, $a5
-; LA32-NEXT:    or $a3, $a3, $a5
-; LA32-NEXT:    sll.w $a5, $a0, $a2
-; LA32-NEXT:    srai.w $a4, $a4, 31
-; LA32-NEXT:    and $a4, $a4, $a5
-; LA32-NEXT:    sub.w $a2, $zero, $a2
-; LA32-NEXT:    andi $a5, $a2, 63
-; LA32-NEXT:    addi.w $a6, $a5, -32
-; LA32-NEXT:    srl.w $a7, $a1, $a6
-; LA32-NEXT:    slti $t0, $a6, 0
-; LA32-NEXT:    masknez $a7, $a7, $t0
-; LA32-NEXT:    srl.w $a0, $a0, $a2
-; LA32-NEXT:    xori $a5, $a5, 31
-; LA32-NEXT:    slli.w $t1, $a1, 1
-; LA32-NEXT:    sll.w $a5, $t1, $a5
-; LA32-NEXT:    or $a0, $a0, $a5
-; LA32-NEXT:    maskeqz $a0, $a0, $t0
-; LA32-NEXT:    or $a0, $a0, $a7
-; LA32-NEXT:    srl.w $a1, $a1, $a2
-; LA32-NEXT:    srai.w $a2, $a6, 31
-; LA32-NEXT:    and $a1, $a2, $a1
-; LA32-NEXT:    or $a1, $a3, $a1
-; LA32-NEXT:    or $a0, $a4, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_64_mask:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a4, $a2, -32
+; LA32R-NEXT:    bltz $a4, .LBB10_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    sll.w $a3, $a0, $a4
+; LA32R-NEXT:    b .LBB10_3
+; LA32R-NEXT:  .LBB10_2:
+; LA32R-NEXT:    sll.w $a3, $a1, $a2
+; LA32R-NEXT:    xori $a5, $a2, 31
+; LA32R-NEXT:    srli.w $a6, $a0, 1
+; LA32R-NEXT:    srl.w $a5, $a6, $a5
+; LA32R-NEXT:    or $a3, $a3, $a5
+; LA32R-NEXT:  .LBB10_3:
+; LA32R-NEXT:    slti $a4, $a4, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    sll.w $a5, $a0, $a2
+; LA32R-NEXT:    and $a4, $a4, $a5
+; LA32R-NEXT:    sub.w $a5, $zero, $a2
+; LA32R-NEXT:    andi $a6, $a5, 63
+; LA32R-NEXT:    addi.w $a7, $a6, -32
+; LA32R-NEXT:    slti $a2, $a7, 0
+; LA32R-NEXT:    sub.w $a2, $zero, $a2
+; LA32R-NEXT:    srl.w $t0, $a1, $a5
+; LA32R-NEXT:    and $a2, $a2, $t0
+; LA32R-NEXT:    bltz $a7, .LBB10_5
+; LA32R-NEXT:  # %bb.4:
+; LA32R-NEXT:    srl.w $a0, $a1, $a7
+; LA32R-NEXT:    b .LBB10_6
+; LA32R-NEXT:  .LBB10_5:
+; LA32R-NEXT:    srl.w $a0, $a0, $a5
+; LA32R-NEXT:    xori $a5, $a6, 31
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    sll.w $a1, $a1, $a5
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:  .LBB10_6:
+; LA32R-NEXT:    or $a0, $a4, $a0
+; LA32R-NEXT:    or $a1, $a3, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_64_mask:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sll.w $a3, $a1, $a2
+; LA32S-NEXT:    xori $a4, $a2, 31
+; LA32S-NEXT:    srli.w $a5, $a0, 1
+; LA32S-NEXT:    srl.w $a4, $a5, $a4
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a2, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    sll.w $a6, $a0, $a4
+; LA32S-NEXT:    masknez $a5, $a6, $a5
+; LA32S-NEXT:    or $a3, $a3, $a5
+; LA32S-NEXT:    sll.w $a5, $a0, $a2
+; LA32S-NEXT:    srai.w $a4, $a4, 31
+; LA32S-NEXT:    and $a4, $a4, $a5
+; LA32S-NEXT:    sub.w $a2, $zero, $a2
+; LA32S-NEXT:    andi $a5, $a2, 63
+; LA32S-NEXT:    addi.w $a6, $a5, -32
+; LA32S-NEXT:    srl.w $a7, $a1, $a6
+; LA32S-NEXT:    slti $t0, $a6, 0
+; LA32S-NEXT:    masknez $a7, $a7, $t0
+; LA32S-NEXT:    srl.w $a0, $a0, $a2
+; LA32S-NEXT:    xori $a5, $a5, 31
+; LA32S-NEXT:    slli.w $t1, $a1, 1
+; LA32S-NEXT:    sll.w $a5, $t1, $a5
+; LA32S-NEXT:    or $a0, $a0, $a5
+; LA32S-NEXT:    maskeqz $a0, $a0, $t0
+; LA32S-NEXT:    or $a0, $a0, $a7
+; LA32S-NEXT:    srl.w $a1, $a1, $a2
+; LA32S-NEXT:    srai.w $a2, $a6, 31
+; LA32S-NEXT:    and $a1, $a2, $a1
+; LA32S-NEXT:    or $a1, $a3, $a1
+; LA32S-NEXT:    or $a0, $a4, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_64_mask:
 ; LA64:       # %bb.0:
@@ -306,42 +489,83 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 }
 
 define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotl_64_mask_and_127_and_63:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sll.w $a3, $a1, $a2
-; LA32-NEXT:    srli.w $a4, $a0, 1
-; LA32-NEXT:    andi $a5, $a2, 127
-; LA32-NEXT:    xori $a6, $a5, 31
-; LA32-NEXT:    srl.w $a4, $a4, $a6
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a5, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    sll.w $a6, $a0, $a4
-; LA32-NEXT:    masknez $a5, $a6, $a5
-; LA32-NEXT:    or $a3, $a3, $a5
-; LA32-NEXT:    sll.w $a5, $a0, $a2
-; LA32-NEXT:    srai.w $a4, $a4, 31
-; LA32-NEXT:    and $a4, $a4, $a5
-; LA32-NEXT:    sub.w $a2, $zero, $a2
-; LA32-NEXT:    andi $a5, $a2, 63
-; LA32-NEXT:    addi.w $a6, $a5, -32
-; LA32-NEXT:    srl.w $a7, $a1, $a6
-; LA32-NEXT:    slti $t0, $a6, 0
-; LA32-NEXT:    masknez $a7, $a7, $t0
-; LA32-NEXT:    srl.w $a0, $a0, $a2
-; LA32-NEXT:    xori $a5, $a5, 31
-; LA32-NEXT:    slli.w $t1, $a1, 1
-; LA32-NEXT:    sll.w $a5, $t1, $a5
-; LA32-NEXT:    or $a0, $a0, $a5
-; LA32-NEXT:    maskeqz $a0, $a0, $t0
-; LA32-NEXT:    or $a0, $a0, $a7
-; LA32-NEXT:    srl.w $a1, $a1, $a2
-; LA32-NEXT:    srai.w $a2, $a6, 31
-; LA32-NEXT:    and $a1, $a2, $a1
-; LA32-NEXT:    or $a1, $a3, $a1
-; LA32-NEXT:    or $a0, $a4, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_64_mask_and_127_and_63:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a3, $a2, 127
+; LA32R-NEXT:    addi.w $a4, $a3, -32
+; LA32R-NEXT:    bltz $a4, .LBB11_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    sll.w $a3, $a0, $a4
+; LA32R-NEXT:    b .LBB11_3
+; LA32R-NEXT:  .LBB11_2:
+; LA32R-NEXT:    sll.w $a5, $a1, $a2
+; LA32R-NEXT:    srli.w $a6, $a0, 1
+; LA32R-NEXT:    xori $a3, $a3, 31
+; LA32R-NEXT:    srl.w $a3, $a6, $a3
+; LA32R-NEXT:    or $a3, $a5, $a3
+; LA32R-NEXT:  .LBB11_3:
+; LA32R-NEXT:    slti $a4, $a4, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    sll.w $a5, $a0, $a2
+; LA32R-NEXT:    and $a4, $a4, $a5
+; LA32R-NEXT:    sub.w $a5, $zero, $a2
+; LA32R-NEXT:    andi $a6, $a5, 63
+; LA32R-NEXT:    addi.w $a7, $a6, -32
+; LA32R-NEXT:    slti $a2, $a7, 0
+; LA32R-NEXT:    sub.w $a2, $zero, $a2
+; LA32R-NEXT:    srl.w $t0, $a1, $a5
+; LA32R-NEXT:    and $a2, $a2, $t0
+; LA32R-NEXT:    bltz $a7, .LBB11_5
+; LA32R-NEXT:  # %bb.4:
+; LA32R-NEXT:    srl.w $a0, $a1, $a7
+; LA32R-NEXT:    b .LBB11_6
+; LA32R-NEXT:  .LBB11_5:
+; LA32R-NEXT:    srl.w $a0, $a0, $a5
+; LA32R-NEXT:    xori $a5, $a6, 31
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    sll.w $a1, $a1, $a5
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:  .LBB11_6:
+; LA32R-NEXT:    or $a0, $a4, $a0
+; LA32R-NEXT:    or $a1, $a3, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_64_mask_and_127_and_63:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sll.w $a3, $a1, $a2
+; LA32S-NEXT:    srli.w $a4, $a0, 1
+; LA32S-NEXT:    andi $a5, $a2, 127
+; LA32S-NEXT:    xori $a6, $a5, 31
+; LA32S-NEXT:    srl.w $a4, $a4, $a6
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a5, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    sll.w $a6, $a0, $a4
+; LA32S-NEXT:    masknez $a5, $a6, $a5
+; LA32S-NEXT:    or $a3, $a3, $a5
+; LA32S-NEXT:    sll.w $a5, $a0, $a2
+; LA32S-NEXT:    srai.w $a4, $a4, 31
+; LA32S-NEXT:    and $a4, $a4, $a5
+; LA32S-NEXT:    sub.w $a2, $zero, $a2
+; LA32S-NEXT:    andi $a5, $a2, 63
+; LA32S-NEXT:    addi.w $a6, $a5, -32
+; LA32S-NEXT:    srl.w $a7, $a1, $a6
+; LA32S-NEXT:    slti $t0, $a6, 0
+; LA32S-NEXT:    masknez $a7, $a7, $t0
+; LA32S-NEXT:    srl.w $a0, $a0, $a2
+; LA32S-NEXT:    xori $a5, $a5, 31
+; LA32S-NEXT:    slli.w $t1, $a1, 1
+; LA32S-NEXT:    sll.w $a5, $t1, $a5
+; LA32S-NEXT:    or $a0, $a0, $a5
+; LA32S-NEXT:    maskeqz $a0, $a0, $t0
+; LA32S-NEXT:    or $a0, $a0, $a7
+; LA32S-NEXT:    srl.w $a1, $a1, $a2
+; LA32S-NEXT:    srai.w $a2, $a6, 31
+; LA32S-NEXT:    and $a1, $a2, $a1
+; LA32S-NEXT:    or $a1, $a3, $a1
+; LA32S-NEXT:    or $a0, $a4, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_64_mask_and_127_and_63:
 ; LA64:       # %bb.0:
@@ -358,11 +582,17 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 }
 
 define i64 @rotl_64_mask_or_128_or_64(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotl_64_mask_or_128_or_64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_64_mask_or_128_or_64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_64_mask_or_128_or_64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_64_mask_or_128_or_64:
 ; LA64:       # %bb.0:
@@ -379,41 +609,81 @@ define i64 @rotl_64_mask_or_128_or_64(i64 %x, i64 %y) nounwind {
 }
 
 define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotr_64_mask:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srl.w $a3, $a0, $a2
-; LA32-NEXT:    xori $a4, $a2, 31
-; LA32-NEXT:    slli.w $a5, $a1, 1
-; LA32-NEXT:    sll.w $a4, $a5, $a4
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a2, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a6, $a1, $a4
-; LA32-NEXT:    masknez $a5, $a6, $a5
-; LA32-NEXT:    or $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a5, $a1, $a2
-; LA32-NEXT:    srai.w $a4, $a4, 31
-; LA32-NEXT:    and $a4, $a4, $a5
-; LA32-NEXT:    sub.w $a2, $zero, $a2
-; LA32-NEXT:    andi $a5, $a2, 63
-; LA32-NEXT:    addi.w $a6, $a5, -32
-; LA32-NEXT:    sll.w $a7, $a0, $a6
-; LA32-NEXT:    slti $t0, $a6, 0
-; LA32-NEXT:    masknez $a7, $a7, $t0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    xori $a5, $a5, 31
-; LA32-NEXT:    srli.w $t1, $a0, 1
-; LA32-NEXT:    srl.w $a5, $t1, $a5
-; LA32-NEXT:    or $a1, $a1, $a5
-; LA32-NEXT:    maskeqz $a1, $a1, $t0
-; LA32-NEXT:    or $a1, $a1, $a7
-; LA32-NEXT:    sll.w $a0, $a0, $a2
-; LA32-NEXT:    srai.w $a2, $a6, 31
-; LA32-NEXT:    and $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a3, $a0
-; LA32-NEXT:    or $a1, $a4, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_64_mask:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a3, $a2, -32
+; LA32R-NEXT:    slti $a4, $a3, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    srl.w $a6, $a1, $a2
+; LA32R-NEXT:    bltz $a3, .LBB13_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    srl.w $a3, $a1, $a3
+; LA32R-NEXT:    b .LBB13_3
+; LA32R-NEXT:  .LBB13_2:
+; LA32R-NEXT:    srl.w $a3, $a0, $a2
+; LA32R-NEXT:    xori $a5, $a2, 31
+; LA32R-NEXT:    slli.w $a7, $a1, 1
+; LA32R-NEXT:    sll.w $a5, $a7, $a5
+; LA32R-NEXT:    or $a3, $a3, $a5
+; LA32R-NEXT:  .LBB13_3:
+; LA32R-NEXT:    sub.w $a5, $zero, $a2
+; LA32R-NEXT:    andi $t0, $a5, 63
+; LA32R-NEXT:    addi.w $a7, $t0, -32
+; LA32R-NEXT:    and $a2, $a4, $a6
+; LA32R-NEXT:    bltz $a7, .LBB13_5
+; LA32R-NEXT:  # %bb.4:
+; LA32R-NEXT:    sll.w $a1, $a0, $a7
+; LA32R-NEXT:    b .LBB13_6
+; LA32R-NEXT:  .LBB13_5:
+; LA32R-NEXT:    sll.w $a1, $a1, $a5
+; LA32R-NEXT:    xori $a4, $t0, 31
+; LA32R-NEXT:    srli.w $a6, $a0, 1
+; LA32R-NEXT:    srl.w $a4, $a6, $a4
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:  .LBB13_6:
+; LA32R-NEXT:    slti $a4, $a7, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    sll.w $a0, $a0, $a5
+; LA32R-NEXT:    and $a0, $a4, $a0
+; LA32R-NEXT:    or $a0, $a3, $a0
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_64_mask:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srl.w $a3, $a0, $a2
+; LA32S-NEXT:    xori $a4, $a2, 31
+; LA32S-NEXT:    slli.w $a5, $a1, 1
+; LA32S-NEXT:    sll.w $a4, $a5, $a4
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a2, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a6, $a1, $a4
+; LA32S-NEXT:    masknez $a5, $a6, $a5
+; LA32S-NEXT:    or $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a5, $a1, $a2
+; LA32S-NEXT:    srai.w $a4, $a4, 31
+; LA32S-NEXT:    and $a4, $a4, $a5
+; LA32S-NEXT:    sub.w $a2, $zero, $a2
+; LA32S-NEXT:    andi $a5, $a2, 63
+; LA32S-NEXT:    addi.w $a6, $a5, -32
+; LA32S-NEXT:    sll.w $a7, $a0, $a6
+; LA32S-NEXT:    slti $t0, $a6, 0
+; LA32S-NEXT:    masknez $a7, $a7, $t0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    xori $a5, $a5, 31
+; LA32S-NEXT:    srli.w $t1, $a0, 1
+; LA32S-NEXT:    srl.w $a5, $t1, $a5
+; LA32S-NEXT:    or $a1, $a1, $a5
+; LA32S-NEXT:    maskeqz $a1, $a1, $t0
+; LA32S-NEXT:    or $a1, $a1, $a7
+; LA32S-NEXT:    sll.w $a0, $a0, $a2
+; LA32S-NEXT:    srai.w $a2, $a6, 31
+; LA32S-NEXT:    and $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a3, $a0
+; LA32S-NEXT:    or $a1, $a4, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_64_mask:
 ; LA64:       # %bb.0:
@@ -428,42 +698,83 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 }
 
 define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotr_64_mask_and_127_and_63:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srl.w $a3, $a0, $a2
-; LA32-NEXT:    slli.w $a4, $a1, 1
-; LA32-NEXT:    andi $a5, $a2, 127
-; LA32-NEXT:    xori $a6, $a5, 31
-; LA32-NEXT:    sll.w $a4, $a4, $a6
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a5, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a6, $a1, $a4
-; LA32-NEXT:    masknez $a5, $a6, $a5
-; LA32-NEXT:    or $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a5, $a1, $a2
-; LA32-NEXT:    srai.w $a4, $a4, 31
-; LA32-NEXT:    and $a4, $a4, $a5
-; LA32-NEXT:    sub.w $a2, $zero, $a2
-; LA32-NEXT:    andi $a5, $a2, 63
-; LA32-NEXT:    addi.w $a6, $a5, -32
-; LA32-NEXT:    sll.w $a7, $a0, $a6
-; LA32-NEXT:    slti $t0, $a6, 0
-; LA32-NEXT:    masknez $a7, $a7, $t0
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    xori $a5, $a5, 31
-; LA32-NEXT:    srli.w $t1, $a0, 1
-; LA32-NEXT:    srl.w $a5, $t1, $a5
-; LA32-NEXT:    or $a1, $a1, $a5
-; LA32-NEXT:    maskeqz $a1, $a1, $t0
-; LA32-NEXT:    or $a1, $a1, $a7
-; LA32-NEXT:    sll.w $a0, $a0, $a2
-; LA32-NEXT:    srai.w $a2, $a6, 31
-; LA32-NEXT:    and $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a3, $a0
-; LA32-NEXT:    or $a1, $a4, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_64_mask_and_127_and_63:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a3, $a2, 127
+; LA32R-NEXT:    addi.w $a5, $a3, -32
+; LA32R-NEXT:    slti $a4, $a5, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    srl.w $a6, $a1, $a2
+; LA32R-NEXT:    bltz $a5, .LBB14_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    srl.w $a3, $a1, $a5
+; LA32R-NEXT:    b .LBB14_3
+; LA32R-NEXT:  .LBB14_2:
+; LA32R-NEXT:    srl.w $a5, $a0, $a2
+; LA32R-NEXT:    slli.w $a7, $a1, 1
+; LA32R-NEXT:    xori $a3, $a3, 31
+; LA32R-NEXT:    sll.w $a3, $a7, $a3
+; LA32R-NEXT:    or $a3, $a5, $a3
+; LA32R-NEXT:  .LBB14_3:
+; LA32R-NEXT:    sub.w $a5, $zero, $a2
+; LA32R-NEXT:    andi $t0, $a5, 63
+; LA32R-NEXT:    addi.w $a7, $t0, -32
+; LA32R-NEXT:    and $a2, $a4, $a6
+; LA32R-NEXT:    bltz $a7, .LBB14_5
+; LA32R-NEXT:  # %bb.4:
+; LA32R-NEXT:    sll.w $a1, $a0, $a7
+; LA32R-NEXT:    b .LBB14_6
+; LA32R-NEXT:  .LBB14_5:
+; LA32R-NEXT:    sll.w $a1, $a1, $a5
+; LA32R-NEXT:    xori $a4, $t0, 31
+; LA32R-NEXT:    srli.w $a6, $a0, 1
+; LA32R-NEXT:    srl.w $a4, $a6, $a4
+; LA32R-NEXT:    or $a1, $a1, $a4
+; LA32R-NEXT:  .LBB14_6:
+; LA32R-NEXT:    slti $a4, $a7, 0
+; LA32R-NEXT:    sub.w $a4, $zero, $a4
+; LA32R-NEXT:    sll.w $a0, $a0, $a5
+; LA32R-NEXT:    and $a0, $a4, $a0
+; LA32R-NEXT:    or $a0, $a3, $a0
+; LA32R-NEXT:    or $a1, $a2, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_64_mask_and_127_and_63:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srl.w $a3, $a0, $a2
+; LA32S-NEXT:    slli.w $a4, $a1, 1
+; LA32S-NEXT:    andi $a5, $a2, 127
+; LA32S-NEXT:    xori $a6, $a5, 31
+; LA32S-NEXT:    sll.w $a4, $a4, $a6
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a5, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a6, $a1, $a4
+; LA32S-NEXT:    masknez $a5, $a6, $a5
+; LA32S-NEXT:    or $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a5, $a1, $a2
+; LA32S-NEXT:    srai.w $a4, $a4, 31
+; LA32S-NEXT:    and $a4, $a4, $a5
+; LA32S-NEXT:    sub.w $a2, $zero, $a2
+; LA32S-NEXT:    andi $a5, $a2, 63
+; LA32S-NEXT:    addi.w $a6, $a5, -32
+; LA32S-NEXT:    sll.w $a7, $a0, $a6
+; LA32S-NEXT:    slti $t0, $a6, 0
+; LA32S-NEXT:    masknez $a7, $a7, $t0
+; LA32S-NEXT:    sll.w $a1, $a1, $a2
+; LA32S-NEXT:    xori $a5, $a5, 31
+; LA32S-NEXT:    srli.w $t1, $a0, 1
+; LA32S-NEXT:    srl.w $a5, $t1, $a5
+; LA32S-NEXT:    or $a1, $a1, $a5
+; LA32S-NEXT:    maskeqz $a1, $a1, $t0
+; LA32S-NEXT:    or $a1, $a1, $a7
+; LA32S-NEXT:    sll.w $a0, $a0, $a2
+; LA32S-NEXT:    srai.w $a2, $a6, 31
+; LA32S-NEXT:    and $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a3, $a0
+; LA32S-NEXT:    or $a1, $a4, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_64_mask_and_127_and_63:
 ; LA64:       # %bb.0:
@@ -479,11 +790,17 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 }
 
 define i64 @rotr_64_mask_or_128_or_64(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotr_64_mask_or_128_or_64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    move $a1, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_64_mask_or_128_or_64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    move $a1, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_64_mask_or_128_or_64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    move $a1, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_64_mask_or_128_or_64:
 ; LA64:       # %bb.0:
@@ -499,27 +816,51 @@ define i64 @rotr_64_mask_or_128_or_64(i64 %x, i64 %y) nounwind {
 }
 
 define signext i32 @rotr_64_trunc_32(i64 %x, i64 %y) nounwind {
-; LA32-LABEL: rotr_64_trunc_32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srl.w $a3, $a0, $a2
-; LA32-NEXT:    xori $a4, $a2, 31
-; LA32-NEXT:    slli.w $a5, $a1, 1
-; LA32-NEXT:    sll.w $a4, $a5, $a4
-; LA32-NEXT:    or $a3, $a3, $a4
-; LA32-NEXT:    addi.w $a4, $a2, -32
-; LA32-NEXT:    slti $a5, $a4, 0
-; LA32-NEXT:    maskeqz $a3, $a3, $a5
-; LA32-NEXT:    srl.w $a1, $a1, $a4
-; LA32-NEXT:    masknez $a1, $a1, $a5
-; LA32-NEXT:    or $a1, $a3, $a1
-; LA32-NEXT:    sub.w $a3, $zero, $a2
-; LA32-NEXT:    sll.w $a0, $a0, $a3
-; LA32-NEXT:    ori $a3, $zero, 32
-; LA32-NEXT:    sub.w $a2, $a3, $a2
-; LA32-NEXT:    srai.w $a2, $a2, 31
-; LA32-NEXT:    and $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_64_trunc_32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a3, $a2, -32
+; LA32R-NEXT:    bltz $a3, .LBB16_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    srl.w $a1, $a1, $a3
+; LA32R-NEXT:    b .LBB16_3
+; LA32R-NEXT:  .LBB16_2:
+; LA32R-NEXT:    srl.w $a3, $a0, $a2
+; LA32R-NEXT:    xori $a4, $a2, 31
+; LA32R-NEXT:    slli.w $a1, $a1, 1
+; LA32R-NEXT:    sll.w $a1, $a1, $a4
+; LA32R-NEXT:    or $a1, $a3, $a1
+; LA32R-NEXT:  .LBB16_3:
+; LA32R-NEXT:    ori $a3, $zero, 32
+; LA32R-NEXT:    sub.w $a3, $a3, $a2
+; LA32R-NEXT:    slti $a3, $a3, 0
+; LA32R-NEXT:    sub.w $a3, $zero, $a3
+; LA32R-NEXT:    sub.w $a2, $zero, $a2
+; LA32R-NEXT:    sll.w $a0, $a0, $a2
+; LA32R-NEXT:    and $a0, $a3, $a0
+; LA32R-NEXT:    or $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_64_trunc_32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srl.w $a3, $a0, $a2
+; LA32S-NEXT:    xori $a4, $a2, 31
+; LA32S-NEXT:    slli.w $a5, $a1, 1
+; LA32S-NEXT:    sll.w $a4, $a5, $a4
+; LA32S-NEXT:    or $a3, $a3, $a4
+; LA32S-NEXT:    addi.w $a4, $a2, -32
+; LA32S-NEXT:    slti $a5, $a4, 0
+; LA32S-NEXT:    maskeqz $a3, $a3, $a5
+; LA32S-NEXT:    srl.w $a1, $a1, $a4
+; LA32S-NEXT:    masknez $a1, $a1, $a5
+; LA32S-NEXT:    or $a1, $a3, $a1
+; LA32S-NEXT:    sub.w $a3, $zero, $a2
+; LA32S-NEXT:    sll.w $a0, $a0, $a3
+; LA32S-NEXT:    ori $a3, $zero, 32
+; LA32S-NEXT:    sub.w $a2, $a3, $a2
+; LA32S-NEXT:    srai.w $a2, $a2, 31
+; LA32S-NEXT:    and $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_64_trunc_32:
 ; LA64:       # %bb.0:
@@ -535,10 +876,17 @@ define signext i32 @rotr_64_trunc_32(i64 %x, i64 %y) nounwind {
 }
 
 define signext i32 @rotri_i32(i32 signext %a) nounwind {
-; LA32-LABEL: rotri_i32:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotri.w $a0, $a0, 16
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotri_i32:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 16
+; LA32R-NEXT:    slli.w $a0, $a0, 16
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotri_i32:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotri.w $a0, $a0, 16
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotri_i32:
 ; LA64:       # %bb.0:
@@ -551,12 +899,19 @@ define signext i32 @rotri_i32(i32 signext %a) nounwind {
 }
 
 define i64 @rotri_i64(i64 %a) nounwind {
-; LA32-LABEL: rotri_i64:
-; LA32:       # %bb.0:
-; LA32-NEXT:    move $a2, $a0
-; LA32-NEXT:    move $a0, $a1
-; LA32-NEXT:    move $a1, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotri_i64:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a2, $a0
+; LA32R-NEXT:    move $a0, $a1
+; LA32R-NEXT:    move $a1, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotri_i64:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    move $a2, $a0
+; LA32S-NEXT:    move $a0, $a1
+; LA32S-NEXT:    move $a1, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotri_i64:
 ; LA64:       # %bb.0:
@@ -574,10 +929,17 @@ declare i32 @llvm.fshr.i32(i32, i32, i32)
 declare i64 @llvm.fshr.i64(i64, i64, i64)
 
 define signext i32 @rotl_i32_fshl(i32 signext %a) nounwind {
-; LA32-LABEL: rotl_i32_fshl:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotri.w $a0, $a0, 20
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_i32_fshl:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a1, $a0, 20
+; LA32R-NEXT:    slli.w $a0, $a0, 12
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_i32_fshl:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotri.w $a0, $a0, 20
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_i32_fshl:
 ; LA64:       # %bb.0:
@@ -588,16 +950,27 @@ define signext i32 @rotl_i32_fshl(i32 signext %a) nounwind {
 }
 
 define i64 @rotl_i64_fshl(i64 %a) nounwind {
-; LA32-LABEL: rotl_i64_fshl:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a2, $a1, 20
-; LA32-NEXT:    slli.w $a3, $a0, 12
-; LA32-NEXT:    or $a2, $a3, $a2
-; LA32-NEXT:    srli.w $a0, $a0, 20
-; LA32-NEXT:    slli.w $a1, $a1, 12
-; LA32-NEXT:    or $a1, $a1, $a0
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotl_i64_fshl:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a1, 20
+; LA32R-NEXT:    slli.w $a3, $a0, 12
+; LA32R-NEXT:    or $a2, $a3, $a2
+; LA32R-NEXT:    srli.w $a0, $a0, 20
+; LA32R-NEXT:    slli.w $a1, $a1, 12
+; LA32R-NEXT:    or $a1, $a1, $a0
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotl_i64_fshl:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a2, $a1, 20
+; LA32S-NEXT:    slli.w $a3, $a0, 12
+; LA32S-NEXT:    or $a2, $a3, $a2
+; LA32S-NEXT:    srli.w $a0, $a0, 20
+; LA32S-NEXT:    slli.w $a1, $a1, 12
+; LA32S-NEXT:    or $a1, $a1, $a0
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotl_i64_fshl:
 ; LA64:       # %bb.0:
@@ -608,10 +981,17 @@ define i64 @rotl_i64_fshl(i64 %a) nounwind {
 }
 
 define signext i32 @rotr_i32_fshr(i32 signext %a) nounwind {
-; LA32-LABEL: rotr_i32_fshr:
-; LA32:       # %bb.0:
-; LA32-NEXT:    rotri.w $a0, $a0, 12
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_i32_fshr:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a1, $a0, 20
+; LA32R-NEXT:    srli.w $a0, $a0, 12
+; LA32R-NEXT:    or $a0, $a0, $a1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_i32_fshr:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    rotri.w $a0, $a0, 12
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_i32_fshr:
 ; LA64:       # %bb.0:
@@ -622,16 +1002,27 @@ define signext i32 @rotr_i32_fshr(i32 signext %a) nounwind {
 }
 
 define i64 @rotr_i64_fshr(i64 %a) nounwind {
-; LA32-LABEL: rotr_i64_fshr:
-; LA32:       # %bb.0:
-; LA32-NEXT:    srli.w $a2, $a0, 12
-; LA32-NEXT:    slli.w $a3, $a1, 20
-; LA32-NEXT:    or $a2, $a3, $a2
-; LA32-NEXT:    srli.w $a1, $a1, 12
-; LA32-NEXT:    slli.w $a0, $a0, 20
-; LA32-NEXT:    or $a1, $a0, $a1
-; LA32-NEXT:    move $a0, $a2
-; LA32-NEXT:    ret
+; LA32R-LABEL: rotr_i64_fshr:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    srli.w $a2, $a0, 12
+; LA32R-NEXT:    slli.w $a3, $a1, 20
+; LA32R-NEXT:    or $a2, $a3, $a2
+; LA32R-NEXT:    srli.w $a1, $a1, 12
+; LA32R-NEXT:    slli.w $a0, $a0, 20
+; LA32R-NEXT:    or $a1, $a0, $a1
+; LA32R-NEXT:    move $a0, $a2
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: rotr_i64_fshr:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    srli.w $a2, $a0, 12
+; LA32S-NEXT:    slli.w $a3, $a1, 20
+; LA32S-NEXT:    or $a2, $a3, $a2
+; LA32S-NEXT:    srli.w $a1, $a1, 12
+; LA32S-NEXT:    slli.w $a0, $a0, 20
+; LA32S-NEXT:    or $a1, $a0, $a1
+; LA32S-NEXT:    move $a0, $a2
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: rotr_i64_fshr:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/CodeGen/LoongArch/select-to-shiftand.ll b/llvm/test/CodeGen/LoongArch/select-to-shiftand.ll
index f95f1fb7df794..a6e7de83f3c3a 100644
--- a/llvm/test/CodeGen/LoongArch/select-to-shiftand.ll
+++ b/llvm/test/CodeGen/LoongArch/select-to-shiftand.ll
@@ -148,8 +148,8 @@ define i8 @sub_clamp_zero_i8(i8 signext %x, i8 signext %y) {
 ; LA32-LABEL: sub_clamp_zero_i8:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    ext.w.b $a1, $a0
-; LA32-NEXT:    srai.w $a1, $a1, 7
+; LA32-NEXT:    slli.w $a1, $a0, 24
+; LA32-NEXT:    srai.w $a1, $a1, 31
 ; LA32-NEXT:    andn $a0, $a0, $a1
 ; LA32-NEXT:    ret
 ;
@@ -170,8 +170,8 @@ define i16 @sub_clamp_zero_i16(i16 signext %x, i16 signext %y) {
 ; LA32-LABEL: sub_clamp_zero_i16:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    ext.w.h $a1, $a0
-; LA32-NEXT:    srai.w $a1, $a1, 15
+; LA32-NEXT:    slli.w $a1, $a0, 16
+; LA32-NEXT:    srai.w $a1, $a1, 31
 ; LA32-NEXT:    andn $a0, $a0, $a1
 ; LA32-NEXT:    ret
 ;
diff --git a/llvm/test/CodeGen/LoongArch/shift-masked-shamt.ll b/llvm/test/CodeGen/LoongArch/shift-masked-shamt.ll
index 4909a5f7098da..3ccbb42c7dd55 100644
--- a/llvm/test/CodeGen/LoongArch/shift-masked-shamt.ll
+++ b/llvm/test/CodeGen/LoongArch/shift-masked-shamt.ll
@@ -160,21 +160,23 @@ define i64 @sll_redundant_mask_zeros_i64(i64 %a, i64 %b) {
 ; LA32-LABEL: sll_redundant_mask_zeros_i64:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    slli.w $a2, $a2, 2
-; LA32-NEXT:    sll.w $a1, $a1, $a2
-; LA32-NEXT:    srli.w $a3, $a0, 1
 ; LA32-NEXT:    andi $a4, $a2, 60
-; LA32-NEXT:    xori $a5, $a4, 31
-; LA32-NEXT:    srl.w $a3, $a3, $a5
-; LA32-NEXT:    or $a1, $a1, $a3
 ; LA32-NEXT:    addi.w $a3, $a4, -32
-; LA32-NEXT:    slti $a4, $a3, 0
-; LA32-NEXT:    maskeqz $a1, $a1, $a4
-; LA32-NEXT:    sll.w $a5, $a0, $a3
-; LA32-NEXT:    masknez $a4, $a5, $a4
+; LA32-NEXT:    bltz $a3, .LBB9_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    sll.w $a1, $a0, $a3
+; LA32-NEXT:    b .LBB9_3
+; LA32-NEXT:  .LBB9_2:
+; LA32-NEXT:    sll.w $a1, $a1, $a2
+; LA32-NEXT:    srli.w $a5, $a0, 1
+; LA32-NEXT:    xori $a4, $a4, 31
+; LA32-NEXT:    srl.w $a4, $a5, $a4
 ; LA32-NEXT:    or $a1, $a1, $a4
+; LA32-NEXT:  .LBB9_3:
+; LA32-NEXT:    slti $a3, $a3, 0
+; LA32-NEXT:    sub.w $a3, $zero, $a3
 ; LA32-NEXT:    sll.w $a0, $a0, $a2
-; LA32-NEXT:    srai.w $a2, $a3, 31
-; LA32-NEXT:    and $a0, $a2, $a0
+; LA32-NEXT:    and $a0, $a3, $a0
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: sll_redundant_mask_zeros_i64:
@@ -192,21 +194,23 @@ define i64 @srl_redundant_mask_zeros_i64(i64 %a, i64 %b) {
 ; LA32-LABEL: srl_redundant_mask_zeros_i64:
 ; LA32:       # %bb.0:
 ; LA32-NEXT:    slli.w $a2, $a2, 3
-; LA32-NEXT:    srl.w $a0, $a0, $a2
-; LA32-NEXT:    slli.w $a3, $a1, 1
 ; LA32-NEXT:    andi $a4, $a2, 56
-; LA32-NEXT:    xori $a5, $a4, 31
-; LA32-NEXT:    sll.w $a3, $a3, $a5
-; LA32-NEXT:    or $a0, $a0, $a3
 ; LA32-NEXT:    addi.w $a3, $a4, -32
-; LA32-NEXT:    slti $a4, $a3, 0
-; LA32-NEXT:    maskeqz $a0, $a0, $a4
-; LA32-NEXT:    srl.w $a5, $a1, $a3
-; LA32-NEXT:    masknez $a4, $a5, $a4
+; LA32-NEXT:    bltz $a3, .LBB10_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    srl.w $a0, $a1, $a3
+; LA32-NEXT:    b .LBB10_3
+; LA32-NEXT:  .LBB10_2:
+; LA32-NEXT:    srl.w $a0, $a0, $a2
+; LA32-NEXT:    slli.w $a5, $a1, 1
+; LA32-NEXT:    xori $a4, $a4, 31
+; LA32-NEXT:    sll.w $a4, $a5, $a4
 ; LA32-NEXT:    or $a0, $a0, $a4
+; LA32-NEXT:  .LBB10_3:
+; LA32-NEXT:    slti $a3, $a3, 0
+; LA32-NEXT:    sub.w $a3, $zero, $a3
 ; LA32-NEXT:    srl.w $a1, $a1, $a2
-; LA32-NEXT:    srai.w $a2, $a3, 31
-; LA32-NEXT:    and $a1, $a2, $a1
+; LA32-NEXT:    and $a1, $a3, $a1
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: srl_redundant_mask_zeros_i64:
@@ -223,25 +227,21 @@ define i64 @srl_redundant_mask_zeros_i64(i64 %a, i64 %b) {
 define i64 @sra_redundant_mask_zeros_i64(i64 %a, i64 %b) {
 ; LA32-LABEL: sra_redundant_mask_zeros_i64:
 ; LA32:       # %bb.0:
-; LA32-NEXT:    srai.w $a3, $a1, 31
-; LA32-NEXT:    slli.w $a4, $a2, 4
-; LA32-NEXT:    andi $a5, $a4, 48
-; LA32-NEXT:    addi.w $a6, $a5, -32
-; LA32-NEXT:    slti $a7, $a6, 0
-; LA32-NEXT:    masknez $a2, $a3, $a7
-; LA32-NEXT:    sra.w $a3, $a1, $a4
-; LA32-NEXT:    maskeqz $a3, $a3, $a7
-; LA32-NEXT:    or $a2, $a3, $a2
-; LA32-NEXT:    srl.w $a0, $a0, $a4
-; LA32-NEXT:    slli.w $a3, $a1, 1
-; LA32-NEXT:    xori $a4, $a5, 31
-; LA32-NEXT:    sll.w $a3, $a3, $a4
+; LA32-NEXT:    slli.w $a2, $a2, 4
+; LA32-NEXT:    andi $a3, $a2, 48
+; LA32-NEXT:    addi.w $a4, $a3, -32
+; LA32-NEXT:    bltz $a4, .LBB11_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    sra.w $a0, $a1, $a4
+; LA32-NEXT:    srai.w $a1, $a1, 31
+; LA32-NEXT:    ret
+; LA32-NEXT:  .LBB11_2:
+; LA32-NEXT:    srl.w $a0, $a0, $a2
+; LA32-NEXT:    slli.w $a4, $a1, 1
+; LA32-NEXT:    xori $a3, $a3, 31
+; LA32-NEXT:    sll.w $a3, $a4, $a3
 ; LA32-NEXT:    or $a0, $a0, $a3
-; LA32-NEXT:    maskeqz $a0, $a0, $a7
-; LA32-NEXT:    sra.w $a1, $a1, $a6
-; LA32-NEXT:    masknez $a1, $a1, $a7
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    move $a1, $a2
+; LA32-NEXT:    sra.w $a1, $a1, $a2
 ; LA32-NEXT:    ret
 ;
 ; LA64-LABEL: sra_redundant_mask_zeros_i64:
diff --git a/llvm/test/CodeGen/LoongArch/smul-with-overflow.ll b/llvm/test/CodeGen/LoongArch/smul-with-overflow.ll
index 43d56e5d5eb2f..968c06136225d 100644
--- a/llvm/test/CodeGen/LoongArch/smul-with-overflow.ll
+++ b/llvm/test/CodeGen/LoongArch/smul-with-overflow.ll
@@ -95,170 +95,162 @@ define zeroext i1 @smuloi128(i128 %v1, i128 %v2, ptr %res) {
 ; LA32-NEXT:    ld.w $a7, $a0, 8
 ; LA32-NEXT:    ld.w $t0, $a0, 12
 ; LA32-NEXT:    ld.w $a4, $a0, 0
-; LA32-NEXT:    ld.w $t2, $a1, 4
+; LA32-NEXT:    ld.w $t4, $a1, 4
 ; LA32-NEXT:    mulh.wu $a0, $a7, $a3
 ; LA32-NEXT:    mul.w $a1, $t0, $a3
 ; LA32-NEXT:    add.w $a0, $a1, $a0
 ; LA32-NEXT:    sltu $a1, $a0, $a1
-; LA32-NEXT:    mulh.wu $t3, $t0, $a3
-; LA32-NEXT:    add.w $a1, $t3, $a1
-; LA32-NEXT:    mul.w $t3, $a7, $t2
-; LA32-NEXT:    add.w $t4, $t3, $a0
-; LA32-NEXT:    sltu $a0, $t4, $t3
-; LA32-NEXT:    mulh.wu $t3, $a7, $t2
+; LA32-NEXT:    mulh.wu $t2, $t0, $a3
+; LA32-NEXT:    add.w $a1, $t2, $a1
+; LA32-NEXT:    mul.w $t3, $a7, $t4
+; LA32-NEXT:    add.w $t2, $t3, $a0
+; LA32-NEXT:    sltu $a0, $t2, $t3
+; LA32-NEXT:    mulh.wu $t3, $a7, $t4
 ; LA32-NEXT:    add.w $a0, $t3, $a0
-; LA32-NEXT:    add.w $t3, $a1, $a0
-; LA32-NEXT:    mul.w $t5, $t0, $t2
-; LA32-NEXT:    add.w $t6, $t5, $t3
+; LA32-NEXT:    add.w $t5, $a1, $a0
+; LA32-NEXT:    mul.w $t6, $t0, $t4
+; LA32-NEXT:    add.w $t7, $t6, $t5
 ; LA32-NEXT:    srai.w $a0, $t0, 31
-; LA32-NEXT:    mul.w $t7, $a3, $a0
-; LA32-NEXT:    add.w $t8, $t6, $t7
-; LA32-NEXT:    sltu $fp, $t8, $t6
-; LA32-NEXT:    sltu $t5, $t6, $t5
-; LA32-NEXT:    sltu $a1, $t3, $a1
-; LA32-NEXT:    mulh.wu $t3, $t0, $t2
-; LA32-NEXT:    add.w $a1, $t3, $a1
-; LA32-NEXT:    add.w $a1, $a1, $t5
-; LA32-NEXT:    mulh.wu $t3, $a3, $a0
-; LA32-NEXT:    add.w $t3, $t3, $t7
-; LA32-NEXT:    mul.w $t5, $t2, $a0
-; LA32-NEXT:    add.w $t3, $t3, $t5
-; LA32-NEXT:    add.w $a1, $a1, $t3
-; LA32-NEXT:    add.w $t3, $a1, $fp
+; LA32-NEXT:    mul.w $t8, $a3, $a0
+; LA32-NEXT:    add.w $t3, $t7, $t8
+; LA32-NEXT:    sltu $fp, $t3, $t7
+; LA32-NEXT:    sltu $t6, $t7, $t6
+; LA32-NEXT:    sltu $a1, $t5, $a1
+; LA32-NEXT:    mulh.wu $t5, $t0, $t4
+; LA32-NEXT:    add.w $a1, $t5, $a1
+; LA32-NEXT:    add.w $a1, $a1, $t6
+; LA32-NEXT:    mulh.wu $t5, $a3, $a0
+; LA32-NEXT:    add.w $t5, $t5, $t8
+; LA32-NEXT:    mul.w $t6, $t4, $a0
+; LA32-NEXT:    add.w $t5, $t5, $t6
+; LA32-NEXT:    add.w $t8, $a1, $t5
 ; LA32-NEXT:    mulh.wu $a1, $a4, $a3
 ; LA32-NEXT:    mul.w $t5, $t1, $a3
 ; LA32-NEXT:    add.w $a1, $t5, $a1
 ; LA32-NEXT:    sltu $t5, $a1, $t5
 ; LA32-NEXT:    mulh.wu $t6, $t1, $a3
 ; LA32-NEXT:    add.w $t5, $t6, $t5
-; LA32-NEXT:    mul.w $t6, $a4, $t2
+; LA32-NEXT:    mul.w $t6, $a4, $t4
 ; LA32-NEXT:    add.w $a1, $t6, $a1
 ; LA32-NEXT:    sltu $t6, $a1, $t6
-; LA32-NEXT:    mulh.wu $t7, $a4, $t2
+; LA32-NEXT:    mulh.wu $t7, $a4, $t4
 ; LA32-NEXT:    add.w $t6, $t7, $t6
 ; LA32-NEXT:    add.w $t6, $t5, $t6
-; LA32-NEXT:    mul.w $t7, $t1, $t2
-; LA32-NEXT:    add.w $fp, $t7, $t6
-; LA32-NEXT:    sltu $t7, $fp, $t7
+; LA32-NEXT:    mul.w $t7, $t1, $t4
 ; LA32-NEXT:    sltu $t5, $t6, $t5
-; LA32-NEXT:    mulh.wu $t2, $t1, $t2
-; LA32-NEXT:    add.w $t2, $t2, $t5
-; LA32-NEXT:    add.w $t2, $t2, $t7
-; LA32-NEXT:    add.w $t2, $t4, $t2
+; LA32-NEXT:    add.w $t6, $t7, $t6
+; LA32-NEXT:    sltu $t7, $t6, $t7
+; LA32-NEXT:    mulh.wu $t4, $t1, $t4
+; LA32-NEXT:    add.w $t4, $t4, $t5
+; LA32-NEXT:    add.w $t4, $t4, $t7
+; LA32-NEXT:    add.w $t4, $t2, $t4
 ; LA32-NEXT:    mul.w $t5, $a7, $a3
-; LA32-NEXT:    add.w $t6, $t5, $fp
+; LA32-NEXT:    add.w $t6, $t5, $t6
 ; LA32-NEXT:    sltu $t5, $t6, $t5
-; LA32-NEXT:    add.w $t2, $t2, $t5
-; LA32-NEXT:    sltu $t7, $t2, $t4
-; LA32-NEXT:    xor $t4, $t2, $t4
-; LA32-NEXT:    sltui $t4, $t4, 1
-; LA32-NEXT:    masknez $t7, $t7, $t4
-; LA32-NEXT:    maskeqz $t4, $t5, $t4
-; LA32-NEXT:    or $t4, $t4, $t7
-; LA32-NEXT:    add.w $t5, $t8, $t4
-; LA32-NEXT:    sltu $t4, $t5, $t8
-; LA32-NEXT:    add.w $t4, $t3, $t4
-; LA32-NEXT:    mulh.wu $t3, $a4, $a6
-; LA32-NEXT:    mul.w $t7, $t1, $a6
-; LA32-NEXT:    add.w $t3, $t7, $t3
-; LA32-NEXT:    sltu $t7, $t3, $t7
+; LA32-NEXT:    add.w $t7, $t4, $t5
+; LA32-NEXT:    add.w $t4, $t8, $fp
+; LA32-NEXT:    beq $t7, $t2, .LBB1_2
+; LA32-NEXT:  # %bb.1:
+; LA32-NEXT:    sltu $t5, $t7, $t2
+; LA32-NEXT:  .LBB1_2:
+; LA32-NEXT:    add.w $t5, $t3, $t5
+; LA32-NEXT:    sltu $t2, $t5, $t3
+; LA32-NEXT:    add.w $t4, $t4, $t2
+; LA32-NEXT:    mulh.wu $t2, $a4, $a6
+; LA32-NEXT:    mul.w $t3, $t1, $a6
+; LA32-NEXT:    add.w $t2, $t3, $t2
+; LA32-NEXT:    sltu $t3, $t2, $t3
 ; LA32-NEXT:    mulh.wu $t8, $t1, $a6
-; LA32-NEXT:    add.w $t7, $t8, $t7
-; LA32-NEXT:    mul.w $t8, $a4, $a5
-; LA32-NEXT:    add.w $fp, $t8, $t3
-; LA32-NEXT:    sltu $t3, $fp, $t8
-; LA32-NEXT:    mulh.wu $t8, $a4, $a5
-; LA32-NEXT:    add.w $t3, $t8, $t3
-; LA32-NEXT:    add.w $t8, $t7, $t3
-; LA32-NEXT:    mul.w $s0, $t1, $a5
-; LA32-NEXT:    add.w $s1, $s0, $t8
+; LA32-NEXT:    add.w $s0, $t8, $t3
+; LA32-NEXT:    mul.w $t3, $a4, $a5
+; LA32-NEXT:    add.w $t8, $t3, $t2
+; LA32-NEXT:    sltu $t2, $t8, $t3
+; LA32-NEXT:    mulh.wu $t3, $a4, $a5
+; LA32-NEXT:    add.w $t2, $t3, $t2
+; LA32-NEXT:    add.w $t2, $s0, $t2
+; LA32-NEXT:    mul.w $s1, $t1, $a5
+; LA32-NEXT:    add.w $s2, $s1, $t2
 ; LA32-NEXT:    srai.w $t3, $a5, 31
-; LA32-NEXT:    mul.w $s2, $t3, $a4
-; LA32-NEXT:    add.w $s3, $s1, $s2
-; LA32-NEXT:    sltu $s4, $s3, $s1
-; LA32-NEXT:    sltu $s0, $s1, $s0
-; LA32-NEXT:    sltu $t7, $t8, $t7
-; LA32-NEXT:    mulh.wu $t8, $t1, $a5
-; LA32-NEXT:    add.w $t7, $t8, $t7
-; LA32-NEXT:    add.w $t7, $t7, $s0
+; LA32-NEXT:    mul.w $s3, $t3, $a4
+; LA32-NEXT:    add.w $fp, $s2, $s3
+; LA32-NEXT:    sltu $s4, $fp, $s2
+; LA32-NEXT:    sltu $s1, $s2, $s1
+; LA32-NEXT:    sltu $t2, $t2, $s0
+; LA32-NEXT:    mulh.wu $s0, $t1, $a5
+; LA32-NEXT:    add.w $t2, $s0, $t2
+; LA32-NEXT:    add.w $t2, $t2, $s1
 ; LA32-NEXT:    mul.w $t1, $t3, $t1
-; LA32-NEXT:    mulh.wu $t8, $t3, $a4
-; LA32-NEXT:    add.w $t1, $t8, $t1
-; LA32-NEXT:    add.w $t1, $t1, $s2
-; LA32-NEXT:    add.w $t1, $t7, $t1
-; LA32-NEXT:    add.w $t7, $t1, $s4
-; LA32-NEXT:    add.w $t2, $fp, $t2
-; LA32-NEXT:    mul.w $t8, $a4, $a6
-; LA32-NEXT:    add.w $t1, $t8, $t6
-; LA32-NEXT:    sltu $t6, $t1, $t8
-; LA32-NEXT:    add.w $t2, $t2, $t6
-; LA32-NEXT:    sltu $t8, $t2, $fp
-; LA32-NEXT:    xor $fp, $t2, $fp
-; LA32-NEXT:    sltui $fp, $fp, 1
-; LA32-NEXT:    masknez $t8, $t8, $fp
-; LA32-NEXT:    maskeqz $t6, $t6, $fp
-; LA32-NEXT:    or $t6, $t6, $t8
-; LA32-NEXT:    add.w $t6, $s3, $t6
-; LA32-NEXT:    sltu $t8, $t6, $s3
-; LA32-NEXT:    add.w $t7, $t7, $t8
-; LA32-NEXT:    add.w $t8, $t4, $t7
-; LA32-NEXT:    add.w $t6, $t5, $t6
-; LA32-NEXT:    sltu $fp, $t6, $t5
-; LA32-NEXT:    add.w $t8, $t8, $fp
+; LA32-NEXT:    mulh.wu $s0, $t3, $a4
+; LA32-NEXT:    add.w $t1, $s0, $t1
+; LA32-NEXT:    add.w $t1, $t1, $s3
+; LA32-NEXT:    add.w $s0, $t2, $t1
+; LA32-NEXT:    add.w $t2, $t8, $t7
+; LA32-NEXT:    mul.w $t7, $a4, $a6
+; LA32-NEXT:    add.w $t1, $t7, $t6
+; LA32-NEXT:    sltu $t7, $t1, $t7
+; LA32-NEXT:    add.w $t2, $t2, $t7
+; LA32-NEXT:    add.w $t6, $s0, $s4
+; LA32-NEXT:    beq $t2, $t8, .LBB1_4
+; LA32-NEXT:  # %bb.3:
+; LA32-NEXT:    sltu $t7, $t2, $t8
+; LA32-NEXT:  .LBB1_4:
+; LA32-NEXT:    add.w $t7, $fp, $t7
+; LA32-NEXT:    sltu $t8, $t7, $fp
+; LA32-NEXT:    add.w $t8, $t6, $t8
+; LA32-NEXT:    add.w $t6, $t4, $t8
+; LA32-NEXT:    add.w $t7, $t5, $t7
+; LA32-NEXT:    sltu $s0, $t7, $t5
+; LA32-NEXT:    add.w $s4, $t6, $s0
 ; LA32-NEXT:    mulh.wu $t5, $a7, $a6
-; LA32-NEXT:    mul.w $s0, $t0, $a6
-; LA32-NEXT:    add.w $s1, $s0, $t5
-; LA32-NEXT:    mul.w $s2, $a7, $a5
-; LA32-NEXT:    add.w $s3, $s2, $s1
-; LA32-NEXT:    add.w $s4, $s3, $t8
+; LA32-NEXT:    mul.w $s1, $t0, $a6
+; LA32-NEXT:    add.w $s3, $s1, $t5
+; LA32-NEXT:    mul.w $fp, $a7, $a5
+; LA32-NEXT:    add.w $s2, $fp, $s3
+; LA32-NEXT:    add.w $t6, $s2, $s4
 ; LA32-NEXT:    mul.w $s5, $a7, $a6
-; LA32-NEXT:    add.w $t5, $s5, $t6
-; LA32-NEXT:    sltu $s5, $t5, $s5
-; LA32-NEXT:    add.w $t6, $s4, $s5
-; LA32-NEXT:    sltu $s4, $t6, $s3
-; LA32-NEXT:    xor $s6, $t6, $s3
-; LA32-NEXT:    sltui $s6, $s6, 1
-; LA32-NEXT:    masknez $s4, $s4, $s6
-; LA32-NEXT:    maskeqz $s5, $s5, $s6
-; LA32-NEXT:    or $s4, $s5, $s4
-; LA32-NEXT:    sltu $s5, $t8, $t4
-; LA32-NEXT:    xor $t8, $t8, $t4
-; LA32-NEXT:    sltui $t8, $t8, 1
-; LA32-NEXT:    masknez $s5, $s5, $t8
-; LA32-NEXT:    maskeqz $t8, $fp, $t8
-; LA32-NEXT:    or $t8, $t8, $s5
+; LA32-NEXT:    add.w $t5, $s5, $t7
+; LA32-NEXT:    sltu $t7, $t5, $s5
+; LA32-NEXT:    add.w $t6, $t6, $t7
+; LA32-NEXT:    beq $t6, $s2, .LBB1_6
+; LA32-NEXT:  # %bb.5:
+; LA32-NEXT:    sltu $t7, $t6, $s2
+; LA32-NEXT:  .LBB1_6:
+; LA32-NEXT:    beq $s4, $t4, .LBB1_8
+; LA32-NEXT:  # %bb.7:
+; LA32-NEXT:    sltu $s0, $s4, $t4
+; LA32-NEXT:  .LBB1_8:
 ; LA32-NEXT:    srai.w $t4, $t4, 31
-; LA32-NEXT:    srai.w $t7, $t7, 31
-; LA32-NEXT:    add.w $t7, $t4, $t7
-; LA32-NEXT:    add.w $t8, $t7, $t8
-; LA32-NEXT:    sltu $fp, $s1, $s0
-; LA32-NEXT:    mulh.wu $s0, $t0, $a6
-; LA32-NEXT:    add.w $fp, $s0, $fp
-; LA32-NEXT:    sltu $s0, $s3, $s2
-; LA32-NEXT:    mulh.wu $s1, $a7, $a5
-; LA32-NEXT:    add.w $s0, $s1, $s0
-; LA32-NEXT:    add.w $s0, $fp, $s0
-; LA32-NEXT:    mul.w $s1, $t0, $a5
-; LA32-NEXT:    add.w $s2, $s1, $s0
-; LA32-NEXT:    mul.w $s3, $a6, $a0
+; LA32-NEXT:    srai.w $t8, $t8, 31
+; LA32-NEXT:    add.w $t8, $t4, $t8
+; LA32-NEXT:    add.w $s0, $t8, $s0
+; LA32-NEXT:    sltu $s1, $s3, $s1
+; LA32-NEXT:    mulh.wu $s3, $t0, $a6
+; LA32-NEXT:    add.w $s1, $s3, $s1
+; LA32-NEXT:    sltu $fp, $s2, $fp
+; LA32-NEXT:    mulh.wu $s2, $a7, $a5
+; LA32-NEXT:    add.w $fp, $s2, $fp
+; LA32-NEXT:    add.w $fp, $s1, $fp
+; LA32-NEXT:    mul.w $s2, $t0, $a5
+; LA32-NEXT:    add.w $s3, $s2, $fp
+; LA32-NEXT:    mul.w $s4, $a6, $a0
 ; LA32-NEXT:    mul.w $s5, $t3, $a7
-; LA32-NEXT:    add.w $s6, $s5, $s3
-; LA32-NEXT:    add.w $s7, $s2, $s6
-; LA32-NEXT:    add.w $s8, $s7, $t8
-; LA32-NEXT:    add.w $s4, $s8, $s4
-; LA32-NEXT:    sltu $ra, $s4, $s8
-; LA32-NEXT:    sltu $t4, $t7, $t4
-; LA32-NEXT:    add.w $t4, $t7, $t4
-; LA32-NEXT:    sltu $t7, $t8, $t7
-; LA32-NEXT:    add.w $t4, $t4, $t7
-; LA32-NEXT:    sltu $t7, $s7, $s2
-; LA32-NEXT:    sltu $t8, $s2, $s1
-; LA32-NEXT:    sltu $fp, $s0, $fp
-; LA32-NEXT:    mulh.wu $s0, $t0, $a5
-; LA32-NEXT:    add.w $fp, $s0, $fp
-; LA32-NEXT:    add.w $t8, $fp, $t8
+; LA32-NEXT:    add.w $s6, $s5, $s4
+; LA32-NEXT:    add.w $s7, $s3, $s6
+; LA32-NEXT:    add.w $s8, $s7, $s0
+; LA32-NEXT:    add.w $t7, $s8, $t7
+; LA32-NEXT:    sltu $ra, $t7, $s8
+; LA32-NEXT:    sltu $t4, $t8, $t4
+; LA32-NEXT:    add.w $t4, $t8, $t4
+; LA32-NEXT:    sltu $t8, $s0, $t8
+; LA32-NEXT:    add.w $t4, $t4, $t8
+; LA32-NEXT:    sltu $t8, $s7, $s3
+; LA32-NEXT:    sltu $s0, $s3, $s2
+; LA32-NEXT:    sltu $fp, $fp, $s1
+; LA32-NEXT:    mulh.wu $s1, $t0, $a5
+; LA32-NEXT:    add.w $fp, $s1, $fp
+; LA32-NEXT:    add.w $fp, $fp, $s0
 ; LA32-NEXT:    mulh.wu $a6, $a6, $a0
-; LA32-NEXT:    add.w $a6, $a6, $s3
+; LA32-NEXT:    add.w $a6, $a6, $s4
 ; LA32-NEXT:    mul.w $a0, $a5, $a0
 ; LA32-NEXT:    add.w $a0, $a6, $a0
 ; LA32-NEXT:    mul.w $a5, $t3, $t0
@@ -268,8 +260,8 @@ define zeroext i1 @smuloi128(i128 %v1, i128 %v2, ptr %res) {
 ; LA32-NEXT:    add.w $a0, $a5, $a0
 ; LA32-NEXT:    sltu $a5, $s6, $s5
 ; LA32-NEXT:    add.w $a0, $a0, $a5
-; LA32-NEXT:    add.w $a0, $t8, $a0
-; LA32-NEXT:    add.w $a0, $a0, $t7
+; LA32-NEXT:    add.w $a0, $fp, $a0
+; LA32-NEXT:    add.w $a0, $a0, $t8
 ; LA32-NEXT:    add.w $a0, $a0, $t4
 ; LA32-NEXT:    sltu $a5, $s8, $s7
 ; LA32-NEXT:    add.w $a0, $a0, $a5
@@ -278,7 +270,7 @@ define zeroext i1 @smuloi128(i128 %v1, i128 %v2, ptr %res) {
 ; LA32-NEXT:    xor $a0, $a0, $a5
 ; LA32-NEXT:    xor $a6, $t6, $a5
 ; LA32-NEXT:    or $a0, $a6, $a0
-; LA32-NEXT:    xor $a6, $s4, $a5
+; LA32-NEXT:    xor $a6, $t7, $a5
 ; LA32-NEXT:    xor $a5, $t5, $a5
 ; LA32-NEXT:    or $a5, $a5, $a6
 ; LA32-NEXT:    or $a0, $a5, $a0
diff --git a/llvm/test/CodeGen/LoongArch/stack-realignment-with-variable-sized-objects.ll b/llvm/test/CodeGen/LoongArch/stack-realignment-with-variable-sized-objects.ll
index fe7a8f8f2d6bd..9f15604fcca6b 100644
--- a/llvm/test/CodeGen/LoongArch/stack-realignment-with-variable-sized-objects.ll
+++ b/llvm/test/CodeGen/LoongArch/stack-realignment-with-variable-sized-objects.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc --mtriple=loongarch32 -mattr=+d --verify-machineinstrs < %s \
+; RUN: llc --mtriple=loongarch32 -mattr=+32s,+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA32
 ; RUN: llc --mtriple=loongarch64 -mattr=+d --verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s --check-prefix=LA64
diff --git a/llvm/test/CodeGen/LoongArch/typepromotion-overflow.ll b/llvm/test/CodeGen/LoongArch/typepromotion-overflow.ll
index 3f51e3b840097..48de174359632 100644
--- a/llvm/test/CodeGen/LoongArch/typepromotion-overflow.ll
+++ b/llvm/test/CodeGen/LoongArch/typepromotion-overflow.ll
@@ -1,21 +1,37 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=loongarch32 %s -o - | FileCheck %s --check-prefix=LA32
+; RUN: llc -mtriple=loongarch32 -mattr=-32s %s -o - | FileCheck %s --check-prefix=LA32R
+; RUN: llc -mtriple=loongarch32 -mattr=+32s %s -o - | FileCheck %s --check-prefix=LA32S
 ; RUN: llc -mtriple=loongarch64 %s -o - | FileCheck %s --check-prefix=LA64
 
 define zeroext i16 @overflow_add(i16 zeroext %a, i16 zeroext %b) {
-; LA32-LABEL: overflow_add:
-; LA32:       # %bb.0:
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    ori $a0, $a0, 1
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ori $a1, $zero, 1024
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 5
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 2
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_add:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    ori $a0, $a0, 1
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    ori $a2, $zero, 1024
+; LA32R-NEXT:    ori $a0, $zero, 2
+; LA32R-NEXT:    bltu $a2, $a1, .LBB0_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 5
+; LA32R-NEXT:  .LBB0_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_add:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    ori $a0, $a0, 1
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ori $a1, $zero, 1024
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 5
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 2
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_add:
 ; LA64:       # %bb.0:
@@ -38,19 +54,34 @@ define zeroext i16 @overflow_add(i16 zeroext %a, i16 zeroext %b) {
 }
 
 define zeroext i16 @overflow_sub(i16 zeroext %a, i16 zeroext %b) {
-; LA32-LABEL: overflow_sub:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sub.w $a0, $a0, $a1
-; LA32-NEXT:    ori $a0, $a0, 1
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ori $a1, $zero, 1024
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 5
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 2
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_sub:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sub.w $a0, $a0, $a1
+; LA32R-NEXT:    ori $a0, $a0, 1
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    ori $a2, $zero, 1024
+; LA32R-NEXT:    ori $a0, $zero, 2
+; LA32R-NEXT:    bltu $a2, $a1, .LBB1_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 5
+; LA32R-NEXT:  .LBB1_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_sub:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sub.w $a0, $a0, $a1
+; LA32S-NEXT:    ori $a0, $a0, 1
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ori $a1, $zero, 1024
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 5
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 2
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_sub:
 ; LA64:       # %bb.0:
@@ -73,19 +104,34 @@ define zeroext i16 @overflow_sub(i16 zeroext %a, i16 zeroext %b) {
 }
 
 define zeroext i16 @overflow_mul(i16 zeroext %a, i16 zeroext %b) {
-; LA32-LABEL: overflow_mul:
-; LA32:       # %bb.0:
-; LA32-NEXT:    mul.w $a0, $a1, $a0
-; LA32-NEXT:    ori $a0, $a0, 1
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ori $a1, $zero, 1024
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 5
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 2
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_mul:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    mul.w $a0, $a1, $a0
+; LA32R-NEXT:    ori $a0, $a0, 1
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    ori $a2, $zero, 1024
+; LA32R-NEXT:    ori $a0, $zero, 2
+; LA32R-NEXT:    bltu $a2, $a1, .LBB2_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 5
+; LA32R-NEXT:  .LBB2_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_mul:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    mul.w $a0, $a1, $a0
+; LA32S-NEXT:    ori $a0, $a0, 1
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ori $a1, $zero, 1024
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 5
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 2
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_mul:
 ; LA64:       # %bb.0:
@@ -108,19 +154,34 @@ define zeroext i16 @overflow_mul(i16 zeroext %a, i16 zeroext %b) {
 }
 
 define zeroext i16 @overflow_shl(i16 zeroext %a, i16 zeroext %b) {
-; LA32-LABEL: overflow_shl:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sll.w $a0, $a0, $a1
-; LA32-NEXT:    ori $a0, $a0, 1
-; LA32-NEXT:    bstrpick.w $a0, $a0, 15, 0
-; LA32-NEXT:    ori $a1, $zero, 1024
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 5
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 2
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_shl:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    sll.w $a0, $a0, $a1
+; LA32R-NEXT:    ori $a0, $a0, 1
+; LA32R-NEXT:    lu12i.w $a1, 15
+; LA32R-NEXT:    ori $a1, $a1, 4095
+; LA32R-NEXT:    and $a1, $a0, $a1
+; LA32R-NEXT:    ori $a2, $zero, 1024
+; LA32R-NEXT:    ori $a0, $zero, 2
+; LA32R-NEXT:    bltu $a2, $a1, .LBB3_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 5
+; LA32R-NEXT:  .LBB3_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_shl:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sll.w $a0, $a0, $a1
+; LA32S-NEXT:    ori $a0, $a0, 1
+; LA32S-NEXT:    bstrpick.w $a0, $a0, 15, 0
+; LA32S-NEXT:    ori $a1, $zero, 1024
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 5
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 2
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_shl:
 ; LA64:       # %bb.0:
@@ -143,17 +204,28 @@ define zeroext i16 @overflow_shl(i16 zeroext %a, i16 zeroext %b) {
 }
 
 define i32 @overflow_add_no_consts(i8 zeroext %a, i8 zeroext %b, i8 zeroext %limit) {
-; LA32-LABEL: overflow_add_no_consts:
-; LA32:       # %bb.0:
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    sltu $a0, $a2, $a0
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_add_no_consts:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    andi $a1, $a0, 255
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    bltu $a2, $a1, .LBB4_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB4_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_add_no_consts:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    sltu $a0, $a2, $a0
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_add_no_consts:
 ; LA64:       # %bb.0:
@@ -173,18 +245,30 @@ define i32 @overflow_add_no_consts(i8 zeroext %a, i8 zeroext %b, i8 zeroext %lim
 }
 
 define i32 @overflow_add_const_limit(i8 zeroext %a, i8 zeroext %b) {
-; LA32-LABEL: overflow_add_const_limit:
-; LA32:       # %bb.0:
-; LA32-NEXT:    add.w $a0, $a1, $a0
-; LA32-NEXT:    andi $a0, $a0, 255
-; LA32-NEXT:    ori $a1, $zero, 128
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_add_const_limit:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    add.w $a0, $a1, $a0
+; LA32R-NEXT:    andi $a1, $a0, 255
+; LA32R-NEXT:    ori $a2, $zero, 128
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    bltu $a2, $a1, .LBB5_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB5_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_add_const_limit:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    add.w $a0, $a1, $a0
+; LA32S-NEXT:    andi $a0, $a0, 255
+; LA32S-NEXT:    ori $a1, $zero, 128
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_add_const_limit:
 ; LA64:       # %bb.0:
@@ -205,16 +289,28 @@ define i32 @overflow_add_const_limit(i8 zeroext %a, i8 zeroext %b) {
 }
 
 define i32 @overflow_add_positive_const_limit(i8 zeroext %a) {
-; LA32-LABEL: overflow_add_positive_const_limit:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    slti $a0, $a0, -1
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_add_positive_const_limit:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    srai.w $a1, $a0, 24
+; LA32R-NEXT:    addi.w $a2, $zero, -1
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    blt $a1, $a2, .LBB6_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB6_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_add_positive_const_limit:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    slti $a0, $a0, -1
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_add_positive_const_limit:
 ; LA64:       # %bb.0:
@@ -232,16 +328,27 @@ define i32 @overflow_add_positive_const_limit(i8 zeroext %a) {
 }
 
 define i32 @unsafe_add_underflow(i8 zeroext %a) {
-; LA32-LABEL: unsafe_add_underflow:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a0, $a0, -1
-; LA32-NEXT:    sltui $a0, $a0, 1
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: unsafe_add_underflow:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a1, $a0
+; LA32R-NEXT:    ori $a2, $zero, 1
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    beq $a1, $a2, .LBB7_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB7_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: unsafe_add_underflow:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a0, $a0, -1
+; LA32S-NEXT:    sltui $a0, $a0, 1
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: unsafe_add_underflow:
 ; LA64:       # %bb.0:
@@ -259,15 +366,25 @@ define i32 @unsafe_add_underflow(i8 zeroext %a) {
 }
 
 define i32 @safe_add_underflow(i8 zeroext %a) {
-; LA32-LABEL: safe_add_underflow:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sltui $a0, $a0, 1
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_add_underflow:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a1, $a0
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    beq $a1, $zero, .LBB8_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB8_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_add_underflow:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sltui $a0, $a0, 1
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_add_underflow:
 ; LA64:       # %bb.0:
@@ -284,16 +401,27 @@ define i32 @safe_add_underflow(i8 zeroext %a) {
 }
 
 define i32 @safe_add_underflow_neg(i8 zeroext %a) {
-; LA32-LABEL: safe_add_underflow_neg:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a0, $a0, -2
-; LA32-NEXT:    sltui $a0, $a0, 251
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_add_underflow_neg:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $a0, -2
+; LA32R-NEXT:    ori $a2, $zero, 251
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    bltu $a1, $a2, .LBB9_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB9_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_add_underflow_neg:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a0, $a0, -2
+; LA32S-NEXT:    sltui $a0, $a0, 251
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_add_underflow_neg:
 ; LA64:       # %bb.0:
@@ -312,16 +440,28 @@ define i32 @safe_add_underflow_neg(i8 zeroext %a) {
 }
 
 define i32 @overflow_sub_negative_const_limit(i8 zeroext %a) {
-; LA32-LABEL: overflow_sub_negative_const_limit:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ext.w.b $a0, $a0
-; LA32-NEXT:    slti $a0, $a0, -1
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: overflow_sub_negative_const_limit:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slli.w $a0, $a0, 24
+; LA32R-NEXT:    srai.w $a1, $a0, 24
+; LA32R-NEXT:    addi.w $a2, $zero, -1
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    blt $a1, $a2, .LBB10_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB10_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: overflow_sub_negative_const_limit:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ext.w.b $a0, $a0
+; LA32S-NEXT:    slti $a0, $a0, -1
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: overflow_sub_negative_const_limit:
 ; LA64:       # %bb.0:
@@ -339,17 +479,28 @@ define i32 @overflow_sub_negative_const_limit(i8 zeroext %a) {
 }
 
 define i32 @sext_sub_underflow(i8 zeroext %a) {
-; LA32-LABEL: sext_sub_underflow:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a0, $a0, -6
-; LA32-NEXT:    addi.w $a1, $zero, -6
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_sub_underflow:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $a0, -6
+; LA32R-NEXT:    addi.w $a2, $zero, -6
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    bltu $a2, $a1, .LBB11_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB11_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_sub_underflow:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a0, $a0, -6
+; LA32S-NEXT:    addi.w $a1, $zero, -6
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_sub_underflow:
 ; LA64:       # %bb.0:
@@ -369,15 +520,25 @@ define i32 @sext_sub_underflow(i8 zeroext %a) {
 }
 
 define i32 @safe_sub_underflow(i8 zeroext %a) {
-; LA32-LABEL: safe_sub_underflow:
-; LA32:       # %bb.0:
-; LA32-NEXT:    sltui $a0, $a0, 1
-; LA32-NEXT:    ori $a1, $zero, 8
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 16
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_sub_underflow:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    move $a1, $a0
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:    beq $a1, $zero, .LBB12_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:  .LBB12_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_sub_underflow:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    sltui $a0, $a0, 1
+; LA32S-NEXT:    ori $a1, $zero, 8
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 16
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_sub_underflow:
 ; LA64:       # %bb.0:
@@ -394,17 +555,28 @@ define i32 @safe_sub_underflow(i8 zeroext %a) {
 }
 
 define i32 @safe_sub_underflow_neg(i8 zeroext %a) {
-; LA32-LABEL: safe_sub_underflow_neg:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a0, $a0, -4
-; LA32-NEXT:    ori $a1, $zero, 250
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_sub_underflow_neg:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $a0, -4
+; LA32R-NEXT:    ori $a2, $zero, 250
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    bltu $a2, $a1, .LBB13_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB13_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_sub_underflow_neg:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a0, $a0, -4
+; LA32S-NEXT:    ori $a1, $zero, 250
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_sub_underflow_neg:
 ; LA64:       # %bb.0:
@@ -424,16 +596,27 @@ define i32 @safe_sub_underflow_neg(i8 zeroext %a) {
 }
 
 define i32 @sext_sub_underflow_neg(i8 zeroext %a) {
-; LA32-LABEL: sext_sub_underflow_neg:
-; LA32:       # %bb.0:
-; LA32-NEXT:    addi.w $a0, $a0, -4
-; LA32-NEXT:    sltui $a0, $a0, -3
-; LA32-NEXT:    ori $a1, $zero, 16
-; LA32-NEXT:    masknez $a1, $a1, $a0
-; LA32-NEXT:    ori $a2, $zero, 8
-; LA32-NEXT:    maskeqz $a0, $a2, $a0
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: sext_sub_underflow_neg:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    addi.w $a1, $a0, -4
+; LA32R-NEXT:    addi.w $a2, $zero, -3
+; LA32R-NEXT:    ori $a0, $zero, 8
+; LA32R-NEXT:    bltu $a1, $a2, .LBB14_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 16
+; LA32R-NEXT:  .LBB14_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: sext_sub_underflow_neg:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    addi.w $a0, $a0, -4
+; LA32S-NEXT:    sltui $a0, $a0, -3
+; LA32S-NEXT:    ori $a1, $zero, 16
+; LA32S-NEXT:    masknez $a1, $a1, $a0
+; LA32S-NEXT:    ori $a2, $zero, 8
+; LA32S-NEXT:    maskeqz $a0, $a2, $a0
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: sext_sub_underflow_neg:
 ; LA64:       # %bb.0:
@@ -452,10 +635,15 @@ define i32 @sext_sub_underflow_neg(i8 zeroext %a) {
 }
 
 define i32 @safe_sub_imm_var(ptr nocapture readonly %b) local_unnamed_addr #1 {
-; LA32-LABEL: safe_sub_imm_var:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    move $a0, $zero
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_sub_imm_var:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    move $a0, $zero
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_sub_imm_var:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    move $a0, $zero
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_sub_imm_var:
 ; LA64:       # %bb.0: # %entry
@@ -466,13 +654,21 @@ entry:
 }
 
 define i32 @safe_sub_var_imm(ptr nocapture readonly %b) local_unnamed_addr #1 {
-; LA32-LABEL: safe_sub_var_imm:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ld.bu $a0, $a0, 0
-; LA32-NEXT:    addi.w $a0, $a0, -248
-; LA32-NEXT:    addi.w $a1, $zero, -4
-; LA32-NEXT:    sltu $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_sub_var_imm:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ld.bu $a0, $a0, 0
+; LA32R-NEXT:    addi.w $a0, $a0, -248
+; LA32R-NEXT:    addi.w $a1, $zero, -4
+; LA32R-NEXT:    sltu $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_sub_var_imm:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ld.bu $a0, $a0, 0
+; LA32S-NEXT:    addi.w $a0, $a0, -248
+; LA32S-NEXT:    addi.w $a1, $zero, -4
+; LA32S-NEXT:    sltu $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_sub_var_imm:
 ; LA64:       # %bb.0: # %entry
@@ -490,10 +686,15 @@ entry:
 }
 
 define i32 @safe_add_imm_var(ptr nocapture readnone %b) {
-; LA32-LABEL: safe_add_imm_var:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ori $a0, $zero, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_add_imm_var:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ori $a0, $zero, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_add_imm_var:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ori $a0, $zero, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_add_imm_var:
 ; LA64:       # %bb.0: # %entry
@@ -504,10 +705,15 @@ entry:
 }
 
 define i32 @safe_add_var_imm(ptr nocapture readnone %b) {
-; LA32-LABEL: safe_add_var_imm:
-; LA32:       # %bb.0: # %entry
-; LA32-NEXT:    ori $a0, $zero, 1
-; LA32-NEXT:    ret
+; LA32R-LABEL: safe_add_var_imm:
+; LA32R:       # %bb.0: # %entry
+; LA32R-NEXT:    ori $a0, $zero, 1
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: safe_add_var_imm:
+; LA32S:       # %bb.0: # %entry
+; LA32S-NEXT:    ori $a0, $zero, 1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: safe_add_var_imm:
 ; LA64:       # %bb.0: # %entry
@@ -518,20 +724,37 @@ entry:
 }
 
 define i8 @convert_add_order(i8 zeroext %arg) {
-; LA32-LABEL: convert_add_order:
-; LA32:       # %bb.0:
-; LA32-NEXT:    ori $a1, $a0, 1
-; LA32-NEXT:    sltui $a2, $a1, 50
-; LA32-NEXT:    addi.w $a1, $a1, -40
-; LA32-NEXT:    sltui $a1, $a1, 20
-; LA32-NEXT:    ori $a3, $zero, 2
-; LA32-NEXT:    sub.w $a1, $a3, $a1
-; LA32-NEXT:    ori $a3, $zero, 255
-; LA32-NEXT:    masknez $a3, $a3, $a2
-; LA32-NEXT:    maskeqz $a1, $a1, $a2
-; LA32-NEXT:    or $a1, $a1, $a3
-; LA32-NEXT:    and $a0, $a1, $a0
-; LA32-NEXT:    ret
+; LA32R-LABEL: convert_add_order:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    ori $a1, $a0, 1
+; LA32R-NEXT:    ori $a2, $zero, 50
+; LA32R-NEXT:    bltu $a1, $a2, .LBB19_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a1, $zero, 255
+; LA32R-NEXT:    and $a0, $a1, $a0
+; LA32R-NEXT:    ret
+; LA32R-NEXT:  .LBB19_2:
+; LA32R-NEXT:    addi.w $a1, $a1, -40
+; LA32R-NEXT:    sltui $a1, $a1, 20
+; LA32R-NEXT:    ori $a2, $zero, 2
+; LA32R-NEXT:    sub.w $a1, $a2, $a1
+; LA32R-NEXT:    and $a0, $a1, $a0
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: convert_add_order:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    ori $a1, $a0, 1
+; LA32S-NEXT:    sltui $a2, $a1, 50
+; LA32S-NEXT:    addi.w $a1, $a1, -40
+; LA32S-NEXT:    sltui $a1, $a1, 20
+; LA32S-NEXT:    ori $a3, $zero, 2
+; LA32S-NEXT:    sub.w $a1, $a3, $a1
+; LA32S-NEXT:    ori $a3, $zero, 255
+; LA32S-NEXT:    masknez $a3, $a3, $a2
+; LA32S-NEXT:    maskeqz $a1, $a1, $a2
+; LA32S-NEXT:    or $a1, $a1, $a3
+; LA32S-NEXT:    and $a0, $a1, $a0
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: convert_add_order:
 ; LA64:       # %bb.0:
@@ -558,17 +781,28 @@ define i8 @convert_add_order(i8 zeroext %arg) {
 }
 
 define i8 @underflow_if_sub(i32 %arg, i8 zeroext %arg1) {
-; LA32-LABEL: underflow_if_sub:
-; LA32:       # %bb.0:
-; LA32-NEXT:    slt $a2, $zero, $a0
-; LA32-NEXT:    and $a0, $a2, $a0
-; LA32-NEXT:    addi.w $a0, $a0, 245
-; LA32-NEXT:    sltu $a1, $a0, $a1
-; LA32-NEXT:    maskeqz $a0, $a0, $a1
-; LA32-NEXT:    ori $a2, $zero, 100
-; LA32-NEXT:    masknez $a1, $a2, $a1
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: underflow_if_sub:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    slt $a2, $zero, $a0
+; LA32R-NEXT:    and $a0, $a2, $a0
+; LA32R-NEXT:    addi.w $a0, $a0, 245
+; LA32R-NEXT:    bltu $a0, $a1, .LBB20_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 100
+; LA32R-NEXT:  .LBB20_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: underflow_if_sub:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    slt $a2, $zero, $a0
+; LA32S-NEXT:    and $a0, $a2, $a0
+; LA32S-NEXT:    addi.w $a0, $a0, 245
+; LA32S-NEXT:    sltu $a1, $a0, $a1
+; LA32S-NEXT:    maskeqz $a0, $a0, $a1
+; LA32S-NEXT:    ori $a2, $zero, 100
+; LA32S-NEXT:    masknez $a1, $a2, $a1
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: underflow_if_sub:
 ; LA64:       # %bb.0:
@@ -593,18 +827,30 @@ define i8 @underflow_if_sub(i32 %arg, i8 zeroext %arg1) {
 }
 
 define i8 @underflow_if_sub_signext(i32 %arg, i8 signext %arg1) {
-; LA32-LABEL: underflow_if_sub_signext:
-; LA32:       # %bb.0:
-; LA32-NEXT:    andi $a1, $a1, 255
-; LA32-NEXT:    slt $a2, $zero, $a0
-; LA32-NEXT:    and $a0, $a2, $a0
-; LA32-NEXT:    addi.w $a0, $a0, 245
-; LA32-NEXT:    sltu $a1, $a0, $a1
-; LA32-NEXT:    maskeqz $a0, $a0, $a1
-; LA32-NEXT:    ori $a2, $zero, 100
-; LA32-NEXT:    masknez $a1, $a2, $a1
-; LA32-NEXT:    or $a0, $a0, $a1
-; LA32-NEXT:    ret
+; LA32R-LABEL: underflow_if_sub_signext:
+; LA32R:       # %bb.0:
+; LA32R-NEXT:    andi $a1, $a1, 255
+; LA32R-NEXT:    slt $a2, $zero, $a0
+; LA32R-NEXT:    and $a0, $a2, $a0
+; LA32R-NEXT:    addi.w $a0, $a0, 245
+; LA32R-NEXT:    bltu $a0, $a1, .LBB21_2
+; LA32R-NEXT:  # %bb.1:
+; LA32R-NEXT:    ori $a0, $zero, 100
+; LA32R-NEXT:  .LBB21_2:
+; LA32R-NEXT:    ret
+;
+; LA32S-LABEL: underflow_if_sub_signext:
+; LA32S:       # %bb.0:
+; LA32S-NEXT:    andi $a1, $a1, 255
+; LA32S-NEXT:    slt $a2, $zero, $a0
+; LA32S-NEXT:    and $a0, $a2, $a0
+; LA32S-NEXT:    addi.w $a0, $a0, 245
+; LA32S-NEXT:    sltu $a1, $a0, $a1
+; LA32S-NEXT:    maskeqz $a0, $a0, $a1
+; LA32S-NEXT:    ori $a2, $zero, 100
+; LA32S-NEXT:    masknez $a1, $a2, $a1
+; LA32S-NEXT:    or $a0, $a0, $a1
+; LA32S-NEXT:    ret
 ;
 ; LA64-LABEL: underflow_if_sub_signext:
 ; LA64:       # %bb.0:
diff --git a/llvm/test/MC/LoongArch/Basic/Integer/atomic.s b/llvm/test/MC/LoongArch/Basic/Integer/atomic.s
index 69acdeef935cc..e0aaa15ee426e 100644
--- a/llvm/test/MC/LoongArch/Basic/Integer/atomic.s
+++ b/llvm/test/MC/LoongArch/Basic/Integer/atomic.s
@@ -21,14 +21,6 @@ ll.w $tp, $s4, 220
 # CHECK-ASM: encoding: [0xd3,0x39,0x00,0x21]
 sc.w $t7, $t2, 56
 
-# CHECK-ASM-AND-OBJ: llacq.w $t1, $t2
-# CHECK-ASM: encoding: [0xcd,0x81,0x57,0x38]
-llacq.w $t1, $t2
-
-# CHECK-ASM-AND-OBJ: screl.w $t1, $t2
-# CHECK-ASM: encoding: [0xcd,0x85,0x57,0x38]
-screl.w $t1, $t2
-
 
 
 #############################################################
@@ -277,6 +269,14 @@ sc.d $t5, $t5, 244
 # CHECK64-ASM: encoding: [0x33,0x3a,0x57,0x38]
 sc.q $t7, $t2, $t5
 
+# CHECK64-ASM-AND-OBJ: llacq.w $t1, $t2
+# CHECK64-ASM: encoding: [0xcd,0x81,0x57,0x38]
+llacq.w $t1, $t2
+
+# CHECK64-ASM-AND-OBJ: screl.w $t1, $t2
+# CHECK64-ASM: encoding: [0xcd,0x85,0x57,0x38]
+screl.w $t1, $t2
+
 # CHECK64-ASM-AND-OBJ: llacq.d $t1, $t2
 # CHECK64-ASM: encoding: [0xcd,0x89,0x57,0x38]
 llacq.d $t1, $t2
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.generated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.generated.expected
index 7a7115b393b1d..eda7e771c128b 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.generated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.generated.expected
@@ -75,11 +75,11 @@ attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-NEXT:    st.w $zero, $fp, -12
 ; CHECK-NEXT:    st.w $zero, $fp, -16
 ; CHECK-NEXT:    ori $a0, $zero, 1
-; CHECK-NEXT:    beqz $zero, .LBB0_3
+; CHECK-NEXT:    beq $zero, $zero, .LBB0_3
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    st.w $a0, $fp, -24
 ; CHECK-NEXT:    ld.w $a0, $fp, -16
-; CHECK-NEXT:    beqz $a0, .LBB0_4
+; CHECK-NEXT:    beq $a0, $zero, .LBB0_4
 ; CHECK-NEXT:  .LBB0_2:
 ; CHECK-NEXT:    ori $a0, $zero, 1
 ; CHECK-NEXT:    st.w $a0, $fp, -24
@@ -93,7 +93,7 @@ attributes #0 = { noredzone nounwind ssp uwtable "frame-pointer"="all" }
 ; CHECK-NEXT:    ori $a0, $zero, 4
 ; CHECK-NEXT:    st.w $a0, $fp, -28
 ; CHECK-NEXT:    ld.w $a0, $fp, -16
-; CHECK-NEXT:    bnez $a0, .LBB0_2
+; CHECK-NEXT:    bne $a0, $zero, .LBB0_2
 ; CHECK-NEXT:  .LBB0_4:
 ; CHECK-NEXT:    ori $a0, $zero, 1
 ; CHECK-NEXT:    st.w $a0, $fp, -16
diff --git a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.nogenerated.expected b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.nogenerated.expected
index d99eb3749826f..aab63fa7176c1 100644
--- a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.nogenerated.expected
+++ b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.nogenerated.expected
@@ -16,11 +16,11 @@ define dso_local i32 @check_boundaries() #0 {
 ; CHECK-NEXT:    st.w $zero, $fp, -12
 ; CHECK-NEXT:    st.w $zero, $fp, -16
 ; CHECK-NEXT:    ori $a0, $zero, 1
-; CHECK-NEXT:    beqz $zero, .LBB0_3
+; CHECK-NEXT:    beq $zero, $zero, .LBB0_3
 ; CHECK-NEXT:  # %bb.1:
 ; CHECK-NEXT:    st.w $a0, $fp, -24
 ; CHECK-NEXT:    ld.w $a0, $fp, -16
-; CHECK-NEXT:    beqz $a0, .LBB0_4
+; CHECK-NEXT:    beq $a0, $zero, .LBB0_4
 ; CHECK-NEXT:  .LBB0_2:
 ; CHECK-NEXT:    ori $a0, $zero, 1
 ; CHECK-NEXT:    st.w $a0, $fp, -24
@@ -34,7 +34,7 @@ define dso_local i32 @check_boundaries() #0 {
 ; CHECK-NEXT:    ori $a0, $zero, 4
 ; CHECK-NEXT:    st.w $a0, $fp, -28
 ; CHECK-NEXT:    ld.w $a0, $fp, -16
-; CHECK-NEXT:    bnez $a0, .LBB0_2
+; CHECK-NEXT:    bne $a0, $zero, .LBB0_2
 ; CHECK-NEXT:  .LBB0_4:
 ; CHECK-NEXT:    ori $a0, $zero, 1
 ; CHECK-NEXT:    st.w $a0, $fp, -16

>From 7940d1e884df721a46393339d776cc5d283cf432 Mon Sep 17 00:00:00 2001
From: WANG Rui <wangrui at loongson.cn>
Date: Wed, 14 May 2025 13:46:50 +0800
Subject: [PATCH 2/2] Support CPUCFG instruction for LA32R

---
 llvm/lib/Target/LoongArch/LoongArchInstrInfo.td | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
index f3c132cdb42e6..d2e677062b4ad 100644
--- a/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+++ b/llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
@@ -854,6 +854,12 @@ def BREAK   : MISC_I15<0x002a0000>;
 def RDTIMEL_W : RDTIME_2R<0x00006000>;
 def RDTIMEH_W : RDTIME_2R<0x00006400>;
 
+// The CPUCFG instruction offers a reliable way to probing CPU features.
+// Although it's undefined on LA32R, adding compiler support enables the
+// Linux kernel to emulate it, making the functionality available to user
+// space applications.
+def CPUCFG : ALU_2R<0x00006c00>;
+
 // Cache Maintenance Instructions
 def CACOP : FmtCACOP<(outs), (ins uimm5:$op, GPR:$rj, simm12:$imm12),
                      "$op, $rj, $imm12">;
@@ -895,9 +901,6 @@ def MASKNEZ : ALU_3R<0x00138000>;
 // Branch Instructions
 def BEQZ : BrCCZ_1RI21<0x40000000>;
 def BNEZ : BrCCZ_1RI21<0x44000000>;
-
-// Other Miscellaneous Instructions
-def CPUCFG : ALU_2R<0x00006c00>;
 } // Predicates = [Has32S]
 
 /// LA64 instructions



More information about the llvm-commits mailing list