[llvm] [LoongArch] Introduce `32s` target feature for LA32S ISA extensions (PR #139695)
via llvm-commits
llvm-commits at lists.llvm.org
Tue May 13 02:17:33 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-loongarch
Author: hev (heiher)
<details>
<summary>Changes</summary>
According to the offical LoongArch reference manual, the 32-bit LoongArch is divied into two variants: the Reduced version (LA32R) and Standard version (LA32S). LA32S extends LA32R by adding additional instructions, and the 64-bit version (LA64) fully includes the LA32S instruction set.
This patch introduces a new target feature `32s` for the LoongArch backend, enabling support for instructions specific to the LA32S variant.
The LA32S exntension includes the following additional instructions:
- ALSL.W
- {AND,OR}N
- B{EQ,NE}Z
- BITREV.{4B,W}
- BSTR{INS,PICK}.W
- BYTEPICK.W
- CL{O,Z}.W
- CPUCFG
- CT{O,Z}.W
- EXT.W,{B,H}
- F{LD,ST}X.{D,S}
- MASK{EQ,NE}Z
- PC{ADDI,ALAU12I}
- REVB.2H
- ROTR{I},W
Additionally, LA32R defines three new instruction aliases:
- RDCNTID.W RJ => RDTIMEL.W ZERO, RJ
- RDCNTVH.W RD => RDTIMEH.W RD, ZERO
- RDCNTVL.W RD => RDTIMEL.W RD, ZERO
---
Patch is 979.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139695.diff
60 Files Affected:
- (modified) llvm/lib/Target/LoongArch/LoongArch.td (+12)
- (modified) llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp (+12-6)
- (modified) llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h (+22)
- (modified) llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp (+506-11)
- (modified) llvm/lib/Target/LoongArch/LoongArchISelLowering.h (+4)
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.td (+126-49)
- (modified) llvm/test/CodeGen/LoongArch/alloca.ll (+148-71)
- (modified) llvm/test/CodeGen/LoongArch/alsl.ll (+273-118)
- (modified) llvm/test/CodeGen/LoongArch/annotate-tablejump.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/atomicrmw-cond-sub-clamp.ll (+8-8)
- (modified) llvm/test/CodeGen/LoongArch/atomicrmw-uinc-udec-wrap.ll (+8-8)
- (modified) llvm/test/CodeGen/LoongArch/bitreverse.ll (+573-72)
- (modified) llvm/test/CodeGen/LoongArch/bnez-beqz.ll (+66-31)
- (modified) llvm/test/CodeGen/LoongArch/branch-relaxation.ll (+101-53)
- (modified) llvm/test/CodeGen/LoongArch/bstrins_w.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/bstrpick_w.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/bswap-bitreverse.ll (+225-39)
- (modified) llvm/test/CodeGen/LoongArch/bswap.ll (+229-65)
- (modified) llvm/test/CodeGen/LoongArch/bytepick.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll (+803-200)
- (modified) llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll (+80-37)
- (modified) llvm/test/CodeGen/LoongArch/exception-pointer-register.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/fabs.ll (+2-2)
- (modified) llvm/test/CodeGen/LoongArch/fcopysign.ll (+2-2)
- (modified) llvm/test/CodeGen/LoongArch/feature-32bit.ll (+1)
- (modified) llvm/test/CodeGen/LoongArch/intrinsic-csr-side-effects.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/and.ll (+303-136)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/ashr.ll (+20-22)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg-128.ll (+10-10)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll (+44-44)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-fp.ll (+40-40)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lam-bh.ll (+557-591)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-lamcas.ll (+90-90)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw-minmax.ll (+40-40)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/atomicrmw.ll (+4752-2266)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/br.ll (+201-90)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/double-convert.ll (+9-8)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-dbl.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/fcmp-flt.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/float-convert.ll (+18-16)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/load-store-fp.ll (+2-2)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/load-store.ll (+818-390)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/lshr.ll (+132-57)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/mul.ll (+1276-594)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/sdiv-udiv-srem-urem.ll (+1041-490)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/select-bare-int.ll (+36-25)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/select-fpcc-int.ll (+112-84)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/select-icc-int.ll (+50-46)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/sext-zext-trunc.ll (+281-121)
- (modified) llvm/test/CodeGen/LoongArch/ir-instruction/shl.ll (+13-11)
- (modified) llvm/test/CodeGen/LoongArch/jump-table.ll (+2-2)
- (modified) llvm/test/CodeGen/LoongArch/rotl-rotr.ll (+711-320)
- (modified) llvm/test/CodeGen/LoongArch/select-to-shiftand.ll (+4-4)
- (modified) llvm/test/CodeGen/LoongArch/shift-masked-shamt.ll (+40-40)
- (modified) llvm/test/CodeGen/LoongArch/smul-with-overflow.ll (+132-140)
- (modified) llvm/test/CodeGen/LoongArch/stack-realignment-with-variable-sized-objects.ll (+1-1)
- (modified) llvm/test/CodeGen/LoongArch/typepromotion-overflow.ll (+468-222)
- (modified) llvm/test/MC/LoongArch/Basic/Integer/atomic.s (+8-8)
- (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.generated.expected (+3-3)
- (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/loongarch_generated_funcs.ll.nogenerated.expected (+3-3)
``````````diff
diff --git a/llvm/lib/Target/LoongArch/LoongArch.td b/llvm/lib/Target/LoongArch/LoongArch.td
index 5fd52babfc6ec..707d2de23cdfe 100644
--- a/llvm/lib/Target/LoongArch/LoongArch.td
+++ b/llvm/lib/Target/LoongArch/LoongArch.td
@@ -32,6 +32,14 @@ def IsLA32
defvar LA32 = DefaultMode;
def LA64 : HwMode<"+64bit", [IsLA64]>;
+// LoongArch 32-bit is divided into variants, the reduced 32-bit variant (LA32R)
+// and the standard 32-bit variant (LA32S).
+def Feature32S
+ : SubtargetFeature<"32s", "Has32S", "true",
+ "LA32 Standard Basic Instruction Extension">;
+def Has32S : Predicate<"Subtarget->has32S()">;
+def Not32S : Predicate<"!Subtarget->has32S()">;
+
// Single Precision floating point
def FeatureBasicF
: SubtargetFeature<"f", "HasBasicF", "true",
@@ -159,11 +167,13 @@ include "LoongArchInstrInfo.td"
def : ProcessorModel<"generic-la32", NoSchedModel, [Feature32Bit]>;
def : ProcessorModel<"generic-la64", NoSchedModel, [Feature64Bit,
+ Feature32S,
FeatureUAL,
FeatureExtLSX]>;
// Generic 64-bit processor with double-precision floating-point support.
def : ProcessorModel<"loongarch64", NoSchedModel, [Feature64Bit,
+ Feature32S,
FeatureUAL,
FeatureBasicD]>;
@@ -172,12 +182,14 @@ def : ProcessorModel<"loongarch64", NoSchedModel, [Feature64Bit,
def : ProcessorModel<"generic", NoSchedModel, []>;
def : ProcessorModel<"la464", NoSchedModel, [Feature64Bit,
+ Feature32S,
FeatureUAL,
FeatureExtLASX,
FeatureExtLVZ,
FeatureExtLBT]>;
def : ProcessorModel<"la664", NoSchedModel, [Feature64Bit,
+ Feature32S,
FeatureUAL,
FeatureExtLASX,
FeatureExtLVZ,
diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
index 27d20390eb6ae..3be012feb2385 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandAtomicPseudoInsts.cpp
@@ -214,8 +214,9 @@ static void doAtomicBinOpExpansion(const LoongArchInstrInfo *TII,
.addReg(ScratchReg)
.addReg(AddrReg)
.addImm(0);
- BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQZ))
+ BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQ))
.addReg(ScratchReg)
+ .addReg(LoongArch::R0)
.addMBB(LoopMBB);
}
@@ -296,8 +297,9 @@ static void doMaskedAtomicBinOpExpansion(
.addReg(ScratchReg)
.addReg(AddrReg)
.addImm(0);
- BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQZ))
+ BuildMI(LoopMBB, DL, TII->get(LoongArch::BEQ))
.addReg(ScratchReg)
+ .addReg(LoongArch::R0)
.addMBB(LoopMBB);
}
@@ -454,8 +456,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicMinMaxOp(
.addReg(Scratch1Reg)
.addReg(AddrReg)
.addImm(0);
- BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+ BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
.addReg(Scratch1Reg)
+ .addReg(LoongArch::R0)
.addMBB(LoopHeadMBB);
NextMBBI = MBB.end();
@@ -529,8 +532,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg(
.addReg(ScratchReg)
.addReg(AddrReg)
.addImm(0);
- BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+ BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
.addReg(ScratchReg)
+ .addReg(LoongArch::R0)
.addMBB(LoopHeadMBB);
BuildMI(LoopTailMBB, DL, TII->get(LoongArch::B)).addMBB(DoneMBB);
} else {
@@ -569,8 +573,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg(
.addReg(ScratchReg)
.addReg(AddrReg)
.addImm(0);
- BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+ BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
.addReg(ScratchReg)
+ .addReg(LoongArch::R0)
.addMBB(LoopHeadMBB);
BuildMI(LoopTailMBB, DL, TII->get(LoongArch::B)).addMBB(DoneMBB);
}
@@ -677,8 +682,9 @@ bool LoongArchExpandAtomicPseudo::expandAtomicCmpXchg128(
.addReg(ScratchReg)
.addReg(NewValHiReg)
.addReg(AddrReg);
- BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQZ))
+ BuildMI(LoopTailMBB, DL, TII->get(LoongArch::BEQ))
.addReg(ScratchReg)
+ .addReg(LoongArch::R0)
.addMBB(LoopHeadMBB);
BuildMI(LoopTailMBB, DL, TII->get(LoongArch::B)).addMBB(DoneMBB);
int hint;
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h
index 8a7eba418d804..e94f249c14be2 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h
+++ b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.h
@@ -64,6 +64,28 @@ class LoongArchDAGToDAGISel : public SelectionDAGISel {
bool selectVSplatUimmInvPow2(SDValue N, SDValue &SplatImm) const;
bool selectVSplatUimmPow2(SDValue N, SDValue &SplatImm) const;
+ // Return the LoongArch branch opcode that matches the given DAG integer
+ // condition code. The CondCode must be one of those supported by the
+ // LoongArch ISA (see translateSetCCForBranch).
+ static unsigned getBranchOpcForIntCC(ISD::CondCode CC) {
+ switch (CC) {
+ default:
+ llvm_unreachable("Unsupported CondCode");
+ case ISD::SETEQ:
+ return LoongArch::BEQ;
+ case ISD::SETNE:
+ return LoongArch::BNE;
+ case ISD::SETLT:
+ return LoongArch::BLT;
+ case ISD::SETGE:
+ return LoongArch::BGE;
+ case ISD::SETULT:
+ return LoongArch::BLTU;
+ case ISD::SETUGE:
+ return LoongArch::BGEU;
+ }
+ }
+
// Include the pieces autogenerated from the target description.
#include "LoongArchGenDAGISel.inc"
};
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index b729b4ea6f9b4..6e3e1396e6aeb 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -18,6 +18,7 @@
#include "LoongArchSubtarget.h"
#include "MCTargetDesc/LoongArchBaseInfo.h"
#include "MCTargetDesc/LoongArchMCTargetDesc.h"
+#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -102,15 +103,26 @@ LoongArchTargetLowering::LoongArchTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::PREFETCH, MVT::Other, Custom);
- // Expand bitreverse.i16 with native-width bitrev and shift for now, before
- // we get to know which of sll and revb.2h is faster.
- setOperationAction(ISD::BITREVERSE, MVT::i8, Custom);
- setOperationAction(ISD::BITREVERSE, GRLenVT, Legal);
-
- // LA32 does not have REVB.2W and REVB.D due to the 64-bit operands, and
- // the narrower REVB.W does not exist. But LA32 does have REVB.2H, so i16
- // and i32 could still be byte-swapped relatively cheaply.
- setOperationAction(ISD::BSWAP, MVT::i16, Custom);
+ // BITREV/REVB requires the 32S feature.
+ if (STI.has32S()) {
+ // Expand bitreverse.i16 with native-width bitrev and shift for now, before
+ // we get to know which of sll and revb.2h is faster.
+ setOperationAction(ISD::BITREVERSE, MVT::i8, Custom);
+ setOperationAction(ISD::BITREVERSE, GRLenVT, Legal);
+
+ // LA32 does not have REVB.2W and REVB.D due to the 64-bit operands, and
+ // the narrower REVB.W does not exist. But LA32 does have REVB.2H, so i16
+ // and i32 could still be byte-swapped relatively cheaply.
+ setOperationAction(ISD::BSWAP, MVT::i16, Custom);
+ } else {
+ setOperationAction(ISD::BSWAP, GRLenVT, Expand);
+ setOperationAction(ISD::CTTZ, GRLenVT, Expand);
+ setOperationAction(ISD::CTLZ, GRLenVT, Expand);
+ setOperationAction(ISD::ROTR, GRLenVT, Expand);
+ setOperationAction(ISD::SELECT, GRLenVT, Custom);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Expand);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Expand);
+ }
setOperationAction(ISD::BR_JT, MVT::Other, Expand);
setOperationAction(ISD::BR_CC, GRLenVT, Expand);
@@ -476,6 +488,8 @@ SDValue LoongArchTargetLowering::LowerOperation(SDValue Op,
return lowerSCALAR_TO_VECTOR(Op, DAG);
case ISD::PREFETCH:
return lowerPREFETCH(Op, DAG);
+ case ISD::SELECT:
+ return lowerSELECT(Op, DAG);
}
return SDValue();
}
@@ -492,6 +506,327 @@ SDValue LoongArchTargetLowering::lowerPREFETCH(SDValue Op,
return Op;
}
+// Return true if Val is equal to (setcc LHS, RHS, CC).
+// Return false if Val is the inverse of (setcc LHS, RHS, CC).
+// Otherwise, return std::nullopt.
+static std::optional<bool> matchSetCC(SDValue LHS, SDValue RHS,
+ ISD::CondCode CC, SDValue Val) {
+ assert(Val->getOpcode() == ISD::SETCC);
+ SDValue LHS2 = Val.getOperand(0);
+ SDValue RHS2 = Val.getOperand(1);
+ ISD::CondCode CC2 = cast<CondCodeSDNode>(Val.getOperand(2))->get();
+
+ if (LHS == LHS2 && RHS == RHS2) {
+ if (CC == CC2)
+ return true;
+ if (CC == ISD::getSetCCInverse(CC2, LHS2.getValueType()))
+ return false;
+ } else if (LHS == RHS2 && RHS == LHS2) {
+ CC2 = ISD::getSetCCSwappedOperands(CC2);
+ if (CC == CC2)
+ return true;
+ if (CC == ISD::getSetCCInverse(CC2, LHS2.getValueType()))
+ return false;
+ }
+
+ return std::nullopt;
+}
+
+static SDValue combineSelectToBinOp(SDNode *N, SelectionDAG &DAG,
+ const LoongArchSubtarget &Subtarget) {
+ SDValue CondV = N->getOperand(0);
+ SDValue TrueV = N->getOperand(1);
+ SDValue FalseV = N->getOperand(2);
+ MVT VT = N->getSimpleValueType(0);
+ SDLoc DL(N);
+
+ // (select c, -1, y) -> -c | y
+ if (isAllOnesConstant(TrueV)) {
+ SDValue Neg = DAG.getNegative(CondV, DL, VT);
+ return DAG.getNode(ISD::OR, DL, VT, Neg, DAG.getFreeze(FalseV));
+ }
+ // (select c, y, -1) -> (c-1) | y
+ if (isAllOnesConstant(FalseV)) {
+ SDValue Neg =
+ DAG.getNode(ISD::ADD, DL, VT, CondV, DAG.getAllOnesConstant(DL, VT));
+ return DAG.getNode(ISD::OR, DL, VT, Neg, DAG.getFreeze(TrueV));
+ }
+
+ // (select c, 0, y) -> (c-1) & y
+ if (isNullConstant(TrueV)) {
+ SDValue Neg =
+ DAG.getNode(ISD::ADD, DL, VT, CondV, DAG.getAllOnesConstant(DL, VT));
+ return DAG.getNode(ISD::AND, DL, VT, Neg, DAG.getFreeze(FalseV));
+ }
+ // (select c, y, 0) -> -c & y
+ if (isNullConstant(FalseV)) {
+ SDValue Neg = DAG.getNegative(CondV, DL, VT);
+ return DAG.getNode(ISD::AND, DL, VT, Neg, DAG.getFreeze(TrueV));
+ }
+
+ // select c, ~x, x --> xor -c, x
+ if (isa<ConstantSDNode>(TrueV) && isa<ConstantSDNode>(FalseV)) {
+ const APInt &TrueVal = TrueV->getAsAPIntVal();
+ const APInt &FalseVal = FalseV->getAsAPIntVal();
+ if (~TrueVal == FalseVal) {
+ SDValue Neg = DAG.getNegative(CondV, DL, VT);
+ return DAG.getNode(ISD::XOR, DL, VT, Neg, FalseV);
+ }
+ }
+
+ // Try to fold (select (setcc lhs, rhs, cc), truev, falsev) into bitwise ops
+ // when both truev and falsev are also setcc.
+ if (CondV.getOpcode() == ISD::SETCC && TrueV.getOpcode() == ISD::SETCC &&
+ FalseV.getOpcode() == ISD::SETCC) {
+ SDValue LHS = CondV.getOperand(0);
+ SDValue RHS = CondV.getOperand(1);
+ ISD::CondCode CC = cast<CondCodeSDNode>(CondV.getOperand(2))->get();
+
+ // (select x, x, y) -> x | y
+ // (select !x, x, y) -> x & y
+ if (std::optional<bool> MatchResult = matchSetCC(LHS, RHS, CC, TrueV)) {
+ return DAG.getNode(*MatchResult ? ISD::OR : ISD::AND, DL, VT, TrueV,
+ DAG.getFreeze(FalseV));
+ }
+ // (select x, y, x) -> x & y
+ // (select !x, y, x) -> x | y
+ if (std::optional<bool> MatchResult = matchSetCC(LHS, RHS, CC, FalseV)) {
+ return DAG.getNode(*MatchResult ? ISD::AND : ISD::OR, DL, VT,
+ DAG.getFreeze(TrueV), FalseV);
+ }
+ }
+
+ return SDValue();
+}
+
+// Transform `binOp (select cond, x, c0), c1` where `c0` and `c1` are constants
+// into `select cond, binOp(x, c1), binOp(c0, c1)` if profitable.
+// For now we only consider transformation profitable if `binOp(c0, c1)` ends up
+// being `0` or `-1`. In such cases we can replace `select` with `and`.
+// TODO: Should we also do this if `binOp(c0, c1)` is cheaper to materialize
+// than `c0`?
+static SDValue
+foldBinOpIntoSelectIfProfitable(SDNode *BO, SelectionDAG &DAG,
+ const LoongArchSubtarget &Subtarget) {
+ unsigned SelOpNo = 0;
+ SDValue Sel = BO->getOperand(0);
+ if (Sel.getOpcode() != ISD::SELECT || !Sel.hasOneUse()) {
+ SelOpNo = 1;
+ Sel = BO->getOperand(1);
+ }
+
+ if (Sel.getOpcode() != ISD::SELECT || !Sel.hasOneUse())
+ return SDValue();
+
+ unsigned ConstSelOpNo = 1;
+ unsigned OtherSelOpNo = 2;
+ if (!isa<ConstantSDNode>(Sel->getOperand(ConstSelOpNo))) {
+ ConstSelOpNo = 2;
+ OtherSelOpNo = 1;
+ }
+ SDValue ConstSelOp = Sel->getOperand(ConstSelOpNo);
+ ConstantSDNode *ConstSelOpNode = dyn_cast<ConstantSDNode>(ConstSelOp);
+ if (!ConstSelOpNode || ConstSelOpNode->isOpaque())
+ return SDValue();
+
+ SDValue ConstBinOp = BO->getOperand(SelOpNo ^ 1);
+ ConstantSDNode *ConstBinOpNode = dyn_cast<ConstantSDNode>(ConstBinOp);
+ if (!ConstBinOpNode || ConstBinOpNode->isOpaque())
+ return SDValue();
+
+ SDLoc DL(Sel);
+ EVT VT = BO->getValueType(0);
+
+ SDValue NewConstOps[2] = {ConstSelOp, ConstBinOp};
+ if (SelOpNo == 1)
+ std::swap(NewConstOps[0], NewConstOps[1]);
+
+ SDValue NewConstOp =
+ DAG.FoldConstantArithmetic(BO->getOpcode(), DL, VT, NewConstOps);
+ if (!NewConstOp)
+ return SDValue();
+
+ const APInt &NewConstAPInt = NewConstOp->getAsAPIntVal();
+ if (!NewConstAPInt.isZero() && !NewConstAPInt.isAllOnes())
+ return SDValue();
+
+ SDValue OtherSelOp = Sel->getOperand(OtherSelOpNo);
+ SDValue NewNonConstOps[2] = {OtherSelOp, ConstBinOp};
+ if (SelOpNo == 1)
+ std::swap(NewNonConstOps[0], NewNonConstOps[1]);
+ SDValue NewNonConstOp = DAG.getNode(BO->getOpcode(), DL, VT, NewNonConstOps);
+
+ SDValue NewT = (ConstSelOpNo == 1) ? NewConstOp : NewNonConstOp;
+ SDValue NewF = (ConstSelOpNo == 1) ? NewNonConstOp : NewConstOp;
+ return DAG.getSelect(DL, VT, Sel.getOperand(0), NewT, NewF);
+}
+
+// Changes the condition code and swaps operands if necessary, so the SetCC
+// operation matches one of the comparisons supported directly by branches
+// in the LoongArch ISA. May adjust compares to favor compare with 0 over
+// compare with 1/-1.
+static void translateSetCCForBranch(const SDLoc &DL, SDValue &LHS, SDValue &RHS,
+ ISD::CondCode &CC, SelectionDAG &DAG) {
+ // If this is a single bit test that can't be handled by ANDI, shift the
+ // bit to be tested to the MSB and perform a signed compare with 0.
+ if (isIntEqualitySetCC(CC) && isNullConstant(RHS) &&
+ LHS.getOpcode() == ISD::AND && LHS.hasOneUse() &&
+ isa<ConstantSDNode>(LHS.getOperand(1))) {
+ uint64_t Mask = LHS.getConstantOperandVal(1);
+ if ((isPowerOf2_64(Mask) || isMask_64(Mask)) && !isInt<12>(Mask)) {
+ unsigned ShAmt = 0;
+ if (isPowerOf2_64(Mask)) {
+ CC = CC == ISD::SETEQ ? ISD::SETGE : ISD::SETLT;
+ ShAmt = LHS.getValueSizeInBits() - 1 - Log2_64(Mask);
+ } else {
+ ShAmt = LHS.getValueSizeInBits() - llvm::bit_width(Mask);
+ }
+
+ LHS = LHS.getOperand(0);
+ if (ShAmt != 0)
+ LHS = DAG.getNode(ISD::SHL, DL, LHS.getValueType(), LHS,
+ DAG.getConstant(ShAmt, DL, LHS.getValueType()));
+ return;
+ }
+ }
+
+ if (auto *RHSC = dyn_cast<ConstantSDNode>(RHS)) {
+ int64_t C = RHSC->getSExtValue();
+ switch (CC) {
+ default:
+ break;
+ case ISD::SETGT:
+ // Convert X > -1 to X >= 0.
+ if (C == -1) {
+ RHS = DAG.getConstant(0, DL, RHS.getValueType());
+ CC = ISD::SETGE;
+ return;
+ }
+ break;
+ case ISD::SETLT:
+ // Convert X < 1 to 0 >= X.
+ if (C == 1) {
+ RHS = LHS;
+ LHS = DAG.getConstant(0, DL, RHS.getValueType());
+ CC = ISD::SETGE;
+ return;
+ }
+ break;
+ }
+ }
+
+ switch (CC) {
+ default:
+ break;
+ case ISD::SETGT:
+ case ISD::SETLE:
+ case ISD::SETUGT:
+ case ISD::SETULE:
+ CC = ISD::getSetCCSwappedOperands(CC);
+ std::swap(LHS, RHS);
+ break;
+ }
+}
+
+SDValue LoongArchTargetLowering::lowerSELECT(SDValue Op,
+ SelectionDAG &DAG) const {
+ SDValue CondV = Op.getOperand(0);
+ SDValue TrueV = Op.getOperand(1);
+ SDValue FalseV = Op.getOperand(2);
+ SDLoc DL(Op);
+ MVT VT = Op.getSimpleValueType();
+ MVT GRLenVT = Subtarget.getGRLenVT();
+
+ if (SDValue V = combineSelectToBinOp(Op.getNode(), DAG, Subtarget))
+ return V;
+
+ if (Op.hasOneUse()) {
+ unsigned UseOpc = Op->user_begin()->getOpcode();
+ if (isBinOp(UseOpc) && DAG.isSafeToSpeculativelyExecute(UseOpc)) {
+ SDNode *BinOp = *Op->user_begin();
+ if (SDValue NewSel = foldBinOpIntoSelectIfProfitable(*Op->user_begin(),
+ DAG, Subtarget)) {
+ DAG.ReplaceAllUsesWith(BinOp, &NewSel);
+ // Opcode check is necessary because foldBinOpIntoSelectIfProfitable
+ // may return a constant node and cause crash in lowerSELECT.
+ if (NewSel.getOpcode() == ISD::SELECT)
+ return lowerSELECT(NewSel, DAG);
+ return NewSel;
+ }
+ }
+ }
+
+ // If the condition is not an integer SETCC which operates on GRLenVT, we need
+ // to emit a LoongArchISD::SELECT_CC comparing the condition to zero. i.e.:
+ // (select condv, truev, falsev)
+ // -> (loongarchisd::select_cc condv, zero, setne, truev, falsev)
+ if (CondV.getOpcode() != ISD::SETCC ||
+ CondV.getOperand(0).getSimpleValueType() != GRLenVT) {
+ SDValue Zero = DAG.getConstant(0, DL, GRLenVT);
+ SDValue SetNE = DAG.getCondCode(ISD::SETNE);
+
+ SDValue Ops[] = {CondV, Zero, SetNE, TrueV, FalseV};
+
+ return DAG.getNode(LoongArchISD::SELECT_CC, DL, VT, Ops);
+ }
+
+ // If the CondV is the output of a SETCC node which operates on GRLenVT
+ // inputs, then merge the SETCC node into the lowered LoongArchISD::SELECT_CC
+ // to take advantage of the integer compare+branch instructions. i.e.: (select
+ // (setcc lhs, rhs, cc), truev, falsev)
+ // -> (loongarchisd::select_cc lhs, rhs, cc, truev, falsev)
+ SDValue LHS = CondV.getOperand(0);
+ SDValue RHS = CondV.getOperand(1);
+ ISD::CondCode CCVal = cast<CondCodeSDNode>(CondV.getOperand(2))->get();
+
+ // Special case for a select of 2 constants that have a difference of 1.
+ // Normally this is done by DAGCombine, but if the select is introduced by
+ // type legalization or op legalization, we miss it. Restricting to SETLT
+ // case for now because that is what signed saturating add/sub need.
+ // FIXME: We don't need the condition to be SETLT or even a SETCC,
+ // but we would probably want to swap the true/false values if the condition
+ // is SETGE/SETLE to avoid an XORI.
+ if (isa<ConstantSDNode>(TrueV) && isa<ConstantSDNode>(FalseV) &&
+ CCVal == ISD::SETLT) {
+ const APInt &TrueVal = TrueV->getAsAPIntVal();
+ const APInt &FalseVal = FalseV->getAsAPIntVal();
+ if (TrueVal - 1 == FalseVal)
+ return DAG.getNode(ISD::ADD, DL, VT, CondV, FalseV);
+ if (TrueVal + 1 == FalseVal)
+ return DAG.getNode(ISD::SUB, DL, VT, FalseV, CondV);
+ }
+
+ translateSetCCForBranch(DL, LHS, RHS, CCVal, DAG);
+ // 1 < x ? x : 1 -> 0 < x ? x : 1
+ if (isOneConstant(LHS) && (CCVal == ISD::SETLT || CCVal == ISD::SETULT) &&
+ RHS == TrueV && LHS == FalseV) {
+ LHS = DAG.getConstant(0, DL, VT);
+ // 0 <u x is the same as x != 0.
+ if (CCVal == ISD::SETULT) {
+ std::swap(LHS, RHS);
+ CCVal = ISD::SETNE;
+ }
+ }
+
+ // x <s -1 ? x : -1 -> x <s 0 ? x : -1
+ if (isAllOnesConstant(RH...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/139695
More information about the llvm-commits
mailing list