[llvm] [LoongArch] Custom legalizing ConstantFP to avoid float loads (PR #158050)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 15 00:33:38 PDT 2025
================
@@ -549,10 +575,66 @@ SDValue LoongArchTargetLowering::LowerOperation(SDValue Op,
case ISD::VECREDUCE_UMAX:
case ISD::VECREDUCE_UMIN:
return lowerVECREDUCE(Op, DAG);
+ case ISD::ConstantFP:
+ return lowerConstantFP(Op, DAG);
}
return SDValue();
}
+SDValue LoongArchTargetLowering::lowerConstantFP(SDValue Op,
+ SelectionDAG &DAG) const {
+ EVT VT = Op.getValueType();
+ ConstantFPSDNode *CFP = cast<ConstantFPSDNode>(Op);
+ const APFloat &FPVal = CFP->getValueAPF();
+ SDLoc DL(CFP);
+
+ assert((VT == MVT::f32 && Subtarget.hasBasicF()) ||
+ (VT == MVT::f64 && Subtarget.hasBasicD()));
+
+ // If value is 0.0 or -0.0, just ignore it.
+ if (FPVal.isZero())
+ return SDValue();
+
+ // If lsx enabled, use cheaper 'vldi' instruction if possible.
+ if (isFPImmVLDILegal(FPVal, VT))
+ return SDValue();
+
+ // Construct as integer, and move to float register.
+ APInt INTVal = FPVal.bitcastToAPInt();
+ switch (VT.getSimpleVT().SimpleTy) {
+ default:
+ llvm_unreachable("Unexpected floating point type!");
+ break;
+ case MVT::f32: {
+ SDValue NewVal = DAG.getConstant(INTVal, DL, MVT::i32);
+ if (Subtarget.is64Bit())
+ NewVal = DAG.getNode(ISD::ZERO_EXTEND, DL, MVT::i64, NewVal);
+ return DAG.getNode(Subtarget.is64Bit() ? LoongArchISD::MOVGR2FR_W_LA64
+ : LoongArchISD::MOVGR2FR_W,
+ DL, VT, NewVal);
+ }
+ case MVT::f64: {
+ // If more than MaterializeFPImmInsNum instructions will be used to
+ // generate the INTVal, fallback to use floating point load from the
+ // constant pool.
+ auto Seq = LoongArchMatInt::generateInstSeq(INTVal.getSExtValue());
+ if (Seq.size() > MaterializeFPImmInsNum && !FPVal.isExactlyValue(+1.0))
----------------
zhaoqi5 wrote:
Thanks for your suggestions!
Taking `movgr2fr[h]` into account is reasonable. What do you think about setting `MaterializeFPImmInsNum` to `3` as default for both LA32 and LA64? Which means:
- For `f32` on both LA32 and LA64: `2 insts + movgr2fr.w`; (will cover all `f32` values)
- For `f64` on LA64: `2 insts + movgr2fr.d`;
- For `f64` on LA32: `1 inst + movgr2fr.w + movgr2frh.w`. (same inst latency as using constant pool)
The range of `MaterializeFPImmInsNum` will be `0,2-6`. (6 behaves same as 5 on LA64.)
https://github.com/llvm/llvm-project/pull/158050
More information about the llvm-commits
mailing list