[llvm] [AArch64][SVE] Use SVE for scalar FP converts in streaming[-compatible] functions (PR #112564)
Benjamin Maxwell via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 24 06:15:20 PDT 2024
================
@@ -18929,13 +18929,67 @@ static SDValue performVectorCompareAndMaskUnaryOpCombine(SDNode *N,
return SDValue();
}
+static bool
+shouldUseSVEForScalarFPConversion(SDNode *N,
+ const AArch64Subtarget *Subtarget) {
+ auto isSupportedType = [](EVT VT) {
+ if (!VT.isSimple())
+ return false;
+ // There are SVE instructions that can convert to/from all pairs of these
+ // int and float types. Note: We don't bother with i8 or i16 as those are
+ // illegal types for scalars.
+ return is_contained({MVT::i32, MVT::i64, MVT::f16, MVT::f32, MVT::f64},
+ VT.getSimpleVT().SimpleTy);
+ };
+ // If we are in a streaming[-compatible] function, use SVE for scalar FP <->
+ // INT conversions as this can help avoid movs between GPRs and FPRs, which
+ // could be quite expensive.
+ return !N->isStrictFPOpcode() && Subtarget->isSVEorStreamingSVEAvailable() &&
+ (Subtarget->isStreaming() || Subtarget->isStreamingCompatible()) &&
+ isSupportedType(N->getValueType(0)) &&
+ isSupportedType(N->getOperand(0).getValueType());
+}
+
+/// Replaces a scalar FP <-> INT conversion with an SVE (scalable) one, wrapped
+/// with an insert and extract.
+static SDValue replaceScalarFPConversionWithSVE(SDNode *N, SelectionDAG &DAG) {
+ assert(!N->isStrictFPOpcode() && "strict fp ops not supported");
+ SDValue SrcVal = N->getOperand(0);
+ EVT SrcTy = SrcVal.getValueType();
+ EVT DestTy = N->getValueType(0);
+ EVT SrcVecTy;
+ EVT DestVecTy;
+ // Use a packed vector for the larger type.
+ // Note: For conversions such as FCVTZS_ZPmZ_DtoS, and UCVTF_ZPmZ_StoD that
+ // notionally take or return a nxv2i32 type we must instead use a nxv4i32, as
+ // (unlike floats) nxv2i32 is an illegal unpacked type.
+ if (DestTy.bitsGT(SrcTy)) {
+ DestVecTy = getPackedSVEVectorVT(DestTy);
+ SrcVecTy = SrcTy == MVT::i32 ? getPackedSVEVectorVT(SrcTy)
+ : DestVecTy.changeVectorElementType(SrcTy);
+ } else {
+ SrcVecTy = getPackedSVEVectorVT(SrcTy);
+ DestVecTy = DestTy == MVT::i32 ? getPackedSVEVectorVT(DestTy)
+ : SrcVecTy.changeVectorElementType(DestTy);
+ }
----------------
MacDue wrote:
>From the LangRef (https://llvm.org/docs/LangRef.html#fptoui-to-instruction), I take "fit" here to refer to the value, not the type. So you can have a value with an `f64` type, that can fit within an integer of `i32`, it's only poison if the value is out of the range of the integer type.
I had to add a hack in `LowerToPredicatedOp`, I didn't add a hack in the `SVEInstrInfo`, I just allowed lowering to instruction. The same issue is present for the `[S|U]CVTF_ZPmZ_StoD`, which currently can be lowered from ISD nodes (but has inconsistent types). `[S|U]CVTF` does not need a hack in `LowerToPredicatedOp` as it's the operand that has the wrong type, not the result.
https://github.com/llvm/llvm-project/pull/112564
More information about the llvm-commits
mailing list