[llvm] [LLVM][CodeGen][SVE] Use BFDOT for fadd reductions. (PR #147981)

Tue Jul 15 06:21:25 PDT 2025

================
@@ -16063,6 +16066,22 @@ SDValue AArch64TargetLowering::LowerVECREDUCE(SDValue Op,
     if (SrcVT.getVectorElementType() == MVT::i1)
       return LowerPredReductionToSVE(Op, DAG);
 
+    if (SrcVT == MVT::nxv8bf16 && Op.getOpcode() == ISD::VECREDUCE_FADD) {
+      assert(Subtarget->hasBF16() &&
+             "VECREDUCE custom lowering expected +bf16!");
+      SDLoc DL(Op);
+      SDValue ID =
+          DAG.getTargetConstant(Intrinsic::aarch64_sve_bfdot, DL, MVT::i64);
+      SDValue Zero = DAG.getConstantFP(0.0, DL, MVT::nxv4f32);
+      SDValue One = DAG.getConstantFP(1.0, DL, MVT::nxv4f32);
+      // Use BFDOT's implicitly promotion to float with partial reduction.
+      SDValue BFDOT = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, DL, MVT::nxv4f32, ID,
+                                  Zero, Src, One);
+      SDValue FADDV = DAG.getNode(ISD::VECREDUCE_FADD, DL, MVT::f32, BFDOT);
----------------
paulwalker-arm wrote:

Yes, I think the transformation is likely only valid when FEAT_EBF16 is available.  From what I can see there is no provision to specify this feature flag and so I'm going to abandon this PR for now.

https://github.com/llvm/llvm-project/pull/147981