[llvm] [NVPTX] Lower bfloat16 add/mul/sub as fma on SM80 (PR #121065)

Thu Jan 9 00:27:19 PST 2025

================
@@ -9071,6 +9071,60 @@ SDValue TargetLowering::expandIS_FPCLASS(EVT ResultVT, SDValue Op,
   return Res;
 }
 
+SDValue TargetLowering::expandFADD(SDNode *Node, SelectionDAG &DAG) const {
+  auto VT = Node->getValueType(0);
+  if (!isOperationLegalOrCustom(ISD::FMA, VT)) {
+    return {};
+  }
+
+  // FADD(a, b) -> FMA(a, 1.0, b)
+  SDLoc DL(Node);
+  auto One = DAG.getConstantFP(1.0, DL, VT);
+  SmallVector<SDValue, 3> Operands{Node->getOperand(0), One,
----------------
arsenm wrote:

Don't need the temporary vector, can directly pass the operands to the 3 operand form of getNode 

https://github.com/llvm/llvm-project/pull/121065