[llvm] [NVPTX] Lower bfloat16 add/mul/sub as fma on SM80 (PR #121065)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 27 01:19:26 PST 2024


================
@@ -853,6 +853,16 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
       AddPromotedToType(Op, MVT::bf16, MVT::f32);
   }
 
+  // Lower bf16 add/mul/sub as fma when it avoids promotion
+  for (const auto &Op : {ISD::FADD, ISD::FMUL, ISD::FSUB}) {
+    for (const auto &VT : {MVT::bf16, MVT::v2bf16}) {
+      if (getOperationAction(Op, VT) != Legal &&
+          getOperationAction(ISD::FMA, VT) == Legal) {
----------------
arsenm wrote:

> I'm not sure this makes sense. The FTZ logic is target specific and we also want to fallback to promotion, not a libcall here.

The custom expansion can defer to the default expansion depending on the function state. Yes, the default Expand action can conditionally use the to FMA path instead of the default libcall expansion. The core transform code can be in the generic legalizer 

https://github.com/llvm/llvm-project/pull/121065


More information about the llvm-commits mailing list