[Mlir-commits] [mlir] [MLIR][NVVM] Add nvvm.fadd and nvvm.fsub Ops (PR #179162)

Mon Feb 2 01:29:06 PST 2026

================
@@ -6247,6 +6155,77 @@ def NVVM_Tcgen05MMAWsSparseOp : NVVM_Op<"tcgen05.mma.ws.sp",
   }];
 }
 
+def NVVM_FloatAdditionOp : 
+  NVVM_SingleResultIntrinsicOp<"fadd", [Pure, Commutative]> {
+  let summary = [{
+    Performs floating point addition operation with support for mixed precision 
+    operands
+  }];
+  let description = [{
+    The `nvvm.fadd` operation performs floating point addition of two operands.
+
+    The rounding mode to be used is specified by the `rnd` attribute, 
+    saturation mode by the `sat` attribute, and FTZ by the `ftz` unit attribute.
+    
+    The result type must be at least as wide as the operands. The operands are 
+    converted to the result type before addition if it is wider.
+    
+    For more information, see PTX ISA - [floating point addition](https://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-add), 
+    [half-precision floating point addition](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-add), 
+    [mixed precision floating point addition](https://docs.nvidia.com/cuda/parallel-thread-execution/#mixed-precision-floating-point-instructions-add).
+  }];
+  let arguments = (ins
+    AnyTypeOf<[F16, BF16, F32, F64, VectorOfLengthAndType<[2], [F16, BF16]>]>:$lhs,
+    AnyTypeOf<[F16, BF16, F32, F64, VectorOfLengthAndType<[2], [F16, BF16]>]>:$rhs,
+    DefaultValuedAttr<FPRoundingModeAttr, "FPRoundingMode::NONE">:$rnd,
+    DefaultValuedAttr<SaturationModeAttr, "SaturationMode::NONE">:$sat,
+    UnitAttr:$ftz
+  );
+  let results = (outs AnyTypeOf<[F16, BF16, F32, F64, VectorOfLengthAndType<[2], [F16, BF16]>]>:$res);
+  let assemblyFormat = "$lhs `,` $rhs attr-dict `:` type(operands) `->` type($res)";
+  let hasVerifier = 1;
----------------
Wolfram70 wrote:

I did look into this while writing the Ops, but from what it looks like, splitting this into multiple Ops has the same problem as the conversion Ops (no clean way to split it and we still have all of the ugliness/invalid cases just spread across multiple Ops) especially in this case since we also support mixed precision arithmetic in the ISA. Most of those invalid result/arg mixed types cases are mainly due to the self-imposed restriction we have in the Op where we freely perform any necessary extensions but disallow any trunctions of the operands since that would result in a loss of precision.

https://github.com/llvm/llvm-project/pull/179162