[llvm] [NVPTX] Add float to tf32 conversion intrinsics (PR #121507)

Thu Jan 9 12:56:12 PST 2025

================
@@ -725,6 +728,12 @@ let hasSideEffects = false in {
 
   def CVT_f16x2_e4m3x2 : CVT_f16x2_fp8<"e4m3">;
   def CVT_f16x2_e5m2x2 : CVT_f16x2_fp8<"e5m2">;
+
+  // Float to TF32 conversions.
+  def CVT_tf32_f32 : NVPTXInst<(outs Int32Regs:$dst),
+                     (ins Float32Regs:$src, CvtMode:$mode),
+                     !strconcat("cvt${mode:base}${mode:relu}${mode:satfinite}.",
----------------
Artem-B wrote:

To clarify, right now you define a single record which generates all `cvt.tf32.f32` instruction variants, modified via `mode` passed by a custom pattern matcher.

I'm proposing defining per-intrinsic record parametrized by the PTX instruction modifier string, which emits specific PTX instruction, and which applies the intrinsic-matching pattern as a `NVPTXInst` parameter.

https://github.com/llvm/llvm-project/pull/121507