[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)
Artem Belevich via cfe-commits
cfe-commits at lists.llvm.org
Tue Aug 13 12:27:54 PDT 2024
================
@@ -722,6 +722,37 @@ let hasSideEffects = false in {
defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>;
defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>;
+
+ // FP8 conversions.
+ multiclass CVT_TO_F8X2<string F8Name> {
+ def _f32 :
+ NVPTXInst<(outs Int16Regs:$dst),
+ (ins Float32Regs:$src1, Float32Regs:$src2, CvtMode:$mode),
+ !strconcat("cvt${mode:base}.satfinite${mode:relu}.",
+ F8Name, "x2.f32 \t$dst, $src1, $src2;"), []>,
+ Requires<[hasPTX<81>, hasSM<89>]>;
+ def _f16x2 :
+ NVPTXInst<(outs Int16Regs:$dst),
+ (ins Int32Regs:$src, CvtMode:$mode),
+ !strconcat("cvt${mode:base}.satfinite${mode:relu}.",
+ F8Name, "x2.f16x2 \t$dst, $src;"), []>,
+ Requires<[hasPTX<81>, hasSM<89>]>;
+ }
+
+ defm CVT_e4m3x2 : CVT_TO_F8X2<"e4m3">;
+ defm CVT_e5m2x2 : CVT_TO_F8X2<"e5m2">;
+
+ multiclass CVT_FROM_F8X2<string F8Name> {
+ def x2 :
+ NVPTXInst<(outs Int32Regs:$dst),
----------------
Artem-B wrote:
It would be useful to add a comment that the output i32 is actually <2 x f16> as `x2` does not give enough of a hint. Or, perhaps rename it to `_f16x2`
https://github.com/llvm/llvm-project/pull/102969
More information about the cfe-commits
mailing list