[clang] [llvm] [clang][NVPTX] Add intrinsics and builtins for mixed-precision FP arithmetic (PR #168359)
Artem Belevich via cfe-commits
cfe-commits at lists.llvm.org
Wed Nov 19 11:20:38 PST 2025
================
@@ -460,6 +478,52 @@ def __nvvm_add_rz_d : NVPTXBuiltin<"double(double, double)">;
def __nvvm_add_rm_d : NVPTXBuiltin<"double(double, double)">;
def __nvvm_add_rp_d : NVPTXBuiltin<"double(double, double)">;
+def __nvvm_add_mixed_f16_f32 : NVPTXBuiltinSMAndPTX<"float(__fp16, float)", SM_100, PTX86>;
+def __nvvm_add_mixed_rn_f16_f32 : NVPTXBuiltinSMAndPTX<"float(__fp16, float)", SM_100, PTX86>;
----------------
Artem-B wrote:
This set of intrinsics appears to be regular enough to consider using tablegen loops to generate them.
Not sure if it's going to end up being an improvement, but if it would reduce the boilerplate, it may be worth giving it a try.
https://github.com/llvm/llvm-project/pull/168359
More information about the cfe-commits
mailing list