[PATCH] D116673: [Clang][NVPTX]Add NVPTX intrinsics and builtins for CUDA PTX cvt sm80 instructions

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 5 11:45:01 PST 2022


tra added a comment.

LGTM overall.



================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:405
 
+TARGET_BUILTIN(__nvvm_ff2v2bf_rn, "ZUiff", "", AND(SM_80,PTX70))
+TARGET_BUILTIN(__nvvm_ff2v2bf_rn_relu, "ZUiff", "", AND(SM_80,PTX70))
----------------
Nit: `ff2v2bf` is a bit hard to parse. I initially tried to interpret it as "convert ff2v to bf" and was confused about  what exactly does `2v` part mean -- we already have `ff` to denote two floats.

Perhaps `ff2bf16x2` would be a bit easier to read and understand. It would also work consistently for `f16` and `tf32` variants below.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116673/new/

https://reviews.llvm.org/D116673



More information about the llvm-commits mailing list