[PATCH] D117887: [NVPTX] Expose float tys min, max, abs, neg as builtins

Fri Jan 21 11:13:48 PST 2022

tra added a comment.

Looks good overall. 
Please do check that the generated PTX does get assembled by ptxas.

There are few newer variants of these instructions that appear to be missing. E.g. `{min/max}.xorsign.abs`. 
If you only intended to add instructions available in PTX-7.0, which, based on the constraints used in the patch, appears to be the case, I'd mention that in the commit log.

================
Comment at: clang/test/CodeGen/builtins-nvptx.c:822
+  // CHECK_PTX70_SM80: call i16 @llvm.nvvm.fmin.bf16
+  __nvvm_fmin_bf16(0x1234, 0x7FBF);
+  // CHECK_PTX70_SM80: call i16 @llvm.nvvm.fmin.nan.bf16
----------------
I'd `#define` the magic values to give them sensible names.

================
Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:580

-  def int_nvvm_fmin_d : GCCBuiltin<"__nvvm_fmin_d">,
-      DefaultAttrsIntrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty],
-        [IntrNoMem, IntrSpeculatable, Commutative]>;
-  def int_nvvm_fmax_d : GCCBuiltin<"__nvvm_fmax_d">,
-      DefaultAttrsIntrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty],
-        [IntrNoMem, IntrSpeculatable, Commutative]>;
+    foreach capability = ["_f16", "_ftz_f16", "_nan_f16", "_ftz_nan_f16"] in {
+      def int_nvvm_f # operation # capability :
----------------
Nit: `variant` might work better here and below.

================
Comment at: llvm/lib/Target/NVPTX/NVPTXIntrinsics.td:620
+class MIN_MAX_TUPLE<string C, Intrinsic I, NVPTXRegClass RC> {
+  string Capacity = C;
+  Intrinsic Intr = I;
----------------
Ditto. Capacity->Variant.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117887/new/

https://reviews.llvm.org/D117887