[PATCH] D104847: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX 6.5 and 7.0 WMMA and MMA instructions

Steffen Larsen via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jun 25 07:13:31 PDT 2021


steffenlarsen marked 4 inline comments as done.
steffenlarsen added a comment.

@tra Thank you for the quick response! I agree with your comments and have made changes accordingly.



================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:762
 
+// Builtins to support double and alternate float WMMA instructions on sm_80
+TARGET_BUILTIN(__dmma_m8n8k4_ld_a, "vd*dC*UiIi", "", AND(SM_80,PTX70))
----------------
tra wrote:
> Is this a sufficient set of builtins to compile mma.h in CUDA-11.x?
mma.h was my frame-of-reference for the builtins and I have CUDA 11.3 (465.19.01) installed, so I would expect it to be.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:16411-16430
 #define MMA_VARIANTS(geom, type) {{                                 \
+      Intrinsic::nvvm_wmma_##geom##_mma_row_row_##type,             \
+      0,                                                            \
+      Intrinsic::nvvm_wmma_##geom##_mma_row_col_##type,             \
+      0,                                                            \
+      Intrinsic::nvvm_wmma_##geom##_mma_col_row_##type,             \
+      0,                                                            \
----------------
tra wrote:
> Nit: satf variants are in the minority. We could  move them to the end of the variant list and rely on default initialization to 0. E.g something like this:
> 
> ```
> unsigned getMMAIntrinsic(int Layout, bool Satf) {
>     unsigned Index = Layout + 4*Satf;
>     if (Index >= Variants.size())
>       return 0;
>     return Variants[Index];
>   }
> #define MMA_VARIANTS(geom, type) 
>       Intrinsic::nvvm_wmma_##geom##_mma_row_row_##type, \
>       Intrinsic::nvvm_wmma_##geom##_mma_row_col_##type, \
>       Intrinsic::nvvm_wmma_##geom##_mma_col_row_##type, \
>       Intrinsic::nvvm_wmma_##geom##_mma_col_col_##type 
> 
> #define MMA_SATF_VARIANTS(geom, type)
>       MMA_VARIANTS(geom, type), \
>       Intrinsic::nvvm_wmma_##geom##_mma_row_row_##type##_satfinite, \
>       Intrinsic::nvvm_wmma_##geom##_mma_row_col_##type##_satfinite, \
>       Intrinsic::nvvm_wmma_##geom##_mma_col_row_##type##_satfinite, \
>       Intrinsic::nvvm_wmma_##geom##_mma_col_col_##type##_satfinite 
> ...
> case NVPTX::BI__hmma_m16n16k16_mma_f16f16:
>   return {8, 8, 4, 4, {{ MMA_SATF_VARIANTS(m16n16k16, f16_f16) }}};
> ...
>     case NVPTX::BI__dmma_m8n8k4_mma_f64:
>       return {1, 1, 2, 2, {{MMA_VARIANTS(m8n8k4, f64)}}};
> 
> ```
> 
> Up to you.
I agree, I like this better. In case other options will use the satf parameter (e.g. rnd which isn't indicated as being in use from mma.h) this solution is also easier to extend in the future.


================
Comment at: clang/test/CodeGen/builtins-nvptx-mma.py:142-146
+  elif frag.geom == "m16n16k8":
+    if frag.frag == "d":
+      prefix = "__mma"
+    else:
+      prefix = "__mma_tf32"
----------------
tra wrote:
> It's not obvious why  frag `d` is `__mma` and not `__mma_tf32` 
> Can we use frag.ptx_type to make that decision?
We absolutely can. I don't know why that wasn't my first solution.


================
Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:4474
+    foreach satf = [0, 1] in {
+      foreach rnd = ["-", "rn", "rz", "rm", "rp"] in {
+        foreach op = NVVM_MMA_OPS.all_wmma_ops in {
----------------
tra wrote:
> We're often using an empty string to represent a `none`. Comparisons with `-` where we check `rnd` look like we're doing something special there.
> I'd use an empty string for `rnd`, too. 
> 
Empty string works for me. I think there are/were some places that used "-" as a default parameter meaning `none`, but I agree with your assessment.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104847/new/

https://reviews.llvm.org/D104847



More information about the cfe-commits mailing list