[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

Justin Lebar via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 11 10:47:28 PDT 2017


jlebar added inline comments.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:9733
+      return nullptr;
+    bool isColMajor = isColMajorArg.getZExtValue();
+    unsigned IID;
----------------
tra wrote:
> jlebar wrote:
> > Urg, this isn't a bool?  Do we want it to be?
> There are no explicit declarations for these builtins in CUDA headers. Callers of these builtins pass 0/1 and corresponding intrinsic described in [[ http://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#nvvm-intrin-warp-level-matrix-ld | NVVM-IR spec ]] shows the argument type as i32, so I've made the type integer in clang. 
> 
> 
sgtm


https://reviews.llvm.org/D38742





More information about the cfe-commits mailing list