[PATCH] D38645: [NVPTX] Implemented wmma intrinsics and instructions.

Yuan Lin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 11 11:27:56 PDT 2017


YuanLin added a comment.

Artem,  thanks a lot for working on this!

I notice that you are taking a different approach to define the llvm wmma intrinsics than what we (NVIDIA) do.

  http://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#nvvm-intrin-warp-level-matrix

Specifically, yours embeds the layout/memory space/ while ours treats them as constant arguments. We did this to reduce the amount of intrinsic functions the optimizer and codegen have to deal with. We have plans for more wmma features in the next few CUDA releases. It would be better to unify the syntax and naming of the wmma intrinsics. It would also make cross-support much easier.

Would you be able to revise the patch? Highly appreciated.

Thanks.

Yuan


https://reviews.llvm.org/D38645





More information about the llvm-commits mailing list