[PATCH] D38645: [NVPTX] Implemented wmma intrinsics and instructions.
Yuan Lin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 11 11:27:56 PDT 2017
YuanLin added a comment.
Artem, thanks a lot for working on this!
I notice that you are taking a different approach to define the llvm wmma intrinsics than what we (NVIDIA) do.
http://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#nvvm-intrin-warp-level-matrix
Specifically, yours embeds the layout/memory space/ while ours treats them as constant arguments. We did this to reduce the amount of intrinsic functions the optimizer and codegen have to deal with. We have plans for more wmma features in the next few CUDA releases. It would be better to unify the syntax and naming of the wmma intrinsics. It would also make cross-support much easier.
Would you be able to revise the patch? Highly appreciated.
Thanks.
Yuan
https://reviews.llvm.org/D38645
More information about the llvm-commits
mailing list