[PATCH] D107046: [NVPTX] Add NVPTX intrinsics for CUDA PTX 6.5 ldmatrix instructions
Steffen Larsen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 29 01:45:42 PDT 2021
steffenlarsen created this revision.
steffenlarsen added a reviewer: tra.
Herald added subscribers: hiraditya, yaxunl, jholewinski.
steffenlarsen requested review of this revision.
Herald added subscribers: llvm-commits, jdoerfert.
Herald added a project: LLVM.
Adds NVPTX intrinsics for the CUDA PTX `ldmatrix.sync.aligned` instructions added in PTX 6.5.
PTX ISA description of `ldmatrix.sync.aligned`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-ldmatrix
Authored-by: Steffen Larsen <steffen.larsen at codeplay.com>
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D107046
Files:
llvm/include/llvm/IR/IntrinsicsNVVM.td
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/test/CodeGen/NVPTX/wmma.py
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D107046.362673.patch
Type: text/x-patch
Size: 18837 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210729/2790a3b9/attachment.bin>
More information about the llvm-commits
mailing list