[PATCH] D107046: [NVPTX] Add NVPTX intrinsics for CUDA PTX 6.5 ldmatrix instructions

Steffen Larsen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 29 01:45:42 PDT 2021


steffenlarsen created this revision.
steffenlarsen added a reviewer: tra.
Herald added subscribers: hiraditya, yaxunl, jholewinski.
steffenlarsen requested review of this revision.
Herald added subscribers: llvm-commits, jdoerfert.
Herald added a project: LLVM.

Adds NVPTX intrinsics for the CUDA PTX `ldmatrix.sync.aligned` instructions added in PTX 6.5.

PTX ISA description of `ldmatrix.sync.aligned`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-ldmatrix

Authored-by: Steffen Larsen <steffen.larsen at codeplay.com>


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D107046

Files:
  llvm/include/llvm/IR/IntrinsicsNVVM.td
  llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
  llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
  llvm/test/CodeGen/NVPTX/wmma.py

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D107046.362673.patch
Type: text/x-patch
Size: 18837 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210729/2790a3b9/attachment.bin>


More information about the llvm-commits mailing list