[llvm] [LLVM][NVPTX] Add movmatrix intrinsic and PTX instruction support (PR #190109)
Alex MacLean via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 2 10:55:06 PDT 2026
================
@@ -3936,6 +3936,34 @@ an event.
For more information on the pmevent instructions, refer to the `PTX ISA
<https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent>`__.
+movmatrix Intrinsics
+--------------------
+
+'``llvm.nvvm.movmatrix.sync.aligned.m8n8.trans.b16``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+ declare i32 @llvm.nvvm.movmatrix.sync.aligned.m8n8.trans.b16(i32 %src)
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.movmatrix.sync.aligned.m8n8.trans.b16``' intrinsic generates
+the ``movmatrix.sync.aligned.m8n8.trans.b16`` PTX instruction, which performs
----------------
AlexMaclean wrote:
I think the intrinsic documentation should not be so focused on the PTX lowering, instead we should just define what the semantics are and perhaps reference the PTX briefly at the end.
https://github.com/llvm/llvm-project/pull/190109
More information about the llvm-commits
mailing list