[llvm] [NVPTX] Add tcgen05 alloc/dealloc intrinsics (PR #124961)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 29 16:02:00 PST 2025


================
@@ -962,6 +962,109 @@ The ``griddepcontrol`` intrinsics allows the dependent grids and prerequisite gr
 For more information, refer 
 `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol>`__.
 
+TCGEN05 family of Intrinsics
+----------------------------
+
+The llvm.nvvm.tcgen05.* intrinsics model the TCGEN05 family of instructions
+exposed by PTX. These intrinsics use 'Tensor Memory' (henceforth ``tmem``).
+NVPTX represents this memory using ``addrspace(6)`` and is always 32-bits.
+
+For more information, refer PTX ISA
----------------
Artem-B wrote:

Edit: "refer to the PTX ISA" or "See the PTX ISA for more information".

https://github.com/llvm/llvm-project/pull/124961


More information about the llvm-commits mailing list