[llvm] [LLVM][NVPTX] Add codegen support for tcgen05.{ld, st} instructions (PR #126740)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 18 12:00:49 PST 2025


================
@@ -1175,6 +1175,101 @@ For more information, refer to the PTX ISA
 `<https://docs.nvidia.com/cuda/parallel-thread-execution/#tensorcore-5th-generation-instructions-tcgen05-fence>`_.
 
 
+'``llvm.nvvm.tcgen05.ld.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+  declare <n x i32> @llvm.nvvm.tcgen05.ld.<shape>.<num>(ptr addrspace(6) %tmem_addr, i1 %pack)
+
+  declare <n x i32> @llvm.nvvm.tcgen05.ld.16x32bx2.<num>(ptr addrspace(6) %tmem_addr, i64 %offset, i1 %pack)
----------------
Artem-B wrote:

After a closer look at the changes, I'm fine with them. While we could push a bit more into the tablegen, we'd still likely need the custom lowering glue.

https://github.com/llvm/llvm-project/pull/126740


More information about the llvm-commits mailing list