[llvm] [LLVM][NVPTX] Add codegen support for tcgen05.{ld, st} instructions (PR #126740)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 18 12:00:49 PST 2025
================
@@ -1175,6 +1175,101 @@ For more information, refer to the PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#tensorcore-5th-generation-instructions-tcgen05-fence>`_.
+'``llvm.nvvm.tcgen05.ld.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+ declare <n x i32> @llvm.nvvm.tcgen05.ld.<shape>.<num>(ptr addrspace(6) %tmem_addr, i1 %pack)
+
+ declare <n x i32> @llvm.nvvm.tcgen05.ld.16x32bx2.<num>(ptr addrspace(6) %tmem_addr, i64 %offset, i1 %pack)
----------------
Artem-B wrote:
After a closer look at the changes, I'm fine with them. While we could push a bit more into the tablegen, we'd still likely need the custom lowering glue.
https://github.com/llvm/llvm-project/pull/126740
More information about the llvm-commits
mailing list