[llvm] [LLVM][NVPTX] Add codegen support for tcgen05.{ld, st} instructions (PR #126740)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 11 11:28:05 PST 2025


================
@@ -1175,6 +1175,101 @@ For more information, refer to the PTX ISA
 `<https://docs.nvidia.com/cuda/parallel-thread-execution/#tensorcore-5th-generation-instructions-tcgen05-fence>`_.
 
 
+'``llvm.nvvm.tcgen05.ld.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+  declare <n x i32> @llvm.nvvm.tcgen05.ld.<shape>.<num>(ptr addrspace(6) %tmem_addr, i1 %pack)
+
+  declare <n x i32> @llvm.nvvm.tcgen05.ld.16x32bx2.<num>(ptr addrspace(6) %tmem_addr, i64 %offset, i1 %pack)
----------------
Artem-B wrote:

`Pack` is an odd one out here. Is there a particular reason to make it a parameter rather than the name?
While for boolean argument mapping to instruction name is OK, there's still a question of consistency.
We should summarize and document the rules of thumb we've established so far, so we do not have to bikeshed over naming on every new PTX intrinsic.


https://github.com/llvm/llvm-project/pull/126740


More information about the llvm-commits mailing list