[llvm] [LLVM][NVPTX] Add support for tensormap.cp_fenceproxy (PR #107555)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 10 15:54:27 PDT 2024
================
@@ -311,7 +311,37 @@ The ``@llvm.nvvm.fence.proxy.tensormap_generic.*`` is a uni-directional fence us
``@llvm.nvvm.fence.proxy.tensormap_generic.acquire.*`` ``fence.proxy.tensormap::generic.acquire.* [addr], size``
====================================================== =========================================================
-The address operand ``addr`` and the operand ``size`` together specify the memory range ``[addr, addr+size)`` on which the ordering guarantees on the memory accesses across the proxies is to be provided. The only supported value for the ``size`` operand is ``128`` and must be an immediate. Generic Addressing is used unconditionally, and the address specified by the operand addr must fall within the ``.global`` state space. Otherwise, the behavior is undefined. For more information, see `PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar>`_.
+The address operand ``addr`` and the operand ``size`` together specify the memory range ``[addr, addr+size)`` on which the ordering guarantees on the memory accesses across the proxies is to be provided. The only supported value for the ``size`` operand is ``128`` and must be an immediate. Generic Addressing is used unconditionally, and the address specified by the operand addr must fall within the ``.global`` state space. Otherwise, the behavior is undefined. For more information, see PTX ISA `<https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar>`_.
+
+'``llvm.nvvm.tensormap.cp_fenceproxy.global.shared.tensormap_generic.release.*.sync.aligned``'
----------------
gonzalobg wrote:
> If/when we will need to introduce other variants, e.g. shared.shared, then we can add intrinsics with the address space encoded in the name, and auto-upgrade the abbreviated one to the global.shared.
What do you mean by "auto-upgrade"? That new variants are added, but variants without modifiers are kept with their pre-existing meaning?
https://github.com/llvm/llvm-project/pull/107555
More information about the llvm-commits
mailing list