[llvm] [LLVM][NVPTX] Add NVPTX codegen support for fence.proxy.tensormap (PR #100748)
Pradeep Kumar via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 7 15:09:44 PDT 2024
================
@@ -251,6 +251,41 @@ Overview:
The '``@llvm.nvvm.barrier0()``' intrinsic emits a PTX ``bar.sync 0``
instruction, equivalent to the ``__syncthreads()`` call in CUDA.
+Membar/Fences
+-------------
+
+
+'``llvm.nvvm.fence.proxy.tensormap_generic.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.release.cta()
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.release.cluster()
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.release.gpu()
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.release.sys()
+
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.acquire.cta(ptr %addr, i32 %size)
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.acquire.cluster(ptr %addr, i32 %size)
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.acquire.gpu(ptr %addr, i32 %size)
+ declare void @llvm.nvvm.fence.proxy.tensormap_generic.acquire.sys(ptr %addr, i32 %size)
+
+Overview:
+"""""""""
+
+The ``@llvm.nvvm.fence.proxy.tensormap_generic.*`` is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy. ``nvvm.fence.proxy.tensormap_generic.release`` can form a release sequence that synchronizes with an acquire sequence that contains the ``nvvm.fence.proxy.tensormap_generic.acquire`` proxy fence. The following table describes the mapping between LLVM Intrinsic and the PTX instruction:
----------------
schwarzschild-radius wrote:
I did not add a definiton to proxy because Gonzalo explained that the PTX definition is evolving and he suggested us to just add a link to the PTX's proxy defintion and I have included Gonzalo's suggestion in the commit (https://github.com/llvm/llvm-project/blob/cf416e0a1bd1b3ff6edc662116295f6fffe24be0/llvm/docs/NVPTXUsage.rst?plain=1#L279). Please let me know if I am missing something
https://github.com/llvm/llvm-project/pull/100748
More information about the llvm-commits
mailing list