[llvm] [LLVM][NVPTX] Add NVPTX codegen support for fence.proxy.tensormap (PR #100748)

via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 1 10:15:45 PDT 2024


================
@@ -251,6 +251,34 @@ Overview:
 The '``@llvm.nvvm.barrier0()``' intrinsic emits a PTX ``bar.sync 0``
 instruction, equivalent to the ``__syncthreads()`` call in CUDA.
 
+Membar/Fences
+-------------
+
+
+'``llvm.nvvm.fence.proxy.tensormap.*``'
----------------
gonzalobg wrote:

Why is `generic` dropped from PTX (`tensormap::generic`) ?

How does the user of the intrinsic know whether it is fencing "from generic to tensormap", or in the other direction "from tensormap to generic"?

How will this be disambiguated if the opposite direction needs to be added?

https://github.com/llvm/llvm-project/pull/100748


More information about the llvm-commits mailing list