[llvm] [LLVM][NVPTX] Add NVPTX codegen support for fence.proxy.tensormap (PR #100748)

Pradeep Kumar via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 1 03:22:21 PDT 2024


================
@@ -1418,6 +1418,20 @@ let TargetPrefix = "nvvm" in {
   def int_nvvm_fence_sc_cluster:
       Intrinsic<[], [], [IntrNoCallback]>;
 
+// Proxy fence (uni-directional)
+foreach scope = ["cta", "cluster", "gpu", "sys"] in {
+
+  def int_nvvm_fence_proxy_tensormap_release_ # scope:
+        Intrinsic<[], [], [IntrNoCallback],
+                  "llvm.nvvm.fence.proxy.tensormap.release." # scope>;
+
+  def int_nvvm_fence_proxy_tensormap_acquire_ # scope:
+        Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty],
+                  [IntrNoCallback, ImmArg<ArgIndex<1>>],
----------------
schwarzschild-radius wrote:

> I'm not familiar enough with the new synchronization instructions, so I can't tell you what is the right way to specify their behavior in LLVM, but it's not uncommon for various barrier instructions to come with a IntrHasSideEffects to make sure that LLVM would not move them around.

By default intrinsics are assumed to be side-effecting unless specified otherwise (https://github.com/llvm/llvm-project/blob/67730ae19c6bcb08dca292e0576b6cd55a843932/llvm/include/llvm/IR/Intrinsics.td#L168). Added `IntrArgMemOnly` attribute to relax constraints for the acquire variants

> Also, are there any concerns regarding replicating/merging these instructions across different control flow paths? Do they need IntrConvergent ?

fence.* variants don't require `IntrConvergent` as they can be executed divergently as well

https://github.com/llvm/llvm-project/pull/100748


More information about the llvm-commits mailing list