[llvm] [LLVM][NVPTX] Add NVPTX codegen support for fence.proxy.tensormap (PR #100748)
Pradeep Kumar via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 1 03:22:21 PDT 2024
================
@@ -1418,6 +1418,20 @@ let TargetPrefix = "nvvm" in {
def int_nvvm_fence_sc_cluster:
Intrinsic<[], [], [IntrNoCallback]>;
+// Proxy fence (uni-directional)
+foreach scope = ["cta", "cluster", "gpu", "sys"] in {
+
+ def int_nvvm_fence_proxy_tensormap_release_ # scope:
+ Intrinsic<[], [], [IntrNoCallback],
+ "llvm.nvvm.fence.proxy.tensormap.release." # scope>;
+
+ def int_nvvm_fence_proxy_tensormap_acquire_ # scope:
+ Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty],
+ [IntrNoCallback, ImmArg<ArgIndex<1>>],
----------------
schwarzschild-radius wrote:
> I'm not familiar enough with the new synchronization instructions, so I can't tell you what is the right way to specify their behavior in LLVM, but it's not uncommon for various barrier instructions to come with a IntrHasSideEffects to make sure that LLVM would not move them around.
By default intrinsics are assumed to be side-effecting unless specified otherwise (https://github.com/llvm/llvm-project/blob/67730ae19c6bcb08dca292e0576b6cd55a843932/llvm/include/llvm/IR/Intrinsics.td#L168). Added `IntrArgMemOnly` attribute to relax constraints for the acquire variants
> Also, are there any concerns regarding replicating/merging these instructions across different control flow paths? Do they need IntrConvergent ?
fence.* variants don't require `IntrConvergent` as they can be executed divergently as well
https://github.com/llvm/llvm-project/pull/100748
More information about the llvm-commits
mailing list