[llvm] [NVPTX] Add TMA Bulk Copy Intrinsics (PR #138679)

Durgadoss R via llvm-commits llvm-commits at lists.llvm.org
Wed May 7 01:08:35 PDT 2025


================
@@ -5323,6 +5323,20 @@ def int_nvvm_cp_async_bulk_shared_cta_to_global
        NoCapture<ArgIndex<0>>, NoCapture<ArgIndex<1>>,
        ImmArg<ArgIndex<4>>]>;
 
+// From Shared CTA to Global memory with bytemask
+def int_nvvm_cp_async_bulk_shared_cta_to_global_bytemask
+  : DefaultAttrsIntrinsic<[],
+      [llvm_global_ptr_ty, // dst_gmem_ptr
+       llvm_shared_ptr_ty, // src_smem_ptr
+       llvm_i32_ty,        // copy_size
+       llvm_i16_ty,        // byte_mask
+       llvm_i64_ty,        // cache_hint
+       llvm_i1_ty],        // Flag for cache_hint
+      [IntrConvergent, IntrArgMemOnly,
+       WriteOnly<ArgIndex<0>>, ReadOnly<ArgIndex<1>>,
+       NoCapture<ArgIndex<0>>, NoCapture<ArgIndex<1>>,
----------------
durga4github wrote:

Fixed in the latest revision,

https://github.com/llvm/llvm-project/pull/138679


More information about the llvm-commits mailing list