[llvm] [NVPTX] Add TMA Bulk Copy Intrinsics (PR #138679)
Durgadoss R via llvm-commits
llvm-commits at lists.llvm.org
Wed May 7 01:08:35 PDT 2025
================
@@ -5323,6 +5323,20 @@ def int_nvvm_cp_async_bulk_shared_cta_to_global
NoCapture<ArgIndex<0>>, NoCapture<ArgIndex<1>>,
ImmArg<ArgIndex<4>>]>;
+// From Shared CTA to Global memory with bytemask
+def int_nvvm_cp_async_bulk_shared_cta_to_global_bytemask
+ : DefaultAttrsIntrinsic<[],
+ [llvm_global_ptr_ty, // dst_gmem_ptr
+ llvm_shared_ptr_ty, // src_smem_ptr
+ llvm_i32_ty, // copy_size
+ llvm_i16_ty, // byte_mask
+ llvm_i64_ty, // cache_hint
+ llvm_i1_ty], // Flag for cache_hint
+ [IntrConvergent, IntrArgMemOnly,
+ WriteOnly<ArgIndex<0>>, ReadOnly<ArgIndex<1>>,
+ NoCapture<ArgIndex<0>>, NoCapture<ArgIndex<1>>,
----------------
durga4github wrote:
Fixed in the latest revision,
https://github.com/llvm/llvm-project/pull/138679
More information about the llvm-commits
mailing list