[llvm] [NVPTX] Add TMA Bulk Copy Intrinsics (PR #138679)
Durgadoss R via llvm-commits
llvm-commits at lists.llvm.org
Wed May 7 01:08:52 PDT 2025
================
@@ -525,18 +528,26 @@ class CpAsyncBulkStr<bit mc, bit ch> {
string C2C = "cp.async.bulk.shared::cluster.shared::cta.mbarrier::complete_tx::bytes";
}
-multiclass CP_ASYNC_BULK_S2G<NVPTXRegClass rc> {
- def NAME: NVPTXInst<(outs),
- (ins Int64Regs:$dst, rc:$src, Int32Regs:$size),
+def CP_ASYNC_BULK_S2G : NVPTXInst<(outs),
+ (ins Int64Regs:$dst, ADDR:$src, Int32Regs:$size),
!strconcat(CpAsyncBulkStr<0, 0>.S2G, " [$dst], [$src], $size;"), []>,
Requires<[hasPTX<80>, hasSM<90>]>;
- def NAME # _CH: NVPTXInst<(outs),
- (ins Int64Regs:$dst, rc:$src, Int32Regs:$size, Int64Regs:$ch),
- !strconcat(CpAsyncBulkStr<0, 1>.S2G, " [$dst], [$src], $size, $ch;"), []>,
- Requires<[hasPTX<80>, hasSM<90>]>;
-}
-defm CP_ASYNC_BULK_S2G : CP_ASYNC_BULK_S2G<Int64Regs>;
-defm CP_ASYNC_BULK_S2G_SHARED32 : CP_ASYNC_BULK_S2G<Int32Regs>;
+
+def CP_ASYNC_BULK_S2G_CH : NVPTXInst<(outs),
+ (ins Int64Regs:$dst, ADDR:$src, Int32Regs:$size, Int64Regs:$ch),
+ !strconcat(CpAsyncBulkStr<0, 1>.S2G, " [$dst], [$src], $size, $ch;"), []>,
----------------
durga4github wrote:
Done. Updated in the latest revision.
https://github.com/llvm/llvm-project/pull/138679
More information about the llvm-commits
mailing list