[llvm] [NVPTX] Add TMA Bulk Copy Intrinsics (PR #138679)

Durgadoss R via llvm-commits llvm-commits at lists.llvm.org
Wed May 7 01:08:52 PDT 2025


================
@@ -525,18 +528,26 @@ class CpAsyncBulkStr<bit mc, bit ch> {
   string C2C = "cp.async.bulk.shared::cluster.shared::cta.mbarrier::complete_tx::bytes";
 }
 
-multiclass CP_ASYNC_BULK_S2G<NVPTXRegClass rc> {
-  def NAME: NVPTXInst<(outs),
-            (ins Int64Regs:$dst, rc:$src, Int32Regs:$size),
+def CP_ASYNC_BULK_S2G : NVPTXInst<(outs),
+            (ins Int64Regs:$dst, ADDR:$src, Int32Regs:$size),
             !strconcat(CpAsyncBulkStr<0, 0>.S2G, " [$dst], [$src], $size;"), []>,
             Requires<[hasPTX<80>, hasSM<90>]>;
-  def NAME # _CH: NVPTXInst<(outs),
-                  (ins Int64Regs:$dst, rc:$src, Int32Regs:$size, Int64Regs:$ch),
-                  !strconcat(CpAsyncBulkStr<0, 1>.S2G, " [$dst], [$src], $size, $ch;"), []>,
-                  Requires<[hasPTX<80>, hasSM<90>]>;
-}
-defm CP_ASYNC_BULK_S2G : CP_ASYNC_BULK_S2G<Int64Regs>;
-defm CP_ASYNC_BULK_S2G_SHARED32 : CP_ASYNC_BULK_S2G<Int32Regs>;
+
+def CP_ASYNC_BULK_S2G_CH : NVPTXInst<(outs),
+            (ins Int64Regs:$dst, ADDR:$src, Int32Regs:$size, Int64Regs:$ch),
+            !strconcat(CpAsyncBulkStr<0, 1>.S2G, " [$dst], [$src], $size, $ch;"), []>,
----------------
durga4github wrote:

Done. Updated in the latest revision.

https://github.com/llvm/llvm-project/pull/138679


More information about the llvm-commits mailing list