[llvm] [NVPTX] Add Bulk Copy Prefetch Intrinsics (PR #123226)

Abhilash Majumder via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 21 22:11:28 PST 2025


================
@@ -547,6 +547,23 @@ multiclass CP_ASYNC_BULK_CTA_TO_CLUSTER<NVPTXRegClass rc> {
 defm CP_ASYNC_BULK_CTA_TO_CLUSTER : CP_ASYNC_BULK_CTA_TO_CLUSTER<Int64Regs>;
 defm CP_ASYNC_BULK_CTA_TO_CLUSTER_SHARED32 : CP_ASYNC_BULK_CTA_TO_CLUSTER<Int32Regs>;
 
+//------------------------------
+// Bulk Copy Prefetch Functions
+//------------------------------
+multiclass CP_ASYNC_BULK_PREFETCH_INTR {
+  defvar prefetch = "cp.async.bulk.prefetch.L2.global";
+  def NAME: NVPTXInst<(outs),
+            (ins Int64Regs:$src, Int32Regs:$size),
+            !strconcat(prefetch, " [$src], $size;"), []>,
+            Requires<[hasPTX<80>, hasSM<90>]>;
+  def NAME # _CH: NVPTXInst<(outs),
+                  (ins Int64Regs:$src, Int32Regs:$size, Int64Regs:$ch),
+                  !strconcat(prefetch, ".L2::cache_hint [$src], $size, $ch;"), []>,
+                  Requires<[hasPTX<80>, hasSM<90>]>;
+}
+
+defm CP_ASYNC_BULK_PREFETCH : CP_ASYNC_BULK_PREFETCH_INTR;
+
----------------
abhilash1910 wrote:

Yes addressed this , removed multiclass instantiation. 

https://github.com/llvm/llvm-project/pull/123226


More information about the llvm-commits mailing list