[Mlir-commits] [mlir] [MLIR][NVVM] Add prefetch Ops (PR #141737)

Sun Jun 1 22:15:13 PDT 2025

================
@@ -2333,6 +2353,79 @@ def NVVM_CpAsyncBulkTensorSharedCTAToGlobalOp :
   let hasVerifier = 1;
 }
 
+//===----------------------------------------------------------------------===//
+// NVVM Prefetch Ops
+//===----------------------------------------------------------------------===//
+
+def NVVM_PrefetchL1Op : NVVM_Op<"prefetch.L1"> {
+  let summary = "Brings the cache line containing the specified address into L1 cache";
+  let description = [{
+    Brings the cache line containing the specified address into L1 cache.
+
+    Operand `addr` can be a global, local or generic address pointer.
+    No operation is performed if `addr` maps to a `shared` memory location.
+
+    [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu)
+  }];
+  let arguments = (ins AnyTypeOf<[LLVM_PointerGlobal,
+                                  LLVM_PointerLocal,
+                                  LLVM_PointerGeneric]>:$addr);
+  let assemblyFormat = "$addr attr-dict `:` type($addr)";
+
+  let extraClassDeclaration = [{
+    static llvm::Intrinsic::ID getIntrinsicID(Operation &op);
+  }];
+  let llvmBuilder = [{
+    auto intId = NVVM::PrefetchL1Op::getIntrinsicID(*op);
+    createIntrinsicCall(builder, intId, $addr);
+  }];
+}
+
+def NVVM_PrefetchL2Op : NVVM_Op<"prefetch.L2"> {
----------------
Wolfram70 wrote:

Our thought was that since eviction priority can only be specified on prefetch to L2, combining them would make the verifier a little bulkier and we would also probably need another enum attribute to specify the cache level. What do you think? Would it be better to combine them into a single Op?

https://github.com/llvm/llvm-project/pull/141737