[Mlir-commits] [mlir] [MLIR][NVGPU] Add `tma.fence.descriptor` OP (PR #133218)

Thu Mar 27 05:04:41 PDT 2025

================
@@ -433,6 +433,20 @@ def NVGPU_MBarrierTryWaitParityOp : NVGPU_Op<"mbarrier.try_wait.parity", []> {
   let assemblyFormat = "$barriers `[` $mbarId `]` `,` $phaseParity `,` $ticks attr-dict `:` type($barriers)";  
 }
 
+def NVGPU_TmaFenceOp : NVGPU_Op<"tma.fence.descriptor", []> {
+  let summary = "Insert fence given `nvgpu.tensormap.descriptor` ";
+  let description = [{
+    The Op fences the given `$tmaDescriptor`. This is necessary if the tensor map
+    descriptor was modified from the host using cudaMemcpy. In this case, the
+    kernel needs a fence, then it is safe to use `tensor.map` for load.
+  }];
+  let arguments = (ins NVGPU_TensorMapDescriptor:$tensorMapDescriptor);
----------------
durga4github wrote:

Should we also add an attr for the mem-scope (with `sys` as the default, to cater to the current usage)?
(or)
Should we name it explicitly `TmaFenceSysScopeOp` or something..?



https://github.com/llvm/llvm-project/pull/133218