[Mlir-commits] [mlir] [mlir][amdgpu] Add amdgpu.make_dma_descriptor (PR #169407)
Jakub Kuderski
llvmlistbot at llvm.org
Tue Nov 25 06:18:44 PST 2025
================
@@ -1192,4 +1227,91 @@ def AMDGPU_ScaledMFMAOp :
}];
let hasCanonicalizer = 1;
}
+
+def AMDGPU_MakeDmaBaseOp :
+ AMDGPU_Op<"make_dma_base", [Pure, AttrSizedOperandSegments]>,
+ Arguments<(ins
+ Arg<AnyMemRef, "buffer to read from">:$src,
+ Variadic<Index>:$srcIndices,
+ Arg<AnyMemRef, "buffer to write to">:$dst,
+ Variadic<Index>:$dstIndices)>,
+ Results<(outs AMDGPU_TDMBaseType: $base)> {
+
+ // TODO:
+ // * Add verifiers such that one of the memrefs is from LDS and the other global.
+ // * Add verifiers to make sure that the number of indices do not exceed the number of dimensions.
+
+ let summary = "Pair of based addresses used when moving tiles between LDS and global memory.";
+ let description = [{
+ This operation creates a pair of addresses that will be used by tensor_load_to_lds
+ and tensor_store_from_lds.
+
+ This operation creates a value corresponding roughly to the descriptor group 0
+ found in TensorLoadToLDSOp and TensorStoreFromLDSOp in the rocdl dialect.
+ }];
+
+ let assemblyFormat = [{
+ $src `[` $srcIndices `]` `,` $dst `[` $dstIndices `]` attr-dict `:` type($src) `,` type($dst) `->` type(results)
+ }];
+}
+
+def AMDGPU_MakeDmaDescriptorOp :
+ AMDGPU_Op<"make_dma_descriptor", [Pure, AttrSizedOperandSegments]>,
+ Arguments<(ins
+ AMDGPU_TDMBaseType: $base,
+ Variadic<Index>: $global_dynamic_sizes,
+ OptionalAttr<DenseI64ArrayAttr>: $global_static_sizes,
+ Variadic<Index>: $global_dynamic_strides,
+ OptionalAttr<DenseI64ArrayAttr>: $global_static_strides,
+ Variadic<Index>: $shared_dynamic_sizes,
+ OptionalAttr<DenseI64ArrayAttr>: $shared_static_sizes,
+ Optional<Index>: $pad,
+ OptionalAttr<IndexAttr>: $pad_const,
+ Optional<Index>: $every,
+ OptionalAttr<IndexAttr>: $every_const,
+ Optional<AnyMemRef>: $atomic_barrier_address,
+ Variadic<Index>: $atomic_barrier_dynamic_indices,
+ OptionalAttr<DenseI64ArrayAttr>: $atomic_barrier_static_indices,
+ Optional<Index>: $global_increment,
+ Optional<Index>: $lds_increment,
+ Optional<Index>: $iteration_count)>,
+ Results<(outs AMDGPU_TDMDescriptorType: $desc)> {
+
+ let summary = "Make all descriptor groups needed by TensorLoadToLDS/TensorStoreFromLDS.";
+ let description = [{
+ Make all descriptor groups needed by tensor memory operations.
+
+ The $base operand corresponds to the base pair addresses, one must be an address in LDS
+ while the other must be a global memory location.
+
+ $global_{static/dynamic}_sizes determine the size of the tensor.
+ $global_{static/dynamic}_strides determine the strides of the tensor.
+ $shared_{static/dynamic}_sizes determines the size of the tile.
+
+ Padding can be applied to the LDS address when copying from memory to LDS,
+ but not when copying from LDS to memory.
+ The values in the padded target addresses remain the same as before the operation was applied.
+
+ 2D and 3D tensors may be iterated over by setting $global_increment, $lds_increment, and $iteration_count.
+ $global_increment determines how much to increment the starting global memory address per iteration in units of the $base's element type.
+ $lds_increment determines how much to increment the starting LDS address per iteration in units of the $base's element type.
+ $iterate_count determines how many times to iterate.
----------------
kuhar wrote:
also here: can you add a few mlir examples?
https://github.com/llvm/llvm-project/pull/169407
More information about the Mlir-commits
mailing list