[Mlir-commits] [mlir] Fix bug in gpu.memcpy lowering for dynamically shaped operands. (PR #128820)
Matthias Springer
llvmlistbot at llvm.org
Thu Feb 27 00:27:29 PST 2025
================
@@ -17,3 +17,23 @@ module attributes {gpu.container_module} {
return
}
}
+
+// -----
+
+module attributes {gpu.container_module} {
+
+ // CHECK: func @dynamic
+ func.func @dynamic(%dst : memref<?x?xf32, 1>, %src : memref<?x?xf32>) {
+ // CHECK: %[[T0:.*]] = llvm.call @mgpuStreamCreate
+ %t0 = gpu.wait async
+ %t1 = gpu.memcpy async [%t0] %dst, %src : memref<?x?xf32, 1>, memref<?x?xf32>
----------------
matthias-springer wrote:
Before this PR, the implementation calculated the number of elements as `stride[0] * size[0]`. Not sure why it was like that. But maybe the sparse test will start failing now that we don't take into account the strides anymore.
https://github.com/llvm/llvm-project/pull/128820
More information about the Mlir-commits
mailing list