[Mlir-commits] [mlir] Fix bug in gpu.memcpy lowering for dynamically shaped operands. (PR #128820)

Thu Feb 27 00:27:29 PST 2025

================
@@ -17,3 +17,23 @@ module attributes {gpu.container_module} {
     return
   }
 }
+
+// -----
+
+module attributes {gpu.container_module} {
+
+  // CHECK: func @dynamic
+  func.func @dynamic(%dst : memref<?x?xf32, 1>, %src : memref<?x?xf32>) {
+    // CHECK: %[[T0:.*]] = llvm.call @mgpuStreamCreate
+    %t0 = gpu.wait async
+    %t1 = gpu.memcpy async [%t0] %dst, %src : memref<?x?xf32, 1>, memref<?x?xf32>
----------------
matthias-springer wrote:

Before this PR, the implementation calculated the number of elements as `stride[0] * size[0]`. Not sure why it was like that. But maybe the sparse test will start failing now that we don't take into account the strides anymore.


https://github.com/llvm/llvm-project/pull/128820