[Mlir-commits] [mlir] Fix bug in gpu.memcpy lowering for dynamically shaped operands. (PR #128820)

Thu Feb 27 00:26:08 PST 2025

================
@@ -17,3 +17,23 @@ module attributes {gpu.container_module} {
     return
   }
 }
+
+// -----
+
+module attributes {gpu.container_module} {
+
+  // CHECK: func @dynamic
+  func.func @dynamic(%dst : memref<?x?xf32, 1>, %src : memref<?x?xf32>) {
+    // CHECK: %[[T0:.*]] = llvm.call @mgpuStreamCreate
+    %t0 = gpu.wait async
+    %t1 = gpu.memcpy async [%t0] %dst, %src : memref<?x?xf32, 1>, memref<?x?xf32>
----------------
matthias-springer wrote:

Is that the only failing test? Maybe we can put a `memref.cast` in the sparse compiler to cast away the layout map, along with a TODO for the sparse compiler folks to change the way `CoordinatesOp` is generated.

I'm also wondering if the IR generated by the sparse compiler here is actually correct. As in: does the memcpy to GPU actually work, given that there is a layout map? @aartbik @PeimingLiu 

https://github.com/llvm/llvm-project/pull/128820