[Mlir-commits] [mlir] [mlir][ArmSME] Lower transfer_write + transpose to vertical store (PR #71181)

Wed Nov 8 04:59:55 PST 2023

================
@@ -116,6 +135,25 @@ func.func @entry() {
   call @transfer_write_2d_mask(%A, %c0, %c0) : (memref<?x?xf32>, index, index) -> ()
   call @load_and_print(%A, %c0, %c0) : (memref<?x?xf32>, index, index) -> ()
 
+  // 4. Reload 3. + store + transpose.
+  // CHECK-LABEL: TILE BEGIN:
+  // CHECK-NEXT: ( 0, 0, 20, 30
+  // CHECK-NEXT: ( 0, 0, 21, 31
+  // CHECK-NEXT: ( 0, 0, 0, 0
+  // CHECK-NEXT: ( 3, 13, 0, 0
+  call @transfer_write_2d_transposed(%A, %c0, %c0) : (memref<?x?xf32>, index, index) -> ()
+  call @load_and_print(%A, %c0, %c0) : (memref<?x?xf32>, index, index) -> ()
+
+  // 5. Reload 4. + store + transpose but with mask (nrows=4, ncols=2).
+  // The mask applies after permutation
+  // CHECK-LABEL: TILE BEGIN:
+  // CHECK-NEXT: ( 0, 0, 20, 30
+  // CHECK-NEXT: ( 0, 0, 21, 31
+  // CHECK-NEXT: ( 20, 21, 0, 0
+  // CHECK-NEXT: ( 30, 31, 0, 0
----------------
c-rhodes wrote:

correct, took me a minute to figure it out again as well :) added a comment to clarify.

https://github.com/llvm/llvm-project/pull/71181