[PATCH] D156467: [mlir][ArmSME] Add conversion from ArmSME to SCF to materialize loops

Mon Jul 31 07:52:31 PDT 2023

awarzynski accepted this revision.
awarzynski added a comment.

LGTM, great work, thanks!

================
Comment at: mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td:355
+    $base `[` $indices `]` `,` $tile `,` $tile_slice_index
+      attr-dict `:` type($base) `,` type($tile) `,` type($result)
+  }];
----------------
Given that `type($tile)` will always be equal to `type($result)`, I wonder whether this wouldn't be cleaner:
```
let assemblyFormat = [{
    $base `[` $indices `]` `,` $tile `,` $tile_slice_index
      attr-dict `:` type($base) `,` type($tile)
  }];
```
or (with index type):
```
let assemblyFormat = [{
    $base `[` $indices `]` `,` $tile `,` $tile_slice_index
      attr-dict `:` type($base) `,` type($tile) `,` type($tile_slice_index)
  }]
```

================
Comment at: mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td:322-323
+
+    The memory is defined by a base and index and must be contiguous. The
+    memref must be either rank 1 or rank 2 with dynamic dimensions, since the
+    operation is scalable, and the element type must be a scalar that matches
----------------
dcaballe wrote:
> -> ~The slice of memory read is defined...?
[nit] Perhaps this is just me and also English is not my first language, but this reads a bit like:

> The memref must be either "rank 1" or "rank 2 with dynamic dimensions"

i.e. as if "dynamic dimensions" only referred to "rank 2". 

================
Comment at: mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp:71
+    // use as input tile to 'arm_sme.load_tile_slice' ops.
+    auto tileInit =
+        rewriter.create<arm_sme::CastTileToVector>(loc, tileType, tileId);
----------------
[nit] "Init" in `tileInit` suggests that something might be initialised. Perhaps `tileIdAsVector`?

================
Comment at: mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp:89
+    auto tileSliceIndex = forOp.getInductionVar();
+    // TODO: use indices
+    // Create 'arm_sme.load_tile_slice' to load tile slice from
----------------
Can you expand?

================
Comment at: mlir/lib/Conversion/ArmSMEToSCF/ArmSMEToSCF.cpp:101-102
+    // loop to simplify the canonicalizations.
+    rewriter.replaceOpWithNewOp<arm_sme::CastTileToVector>(tileLoadOp, tileType,
+                                                           tileId);
+
----------------
Is this 2nd cast really needed? Wouldn't this work:
```
rewriter.replaceOp(tileLoadOp, tileInit);
```
? I'm probably missing something obvious 🤔 .

================
Comment at: mlir/test/Dialect/ArmSME/invalid.mlir:77-81
+func.func @arm_sme_load_tile_slice__tile_and_result_type_mismatch(%src : memref<?x?xi8>, %tile : vector<[16]x[16]xi8>, %tile_slice_index : index) {
+  %c0 = arith.constant 0 : index
+  // expected-error at +1 {{expected result type to match tile type}}
+  %tile_update = arm_sme.load_tile_slice %src[%c0], %tile, %tile_slice_index : memref<?x?xi8>, vector<[16]x[16]xi8>, vector<[4]x[4]xi32>
+  return
----------------
Similar test for `arm_sme.store_tile_slice`? If we preserve this assembly format.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156467/new/

https://reviews.llvm.org/D156467