[all-commits] [llvm/llvm-project] 447bb5: [mlir][ArmSME] Introduce new lowering layer (Vecto...

Tue Jul 18 01:07:08 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 447bb5bee402eab94987ebbd8f29d696f946ba9e
      https://github.com/llvm/llvm-project/commit/447bb5bee402eab94987ebbd8f29d696f946ba9e
  Author: Andrzej Warzynski <andrzej.warzynski at arm.com>
  Date:   2023-07-18 (Tue, 18 Jul 2023)

  Changed paths:
    M mlir/include/mlir/Conversion/Passes.h
    M mlir/include/mlir/Conversion/Passes.td
    A mlir/include/mlir/Conversion/VectorToArmSME/VectorToArmSME.h
    M mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.h
    M mlir/include/mlir/Dialect/ArmSME/IR/ArmSME.td
    M mlir/lib/Conversion/CMakeLists.txt
    A mlir/lib/Conversion/VectorToArmSME/CMakeLists.txt
    A mlir/lib/Conversion/VectorToArmSME/VectorToArmSME.cpp
    A mlir/lib/Conversion/VectorToArmSME/VectorToArmSMEPass.cpp
    M mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVMPass.cpp
    M mlir/lib/Dialect/ArmSME/IR/CMakeLists.txt
    M mlir/lib/Dialect/ArmSME/Transforms/CMakeLists.txt
    M mlir/lib/Dialect/ArmSME/Transforms/LegalizeForLLVMExport.cpp
    R mlir/lib/Dialect/ArmSME/Transforms/LowerVectorOps.cpp
    M mlir/test/Dialect/ArmSME/roundtrip.mlir
    A mlir/test/Dialect/ArmSME/vector-ops-to-llvm.mlir
    A mlir/test/Dialect/ArmSME/vector-ops-to-sme.mlir
    R mlir/test/Dialect/ArmSME/vector-ops.mlir
    M mlir/test/Integration/Dialect/Vector/CPU/ArmSME/vector-ops.mlir

  Log Message:
  -----------
  [mlir][ArmSME] Introduce new lowering layer (Vector -> ArmSME)

At the moment, the lowering from the Vector dialect to SME looks like
this:

  * Vector --> SME LLVM IR intrinsics

This patch introduces a new lowering layer between the Vector dialect
and the Arm SME extension:

  * Vector --> ArmSME dialect (custom Ops) --> SME LLVM IR intrinsics.

This is motivated by 2 considerations:
1. Storing `ZA` to memory (e.g. `vector.transfer_write`) requires an
   `scf.for` loop over all rows of `ZA`. Similar logic will apply to
   "load to ZA from memory". This is a rather complex transformation and
   a custom Op seems justified.
2. As discussed in [1], we need to prevent the LLVM type converter from
   having to convert types unsupported in LLVM, e.g.
   `vector<[16]x[16]xi8>`. A dedicated abstraction layer with custom Ops
   opens a path to some fine tuning (e.g. custom type converters) that
   will allow us to avoid this.

To facilitate this change, two new custom SME Op are introduced:

  * `TileStoreOp`, and
  * `ZeroOp`.

Note that no new functionality is added - these Ops merely model what's
already supported. In particular, the following tile size is assumed
(dimension and element size are fixed):

  * `vector<[16]x[16]xi8>`

The new lowering layer is introduced via a conversion pass between the
Vector and the SME dialects. You can use the `-convert-vector-to-sme`
flag to run it. The following function:
```
func.func @example(%arg0 : memref<?x?xi8>) {
  // (...)
  %cst = arith.constant dense<0> : vector<[16]x[16]xi8>
  vector.transfer_write %cst, %arg0 : vector<[16]x[16]xi8>, memref<?x?xi8>
  return
}
```
would be lowered to:
```
  func.func @example(%arg0: memref<?x?xi8>) {
    // (...)
    %0 = arm_sme.zero : vector<[16]x[16]xi8>
    arm_sme.tile_store %arg0[%c0, %c0], %0 : memref<?x?xi8>, vector<[16]x[16]xi8>
    return
  }
```

Later, a mechanism will be introduced to guarantee that `arm_sme.zero`
and `arm_sme.tile_store` operate on the same virtual tile. For `i8`
elements this is not required as there is only one tile.

In order to lower the above output to LLVM, use
  * `-convert-vector-to-llvm="enable-arm-sme"`.

[1] https://github.com/openxla/iree/issues/14294

Reviewed By: WanderAway

Differential Revision: https://reviews.llvm.org/D154867