[all-commits] [llvm/llvm-project] f75d46: [mlir][ArmSME] Lower vector.outerproduct to FMOPA/...

Thu Sep 14 00:32:05 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f75d46a7ec38318a55dc82b0d635cc203954a218
      https://github.com/llvm/llvm-project/commit/f75d46a7ec38318a55dc82b0d635cc203954a218
  Author: Cullen Rhodes <cullen.rhodes at arm.com>
  Date:   2023-09-14 (Thu, 14 Sep 2023)

  Changed paths:
    M mlir/include/mlir/Dialect/ArmSME/Utils/Utils.h
    M mlir/lib/Dialect/ArmSME/Transforms/LegalizeForLLVMExport.cpp
    M mlir/lib/Dialect/ArmSME/Utils/Utils.cpp
    M mlir/lib/Dialect/Vector/Transforms/LowerVectorContract.cpp
    M mlir/test/Dialect/ArmSME/vector-ops-to-llvm.mlir
    A mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-outerproduct-f32.mlir
    A mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-outerproduct-f64.mlir

  Log Message:
  -----------
  [mlir][ArmSME] Lower vector.outerproduct to FMOPA/BFMOPA (#65621)

This patch adds support for lowering vector.outerproduct to the ArmSME
MOPA intrinsic for the following types:

  vector<[8]xf16>,  vector<[8]xf16>  -> vector<[8]x[8]xf16>
  vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16>
  vector<[4]xf32>,  vector<[4]xf32>  -> vector<[4]x[4]xf32>
  vector<[2]xf64>,  vector<[2]xf64>  -> vector<[2]x[2]xf64>

The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to
BFMOPA
(non-widening) [2].

Note at the ISA level these variants are implemented by different
architecture features, these are listed below:

  FMOPA (non-widening)
    * half-precision   - +sme2p1,+sme-f16f16
    * single-precision - +sme
    * double-precision - +sme-f64f64
  BFMOPA (non-widening)
    * half-precision   - +sme2p1,+b16b16

There's currently no way to target different features when lowering to
ArmSME. Integration tests are added for F32 and F64. We use QEMU to run
the integration tests but SME2 support isn't available yet, it's
targeted for 9.0, so integration tests for these variants excluded.

Masking is currently unsupported.

Depends on #65450.

[1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate-
[2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-