[all-commits] [llvm/llvm-project] 95ef8e: [mlir][ArmSME] Support 2-way widening outer produc...
Cullen Rhodes via All-commits
all-commits at lists.llvm.org
Wed Jan 31 01:13:31 PST 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 95ef8e386823717efeb2b7b1d02bfbb28473cccc
https://github.com/llvm/llvm-project/commit/95ef8e386823717efeb2b7b1d02bfbb28473cccc
Author: Cullen Rhodes <cullen.rhodes at arm.com>
Date: 2024-01-31 (Wed, 31 Jan 2024)
Changed paths:
M mlir/include/mlir/Dialect/ArmSME/IR/ArmSMEIntrinsicOps.td
M mlir/include/mlir/Dialect/ArmSME/IR/ArmSMEOps.td
M mlir/include/mlir/Dialect/ArmSME/Transforms/Passes.h
M mlir/include/mlir/Dialect/ArmSME/Transforms/Passes.td
M mlir/include/mlir/Dialect/ArmSME/Transforms/Transforms.h
M mlir/lib/Conversion/ArmSMEToLLVM/ArmSMEToLLVM.cpp
M mlir/lib/Dialect/ArmSME/Transforms/CMakeLists.txt
A mlir/lib/Dialect/ArmSME/Transforms/OuterProductFusion.cpp
M mlir/test/Conversion/ArmSMEToLLVM/arm-sme-to-llvm.mlir
M mlir/test/Dialect/ArmSME/invalid.mlir
A mlir/test/Dialect/ArmSME/outer-product-fusion.mlir
M mlir/test/Dialect/ArmSME/roundtrip.mlir
A mlir/test/Integration/Dialect/Vector/CPU/ArmSME/test-outerproduct-f16f16f32.mlir
M mlir/test/Target/LLVMIR/arm-sme.mlir
Log Message:
-----------
[mlir][ArmSME] Support 2-way widening outer products (#78975)
This patch introduces support for 2-way widening outer products. This
enables the fusion of 2 'arm_sme.outerproduct' operations that are
chained via the accumulator into a 2-way widening outer product
operation.
Changes:
- Add 'llvm.aarch64.sme.[us]mop[as].za32' intrinsics for 2-way variants.
These map to instruction variants added in SME2 and use different
intrinsics. Intrinsics are already implemented for widening variants
from SME1.
- Adds the following operations:
- fmopa_2way, fmops_2way
- smopa_2way, smops_2way
- umopa_2way, umops_2way
- Implements conversions for the above ops to intrinsics in
ArmSMEToLLVM.
- Adds a pass 'arm-sme-outer-product-fusion' that fuses
'arm_sme.outerproduct' operations.
For a detailed description of these operations see the
'arm_sme.fmopa_2way' description.
The reason for introducing many operations rather than one is the
signed/unsigned variants can't be distinguished with types (e.g., ui16,
si16) since 'arith.extui' and 'arith.extsi' only support signless
integers. A single operation would require this information and an
attribute (for example) for the sign doesn't feel right if
floating-point types are also supported where this wouldn't apply.
Furthermore, the SME FP8 extensions (FEAT_SME_F8F16, FEAT_SME_F8F32)
introduce FMOPA 2-way (FP8 to FP16) and 4-way (FP8 to FP32) variants but
no subtract variant. Whilst these are not supported in this patch, it
felt simpler to have separate ops for add/subtract given this.
More information about the All-commits
mailing list