[all-commits] [llvm/llvm-project] 0d87e2: [mlir][tosa] Improve lowering to tosa.fully_connec...
Spenser Bauman via All-commits
all-commits at lists.llvm.org
Fri Dec 1 07:17:07 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 0d87e2577914a6384f4ad5952b8fa9b0d8e48da8
https://github.com/llvm/llvm-project/commit/0d87e2577914a6384f4ad5952b8fa9b0d8e48da8
Author: Spenser Bauman <sbauman at mathworks.com>
Date: 2023-12-01 (Fri, 01 Dec 2023)
Changed paths:
M mlir/lib/Conversion/TosaToLinalg/TosaToLinalgNamed.cpp
M mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-named.mlir
A mlir/test/Integration/Dialect/Tosa/CPU/test-fully-connected.mlir
Log Message:
-----------
[mlir][tosa] Improve lowering to tosa.fully_connected (#73049)
The current lowering of tosa.fully_connected produces a linalg.matmul
followed by a linalg.generic to add the bias. The IR looks like the
following:
%init = tensor.empty()
%zero = linalg.fill ins(0 : f32) outs(%init)
%prod = linalg.matmul ins(%A, %B) outs(%zero)
// Add the bias
%initB = tensor.empty()
%result = linalg.generic ins(%prod, %bias) outs(%initB) {
// add bias and product
}
This has two down sides:
1. The tensor.empty operations typically result in additional
allocations after bufferization
2. There is a redundant traversal of the data to add the bias to the
matrix product.
This extra work can be avoided by leveraging the out-param of
linalg.matmul. The new IR sequence is:
%init = tensor.empty()
%broadcast = linalg.broadcast ins(%bias) outs(%init)
%prod = linalg.matmul ins(%A, %B) outs(%broadcast)
In my experiments, this eliminates one loop and one allocation (post
bufferization) from the generated code.
More information about the All-commits
mailing list