[Mlir-commits] [mlir] [mlir][tosa] Fix bf16 reduction accumulator widening (PR #192045)
Luke Hutton
llvmlistbot at llvm.org
Tue Apr 14 09:16:23 PDT 2026
================
@@ -1172,9 +1172,11 @@ static LogicalResult reduceMatchAndRewriteHelper(OpTy op, uint64_t axis,
Value input = op->getOperand(0);
// Figure out the accType if needed
- bool widenAccTy = std::is_same_v<OpTy, tosa::ReduceSumOp> &&
- isa<FloatType>(elementTy) &&
- cast<FloatType>(elementTy).isBF16();
+ const bool needsFp32AccTy =
+ isa<FloatType>(elementTy) && cast<FloatType>(elementTy).isBF16();
+ const bool widenAccTy = (std::is_same_v<OpTy, tosa::ReduceSumOp> ||
+ std::is_same_v<OpTy, tosa::ReduceProductOp>) &&
----------------
lhutton1 wrote:
In general an implementation can diverge from the spec pseudo-code as long as it passes conformance. Since this is widening the accumulator type, I don't see an issue here.
It seems the problem previously was that the linalg implementation using a bf16 accumulator type used truncation when multiplying, rather than using round-to-nearest. A round-to-nearest implementation would pass conformance. Perhaps it's worth leaving a comment to this effect so that the legalization can be improved in the future? Otherwise the changes LGTM!
https://github.com/llvm/llvm-project/pull/192045
More information about the Mlir-commits
mailing list