[Mlir-commits] [mlir] [mlir][tosa] Fix bf16 reduction accumulator widening (PR #192045)

Tue Apr 14 09:16:23 PDT 2026

================
@@ -1172,9 +1172,11 @@ static LogicalResult reduceMatchAndRewriteHelper(OpTy op, uint64_t axis,
   Value input = op->getOperand(0);
 
   // Figure out the accType if needed
-  bool widenAccTy = std::is_same_v<OpTy, tosa::ReduceSumOp> &&
-                    isa<FloatType>(elementTy) &&
-                    cast<FloatType>(elementTy).isBF16();
+  const bool needsFp32AccTy =
+      isa<FloatType>(elementTy) && cast<FloatType>(elementTy).isBF16();
+  const bool widenAccTy = (std::is_same_v<OpTy, tosa::ReduceSumOp> ||
+                           std::is_same_v<OpTy, tosa::ReduceProductOp>) &&
----------------
lhutton1 wrote:

In general an implementation can diverge from the spec pseudo-code as long as it passes conformance. Since this is widening the accumulator type, I don't see an issue here. 

It seems the problem previously was that the linalg implementation using a bf16 accumulator type used truncation when multiplying, rather than using round-to-nearest. A round-to-nearest implementation would pass conformance. Perhaps it's worth leaving a comment to this effect so that the legalization can be improved in the future? Otherwise the changes LGTM!

https://github.com/llvm/llvm-project/pull/192045