[Mlir-commits] [mlir] Introduce `arith.scaling_extf` and `arith.scaling_truncf` (PR #141965)
Umang Yadav
llvmlistbot at llvm.org
Fri May 30 11:41:48 PDT 2025
================
@@ -1280,6 +1318,49 @@ def Arith_TruncFOp :
attr-dict `:` type($in) `to` type($out) }];
}
+//===----------------------------------------------------------------------===//
+// Scaling TruncFOp
+//===----------------------------------------------------------------------===//
+
+def Arith_ScalingTruncFOp
+ : Arith_Op<"scaling_truncf",
+ [Pure, SameInputOutputTensorDims,
+ DeclareOpInterfaceMethods<ArithRoundingModeInterface>,
+ DeclareOpInterfaceMethods<ArithFastMathInterface>,
+ DeclareOpInterfaceMethods<CastOpInterface>]>,
+ Arguments<(ins FloatLike:$in, FloatLike:$scale,
+ OptionalAttr<Arith_RoundingModeAttr>:$roundingmode,
+ OptionalAttr<Arith_FastMathAttr>:$fastmath)>,
+ Results<(outs FloatLike:$out)> {
+ let summary =
+ "cast from floating-point to narrower floating-point with scales";
+ let description = [{
+ This operation implements micro-scaling (OCP MXFP) quantization of input using provided scale values.
+ This quantization usually happens over a block of values. All values in that block share same scale value for quantization purposes.
+ Therefore original input of shape `<dim1 x dim2 ... dimN>` can be thought of as of shape `<dim1 x dim2 x ... (dimN / blockSize) x blockSize>`,
+ assuming quantization axis is the last axis.
+ Original scales values therefore would be of shape `<dim1 x dim2 x ... x dimN-1 x (dimN/blockSize)>`.
+ `arith.scaling_truncf` operation is an elementwise operation. Therefore, before calling into `arith.scaling_truncf`, if `blockSize != 1` then
+ Scales must be broadcast appropriately to ensure they are of the same shape as the input operand.
----------------
umangyadav wrote:
Done.
https://github.com/llvm/llvm-project/pull/141965
More information about the Mlir-commits
mailing list