[Mlir-commits] [mlir] Sub-channel quantized type implementation (PR #120172)
Sandeep Dasgupta
llvmlistbot at llvm.org
Wed Mar 12 17:58:11 PDT 2025
================
@@ -410,6 +410,123 @@ int32_t UniformQuantizedPerAxisType::getQuantizedDimension() const {
return getImpl()->quantizedDimension;
}
+UniformQuantizedSubChannelType UniformQuantizedSubChannelType::get(
+ unsigned flags, Type storageType, Type expressedType,
+ DenseElementsAttr scales, DenseElementsAttr zeroPoints,
+ ArrayRef<int32_t> quantizedDimensions, ArrayRef<int64_t> blockSizes,
+ int64_t storageTypeMin, int64_t storageTypeMax) {
+ return Base::get(storageType.getContext(), flags, storageType, expressedType,
+ scales, zeroPoints, quantizedDimensions, blockSizes,
+ storageTypeMin, storageTypeMax);
+}
+
+UniformQuantizedSubChannelType UniformQuantizedSubChannelType::getChecked(
+ function_ref<InFlightDiagnostic()> emitError, unsigned flags,
+ Type storageType, Type expressedType, DenseElementsAttr scales,
+ DenseElementsAttr zeroPoints, ArrayRef<int32_t> quantizedDimensions,
+ ArrayRef<int64_t> blockSizes, int64_t storageTypeMin,
+ int64_t storageTypeMax) {
+ return Base::getChecked(emitError, storageType.getContext(), flags,
+ storageType, expressedType, scales, zeroPoints,
+ quantizedDimensions, blockSizes, storageTypeMin,
+ storageTypeMax);
+}
+
+LogicalResult UniformQuantizedSubChannelType::verifyInvariants(
+ function_ref<InFlightDiagnostic()> emitError, unsigned flags,
+ Type storageType, Type expressedType, DenseElementsAttr scales,
+ DenseElementsAttr zeroPoints, ArrayRef<int32_t> quantizedDimensions,
+ ArrayRef<int64_t> blockSizes, int64_t storageTypeMin,
+ int64_t storageTypeMax) {
+ if (failed(QuantizedType::verifyInvariants(emitError, flags, storageType,
+ expressedType, storageTypeMin,
+ storageTypeMax))) {
+ return failure();
+ }
+
+ // Uniform quantization requires fully expressed parameters, including
+ // expressed type.
+ if (!expressedType)
+ return emitError() << "uniform quantization requires expressed type";
----------------
sdasgup3 wrote:
There are two verification methods introduced for sub-channel quantization: (A) `UniformQuantizedSubChannelType::verifyInvariants`, and (B) verifySubChannelQuantization, which are complementary in nature.
(A) is used for all the checks that could be performed at type level only (w/o knowing the container tensor type). Like
- Expressed type is floating point.
- Scale type to match expressedType.
- Zero-point type to match storageType.
- Shape of scales and zeroPoints match.
- number of quantized-dimensions and block-sizes match.
- quantized dimension >= 0
- blockSize > 0
(A) is invoked as part of `QuantizedType` ctor.
and (B) are the complementary checks once we have information about the container type. (B) is invoked as part of quant dialect ops' verfication like `quant.qcast, quant.dcast` etc. Some example checks for (B) are:
- container type should be a ranked tensor type
- dim(scales, i) = dims(container_type, i) / block_sizes(i) etc
https://github.com/llvm/llvm-project/pull/120172
More information about the Mlir-commits
mailing list