[Mlir-commits] [mlir] Extending UniformQuantizedType with interface-based support for new storage types in Quant dialect (PR #152966)

Javed Absar llvmlistbot at llvm.org
Mon Feb 9 04:25:35 PST 2026


================
@@ -159,6 +192,34 @@ def Builtin_Float8E4M3FN : Builtin_FloatType<"Float8E4M3FN", "f8E4M3FN"> {
 
     Described in: https://arxiv.org/abs/2209.05433
   }];
+
+  let extraClassDeclaration = [{
+    /// QuantStorageTypeInterface method implementations
+    /// Whether the storage type should default to signed when used in quantization.
+    bool shouldDefaultToSigned() const { return true; }
+    /// Get the bit width of this 8-bit floating point type.
+    unsigned getStorageWidth() const { return 8; }
+    
+    /// Get default maximum value for this 8-bit floating point type.
+    int64_t getDefaultMaximum(bool isSigned) const { return 448; }
+    /// Get default minimum value for this 8-bit floating point type.
+    int64_t getDefaultMinimum(bool isSigned) const { return -getDefaultMaximum(isSigned); }
----------------
javedabsar1 wrote:

/// Get default min, max value for this 8-bit floating point type.
    int64_t getDefaultMinimum(bool isSigned) const { return -getDefaultMaximum(isSigned); }
    int64_t getDefaultMaximum(bool isSigned) const { return 448; }  

https://github.com/llvm/llvm-project/pull/152966


More information about the Mlir-commits mailing list