[Mlir-commits] [mlir] [mlir][AMDGPU] Add scaled floating point conversion ops fp8 (PR #141554)

Thu Jun 5 07:59:33 PDT 2025

================
@@ -112,6 +112,38 @@ def AMDGPU_ExtPackedFp8Op :
   }];
 }
 
+def AMDGPU_ScaledExtPackedOp
+    : AMDGPU_Op<"scaled_ext_packed", [Pure]>,
+      Arguments<(
+          ins AnyTypeOf<[VectorOfLengthAndType<[2, 3, 4], [F8E5M2, F8E4M3FN]>,
----------------
tgymnich wrote:

How do you think this should be implemented? 
(1) Using the non-pk instructions or 
(2) by leaving one of the 2 input vector elements undefined / zero (potentially inefficient).

The non-pk instructions are missing the bf16 cases and the f16 cases have different semantics (e.g. they take an existing vector input for the result to be packed into).

https://github.com/llvm/llvm-project/pull/141554