[Mlir-commits] [mlir] [mlir][AMDGPU] Add scaled floating point conversion ops fp8 (PR #141554)
Tim Gymnich
llvmlistbot at llvm.org
Thu Jun 5 07:59:33 PDT 2025
================
@@ -112,6 +112,38 @@ def AMDGPU_ExtPackedFp8Op :
}];
}
+def AMDGPU_ScaledExtPackedOp
+ : AMDGPU_Op<"scaled_ext_packed", [Pure]>,
+ Arguments<(
+ ins AnyTypeOf<[VectorOfLengthAndType<[2, 3, 4], [F8E5M2, F8E4M3FN]>,
----------------
tgymnich wrote:
How do you think this should be implemented?
(1) Using the non-pk instructions or
(2) by leaving one of the 2 input vector elements undefined / zero (potentially inefficient).
The non-pk instructions are missing the bf16 cases and the f16 cases have different semantics (e.g. they take an existing vector input for the result to be packed into).
https://github.com/llvm/llvm-project/pull/141554
More information about the Mlir-commits
mailing list