[llvm] [AMDGPU] Use native instructions for f16 to u16/i16 saturated conversion (PR #186769)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 17 03:36:11 PDT 2026


================
@@ -632,6 +632,53 @@ def : GCNPat<
 >;
 }
 
+// These i16 conversions naturally saturate.
+let OtherPredicates = [Has16BitInsts], True16Predicate = NotHasTrue16BitInsts in {
+def : GCNPat<(i16 (fp_to_uint_sat (f16 (VOP3Mods f16:$src0, i32:$src0_modifiers)), i16)),
+             (V_CVT_U16_F16_e64 $src0_modifiers, $src0)>;
+def : GCNPat<(i16 (fp_to_sint_sat (f16 (VOP3Mods f16:$src0, i32:$src0_modifiers)), i16)),
+             (V_CVT_I16_F16_e64 $src0_modifiers, $src0)>;
+def : GCNPat<(i16 (fp_to_uint_sat f16:$src0, i16)), (V_CVT_U16_F16_e32 (f16 $src0))>;
+def : GCNPat<(i16 (fp_to_sint_sat f16:$src0, i16)), (V_CVT_I16_F16_e32 (f16 $src0))>;
----------------
jayfoad wrote:

You should not need _e32 patterns. The normal flow is that instruction selection selects _e64 instructions, and later SIShrinkInstructions shrinks them to _e32 if possible.

https://github.com/llvm/llvm-project/pull/186769


More information about the llvm-commits mailing list