[llvm] select v_sat_pk from two i16 or v2i16 (PR #121124)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 6 02:30:10 PST 2025
================
@@ -315,6 +315,55 @@ def srl_16 : PatFrag<
(ops node:$src0), (srl_oneuse node:$src0, (i32 16))
>;
+def clamp_s16_u8 : PatFrag<
+ (ops node:$src),
+ (i16 (AMDGPUsmed3 $src, (i16 0), (i16 255)))
+>;
+
+def conc_lo_u8_i16 : PatFrags<
+ (ops node:$src0, node:$src1),
+ [
+ (or
+ (i16 $src0),
+ (shl (i16 $src1), (i16 8))
+ ),
+ (or
+ (and (i16 $src0), (i16 255)),
+ (shl (i16 $src1), (i16 8))
+ )
+ ]
+>;
+
+def clamp_v2i16_u8 : PatFrags<
+ (ops node:$src),
+ [
+ (v2i16 (smax (smin $src, (build_vector (i16 255), (i16 255))), (build_vector (i16 0), (i16 0)))),
+ (v2i16 (smin (smax $src, (build_vector (i16 0), (i16 0))), (build_vector (i16 255), (i16 255))))
+ ]
+>;
+
+def conc_lo_v2i16_i16 : PatFrags<
----------------
arsenm wrote:
Depends what you mean by "fragile" but for an optimization it doesn't require robustness. -100 for making v2i8 legal, that's a huge amount of effort for one operation.
https://github.com/llvm/llvm-project/pull/121124
More information about the llvm-commits
mailing list