[llvm] [AMDGPU][True16][MC][CodeGen] true16 for v_cndmask_b16 (PR #119736)

Fri Jan 3 13:48:17 PST 2025

================
@@ -1245,11 +1245,29 @@ class VOPSelectPat <ValueType vt> : GCNPat <
   (vt (select i1:$src0, vt:$src1, vt:$src2)),
   (V_CNDMASK_B32_e64 0, VSrc_b32:$src2, 0, VSrc_b32:$src1, SSrc_i1:$src0)
 >;
+class VOPSelectPat_t16 <ValueType vt> : GCNPat <
+  (vt (select i1:$src0, vt:$src1, vt:$src2)),
+  (V_CNDMASK_B16_t16_e64 0, VSrcT_b16:$src2, 0, VSrcT_b16:$src1, SSrc_i1:$src0)
+>;
+class VOPSelectPat_fake16 <ValueType vt> : GCNPat <
+  (vt (select i1:$src0, vt:$src1, vt:$src2)),
+  (V_CNDMASK_B16_fake16_e64 0, VSrc_b16:$src2, 0, VSrc_b16:$src1, SSrc_i1:$src0)
+>;
 
 def : VOPSelectModsPat <i32>;
 def : VOPSelectModsPat <f32>;
-def : VOPSelectPat <f16>;
-def : VOPSelectPat <i16>;
+let True16Predicate = NotHasTrue16BitInsts in {
+  def : VOPSelectPat <f16>;
----------------
broxigarchen wrote:

I see. Sorry I was confused at the beginning.

In Pre-GFX11 it is matched to v_cndmask_b32 and in post-gfx11 it is matched to v_cndmask_b16. Both instruction support omod/abs/neg/opsel srcmod. So I think we should have srcmod pattern for f16/i16 both pre-GFX11 and post-GFX11

I think it's better to merge this as it, and open another PR to address this seperately as this impacts all many targets. What do you think?



https://github.com/llvm/llvm-project/pull/119736