[llvm] [AMDGPU] selecting v_sat_pk instruction, version 2 (PR #123297)

Fri Jan 17 06:42:34 PST 2025

================
@@ -816,6 +816,14 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
                          {MVT::v4f32, MVT::v8f32, MVT::v16f32, MVT::v32f32},
                          Custom);
     }
+
+    // true 16 currently unsupported
+    if (!Subtarget->hasTrue16BitInsts() || (!Subtarget->useRealTrue16Insts() ||
----------------
Shoreshen wrote:

Hi @arsenm , by adding the node I got the following:

```
def V_SAT_PK_U8_I16_e64: list<dag> Pattern = [(set i16:$vdst, (AMDGPUsat_pk_cast (i32 (VOP3Mods0 i32:$src0))))];
def V_SAT_PK_U8_I16_fake16_e64: list<dag> Pattern = [(set i16:$vdst, (AMDGPUsat_pk_cast (i32 (VOP3Mods0 i32:$src0))))];
def V_SAT_PK_U8_I16_t16_e64: list<dag> Pattern = [(set i16:$vdst, (AMDGPUsat_pk_cast (i32 (VOP3OpSelMods i32:$src0, i32:$src0_modifiers))))];
```

I think there are 2 problems:
1. The source is i32, instead of v2i16
2. It requires the operand of AMDGPUsat_pk_cast be complex patterns of VOP3Mods0 and VOP3OpSelMods 

If the instruction cannot cover any type of (i16 (AMDGPUsat_pk_cast v2i8)), we risks that this may cause a failure in selection.

I also tried to create a new VOP_I16_V2I16 type, but it makes V_SAT_PK_U8_I16_e64 and V_SAT_PK_U8_I16_fake16_e64 4 operand instructions (with modifier, clamp and opsel)

https://github.com/llvm/llvm-project/pull/123297