[llvm] [AMDGPU][True16] extractEltcheap check 16bit in true16 mode (PR #171762)

Thu Dec 11 08:42:58 PST 2025

================
@@ -586,14 +586,33 @@ define void @undef_hi_op_v2f16(half %arg0) {
 ; GFX9-NEXT:    ;;#ASMEND
 ; GFX9-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX11-LABEL: undef_hi_op_v2f16:
-; GFX11:       ; %bb.0:
-; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT:    v_pk_add_f16 v0, v0, 1.0 op_sel_hi:[1,0]
-; GFX11-NEXT:    ;;#ASMSTART
-; GFX11-NEXT:    ; use v0
-; GFX11-NEXT:    ;;#ASMEND
-; GFX11-NEXT:    s_setpc_b64 s[30:31]
+; GFX11-FAKE16-LABEL: undef_hi_op_v2f16:
+; GFX11-FAKE16:       ; %bb.0:
+; GFX11-FAKE16-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-FAKE16-NEXT:    v_pk_add_f16 v0, v0, 1.0 op_sel_hi:[1,0]
+; GFX11-FAKE16-NEXT:    ;;#ASMSTART
+; GFX11-FAKE16-NEXT:    ; use v0
+; GFX11-FAKE16-NEXT:    ;;#ASMEND
+; GFX11-FAKE16-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX11-TRUE16-SDAG-LABEL: undef_hi_op_v2f16:
+; GFX11-TRUE16-SDAG:       ; %bb.0:
+; GFX11-TRUE16-SDAG-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-TRUE16-SDAG-NEXT:    v_add_f16_e32 v0.l, 1.0, v0.l
+; GFX11-TRUE16-SDAG-NEXT:    v_mov_b16_e32 v0.h, 0x7e00
----------------
broxigarchen wrote:

It seem with the IsExtractEltCheap with 16bit, isel is trying to break
```
op(build_vector(undef, a, ...), build_vecctor(undef, b, ....)) 
```
=>
```
build_vector(undef, op(a,b), undef, ...)
```
This look correct since it splits for more possible optimization, but for some cases it becomes addtional mov in the end.

It seems we need these kinds of optimization for true16 mode:
```
build_vector ((op a,b) , constant) -> v_pk_op(reg_seq(a, constant), reg_seq(b, 0)) 
build_vector ((op a,b) , (op c,d)) -> v_pk_op(reg_seq(a, c), reg_seq(b, d)) 
```
I will do this in another patch

https://github.com/llvm/llvm-project/pull/171762