[llvm] [AMDGPU][True16][Codegen] remove packed build_vector pattern from true16 (PR #148715)

Thu Jul 17 12:55:12 PDT 2025

================
@@ -77,11 +77,20 @@ define i32 @divergent_vec_0_i16(i16 %a) {
 ; GFX906-NEXT:    v_lshlrev_b32_e32 v0, 16, v0
 ; GFX906-NEXT:    s_setpc_b64 s[30:31]
 ;
----------------
Sisyph wrote:

To summarize my discussion on this with @broxigarchen,

- Fake16 exploits the fact that 16 bit values will always be in the lo16 bits during isel, which is why It can select build_vector with 0 into lshlrev for these 16 bit call arguments. In true16, we should not exploit that, and the better way to optimize is probably change to the calling convention to pack 16 bit values, instead of leaving them unpacked. 
- As he said above, we can do some optimization in the coalescer to remove the extra v_mov_b32 

https://github.com/llvm/llvm-project/pull/148715