[llvm] [LoongArch] Try to widen shuffle mask (PR #136081)

Thu Apr 24 19:39:42 PDT 2025

================
@@ -6,7 +6,8 @@ define <32 x i8> @widen_shuffle_mask_v32i8_to_v16i16(<32 x i8> %a, <32 x i8> %b)
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI0_0)
 ; CHECK-NEXT:    xvld $xr2, $a0, %pc_lo12(.LCPI0_0)
-; CHECK-NEXT:    xvshuf.b $xr0, $xr1, $xr0, $xr2
+; CHECK-NEXT:    xvshuf.h $xr2, $xr1, $xr0
+; CHECK-NEXT:    xvori.b $xr0, $xr2, 0
----------------
heiher wrote:

Thanks. Just to clarify - are you assuming `$xr0` is used as a return value here? If so, would the following be equivalent?

```
; CHECK-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI0_0)
; CHECK-NEXT:    xvld $xr2, $a0, %pc_lo12(.LCPI0_0)
; CHECK-NEXT:    xvshuf.h $xr0, $xr1, $xr0
```

Under the native ABI, the 256-bit vector return values typically go through the stack, not `$xr` registers. If we're confident `xvori.b` isn't needed, it might be worth double-checking how this behaves with native ABI return conventions.

https://github.com/llvm/llvm-project/pull/136081