[llvm] [LoongArch] Fix fp_to_uint/fp_to_sint conversion errors for lasx (PR #137129)

Thu Apr 24 03:07:32 PDT 2025

================
@@ -31,9 +31,9 @@ define void @fptosi_v4f64_v4i32(ptr %res, ptr %in){
 ; CHECK-LABEL: fptosi_v4f64_v4i32:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    xvld $xr0, $a1, 0
+; CHECK-NEXT:    xvftintrz.l.d $xr0, $xr0
 ; CHECK-NEXT:    xvpermi.d $xr1, $xr0, 238
-; CHECK-NEXT:    xvfcvt.s.d $xr0, $xr1, $xr0
-; CHECK-NEXT:    xvftintrz.w.s $xr0, $xr0
+; CHECK-NEXT:    xvpickev.w $xr0, $xr1, $xr0
 ; CHECK-NEXT:    vst $vr0, $a0, 0
----------------
heiher wrote:

The previous impl appears to convert `f64` to `f32` before performing the integer conversion, which introduces a loss of precision at the `f64 -> f32` step.

```
d: float64
s: float32
w: int32

*a1: d0, d1, d2, d3 // (little-endian)
xr0: d3, d2, d1, d0 // xvld $xr0, $a1, 0
xr1: d3, d2, d3, d2 // xvpermi.d $xr1, $xr0, 238
xr0: s3, s2, s3, s2, s3, s2, s1, s0 // xvfcvt.s.d $xr0, $xr1, $xr0
                                       ^
                                       +-- Loss of precision at this step
xr0: w3, w2, w3, w2, w3, w2, w1, w0 // xvftintrz.w.s $xr0, $xr0
vr0: w3, w2, w1, w0 // vst $vr0, $a0, 0
*a0: w0, w1, w2, w3 // (little-endian)
```

In the updated version, it seems the `f64` values are first converted to `i64` and then truncated to `u32`. This can produce incorrect results when the original `f64` values exceed the range representable by (signed) `i32`.

Would it make sense to go with a direct `f64 -> i32` conversion instead?

```
*a1: d0, d1, d2, d3 // (little-endian)
xr0: d3, d2, d1, d0 // xvld $xr0, $a1, 0
xr1: d3, d2, d3, d2 // xvpermi.d $xr1, $xr0, 238
xr0: w3, w2, w3, w2, w3, w2, w1, w0 // xvftintrz.w.d $xr0, $xr1, $xr0
vr0: w3, w2, w1, w0 // vst $vr0, $a0, 0
*a0: w0, w1, w2, w3 // (little-endian)
```

https://github.com/llvm/llvm-project/pull/137129