[llvm] [LoongArch] Fix fp_to_uint/fp_to_sint conversion errors for lasx (PR #137129)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 24 03:07:32 PDT 2025
================
@@ -31,9 +31,9 @@ define void @fptosi_v4f64_v4i32(ptr %res, ptr %in){
; CHECK-LABEL: fptosi_v4f64_v4i32:
; CHECK: # %bb.0:
; CHECK-NEXT: xvld $xr0, $a1, 0
+; CHECK-NEXT: xvftintrz.l.d $xr0, $xr0
; CHECK-NEXT: xvpermi.d $xr1, $xr0, 238
-; CHECK-NEXT: xvfcvt.s.d $xr0, $xr1, $xr0
-; CHECK-NEXT: xvftintrz.w.s $xr0, $xr0
+; CHECK-NEXT: xvpickev.w $xr0, $xr1, $xr0
; CHECK-NEXT: vst $vr0, $a0, 0
----------------
heiher wrote:
The previous impl appears to convert `f64` to `f32` before performing the integer conversion, which introduces a loss of precision at the `f64 -> f32` step.
```
d: float64
s: float32
w: int32
*a1: d0, d1, d2, d3 // (little-endian)
xr0: d3, d2, d1, d0 // xvld $xr0, $a1, 0
xr1: d3, d2, d3, d2 // xvpermi.d $xr1, $xr0, 238
xr0: s3, s2, s3, s2, s3, s2, s1, s0 // xvfcvt.s.d $xr0, $xr1, $xr0
^
+-- Loss of precision at this step
xr0: w3, w2, w3, w2, w3, w2, w1, w0 // xvftintrz.w.s $xr0, $xr0
vr0: w3, w2, w1, w0 // vst $vr0, $a0, 0
*a0: w0, w1, w2, w3 // (little-endian)
```
In the updated version, it seems the `f64` values are first converted to `i64` and then truncated to `u32`. This can produce incorrect results when the original `f64` values exceed the range representable by (signed) `i32`.
Would it make sense to go with a direct `f64 -> i32` conversion instead?
```
*a1: d0, d1, d2, d3 // (little-endian)
xr0: d3, d2, d1, d0 // xvld $xr0, $a1, 0
xr1: d3, d2, d3, d2 // xvpermi.d $xr1, $xr0, 238
xr0: w3, w2, w3, w2, w3, w2, w1, w0 // xvftintrz.w.d $xr0, $xr1, $xr0
vr0: w3, w2, w1, w0 // vst $vr0, $a0, 0
*a0: w0, w1, w2, w3 // (little-endian)
```
https://github.com/llvm/llvm-project/pull/137129
More information about the llvm-commits
mailing list