[llvm-dev] The semantics of the fptrunc instruction with an example of incorrect optimisation

Ahmed Bougacha via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 21 13:25:14 PDT 2015


On Fri, Aug 21, 2015 at 12:36 PM, Dan Liew via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> I've recently been looking at how to implement in LLVM IR the rounding
> of floating point values when casting using different rounding modes
> and I've hit some problems.
>
> It seems that when casting down floats to less precise types the
> ``fptrunc`` LLVM IR instruction is used. The LLVM language reference
> suggests that it just truncates the value (which would be equivalent
> to rounding towards zero) but this seems to be very misleading because
> on the target I'm using (x86_64) that **is not** what happens.
>
> Consider the following example in C
>
> ```
> #include <stdio.h>
> #include <fenv.h>
> int main() {
>     double x = 0.3;
>     fesetround(FE_TONEAREST);
>     float y = (float) x;
>     printf("y (nearest):%a\n", y);
>     fesetround(FE_UPWARD);
>     y = (float) x;
>     printf("y (upward):%a\n", y);
>     fesetround(FE_DOWNWARD);
>     y = (float) x;
>     printf("y (downward):%a\n", y);
>     return (int) y;
> }
> ```


This sounds like https://llvm.org/bugs/show_bug.cgi?id=8100 : complete
support for FP rounding and exceptions (via `#pragma STDC FENV_ACCESS
ON', which you need for fesetround to be "meaningful") isn't
implemented yet (and is probably a huge task, as you explain).

-Ahmed

>
> If I get the unoptimised LLVM IR for this by running ``clang -O0
> float.c -emit-llvm -c -o float.clang.o0.bc`` I can see that the cast
> of variable x is being handled using LLVM IR's ``fptrunc``
>
> ```
> ...
>   store double 3.000000e-01, double* %x, align 8
>   %call = call i32 @fesetround(i32 0) #3
>   %0 = load double, double* %x, align 8
>   %conv = fptrunc double %0 to float
> ....
> ```
>
> If I look at the codegened assembly I see that the ``cvtsd2ss`` x86
> instruction is used (how rounding is done is controlled by the MXCSR
> register apparently).  So this instruction might not "truncate"
> depending on how MXCSR is set.
>
> If I run the program
> ```
> $ clang -O0 float.c -lm -o float.clang.o0
> $ ./float.clang.o0
> y (nearest):0x1.333334p-2
> y (upward):0x1.333334p-2
> y (downward):0x1.333332p-2
> ```
>
> I can see that the last cast gives a different result because the
> rounding mode has been changed as expected.
>
> Now let's see what clang does when we ask it to optimize.
>
> ```
> ./float.clang.o3
> y (nearest):0x1.333334p-2
> y (upward):0x1.333334p-2
> y (downward):0x1.333334p-2
> ```
>
> The result of the last cast is wrong (note gcc at -O3 also seems to do
> this) and looking at the optimized LLVM IR reveals why
>
> ```
> define i32 @main() #0 {
> entry:
>   %call = tail call i32 @fesetround(i32 0) #2
>   %call2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds
> ([16 x i8], [16 x i8]* @.str, i64 0, i64 0), double
> 0x3FD3333340000000) #2
>   %call3 = tail call i32 @fesetround(i32 2048) #2
>   %call6 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds
> ([15 x i8], [15 x i8]* @.str.1, i64 0, i64 0), double
> 0x3FD3333340000000) #2
>   %call7 = tail call i32 @fesetround(i32 1024) #2
>   %call10 = tail call i32 (i8*, ...) @printf(i8* getelementptr
> inbounds ([17 x i8], [17 x i8]* @.str.2, i64 0, i64 0), double
> 0x3FD3333340000000) #2
>   ret i32 0
> }
> ```
>
> the cast of a constant has been constant folded incorrectly (I guess
> that clang is assuming a particular rounding mode which in this case
> is sometimes the wrong rounding mode).
>
> I'm not sure if there's a good way to fix this. First I thought it
> would be better if the rounding mode was an operand to ``fptrunc``
> (which would make constant folding correct) but then I realized that
> for codegen to be always correct, every time a ``fptrunc`` is about to
> be executed the rounding mode might to be reset which most of the time
> would be a very wasteful thing to do.
>
> In general its not (at least in C) possible always know what the
> rounding mode is going to be statically at any point during the
> program because it's part of the currently executing thread's state.
>
> On the other hand LLVM IR isn't supposed to be tied to C so I feel
> like there ought to be away to specify how certain floating point
> operations do rounding. (I think these rounding issues apply to more
> than just ``fptrunc``)
>
> Any thoughts on this? At the very least the LLVM IR documentation
> needs to be more specific about how rounding is done.
>
>
> Thanks,
> Dan.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list