[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

Fri Feb 5 03:18:38 PST 2016

Chandler Carruth via llvm-dev wrote on Fri, 05 Feb 2016:

> If "return" is provided for the exception behavior, then the i1 component
> of the result is true if an FP exception occured and false otherwise. If
> "ignore" is provided then any FP exceptions are ignored and the i1 is
> always false. If "trap" is provided then the i1 is always false, but the
> call to the intrinsic might trap. We could either define a trap as
> precisely the same as a call to @llvm.trap(), or we could introduce an
> @llvm.fp.trap() and define it as a call to that.

Our run time library installs signal handlers/exception filters to  
catch FPU exceptions. Can that be modeled in this way too?

> The frontend would then be responsible for lowering floating point
> arithmetic using these intrinsics. This may be somewhat challenging because
> in the frontend behavior is controlled dynamically in some languages. In
> those situations, we can either allow these intrinsics to accept
> non-constant arguments for %rounding_mode and %exception_behavior so that
> frontends can emit code that just dynamically computes them, or we could
> follow the same model that atomics use, and if the frontend cannot
> trivially compute a constant, it can emit a switch over the possible states
> with a specific intrinsic call in each case. I don't have strong opinions
> about which would be best, I think either could be made to work.

In our run time library you have calls to dynamically change the  
rounding mode of the FPU, and to dynamically mask individual floating  
point exceptions. With our current (non-llvm) code generators, we  
simply emit regular FPU instructions and depending on those settings,  
they always do "the right thing". It's true that we cannot perform a  
number of optimisations because of this, but on the other hand there  
is no overhead at run time for any kind of checks.

If I understood your proposal above correctly, you propose that for  
LLVM this would be implemented by our frontend emitting a bunch of  
checking code for each (sequence of) FPU instructions to determine the  
current FPU exception mask and rounding mode? That seems rather heavy,  
even if LLVM can optimise away a bunch of those calls if they're  
annotated correctly as not changing any state themselves.

> When emitting constants and trying to respect floating point environment
> settings, frontends will have to emit runtime calls instead of actual
> constants. But this seems actually good because that is what we'll need
> anyways -- we aren't able to with full generality emulate all the
> environment options if I understand things correctly (and let me know if
> I've misunderstood).

You indeed can't, but I don't understand how calling these run time  
functions will help:
1) at compile time, you still can't do anything about it, unless you  
want to generate umpteen different versions of the FPU code that are  
then selected at run time depending on which results those functions  
returned (like with your "switch" proposal above, but I think that  
would completely kill performance in many cases -- atomics are used  
sparingly and are slow by definition; that's not true for floating  
point code)
2) at run time, you get the extra overhead of the extra function calls  
everywhere

I wonder whether this won't result in enormous code bloat, and under  
which circumstances this would result in better performance than  
simply an option whereby the frontend instructs LLVM to
1) assume that all FPU instructions may trap and may use any rounding mode
2) emit regular FPU opcodes without the need for any extra calls etc.

At least such an option would be seem desirable for our language.

Having a similar option for telling LLVM to stop assuming that the  
results of null-pointer dereferences and integer divisions-by-zero are  
undefined (they are not, in our case; only if the hardware/OS does not  
support exceptions for them, we generate explicit checks in our  
non-LLVM code generators), would be even better.

Jonas