[llvm-dev] [RFC] FP Environment and Rounding mode handling in LLVM

Thu Feb 4 18:05:38 PST 2016

First, thanks Mehdi for putting something on llvm-dev and getting wider
awareness of this.

I am actually really interested in finding a way for LLVM to support the
interesting functionality we are missing from fenv-like interfaces. Things
like rounding modes, exceptions, etc. However, I think the current design
is going to be a really high burden for the entire optimizer and I think
there is a simpler model that we might pursue instead.

I'll start off with some underlying principles that I'm operating from:
a) Most code in the world will be very happy with the default floating
point environment, doesn't need to carefully model floating point
exceptions, etc. Essentially, I think that LLVM's behavior today is
probably right for most code. Now, the code which needs support for the
other features of floating point isn't bad or unimportant! But it is
relatively speaking rare, and so I think it is reasonable to optimize the
*representation* model for the common case provided we don't lose support
for functionality.

a) When outside the default floating point environment's rules, there are
few if any optimizations that we realistically expect from LLVM. Certainly,
any changes to the LLVM optimizer which impact code outside the default
needs to be done *much* more carefully to avoid introducing subtle bugs.

OK, based on that, consider the following model:
We provide intrinsics that mirror the instructions 'fadd', 'fsub', 'fmul',
'fdiv', and 'frem' (so 5 total). From here on out, I'll exclusively use
'fadd' as my examples. The intrinsics would look like:
  declare {f32, i1} @llvm.fadd.with.environment.f32(f32 %lhs, f32 %rhs, i8
%rounding_mode, i8 %exception_behavior)

Then we define specific values to be used for the IEEE rounding modes. And
we define values to control exception behavior. I'm not an expert on
floating point exceptions in particular (my platforms don't use them) but
I'm imagining three states "ignore", "return", and "trap". I've used a
single 'i1', but I'm assuming it would need to be several i1s or an iN in
order to model the set of FP exceptions. I'm using i1 here just to simplify
the explanation, I think it generalizes and I'll let the experts suggest
the exact formulation.

If the default rounding mode is provided to these intrinsics and the
"ignore" exception behavior is provided, they behave exactly as the
existing instructions do, and instcombine should canonicalize to the
existing instructions.

The semantics of non-default rounding modes are to perform the operation
with that rounding mode.

If "return" is provided for the exception behavior, then the i1 component
of the result is true if an FP exception occured and false otherwise. If
"ignore" is provided then any FP exceptions are ignored and the i1 is
always false. If "trap" is provided then the i1 is always false, but the
call to the intrinsic might trap. We could either define a trap as
precisely the same as a call to @llvm.trap(), or we could introduce an
@llvm.fp.trap() and define it as a call to that.

The frontend would then be responsible for lowering floating point
arithmetic using these intrinsics. This may be somewhat challenging because
in the frontend behavior is controlled dynamically in some languages. In
those situations, we can either allow these intrinsics to accept
non-constant arguments for %rounding_mode and %exception_behavior so that
frontends can emit code that just dynamically computes them, or we could
follow the same model that atomics use, and if the frontend cannot
trivially compute a constant, it can emit a switch over the possible states
with a specific intrinsic call in each case. I don't have strong opinions
about which would be best, I think either could be made to work.

If we go with constant arguments being required, we could use "metadata
arguments" which aren't actually metadata but just encoded arguments for
intrinsics.

When emitting constants and trying to respect floating point environment
settings, frontends will have to emit runtime calls instead of actual
constants. But this seems actually good because that is what we'll need
anyways -- we aren't able to with full generality emulate all the
environment options if I understand things correctly (and let me know if
I've misunderstood).

The two really big reasons why I like this model much more than extending
flags are:

1) This avoids implicit state. The implicit state of the floating point
environment makes things like code motion extremely hard to reason about. I
think we will just get it wrong too often to make this a good approach. By
modeling all of this as actual SSA values I think there is a much better
chance we'll get this stuff right. For example by or-ing all the i1s for
floating point exceptions and testing the result to implement fetestexcept.
Then the backend can spill the state when necessary and reload it when
needed even if other floating point math is introduced. I admit that first
class aggregate returns aren't a beautiful way to encapsulate this, but
they are an *effective* way that we know how to work with in the LLVM IR.
If we ever come up with a better multi-def model, we can always switch
these and all the other intrinsics which need this to that model.

2) Every pass will conservatively correctly model the operations. This is
most significant when modeling trapping on exceptions. We need every pass
to realize that control flow might not proceed past such operations. We
already have this logic for calls, and it seems a really nice fit for
allowing most of the optimizer to be unaware of these constructs while
respecting them and preserving behavior in the face of them.

I suspect that there are things this model doesn't handle that I've not
thought of (as this is outside the are of FP that I'm deeply familiar
with), but I really think this model would be easier to reason about and
would be much less invasive within the IR and optimizer. I wonder if folks
think this could work and would be up for moving their efforts in this
direction?

-Chandler

On Wed, Feb 3, 2016 at 3:04 PM Mehdi Amini <mehdi.amini at apple.com> wrote:

> Hi everyone,
>
> Sergey (CC’ed) worked on a series of patches to add support for
> floating-point environment and floating-point rounding modes in LLVM.
> This started *in 2014* and the patches after multiple rounds of review in
> the last months (involving amongst other Steve Canon, Hal Finkel, David
> Majnemer, and myself) are getting very close (IMO) to be in a state where
> we can land them.
>
> This is the thread that started this development: “ [LLVMdev] More careful
> treatment of floating point exceptions"
> http://marc.info/?l=llvm-dev&m=141113983302113&w=2
> And this is the thread where most of the discussion on the design
> occurred: "[PATCH] Flag to enable IEEE-754 friendly FP optimizations”
> http://marc.info/?l=llvm-commits&m=141235814915999&w=2
>
> Since Chandler raised some concerns on IRC today, so I figured I should
> send a heads-up on this topic to allow any one to comment on the current
> plan.
>
> We plan on adding two new FP env flags to the existing FMF (fast-math
> flags). Without these flags set, the optimizer has to assume that the FP
> env can be observed, or the rounding mode can be changed. For clang, these
> flags would be set unless a command line option would require to preserve
> the FP env.
>
> Here is the list of patches:
>
> [FPEnv Core 01/14] Add flags and command-line switches for FPEnv:
> http://reviews.llvm.org/D14066
> [FPEnv Core 02/14] Add FPEnv access flags to fast-math flags:
> http://reviews.llvm.org/D14067
> [FPEnv Core 03/14] Make SelectionDAG aware of FPEnv flags:
> http://reviews.llvm.org/D14068
> [FPEnv Core 04/14] Skip constant folding to preserve FPEnv:
> http://reviews.llvm.org/D14069
> [FPEnv Core 05/14] Teach IR builder and folders about new flags:
> http://reviews.llvm.org/D14070
> [FPEnv Core 06/14] Do not fold constants on reading in IR asm/bitcode:
> http://reviews.llvm.org/D14071
> [FPEnv Core 07/14] Prevent undesired folding by InstSimplify:
> http://reviews.llvm.org/D14072
> [FPEnv Core 08/14] Do not simplify expressions with FPEnv access:
> http://reviews.llvm.org/D14073
> [FPEnv Core 09/14] Make Strict flag available for more clients:
> http://reviews.llvm.org/D14074
> [FPEnv Core 10/14] Use Strict in IRBuilder: http://reviews.llvm.org/D14075
> [FPEnv Core 11/14] Don't convert fpops to constexprs in SCCP:
> http://reviews.llvm.org/D14076
> [FPEnv Core 13/14] Don't hoist FP-ops with side-effects in LICM:
> http://reviews.llvm.org/D14078
> [FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to prevent reordering:
> http://reviews.llvm.org/D14079
>
>
> —
> Mehdi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/7088ffbb/attachment.html>