[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

Mon Nov 12 17:42:16 PST 2012

On Nov 12, 2012, at 10:39 AM, Joe Abbey <jabbey at arxan.com> wrote:

> Michael,
> 
> Since you won't be using metadata to store this information and are augmenting the IR, I'd recommend incrementing the bitcode version number.  The current version stored in a local variable in BitcodeWriter.cpp:1814*  
> 
> I would suspect then you'll also need to provide additional logic for reading:
> 
>       switch (module_version) {
>         default: return Error("Unknown bitstream version!");
>         case 2:
> 	  EncodesFastMathIR = true;
>         case 1:
>           UseRelativeIDs = true;
>           break;
>  	case 0:
>           UseRelativeIDs = false;
>           break;
>        
>       }

Couldn't this be handled by adding an extra operand to the binary operators?

-Chris

> 
> Joe
> 
> (*TODO: Put this somewhere else).
> 
> On Nov 9, 2012, at 5:34 PM, Michael Ilseman <milseman at apple.com> wrote:
> 
>> Revision 2
>> 
>> Revision 2 changes:
>> * Add in separate Reciprocal flag
>> * Clarified wording of flags, specified undefined values, not behavior
>> * Removed some confusing language
>> * Mentioned optimizations/analyses adding in flags due to inferred knowledge
>> 
>> Revision 1 changes:
>> * Removed Fusion flag from all sections
>> * Clarified and changed descriptions of remaining flags:
>>   * Make 'N' and 'I' flags be explicitly concerning values of operands, and
>>     producing undef values if a NaN/Inf is provided.
>>   * 'S' is now only about distinguishing between +/-0.
>>   * LangRef changes updated to reflect flags changes
>>   * Updated Quesiton section given the now simpler set of flags
>>   * Optimizations changed to reflect 'N' and 'I' describing operands and not
>>     results
>> * Be explicit on what LLVM's default behavior is (no signaling NaNs, etc)
>> * Mention that this could be alternatively solved with metadata, and open the
>>   debate
>> 
>> 
>> Introduction
>> ---
>> 
>> LLVM IR currently does not have any support for specifying fine-grained control
>> over relaxing floating point requirements for the optimizer. The below is a
>> proposal to extend floating point IR instructions to support a number of flags
>> that a creator of IR can use to allow for greater optimizations when
>> desired. Such changes are sometimes referred to as fast-math, but this proposal
>> is about finer-grained specifications at a per-instruction level.
>> 
>> 
>> What this doesn't address
>> ---
>> 
>> Default behavior is retained, and this proposal is only addressing relaxing
>> restrictions. LLVM currently by default:
>> - ignores signaling NaNs
>> - assumes default rounding mode
>> - assumes FENV_ACCESS is off
>> 
>> Discussion on changing the default behavior of LLVM or allowing for more
>> restrictive behavior is outside the scope of this proposal. This proposal does
>> not address behavior of denormals, which is more of a backend concern.
>> 
>> Specifying exact precision control or requirements is outside the scope of this
>> proposal, and can probably be handled with the existing metadata implementation.
>> 
>> This proposal covers changes to and optimizations over LLVM IR, and changes to
>> codegen are outside the scope of this proposal. The flags described in the next
>> section exist only at the IR level, and will not be propagated into codegen or
>> the SelectionDAG.
>> 
>> 
>> Flags
>> ---
>> 
>> LLVM IR instructions will have the following flags that can be set by the
>> creator of the IR.
>> 
>> no NaNs (N)
>> - Allow optimizations that assume the arguments and result are not NaN. Such
>>   optimizations are required to retain defined behavior over NaNs, but the
>>   value of the result is undefined.
>> 
>> no Infs (I)
>> - Allow optimizations that assume the arguments and result are not
>>   +/-Inf. Such optimizations are required to retain defined behavior over
>>   +/-Inf, but the value of the result is undefined.
>> 
>> no signed zeros (S)
>> - Allow optimizations to treat the sign of a zero argument or result as
>>   insignificant.
>> 
>> allow reciprocal (R)
>> - Allow optimizations to use the reciprocal of an argument instead of dividing
>> 
>> unsafe algebra (A)
>> - The optimizer is allowed to perform algebraically equivalent transformations
>>    that may dramatically change results in floating point. (e.g.
>>    reassociation).
>> 
>> Throughout I'll refer to these options in their short-hand, e.g. 'A'.
>> Internally, these flags are to reside in SubclassData.
>> 
>> Setting the 'A' flag implies the setting of all the others ('N', 'I', 'S', 'R').
>> 
>> 
>> Changes to LangRef
>> ---
>> 
>> Change the definitions of floating point arithmetic operations, below is how
>> fadd will change:
>> 
>> 'fadd' Instruction
>> Syntax:
>> 
>> <result> = fadd {flag}* <ty> <op1>, <op2>   ; yields {ty}:result
>> ...
>> Semantics:
>> ...
>> flag can be one of the following optimizer hints to enable otherwise unsafe
>> floating point optimizations:
>> N: no NaNs - Allow optimizations that assume the arguments and result are not
>>   NaN. Such optimizations are required to retain defined behavior over NaNs,
>>   but the value of the result is undefined.
>> I: no infs - Allow optimizations that assume the arguments and result are not
>>   +/-Inf. Such optimizations are required to retain defined behavior over
>>   +/-Inf, but the value of the result is undefined.
>> S: no signed zeros - Allow optimizations to treat the sign of a zero argument
>>   or result as insignificant.
>> A: unsafe algebra - The optimizer is allowed to perform algebraically
>>    equivalent transformations that may dramatically change results in floating
>>    point. (e.g.  reassociation).
>> 
>> fdiv will also mention that 'R' allows the fdiv to be replaced by a
>> multiply-by-reciprocal.
>> 
>> 
>> Changes to optimizations
>> ---
>> 
>> Optimizations should be allowed to perform unsafe optimizations provided the
>> instructions involved have the corresponding restrictions relaxed. When
>> combining instructions, optimizations should do what makes sense to not remove
>> restrictions that previously existed (commonly, a bitwise-AND of the flags).
>> 
>> Below are some example optimizations that could be allowed with the given
>> relaxations.
>> 
>> N - no NaNs
>> x == x ==> true
>> 
>> S - no signed zeros
>> x - 0 ==> x
>> 0 - (x - y) ==> y - x
>> 
>> NIS - no signed zeros AND no NaNs AND no Infs
>> x * 0 ==> 0
>> 
>> NI - no infs AND no NaNs
>> x - x ==> 0
>> 
>> R - reciprocal
>>  x / y ==> x * (1/y)
>> 
>> A - unsafe-algebra
>> Reassociation
>>   (x + y) + z ==> x + (y + z)
>>   (x + C1) + C2 ==> x + (C1 + C2)
>> Redistribution
>>   (x * C) + x ==> x * (C+1)
>>   (x * C) + (x + x) ==> x * (C + 2)
>> 
>> I propose to expand -instsimplify and -instcombine to perform these kinds of
>> optimizations. -reassociate will be expanded to reassociate floating point
>> operations when allowed. Similar to existing behavior regarding integer
>> wrapping, -early-cse will not CSE FP operations with mismatched flags, while
>> -gvn will (conservatively). This allows later optimizations to optimize the
>> expressions independently between runs of -early-cse and -gvn.
>> 
>> Optimizations and analyses that are able to infer certain properties of
>> instructions are allowed to set relevant flags. For example, if some analysis
>> has determined that the arguments and result of an instruction are not NaNs or
>> Infs, then it may set the 'N' and 'I' flags, allowing every other optimization
>> and analysis to benefit from this inferred knowledge.
>> 
>> Changes to frontends
>> ---
>> 
>> Frontends are free to generate code with flags set as they desire. Frontends
>> should continue to call llc with their desired options, as the flags apply only
>> at the IR level and not at codegen or the SelectionDAGs.
>> 
>> The intention behind the flags are to allow the IR creator to say something
>> along the lines of:
>> "If this operation is given a NaN, or the result is a NaN, then I don't care
>> what answer I get back. However, I expect my program to otherwise behave
>> properly."
>> 
>> Below is a suggested change to clang's command-line options.
>> 
>> -ffast-math
>> Currently described as:
>> Enable the *frontend*'s 'fast-math' mode. This has no effect on optimizations,
>> but provides a preprocessor macro __FAST_MATH__ the same as GCC's -ffast-math
>> flag
>> 
>> I propose to change the description and behavior to:
>> 
>> Enable 'fast-math' mode. This allows for optimizations that may produce
>> incorrect and unsafe results, and thus should only be used with care. This
>> also provides a preprocessor macro __FAST_MATH__ the same as GCC's -ffast-math
>> flag
>> 
>> I propose that this turn on all flags for all floating point instructions. If
>> this flag doesn't already cause clang to run llc with -enable-unsafe-fp-math,
>> then I propose that it does so as well.
>> 
>> (Optional)
>> I propose adding the below flags:
>> 
>> -ffinite-math-only
>> Allow optimizations to assume that floating point arguments and results are
>> NaNs or +/-Inf. This may produce incorrect results, and so should be used with
>> care.
>> 
>> This would set the 'I' and 'N' bits on all generated floating point instructions.
>> 
>> -fno-signed-zeros
>> Allow optimizations to ignore the signedness of zero. This may produce
>> incorrect results, and so should be used with care.
>> 
>> This would set the 'S' bit on all FP instructions.
>> 
>> -freciprocal-math
>> Allow optimizations to use the reciprocal of an argument instead of using
>> division. This may produce less precise results, and so should be used with
>> care.
>> 
>> This would set the 'R' bit on all relevant FP instructions
>> 
>> Changes to llvm cli tools
>> ---
>> opt and llc already have the command line options
>> -enable-unsafe-fp-math: Enable optimizations that may decrease FP precision
>> -enable-no-infs-fp-math: Enable FP math optimizations that assume no +-Infs
>> -enable-no-nans-fp-math: Enable FP math optimizations that assume no NaNs
>> However, opt makes no use of them as they are currently only considered to be
>> TargetOptions. llc will remain unchanged, as these options apply to DAG
>> optimizations while this proposal deals with IR optimizations.
>> 
>> (Optional)
>> Have an opt pass that adds the desired flags to floating point instructions.
>> 
>> 
>> Miscellaneous explanations in the form of Q&A
>> ---
>> 
>> Why not just have "fast-math" rather than individual flags?
>> 
>> Having the individual flags gives the granularity to choose the levels of
>> optimizations. For example, unsafe-algebra can lead to dramatically different
>> results in corner cases, and may not be desired when a user just wants to ensure
>> that x*0 folds to 0.
>> 
>> 
>> Why have these flags attached to the instruction itself, rather than be a
>> compiler mode?
>> 
>> Being attached to the instruction itself allows much greater flexibility both
>> for other optimizations and for the concerns of the source and target. For
>> example, a frontend may desire that x - x be folded to 0. This would require
>> no-NaNs for the subtract. However, the frontend may want to keep NaNs for its
>> comparisons.
>> 
>> Additionally, these properties can be set internally in the optimizer when the
>> property has been proven. For example, if x has been found to be positive, then
>> operations involving x and a constant can be marked to ignore signed zero.
>> 
>> Finally, having these flags allows for greater safety and optimization when code
>> of different flags are mixed. For example, a function author may set the
>> unsafe-algebra flag knowing that such transformations will not meaningfully
>> alter its result. If that function gets inlined into a caller, however, we don't
>> want to always assume that the function's expressions can be reassociated with
>> the caller's expressions. These properties allow us to preserve the
>> optimizations of the inlined function without affecting the caller.
>> 
>> 
>> Why not use metadata rather than flags?
>> 
>> There is existing metadata to denote precisions, and this proposal is orthogonal
>> to those efforts. While these properties could still be expressed as metadata,
>> the proposed flags are analogous to nsw/nuw and are inherent properties of the
>> IR instructions themselves that all transformations should respect.
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121112/c68dde09/attachment.html>