<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Michael,<div><br></div><div>Since you won't be using metadata to store this information and are augmenting the IR, I'd recommend incrementing the bitcode version number.  The current version stored in a local variable in BitcodeWriter.cpp:1814*  </div><div><br></div><div>I would suspect then you'll also need to provide additional logic for reading:</div><div><br></div><div><div style="margin: 0px; font-size: 11px; font-family: Menlo; ">      <span style="color: #bb2ca2">switch</span> (module_version) {</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(209, 47, 27); "><span style="color: #000000">        </span><span style="color: #bb2ca2">default</span><span style="color: #000000">: </span><span style="color: #bb2ca2">return</span><span style="color: #000000"> </span><span style="color: #31595d">Error</span><span style="color: #000000">(</span>"Unknown bitstream version!"<span style="color: #000000">);</span></div><div style="margin: 0px; font-size: 11px; font-family: Menlo; ">        <span style="color: rgb(187, 44, 162); ">case</span> <font color="#272ad8">2</font>:</div></div><div><font face="Menlo"><span style="font-size: 11px;"><span class="Apple-tab-span" style="white-space:pre">     </span>  <font color="#4f8187">EncodesFastMathIR</font></span></font><span style="font-family: Menlo; font-size: 11px; "> </span><span style="font-family: Menlo; font-size: 11px; ">=</span><span style="font-family: Menlo; font-size: 11px; "> </span><span style="font-family: Menlo; font-size: 11px; color: rgb(187, 44, 162); ">true</span><span style="font-family: Menlo; font-size: 11px; ">;</span><div style="margin: 0px; font-size: 11px; font-family: Menlo; ">        <span style="color: rgb(187, 44, 162); ">case</span> <span style="color: rgb(39, 42, 216); ">1</span>:</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; ">          <span style="color: #4f8187">UseRelativeIDs</span> = <span style="color: #bb2ca2">true</span>;</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; ">          <span style="color: #bb2ca2">break</span>;</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; "><div style="margin: 0px; "> <span class="Apple-tab-span" style="white-space:pre">      </span><span style="color: rgb(187, 44, 162); ">case</span> <span style="color: rgb(39, 42, 216); ">0</span>:</div><div style="margin: 0px; ">          <span style="color: rgb(79, 129, 135); ">UseRelativeIDs</span> = <span style="color: rgb(187, 44, 162); ">false</span>;</div><div style="margin: 0px; ">          <span style="color: rgb(187, 44, 162); ">break</span>;</div><div style="margin: 0px; ">       </div></div><div style="margin: 0px; font-size: 11px; font-family: Menlo; ">      }</div></div><div><br></div><div>Joe</div><div><br></div><div><span style="-webkit-text-decorations-in-effect: underline; ">(*TODO: Put this somewhere else).</span></div><div><div><br></div><div><div>On Nov 9, 2012, at 5:34 PM, Michael Ilseman <<a href="mailto:milseman@apple.com">milseman@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Revision 2<br><br>Revision 2 changes:<br> * Add in separate Reciprocal flag<br> * Clarified wording of flags, specified undefined values, not behavior<br> * Removed some confusing language<br> * Mentioned optimizations/analyses adding in flags due to inferred knowledge<br><br>Revision 1 changes:<br> * Removed Fusion flag from all sections<br> * Clarified and changed descriptions of remaining flags:<br>   * Make 'N' and 'I' flags be explicitly concerning values of operands, and<br>     producing undef values if a NaN/Inf is provided.<br>   * 'S' is now only about distinguishing between +/-0.<br>   * LangRef changes updated to reflect flags changes<br>   * Updated Quesiton section given the now simpler set of flags<br>   * Optimizations changed to reflect 'N' and 'I' describing operands and not<br>     results<br> * Be explicit on what LLVM's default behavior is (no signaling NaNs, etc)<br> * Mention that this could be alternatively solved with metadata, and open the<br>   debate<br><br><br>Introduction<br>---<br><br>LLVM IR currently does not have any support for specifying fine-grained control<br>over relaxing floating point requirements for the optimizer. The below is a<br>proposal to extend floating point IR instructions to support a number of flags<br>that a creator of IR can use to allow for greater optimizations when<br>desired. Such changes are sometimes referred to as fast-math, but this proposal<br>is about finer-grained specifications at a per-instruction level.<br><br><br>What this doesn't address<br>---<br><br>Default behavior is retained, and this proposal is only addressing relaxing<br>restrictions. LLVM currently by default:<br>- ignores signaling NaNs<br>- assumes default rounding mode<br>- assumes FENV_ACCESS is off<br><br>Discussion on changing the default behavior of LLVM or allowing for more<br>restrictive behavior is outside the scope of this proposal. This proposal does<br>not address behavior of denormals, which is more of a backend concern.<br><br>Specifying exact precision control or requirements is outside the scope of this<br>proposal, and can probably be handled with the existing metadata implementation.<br><br>This proposal covers changes to and optimizations over LLVM IR, and changes to<br>codegen are outside the scope of this proposal. The flags described in the next<br>section exist only at the IR level, and will not be propagated into codegen or<br>the SelectionDAG.<br><br><br>Flags<br>---<br><br>LLVM IR instructions will have the following flags that can be set by the<br>creator of the IR.<br><br>no NaNs (N)<br> - Allow optimizations that assume the arguments and result are not NaN. Such<br>   optimizations are required to retain defined behavior over NaNs, but the<br>   value of the result is undefined.<br><br>no Infs (I)<br> - Allow optimizations that assume the arguments and result are not<br>   +/-Inf. Such optimizations are required to retain defined behavior over<br>   +/-Inf, but the value of the result is undefined.<br><br>no signed zeros (S)<br> - Allow optimizations to treat the sign of a zero argument or result as<br>   insignificant.<br><br>allow reciprocal (R)<br> - Allow optimizations to use the reciprocal of an argument instead of dividing<br><br>unsafe algebra (A)<br> - The optimizer is allowed to perform algebraically equivalent transformations<br>    that may dramatically change results in floating point. (e.g.<br>    reassociation).<br><br>Throughout I'll refer to these options in their short-hand, e.g. 'A'.<br>Internally, these flags are to reside in SubclassData.<br><br>Setting the 'A' flag implies the setting of all the others ('N', 'I', 'S', 'R').<br><br><br>Changes to LangRef<br>---<br><br>Change the definitions of floating point arithmetic operations, below is how<br>fadd will change:<br><br>'fadd' Instruction<br>Syntax:<br><br> <result> = fadd {flag}* <ty> <op1>, <op2>   ; yields {ty}:result<br>...<br>Semantics:<br>...<br>flag can be one of the following optimizer hints to enable otherwise unsafe<br>floating point optimizations:<br> N: no NaNs - Allow optimizations that assume the arguments and result are not<br>   NaN. Such optimizations are required to retain defined behavior over NaNs,<br>   but the value of the result is undefined.<br> I: no infs - Allow optimizations that assume the arguments and result are not<br>   +/-Inf. Such optimizations are required to retain defined behavior over<br>   +/-Inf, but the value of the result is undefined.<br> S: no signed zeros - Allow optimizations to treat the sign of a zero argument<br>   or result as insignificant.<br> A: unsafe algebra - The optimizer is allowed to perform algebraically<br>    equivalent transformations that may dramatically change results in floating<br>    point. (e.g.  reassociation).<br><br>fdiv will also mention that 'R' allows the fdiv to be replaced by a<br>multiply-by-reciprocal.<br><br><br>Changes to optimizations<br>---<br><br>Optimizations should be allowed to perform unsafe optimizations provided the<br>instructions involved have the corresponding restrictions relaxed. When<br>combining instructions, optimizations should do what makes sense to not remove<br>restrictions that previously existed (commonly, a bitwise-AND of the flags).<br><br>Below are some example optimizations that could be allowed with the given<br>relaxations.<br><br>N - no NaNs<br> x == x ==> true<br><br>S - no signed zeros<br> x - 0 ==> x<br> 0 - (x - y) ==> y - x<br><br>NIS - no signed zeros AND no NaNs AND no Infs<br> x * 0 ==> 0<br><br>NI - no infs AND no NaNs<br> x - x ==> 0<br><br>R - reciprocal<br>  x / y ==> x * (1/y)<br><br>A - unsafe-algebra<br> Reassociation<br>   (x + y) + z ==> x + (y + z)<br>   (x + C1) + C2 ==> x + (C1 + C2)<br> Redistribution<br>   (x * C) + x ==> x * (C+1)<br>   (x * C) + (x + x) ==> x * (C + 2)<br><br>I propose to expand -instsimplify and -instcombine to perform these kinds of<br>optimizations. -reassociate will be expanded to reassociate floating point<br>operations when allowed. Similar to existing behavior regarding integer<br>wrapping, -early-cse will not CSE FP operations with mismatched flags, while<br>-gvn will (conservatively). This allows later optimizations to optimize the<br>expressions independently between runs of -early-cse and -gvn.<br><br>Optimizations and analyses that are able to infer certain properties of<br>instructions are allowed to set relevant flags. For example, if some analysis<br>has determined that the arguments and result of an instruction are not NaNs or<br>Infs, then it may set the 'N' and 'I' flags, allowing every other optimization<br>and analysis to benefit from this inferred knowledge.<br><br>Changes to frontends<br>---<br><br>Frontends are free to generate code with flags set as they desire. Frontends<br>should continue to call llc with their desired options, as the flags apply only<br>at the IR level and not at codegen or the SelectionDAGs.<br><br>The intention behind the flags are to allow the IR creator to say something<br>along the lines of:<br>"If this operation is given a NaN, or the result is a NaN, then I don't care<br>what answer I get back. However, I expect my program to otherwise behave<br>properly."<br><br>Below is a suggested change to clang's command-line options.<br><br>-ffast-math<br> Currently described as:<br> Enable the *frontend*'s 'fast-math' mode. This has no effect on optimizations,<br> but provides a preprocessor macro __FAST_MATH__ the same as GCC's -ffast-math<br> flag<br><br> I propose to change the description and behavior to:<br><br> Enable 'fast-math' mode. This allows for optimizations that may produce<br> incorrect and unsafe results, and thus should only be used with care. This<br> also provides a preprocessor macro __FAST_MATH__ the same as GCC's -ffast-math<br> flag<br><br> I propose that this turn on all flags for all floating point instructions. If<br> this flag doesn't already cause clang to run llc with -enable-unsafe-fp-math,<br> then I propose that it does so as well.<br><br>(Optional)<br>I propose adding the below flags:<br><br>-ffinite-math-only<br> Allow optimizations to assume that floating point arguments and results are<br> NaNs or +/-Inf. This may produce incorrect results, and so should be used with<br> care.<br><br> This would set the 'I' and 'N' bits on all generated floating point instructions.<br><br>-fno-signed-zeros<br> Allow optimizations to ignore the signedness of zero. This may produce<br> incorrect results, and so should be used with care.<br><br> This would set the 'S' bit on all FP instructions.<br><br>-freciprocal-math<br> Allow optimizations to use the reciprocal of an argument instead of using<br> division. This may produce less precise results, and so should be used with<br> care.<br><br> This would set the 'R' bit on all relevant FP instructions<br><br>Changes to llvm cli tools<br>---<br>opt and llc already have the command line options<br> -enable-unsafe-fp-math: Enable optimizations that may decrease FP precision<br> -enable-no-infs-fp-math: Enable FP math optimizations that assume no +-Infs<br> -enable-no-nans-fp-math: Enable FP math optimizations that assume no NaNs<br>However, opt makes no use of them as they are currently only considered to be<br>TargetOptions. llc will remain unchanged, as these options apply to DAG<br>optimizations while this proposal deals with IR optimizations.<br><br>(Optional)<br>Have an opt pass that adds the desired flags to floating point instructions.<br><br><br>Miscellaneous explanations in the form of Q&A<br>---<br><br>Why not just have "fast-math" rather than individual flags?<br><br>Having the individual flags gives the granularity to choose the levels of<br>optimizations. For example, unsafe-algebra can lead to dramatically different<br>results in corner cases, and may not be desired when a user just wants to ensure<br>that x*0 folds to 0.<br><br><br>Why have these flags attached to the instruction itself, rather than be a<br>compiler mode?<br><br>Being attached to the instruction itself allows much greater flexibility both<br>for other optimizations and for the concerns of the source and target. For<br>example, a frontend may desire that x - x be folded to 0. This would require<br>no-NaNs for the subtract. However, the frontend may want to keep NaNs for its<br>comparisons.<br><br>Additionally, these properties can be set internally in the optimizer when the<br>property has been proven. For example, if x has been found to be positive, then<br>operations involving x and a constant can be marked to ignore signed zero.<br><br>Finally, having these flags allows for greater safety and optimization when code<br>of different flags are mixed. For example, a function author may set the<br>unsafe-algebra flag knowing that such transformations will not meaningfully<br>alter its result. If that function gets inlined into a caller, however, we don't<br>want to always assume that the function's expressions can be reassociated with<br>the caller's expressions. These properties allow us to preserve the<br>optimizations of the inlined function without affecting the caller.<br><br><br>Why not use metadata rather than flags?<br><br>There is existing metadata to denote precisions, and this proposal is orthogonal<br>to those efforts. While these properties could still be expressed as metadata,<br>the proposed flags are analogous to nsw/nuw and are inherent properties of the<br>IR instructions themselves that all transformations should respect.<br><br>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br></blockquote></div><br></div></body></html>