[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level
milseman at apple.com
Tue Oct 30 10:18:40 PDT 2012
On Oct 30, 2012, at 8:23 AM, Dan Gohman <dan433584 at gmail.com> wrote:
> Hi Micheal,
> On Mon, Oct 29, 2012 at 4:34 PM, Michael Ilseman <milseman at apple.com> wrote:
> no NaNs (N)
> - ignore the existence of NaNs when convenient
> no Infs (I)
> - ignore the existence of Infs when convenient
> no signed zeros (S)
> - ignore the existence of negative zero when convenient
> Does this mean ignore the possibility of NaNs as operands, as results, or both? Ditto for infinity and negative zero.
I wrote this thinking both, though I could certainly imagine it being clearer if defined as operands. The example optimizations section is written along the lines of ignoring both.
> Also, what does "ignore" mean? As worded, it seems to imply Undefined Behavior if the value is encountered. Is that intended?
What I'm intending is for optimizations to be allowed to ignore the possibility of those values. Thinking about it more, this is pretty vague. With your and Krzysztof's feedback in mind, I think something along the lines of:
no NaNs (N)
- The operands' values can be assumed to be non-NaN by the optimizer. The result of this operator is Undef if passed a NaN.
Might be more clear. I'll think about that more and revise the examples section too.
> allow fusion (F)
> - fuse FP operations when convenient, despite possible differences in rounding
> (e.g. form FMAs)
> What do you intend to be the relationship between this and @llvm.fmuladd? It's not clear whether you're trying to replace it or trying to set up an alternative for different use cases.
Interesting, I had not seen llvm.fmuladd. I'll have to think about this more; perhaps fmuladd can already provide what I was intending here.
> Is your wording of "fusing" intended to imply fusing with infinite intermediate precision only, or is mere increased precision also valid?
My intention is that increased precision is also valid, though I haven't though too deeply about the difference
> unsafe algebra (A)
> - allow for algebraically equivalent transformations that may dramatically
> change results in floating point. (e.g. reassociation)
> Not all combinations make sense (e.g. 'A' pretty much implies all other flags).
> Basically, I have the below semilattice of sensible relations:
> A > S > I > N
> A > F
> Meaning that 'A' implies all the others, 'S' implies 'I' and 'N', etc.
> Why does it make sense for S to imply I and N? GCC's -fno-signed-zeros flag doesn't seem to imply -ffinite-math-only, among other things. The concept of negative zero isn't inherently linked with the concepts of infinity or NaN.
What I mean here is that I'm finding it hard to think of a case where a user would desire to specify 'I' and not specify 'N'. This is more so a question I had as to whether we could/should express this as a fast-math level rather than allow each flag to be individually toggle-able. Any thoughts on this?
> It might make sense to change the S, I, and N options to be some kind of finite
> option with levels 3, 2, and 1 respectively. F and A could be kept distinct. It
> is still the case that A would imply pretty much everything else.
> N - no NaNs
> x == x ==> true
> This is not true if x is infinity.
> S - no signed zeros
> x - 0 ==> x
> 0 - (x - y) ==> y - x
> NS - no signed zeros AND no NaNs
> x * 0 ==> 0
> NI - no infs AND no NaNs
> x - x ==> 0
> Inf > x ==> true
> With the I flag, would the infinity as an operand make this undefined?
I'll think about this more with regards to the prior changes.
> A - unsafe-algebra
> (x + C1) + C2 ==> x + (C1 + C2)
> (x * C) + x ==> x * (C+1)
> (x * C) + (x + x) ==> x * (C + 2)
> x / C ==> x * (1/C)
> These examples apply when the new constants are permitted, e.g. not denormal,
> and all the instructions involved have the needed flags.
> I'm confused. In other places, you seem to apply that reassociation would be valid even on non-constant values. It's not clear whether you meant to contradict that here.
Reassociation is still valid. These examples are just cases where there would be a clear optimization benefit to be had. I'll probably add in a general expression to clarify.
> I'm not too familiar with this option, but I recommend that 'all' turn on the
> 'F' bit for all FP instructinos, default do so when following the pragma, and
> off never doing so. This option should still be passed to the backend.
> Please coordinate with Lang and others who have already done a fair amount of work on FP_CONTRACT.
I will, thanks.
> I propose adding the below flags:
> Allow optimizations to assume that floating point arguments and results are
> NaNs or +/-Inf. This may produce incorrect results, and so should be used with
> This would set the 'I' and 'N' bits on all generated floating point instructions.
> Allow optimizations to ignore the signedness of zero. This may produce
> incorrect results, and so should be used with care.
> This would set the 'S' bit on all FP instructions.
> These are established flags in GCC. Do you know if there are any semantic differences between your proposed semantics and the semantics of these flags in GCC? If so, it would be good to either change to match them, or document the differences.
I don't know of any differences, but I'll have to look into GCC's behavior more.
Thanks for the feedback!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev