<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi Duncan,<br>
<br>
<div class="moz-cite-prefix">On 23.09.2014 17:58, Duncan Sands
wrote:<br>
</div>
<blockquote cite="mid:54217C76.3020001@deepbluecap.com" type="cite">Hi
Oleg,
<br>
<br>
On 22/09/14 17:56, Oleg Ranevskyy wrote:
<br>
<blockquote type="cite">Hi Duncan,
<br>
<br>
On 17.09.2014 21:10, Duncan Sands wrote:
<br>
<blockquote type="cite">Hi Oleg,
<br>
<br>
On 17/09/14 18:45, Oleg Ranevskyy wrote:
<br>
<blockquote type="cite">Hi,
<br>
<br>
Thank you for all your helpful comments.
<br>
<br>
To sum up, below is the list of correct folding examples for
fadd:
<br>
(1) fadd %x, -0.0 -> %x
<br>
(2) fadd undef, undef -> undef
<br>
(3) fadd %x, undef -> NaN (undef is
a NaN which is
<br>
propagated)
<br>
<br>
Looking through the code I found the "NoNaNs" flag accessed
through an instance
<br>
of the FastMathFlags class.
<br>
(2) and (3) should probably depend on it.
<br>
If the flag is set, (2) and (3) cannot be folded as there
are no NaNs and we are
<br>
not guaranteed to get an arbitrary bit pattern from fadd,
right?
<br>
</blockquote>
<br>
I think it's exactly the other way round: if NoNans is set
then you can fold
<br>
(2) and (3) to undef. That's because (IIRC) the NoNans flag
promises that no
<br>
NaNs will be used by the program. However "undef" could be a
NaN, thus the
<br>
promise is broken, meaning the program is performing undefined
behaviour, and
<br>
you can do whatever you want.
<br>
</blockquote>
Oh, I see the point now. I thought if NoNaNs was set then no
NaNs were possible
<br>
at all. But undef is still an arbitrary bit pattern that might
occasionally be
<br>
the same as the one of a NaN. Thank you for the explanation.
<br>
<br>
Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to
undef, whereas
<br>
"fadd/fsub/fmul/fdiv %x, undef" is folded to either undef
(NoNaNs is set) or a
<br>
NaN (NoNaNs is not set).
<br>
</blockquote>
<br>
for fmul and fdiv, the reasoning does depend on fmul %x, 1.0
always being equal to %x (likewise: fdiv %x, 1.0 being equal to
%x). Is this true?
<br>
</blockquote>
Do you mean that we can't apply "fmul/fdiv undef, undef" to undef
folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?<br>
If we choose one undef to have an arbitrary bit pattern and another
undef = 1.0, we need a guarantee to get the bit pattern of the first
undef. Do I get it right?<br>
<br>
I checked the standard regarding "x*1.0 == x" and found that only
"10.4 Literal meaning and value-changing optimizations" addresses
this. I don't pretend to thoroughly understand this paragraph yet,
but it seems to me that language standards are required to preserve
the literal meaning of the source code. Applying the identity
property x*1 is a part of this. Here is a quote from IEEE-754:<br>
<br>
<i>"The following value-changing transformations, among others,
preserve the literal meaning of the source</i><i><br>
</i><i>code:</i><i><br>
</i><i>― Applying the identity property 0 + x when x is not zero and
is not a signaling NaN and the result</i><i><br>
</i><i>has the same exponent as x.</i><i><br>
</i><i>― Applying the identity property 1 × x when x is not a
signaling NaN and the result has the same</i><i><br>
</i><i>exponent as x."</i><i><br>
</i><i><br>
</i>Maybe Owen or Stephen would be able to clarify this.<br>
<br>
Thank you.<br>
Oleg<br>
<blockquote cite="mid:54217C76.3020001@deepbluecap.com" type="cite">
<br>
Ciao, Duncan.
<br>
<br>
<blockquote type="cite">
<br>
Oleg
<br>
<blockquote type="cite">
<br>
<blockquote type="cite">
<br>
Other arithmetic FP operations (fsub, fmul, fdiv) also
propagate NaNs. Thus, the
<br>
same rules seem applicable to them as well:
<br>
---------------------------------------------------------------------
<br>
- fdiv:
<br>
(4) "fdiv %x, undef" is now folded to undef.
<br>
</blockquote>
<br>
But should be folded to NaN, not undef.
<br>
<br>
<blockquote type="cite"> The code comment states this
is done because undef might be a sNaN. We
<br>
can't rely on sNaNs as they can either be masked or the
platform might not have
<br>
FP exceptions at all. Nevertheless, such folding is still
correct due to the NaN
<br>
propagation rules we found in the Standard - undef might be
chosen to be a NaN
<br>
and its payload will be propagated.
<br>
Moreover, this looks similar to (3) and can be
folded to a NaN. /Is it
<br>
worth doing?/
<br>
</blockquote>
<br>
As the current folding to undef is wrong, it has to be fixed.
<br>
<br>
<blockquote type="cite">
<br>
(5) fdiv undef, undef -> undef
<br>
</blockquote>
<br>
Yup.
<br>
<br>
<blockquote type="cite">---------------------------------------------------------------------
<br>
- fmul:
<br>
(6) fmul undef, undef -> undef
<br>
</blockquote>
<br>
Yup.
<br>
<br>
<blockquote type="cite"> (7) fmul %x, undef
-> NaN or undef (undef is a NaN, which is
<br>
propagated)
<br>
</blockquote>
<br>
Should be folded to NaN, not undef.
<br>
<br>
<blockquote type="cite">---------------------------------------------------------------------
<br>
- fsub:
<br>
(8) fsub %x, -0.0 -> %x (if %x is
not -0.0; works this way
<br>
now)
<br>
</blockquote>
<br>
Should this be: fsub %x, +0.0 ?
<br>
</blockquote>
fsub %x, +0.0 is also covered and always folded to %x.
<br>
The version with -0.0 is similar except it additionally checks
if %x is not -0.0.
<br>
<blockquote type="cite">
<br>
<blockquote type="cite"> (9) fsub %x, undef
-> NaN or undef (undef is a NaN, which is
<br>
propagated)
<br>
</blockquote>
<br>
Should fold to NaN not undef.
<br>
<br>
<blockquote type="cite"> (10) fsub undef, undef ->
undef
<br>
</blockquote>
<br>
Yup.
<br>
<br>
Ciao, Duncan.
<br>
<br>
<blockquote type="cite">---------------------------------------------------------------------
<br>
<br>
I will be very thankful if you could review this final
summary and share your
<br>
thoughts.
<br>
<br>
Thank you.
<br>
<br>
P.S. Sorry for bothering you again and again.
<br>
Just want to make sure I clearly understand the subject in
order to make correct
<br>
code changes and to be able to help others with this in the
future.
<br>
<br>
Kind regards,
<br>
Oleg
<br>
<br>
On 16.09.2014 21:42, Duncan Sands wrote:
<br>
<blockquote type="cite">On 16/09/14 19:37, Owen Anderson
wrote:
<br>
<blockquote type="cite">As far as I know, LLVM does not
try very hard to guarantee constant folded
<br>
NaN payloads that match exactly what the target would
generate.
<br>
</blockquote>
<br>
I'm with Owen here. Unless ARM people object, I think it
is reasonable to say
<br>
that at the LLVM IR level we may assume that the IEEE
rules are followed.
<br>
<br>
Ciao, Duncan.
<br>
<br>
<blockquote type="cite">
<br>
—Owen
<br>
<br>
<blockquote type="cite">On Sep 16, 2014, at 10:30 AM,
Oleg Ranevskyy <a class="moz-txt-link-rfc2396E" href="mailto:llvm.mail.list@gmail.com"><llvm.mail.list@gmail.com></a>
<br>
wrote:
<br>
<br>
Hi Duncan,
<br>
<br>
I reread everything we've discussed so far and would
like to pay closer
<br>
attention to the the ARM's FPSCR register mentioned by
Stephen.
<br>
It's really possible on ARM systems that floating
point operations on one or
<br>
more qNaN operands return a NaN different from the
operands. I.e. operand
<br>
NaN is not propagated. This happens when the "default
NaN" flag is set in
<br>
the FPSCR (floating point status and control
register). The result in this
<br>
case is some default NaN value.
<br>
<br>
This means "fadd %x, -0.0", which is currently folded
to %x by
<br>
InstructionSimplify, might produce a different result
if %x is a NaN. This
<br>
breaks the NaN propagation rules the IEEE standard
establishes and
<br>
significantly reduces folding capabilities for the FP
operations.
<br>
<br>
This also applies to "fadd undef, undef" and "fadd %x,
undef". We can't rely
<br>
on getting an arbitrary NaN here on ARMs.
<br>
<br>
Would you be able to confirm this please?
<br>
<br>
Thank you in advance for your time!
<br>
<br>
Kind regards,
<br>
Oleg
<br>
<br>
On 10.09.2014 22:50, Duncan Sands wrote:
<br>
<blockquote type="cite">Hi Oleg,
<br>
<br>
On 01/09/14 18:46, Oleg Ranevskyy wrote:
<br>
<blockquote type="cite">Hi Duncan,
<br>
<br>
I looked through the IEEE standard and here is
what I found:
<br>
<br>
*6.2 Operations with NaNs*
<br>
/"For an operation with quiet NaN inputs, other
than maximum and minimum
<br>
operations, if a floating-point result is to be
delivered the result shall
<br>
be a
<br>
quiet NaN which should be one of the input NaNs"/.
<br>
<br>
*6.2.3 NaN propagation*
<br>
/"An operation that propagates a NaN operand to
its result and has a
<br>
single NaN
<br>
as an input should produce a NaN with the payload
of the input NaN if
<br>
representable in the destination format"./
<br>
</blockquote>
<br>
thanks for finding this out.
<br>
<br>
<blockquote type="cite">
<br>
Floating point add propagates a NaN. There is no
conversion in the
<br>
context of
<br>
LLVM's fadd. So, if %x in "fadd %x, -0.0" is a
NaN, the result is also a
<br>
NaN
<br>
with the same payload.
<br>
</blockquote>
<br>
Yes, folding "fadd %x, -0.0" to "%x" is correct.
This implies that "fadd
<br>
undef, undef" can be folded to "undef".
<br>
<br>
<blockquote type="cite">
<br>
As regards "fadd %x, undef", where %x might be a
NaN and undef might be
<br>
chosen
<br>
to be (probably some different) NaN, and a
possibility to fold this to a
<br>
constant (NaN), the standard says:
<br>
/"If two or more inputs are NaN, then the payload
of the resulting NaN
<br>
should be
<br>
identical to the payload of one of the input NaNs
if representable in the
<br>
destination format. *This standard does not
specify which of the input
<br>
NaNs will
<br>
provide the payload*"/.
<br>
<br>
Thus, this makes it possible to fold "fadd %x,
undef" to a NaN. Is this
<br>
right?
<br>
</blockquote>
<br>
Yes, I agree.
<br>
<br>
Ciao, Duncan.
<br>
<br>
<blockquote type="cite">
<br>
Oleg
<br>
<br>
On 01.09.2014 10:04, Duncan Sands wrote:
<br>
<blockquote type="cite">Hi Oleg,
<br>
<br>
On 01/09/14 15:42, Oleg Ranevskyy wrote:
<br>
<blockquote type="cite">Hi,
<br>
<br>
Thank you for your comment, Owen.
<br>
My LLVM expertise is certainly not enough to
make such decisions yet.
<br>
Duncan, do you have any comments on this or do
you know anyone else
<br>
who can
<br>
decide about preserving NaN payloads?
<br>
</blockquote>
<br>
my take is that the first thing to do is to see
what the IEEE standard
<br>
says
<br>
about NaNs. Consider for example "fadd x,
-0.0". Does the standard
<br>
specify
<br>
the exact NaN bit pattern produced as output
when a particular NaN x is
<br>
input? Or does it just say that the output is a
NaN? If the standard
<br>
doesn't
<br>
care exactly which NaN is output, I think it is
reasonable for LLVM to
<br>
assume
<br>
it is whatever NaN is most convenient for LLVM;
in this case that means
<br>
using
<br>
x itself as the output.
<br>
<br>
However this approach does implicitly mean that
we may end up not folding
<br>
floating point operations completely
deterministically: depending on the
<br>
optimization that kicks in, in one case we might
fold to NaN A, and in
<br>
some
<br>
different optimization we might fold the same
expression to NaN B. I
<br>
think
<br>
this is pretty reasonable, but it is something
to be aware of.
<br>
<br>
Ciao, Duncan.
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>