<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Hi Duncan,<br>

    <br>

    <div class="moz-cite-prefix">On 23.09.2014 17:58, Duncan Sands

      wrote:<br>

    </div>

    <blockquote cite="mid:54217C76.3020001@deepbluecap.com" type="cite">Hi

      Oleg,

      <br>

      <br>

      On 22/09/14 17:56, Oleg Ranevskyy wrote:

      <br>

      <blockquote type="cite">Hi Duncan,

        <br>

        <br>

        On 17.09.2014 21:10, Duncan Sands wrote:

        <br>

        <blockquote type="cite">Hi Oleg,

          <br>

          <br>

          On 17/09/14 18:45, Oleg Ranevskyy wrote:

          <br>

          <blockquote type="cite">Hi,

            <br>

            <br>

            Thank you for all your helpful comments.

            <br>

            <br>

            To sum up, below is the list of correct folding examples for

            fadd:

            <br>

                    (1)  fadd %x, -0.0            ->     %x

            <br>

                    (2)  fadd undef, undef    ->     undef

            <br>

                    (3)  fadd %x, undef         ->     NaN  (undef is

            a NaN which is

            <br>

            propagated)

            <br>

            <br>

            Looking through the code I found the "NoNaNs" flag accessed

            through an instance

            <br>

            of the FastMathFlags class.

            <br>

            (2) and (3) should probably depend on it.

            <br>

            If the flag is set, (2) and (3) cannot be folded as there

            are no NaNs and we are

            <br>

            not guaranteed to get an arbitrary bit pattern from fadd,

            right?

            <br>

          </blockquote>

          <br>

          I think it's exactly the other way round: if NoNans is set

          then you can fold

          <br>

          (2) and (3) to undef.  That's because (IIRC) the NoNans flag

          promises that no

          <br>

          NaNs will be used by the program. However "undef" could be a

          NaN, thus the

          <br>

          promise is broken, meaning the program is performing undefined

          behaviour, and

          <br>

          you can do whatever you want.

          <br>

        </blockquote>

        Oh, I see the point now. I thought if NoNaNs was set then no

        NaNs were possible

        <br>

        at all. But undef is still an arbitrary bit pattern that might

        occasionally be

        <br>

        the same as the one of a NaN. Thank you for the explanation.

        <br>

        <br>

        Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to

        undef, whereas

        <br>

        "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef

        (NoNaNs is set) or a

        <br>

        NaN (NoNaNs is not set).

        <br>

      </blockquote>

      <br>

      for fmul and fdiv, the reasoning does depend on fmul %x, 1.0

      always being equal to %x (likewise: fdiv %x, 1.0 being equal to

      %x).  Is this true?

      <br>

    </blockquote>

    Do you mean that we can't apply "fmul/fdiv undef, undef" to undef

    folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?<br>

    If we choose one undef to have an arbitrary bit pattern and another

    undef = 1.0, we need a guarantee to get the bit pattern of the first

    undef. Do I get it right?<br>

    <br>

    I checked the standard regarding "x*1.0 == x" and found that only

    "10.4 Literal meaning and value-changing optimizations" addresses

    this. I don't pretend to thoroughly understand this paragraph yet,

    but it seems to me that language standards are  required to preserve

    the literal meaning of the source code. Applying the identity

    property x*1 is a part of this. Here is a quote from IEEE-754:<br>

    <br>

    <i>"The following value-changing transformations, among others,

      preserve the literal meaning of the source</i><i><br>

    </i><i>code:</i><i><br>

    </i><i>― Applying the identity property 0 + x when x is not zero and

      is not a signaling NaN and the result</i><i><br>

    </i><i>has the same exponent as x.</i><i><br>

    </i><i>― Applying the identity property 1 × x when x is not a

      signaling NaN and the result has the same</i><i><br>

    </i><i>exponent as x."</i><i><br>

    </i><i><br>

    </i>Maybe Owen or Stephen would be able to clarify this.<br>

    <br>

    Thank you.<br>

    Oleg<br>

    <blockquote cite="mid:54217C76.3020001@deepbluecap.com" type="cite">

      <br>

      Ciao, Duncan.

      <br>

      <br>

      <blockquote type="cite">

        <br>

        Oleg

        <br>

        <blockquote type="cite">

          <br>

          <blockquote type="cite">

            <br>

            Other arithmetic FP operations (fsub, fmul, fdiv) also

            propagate NaNs. Thus, the

            <br>

            same rules seem applicable to them as well:

            <br>

---------------------------------------------------------------------

            <br>

            - fdiv:

            <br>

                    (4) "fdiv %x, undef" is now folded to undef.

            <br>

          </blockquote>

          <br>

          But should be folded to NaN, not undef.

          <br>

          <br>

          <blockquote type="cite">        The code comment states this

            is done because undef might be a sNaN. We

            <br>

            can't rely on sNaNs as they can either be masked or the

            platform might not have

            <br>

            FP exceptions at all. Nevertheless, such folding is still

            correct due to the NaN

            <br>

            propagation rules we found in the Standard - undef might be

            chosen to be a NaN

            <br>

            and its payload will be propagated.

            <br>

                    Moreover, this looks similar to (3) and can be

            folded to a NaN. /Is it

            <br>

            worth doing?/

            <br>

          </blockquote>

          <br>

          As the current folding to undef is wrong, it has to be fixed.

          <br>

          <br>

          <blockquote type="cite">

            <br>

                    (5) fdiv undef, undef    ->    undef

            <br>

          </blockquote>

          <br>

          Yup.

          <br>

          <br>

          <blockquote type="cite">---------------------------------------------------------------------

            <br>

            - fmul:

            <br>

                    (6) fmul undef, undef    ->    undef

            <br>

          </blockquote>

          <br>

          Yup.

          <br>

          <br>

          <blockquote type="cite">        (7) fmul %x, undef       

            -> NaN or undef (undef is a NaN, which is

            <br>

            propagated)

            <br>

          </blockquote>

          <br>

          Should be folded to NaN, not undef.

          <br>

          <br>

          <blockquote type="cite">---------------------------------------------------------------------

            <br>

            - fsub:

            <br>

                    (8) fsub %x, -0.0           ->    %x  (if %x is

            not -0.0; works this way

            <br>

            now)

            <br>

          </blockquote>

          <br>

          Should this be: fsub %x, +0.0 ?

          <br>

        </blockquote>

        fsub %x, +0.0 is also covered and always folded to %x.

        <br>

        The version with -0.0 is similar except it additionally checks

        if %x is not -0.0.

        <br>

        <blockquote type="cite">

          <br>

          <blockquote type="cite">        (9) fsub %x, undef       

            -> NaN or undef (undef is a NaN, which is

            <br>

            propagated)

            <br>

          </blockquote>

          <br>

          Should fold to NaN not undef.

          <br>

          <br>

          <blockquote type="cite">      (10) fsub undef, undef   ->

            undef

            <br>

          </blockquote>

          <br>

          Yup.

          <br>

          <br>

          Ciao, Duncan.

          <br>

          <br>

          <blockquote type="cite">---------------------------------------------------------------------

            <br>

            <br>

            I will be very thankful if you could review this final

            summary and share your

            <br>

            thoughts.

            <br>

            <br>

            Thank you.

            <br>

            <br>

            P.S. Sorry for bothering you again and again.

            <br>

            Just want to make sure I clearly understand the subject in

            order to make correct

            <br>

            code changes and to be able to help others with this in the

            future.

            <br>

            <br>

            Kind regards,

            <br>

            Oleg

            <br>

            <br>

            On 16.09.2014 21:42, Duncan Sands wrote:

            <br>

            <blockquote type="cite">On 16/09/14 19:37, Owen Anderson

              wrote:

              <br>

              <blockquote type="cite">As far as I know, LLVM does not

                try very hard to guarantee constant folded

                <br>

                NaN payloads that match exactly what the target would

                generate.

                <br>

              </blockquote>

              <br>

              I'm with Owen here.  Unless ARM people object, I think it

              is reasonable to say

              <br>

              that at the LLVM IR level we may assume that the IEEE

              rules are followed.

              <br>

              <br>

              Ciao, Duncan.

              <br>

              <br>

              <blockquote type="cite">

                <br>

                —Owen

                <br>

                <br>

                <blockquote type="cite">On Sep 16, 2014, at 10:30 AM,

                  Oleg Ranevskyy <a class="moz-txt-link-rfc2396E" href="mailto:llvm.mail.list@gmail.com"><llvm.mail.list@gmail.com></a>

                  <br>

                  wrote:

                  <br>

                  <br>

                  Hi Duncan,

                  <br>

                  <br>

                  I reread everything we've discussed so far and would

                  like to pay closer

                  <br>

                  attention to the the ARM's FPSCR register mentioned by

                  Stephen.

                  <br>

                  It's really possible on ARM systems that floating

                  point operations on one or

                  <br>

                  more qNaN operands return a NaN different from the

                  operands. I.e. operand

                  <br>

                  NaN is not propagated. This happens when the "default

                  NaN" flag is set in

                  <br>

                  the FPSCR (floating point status and control

                  register). The result in this

                  <br>

                  case is some default NaN value.

                  <br>

                  <br>

                  This means "fadd %x, -0.0", which is currently folded

                  to %x by

                  <br>

                  InstructionSimplify, might produce a different result

                  if %x is a NaN. This

                  <br>

                  breaks the NaN propagation rules the IEEE standard

                  establishes and

                  <br>

                  significantly reduces folding capabilities for the FP

                  operations.

                  <br>

                  <br>

                  This also applies to "fadd undef, undef" and "fadd %x,

                  undef". We can't rely

                  <br>

                  on getting an arbitrary NaN here on ARMs.

                  <br>

                  <br>

                  Would you be able to confirm this please?

                  <br>

                  <br>

                  Thank you in advance for your time!

                  <br>

                  <br>

                  Kind regards,

                  <br>

                  Oleg

                  <br>

                  <br>

                  On 10.09.2014 22:50, Duncan Sands wrote:

                  <br>

                  <blockquote type="cite">Hi Oleg,

                    <br>

                    <br>

                    On 01/09/14 18:46, Oleg Ranevskyy wrote:

                    <br>

                    <blockquote type="cite">Hi Duncan,

                      <br>

                      <br>

                      I looked through the IEEE standard and here is

                      what I found:

                      <br>

                      <br>

                      *6.2 Operations with NaNs*

                      <br>

                      /"For an operation with quiet NaN inputs, other

                      than maximum and minimum

                      <br>

                      operations, if a floating-point result is to be

                      delivered the result shall

                      <br>

                      be a

                      <br>

                      quiet NaN which should be one of the input NaNs"/.

                      <br>

                      <br>

                      *6.2.3 NaN propagation*

                      <br>

                      /"An operation that propagates a NaN operand to

                      its result and has a

                      <br>

                      single NaN

                      <br>

                      as an input should produce a NaN with the payload

                      of the input NaN if

                      <br>

                      representable in the destination format"./

                      <br>

                    </blockquote>

                    <br>

                    thanks for finding this out.

                    <br>

                    <br>

                    <blockquote type="cite">

                      <br>

                      Floating point add propagates a NaN. There is no

                      conversion in the

                      <br>

                      context of

                      <br>

                      LLVM's fadd. So, if %x in "fadd %x, -0.0" is a

                      NaN, the result is also a

                      <br>

                      NaN

                      <br>

                      with the same payload.

                      <br>

                    </blockquote>

                    <br>

                    Yes, folding "fadd %x, -0.0" to "%x" is correct. 

                    This implies that "fadd

                    <br>

                    undef, undef" can be folded to "undef".

                    <br>

                    <br>

                    <blockquote type="cite">

                      <br>

                      As regards "fadd %x, undef", where %x might be a

                      NaN and undef might be

                      <br>

                      chosen

                      <br>

                      to be (probably some different) NaN, and a

                      possibility to fold this to a

                      <br>

                      constant (NaN), the standard says:

                      <br>

                      /"If two or more inputs are NaN, then the payload

                      of the resulting NaN

                      <br>

                      should be

                      <br>

                      identical to the payload of one of the input NaNs

                      if representable in the

                      <br>

                      destination format. *This standard does not

                      specify which of the input

                      <br>

                      NaNs will

                      <br>

                      provide the payload*"/.

                      <br>

                      <br>

                      Thus, this makes it possible to fold "fadd %x,

                      undef" to a NaN. Is this

                      <br>

                      right?

                      <br>

                    </blockquote>

                    <br>

                    Yes, I agree.

                    <br>

                    <br>

                    Ciao, Duncan.

                    <br>

                    <br>

                    <blockquote type="cite">

                      <br>

                      Oleg

                      <br>

                      <br>

                      On 01.09.2014 10:04, Duncan Sands wrote:

                      <br>

                      <blockquote type="cite">Hi Oleg,

                        <br>

                        <br>

                        On 01/09/14 15:42, Oleg Ranevskyy wrote:

                        <br>

                        <blockquote type="cite">Hi,

                          <br>

                          <br>

                          Thank you for your comment, Owen.

                          <br>

                          My LLVM expertise is certainly not enough to

                          make such decisions yet.

                          <br>

                          Duncan, do you have any comments on this or do

                          you know anyone else

                          <br>

                          who can

                          <br>

                          decide about preserving NaN payloads?

                          <br>

                        </blockquote>

                        <br>

                        my take is that the first thing to do is to see

                        what the IEEE standard

                        <br>

                        says

                        <br>

                        about NaNs.  Consider for example "fadd x,

                        -0.0". Does the standard

                        <br>

                        specify

                        <br>

                        the exact NaN bit pattern produced as output

                        when a particular NaN x is

                        <br>

                        input?  Or does it just say that the output is a

                        NaN? If the standard

                        <br>

                        doesn't

                        <br>

                        care exactly which NaN is output, I think it is

                        reasonable for LLVM to

                        <br>

                        assume

                        <br>

                        it is whatever NaN is most convenient for LLVM;

                        in this case that means

                        <br>

                        using

                        <br>

                        x itself as the output.

                        <br>

                        <br>

                        However this approach does implicitly mean that

                        we may end up not folding

                        <br>

                        floating point operations completely

                        deterministically: depending on the

                        <br>

                        optimization that kicks in, in one case we might

                        fold to NaN A, and in

                        <br>

                        some

                        <br>

                        different optimization we might fold the same

                        expression to NaN B.  I

                        <br>

                        think

                        <br>

                        this is pretty reasonable, but it is something

                        to be aware of.

                        <br>

                        <br>

                        Ciao, Duncan.

                        <br>

                      </blockquote>

                      <br>

                    </blockquote>

                    <br>

                  </blockquote>

                  <br>

                </blockquote>

                <br>

              </blockquote>

              <br>

            </blockquote>

            <br>

          </blockquote>

          <br>

        </blockquote>

        <br>

      </blockquote>

      <br>

    </blockquote>

    <br>

  </body>

</html>