<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">Hi Oleg,<br>

      <br>

      What is the status on this? <br>

      Did you move on with a patch?<br>

      <br>

      See also below:<br>

      <br>

      On 9/23/14 8:32 AM, Oleg Ranevskyy wrote:<br>

    </div>

    <blockquote cite="mid:54219274.2030405@gmail.com" type="cite">

      <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

      Hi Duncan,<br>

      <br>

      <div class="moz-cite-prefix">On 23.09.2014 17:58, Duncan Sands

        wrote:<br>

      </div>

      <blockquote cite="mid:54217C76.3020001@deepbluecap.com"

        type="cite">Hi Oleg, <br>

        <br>

        On 22/09/14 17:56, Oleg Ranevskyy wrote: <br>

        <blockquote type="cite">Hi Duncan, <br>

          <br>

          On 17.09.2014 21:10, Duncan Sands wrote: <br>

          <blockquote type="cite">Hi Oleg, <br>

            <br>

            On 17/09/14 18:45, Oleg Ranevskyy wrote: <br>

            <blockquote type="cite">Hi, <br>

              <br>

              Thank you for all your helpful comments. <br>

              <br>

              To sum up, below is the list of correct folding examples

              for fadd: <br>

                      (1)  fadd %x, -0.0            ->     %x <br>

                      (2)  fadd undef, undef    ->     undef <br>

                      (3)  fadd %x, undef         ->     NaN  (undef

              is a NaN which is <br>

              propagated) <br>

              <br>

              Looking through the code I found the "NoNaNs" flag

              accessed through an instance <br>

              of the FastMathFlags class. <br>

              (2) and (3) should probably depend on it. <br>

              If the flag is set, (2) and (3) cannot be folded as there

              are no NaNs and we are <br>

              not guaranteed to get an arbitrary bit pattern from fadd,

              right? <br>

            </blockquote>

            <br>

            I think it's exactly the other way round: if NoNans is set

            then you can fold <br>

            (2) and (3) to undef.  That's because (IIRC) the NoNans flag

            promises that no <br>

            NaNs will be used by the program. However "undef" could be a

            NaN, thus the <br>

            promise is broken, meaning the program is performing

            undefined behaviour, and <br>

            you can do whatever you want. <br>

          </blockquote>

          Oh, I see the point now. I thought if NoNaNs was set then no

          NaNs were possible <br>

          at all. But undef is still an arbitrary bit pattern that might

          occasionally be <br>

          the same as the one of a NaN. Thank you for the explanation. <br>

          <br>

          Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded

          to undef, whereas <br>

          "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef

          (NoNaNs is set) or a <br>

          NaN (NoNaNs is not set). <br>

        </blockquote>

        <br>

        for fmul and fdiv, the reasoning does depend on fmul %x, 1.0

        always being equal to %x (likewise: fdiv %x, 1.0 being equal to

        %x).  Is this true? <br>

      </blockquote>

      Do you mean that we can't apply "fmul/fdiv undef, undef" to undef

      folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?<br>

      If we choose one undef to have an arbitrary bit pattern and

      another undef = 1.0, we need a guarantee to get the bit pattern of

      the first undef. Do I get it right?<br>

    </blockquote>

    I don't think so. I don't think it makes no sense to try to think

    about each individual case like that. <br>

    If you consider returning an undef, I believe the question is "can

    you form all the possible bit pattern that the undef represents from

    the expression?"<br>

    <br>

    if you have:  <br>

      op float undef, undef <br>

    <br>

    then, since the inputs are unconstrained, the question is can "op"

    form any possible float value as an output?<br>

    If yes then folding to undef is valid.<br>

    <br>

    For instance fabs(undef) can't fold to undef because the possible

    range of output is larger that what fabs can produce.<br>

    <br>

    -- <br>

    Mehdi<br>

    <br>

    <br>

    <blockquote cite="mid:54219274.2030405@gmail.com" type="cite"> <br>

      I checked the standard regarding "x*1.0 == x" and found that only

      "10.4 Literal meaning and value-changing optimizations" addresses

      this. I don't pretend to thoroughly understand this paragraph yet,

      but it seems to me that language standards are  required to

      preserve the literal meaning of the source code. Applying the

      identity property x*1 is a part of this. Here is a quote from

      IEEE-754:<br>

      <br>

      <i>"The following value-changing transformations, among others,

        preserve the literal meaning of the source</i><i><br>

      </i><i>code:</i><i><br>

      </i><i>― Applying the identity property 0 + x when x is not zero

        and is not a signaling NaN and the result</i><i><br>

      </i><i>has the same exponent as x.</i><i><br>

      </i><i>― Applying the identity property 1 × x when x is not a

        signaling NaN and the result has the same</i><i><br>

      </i><i>exponent as x."</i><i><br>

      </i><i><br>

      </i>Maybe Owen or Stephen would be able to clarify this.<br>

      <br>

      Thank you.<br>

      Oleg<br>

      <blockquote cite="mid:54217C76.3020001@deepbluecap.com"

        type="cite"> <br>

        Ciao, Duncan. <br>

        <br>

        <blockquote type="cite"> <br>

          Oleg <br>

          <blockquote type="cite"> <br>

            <blockquote type="cite"> <br>

              Other arithmetic FP operations (fsub, fmul, fdiv) also

              propagate NaNs. Thus, the <br>

              same rules seem applicable to them as well: <br>

              ---------------------------------------------------------------------


              <br>

              - fdiv: <br>

                      (4) "fdiv %x, undef" is now folded to undef. <br>

            </blockquote>

            <br>

            But should be folded to NaN, not undef. <br>

            <br>

            <blockquote type="cite">        The code comment states this

              is done because undef might be a sNaN. We <br>

              can't rely on sNaNs as they can either be masked or the

              platform might not have <br>

              FP exceptions at all. Nevertheless, such folding is still

              correct due to the NaN <br>

              propagation rules we found in the Standard - undef might

              be chosen to be a NaN <br>

              and its payload will be propagated. <br>

                      Moreover, this looks similar to (3) and can be

              folded to a NaN. /Is it <br>

              worth doing?/ <br>

            </blockquote>

            <br>

            As the current folding to undef is wrong, it has to be

            fixed. <br>

            <br>

            <blockquote type="cite"> <br>

                      (5) fdiv undef, undef    ->    undef <br>

            </blockquote>

            <br>

            Yup. <br>

            <br>

            <blockquote type="cite">---------------------------------------------------------------------


              <br>

              - fmul: <br>

                      (6) fmul undef, undef    ->    undef <br>

            </blockquote>

            <br>

            Yup. <br>

            <br>

            <blockquote type="cite">        (7) fmul %x, undef       

              -> NaN or undef (undef is a NaN, which is <br>

              propagated) <br>

            </blockquote>

            <br>

            Should be folded to NaN, not undef. <br>

            <br>

            <blockquote type="cite">---------------------------------------------------------------------


              <br>

              - fsub: <br>

                      (8) fsub %x, -0.0           ->    %x  (if %x is

              not -0.0; works this way <br>

              now) <br>

            </blockquote>

            <br>

            Should this be: fsub %x, +0.0 ? <br>

          </blockquote>

          fsub %x, +0.0 is also covered and always folded to %x. <br>

          The version with -0.0 is similar except it additionally checks

          if %x is not -0.0. <br>

          <blockquote type="cite"> <br>

            <blockquote type="cite">        (9) fsub %x, undef       

              -> NaN or undef (undef is a NaN, which is <br>

              propagated) <br>

            </blockquote>

            <br>

            Should fold to NaN not undef. <br>

            <br>

            <blockquote type="cite">      (10) fsub undef, undef   ->

              undef <br>

            </blockquote>

            <br>

            Yup. <br>

            <br>

            Ciao, Duncan. <br>

            <br>

            <blockquote type="cite">---------------------------------------------------------------------


              <br>

              <br>

              I will be very thankful if you could review this final

              summary and share your <br>

              thoughts. <br>

              <br>

              Thank you. <br>

              <br>

              P.S. Sorry for bothering you again and again. <br>

              Just want to make sure I clearly understand the subject in

              order to make correct <br>

              code changes and to be able to help others with this in

              the future. <br>

              <br>

              Kind regards, <br>

              Oleg <br>

              <br>

              On 16.09.2014 21:42, Duncan Sands wrote: <br>

              <blockquote type="cite">On 16/09/14 19:37, Owen Anderson

                wrote: <br>

                <blockquote type="cite">As far as I know, LLVM does not

                  try very hard to guarantee constant folded <br>

                  NaN payloads that match exactly what the target would

                  generate. <br>

                </blockquote>

                <br>

                I'm with Owen here.  Unless ARM people object, I think

                it is reasonable to say <br>

                that at the LLVM IR level we may assume that the IEEE

                rules are followed. <br>

                <br>

                Ciao, Duncan. <br>

                <br>

                <blockquote type="cite"> <br>

                  —Owen <br>

                  <br>

                  <blockquote type="cite">On Sep 16, 2014, at 10:30 AM,

                    Oleg Ranevskyy <a moz-do-not-send="true"

                      class="moz-txt-link-rfc2396E"

                      href="mailto:llvm.mail.list@gmail.com"><llvm.mail.list@gmail.com></a>

                    <br>

                    wrote: <br>

                    <br>

                    Hi Duncan, <br>

                    <br>

                    I reread everything we've discussed so far and would

                    like to pay closer <br>

                    attention to the the ARM's FPSCR register mentioned

                    by Stephen. <br>

                    It's really possible on ARM systems that floating

                    point operations on one or <br>

                    more qNaN operands return a NaN different from the

                    operands. I.e. operand <br>

                    NaN is not propagated. This happens when the

                    "default NaN" flag is set in <br>

                    the FPSCR (floating point status and control

                    register). The result in this <br>

                    case is some default NaN value. <br>

                    <br>

                    This means "fadd %x, -0.0", which is currently

                    folded to %x by <br>

                    InstructionSimplify, might produce a different

                    result if %x is a NaN. This <br>

                    breaks the NaN propagation rules the IEEE standard

                    establishes and <br>

                    significantly reduces folding capabilities for the

                    FP operations. <br>

                    <br>

                    This also applies to "fadd undef, undef" and "fadd

                    %x, undef". We can't rely <br>

                    on getting an arbitrary NaN here on ARMs. <br>

                    <br>

                    Would you be able to confirm this please? <br>

                    <br>

                    Thank you in advance for your time! <br>

                    <br>

                    Kind regards, <br>

                    Oleg <br>

                    <br>

                    On 10.09.2014 22:50, Duncan Sands wrote: <br>

                    <blockquote type="cite">Hi Oleg, <br>

                      <br>

                      On 01/09/14 18:46, Oleg Ranevskyy wrote: <br>

                      <blockquote type="cite">Hi Duncan, <br>

                        <br>

                        I looked through the IEEE standard and here is

                        what I found: <br>

                        <br>

                        *6.2 Operations with NaNs* <br>

                        /"For an operation with quiet NaN inputs, other

                        than maximum and minimum <br>

                        operations, if a floating-point result is to be

                        delivered the result shall <br>

                        be a <br>

                        quiet NaN which should be one of the input

                        NaNs"/. <br>

                        <br>

                        *6.2.3 NaN propagation* <br>

                        /"An operation that propagates a NaN operand to

                        its result and has a <br>

                        single NaN <br>

                        as an input should produce a NaN with the

                        payload of the input NaN if <br>

                        representable in the destination format"./ <br>

                      </blockquote>

                      <br>

                      thanks for finding this out. <br>

                      <br>

                      <blockquote type="cite"> <br>

                        Floating point add propagates a NaN. There is no

                        conversion in the <br>

                        context of <br>

                        LLVM's fadd. So, if %x in "fadd %x, -0.0" is a

                        NaN, the result is also a <br>

                        NaN <br>

                        with the same payload. <br>

                      </blockquote>

                      <br>

                      Yes, folding "fadd %x, -0.0" to "%x" is correct. 

                      This implies that "fadd <br>

                      undef, undef" can be folded to "undef". <br>

                      <br>

                      <blockquote type="cite"> <br>

                        As regards "fadd %x, undef", where %x might be a

                        NaN and undef might be <br>

                        chosen <br>

                        to be (probably some different) NaN, and a

                        possibility to fold this to a <br>

                        constant (NaN), the standard says: <br>

                        /"If two or more inputs are NaN, then the

                        payload of the resulting NaN <br>

                        should be <br>

                        identical to the payload of one of the input

                        NaNs if representable in the <br>

                        destination format. *This standard does not

                        specify which of the input <br>

                        NaNs will <br>

                        provide the payload*"/. <br>

                        <br>

                        Thus, this makes it possible to fold "fadd %x,

                        undef" to a NaN. Is this <br>

                        right? <br>

                      </blockquote>

                      <br>

                      Yes, I agree. <br>

                      <br>

                      Ciao, Duncan. <br>

                      <br>

                      <blockquote type="cite"> <br>

                        Oleg <br>

                        <br>

                        On 01.09.2014 10:04, Duncan Sands wrote: <br>

                        <blockquote type="cite">Hi Oleg, <br>

                          <br>

                          On 01/09/14 15:42, Oleg Ranevskyy wrote: <br>

                          <blockquote type="cite">Hi, <br>

                            <br>

                            Thank you for your comment, Owen. <br>

                            My LLVM expertise is certainly not enough to

                            make such decisions yet. <br>

                            Duncan, do you have any comments on this or

                            do you know anyone else <br>

                            who can <br>

                            decide about preserving NaN payloads? <br>

                          </blockquote>

                          <br>

                          my take is that the first thing to do is to

                          see what the IEEE standard <br>

                          says <br>

                          about NaNs.  Consider for example "fadd x,

                          -0.0". Does the standard <br>

                          specify <br>

                          the exact NaN bit pattern produced as output

                          when a particular NaN x is <br>

                          input?  Or does it just say that the output is

                          a NaN? If the standard <br>

                          doesn't <br>

                          care exactly which NaN is output, I think it

                          is reasonable for LLVM to <br>

                          assume <br>

                          it is whatever NaN is most convenient for

                          LLVM; in this case that means <br>

                          using <br>

                          x itself as the output. <br>

                          <br>

                          However this approach does implicitly mean

                          that we may end up not folding <br>

                          floating point operations completely

                          deterministically: depending on the <br>

                          optimization that kicks in, in one case we

                          might fold to NaN A, and in <br>

                          some <br>

                          different optimization we might fold the same

                          expression to NaN B.  I <br>

                          think <br>

                          this is pretty reasonable, but it is something

                          to be aware of. <br>

                          <br>

                          Ciao, Duncan. <br>

                        </blockquote>

                        <br>

                      </blockquote>

                      <br>

                    </blockquote>

                    <br>

                  </blockquote>

                  <br>

                </blockquote>

                <br>

              </blockquote>

              <br>

            </blockquote>

            <br>

          </blockquote>

          <br>

        </blockquote>

        <br>

      </blockquote>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>

<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>