<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Hi Duncan,<br>
    <br>
    <div class="moz-cite-prefix">On 23.09.2014 17:58, Duncan Sands
      wrote:<br>
    </div>
    <blockquote cite="mid:54217C76.3020001@deepbluecap.com" type="cite">Hi
      Oleg,
      <br>
      <br>
      On 22/09/14 17:56, Oleg Ranevskyy wrote:
      <br>
      <blockquote type="cite">Hi Duncan,
        <br>
        <br>
        On 17.09.2014 21:10, Duncan Sands wrote:
        <br>
        <blockquote type="cite">Hi Oleg,
          <br>
          <br>
          On 17/09/14 18:45, Oleg Ranevskyy wrote:
          <br>
          <blockquote type="cite">Hi,
            <br>
            <br>
            Thank you for all your helpful comments.
            <br>
            <br>
            To sum up, below is the list of correct folding examples for
            fadd:
            <br>
                    (1)  fadd %x, -0.0            ->     %x
            <br>
                    (2)  fadd undef, undef    ->     undef
            <br>
                    (3)  fadd %x, undef         ->     NaN  (undef is
            a NaN which is
            <br>
            propagated)
            <br>
            <br>
            Looking through the code I found the "NoNaNs" flag accessed
            through an instance
            <br>
            of the FastMathFlags class.
            <br>
            (2) and (3) should probably depend on it.
            <br>
            If the flag is set, (2) and (3) cannot be folded as there
            are no NaNs and we are
            <br>
            not guaranteed to get an arbitrary bit pattern from fadd,
            right?
            <br>
          </blockquote>
          <br>
          I think it's exactly the other way round: if NoNans is set
          then you can fold
          <br>
          (2) and (3) to undef.  That's because (IIRC) the NoNans flag
          promises that no
          <br>
          NaNs will be used by the program. However "undef" could be a
          NaN, thus the
          <br>
          promise is broken, meaning the program is performing undefined
          behaviour, and
          <br>
          you can do whatever you want.
          <br>
        </blockquote>
        Oh, I see the point now. I thought if NoNaNs was set then no
        NaNs were possible
        <br>
        at all. But undef is still an arbitrary bit pattern that might
        occasionally be
        <br>
        the same as the one of a NaN. Thank you for the explanation.
        <br>
        <br>
        Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to
        undef, whereas
        <br>
        "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef
        (NoNaNs is set) or a
        <br>
        NaN (NoNaNs is not set).
        <br>
      </blockquote>
      <br>
      for fmul and fdiv, the reasoning does depend on fmul %x, 1.0
      always being equal to %x (likewise: fdiv %x, 1.0 being equal to
      %x).  Is this true?
      <br>
    </blockquote>
    Do you mean that we can't apply "fmul/fdiv undef, undef" to undef
    folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?<br>
    If we choose one undef to have an arbitrary bit pattern and another
    undef = 1.0, we need a guarantee to get the bit pattern of the first
    undef. Do I get it right?<br>
    <br>
    I checked the standard regarding "x*1.0 == x" and found that only
    "10.4 Literal meaning and value-changing optimizations" addresses
    this. I don't pretend to thoroughly understand this paragraph yet,
    but it seems to me that language standards are  required to preserve
    the literal meaning of the source code. Applying the identity
    property x*1 is a part of this. Here is a quote from IEEE-754:<br>
    <br>
    <i>"The following value-changing transformations, among others,
      preserve the literal meaning of the source</i><i><br>
    </i><i>code:</i><i><br>
    </i><i>― Applying the identity property 0 + x when x is not zero and
      is not a signaling NaN and the result</i><i><br>
    </i><i>has the same exponent as x.</i><i><br>
    </i><i>― Applying the identity property 1 × x when x is not a
      signaling NaN and the result has the same</i><i><br>
    </i><i>exponent as x."</i><i><br>
    </i><i><br>
    </i>Maybe Owen or Stephen would be able to clarify this.<br>
    <br>
    Thank you.<br>
    Oleg<br>
    <blockquote cite="mid:54217C76.3020001@deepbluecap.com" type="cite">
      <br>
      Ciao, Duncan.
      <br>
      <br>
      <blockquote type="cite">
        <br>
        Oleg
        <br>
        <blockquote type="cite">
          <br>
          <blockquote type="cite">
            <br>
            Other arithmetic FP operations (fsub, fmul, fdiv) also
            propagate NaNs. Thus, the
            <br>
            same rules seem applicable to them as well:
            <br>
---------------------------------------------------------------------
            <br>
            - fdiv:
            <br>
                    (4) "fdiv %x, undef" is now folded to undef.
            <br>
          </blockquote>
          <br>
          But should be folded to NaN, not undef.
          <br>
          <br>
          <blockquote type="cite">        The code comment states this
            is done because undef might be a sNaN. We
            <br>
            can't rely on sNaNs as they can either be masked or the
            platform might not have
            <br>
            FP exceptions at all. Nevertheless, such folding is still
            correct due to the NaN
            <br>
            propagation rules we found in the Standard - undef might be
            chosen to be a NaN
            <br>
            and its payload will be propagated.
            <br>
                    Moreover, this looks similar to (3) and can be
            folded to a NaN. /Is it
            <br>
            worth doing?/
            <br>
          </blockquote>
          <br>
          As the current folding to undef is wrong, it has to be fixed.
          <br>
          <br>
          <blockquote type="cite">
            <br>
                    (5) fdiv undef, undef    ->    undef
            <br>
          </blockquote>
          <br>
          Yup.
          <br>
          <br>
          <blockquote type="cite">---------------------------------------------------------------------
            <br>
            - fmul:
            <br>
                    (6) fmul undef, undef    ->    undef
            <br>
          </blockquote>
          <br>
          Yup.
          <br>
          <br>
          <blockquote type="cite">        (7) fmul %x, undef       
            -> NaN or undef (undef is a NaN, which is
            <br>
            propagated)
            <br>
          </blockquote>
          <br>
          Should be folded to NaN, not undef.
          <br>
          <br>
          <blockquote type="cite">---------------------------------------------------------------------
            <br>
            - fsub:
            <br>
                    (8) fsub %x, -0.0           ->    %x  (if %x is
            not -0.0; works this way
            <br>
            now)
            <br>
          </blockquote>
          <br>
          Should this be: fsub %x, +0.0 ?
          <br>
        </blockquote>
        fsub %x, +0.0 is also covered and always folded to %x.
        <br>
        The version with -0.0 is similar except it additionally checks
        if %x is not -0.0.
        <br>
        <blockquote type="cite">
          <br>
          <blockquote type="cite">        (9) fsub %x, undef       
            -> NaN or undef (undef is a NaN, which is
            <br>
            propagated)
            <br>
          </blockquote>
          <br>
          Should fold to NaN not undef.
          <br>
          <br>
          <blockquote type="cite">      (10) fsub undef, undef   ->
            undef
            <br>
          </blockquote>
          <br>
          Yup.
          <br>
          <br>
          Ciao, Duncan.
          <br>
          <br>
          <blockquote type="cite">---------------------------------------------------------------------
            <br>
            <br>
            I will be very thankful if you could review this final
            summary and share your
            <br>
            thoughts.
            <br>
            <br>
            Thank you.
            <br>
            <br>
            P.S. Sorry for bothering you again and again.
            <br>
            Just want to make sure I clearly understand the subject in
            order to make correct
            <br>
            code changes and to be able to help others with this in the
            future.
            <br>
            <br>
            Kind regards,
            <br>
            Oleg
            <br>
            <br>
            On 16.09.2014 21:42, Duncan Sands wrote:
            <br>
            <blockquote type="cite">On 16/09/14 19:37, Owen Anderson
              wrote:
              <br>
              <blockquote type="cite">As far as I know, LLVM does not
                try very hard to guarantee constant folded
                <br>
                NaN payloads that match exactly what the target would
                generate.
                <br>
              </blockquote>
              <br>
              I'm with Owen here.  Unless ARM people object, I think it
              is reasonable to say
              <br>
              that at the LLVM IR level we may assume that the IEEE
              rules are followed.
              <br>
              <br>
              Ciao, Duncan.
              <br>
              <br>
              <blockquote type="cite">
                <br>
                —Owen
                <br>
                <br>
                <blockquote type="cite">On Sep 16, 2014, at 10:30 AM,
                  Oleg Ranevskyy <a class="moz-txt-link-rfc2396E" href="mailto:llvm.mail.list@gmail.com"><llvm.mail.list@gmail.com></a>
                  <br>
                  wrote:
                  <br>
                  <br>
                  Hi Duncan,
                  <br>
                  <br>
                  I reread everything we've discussed so far and would
                  like to pay closer
                  <br>
                  attention to the the ARM's FPSCR register mentioned by
                  Stephen.
                  <br>
                  It's really possible on ARM systems that floating
                  point operations on one or
                  <br>
                  more qNaN operands return a NaN different from the
                  operands. I.e. operand
                  <br>
                  NaN is not propagated. This happens when the "default
                  NaN" flag is set in
                  <br>
                  the FPSCR (floating point status and control
                  register). The result in this
                  <br>
                  case is some default NaN value.
                  <br>
                  <br>
                  This means "fadd %x, -0.0", which is currently folded
                  to %x by
                  <br>
                  InstructionSimplify, might produce a different result
                  if %x is a NaN. This
                  <br>
                  breaks the NaN propagation rules the IEEE standard
                  establishes and
                  <br>
                  significantly reduces folding capabilities for the FP
                  operations.
                  <br>
                  <br>
                  This also applies to "fadd undef, undef" and "fadd %x,
                  undef". We can't rely
                  <br>
                  on getting an arbitrary NaN here on ARMs.
                  <br>
                  <br>
                  Would you be able to confirm this please?
                  <br>
                  <br>
                  Thank you in advance for your time!
                  <br>
                  <br>
                  Kind regards,
                  <br>
                  Oleg
                  <br>
                  <br>
                  On 10.09.2014 22:50, Duncan Sands wrote:
                  <br>
                  <blockquote type="cite">Hi Oleg,
                    <br>
                    <br>
                    On 01/09/14 18:46, Oleg Ranevskyy wrote:
                    <br>
                    <blockquote type="cite">Hi Duncan,
                      <br>
                      <br>
                      I looked through the IEEE standard and here is
                      what I found:
                      <br>
                      <br>
                      *6.2 Operations with NaNs*
                      <br>
                      /"For an operation with quiet NaN inputs, other
                      than maximum and minimum
                      <br>
                      operations, if a floating-point result is to be
                      delivered the result shall
                      <br>
                      be a
                      <br>
                      quiet NaN which should be one of the input NaNs"/.
                      <br>
                      <br>
                      *6.2.3 NaN propagation*
                      <br>
                      /"An operation that propagates a NaN operand to
                      its result and has a
                      <br>
                      single NaN
                      <br>
                      as an input should produce a NaN with the payload
                      of the input NaN if
                      <br>
                      representable in the destination format"./
                      <br>
                    </blockquote>
                    <br>
                    thanks for finding this out.
                    <br>
                    <br>
                    <blockquote type="cite">
                      <br>
                      Floating point add propagates a NaN. There is no
                      conversion in the
                      <br>
                      context of
                      <br>
                      LLVM's fadd. So, if %x in "fadd %x, -0.0" is a
                      NaN, the result is also a
                      <br>
                      NaN
                      <br>
                      with the same payload.
                      <br>
                    </blockquote>
                    <br>
                    Yes, folding "fadd %x, -0.0" to "%x" is correct. 
                    This implies that "fadd
                    <br>
                    undef, undef" can be folded to "undef".
                    <br>
                    <br>
                    <blockquote type="cite">
                      <br>
                      As regards "fadd %x, undef", where %x might be a
                      NaN and undef might be
                      <br>
                      chosen
                      <br>
                      to be (probably some different) NaN, and a
                      possibility to fold this to a
                      <br>
                      constant (NaN), the standard says:
                      <br>
                      /"If two or more inputs are NaN, then the payload
                      of the resulting NaN
                      <br>
                      should be
                      <br>
                      identical to the payload of one of the input NaNs
                      if representable in the
                      <br>
                      destination format. *This standard does not
                      specify which of the input
                      <br>
                      NaNs will
                      <br>
                      provide the payload*"/.
                      <br>
                      <br>
                      Thus, this makes it possible to fold "fadd %x,
                      undef" to a NaN. Is this
                      <br>
                      right?
                      <br>
                    </blockquote>
                    <br>
                    Yes, I agree.
                    <br>
                    <br>
                    Ciao, Duncan.
                    <br>
                    <br>
                    <blockquote type="cite">
                      <br>
                      Oleg
                      <br>
                      <br>
                      On 01.09.2014 10:04, Duncan Sands wrote:
                      <br>
                      <blockquote type="cite">Hi Oleg,
                        <br>
                        <br>
                        On 01/09/14 15:42, Oleg Ranevskyy wrote:
                        <br>
                        <blockquote type="cite">Hi,
                          <br>
                          <br>
                          Thank you for your comment, Owen.
                          <br>
                          My LLVM expertise is certainly not enough to
                          make such decisions yet.
                          <br>
                          Duncan, do you have any comments on this or do
                          you know anyone else
                          <br>
                          who can
                          <br>
                          decide about preserving NaN payloads?
                          <br>
                        </blockquote>
                        <br>
                        my take is that the first thing to do is to see
                        what the IEEE standard
                        <br>
                        says
                        <br>
                        about NaNs.  Consider for example "fadd x,
                        -0.0". Does the standard
                        <br>
                        specify
                        <br>
                        the exact NaN bit pattern produced as output
                        when a particular NaN x is
                        <br>
                        input?  Or does it just say that the output is a
                        NaN? If the standard
                        <br>
                        doesn't
                        <br>
                        care exactly which NaN is output, I think it is
                        reasonable for LLVM to
                        <br>
                        assume
                        <br>
                        it is whatever NaN is most convenient for LLVM;
                        in this case that means
                        <br>
                        using
                        <br>
                        x itself as the output.
                        <br>
                        <br>
                        However this approach does implicitly mean that
                        we may end up not folding
                        <br>
                        floating point operations completely
                        deterministically: depending on the
                        <br>
                        optimization that kicks in, in one case we might
                        fold to NaN A, and in
                        <br>
                        some
                        <br>
                        different optimization we might fold the same
                        expression to NaN B.  I
                        <br>
                        think
                        <br>
                        this is pretty reasonable, but it is something
                        to be aware of.
                        <br>
                        <br>
                        Ciao, Duncan.
                        <br>
                      </blockquote>
                      <br>
                    </blockquote>
                    <br>
                  </blockquote>
                  <br>
                </blockquote>
                <br>
              </blockquote>
              <br>
            </blockquote>
            <br>
          </blockquote>
          <br>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>