<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Patches should generally go to llvm-commits and will need test
    cases.  I didn't glance at this in detail, but the general approach
    seems reasonable.  <br>
    <br>
    <div class="moz-cite-prefix">On 04/15/2016 10:57 AM, Carlos Liam via
      llvm-dev wrote:<br>
    </div>
    <blockquote
      cite="mid:513BE407-8E8F-4E27-A729-4FA2DD0FA5B5@aarzee.me"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      Is this patch sound? <a moz-do-not-send="true"
        href="https://ghostbin.com/paste/8wt63" class="">https://ghostbin.com/paste/8wt63</a>
      <div class="">I don't think it is getting triggered.<br class="">
        <div class="">
          <br class="Apple-interchange-newline">
          <span style="color: rgb(0, 0, 0); font-family: Helvetica;
            font-size: 12px; font-style: normal; font-variant: normal;
            font-weight: normal; letter-spacing: normal; line-height:
            normal; orphans: auto; text-align: start; text-indent: 0px;
            text-transform: none; white-space: normal; widows: auto;
            word-spacing: 0px; -webkit-text-stroke-width: 0px; display:
            inline !important; float: none;" class=""> - CL</span>
        </div>
        <br class="">
        <div>
          <blockquote type="cite" class="">
            <div class="">On Apr 15, 2016, at 8:53 AM, Carlos Liam <<a
                moz-do-not-send="true" href="mailto:carlos@aarzee.me"
                class=""><a class="moz-txt-link-abbreviated" href="mailto:carlos@aarzee.me">carlos@aarzee.me</a></a>> wrote:</div>
            <br class="Apple-interchange-newline">
            <div class="">
              <meta http-equiv="Content-Type" content="text/html;
                charset=utf-8" class="">
              <div style="word-wrap: break-word; -webkit-nbsp-mode:
                space; -webkit-line-break: after-white-space;" class="">
                <div class="">My understanding is that this checks
                  whether the bit width of the integer *type* fits in
                  the bit width of the mantissa, not the bit width of
                  the integer value.</div>
                <div class=""><br class="">
                </div>
                <div class=""><span style="font-family: Helvetica;
                    font-size: 12px; font-style: normal;
                    font-variant-ligatures: normal;
                    font-variant-position: normal; font-variant-caps:
                    normal; font-variant-numeric: normal;
                    font-variant-alternates: normal;
                    font-variant-east-asian: normal; font-weight:
                    normal; letter-spacing: normal; line-height: normal;
                    orphans: auto; text-align: start; text-indent: 0px;
                    text-transform: none; white-space: normal; widows:
                    auto; word-spacing: 0px; -webkit-text-stroke-width:
                    0px; float: none; display: inline !important;"
                    class=""> - CL</span>
                </div>
                <br class="">
                <div style="" class="">
                  <blockquote type="cite" class="">
                    <div class="">On Apr 14, 2016, at 6:02 PM, <a
                        moz-do-not-send="true"
                        href="mailto:escha@apple.com" class=""><a class="moz-txt-link-abbreviated" href="mailto:escha@apple.com">escha@apple.com</a></a>
                      wrote:</div>
                    <br class="Apple-interchange-newline">
                    <div class="">
                      <meta http-equiv="Content-Type"
                        content="text/html; charset=utf-8" class="">
                      <div style="word-wrap: break-word;
                        -webkit-nbsp-mode: space; -webkit-line-break:
                        after-white-space;" class="">We already do this
                        to some extent; see this code in
                        InstCombineCasts:
                        <div class=""><br class="">
                        </div>
                        <div class="">
                          <div style="margin: 0px; font-size: 11px;
                            line-height: normal; font-family: Menlo;
                            color: rgb(0, 132, 0);" class="">//
                            fpto{s/u}i({u/s}itofp(X)) --> X or
                            zext(X) or sext(X) or trunc(X)</div>
                          <div style="margin: 0px; font-size: 11px;
                            line-height: normal; font-family: Menlo;
                            color: rgb(0, 132, 0);" class="">// This is
                            safe if the intermediate type has enough
                            bits in its mantissa to</div>
                          <div style="margin: 0px; font-size: 11px;
                            line-height: normal; font-family: Menlo;
                            color: rgb(0, 132, 0);" class="">//
                            accurately represent all values of X.  For
                            example, this won't work with</div>
                          <div style="margin: 0px; font-size: 11px;
                            line-height: normal; font-family: Menlo;
                            color: rgb(0, 132, 0);" class="">// i64
                            -> float -> i64.</div>
                          <div style="margin: 0px; font-size: 11px;
                            line-height: normal; font-family: Menlo;
                            color: rgb(79, 129, 135);" class="">Instruction<span
                              style="font-variant-ligatures:
                              no-common-ligatures;" class=""> *</span>InstCombiner<span
                              style="font-variant-ligatures:
                              no-common-ligatures;" class="">::FoldItoFPtoI(</span>Instruction<span
                              style="font-variant-ligatures:
                              no-common-ligatures;" class=""> &FI) {</span></div>
                        </div>
                        <div class=""><span
                            style="font-variant-ligatures:
                            no-common-ligatures;" class=""><br class="">
                          </span></div>
                        <div class="">—escha</div>
                        <div class=""><br class="">
                          <div class="">
                            <div class="">
                              <blockquote type="cite" class="">
                                <div class="">On Apr 14, 2016, at 2:29
                                  PM, Carlos Liam via llvm-dev <<a
                                    moz-do-not-send="true"
                                    href="mailto:llvm-dev@lists.llvm.org"
                                    class=""><a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></a>>
                                  wrote:</div>
                                <br class="Apple-interchange-newline">
                                <div class="">
                                  <div class="">I'm saying at the IR
                                    level, not the C level. IR makes
                                    certain assumptions about the
                                    representation of floating point
                                    numbers. Nothing to do with C, I
                                    only used it as an example.<br
                                      class="">
                                    <br class="">
                                    - CL<br class="">
                                    <br class="">
                                    <blockquote type="cite" class="">On
                                      Apr 14, 2016, at 4:49 PM, Martin
                                      J. O'Riordan <<a
                                        moz-do-not-send="true"
                                        href="mailto:martin.oriordan@movidius.com"
                                        class=""><a class="moz-txt-link-abbreviated" href="mailto:martin.oriordan@movidius.com">martin.oriordan@movidius.com</a></a>>
                                      wrote:<br class="">
                                      <br class="">
                                      I don't think that this is
                                      correct.<br class="">
                                      <br class="">
                                       | Let's say we have an int x, and
                                      we cast it to a float and back.
                                      Floats have 8 exponent bits and 23
                                      mantissa bits.<br class="">
                                      <br class="">
                                      'float', 'double' and 'long
                                      double' do not have specific
                                      representations, and a given
                                      implementation might choose
                                      different FP implementations for
                                      each.<br class="">
                                      <br class="">
                                      ISO C and C++ only guarantee that
                                      'long double' can accurately
                                      represent all values that may be
                                      represented by 'double', and that
                                      'double' can represent accurately
                                      all values that may be represented
                                      by 'float'; but it does not state
                                      that 'float' has 8 bits of
                                      exponent and 23-bits of mantissa.<br
                                        class="">
                                      <br class="">
                                      And this is a particular problem I
                                      often face when porting
                                      floating-point code between
                                      platforms, each of which can
                                      genuinely claim to be ISO C
                                      compliant.<br class="">
                                      <br class="">
                                      It is "common" for 'float' to be
                                      IEEE 754 32-bit Single Precision
                                      compliant.<br class="">
                                      It is also "common" for 'double'
                                      to be IEEE 754 64-bit Double
                                      Precision compliant.<br class="">
                                      <br class="">
                                      But "common" does not mean
                                      "standard".  The 'clang'
                                      optimisations have to adhere to
                                      the ISO C/C++ Standards, and not
                                      what might be perceived as "the
                                      norm".  Floating-Point has for a
                                      very long time been a problem.<br
                                        class="">
                                      <br class="">
                                      o  How does the machine resolve FP
                                      arithmetic?<br class="">
                                      o  How does the compiler perform
                                      FP arithmetic - is it the same as
                                      the target machine or different?<br
                                        class="">
                                      o  How does the pre-processor
                                      evaluate FP arithmetic - is it the
                                      same as the target machine or
                                      different?<br class="">
                                      <br class="">
                                      These have been issues since the
                                      very first ISO C standard (ANSI
                                      C'89/ISO C'90) and before.  Very
                                      simple things like:<br class="">
                                      <br class="">
                                       #define MY_FP_VAL (3.14159 / 2.0)<br
                                        class="">
                                      <br class="">
                                      Where is that divide performed?
                                       In that compiler subject to host
                                      FP rules?  In the compiler subject
                                      to target rules?  Executed
                                      dynamically by the host?  The same
                                      problem occurs when performing
                                      constant folding in the compiler,
                                      should it follow a model that is
                                      different to what the target would
                                      do or not?  Worse still, when the
                                      pre-processor, compiler, and
                                      target are each different
                                      machines.<br class="">
                                      <br class="">
                                      These are huge problems in the FP
                                      world where exact equivalence and
                                      ordering of evaluation really
                                      matters (think partial ordering -
                                      not the happy unsaturated INT
                                      modulo 2^N world).<br class="">
                                      <br class="">
                                      On our architecture, we have
                                      chosen the 32-bit IEEE model
                                      provided by 'clang' for 'float'
                                      and 'double', but we have chosen
                                      the 64-bit IEEE model for 'long
                                      double'; other implementations are
                                      free to choose a different model.
                                       We also use IEEE 16-bit FP for
                                      'half' aka '__fp16'.  But IEEE
                                      also provides for 128-bit FP,
                                      256-bit FP, and there are FP
                                      implementations that use 80-bits.
                                       In fact, 'clang' does not
                                      preclude an implementation
                                      choosing IEEE 754 16-bit
                                      Half-Precision as its
                                      representation for 'float'.  This
                                      means 5-bits of exponent and
                                      10-bits of mantissa - and that is
                                      still ISO C compliant.<br class="">
                                      <br class="">
                                      Any target is free to choose the
                                      FP representation it prefers for
                                      'float', and that does not mean
                                      that it is bound to IEEE 754
                                      32-bit Single Precision
                                      Floating-Point.  Any FP
                                      optimisations within the compiler
                                      need to keep that target clearly
                                      in mind; I know, I've been burned
                                      by this before.<br class="">
                                      <br class="">
                                       MartinO<br class="">
                                      <br class="">
                                      <br class="">
                                      -----Original Message-----<br
                                        class="">
                                      From: llvm-dev [<a
                                        moz-do-not-send="true"
                                        href="mailto:llvm-dev-bounces@lists.llvm.org"
                                        class=""><a class="moz-txt-link-freetext" href="mailto:llvm-dev-bounces@lists.llvm.org">mailto:llvm-dev-bounces@lists.llvm.org</a></a>]
                                      On Behalf Of Carlos Liam via
                                      llvm-dev<br class="">
                                      Sent: 14 April 2016 19:14<br
                                        class="">
                                      To: <a moz-do-not-send="true"
                                        href="mailto:llvm-dev@lists.llvm.org"
                                        class="">llvm-dev@lists.llvm.org</a><br
                                        class="">
                                      Subject: [llvm-dev] Integer ->
                                      Floating point -> Integer cast
                                      optimizations<br class="">
                                      <br class="">
                                      I brought this up in IRC and was
                                      told to consult someone who knows
                                      more about floating point numbers;
                                      I propose an optimization as
                                      follows.<br class="">
                                      <br class="">
                                      Let's say we have an int x, and we
                                      cast it to a float and back.
                                      Floats have 8 exponent bits and 23
                                      mantissa bits.<br class="">
                                      <br class="">
                                      If x matches the condition
                                      `countTrailingZeros(abs(x)) >
                                      (log2(abs(x)) - 23)`, then we can
                                      remove the float casts.<br
                                        class="">
                                      <br class="">
                                      So, if we can establish that
                                      abs(x) is <= 2**23, we can
                                      remove the casts. LLVM does not
                                      currently perform that
                                      optimization on this C code:<br
                                        class="">
                                      <br class="">
                                      int floatcast(int x) {<br class="">
                                       if (abs(x) <= 16777216) { //
                                      abs(x) is definitely <= 2**23
                                      and fits into our mantissa cleanly<br
                                        class="">
                                           float flt = (float)x;<br
                                        class="">
                                           return (int)flt;<br class="">
                                       }<br class="">
                                       return x;<br class="">
                                      }<br class="">
                                      <br class="">
                                      Things get more interesting when
                                      you bring in higher integers and
                                      leading zeros. Floating point
                                      can't exactly represent integers
                                      that don't fit neatly into the
                                      mantissa; they have to round to a
                                      multiple of some power of 2. For
                                      example, integers between 2**23
                                      and 2**24 round to a multiple of
                                      2**1 - meaning that the result has
                                      *at least* 1 trailing zero.
                                      Integers between 2**24 and 2**25
                                      round to a multiple of 2**2 - with
                                      the result having at least 2
                                      trailing zeros. Et cetera. If we
                                      can prove that the input to these
                                      casts fits in between one of those
                                      ranges *and* has at least the
                                      correct number of leading zeros,
                                      we can eliminate the casts. LLVM
                                      does not currently perform this
                                      optimization on this C code:<br
                                        class="">
                                      <br class="">
                                      int floatcast(int x) {<br class="">
                                       if (16777217 <= abs(x)
                                      && abs(x) <= 33554432)
                                      { // abs(x) is definitely between
                                      2**23 and 2**24<br class="">
                                           float flt = (float)(x /
                                      abs(x) * (abs(x) & (UINT32_MAX
                                      ^ 2))); // what's being casted to
                                      float definitely has at least one
                                      trailing zero in its absolute
                                      value<br class="">
                                           return (int)flt;<br class="">
                                       }<br class="">
                                       return x;<br class="">
                                      }<br class="">
                                      <br class="">
                                      <br class="">
                                      - CL<br class="">
                                      <br class="">
_______________________________________________<br class="">
                                      LLVM Developers mailing list<br
                                        class="">
                                      <a moz-do-not-send="true"
                                        href="mailto:llvm-dev@lists.llvm.org"
                                        class="">llvm-dev@lists.llvm.org</a><br
                                        class="">
                                      <a moz-do-not-send="true"
                                        href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                                        class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br
                                        class="">
                                      <br class="">
                                    </blockquote>
                                    <br class="">
_______________________________________________<br class="">
                                    LLVM Developers mailing list<br
                                      class="">
                                    <a moz-do-not-send="true"
                                      href="mailto:llvm-dev@lists.llvm.org"
                                      class="">llvm-dev@lists.llvm.org</a><br
                                      class="">
                                    <a moz-do-not-send="true"
                                      href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                                      class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br
                                      class="">
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                            <br class="">
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                </div>
                <br class="">
              </div>
            </div>
          </blockquote>
        </div>
        <br class="">
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>