<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    Patches should generally go to llvm-commits and will need test

    cases.  I didn't glance at this in detail, but the general approach

    seems reasonable.  <br>

    <br>

    <div class="moz-cite-prefix">On 04/15/2016 10:57 AM, Carlos Liam via

      llvm-dev wrote:<br>

    </div>

    <blockquote

      cite="mid:513BE407-8E8F-4E27-A729-4FA2DD0FA5B5@aarzee.me"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      Is this patch sound? <a moz-do-not-send="true"

        href="https://ghostbin.com/paste/8wt63" class="">https://ghostbin.com/paste/8wt63</a>

      <div class="">I don't think it is getting triggered.<br class="">

        <div class="">

          <br class="Apple-interchange-newline">

          <span style="color: rgb(0, 0, 0); font-family: Helvetica;

            font-size: 12px; font-style: normal; font-variant: normal;

            font-weight: normal; letter-spacing: normal; line-height:

            normal; orphans: auto; text-align: start; text-indent: 0px;

            text-transform: none; white-space: normal; widows: auto;

            word-spacing: 0px; -webkit-text-stroke-width: 0px; display:

            inline !important; float: none;" class=""> - CL</span>

        </div>

        <br class="">

        <div>

          <blockquote type="cite" class="">

            <div class="">On Apr 15, 2016, at 8:53 AM, Carlos Liam <<a

                moz-do-not-send="true" href="mailto:carlos@aarzee.me"

                class=""><a class="moz-txt-link-abbreviated" href="mailto:carlos@aarzee.me">carlos@aarzee.me</a></a>> wrote:</div>

            <br class="Apple-interchange-newline">

            <div class="">

              <meta http-equiv="Content-Type" content="text/html;

                charset=utf-8" class="">

              <div style="word-wrap: break-word; -webkit-nbsp-mode:

                space; -webkit-line-break: after-white-space;" class="">

                <div class="">My understanding is that this checks

                  whether the bit width of the integer *type* fits in

                  the bit width of the mantissa, not the bit width of

                  the integer value.</div>

                <div class=""><br class="">

                </div>

                <div class=""><span style="font-family: Helvetica;

                    font-size: 12px; font-style: normal;

                    font-variant-ligatures: normal;

                    font-variant-position: normal; font-variant-caps:

                    normal; font-variant-numeric: normal;

                    font-variant-alternates: normal;

                    font-variant-east-asian: normal; font-weight:

                    normal; letter-spacing: normal; line-height: normal;

                    orphans: auto; text-align: start; text-indent: 0px;

                    text-transform: none; white-space: normal; widows:

                    auto; word-spacing: 0px; -webkit-text-stroke-width:

                    0px; float: none; display: inline !important;"

                    class=""> - CL</span>

                </div>

                <br class="">

                <div style="" class="">

                  <blockquote type="cite" class="">

                    <div class="">On Apr 14, 2016, at 6:02 PM, <a

                        moz-do-not-send="true"

                        href="mailto:escha@apple.com" class=""><a class="moz-txt-link-abbreviated" href="mailto:escha@apple.com">escha@apple.com</a></a>

                      wrote:</div>

                    <br class="Apple-interchange-newline">

                    <div class="">

                      <meta http-equiv="Content-Type"

                        content="text/html; charset=utf-8" class="">

                      <div style="word-wrap: break-word;

                        -webkit-nbsp-mode: space; -webkit-line-break:

                        after-white-space;" class="">We already do this

                        to some extent; see this code in

                        InstCombineCasts:

                        <div class=""><br class="">

                        </div>

                        <div class="">

                          <div style="margin: 0px; font-size: 11px;

                            line-height: normal; font-family: Menlo;

                            color: rgb(0, 132, 0);" class="">//

                            fpto{s/u}i({u/s}itofp(X)) --> X or

                            zext(X) or sext(X) or trunc(X)</div>

                          <div style="margin: 0px; font-size: 11px;

                            line-height: normal; font-family: Menlo;

                            color: rgb(0, 132, 0);" class="">// This is

                            safe if the intermediate type has enough

                            bits in its mantissa to</div>

                          <div style="margin: 0px; font-size: 11px;

                            line-height: normal; font-family: Menlo;

                            color: rgb(0, 132, 0);" class="">//

                            accurately represent all values of X.  For

                            example, this won't work with</div>

                          <div style="margin: 0px; font-size: 11px;

                            line-height: normal; font-family: Menlo;

                            color: rgb(0, 132, 0);" class="">// i64

                            -> float -> i64.</div>

                          <div style="margin: 0px; font-size: 11px;

                            line-height: normal; font-family: Menlo;

                            color: rgb(79, 129, 135);" class="">Instruction<span

                              style="font-variant-ligatures:

                              no-common-ligatures;" class=""> *</span>InstCombiner<span

                              style="font-variant-ligatures:

                              no-common-ligatures;" class="">::FoldItoFPtoI(</span>Instruction<span

                              style="font-variant-ligatures:

                              no-common-ligatures;" class=""> &FI) {</span></div>

                        </div>

                        <div class=""><span

                            style="font-variant-ligatures:

                            no-common-ligatures;" class=""><br class="">

                          </span></div>

                        <div class="">—escha</div>

                        <div class=""><br class="">

                          <div class="">

                            <div class="">

                              <blockquote type="cite" class="">

                                <div class="">On Apr 14, 2016, at 2:29

                                  PM, Carlos Liam via llvm-dev <<a

                                    moz-do-not-send="true"

                                    href="mailto:llvm-dev@lists.llvm.org"

                                    class=""><a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></a>>

                                  wrote:</div>

                                <br class="Apple-interchange-newline">

                                <div class="">

                                  <div class="">I'm saying at the IR

                                    level, not the C level. IR makes

                                    certain assumptions about the

                                    representation of floating point

                                    numbers. Nothing to do with C, I

                                    only used it as an example.<br

                                      class="">

                                    <br class="">

                                    - CL<br class="">

                                    <br class="">

                                    <blockquote type="cite" class="">On

                                      Apr 14, 2016, at 4:49 PM, Martin

                                      J. O'Riordan <<a

                                        moz-do-not-send="true"

                                        href="mailto:martin.oriordan@movidius.com"

                                        class=""><a class="moz-txt-link-abbreviated" href="mailto:martin.oriordan@movidius.com">martin.oriordan@movidius.com</a></a>>

                                      wrote:<br class="">

                                      <br class="">

                                      I don't think that this is

                                      correct.<br class="">

                                      <br class="">

                                       | Let's say we have an int x, and

                                      we cast it to a float and back.

                                      Floats have 8 exponent bits and 23

                                      mantissa bits.<br class="">

                                      <br class="">

                                      'float', 'double' and 'long

                                      double' do not have specific

                                      representations, and a given

                                      implementation might choose

                                      different FP implementations for

                                      each.<br class="">

                                      <br class="">

                                      ISO C and C++ only guarantee that

                                      'long double' can accurately

                                      represent all values that may be

                                      represented by 'double', and that

                                      'double' can represent accurately

                                      all values that may be represented

                                      by 'float'; but it does not state

                                      that 'float' has 8 bits of

                                      exponent and 23-bits of mantissa.<br

                                        class="">

                                      <br class="">

                                      And this is a particular problem I

                                      often face when porting

                                      floating-point code between

                                      platforms, each of which can

                                      genuinely claim to be ISO C

                                      compliant.<br class="">

                                      <br class="">

                                      It is "common" for 'float' to be

                                      IEEE 754 32-bit Single Precision

                                      compliant.<br class="">

                                      It is also "common" for 'double'

                                      to be IEEE 754 64-bit Double

                                      Precision compliant.<br class="">

                                      <br class="">

                                      But "common" does not mean

                                      "standard".  The 'clang'

                                      optimisations have to adhere to

                                      the ISO C/C++ Standards, and not

                                      what might be perceived as "the

                                      norm".  Floating-Point has for a

                                      very long time been a problem.<br

                                        class="">

                                      <br class="">

                                      o  How does the machine resolve FP

                                      arithmetic?<br class="">

                                      o  How does the compiler perform

                                      FP arithmetic - is it the same as

                                      the target machine or different?<br

                                        class="">

                                      o  How does the pre-processor

                                      evaluate FP arithmetic - is it the

                                      same as the target machine or

                                      different?<br class="">

                                      <br class="">

                                      These have been issues since the

                                      very first ISO C standard (ANSI

                                      C'89/ISO C'90) and before.  Very

                                      simple things like:<br class="">

                                      <br class="">

                                       #define MY_FP_VAL (3.14159 / 2.0)<br

                                        class="">

                                      <br class="">

                                      Where is that divide performed?

                                       In that compiler subject to host

                                      FP rules?  In the compiler subject

                                      to target rules?  Executed

                                      dynamically by the host?  The same

                                      problem occurs when performing

                                      constant folding in the compiler,

                                      should it follow a model that is

                                      different to what the target would

                                      do or not?  Worse still, when the

                                      pre-processor, compiler, and

                                      target are each different

                                      machines.<br class="">

                                      <br class="">

                                      These are huge problems in the FP

                                      world where exact equivalence and

                                      ordering of evaluation really

                                      matters (think partial ordering -

                                      not the happy unsaturated INT

                                      modulo 2^N world).<br class="">

                                      <br class="">

                                      On our architecture, we have

                                      chosen the 32-bit IEEE model

                                      provided by 'clang' for 'float'

                                      and 'double', but we have chosen

                                      the 64-bit IEEE model for 'long

                                      double'; other implementations are

                                      free to choose a different model.

                                       We also use IEEE 16-bit FP for

                                      'half' aka '__fp16'.  But IEEE

                                      also provides for 128-bit FP,

                                      256-bit FP, and there are FP

                                      implementations that use 80-bits.

                                       In fact, 'clang' does not

                                      preclude an implementation

                                      choosing IEEE 754 16-bit

                                      Half-Precision as its

                                      representation for 'float'.  This

                                      means 5-bits of exponent and

                                      10-bits of mantissa - and that is

                                      still ISO C compliant.<br class="">

                                      <br class="">

                                      Any target is free to choose the

                                      FP representation it prefers for

                                      'float', and that does not mean

                                      that it is bound to IEEE 754

                                      32-bit Single Precision

                                      Floating-Point.  Any FP

                                      optimisations within the compiler

                                      need to keep that target clearly

                                      in mind; I know, I've been burned

                                      by this before.<br class="">

                                      <br class="">

                                       MartinO<br class="">

                                      <br class="">

                                      <br class="">

                                      -----Original Message-----<br

                                        class="">

                                      From: llvm-dev [<a

                                        moz-do-not-send="true"

                                        href="mailto:llvm-dev-bounces@lists.llvm.org"

                                        class=""><a class="moz-txt-link-freetext" href="mailto:llvm-dev-bounces@lists.llvm.org">mailto:llvm-dev-bounces@lists.llvm.org</a></a>]

                                      On Behalf Of Carlos Liam via

                                      llvm-dev<br class="">

                                      Sent: 14 April 2016 19:14<br

                                        class="">

                                      To: <a moz-do-not-send="true"

                                        href="mailto:llvm-dev@lists.llvm.org"

                                        class="">llvm-dev@lists.llvm.org</a><br

                                        class="">

                                      Subject: [llvm-dev] Integer ->

                                      Floating point -> Integer cast

                                      optimizations<br class="">

                                      <br class="">

                                      I brought this up in IRC and was

                                      told to consult someone who knows

                                      more about floating point numbers;

                                      I propose an optimization as

                                      follows.<br class="">

                                      <br class="">

                                      Let's say we have an int x, and we

                                      cast it to a float and back.

                                      Floats have 8 exponent bits and 23

                                      mantissa bits.<br class="">

                                      <br class="">

                                      If x matches the condition

                                      `countTrailingZeros(abs(x)) >

                                      (log2(abs(x)) - 23)`, then we can

                                      remove the float casts.<br

                                        class="">

                                      <br class="">

                                      So, if we can establish that

                                      abs(x) is <= 2**23, we can

                                      remove the casts. LLVM does not

                                      currently perform that

                                      optimization on this C code:<br

                                        class="">

                                      <br class="">

                                      int floatcast(int x) {<br class="">

                                       if (abs(x) <= 16777216) { //

                                      abs(x) is definitely <= 2**23

                                      and fits into our mantissa cleanly<br

                                        class="">

                                           float flt = (float)x;<br

                                        class="">

                                           return (int)flt;<br class="">

                                       }<br class="">

                                       return x;<br class="">

                                      }<br class="">

                                      <br class="">

                                      Things get more interesting when

                                      you bring in higher integers and

                                      leading zeros. Floating point

                                      can't exactly represent integers

                                      that don't fit neatly into the

                                      mantissa; they have to round to a

                                      multiple of some power of 2. For

                                      example, integers between 2**23

                                      and 2**24 round to a multiple of

                                      2**1 - meaning that the result has

                                      *at least* 1 trailing zero.

                                      Integers between 2**24 and 2**25

                                      round to a multiple of 2**2 - with

                                      the result having at least 2

                                      trailing zeros. Et cetera. If we

                                      can prove that the input to these

                                      casts fits in between one of those

                                      ranges *and* has at least the

                                      correct number of leading zeros,

                                      we can eliminate the casts. LLVM

                                      does not currently perform this

                                      optimization on this C code:<br

                                        class="">

                                      <br class="">

                                      int floatcast(int x) {<br class="">

                                       if (16777217 <= abs(x)

                                      && abs(x) <= 33554432)

                                      { // abs(x) is definitely between

                                      2**23 and 2**24<br class="">

                                           float flt = (float)(x /

                                      abs(x) * (abs(x) & (UINT32_MAX

                                      ^ 2))); // what's being casted to

                                      float definitely has at least one

                                      trailing zero in its absolute

                                      value<br class="">

                                           return (int)flt;<br class="">

                                       }<br class="">

                                       return x;<br class="">

                                      }<br class="">

                                      <br class="">

                                      <br class="">

                                      - CL<br class="">

                                      <br class="">

_______________________________________________<br class="">

                                      LLVM Developers mailing list<br

                                        class="">

                                      <a moz-do-not-send="true"

                                        href="mailto:llvm-dev@lists.llvm.org"

                                        class="">llvm-dev@lists.llvm.org</a><br

                                        class="">

                                      <a moz-do-not-send="true"

                                        href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

                                        class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br

                                        class="">

                                      <br class="">

                                    </blockquote>

                                    <br class="">

_______________________________________________<br class="">

                                    LLVM Developers mailing list<br

                                      class="">

                                    <a moz-do-not-send="true"

                                      href="mailto:llvm-dev@lists.llvm.org"

                                      class="">llvm-dev@lists.llvm.org</a><br

                                      class="">

                                    <a moz-do-not-send="true"

                                      href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

                                      class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br

                                      class="">

                                  </div>

                                </div>

                              </blockquote>

                            </div>

                            <br class="">

                          </div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                <br class="">

              </div>

            </div>

          </blockquote>

        </div>

        <br class="">

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>