<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p><br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 10/02/2017 11:10 AM, Bruce Hoult via

      llvm-dev wrote:<br>

    </div>

    <blockquote

cite="mid:CAMU+Ekwm2tUVRPHZbBjBg6OCODTHfc46VDiQwfAU6=CLabUa6Q@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <div dir="ltr">Is there anything that means, in particular, "go

        fast, even if it means not all bits are significant"?

        <div><br>

        </div>

        <div>I'm currently working on an llvm-based compiler for a GPU

          that is optomised for OpenGL, where 16 bit FP may not be quite

          accurate enough (or may be in some cases), but 32 bit FP is

          overkill. A lot of the fast, built in, operations end up with

          a few junk bits at the end (not add/sub/mul . but divide is

          available *only* using reciprocal).</div>

        <div><br>

        </div>

        <div>When implementing OpenCL, the specs and conformance tests

          require full IEEE accuracy. In some cases this requires a

          round of Newton-Raphson to clean up the accuracy, which is a

          significant though maybe not crippling performance penalty.

          But in other cases we need to do a lot of range reduction,

          some polynomial, and then generalise the result again. This

          can be an order of magnitude or more slower than using the

          not-quite-accurate-enough built in instruction.</div>

      </div>

    </blockquote>

    <br>

    This is what arcp is for (implying that you can use the reciprocal

    estimate and not worry about getting the exact answer). Now there's

    a separate question about how many Newton iterations to use, and we

    have a separate flag for that (-mrecip=...). Check out the

    implementation of  TargetLoweringBase::getRecipEstimateSqrtEnabled

    to see how it's setup in backend. This is, however, per function, so

    we don't currently have a per-operation control on this.<br>

    <br>

    <blockquote

cite="mid:CAMU+Ekwm2tUVRPHZbBjBg6OCODTHfc46VDiQwfAU6=CLabUa6Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>The OpenCL spec defines a number of compile flags

          controlling optimizartions. Some seem to map well onto the

          flags already discussed here:</div>

        <div><br>

        </div>

        <div>-cl-mad-enable<br>

        </div>

        <div>-cl-no-signed-zeros<br>

        </div>

        <div>-cl-finite-math-only<br>

        </div>

        <div><br>

        </div>

        <div>However it looks to me that the following ones don't

          presently map well to LLVM:</div>

        <div><br>

        </div>

        <div>

          <div>-cl-unsafe-math-optimizations</div>

          <div>Allow optimizations for floating-point arithmetic that

            (a) assume that arguments and results are valid, (b) may

            violate IEEE 754 standard and (c) may violate the OpenCL

            numerical compliance requirements as defined in the SPIR-V

            OpenCL environment specification for single precision and

            double precision floating-point, and edge case behavior in

            the SPIR-V OpenCL environment specification. This option

            includes the -clno-signed-zeros and -cl-mad-enable options.</div>

        </div>

      </div>

    </blockquote>

    <br>

    I think the idea is that this flag, like

    -funsafe-math-optimizations, gets mapped to an appropriate

    collection of finer-grained flags internally.<br>

    <br>

    <blockquote

cite="mid:CAMU+Ekwm2tUVRPHZbBjBg6OCODTHfc46VDiQwfAU6=CLabUa6Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

        </div>

        <div>

          <div>-cl-fast-relaxed-math</div>

          <div>Sets the optimization options -cl-finite-math-only and

            -cl-unsafe-math-optimizations. This allows optimizations for

            floating-point arithmetic that may violate the IEEE 754

            standard and the OpenCL numerical compliance requirements

            for single precision and double precision floating-point, as

            well as floating point edge case behavior. This option also

            relaxes the precision of commonly used math functions. This

            option causes the preprocessor macro __FAST_RELAXED_MATH__

            to be defined in the OpenCL program. The original and

            modified values are defined in the SPIR-V OpenCL environment

            specification</div>

        </div>

        <div><br>

        </div>

        <div>I'd like to emphasise in the latter one: "This option also

          relaxes the precision of commonly used math functions."</div>

      </div>

    </blockquote>

    <br>

    Isn't this the "libm" flag that is proposed in this thread?<br>

    <br>

     -Hal<br>

    <br>

    <blockquote

cite="mid:CAMU+Ekwm2tUVRPHZbBjBg6OCODTHfc46VDiQwfAU6=CLabUa6Q@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div><br>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Mon, Oct 2, 2017 at 4:45 PM, Ristow,

          Warren via llvm-dev <span dir="ltr"><<a

              moz-do-not-send="true"

              href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div link="blue" vlink="purple" lang="EN-US">

              <div class="m_-6162699180708653109WordSection1">

                <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546a">I'm

                    not aware of any additional bits needed.  But

                    putting us right at the edge leaves me

                    uncomfortable.  So an implementation that isn't

                    limited by the 7 bits in SubclassOptionalData seems

                    sensible.</span></p>

                <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546a"> </span></p>

                <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546a">Thanks,</span></p>

                <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546a">-Warren</span></p>

                <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546a"> </span></p>

                <div style="border:none;border-top:solid #b5c4df

                  1.0pt;padding:3.0pt 0in 0in 0in">

                  <p class="MsoNormal"><b><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">

                      Sanjay Patel [mailto:<a moz-do-not-send="true"

                        href="mailto:spatel@rotateright.com"

                        target="_blank">spatel@rotateright.com</a><wbr>]

                      <br>

                      <b>Sent:</b> Monday, October 2, 2017 12:06 AM<br>

                      <b>To:</b> Ristow, Warren<br>

                      <b>Cc:</b> Hal Finkel; <a moz-do-not-send="true"

                        href="mailto:llvm-dev@lists.llvm.org"

                        target="_blank">llvm-dev@lists.llvm.org</a><br>

                      <b>Subject:</b> Re: [llvm-dev] Trouble when

                      suppressing a portion of fast-math-transformations</span></p>

                </div>

                <div>

                  <div class="h5">

                    <p class="MsoNormal"> </p>

                    <div>

                      <div>

                        <p class="MsoNormal">Are we confident that we

                          just need those 7 bits to represent all of the

                          relaxed FP states that we need/want to

                          support?

                        </p>

                      </div>

                      <div>

                        <p class="MsoNormal"> </p>

                      </div>

                      <div>

                        <p class="MsoNormal"

                          style="margin-bottom:12.0pt">I'm asking

                          because FMF in IR is currently mapped onto the

                          SubclassOptionalData of Value...and we have

                          exactly 7 bits there. :)</p>

                      </div>

                      <p class="MsoNormal">If we're redoing the

                        definitions, I'm wondering if we can share the

                        struct with the backend's SDNodeFlags, but that

                        already has one extra bit for vector reduction.

                        Should we give up on SubclassOptionalData for

                        FMF? We have a MD_fpmath enum value for

                        metadata, so we could move things over there?</p>

                      <div>

                        <p class="MsoNormal"> </p>

                      </div>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                      <div>

                        <p class="MsoNormal">On Fri, Sep 29, 2017 at

                          8:16 PM, Ristow, Warren via llvm-dev <<a

                            moz-do-not-send="true"

                            href="mailto:llvm-dev@lists.llvm.org"

                            target="_blank">llvm-dev@lists.llvm.org</a>>

                          wrote:</p>

                        <p class="MsoNormal">Hi Hal,<br>

                          <br>

                          >> 4. To fix this, I think that

                          additional fast-math-flags are likely<br>

                          >> needed in the IR.  Instead of the

                          following set:<br>

                          >><br>

                          >> 'nnan' + 'ninf' + 'nsz' + 'arcp' +

                          'contract'<br>

                          >><br>

                          >> something like this:<br>

                          >><br>

                          >> 'reassoc' + 'libm' + 'nnan' + 'ninf'

                          + 'nsz' + 'arcp' + 'contract'<br>

                          >><br>

                          >> would be more useful.  Related to

                          this, the current 'fast' flag which acts<br>

                          >> as an umbrella (enabling 'nnan' +

                          'ninf' + 'nsz' + 'arcp' + 'contract') may<br>

                          >> not be needed.  A discussion on this

                          point was raised last November on the<br>

                          >> mailing list:<br>

                          >><br>

                          >> <a moz-do-not-send="true"

href="http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html"

                            target="_blank">

                            http://lists.llvm.org/<wbr>pipermail/llvm-dev/2016-<wbr>November/107104.html</a><br>

                          ><br>

                          > I agree. I'm happy to help review the

                          patches. It will be best to have<br>

                          > only the finer-grained flags where

                          there's no "fast" flag that implies<br>

                          > all of the others.<br>

                          <br>

                          Thanks for the quick response, and for the

                          willingness to review.  I won't let<br>

                          this languish so long, like the post from last

                          November.<br>

                          <br>

                          Happy to hear that you feel it's best not to

                          have the umbrella "fast" flag.<br>

                          <br>

                          Thanks again,</p>

                        <div>

                          <div>

                            <p class="MsoNormal">-Warren<br>

                              ______________________________<wbr>_________________<br>

                              LLVM Developers mailing list<br>

                              <a moz-do-not-send="true"

                                href="mailto:llvm-dev@lists.llvm.org"

                                target="_blank">llvm-dev@lists.llvm.org</a><br>

                              <a moz-do-not-send="true"

                                href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

                                target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a></p>

                          </div>

                        </div>

                      </div>

                      <p class="MsoNormal"> </p>

                    </div>

                  </div>

                </div>

              </div>

            </div>

            <br>

            ______________________________<wbr>_________________<br>

            LLVM Developers mailing list<br>

            <a moz-do-not-send="true"

              href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

            <a moz-do-not-send="true"

              href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

              rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

            <br>

          </blockquote>

        </div>

        <br>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </body>

</html>