<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi,</p>

    <p>The LLVM-VP extension (<a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D57504">https://reviews.llvm.org/D57504</a>)

      generalizes PatternMatch.h to match FP intrinsics as well as

      regular fp (vector) instructions with the same pattern. We use

      this to lift the pattern rewrites in InstSimplify and InstCombine

      to predicated vector instructions. The same logic could be applied

      to "scalar" constrained FP intrinsics. Hal has requested that the

      VP intrinsics model fp exception/rounding too.</p>

    <p>So the suggestions is to keep using fp exception/rounding mode

      arguments but teaching LLVM to handle them in its optimizations

      and analysis.</p>

    <tt>Example</tt><br>

    <tt>-----------</tt><tt><br>

    </tt><br>

    <tt>PatternMatch.h changes:

      <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D57504#change-cWgJ3XBlLNvs">https://reviews.llvm.org/D57504#change-cWgJ3XBlLNvs</a></tt><br>

    <tt>AddSub in code in InstCombine:

      <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D57504#change-24P4gqRF9sNj">https://reviews.llvm.org/D57504#change-24P4gqRF9sNj</a></tt><br>

    <tt>Note that "visitPredicatedFSub" will match either the regular

      FSub instruction or the llvm.vp.fsub intrinsic.</tt><tt><br>

    </tt>

    <p><br>

    </p>

    <p>- Simon</p>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 8/20/19 7:00 PM, Serge Pavlov via

      llvm-dev wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CACOhrX5geDxVPnNtX-kQZB5UgDKS1bS=fdpbfa80ek_bHpV9AA@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">Hi all,<br>

        <br>

        During the review of <a href="https://reviews.llvm.org/D65997"

          moz-do-not-send="true">https://reviews.llvm.org/D65997</a> an

        issue was revealed, which relates to the decision of how

        compiler should represents constrained floating point

        operations.<br>

        <br>

        If a floating point operation requires rounding mode or

        exception behavior different from the default, it should be

        represented by constrained intrinsic (<a

href="http://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics"

          moz-do-not-send="true">http://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics</a>).

        An important point is that according to the current design

        decision, if some part of a function contains such intrinsic,

        all floating point operations in the function must be

        represented by constrained intrinsics as well. Such decision

        should prevent from undesired moves of fp operations. The

        discussion is in the thread <a

          href="http://lists.llvm.org/pipermail/cfe-dev/2017-August/055325.html"

          moz-do-not-send="true">http://lists.llvm.org/pipermail/cfe-dev/2017-August/055325.html</a>,

        the relevant example is:<br>

        <br>

        <blockquote style="margin:0 0 0 40px;border:none;padding:0px">double

          f(double a, double b, double c) {<br>

            {<br>

          #pragma STDC FENV_ACCESS ON<br>

              feenableexcept(FE_OVERFLOW);<br>

              double d = a * b;<br>

              fedisableexcept(FE_OVERFLOW);<br>

            }<br>

            return c * d;<br>

          }</blockquote>

        <br>

        The second fmul must not be hoisted up to before the

        fedisableexcept. Using constrained intrinsics is expected to

        help in this case as they are not handled by optimization

        passes.<br>

        <br>

        The concern is that using constrained intrinsics in a small

        region of a function results in using such intrinsics everywhere

        in the function including functions that inline it. As

        constrained intrinsics prevent from optimizations, it can result

        in performance degradation.<br>

        <br>

        A couple of examples:<br>

        1. There is a performance critical function that makes most of

        calculations in default fp mode, but in some points it enables

        fp exceptions and makes an action that can trigger such

        exception. Using constrained intrinsics would result in

        performance loss, although the code that actually needs them is

        very compact.<br>

        2. Cores that are used for machine learning usually work with

        short data (half, bfloat16 or even shorter). Rounding control in

        this case is much more important than for big cores; using

        proper rounding in different parts of algorithm can gain

        precision. Constrained intrinsics is the only way to enforce

        particular rounding mode. However using them results in poor

        optimization, which is intolerable. In such cores rounding mode

        may be encoded in instructions, so code movements cannot break

        semantics.<br>

        <br>

        Representation of fp operations could be more flexible, so that

        a user would not pay for rounding/exception control by

        performance degradation. For that we need to be able to mix

        constrained intrinsics and regular fp operation in a function.<br>

        <br>

        The question is: how can we prevent from moving fp operations

        through boundaries of a region, where specific rounding and/or

        exception behavior are applied? Any ideas?

        <div><br>

          <div>

            <div dir="ltr" class="gmail_signature"

              data-smartmail="gmail_signature">Thanks,<br>

              --Serge<br>

            </div>

          </div>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

Simon Moll

Researcher / PhD Student

Compiler Design Lab (Prof. Hack)

Saarland University, Computer Science

Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : <a class="moz-txt-link-abbreviated" href="mailto:moll@cs.uni-saarland.de">moll@cs.uni-saarland.de</a>

Fax. +49 (0)681 302-3065  : <a class="moz-txt-link-freetext" href="http://compilers.cs.uni-saarland.de/people/moll">http://compilers.cs.uni-saarland.de/people/moll</a></pre>

  </body>

</html>