<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html;

      charset=windows-1252">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 5/4/20 3:04 AM, Sjoerd Meijer via

      llvm-dev wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:VI1PR08MB2640697713F2F00B320B6EACFCA60@VI1PR08MB2640.eurprd08.prod.outlook.com">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        > The harm comes if the intrinsic ends up with the wrong

        value, or attached to the wrong loop.<br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        The intrinsic is marked as IntrNoDuplicate, so I wasn't worried

        about it ending up somewhere else. Also, it is a property of a

        specific loop, a tail-folded vector loop, that holds even after

        it is transformed I think. I.e. unrolling a vector loop is

        probably not what you want, but even if you do the element count

        would remain the same. But yes, I agree that a future whacky

        optimisation on vector loops could invalidate this, which you

        can then skip but then you lose out on it.... So, I really like

        this:</div>

    </blockquote>

    <p>This approach really doesn't work.  Not unless you're willing to

      impose legality restrictions on optimization passes to preserve

      the information.  <br>

    </p>

    <p><br>

    </p>

    <p>It's helpful to think of the optimizer as being adversarial.  The

      question is not "will the optimizer break this?"; it's "can a

      malicious optimizer break this?".  Unless you can reason from the

      spec (LangRef) that the answer is no, then the answer is yes.</p>

    <p><br>

    </p>

    <p>In your particular example, consider what might happen is loop

      fission runs on your vectorized loop, then we recognize that

      iterations N through M of the first loop (after fission) were nops

      and split it into two loops over narrow ranges.  You'd have real

      trouble matching your intrinsic to anything meaningful in the

      backend, and getting it wrong would be a correctness bug.<br>

    </p>

    <blockquote type="cite"

cite="mid:VI1PR08MB2640697713F2F00B320B6EACFCA60@VI1PR08MB2640.eurprd08.prod.outlook.com">

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        > If the problem is specifically figuring out the underlying

        element count given a predicate, maybe we could attack it from

        that angle?  For example, introduce a special intrinsic for

        deriving the mask (sort of like the SVE whilelo).</div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        That would be an excellent way of doing it and it would also map

        very well to MVE too, where we have a VCTP intrinsic/instruction

        that creates the mask/predicate (Vector Create Tail-Predicate).

        So I will go for this approach. Such an intrinsic was actually

        also proposed in Sam's original RFC (see <a

          href="https://lists.llvm.org/pipermail/llvm-dev/2019-May/132512.html"

          id="LPlnk982545" moz-do-not-send="true">

          https://lists.llvm.org/pipermail/llvm-dev/2019-May/132512.html</a>),

        but we hadn't implemented it yet. This intrinsic will probably

        look something like this:</div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

            <N x i1> @llvm.loop.get.active.mask(AnyInt, AnyInt)<br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        It produces a <N x i1> predicate based on its two

        arguments, the number of elements and the vector trip count, and

        it will be used by the predicated masked loads/stores

        instructions in the vector body. I will start drafting an

        implementation for this and continue with this in D79100.<br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        Thanks,</div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        Sjoerd.<br>

      </div>

      <div style="font-family:Calibri,Arial,Helvetica,sans-serif;

        font-size:12pt; color:rgb(0,0,0)">

        <br>

      </div>

      <div style="font-family:Calibri,Arial,Helvetica,sans-serif;

        font-size:12pt; color:rgb(0,0,0)">

        <br>

      </div>

      <hr tabindex="-1" style="display:inline-block; width:98%">

      <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"

          face="Calibri, sans-serif" color="#000000"><b>From:</b> Eli

          Friedman <a class="moz-txt-link-rfc2396E" href="mailto:efriedma@quicinc.com"><efriedma@quicinc.com></a><br>

          <b>Sent:</b> 01 May 2020 21:11<br>

          <b>To:</b> Sjoerd Meijer <a class="moz-txt-link-rfc2396E" href="mailto:Sjoerd.Meijer@arm.com"><Sjoerd.Meijer@arm.com></a>;

          llvm-dev <a class="moz-txt-link-rfc2396E" href="mailto:llvm-dev@lists.llvm.org"><llvm-dev@lists.llvm.org></a><br>

          <b>Subject:</b> RE: [llvm-dev] LV: predication</font>

        <div> </div>

      </div>

      <div lang="EN-US">

        <div class="x_WordSection1">

          <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

            font-size: 11pt; font-family: "Calibri",

            sans-serif;">

             </p>

          <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

            font-size: 11pt; font-family: "Calibri",

            sans-serif;">

             </p>

          <div>

            <div style="border:none; border-top:solid #E1E1E1 1.0pt;

              padding:3.0pt 0in 0in 0in">

              <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

                font-size: 11pt; font-family: "Calibri",

                sans-serif;margin-left:.5in">

                <b>From:</b> Sjoerd Meijer <a class="moz-txt-link-rfc2396E" href="mailto:Sjoerd.Meijer@arm.com"><Sjoerd.Meijer@arm.com></a>

                <br>

                <b>Sent:</b> Friday, May 1, 2020 11:54 AM<br>

                <b>To:</b> Eli Friedman <a class="moz-txt-link-rfc2396E" href="mailto:efriedma@quicinc.com"><efriedma@quicinc.com></a>;

                llvm-dev <a class="moz-txt-link-rfc2396E" href="mailto:llvm-dev@lists.llvm.org"><llvm-dev@lists.llvm.org></a><br>

                <b>Subject:</b> [EXT] Re: [llvm-dev] LV: predication</p>

            </div>

          </div>

          <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

            font-size: 11pt; font-family: "Calibri",

            sans-serif;margin-left:.5in">

             </p>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black">Hi Eli,</span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black"> </span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black">> The

                problem with your proposal, as written, is that the

                vectorizer is producing the intrinsic.  Because we don’t

                impose any ordering on optimizations before codegen,

                every optimization pass in LLVM would have to be taught

                to preserve any @llvm.set.loop.elements.i32 whenever it

                makes any change.  This is completely impractical

                because the intrinsic isn’t related to anything

                optimizations would normally look for: it’s a random

                intrinsic in the middle of nowhere.</span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black"> </span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black">I do see that

                point. But is that also not the beauty of it? It just

                sits in the preheader, if gets removed, then so be it.

                And if it not recognised, then also no harm done?</span></p>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;">

               </p>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;">

              The harm comes if the intrinsic ends up with the wrong

              value, or attached to the wrong loop.

            </p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black"> </span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black">> Probably

                the simplest path to get this working is to derive the

                number of elements in the backend (in HardwareLoops, or

                your tail predication pass). You should be able to

                figure it from the masks used in the

                llvm.masked.load/store instructions in the loop.</span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black"> </span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black">This is what

                we are currently doing and works excellent for simpler

                cases. For the more complicated cases that we now what

                to handle as well, the pattern matching just becomes a

                bit too horrible, and it is fragile too. All we need is

                the information that the vectoriser already has, and

                pass this on somehow.</span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black"> </span></p>

          </div>

          <div>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;margin-left:.5in">

              <span style="font-size:12.0pt; color:black">As I am really

                keen to simply our backend pass, would there be another

                way to pass this information on? If emitting an

                intrinsic is a blocker, could this be done with a loop

                annotation?</span></p>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;">

               </p>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;">

              If the problem is specifically figuring out the underlying

              element count given a predicate, maybe we could attack it

              from that angle?  For example, introduce a special

              intrinsic for deriving the mask (sort of like the SVE

              whilelo).</p>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;">

               </p>

            <p class="x_MsoNormal" style="margin: 0in 0in 0.0001pt;

              font-size: 11pt; font-family: "Calibri",

              sans-serif;">

              -Eli<span style="font-size:12.0pt; color:black"> </span></p>

          </div>

          <div>

            <div>

              <div>

                <p class="x_xmsonormal" style="margin: 0in 0in 0.0001pt;

                  font-size: 11pt; font-family: "Calibri",

                  sans-serif;margin-left:.5in">

                  <span style="font-size:12.0pt; color:black"> </span></p>

              </div>

            </div>

          </div>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

  </body>

</html>