<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 03/14/2017 09:00 AM, Hal Finkel via
      llvm-dev wrote:<br>
    </div>
    <blockquote cite="mid:28be6f10-7567-6091-4e2a-8c6190d9fcd5@anl.gov"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <p><br>
      </p>
      <div class="moz-cite-prefix">On 03/14/2017 08:00 AM, Nema,
        Ashutosh wrote:<br>
      </div>
      <blockquote
cite="mid:CY4PR12MB17993294734263A4A7EB2C2AFB240@CY4PR12MB1799.namprd12.prod.outlook.com"
        type="cite">
        <meta name="Generator" content="Microsoft Word 15 (filtered
          medium)">
        <!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]-->
        <style><!--
/* Font Definitions */
@font-face
        {font-family:Helvetica;
        panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;}
span.EmailStyle19
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:395780924;
        mso-list-type:hybrid;
        mso-list-template-ids:1076887890 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
        {mso-level-text:"%1\)";
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l0:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l0:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l1
        {mso-list-id:711031633;
        mso-list-type:hybrid;
        mso-list-template-ids:-1063088596 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
        {mso-level-text:"%1\)";
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l1:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l1:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
        <div class="WordSection1">
          <p class="MsoNormal">Summarizing the discussion on the
            implementation approaches.<o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <p class="MsoNormal">Discussed about two approaches, first
            running ‘InnerLoopVectorizer’ again on the epilog loop
            immediately after vectorizing the original loop within the
            same vectorization pass, the second approach where
            re-running vectorization pass and limiting vectorization
            factor of epilog loop by metadata.<o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <p class="MsoNormal"><Approach-2><o:p></o:p></p>
          <p class="MsoNormal">Challenges with re-running the vectorizer
            pass:<o:p></o:p></p>
          <p class="MsoListParagraph"
            style="text-indent:-.25in;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
              style="mso-list:Ignore">1)<span style="font:7.0pt
                "Times New Roman"">      </span></span><!--[endif]-->Reusing
            alias check result: <o:p></o:p></p>
          <p class="MsoListParagraph">When vectorizer pass runs again it
            finds the epilog loop as a new loop and it may generates
            alias check, this new alias check may overkill the gains of
            epilog vectorization.<o:p></o:p></p>
          <p class="MsoListParagraph">We should use the already computed
            alias check result instead of re computing again.<o:p></o:p></p>
          <p class="MsoListParagraph"
            style="text-indent:-.25in;mso-list:l0 level1 lfo2"><!--[if !supportLists]--><span
              style="mso-list:Ignore">2)<span style="font:7.0pt
                "Times New Roman"">      </span></span><!--[endif]-->Rerun
            the vectorizer and hoist the new alias check:<o:p></o:p></p>
          <p class="MsoListParagraph">It’s not possible to hoist alias
            checks as its not fully redundant (not dominated by other
            checks), it’s not getting execute in all paths.<o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <p class="MsoNormal"><img id="Picture_x0020_1"
              src="cid:part1.02BF12B2.A8AAF4B3@anl.gov" height="156"
              width="567"><o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <p class="MsoNormal">NOTE: We cannot prepone alias check as
            its expensive compared to other checks.<o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <p class="MsoNormal"><Approach-1><o:p></o:p></p>
          <p class="MsoListParagraph"
            style="text-indent:-.25in;mso-list:l1 level1 lfo1"><!--[if !supportLists]--><span
              style="mso-list:Ignore">1)<span style="font:7.0pt
                "Times New Roman"">      </span></span><!--[endif]-->Current
            patch depends on the existing functionality of
            LoopVectorizer, it uses ‘InnerLoopVectorizer’ again to
            vectorize the epilog loop, as it happens in the same
            vectorization pass we have flexibility to reuse already
            computed alias result check & limit vectorization factor
            for the epilog loop. <o:p></o:p></p>
          <p class="MsoListParagraph"
            style="text-indent:-.25in;mso-list:l1 level1 lfo1"><!--[if !supportLists]--><span
              style="mso-list:Ignore">2)<span style="font:7.0pt
                "Times New Roman"">      </span></span><!--[endif]-->It
            does not generate the blocks for new block layout
            explicitly, rather it depends on
            ‘InnerLoopVectorizer::createEmptyLoop’ to generate new block
            layout. The new block layout get automatically generated by
            calling the ‘InnerLoopVectorizer:: vectorize’ again.<o:p></o:p></p>
          <p class="MsoListParagraph"
            style="text-indent:-.25in;mso-list:l1 level1 lfo1"><!--[if !supportLists]--><span
              style="mso-list:Ignore">3)<span style="font:7.0pt
                "Times New Roman"">      </span></span><!--[endif]-->Block
            layout description with epilog loop vectorization is
            available at<o:p></o:p></p>
          <p class="MsoListParagraph"><a moz-do-not-send="true"
href="https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png">https://reviews.llvm.org/file/data/fxg5vx3capyj257rrn5j/PHID-FILE-x6thnbf6ub55ep5yhalu/LayoutDescription.png</a><o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          <p class="MsoNormal">Approach-1 looks feasible, please comment
            if any objections.</p>
        </div>
      </blockquote>
      <br>
      I think think this is reasonable. One thing: In the proposed block
      layout, if the alias check fails, we jump to the  "Min Iter Check
      2". From there we re-check the alias-check result (which will be
      false again), and then jump to the scalar loop. This is one more
      branch than necessary in the case where the alias check fails. If
      the alias check fails, we should jump directly to the scalar loop.<br>
    </blockquote>
    <br>
    There's another issue as well. If the trip count is small, it is
    important that the critical path through the checks to the scalar
    loop is as small as possible. If we use this layout, then in the
    case where the trip count is very small, we've now introduced an
    extra check (or set of checks) to get to the scalar loop. We need to
    do it the other way: Check the smaller trip count first. If that
    fails, go to the scalar loop. Only if the small trip count succeeds,
    then we check the larger trip count. The path length through the
    trip counts must be largest when we have the most work over which to
    amortize the checks (i.e. when the trip count is largest).<br>
    <br>
     -Hal<br>
    <br>
    <blockquote cite="mid:28be6f10-7567-6091-4e2a-8c6190d9fcd5@anl.gov"
      type="cite"> <br>
      Thanks again,<br>
      Hal<br>
      <br>
      <blockquote
cite="mid:CY4PR12MB17993294734263A4A7EB2C2AFB240@CY4PR12MB1799.namprd12.prod.outlook.com"
        type="cite">
        <div class="WordSection1">
          <p class="MsoNormal"><o:p></o:p></p>
          <p class="MsoNormal"><o:p> </o:p></p>
          ...</div>
      </blockquote>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
  </body>
</html>