<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:514345176;
        mso-list-type:hybrid;
        mso-list-template-ids:1042861914 2055364180 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
        {mso-level-text:"%1\)";
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:22.5pt;
        text-indent:-18.0pt;}
@list l0:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:58.5pt;
        text-indent:-18.0pt;}
@list l0:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:94.5pt;
        text-indent:-9.0pt;}
@list l0:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:130.5pt;
        text-indent:-18.0pt;}
@list l0:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:166.5pt;
        text-indent:-18.0pt;}
@list l0:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:202.5pt;
        text-indent:-9.0pt;}
@list l0:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:238.5pt;
        text-indent:-18.0pt;}
@list l0:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:274.5pt;
        text-indent:-18.0pt;}
@list l0:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:310.5pt;
        text-indent:-9.0pt;}
@list l1
        {mso-list-id:777796762;
        mso-list-type:hybrid;
        mso-list-template-ids:799581388 97932382 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l1:level1
        {mso-level-start-at:0;
        mso-level-number-format:bullet;
        mso-level-text:-;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:22.5pt;
        text-indent:-18.0pt;
        font-family:"Calibri","sans-serif";
        mso-fareast-font-family:Calibri;
        mso-bidi-font-family:"Times New Roman";}
@list l1:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:58.5pt;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l1:level3
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:94.5pt;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l1:level4
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:130.5pt;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l1:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:166.5pt;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l1:level6
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:202.5pt;
        text-indent:-18.0pt;
        font-family:Wingdings;}
@list l1:level7
        {mso-level-number-format:bullet;
        mso-level-text:\F0B7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:238.5pt;
        text-indent:-18.0pt;
        font-family:Symbol;}
@list l1:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:274.5pt;
        text-indent:-18.0pt;
        font-family:"Courier New";}
@list l1:level9
        {mso-level-number-format:bullet;
        mso-level-text:\F0A7;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:310.5pt;
        text-indent:-18.0pt;
        font-family:Wingdings;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This is somewhat similar to the previous thread regarding missed vectorization<o:p></o:p></p>
<p class="MsoNormal">opportunities (<a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084765.html">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084765.html</a>),<o:p></o:p></p>
<p class="MsoNormal">but maybe different enough to require a new thread.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I’m seeing some missed vectorization opportunities in the loop vectorizer because SCEV<o:p></o:p></p>
<p class="MsoNormal">is not able to fold sext/zext expressions into recurrence expressions (AddRecExpr).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This can manifest in multiple ways:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l1 level1 lfo2">
<![if !supportLists]><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">         
</span></span><![endif]>We cannot get the back-edges taken count since SCEV  because we may have something like (sext (1,+1))<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">which we can’t evaluate as it can overflow<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l1 level1 lfo2">
<![if !supportLists]><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">         
</span></span><![endif]>We cannot get SCEV AddRec expressions for pointers which need runtime checks, and the<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">loop vectorizer fails with a “Can't vectorize due to memory conflicts” error.<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt"><o:p> </o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">I think there are two cases:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l0 level1 lfo4">
<![if !supportLists]><span style="mso-list:Ignore">1)<span style="font:7.0pt "Times New Roman"">     
</span></span><![endif]>It would be possible for SCEV to prove that it is safe to fold the sext/zext nodes into an AddRec<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">expression, but this doesn’t happen because either nsw/nuw flags have been lost or the code<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">to make the inference of nsw/nuw flags in some particular case is missing<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l0 level1 lfo4">
<![if !supportLists]><span style="mso-list:Ignore">2)<span style="font:7.0pt "Times New Roman"">     
</span></span><![endif]>It is actually possible for some operations to overflow, so folding sext/zext nodes into AddRec<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">expressions would be incorrect.<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt"><o:p> </o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">Here is an example where we fail to get the number of back-edge branches taken because of sext/zext<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">operations:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm"><o:p> </o:p></p>
<p class="MsoNormal">void test0(unsigned short a, unsigned short *  in, unsigned short * out) {<o:p></o:p></p>
<p class="MsoNormal">  for (unsigned short w = 1; w < a - 1; w++) //this will never overflow<o:p></o:p></p>
<p class="MsoNormal">      out[w] = in[w+7] * 2;<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">}<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm"><o:p> </o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">In there anyone working on improving the 1) aspect of SCEV? If so, maybe some coordination of effort<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">might be a good idea.<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm"><o:p> </o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">Since the issue seems to be that certain operations can overflow and SCEV cannot properly reason about<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:0cm">overflows and extend operations, would it make more sense to try and:<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l1 level1 lfo2">
<![if !supportLists]><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">         
</span></span><![endif]>Promote values that go into the trip count calculation and memory access indices to the smallest type<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">which would remove sext/zext/trunc operations from the loop body. This should remove the sext/zext<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt">issue, as SCEV wouldn’t have to deal with these operations.<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l1 level1 lfo2">
<![if !supportLists]><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">         
</span></span><![endif]>Add nsw/nuw flags where necessary<o:p></o:p></p>
<p class="MsoListParagraph" style="margin-left:22.5pt;text-indent:-18.0pt;mso-list:l1 level1 lfo2">
<![if !supportLists]><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">         
</span></span><![endif]>Add runtime checks (outside the loop) to detect overflows in the original loop<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Would there be any fundamental issue with this approach? I think it would it be preferable to point fixes<o:p></o:p></p>
<p class="MsoNormal">for case 1), so if anyone is working on something similar it would be good to know.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks,<o:p></o:p></p>
<p class="MsoNormal">Silviu<o:p></o:p></p>
</div>
<br>
<font face="Arial" color="Black" size="2">-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents
 to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.<br>
<br>
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590<br>
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782<br>
</font>
</body>
</html>