<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:#1F497D;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:799957587;
mso-list-type:hybrid;
mso-list-template-ids:-1214880770 44585902 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";
mso-bidi-font-family:"Times New Roman";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";
mso-bidi-font-family:"Times New Roman";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";
mso-bidi-font-family:"Times New Roman";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">></span><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> We also need to understand what to do with edge elements in the vector
if their loading is not required. We, probably, should issue a masked load in this case.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">The existing code solves such edge cases where the last element of an InterleaveGroup is absent by making sure the last iteration (and up to last VF iterations)
are peeled and executed scalarly; see requiresScalarEpilogue.</span><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><a name="_MailEndCompose"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></a></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">></span><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> All regressions that we see are in 32-bit mode.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">One place to find them, using the default BaseT::getInterleavedMemoryOpCost(), is DENBench’s RGB conversions.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Ayal.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><a name="_____replyseparator"></a><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Demikhovsky, Elena
<br>
<b>Sent:</b> Monday, August 08, 2016 00:09<br>
<b>To:</b> Michael Kuperstein <mkuper@google.com>; Renato Golin <renato.golin@linaro.org><br>
<b>Cc:</b> Matthew Simpson <mssimpso@codeaurora.org>; Nema, Ashutosh <Ashutosh.Nema@amd.com>; Sanjay Patel <spatel@rotateright.com>; llvm-dev <llvm-dev@lists.llvm.org>; Zaks, Ayal <ayal.zaks@intel.com><br>
<b>Subject:</b> RE: [llvm-dev] enabling interleaved access loop vectorization<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">We checked the gathered data again. All regressions that we see are in 32-bit mode. The 64-bit mode looks good overall.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo2">
<![if !supportLists]><span style="font-family:"Calibri",sans-serif;color:#2F5496"><span style="mso-list:Ignore">-<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span dir="LTR"></span><b><i><span style="color:#2F5496"> Elena<o:p></o:p></span></i></b></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Michael Kuperstein [</span><a href="mailto:mkuper@google.com"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">mailto:mkuper@google.com</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">]
<br>
<b>Sent:</b> Saturday, August 06, 2016 02:56<br>
<b>To:</b> Renato Golin <</span><a href="mailto:renato.golin@linaro.org"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">renato.golin@linaro.org</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">><br>
<b>Cc:</b> Demikhovsky, Elena <</span><a href="mailto:elena.demikhovsky@intel.com"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">elena.demikhovsky@intel.com</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">>; Matthew
Simpson <</span><a href="mailto:mssimpso@codeaurora.org"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">mssimpso@codeaurora.org</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">>; Nema, Ashutosh <</span><a href="mailto:Ashutosh.Nema@amd.com"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">Ashutosh.Nema@amd.com</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">>;
Sanjay Patel <</span><a href="mailto:spatel@rotateright.com"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">spatel@rotateright.com</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">>; llvm-dev <</span><a href="mailto:llvm-dev@lists.llvm.org"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">llvm-dev@lists.llvm.org</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">>;
Zaks, Ayal <</span><a href="mailto:ayal.zaks@intel.com"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">ayal.zaks@intel.com</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">><br>
<b>Subject:</b> Re: [llvm-dev] enabling interleaved access loop vectorization<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">On Fri, Aug 5, 2016 at 4:37 PM, Renato Golin <<a href="mailto:renato.golin@linaro.org" target="_blank">renato.golin@linaro.org</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:46.2pt;text-indent:-17.85pt">
On 6 August 2016 at 00:18, Michael Kuperstein <<a href="mailto:mkuper@google.com">mkuper@google.com</a>> wrote:<br>
> I agree that we can get *more* improvement with better cost modeling, but<br>
> I'd expect to be able to get *some* improvement the way things are right<br>
> now.<br>
<br>
Elena said she saw "some" improvements. :)<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">I didn't mean "some improvements, some regressions", I meant "some of the improvement we'd expect from the full solution". :-)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:0cm;margin-right:0cm;margin-bottom:12.0pt;margin-left:46.2pt;text-indent:-17.85pt">
<br>
> That's why I'm curious about where we saw regressions - I'm wondering<br>
> whether there's really a significant cost modeling issue I'm missing, or<br>
> it's something that's easy to fix so that we can make forward progress,<br>
> while Ashutosh is working on the longer-term solution.<br>
<br>
Sounds like a task to try a few patterns and fiddle with the cost model.<br>
<br>
Arnold did a lot of those during the first months of the vectorizer,<br>
so it might be just a matter of finding the right heuristics, at least<br>
for the low hanging fruits.<br>
<br>
Of course, that'd also involve benchmarking everything else, to make<br>
sure the new heuristics doesn't introduce regressions on<br>
non-interleaved vectorisation.<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">I don't disagree with you.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">All I'm saying is that before fiddling with the heuristics, it'd be good to understand what exactly breaks if we simply flip the flag. If the answer happens to be "nothing" - well, problem
solved. Unfortunately, according to Elena, that's not the answer. <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt">I'm going to play with it with our internal benchmarks, but it's my understanding that Elena/Ayal already have some idea of what the problems are.<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:46.2pt;text-indent:-17.85pt"><o:p> </o:p></p>
</div>
</div>
</div>
</div>
<p>---------------------------------------------------------------------<br>
Intel Israel (74) Limited</p>
<p>This e-mail and any attachments may contain confidential material for<br>
the sole use of the intended recipient(s). Any review or distribution<br>
by others is strictly prohibited. If you are not the intended<br>
recipient, please contact the sender and delete all copies.</p></body>
</html>