<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"\@SimSun";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
        {mso-style-priority:99;
        mso-style-link:"Balloon Text Char";
        margin:0cm;
        margin-bottom:.0001pt;
        font-size:8.0pt;
        font-family:"Tahoma","sans-serif";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
span.EmailStyle18
        {mso-style-type:personal;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
span.apple-converted-space
        {mso-style-name:apple-converted-space;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
span.BalloonTextChar
        {mso-style-name:"Balloon Text Char";
        mso-style-priority:99;
        mso-style-link:"Balloon Text";
        font-family:"Tahoma","sans-serif";}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1256476253;
        mso-list-type:hybrid;
        mso-list-template-ids:103023524 1176397202 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
        {mso-level-text:"\(%1\)";
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:46.5pt;
        text-indent:-18.0pt;}
@list l0:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:82.5pt;
        text-indent:-18.0pt;}
@list l0:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:118.5pt;
        text-indent:-9.0pt;}
@list l0:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:154.5pt;
        text-indent:-18.0pt;}
@list l0:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:190.5pt;
        text-indent:-18.0pt;}
@list l0:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:226.5pt;
        text-indent:-9.0pt;}
@list l0:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:262.5pt;
        text-indent:-18.0pt;}
@list l0:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:298.5pt;
        text-indent:-18.0pt;}
@list l0:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:334.5pt;
        text-indent:-9.0pt;}
ol
        {margin-bottom:0cm;}
ul
        {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='color:#1F497D'>Hi Hal,<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>See my comments below.<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>Thanks,<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>-Hao<o:p></o:p></span></p><div style='border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt'><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Hal Finkel [mailto:hfinkel@anl.gov] <br><b>Sent:</b> 2015</span><span lang=ZH-CN style='font-size:10.0pt;font-family:SimSun'>年</span><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>3</span><span lang=ZH-CN style='font-size:10.0pt;font-family:SimSun'>月</span><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>21</span><span lang=ZH-CN style='font-size:10.0pt;font-family:SimSun'>日</span><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> 0:31<br><b>To:</b> Hao Liu<br><b>Cc:</b> llvm-commits@cs.uiuc.edu; Jiangning Liu; James Molloy; aschwaighofer@apple.com; Nadav Rotem; Elena Demikhovsky<br><b>Subject:</b> Re: [RFC][PATCH][LoopVectorize] Teach Loop Vectorizer about interleaved data accesses<o:p></o:p></span></p></div></div><div><p class=MsoNormal style='margin-bottom:12.0pt'><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:black'><br>Perhaps this is a silly question, but why do you need interleaved load/store to support this? If we know that we need to access A[0], A[2], A[4], A[6], can't we generate two vector loads, one for A[0...3], and one for A[4...7], and then shuffle the results together. You need to leave the vector loop one iteration early (so you don't access off the end of the original access range), but that does not seem like a big loss. If I'm right, then I'd love to see this implemented in a way that can take advantage of interleaved load/store on targets that support them, but not require target support.<br><br>Thanks again,<br>Hal</span><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D'><o:p></o:p></span></p><div><p class=MsoNormal><span style='color:#1F497D'>[Hao Liu] <o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>See my previous mail. If the interleaved number is 2, we can use 4 IRs (2 loads/stores, 2 shuffles), which seems beneficial. The problem is if we use apart IRs for the target who support interleaved accesses, we will pay extra much more effort to combine them together and it is vulnerable. <o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>I think one solution can be as follows:<o:p></o:p></span></p><p class=MsoListParagraph style='margin-left:41.25pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='color:#1F497D'><span style='mso-list:Ignore'>(1)<span style='font:7.0pt "Times New Roman"'>    </span></span></span><![endif]><span style='color:#1F497D'>For the targets which don’t support interleaved access, it it is beneficial, we can generate apart IRs.<o:p></o:p></span></p><p class=MsoListParagraph style='margin-left:41.25pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='color:#1F497D'><span style='mso-list:Ignore'>(2)<span style='font:7.0pt "Times New Roman"'>    </span></span></span><![endif]><span style='color:#1F497D'>For the targets which support interleaved access, we can generate one intrinsic.<o:p></o:p></span></p><p class=MsoNormal><span style='color:#1F497D'>As for the interleaved access intrinsics, different targets may behavior differently. For the ARM or AArch64 target, it can directly match one intrinsic into one ldN/stN. For X86, I think it can match one intrinsic into several indexed loads/stores in AVX (I’m not familiar with X86, but I think it can easily handle interleaved access intrinsics).<o:p></o:p></span></p></div></div></div></div></body></html>