<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:p="urn:schemas-microsoft-com:office:powerpoint" xmlns:a="urn:schemas-microsoft-com:office:access" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" xmlns:rs="urn:schemas-microsoft-com:rowset" xmlns:z="#RowsetSchema" xmlns:b="urn:schemas-microsoft-com:office:publisher" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet" xmlns:oa="urn:schemas-microsoft-com:office:activation" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:q="http://schemas.xmlsoap.org/soap/envelope/" xmlns:D="DAV:" xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml" xmlns:ois="http://schemas.microsoft.com/sharepoint/soap/ois/" xmlns:dir="http://schemas.microsoft.com/sharepoint/soap/directory/" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:dsp="http://schemas.microsoft.com/sharepoint/dsp" xmlns:udc="http://schemas.microsoft.com/data/udc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sub="http://schemas.microsoft.com/sharepoint/soap/2002/1/alerts/" xmlns:ec="http://www.w3.org/2001/04/xmlenc#" xmlns:sp="http://schemas.microsoft.com/sharepoint/" xmlns:sps="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:udcxf="http://schemas.microsoft.com/data/udc/xmlfile" xmlns:wf="http://schemas.microsoft.com/sharepoint/soap/workflow/" xmlns:mver="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns:mrels="http://schemas.openxmlformats.org/package/2006/relationships" xmlns:ex12t="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:ex12m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:Z="urn:schemas-microsoft-com:" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Balloon Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";}
span.BalloonTextChar
{mso-style-name:"Balloon Text Char";
mso-style-priority:99;
mso-style-link:"Balloon Text";
font-family:"Tahoma","sans-serif";}
span.EmailStyle19
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="2050" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal>Hello,<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>We are currently evaluating LLVM as a compiler technology
against Open64 and have a few questions regarding its currently status with
respect to optimizations in particular. <o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Control flow is often expensive on the architectures we are
considering and as such loop optimizations, e.g. unroll and jam, can play an
important role. For example, consider the following program<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>int a[4][4];<o:p></o:p></p>
<p class=MsoNormal>int b[4][4];<o:p></o:p></p>
<p class=MsoNormal>int c[4][4];<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>int main(void)<o:p></o:p></p>
<p class=MsoNormal>{<o:p></o:p></p>
<p class=MsoNormal> int i, j, k;<o:p></o:p></p>
<p class=MsoNormal> <o:p></o:p></p>
<p class=MsoNormal> for(i = 0; i < 4; i++)<o:p></o:p></p>
<p class=MsoNormal> {<o:p></o:p></p>
<p class=MsoNormal> for(j = 0; j < 4; j++)<o:p></o:p></p>
<p class=MsoNormal> {<o:p></o:p></p>
<p class=MsoNormal> c[i][j] = 0;<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> for (k = 0; k < 4; k++)<o:p></o:p></p>
<p class=MsoNormal> c[i][j]
= a[i][k] * b[k][j] + c[i][j];<o:p></o:p></p>
<p class=MsoNormal> }<o:p></o:p></p>
<p class=MsoNormal> }<o:p></o:p></p>
<p class=MsoNormal> <o:p></o:p></p>
<p class=MsoNormal> return c[2][3];<o:p></o:p></p>
<p class=MsoNormal>}<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>If I compile this with std-compile-opts the inner loops are completely
unrolled and thus unroll-and-jam is not important here but I’m assuming
this is down to the fact that the loop bounds are known and small. Assuming this
was not the case what I would like to be able to do is unroll the loop iterating
over “j” some number of times and then fuse the resulting instances
of the inner loop, e.g:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>for(i=O; i<4; i++)<o:p></o:p></p>
<p class=MsoNormal> for(j=O; j<4; j+=2)<o:p></o:p></p>
<p class=MsoNormal> {<o:p></o:p></p>
<p class=MsoNormal> c[i][j] =
0;<o:p></o:p></p>
<p class=MsoNormal> c [i][j+1]= 0;<o:p></o:p></p>
<p class=MsoNormal> for(k=O; k<4; k++)<o:p></o:p></p>
<p class=MsoNormal> {<o:p></o:p></p>
<p class=MsoNormal> c[i] [j] =
a[i][k]*b[k] [j]+c[i][j];<o:p></o:p></p>
<p class=MsoNormal> c[i][j+1]= a[i][k]*b[k]
[j+1]+c[i] [j+1];<o:p></o:p></p>
<p class=MsoNormal> }<o:p></o:p></p>
<p class=MsoNormal> }<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Of course this is the well known unroll-and-jam optimization,
described by Car and Kennedy*, and I was wondering if there are plans to add this
and other additional loop optimizations to LLVM in the near future? <o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Thanks for any information you can provide around this.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Ben<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal style='text-autospace:none'>*<span style='font-size:10.0pt;
font-family:"Arial","sans-serif"'>S. Carr and K. Kennedy, “Improving the
ratio of memory operations to floating-point operations in loops,” </span><span
style='font-size:9.0pt;font-family:"Arial","sans-serif"'>ACM </span><span
style='font-size:10.0pt;font-family:"Arial","sans-serif"'>Transactions on </span><span
style='font-size:9.0pt;font-family:"Arial","sans-serif"'>Programming Languages
and Systems, </span><span style='font-size:10.0pt;font-family:"Arial","sans-serif"'>vol.
16, no. 6, pp. 1768-1810, 1994.</span><o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><span style='color:navy'><img width=113 height=61
id="Picture_x0020_1" src="cid:image001.gif@01C8CBAB.9D509810"
alt="cid:image001.gif@01C89E6F.8C967C10"><o:p></o:p></span></p>
<table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0>
<tr>
<td width=89 style='width:66.75pt;padding:.75pt .75pt .75pt .75pt'></td>
<td style='padding:.75pt .75pt .75pt .75pt'></td>
</tr>
</table>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>