<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Malgun Gothic";
panose-1:2 11 5 3 2 0 0 2 0 4;}
@font-face
{font-family:"\@Malgun Gothic";}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">Ping.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Additionally, I was not able to see the pass triggered from llvm-test-suite and spec benchmark except hmmer.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks<o:p></o:p></p>
<p class="MsoNormal">JinGu Kang<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b>From:</b> Sanne Wouda <Sanne.Wouda@arm.com> <br>
<b>Sent:</b> 25 June 2021 11:23<br>
<b>To:</b> Jingu Kang <Jingu.Kang@arm.com>; llvm-dev@lists.llvm.org; Florian Hahn <florian_hahn@apple.com><br>
<b>Cc:</b> nikic@php.net<br>
<b>Subject:</b> Re: [llvm-dev] Enabling Loop Distribution Pass as default in the pipeline of new pass manager<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Hi,</span><o:p></o:p></p>
</div>
<div>
<blockquote style="border:none #C8C8C8 1.0pt;border-left:solid #C8C8C8 2.25pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Do you have any data on how often LoopDistribute triggers on a larger set of programs (like llvm-test-suite + SPEC)? AFAIK the implementation is very limited at the moment (geared towards catching
the case in hmmer) and I suspect lack of generality is one of the reasons why it is not enabled by default yet.</span><o:p></o:p></p>
</div>
</blockquote>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">It would be good to have some fresh numbers on how often LoopDistribute triggers. >From what I remember, there are a handful of cases in the test suite, but nothing that significantly affects performance
(other than hmmer, obviously).</span><o:p></o:p></p>
</div>
<blockquote style="border:none #C8C8C8 1.0pt;border-left:solid #C8C8C8 2.25pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Also, there’s been an effort to improve the cost-modeling for LoopDistribute (</span><a href="https://reviews.llvm.org/D100381"><span style="font-size:12.0pt;color:black">https://reviews.llvm.org/D100381</span></a><span style="font-size:12.0pt;color:black">)
Should we make progress in that direction first, before enabling by default?</span><o:p></o:p></p>
</div>
</blockquote>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Unfortunately, there were some problems with this effort. First, the current implementation of LoopDistribute relies heavily on LoopAccessAnalysis, which made it difficult to adapt.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">More importantly though, I'm not convinced that LoopDistribute will be beneficial other than in cases where it enables more vectorization. (The memcpy detection gcc might be interesting, I didn't
look at that.) It reduces both ILP and MLP, which in some cases might be made up by lower register or cache pressure, but this is hard or impossible for the compiler to know.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">While working on this, with a more aggressive LoopDistribute across several benchmarks, I did not see any improvements that didn't turn out to be noise, and plenty of cases where it was actively
degrading performance.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Therefore, I'm not sure this direction is worth pursuing further, and I believe the current heuristic of "distribute when it enables new vectorization" is actually pretty reasonable, if not very
general.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Cheers,</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;color:black">Sanne</span><o:p></o:p></p>
</div>
</div>
</div>
</div>
</body>
</html>