<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
p.xmsonormal, li.xmsonormal, div.xmsonormal
{mso-style-name:x_msonormal;
margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle23
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:.5in"><b>From:</b> Sjoerd Meijer <Sjoerd.Meijer@arm.com>
<br>
<b>Sent:</b> Friday, May 1, 2020 11:54 AM<br>
<b>To:</b> Eli Friedman <efriedma@quicinc.com>; llvm-dev <llvm-dev@lists.llvm.org><br>
<b>Subject:</b> [EXT] Re: [llvm-dev] LV: predication<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black">Hi Eli,<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black">> The problem with your proposal, as written, is that the vectorizer is producing the intrinsic. Because we don’t impose any ordering on optimizations before codegen,
every optimization pass in LLVM would have to be taught to preserve any @llvm.set.loop.elements.i32 whenever it makes any change. This is completely impractical because the intrinsic isn’t related to anything optimizations would normally look for: it’s a
random intrinsic in the middle of nowhere.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black">I do see that point. But is that also not the beauty of it? It just sits in the preheader, if gets removed, then so be it. And if it not recognised, then also no harm done?<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The harm comes if the intrinsic ends up with the wrong value, or attached to the wrong loop.
<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black">> Probably the simplest path to get this working is to derive the number of elements in the backend (in HardwareLoops, or your tail predication pass). You should be able
to figure it from the masks used in the llvm.masked.load/store instructions in the loop.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black">This is what we are currently doing and works excellent for simpler cases. For the more complicated cases that we now what to handle as well, the pattern matching just
becomes a bit too horrible, and it is fragile too. All we need is the information that the vectoriser already has, and pass this on somehow.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black">As I am really keen to simply our backend pass, would there be another way to pass this information on? If emitting an intrinsic is a blocker, could this be done with a
loop annotation?<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">If the problem is specifically figuring out the underlying element count given a predicate, maybe we could attack it from that angle? For example, introduce a special intrinsic for deriving the mask (sort of like the SVE whilelo).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">-Eli<span style="font-size:12.0pt;color:black"> </span><o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="xmsonormal" style="margin-left:.5in"><span style="font-size:12.0pt;color:black"> </span><o:p></o:p></p>
</div>
</div>
</div>
</div>
</body>
</html>