<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hello Janek,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
It looks like you would like to steer which hardwareloop form will be generated with a pragma by providing very detailed target information, but I <span style="color: rgb(0, 0, 0); font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; background: var(--white);">think
a more typical use case of pragmas is to override the cost-model or a transformation threshold/argument. In this case, I would have guessed that the idea of the new pragma is it takes precedence over TTI's isHardwareLoopProfitable hook, and thus would probably
have expected something as simple as "hwloop(enable|disable)" initially. If you would like to bring a hardwareloop into a more efficient form, then I think that's mainly the responsibility of the hardwareloop pass or a backend pass (see e.g. the ARM backend
passes). <span style="caret-color:rgb(0, 0, 0);background-color:rgb(255, 255, 255);display:inline !important">I think option 1 is a non-starter as it exposes all sorts of internals that we don't want for different reasons, so option 2 looks a lot better but
is still very specific.</span></span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span style="color: rgb(0, 0, 0); font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; background: var(--white);"></span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span style="color: rgb(0, 0, 0); font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; background: var(--white);"><br>
</span></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span style="color: rgb(0, 0, 0); font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; background: var(--white);">Cheers,<br>
Sjoerd.</span></div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> llvm-dev <llvm-dev-bounces@lists.llvm.org> on behalf of Janek Van Oirschot via llvm-dev <llvm-dev@lists.llvm.org><br>
<b>Sent:</b> 13 April 2021 13:28<br>
<b>To:</b> llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>; cfe-dev@lists.llvm.org <cfe-dev@lists.llvm.org><br>
<b>Subject:</b> [llvm-dev] Loop pragma for hardware loops</font>
<div> </div>
</div>
<style>
<!--
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
{margin:0cm;
font-size:12.0pt;
font-family:"Calibri",sans-serif}
span.x_EmailStyle17
{font-family:"Calibri",sans-serif;
color:windowtext}
.x_MsoChpDefault
{font-size:12.0pt;
font-family:"Calibri",sans-serif}
@page WordSection1
{margin:72.0pt 72.0pt 72.0pt 72.0pt}
div.x_WordSection1
{}
-->
</style>
<div lang="EN-GB" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="x_WordSection1">
<p class="x_MsoNormal"><span style="font-size:11.0pt">Hey all,</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">I'm looking to extend the current clang loop pragmas to also support hardware loops and allow a user to insert (or completely disable) hardware loop intrinsics on a per-loop basis.</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">One of the questions I have regarding this is how to go about incorporating the different hardware loop intrinsics in the pragma. A few options we came up with:</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">1. The pragma incorporates which intrinsic to use for a loop:</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">#pragma loop hwloop(set_loop_i32)</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">or</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">#pragma loop hwloop(/*LivesInReg=*/ true, /*AddTestGuard=*/ true, /*NumBits=*/ 32)</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">2. The pragma adds some target specific info (string?) to use in the hwloop TTI hook/new hwloop TTI hook:</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">#pragma loop hwloop(target="bdnz") // PPC example</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">#pragma loop hwloop(target="bdz") // PPC example</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">or</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">#pragma loop hwloop(max-count=42, ...)</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Option 1 requires the user to know about llvm's hardware loops internals so I'm leaning more towards option 2 as users are more likely to be aware of target specific information (such as PPC's bdnz/bdz).</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">These are just some options we came up with, we would love to hear about other (better) options, if any.</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Kind regards,</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Janek van Oirschot</span></p>
</div>
</div>
</body>
</html>