<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi Hal<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I was also looking at providing such a support in LLVM for capturing (both explicit and implicit)
<o:p></o:p></p>
<p class="MsoNormal">parallelism in LLVM. We had an initial discussion around this and your proposal comes at the<o:p></o:p></p>
<p class="MsoNormal">right time. We support such an initiative. We can work together to get this support implemented<o:p></o:p></p>
<p class="MsoNormal">in LLVM.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">But, I have a slight different view. I think today parallelism does not necessarily mean OpenMP<o:p></o:p></p>
<p class="MsoNormal">or SIMD, we are in the area of heterogeneous computing. I agree that your primary target<o:p></o:p></p>
<p class="MsoNormal">was thread-based parallelism, but I think we could extend this while we capture the parallelism<o:p></o:p></p>
<p class="MsoNormal">in the program.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">My idea is to capture parallelism with the way you have said using ‘metadata’. I agree to record<o:p></o:p></p>
<p class="MsoNormal">the parallel regions in the metadata (as given by the user). However, we could also give placeholders<o:p></o:p></p>
<p class="MsoNormal">to record any additional information that the compiler writer needs like number of threads,<o:p></o:p></p>
<p class="MsoNormal">scheduling parameters, chunk size, etc etc which are specific perhaps to OpenMP.
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The point is that the same parallel loop could be targeted by another standard to accelerators today<o:p></o:p></p>
<p class="MsoNormal">(like GPUs) using another standard OpenACC. We may get a new standard to capture and target<o:p></o:p></p>
<p class="MsoNormal">for different kind of parallel device, which could look quite different, and has to specifically targeted.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Since we are at the intermediate layer, we could be independent of both user level standards like<o:p></o:p></p>
<p class="MsoNormal">OpenMP, OpenACC, OpenCL, Cilk+, C++AMP etc and at the same time, keep enough information at this stage<o:p></o:p></p>
<p class="MsoNormal">so that the compiler could generate efficient backend code for the target device. So, my suggestion is<o:p></o:p></p>
<p class="MsoNormal">to keep all these relevant information as ‘tags’ for metadata and it is up to the backend to use or<o:p></o:p></p>
<p class="MsoNormal">throw the information. As you said, if the backend ignores there should not be any harm in correctness<o:p></o:p></p>
<p class="MsoNormal">of the final code.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Second point I wanted to make was on the intrinsics. I am not sure why we need these intrinsics at the<o:p></o:p></p>
<p class="MsoNormal">LLVM level. I am not sure why we would need conditional constructs for expressing parallelism. These<o:p></o:p></p>
<p class="MsoNormal">could be calls directly to the runtime library at the code generation level.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Again, this is very good initiative and we would like to support such a support in LLVM ASAP.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Prakash Raghavendra<o:p></o:p></p>
<p class="MsoNormal">AMD, Bangalore<o:p></o:p></p>
<p class="MsoNormal">Email: Prakash.raghavendra@amd.com<o:p></o:p></p>
<p class="MsoNormal">Phone: +91-80-3323 0753<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>