<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=KOI8-R">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>No, we don't. We need to perform the different kind of the
analysis for SPMD mode constructs and Non-SPMD.</p>
<p>For SPMD mode we need to globalize only reduction/lastprivate
variables. For Non-SPMD mode, we need to globalize all the
private/local variables, that may escape their declaration context
in the construct.<br>
</p>
<pre class="moz-signature" cols="72">-------------
Best regards,
Alexey Bataev</pre>
<div class="moz-cite-prefix">22.01.2019 14:29, Doerfert, Johannes
Rudolf пишет:<br>
</div>
<blockquote type="cite"
cite="mid:DM5PR09MB3733476A7ED4F19112729AADBA980@DM5PR09MB3733.namprd09.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html;
charset=KOI8-R">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
<div id="divtagdefaultwrapper" style="font-size: 12pt; color:
rgb(0, 0, 0); font-family: Calibri, Helvetica, sans-serif,
"EmojiFont", "Apple Color Emoji",
"Segoe UI Emoji", NotoColorEmoji, "Segoe UI
Symbol", "Android Emoji", EmojiSymbols;"
dir="ltr">
We would still know that. We can do exactly the same reasoning
as we do now. <br>
<p style="margin-top:0;margin-bottom:0">I think the important
question is, how different is the code generated for either
mode and can we hide (most of) the differences in the runtime.</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">If I understand you
correctly, you say the data sharing code looks very different
and the differences cannot be hidden, correct?</p>
<p style="margin-top:0;margin-bottom:0">It would be helpful for
me to understand your point if you could give me a piece of
OpenMP for which the data sharing in SPMD mode and "guarded"</p>
<p style="margin-top:0;margin-bottom:0">mode are as different as
possible. I can compile it in both modes myself so high-level
OpenMP is fine (I will disable SPMD mode manually in the
source if necessary).</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Thanks,</p>
<p style="margin-top:0;margin-bottom:0"> Johannes</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<br>
<br>
<div style="color: rgb(0, 0, 0);">
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
face="Calibri, sans-serif" color="#000000"><b>From:</b>
llvm-dev <a class="moz-txt-link-rfc2396E" href="mailto:llvm-dev-bounces@lists.llvm.org"><llvm-dev-bounces@lists.llvm.org></a> on behalf
of Alexey Bataev via llvm-dev
<a class="moz-txt-link-rfc2396E" href="mailto:llvm-dev@lists.llvm.org"><llvm-dev@lists.llvm.org></a><br>
<b>Sent:</b> Tuesday, January 22, 2019 13:10<br>
<b>To:</b> Doerfert, Johannes Rudolf<br>
<b>Cc:</b> Alexey Bataev; LLVM-Dev; Arpith Chacko Jacob;
<a class="moz-txt-link-abbreviated" href="mailto:openmp-dev@lists.llvm.org">openmp-dev@lists.llvm.org</a>; <a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
<b>Subject:</b> Re: [llvm-dev] [RFC] Late (OpenMP) GPU
code "SPMD-zation"</font>
<div> </div>
</div>
<div>
<div>But we need to know the execution mode, SPMD or
"guarded"<br>
<br>
</div>
<pre class="x_moz-signature" cols="72">-------------
Best regards,
Alexey Bataev</pre>
<div class="x_moz-cite-prefix">22.01.2019 13:54, Doerfert,
Johannes Rudolf пишет:<br>
</div>
<blockquote type="cite">
<meta content="text/html; charset=koi8-r">
<div dir="auto" style="direction:ltr; margin:0; padding:0;
font-family:sans-serif; font-size:11pt; color:black">
We could still do that in clang, couldn't we?<br>
<br>
</div>
<div dir="auto" style="direction:ltr; margin:0; padding:0;
font-family:sans-serif; font-size:11pt; color:black">
<div dir="auto" style="direction:ltr; margin:0;
padding:0; font-family:sans-serif; font-size:11pt;
color:black">
Get <a href="https://aka.ms/ghei36" id="LPlnk791863"
class="OWAAutoLink" previewremoved="true"
moz-do-not-send="true">
Outlook for Android</a></div>
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font
style="font-size:11pt" face="Calibri, sans-serif"
color="#000000"><b>From:</b> Alexey Bataev
<a class="x_moz-txt-link-rfc2396E OWAAutoLink"
href="mailto:a.bataev@outlook.com" id="LPlnk710514"
previewremoved="true" moz-do-not-send="true">
<a.bataev@outlook.com></a><br>
<b>Sent:</b> Tuesday, January 22, 2019 12:52:42 PM<br>
<b>To:</b> Doerfert, Johannes Rudolf; <a
class="x_moz-txt-link-abbreviated OWAAutoLink"
href="mailto:cfe-dev@lists.llvm.org" id="LPlnk92484"
previewremoved="true" moz-do-not-send="true">
cfe-dev@lists.llvm.org</a><br>
<b>Cc:</b> <a class="x_moz-txt-link-abbreviated
OWAAutoLink" href="mailto:openmp-dev@lists.llvm.org"
id="LPlnk406132" previewremoved="true"
moz-do-not-send="true">
openmp-dev@lists.llvm.org</a>; LLVM-Dev; Finkel, Hal
J.; Alexey Bataev; Arpith Chacko Jacob<br>
<b>Subject:</b> Re: [RFC] Late (OpenMP) GPU code
"SPMD-zation"</font>
<div> </div>
</div>
<div>
<p>The globalization for the local variables, for
example. It must be implemented in the compiler to get
the good performance, not in the runtime.</p>
<p><br>
</p>
<pre class="x_moz-signature" cols="72">-------------
Best regards,
Alexey Bataev</pre>
<div class="x_moz-cite-prefix">22.01.2019 13:43,
Doerfert, Johannes Rudolf пишет:<br>
</div>
<blockquote type="cite">
<meta content="text/html; charset=utf-8">
<div dir="auto" style="direction:ltr; margin:0;
padding:0; font-family:sans-serif; font-size:11pt;
color:black">
Could you elaborate on what you refer to wrt data
sharing. What do we currently do in the clang code
generation that we could not effectively implement
in the runtime, potentially with support of an llvm
pass.<br>
<br>
</div>
<div dir="auto" style="direction:ltr; margin:0;
padding:0; font-family:sans-serif; font-size:11pt;
color:black">
Thanks,<br>
</div>
<div dir="auto" style="direction:ltr; margin:0;
padding:0; font-family:sans-serif; font-size:11pt;
color:black">
James<br>
<br>
</div>
<div dir="auto" style="direction:ltr; margin:0;
padding:0; font-family:sans-serif; font-size:11pt;
color:black">
<div dir="auto" style="direction:ltr; margin:0;
padding:0; font-family:sans-serif; font-size:11pt;
color:black">
Get <a href="https://aka.ms/ghei36"
id="LPlnk975037" class="OWAAutoLink"
previewremoved="true" moz-do-not-send="true">
Outlook for Android</a></div>
<br>
</div>
<hr tabindex="-1" style="display:inline-block;
width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font
style="font-size:11pt" face="Calibri, sans-serif"
color="#000000"><b>From:</b> Alexey Bataev
<a class="x_moz-txt-link-rfc2396E OWAAutoLink"
href="mailto:a.bataev@outlook.com"
id="LPlnk323824" previewremoved="true"
moz-do-not-send="true">
<a.bataev@outlook.com></a><br>
<b>Sent:</b> Tuesday, January 22, 2019 12:34:01 PM<br>
<b>To:</b> Doerfert, Johannes Rudolf; <a
class="x_moz-txt-link-abbreviated OWAAutoLink"
href="mailto:cfe-dev@lists.llvm.org"
id="LPlnk882142" previewremoved="true"
moz-do-not-send="true">
cfe-dev@lists.llvm.org</a><br>
<b>Cc:</b> <a class="x_moz-txt-link-abbreviated
OWAAutoLink"
href="mailto:openmp-dev@lists.llvm.org"
id="LPlnk56115" previewremoved="true"
moz-do-not-send="true">
openmp-dev@lists.llvm.org</a>; LLVM-Dev; Finkel,
Hal J.; Alexey Bataev; Arpith Chacko Jacob<br>
<b>Subject:</b> Re: [RFC] Late (OpenMP) GPU code
"SPMD-zation"</font>
<div> </div>
</div>
<div>
<p><br>
</p>
<pre class="x_moz-signature" cols="72">-------------
Best regards,
Alexey Bataev</pre>
<div class="x_moz-cite-prefix">22.01.2019 13:17,
Doerfert, Johannes Rudolf пишет:<br>
</div>
<blockquote type="cite">
<pre class="x_moz-quote-pre">Where we are
------------
Currently, when we generate OpenMP target offloading code for GPUs, we
use sufficient syntactic criteria to decide between two execution modes:
1) SPMD -- All target threads (in an OpenMP team) run all the code.
2) "Guarded" -- The master thread (of an OpenMP team) runs the user
code. If an OpenMP distribute region is encountered, thus
if all threads (in the OpenMP team) are supposed to
execute the region, the master wakes up the idling
worker threads and points them to the correct piece of
code for distributed execution.
For a variety of reasons we (generally) prefer the first execution mode.
However, depending on the code, that might not be valid, or we might
just not know if it is in the Clang code generation phase.
The implementation of the "guarded" execution mode follows roughly the
state machine description in [1], though the implementation is different
(more general) nowadays.
What we want
------------
Increase the amount of code executed in SPMD mode and the use of
lightweight "guarding" schemes where appropriate.
How we get (could) there
------------------------
We propose the following two modifications in order:
1) Move the state machine logic into the OpenMP runtime library. That
means in SPMD mode all device threads will start the execution of
the user code, thus emerge from the runtime, while in guarded mode
only the master will escape the runtime and the other threads will
idle in their state machine code that is now just "hidden".
Why:
- The state machine code cannot be (reasonably) optimized anyway,
moving it into the library shouldn't hurt runtime but might even
improve compile time a little bit.
- The change should also simplify the Clang code generation as we
would generate structurally the same code for both execution modes
but only the runtime library calls, or their arguments, would
differ between them.
- The reason we should not "just start in SPMD mode" and "repair"
it later is simple, this way we always have semantically correct
and executable code.
- Finally, and most importantly, there is now only little
difference (see above) between the two modes in the code
generated by clang. If we later analyze the code trying to decide
if we can use SPMD mode instead of guarded mode the analysis and
transformation becomes much simpler.</pre>
</blockquote>
<p>The last item is wrong, unfortunately. A lot of
things in the codegen depend on the execution
mode, e.g. correct support of the data-sharing. Of
course, we can try to generalize the codegen and
rely completely on the runtime, but the
performance is going to be very poor.</p>
<p>We still need static analysis in the compiler. I
agree, that it is better to move this analysis to
the backend, at least after the inlining, but at
the moment it is not possible. We need the support
for the late outlining, which will allow to
implement better detection of the SPMD constructs
+ improve performance.<br>
</p>
<blockquote type="cite">
<pre class="x_moz-quote-pre"> 2) Implement a middle-end LLVM-IR pass that detects the guarded mode,
e.g., through the runtime library calls used, and that tries to
convert it into the SPMD mode potentially by introducing lightweight
guards in the process.
Why:
- After the inliner, and the canonicalizations, we have a clearer
picture of the code that is actually executed in the target
region and all the side effects it contains. Thus, we can make an
educated decision on the required amount of guards that prevent
unwanted side effects from happening after a move to SPMD mode.
- At this point we can more easily introduce different schemes to
avoid side effects by threads that were not supposed to run. We
can decide if a state machine is needed, conditionals should be
employed, masked instructions are appropriate, or "dummy" local
storage can be used to hide the side effect from the outside
world.
None of this was implemented yet but we plan to start in the immediate
future. Any comments, ideas, criticism is welcome!
Cheers,
Johannes
P.S. [2-4] Provide further information on implementation and features.
[1] <a class="x_moz-txt-link-freetext OWAAutoLink" href="https://ieeexplore.ieee.org/document/7069297" id="LPlnk545306" previewremoved="true" moz-do-not-send="true">https://ieeexplore.ieee.org/document/7069297</a>
[2] <a class="x_moz-txt-link-freetext OWAAutoLink" href="https://dl.acm.org/citation.cfm?id=2833161" id="LPlnk848282" previewremoved="true" moz-do-not-send="true">https://dl.acm.org/citation.cfm?id=2833161</a>
[3] <a class="x_moz-txt-link-freetext OWAAutoLink" href="https://dl.acm.org/citation.cfm?id=3018870" id="LPlnk111280" previewremoved="true" moz-do-not-send="true">https://dl.acm.org/citation.cfm?id=3018870</a>
[4] <a class="x_moz-txt-link-freetext OWAAutoLink" href="https://dl.acm.org/citation.cfm?id=3148189" id="LPlnk967688" previewremoved="true" moz-do-not-send="true">https://dl.acm.org/citation.cfm?id=3148189</a>
</pre>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</body>
</html>