<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Jingyue,<div class=""><br class=""></div><div class="">I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc.  The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models (CUDA, OpenCL, individual GPU vendors, etc) may layer more permissive semantics on top of it in code that is specific to that programming model.</div><div class=""><br class=""></div><div class="">—Owen</div><div class=""><br class=""><div class=""><div><blockquote type="cite" class=""><div class="">On Sep 22, 2015, at 10:33 AM, Jingyue Wu <<a href="mailto:jingyue@google.com" class="">jingyue@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Owen, <div class=""><br class=""></div><div class="">This is very interesting. </div><div class=""><br class=""></div><div class="">How different is "convergent" from "uniform"? An instruction is uniform if threads in the same SIMT unit (e.g. warp) do not diverge when executing this instruction. <div class=""><br class=""></div><div class="">I ask this because Bjarke recently came up with a mathematical definition of uniformity. I wonder if that is a foundation "convergent" needs as well. AFAICT, Bjarke's definition of "uniformity" is less restrictive than "convergent". For example, <span style="font-size:12.8px" class="">it allows loop unswitching the following code if "c" is uniform, which seems a case you ideally want to allow. </span></div><div class=""><br class=""></div><div class=""><span style="font-size:12.8px" class="">DISALLOWED:</span><br style="font-size:12.8px" class=""><span style="font-size:12.8px" class="">for (…) {</span><br style="font-size:12.8px" class=""><span style="font-size:12.8px" class="">  if (c) { … }</span><br style="font-size:12.8px" class=""><span style="font-size:12.8px" class="">  convergent();</span><br style="font-size:12.8px" class=""><span style="font-size:12.8px" class="">}</span><br class=""></div><div class=""><br class=""></div><div class="">Jingyue</div></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Fri, Sep 4, 2015 at 1:25 PM, Owen Anderson via llvm-dev <span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br class="">

<br class="">

In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc.  Credit to John McCall for talking this over with me and seeding the core ideas.<br class="">

<br class="">

Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition.  This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial.  More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable.  Related problems arise in loop unswitching as well.<br class="">

<br class="">

The proposed change is to split the semantics of convergent into two annotations:<br class="">

        convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition)<br class="">

        nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?)<br class="">

<br class="">

Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent.  The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below.<br class="">

<br class="">

The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients.  Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences.<br class="">

<br class="">

—Owen<br class="">

<br class="">

——————————————————<br class="">

<br class="">

Loop Unswitching Examples<br class="">

<br class="">

ALLOWED:<br class="">

for (…) {<br class="">

  if (c) { convergent(); }<br class="">

}<br class="">

<br class="">

DISALLOWED:<br class="">

for (…) {<br class="">

  if (c) { … }<br class="">

  convergent();<br class="">

}<br class="">

<br class="">

<br class="">

_______________________________________________<br class="">

LLVM Developers mailing list<br class="">

<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class="">

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class="">

</blockquote></div><br class=""></div>

</div></blockquote></div><br class=""></div></div></body></html>