<html><head></head><body><div class="ydpc04448f6yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;"><div></div>

        <div>I just remembered that I had put some thought about moving and eliminating ballots before. I was looking into</div><div>using cyclic equivalence classes, and that a ballot could only be moved to a location with the same equivalence</div><div>class, and two ballot() could be merged. I didn't think too much about it after that, because there was</div><div>no obvious way to handle uniform branches, which is important. I will see if I can figure something out that</div><div>makes sense. </div><div><br></div><div>- Jan</div>

        

        </div><div id="yahoo_quoted_9331569814" class="yahoo_quoted">

            <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">

                

                <div>

                    On Thursday, January 24, 2019, 9:06:14 PM EST, Jan Sjodin via llvm-dev <llvm-dev@lists.llvm.org> wrote:

                </div>

                <div><br></div>

                <div><br></div>

                <div><div id="yiv0918204397"><div><div class="yiv0918204397ydpdd21cc60yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;">

        <div><span>> for (...) {<br clear="none">>     ballot();<br clear="none">>     if (... /* non-uniform */) continue;<br clear="none">> }<br clear="none">> <br clear="none">> into<br clear="none">> <br clear="none">> for (...) {<br clear="none">>     do {<br clear="none">>         ballot();<br clear="none">>     } while (... /* non-uniform */);<br clear="none">> }</span><br clear="none"></div><div><br clear="none"></div><div>I'm not sure if I follow this example, could you and explain a bit more?</div><div>It looks to me that the condition in the "if" must be false (if the</div><div>same condition is used in the while), or we would</div><div> call ballot the wrong number of times.</div><br clear="none"><div>About the CSE, when would that be legal? I can imagine with uniform</div><div>branches that it could work, but would like to see an example to</div><div>fully understand this.</div><div><br clear="none"></div><div>I agree that it would be more conservative than if we model the threading,</div><div> but I'm not sure about the cost/benefit. I am mostly curious if it is</div><div> possible to have a single-thread view or not. Then we would have to</div><div>see if it is adequate.<br clear="none"></div></div><div class="yiv0918204397ydpdd21cc60yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;"><div><br clear="none"></div><div>- Jan<br clear="none"></div><div><br clear="none"></div>

        

        </div><div class="yiv0918204397yqt4735322840" id="yiv0918204397yqt39182"><div class="yiv0918204397ydp986081e1yahoo_quoted" id="yiv0918204397ydp986081e1yahoo_quoted_8798273484">

            <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">

                

                <div>

                    On Thursday, January 24, 2019, 10:31:47 AM EST, Connor Abbott <cwabbott0@gmail.com> wrote:

                </div>

                <div><br clear="none"></div>

                <div><br clear="none"></div>

                <div><div dir="ltr">I don't see how this would fix the continue vs. nested loop problem I<br clear="none">explained earlier. That is, how would this prevent turning:<br clear="none"><br clear="none">for (...) {<br clear="none">    ballot();<br clear="none">    if (... /* non-uniform */) continue;<br clear="none">}<br clear="none"><br clear="none">into<br clear="none"><br clear="none">for (...) {<br clear="none">    do {<br clear="none">        ballot();<br clear="none">    } while (... /* non-uniform */);<br clear="none">}<br clear="none"><br clear="none">and vice versa? Note that there's no duplication going on here, and<br clear="none">the single-threaded flow of control is exactly the same.<br clear="none"><br clear="none">Another reason this isn't so great is that it prevents e.g. CSE on<br clear="none">ballots that actually should be combined, since you're modelling it as<br clear="none">a write. It seems like the optimizer is going to need some special<br clear="none">knowledge of convergent things that fake memory constraints can't give<br clear="none">us.<br clear="none"><div class="yiv0918204397ydp986081e1yqt0920374022" id="yiv0918204397ydp986081e1yqtfd96929"><br clear="none">On Thu, Jan 24, 2019 at 4:06 PM Jan Sjodin <<a rel="nofollow" shape="rect" ymailto="mailto:jan_sjodin@yahoo.com" target="_blank" href="mailto:jan_sjodin@yahoo.com">jan_sjodin@yahoo.com</a>> wrote:<br clear="none">><br clear="none">><br clear="none">> I was looking into ballot() and how if it is possible to keep a single-threaded<br clear="none">> view of the code, but add some extra conditions that must hold after the<br clear="none">> transformation. I had the initial idea that each call to ballot() in a<br clear="none">> single-threaded program can be seen as a partial write to a memory<br clear="none">> location, and each location memory location is unique for every call site,<br clear="none">> plus there some externally observable side effect. We can abstract this<br clear="none">> away by tagging the calls, e.g. by using aliases.<br clear="none">><br clear="none">> For example:<br clear="none">><br clear="none">> if (...) {<br clear="none">>      foo1 = ballot();<br clear="none">> } else {<br clear="none">>      foo2 = ballot();<br clear="none">> }<br clear="none">><br clear="none">> simply becomes:<br clear="none">><br clear="none">> if (...) {<br clear="none">>      foo1 = ballot_1();<br clear="none">> } else {<br clear="none">>      foo2 = ballot_2();<br clear="none">> }<br clear="none">><br clear="none">><br clear="none">> and<br clear="none">><br clear="none">> if (...) {<br clear="none">> } else {<br clear="none">> }<br clear="none">> ballot();<br clear="none">><br clear="none">> becomes<br clear="none">><br clear="none">> if (...) {<br clear="none">> } else {<br clear="none">> }<br clear="none">> ballot_1();<br clear="none">><br clear="none">> In the first case it would prevent combining the two calls into one<br clear="none">> after the if. In the second example there is generally nothing that<br clear="none">> says it could not be transformed into the first example with two<br clear="none">> calls to ballot_1(), which should not be allowed.<br clear="none">><br clear="none">> Another form of duplication that we must allow are loop transforms,<br clear="none">> like unrolling or peeling. These might seem similar to the example<br clear="none">> above, since we are cloning code and with conditions etc. But<br clear="none">> they are different since they calls are in different loop iterations.<br clear="none">><br clear="none">> The condition that needs to be met is that:<br clear="none">><br clear="none">> There must be a single path between all cloned ballot_n() functions.<br clear="none">><br clear="none">> The reason for this condition is that if we clone the same call, then<br clear="none">> the copies must be mutually exclusive, but if they are cloned from<br clear="none">> a loop, there must be a path, or we would skip iterations.<br clear="none">><br clear="none">> If we want to be more sophisticated we can add:<br clear="none">><br clear="none">> If there is no such path, the calls must be separated by uniform branches.<br clear="none">><br clear="none">> After the transform everything should be re-tagged, since we already<br clear="none">> checked the calls and we don't want to check them again. Also, not all<br clear="none">> transforms need (or should) have the tagging checked. One example is<br clear="none">> inlining, where multiple copies are created, but they are clearly different<br clear="none">> calls. The tagging can be done temporarily for a single pass, and then<br clear="none">> eliminated. This could be a good tool for debugging as well, since it can<br clear="none">> detect if a transform is suspect.<br clear="none">><br clear="none">> The code would of course have to make sense as far as control flow. If<br clear="none">> we have:<br clear="none">><br clear="none">> for(;;) {<br clear="none">>    if(...) {<br clear="none">>       ballot();<br clear="none">>       break;<br clear="none">>    }<br clear="none">> }<br clear="none">><br clear="none">> This would have to be translated to:<br clear="none">><br clear="none">> for(;;) {<br clear="none">>     if(...) {<br clear="none">>       ballot();<br clear="none">>       if(UnknownConstant) {  // Not a uniform condition, but later on translated to "true"<br clear="none">>         break;<br clear="none">>       }<br clear="none">> }<br clear="none">><br clear="none">> However, I think that this is the way the code is generated today anyway.<br clear="none">> There would have to be some attribute that indicate that these calls (or functions)<br clear="none">> contain ballot or other cross-lane operations so they could be tagged and<br clear="none">> checked. The attribute would be used by the passes to know that the special<br clear="none">> conditions exist for those calls.<br clear="none">><br clear="none">> As far as what it means to have a path, it could be complicated.<br clear="none">> For example:<br clear="none">><br clear="none">> x = ...<br clear="none">> ballot_1();<br clear="none">><br clear="none">> could be transformed to:<br clear="none">><br clear="none">> if (x < 4711) {<br clear="none">>   ballot_1();<br clear="none">><br clear="none">> if(x >= 4711) {<br clear="none">>   ballot_1();<br clear="none">> }<br clear="none">><br clear="none">> So a simple path check would say there is a path, and the transform is legal,<br clear="none">> but if we examine the conditions, there is no path, and the transform should not be legal.<br clear="none">> It could be made even more obscure of course, but I don't see any optimizations really<br clear="none">> doing this kind of thing,<br clear="none">><br clear="none">> - Jan<br clear="none">><br clear="none">> On Saturday, December 29, 2018, 11:32:25 AM EST, Nicolai Hähnle via llvm-dev <<a rel="nofollow" shape="rect" ymailto="mailto:llvm-dev@lists.llvm.org" target="_blank" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br clear="none">><br clear="none">><br clear="none">> On 20.12.18 18:03, Connor Abbott wrote:<br clear="none">> >    We already have the notion of "convergent" functions like<br clear="none">> >    syncthreads(), to which we cannot add control-flow dependencies.<br clear="none">> >    That is, it's legal to hoist syncthreads out of an "if", but it's<br clear="none">> >    not legal to sink it into an "if".  It's not clear to me why we<br clear="none">> >    can't have "anticonvergent" (terrible name) functions which cannot<br clear="none">> >    have control-flow dependencies removed from them?  ballot() would be<br clear="none">> >    both convergent and anticonvergent.<br clear="none">> ><br clear="none">> >    Would that solve your problem?<br clear="none">> ><br clear="none">> ><br clear="none">> > I think it's important to note that we already have such an attribute,<br clear="none">> > although with the opposite sense - it's impossible to remove control<br clear="none">> > flow dependencies from a call unless you mark it as "speculatable".<br clear="none">><br clear="none">> This isn't actually true. If both sides of an if/else have the same<br clear="none">> non-speculative function call, it can still be moved out of control flow.<br clear="none">><br clear="none">> That's because doing so doesn't change anything at all from a<br clear="none">> single-threaded perspective. Hence why I think we should model the<br clear="none">> communication between threads honestly.<br clear="none">><br clear="none">><br clear="none">> > However, this doesn't prevent<br clear="none">> ><br clear="none">> > if (...) {<br clear="none">> > } else {<br clear="none">> > }<br clear="none">> > foo = ballot();<br clear="none">> ><br clear="none">> > from being turned into<br clear="none">> ><br clear="none">> > if (...) {<br clear="none">> >      foo1 = ballot();<br clear="none">> > } else {<br clear="none">> >      foo2 = ballot();<br clear="none">> > }<br clear="none">> > foo = phi(foo1, foo2)<br clear="none">> ><br clear="none">> > and vice versa. We have a "noduplicate" attribute which prevents<br clear="none">> > transforming the first into the second, but not the other way around. Of<br clear="none">> > course we could keep going this way and add a "nocombine" attribute to<br clear="none">> > complement noduplicate. But even then, there are even still problematic<br clear="none">> > transforms. For example, take this program, which is simplified from a<br clear="none">> > real game that doesn't work with the AMDGPU backend:<br clear="none">> ><br clear="none">> > while (cond1 /* uniform */) {<br clear="none">> >      ballot();<br clear="none">> >      ...<br clear="none">> >      if (cond2 /* non-uniform */) continue;<br clear="none">> >      ...<br clear="none">> > }<br clear="none">> ><br clear="none">> > In SPIR-V, when using structured control flow, the semantics of this are<br clear="none">> > pretty clearly defined. In particular, there's a continue block after<br clear="none">> > the body of the loop where control flow re-converges, and the only back<br clear="none">> > edge is from the continue block, so the ballot is in uniform control<br clear="none">> > flow. But LLVM will get rid of the continue block since it's empty, and<br clear="none">> > re-analyze the loop as two nested loops, splitting the loop header in<br clear="none">> > two, producing a CFG which corresponds to this:<br clear="none">> ><br clear="none">> > while (cond1 /* uniform */) {<br clear="none">> >      do {<br clear="none">> >          ballot();<br clear="none">> >           ...<br clear="none">> >      } while (cond2 /* non-uniform */);<br clear="none">> >      ...<br clear="none">> > }<br clear="none">> ><br clear="none">> > Now, in an implementation where control flow re-converges at the<br clear="none">> > immediate post-dominator, this won't do the right thing anymore. In<br clear="none">> > order to handle it correctly, you'd effectively need to always flatten<br clear="none">> > nested loops, which will probably be really bad for performance if the<br clear="none">> > programmer actually wanted the second thing. It also makes it impossible<br clear="none">> > when translating a high-level language to LLVM to get the "natural"<br clear="none">> > behavior which game developers actually expect. This is exactly the sort<br clear="none">> > of "spooky action at a distance" which makes me think that everything<br clear="none">> > we've done so far is really insufficient, and we need to add an explicit<br clear="none">> > notion of control-flow divergence and reconvergence to the IR. We need a<br clear="none">> > way to say that control flow re-converges at the continue block, so that<br clear="none">> > LLVM won't eliminate it, and we can vectorize it correctly without<br clear="none">> > penalizing cases where it's better for control flow not to re-converge.<br clear="none">><br clear="none">> Well said!<br clear="none">><br clear="none">> Cheers,<br clear="none">><br clear="none">> Nicolai<br clear="none">> --<br clear="none">> Lerne, wie die Welt wirklich ist,<br clear="none">> Aber vergiss niemals, wie sie sein sollte.<br clear="none">> _______________________________________________<br clear="none">> LLVM Developers mailing list<br clear="none">> <a rel="nofollow" shape="rect" ymailto="mailto:llvm-dev@lists.llvm.org" target="_blank" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br clear="none">> <a rel="nofollow" shape="rect" target="_blank" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></div></div></div>

            </div>

        </div></div></div></div><div class="yqt4735322840" id="yqt28192">_______________________________________________<br clear="none">LLVM Developers mailing list<br clear="none"><a shape="rect" ymailto="mailto:llvm-dev@lists.llvm.org" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br clear="none"><a shape="rect" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br clear="none"></div></div>

            </div>

        </div></body></html>