<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Aug 29, 2015 at 3:29 AM, escha <span dir="ltr"><<a href="mailto:escha@apple.com" target="_blank">escha@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div class="h5"><div><blockquote type="cite"><div><div style="word-wrap:break-word"><div>I deliberately didn’t add any constraints on duplication, both because IME it’s difficult to deal with in practice, and because I have use cases in mind that don’t care about duplication the way that a barrier does.</div><div><br></div><div>—Owen</div></div></div></blockquote></div><br></div></div><div>Would it be correct to say a barrier needs noduplicate if it uses a particular argument to represent a particular barrier counter, whereas if the barrier has no such argument, it could be marked convergent-only?</div><div><br></div><div>So you could loop-unroll the latter, but not the former.</div><span class="HOEnZb"><font color="#888888"><div><br></div></font></span></div></blockquote><div>If convergent were fully supported and mature in LLVM, I think we could mark all the barriers as convergent-only. E.g. full loop unrolling and inlining should be fine also for numbered barriers, at least on CUDA. Is there an example that you have in mind where the numbering is what makes the difference?</div><div><br></div><div>Bjarke<br></div></div></div></div>