<div dir="ltr">I'm using 7.0. I am attaching the reduced example. <div><br></div><div>nvcc <a href="http://sync.cu">sync.cu</a> -arch=sm_35 -ptx</div><div><br></div><div>gives </div><div><br></div><div><div> // .globl _Z3foov</div><div>.visible .entry _Z3foov(</div><div><br></div><div>)</div><div>{</div><div> .reg .pred %p<2>;</div><div> .reg .s32 %r<3>;</div><div><br></div><div><br></div><div> mov.u32 %r1, %tid.x;</div><div> and.b32 %r2, %r1, 1;</div><div> setp.eq.b32 %p1, %r2, 1;</div><div> @!%p1 bra BB7_2;</div><div> bra.uni BB7_1;</div><div><br></div><div>BB7_1:</div><div> bar.sync 0;</div><div> bra.uni BB7_3;</div><div><br></div><div>BB7_2:</div><div> bar.sync 0;</div><div><br></div><div>BB7_3:</div><div> ret;</div><div>}</div></div><div><br></div><div>As you see, bar.sync is duplicated. </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 21, 2015 at 1:56 PM, Justin Holewinski <span dir="ltr"><<a href="mailto:jholewinski@nvidia.com" target="_blank">jholewinski@nvidia.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">That’s an interesting case. AFAIK, inlining should be restricted here for the reason you mention. Inlining should only be valid if it doesn’t duplicate the barrier. Which nvcc shows this behavior?<br>
<br>
Adding Vinod and Yuan for comment.<br>
<div><div class="h5"><br>
> On Aug 21, 2015, at 4:24 PM, Jingyue Wu <<a href="mailto:jingyue@google.com">jingyue@google.com</a>> wrote:<br>
><br>
> Hi Justin,<br>
><br>
> Is a compiler allowed to inline a function that calls __syncthreads? I saw nvcc does that, but not sure it's valid though. For example,<br>
><br>
> void foo() {<br>
> __syncthreads();<br>
> }<br>
><br>
> if (threadIdx.x % 2 == 0) {<br>
> ...<br>
> foo();<br>
> } else {<br>
> ...<br>
> foo();<br>
> }<br>
><br>
> Before inlining, all threads meet at one __syncthreads(). After inlining<br>
><br>
> if (threadIdx.x % 2 == 0) {<br>
> ...<br>
> __syncthreads();<br>
> } else {<br>
> ...<br>
> __syncthreads();<br>
> }<br>
><br>
> The __syncthreads call is duplicated, and it's no longer guaranteed that all threads can meet one __syncthreads().<br>
><br>
> Any thoughts?<br>
><br>
> Jingyue<br>
<br>
<br>
</div></div>-----------------------------------------------------------------------------------<br>
This email message is for the sole use of the intended recipient(s) and may contain<br>
confidential information. Any unauthorized review, use, disclosure or distribution<br>
is prohibited. If you are not the intended recipient, please contact the sender by<br>
reply email and destroy all copies of the original message.<br>
-----------------------------------------------------------------------------------<br>
</blockquote></div><br></div>