[llvm-dev] [CUDA/NVPTX] is inlining __syncthreads allowed?

Jingyue Wu via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 21 15:11:35 PDT 2015


I'm using 7.0. I am attaching the reduced example.

nvcc sync.cu -arch=sm_35 -ptx

gives

        // .globl       _Z3foov
.visible .entry _Z3foov(

)
{
        .reg .pred      %p<2>;
        .reg .s32       %r<3>;


        mov.u32         %r1, %tid.x;
        and.b32         %r2, %r1, 1;
        setp.eq.b32     %p1, %r2, 1;
        @!%p1 bra       BB7_2;
        bra.uni         BB7_1;

BB7_1:
        bar.sync        0;
        bra.uni         BB7_3;

BB7_2:
        bar.sync        0;

BB7_3:
        ret;
}

As you see, bar.sync is duplicated.

On Fri, Aug 21, 2015 at 1:56 PM, Justin Holewinski <jholewinski at nvidia.com>
wrote:

> That’s an interesting case.  AFAIK, inlining should be restricted here for
> the reason you mention.  Inlining should only be valid if it doesn’t
> duplicate the barrier.  Which nvcc shows this behavior?
>
> Adding Vinod and Yuan for comment.
>
> > On Aug 21, 2015, at 4:24 PM, Jingyue Wu <jingyue at google.com> wrote:
> >
> > Hi Justin,
> >
> > Is a compiler allowed to inline a function that calls __syncthreads? I
> saw nvcc does that, but not sure it's valid though. For example,
> >
> > void foo() {
> >   __syncthreads();
> > }
> >
> > if (threadIdx.x % 2 == 0) {
> >   ...
> >   foo();
> > } else {
> >   ...
> >   foo();
> > }
> >
> > Before inlining, all threads meet at one __syncthreads(). After inlining
> >
> > if (threadIdx.x % 2 == 0) {
> >   ...
> >   __syncthreads();
> > } else {
> >   ...
> >   __syncthreads();
> > }
> >
> > The __syncthreads call is duplicated, and it's no longer guaranteed that
> all threads can meet one __syncthreads().
> >
> > Any thoughts?
> >
> > Jingyue
>
>
>
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and
> may contain
> confidential information.  Any unauthorized review, use, disclosure or
> distribution
> is prohibited.  If you are not the intended recipient, please contact the
> sender by
> reply email and destroy all copies of the original message.
>
> -----------------------------------------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/5538a63f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sync.cu
Type: application/octet-stream
Size: 273 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/5538a63f/attachment.obj>


More information about the llvm-dev mailing list