[llvm-dev] [CUDA/NVPTX] is inlining __syncthreads allowed?
Jingyue Wu via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 21 15:11:35 PDT 2015
I'm using 7.0. I am attaching the reduced example.
nvcc sync.cu -arch=sm_35 -ptx
gives
// .globl _Z3foov
.visible .entry _Z3foov(
)
{
.reg .pred %p<2>;
.reg .s32 %r<3>;
mov.u32 %r1, %tid.x;
and.b32 %r2, %r1, 1;
setp.eq.b32 %p1, %r2, 1;
@!%p1 bra BB7_2;
bra.uni BB7_1;
BB7_1:
bar.sync 0;
bra.uni BB7_3;
BB7_2:
bar.sync 0;
BB7_3:
ret;
}
As you see, bar.sync is duplicated.
On Fri, Aug 21, 2015 at 1:56 PM, Justin Holewinski <jholewinski at nvidia.com>
wrote:
> That’s an interesting case. AFAIK, inlining should be restricted here for
> the reason you mention. Inlining should only be valid if it doesn’t
> duplicate the barrier. Which nvcc shows this behavior?
>
> Adding Vinod and Yuan for comment.
>
> > On Aug 21, 2015, at 4:24 PM, Jingyue Wu <jingyue at google.com> wrote:
> >
> > Hi Justin,
> >
> > Is a compiler allowed to inline a function that calls __syncthreads? I
> saw nvcc does that, but not sure it's valid though. For example,
> >
> > void foo() {
> > __syncthreads();
> > }
> >
> > if (threadIdx.x % 2 == 0) {
> > ...
> > foo();
> > } else {
> > ...
> > foo();
> > }
> >
> > Before inlining, all threads meet at one __syncthreads(). After inlining
> >
> > if (threadIdx.x % 2 == 0) {
> > ...
> > __syncthreads();
> > } else {
> > ...
> > __syncthreads();
> > }
> >
> > The __syncthreads call is duplicated, and it's no longer guaranteed that
> all threads can meet one __syncthreads().
> >
> > Any thoughts?
> >
> > Jingyue
>
>
>
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and
> may contain
> confidential information. Any unauthorized review, use, disclosure or
> distribution
> is prohibited. If you are not the intended recipient, please contact the
> sender by
> reply email and destroy all copies of the original message.
>
> -----------------------------------------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/5538a63f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sync.cu
Type: application/octet-stream
Size: 273 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/5538a63f/attachment.obj>
More information about the llvm-dev
mailing list