[PATCH] D12246: [NVPTX] change threading intrinsics from noduplicate to convergent

Jingyue Wu via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 30 22:22:29 PDT 2015


In Bjarke's example, all threads call __syncthreads in and only in their
first iteration (assuming *bounds > 0).

On Sun, Aug 30, 2015 at 10:12 PM, Xuetian Weng <wengxt at gmail.com> wrote:

> wengxt marked an inline comment as done.
> wengxt added a comment.
>
> In http://reviews.llvm.org/D12246#235675, @broune wrote:
>
> >   for (int i = 0; i < *bound; ++i) {
> >     if (i == 0)
> >       __syncthreads();
> >   }
> >
> >
> > This input program is valid as long as *bound > 0 has the same value
> across the block. Here loop-unrolling by a factor of 2 will separate off
> the first iteration of the loop into a duplicate body for the case where
> *bound is odd. I checked with an example loop that's similar but that
> doesn't use __syncthreads() and LLVM does do unrolling by a factor of 2 in
> this way. If whether *bound is odd is divergent, then only part of the warp
> would execute the __syncthreads() in the duplicate odd-case unrolled loop
> body. So I think that unrolling does have to be careful with divergent trip
> counts for loops that include __syncthreads() in cases such as this.
>
>
> IMHO, for loop, it is not possible for two thread running on different
> iteration to sync together, that is actually divergent. In that sense, if
> two function call instruction converge in the original code, they will also
> converge after unrolling.
>
> Thus I don't think there is any problem in this example.
>
>
> http://reviews.llvm.org/D12246
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150830/f1061983/attachment.html>


More information about the llvm-commits mailing list