[PATCH] D12246: [NVPTX] change threading intrinsics from noduplicate to convergent

Bjarke Roune via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 31 10:07:48 PDT 2015


On Sat, Aug 29, 2015 at 3:29 AM, escha <escha at apple.com> wrote:

> I deliberately didn’t add any constraints on duplication, both because IME
> it’s difficult to deal with in practice, and because I have use cases in
> mind that don’t care about duplication the way that a barrier does.
>
> —Owen
>
>
> Would it be correct to say a barrier needs noduplicate if it uses a
> particular argument to represent a particular barrier counter, whereas if
> the barrier has no such argument, it could be marked convergent-only?
>
> So you could loop-unroll the latter, but not the former.
>
> If convergent were fully supported and mature in LLVM, I think we could
mark all the barriers as convergent-only. E.g. full loop unrolling and
inlining should be fine also for numbered barriers, at least on CUDA. Is
there an example that you have in mind where the numbering is what makes
the difference?

Bjarke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150831/ef9b8212/attachment.html>


More information about the llvm-commits mailing list