[PATCH] D16941: [NVPTX] Mark nvvm synchronizing intrinsics as convergent.
Hal Finkel via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 5 18:23:08 PST 2016
hfinkel added a subscriber: hfinkel.
hfinkel accepted this revision.
hfinkel added a reviewer: hfinkel.
hfinkel added a comment.
This revision is now accepted and ready to land.
In http://reviews.llvm.org/D16941#345516, @jingyue wrote:
> LGTM, but do you have a test where LLVM generates wrong code if `__syncthreads` is not marked convergent?
> FYI, http://reviews.llvm.org/D12246 has lots of discussion on this. Replacing noduplicate with convergent on these NVPTX thread intrinsics is correct. For example, Inlining a function that contains `__syncthreads` is OK. According to PTX ISA, `bar.sync` should only be executed uniformly, so inlining won't introduce new divergence.
> The problem is that, before we replace them, we need to fix several places in LLVM (such as SpeculativeExecution, TryToSinkInstruction in InstCombine, and GVN PRE) to handle convergent correctly.
While you're there ;) - you might look at making our handling of noduplicate more consistent as well. I just noticed, for example, that loop unswitching checks for convergent but not for noduplicate.
More information about the llvm-commits