[PATCH] D16941: [NVPTX] Mark nvvm synchronizing intrinsics as convergent.
Jingyue Wu via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 5 18:04:20 PST 2016
jingyue added a comment.
LGTM, but do you have a test where LLVM generates wrong code if `__syncthreads` is not marked convergent?
FYI, http://reviews.llvm.org/D12246 has lots of discussion on this. Replacing noduplicate with convergent on these NVPTX thread intrinsics is correct. For example, Inlining a function that contains `__syncthreads` is OK. According to PTX ISA, `bar.sync` should only be executed uniformly, so inlining won't introduce new divergence.
The problem is that, before we replace them, we need to fix several places in LLVM (such as SpeculativeExecution, TryToSinkInstruction in InstCombine, and GVN PRE) to handle convergent correctly.
More information about the llvm-commits