[llvm-dev] [CUDA/NVPTX] is inlining __syncthreads allowed?
Jingyue Wu via llvm-dev
llvm-dev at lists.llvm.org
Fri Aug 21 13:24:11 PDT 2015
Hi Justin,
Is a compiler allowed to inline a function that calls __syncthreads? I saw
nvcc does that, but not sure it's valid though. For example,
void foo() {
__syncthreads();
}
if (threadIdx.x % 2 == 0) {
...
foo();
} else {
...
foo();
}
Before inlining, all threads meet at one __syncthreads(). After inlining
if (threadIdx.x % 2 == 0) {
...
__syncthreads();
} else {
...
__syncthreads();
}
The __syncthreads call is duplicated, and it's no longer guaranteed that
all threads can meet one __syncthreads().
Any thoughts?
Jingyue
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/196c4b79/attachment.html>
More information about the llvm-dev
mailing list