[llvm-dev] [CUDA/NVPTX] is inlining __syncthreads allowed?

Jingyue Wu via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 21 13:24:11 PDT 2015


Hi Justin,

Is a compiler allowed to inline a function that calls __syncthreads? I saw
nvcc does that, but not sure it's valid though. For example,

void foo() {
  __syncthreads();
}

if (threadIdx.x % 2 == 0) {
  ...
  foo();
} else {
  ...
  foo();
}

Before inlining, all threads meet at one __syncthreads(). After inlining

if (threadIdx.x % 2 == 0) {
  ...
  __syncthreads();
} else {
  ...
  __syncthreads();
}

The __syncthreads call is duplicated, and it's no longer guaranteed that
all threads can meet one __syncthreads().

Any thoughts?

Jingyue
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/196c4b79/attachment.html>


More information about the llvm-dev mailing list