[PATCH] CUDA: Add option to allow host device functions to call host functions

Mon Feb 23 18:10:54 PST 2015

That is sort of part of a disagreement I've been having with someone who uses this feature. He feels this is supported behavior and not a program with errors. So if there is code like:

  void bar() {}
  __host__ __device__ foo() { bar(); }

And if foo is never called from device then the program "makes sense", as you are never attempting to have host code executed on the GPU, and the compiled program runs as expected. Now this is a silly example, he is doing some template metaprogramming to generate kernels for both host and device which makes his use-case understandable. Using this patch the code we generate also runs correctly. So it isn't just for analysis as it useful in our code generation too.

Normally I would think this could be fixed by ifdef-guarding on __CUDA_ARCH__ but if bar were to perform a templated kernel launch, which happens in this client's code, then that would not be allowed usage under nvcc.

http://reviews.llvm.org/D7841

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/