[PATCH] D44747: Set calling convention for CUDA kernel

John McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 27 13:22:09 PDT 2018


rjmccall added inline comments.


================
Comment at: lib/Sema/SemaOverload.cpp:1492
+      Changed = true;
+    }
+
----------------
yaxunl wrote:
> rjmccall wrote:
> > It's cheaper not to check the CUDA language mode here; pulling the CC out of the FPT is easy.
> > 
> > Why is this necessary, anyway?  From the spec, it doesn't look to me like kernel function pointers can be converted to ordinary function pointers.  A kernel function pointer is supposed to be declared with something like `__global__ void (*fn)(void)`.  You'll need to change your patch to SemaType to apply the CC even when compiling for the host, of course.
> > 
> > I was going to say that you should use this CC in your validation that calls with execution configurations go to kernel functions, but... I can't actually find where you do that validation.
> > 
> > Do you need these function pointers to be a different size from the host function pointer?
> In CUDA, `__global__` can only be used with function declaration or definition. Using it in function pointer declaration will result in a warning: 'global' attribute only applies to functions.
> 
> Also, there is this lit test in SemaCUDA:
> 
> ```
> __global__ void kernel() {}
> 
> typedef void (*fn_ptr_t)();
> 
> __host__ fn_ptr_t get_ptr_h() {
>   return kernel;
> }
> 
> ```
> It allows implicit conversion of `__global__ void()` to void(*)(), therefore I need the above change to drop the CUDA kernel calling convention in such implicit conversion.
I see.  I must have mis-read the specification, but I see that the code samples I can find online agree with that test case.  So `__global__` function pointers are just treated as function pointers, and it's simply undefined behavior if you try to call a function pointer that happens to be a kernel without an execution configuration, or contrariwise if you use an execution configuration to call a function pointer that isn't a kernel.

In that case, I think the best solution is to just immediately strip `__global__` from the type of a DRE to a kernel function, since `__global__` isn't supposed to be part of the user-facing type system.


https://reviews.llvm.org/D44747





More information about the cfe-commits mailing list