[PATCH] D44747: Set calling convention for CUDA kernel
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 3 10:51:27 PDT 2018
tra added inline comments.
================
Comment at: lib/Sema/SemaType.cpp:3319-3330
+ // Attribute AT_CUDAGlobal affects the calling convention for AMDGPU targets.
+ // This is the simplest place to infer calling convention for CUDA kernels.
+ if (S.getLangOpts().CUDA && S.getLangOpts().CUDAIsDevice) {
+ for (const AttributeList *Attr = D.getDeclSpec().getAttributes().getList();
+ Attr; Attr = Attr->getNext()) {
+ if (Attr->getKind() == AttributeList::AT_CUDAGlobal) {
+ CC = CC_CUDAKernel;
----------------
tra wrote:
> This apparently breaks compilation of some CUDA code in our internal tests. I'm working on minimizing a reproduction case. Should this code be enabled for AMD GPUs only?
Here's a small snippet of code that previously used to compile and work:
```
template <typename T>
__global__ void EmptyKernel(void) { }
struct Dummy {
/// Type definition of the EmptyKernel kernel entry point
typedef void (*EmptyKernelPtr)();
EmptyKernelPtr Empty() { return EmptyKernel<void>; }
};
```
AFAICT, it's currently impossible to apply __global__ to pointers, so there's no way to make the code above work with this patch applied.
Repository:
rL LLVM
https://reviews.llvm.org/D44747
More information about the llvm-commits
mailing list