[PATCH] D44747: Set calling convention for CUDA kernel

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 3 10:51:27 PDT 2018


tra added inline comments.


================
Comment at: lib/Sema/SemaType.cpp:3319-3330
+  // Attribute AT_CUDAGlobal affects the calling convention for AMDGPU targets.
+  // This is the simplest place to infer calling convention for CUDA kernels.
+  if (S.getLangOpts().CUDA && S.getLangOpts().CUDAIsDevice) {
+    for (const AttributeList *Attr = D.getDeclSpec().getAttributes().getList();
+         Attr; Attr = Attr->getNext()) {
+      if (Attr->getKind() == AttributeList::AT_CUDAGlobal) {
+        CC = CC_CUDAKernel;
----------------
tra wrote:
> This apparently breaks compilation of some CUDA code in our internal tests. I'm working on minimizing a reproduction case. Should this code be enabled for AMD GPUs only?
Here's a small snippet of code that previously used to compile and work:

```
template <typename T>
__global__ void EmptyKernel(void) { }

struct Dummy {
  /// Type definition of the EmptyKernel kernel entry point
  typedef void (*EmptyKernelPtr)();
  EmptyKernelPtr Empty() { return EmptyKernel<void>; }
};
```
AFAICT,  it's currently impossible to apply __global__ to pointers, so there's no way to make the code above work with this patch applied.


Repository:
  rL LLVM

https://reviews.llvm.org/D44747





More information about the llvm-commits mailing list