[PATCH] D94732: [CUDA] Normalize handling of defauled dtor.

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 20 14:16:45 PST 2021


rsmith added inline comments.


================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:15162-15170
+  bool SkipDtorChecks = VD->getType()->isArrayType();
+
+  // CUDA: Skip destructor checks for host-only variables during device-side
+  // compilation
+  SkipDtorChecks |=
+      (LangOpts.CUDAIsDevice && VD->hasGlobalStorage() &&
+       !(VD->hasAttr<CUDADeviceAttr>() || VD->hasAttr<CUDAConstantAttr>() ||
----------------
tra wrote:
> rsmith wrote:
> > Is this safe? What happens if the destructor for the variable is a template, and instantiating that template results in a reference to a device function? Eg:
> > 
> > ```
> > template<typename T> __device__ void f() {}
> > template<typename T> struct A {
> >   ~A() { f<<<>>>(); }
> > };
> > A a;
> > ```
> This is business as usual -- we catch it during host compilation, where `a` is instantiated.
> 
> ```
> h.cu:3:10: error: no matching function for call to 'f'
>   ~A() { f<T>(); }
>          ^~~~
> h.cu:5:8: note: in instantiation of member function 'A<int>::~A' requested here
> A<int> a;
>        ^
> h.cu:1:51: note: candidate function not viable: call to __device__ function from __host__ function
> template<typename T> __attribute__((device)) void f() {}
> 
> 1 error generated when compiling for host.
> ```
> 
> If it were a `__device__ A<int> a;` , then we catch it during GPU compilation and also complain that we can't have dynamic initializers.
> 
Sorry, testcase wasn't quite right; I meant for `f` to be `__global__` not `__device__` so that the kernel call to it works. Fixed example:

```
extern "C" int cudaConfigureCall(int a, int b);
template<typename T> __attribute__((__global__)) void f(T) {}
template<typename T> struct A {
  ~A() { f<<<1, 1>>>(T()); }
};
A<int> a;
```

I think that this is valid. In order for it to work, we need to trigger instantiation of `f<int>` on the device side of the compilation. In order to do that, we need to trigger instantiation of `A<int>::~A()`, so we need to mark it referenced on the device side. (This is, I think, in line with the general principle that we want to do the same template instantiations of host functions on both sides of the compilation, so that both sides agree on which kernel functions are referenced.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94732/new/

https://reviews.llvm.org/D94732



More information about the cfe-commits mailing list