[PATCH] D25809: [CUDA] Improved target attribute-based overloading.

Fri Oct 21 16:03:42 PDT 2016

jlebar added inline comments.

================
Comment at: lib/Sema/SemaCUDA.cpp:87
+
+  if ((HasHostAttr && HasDeviceAttr) || ForceCUDAHostDeviceDepth > 0)
+    return CFT_HostDevice;
----------------
Checking ForceCUDAHostDeviceDepth here is...yeah.  Especially because the other overload of this function doesn't do it.

I know you get rid of it in the next patch, but how much work would it be to get rid of it here?  It definitely makes this patch harder to check.

================
Comment at: lib/Sema/SemaCUDA.cpp:790
+                                   LookupResult &Previous) {
+  CUDAFunctionTarget NewTarget = IdentifyCUDATarget(NewFD);
+  for (auto OldND : Previous) {
----------------
Nit, can we assert that we're in cuda mode?

================
Comment at: lib/Sema/SemaCUDA.cpp:791
+  CUDAFunctionTarget NewTarget = IdentifyCUDATarget(NewFD);
+  for (auto OldND : Previous) {
+    FunctionDecl *OldFD = OldND->getAsFunction();
----------------
If this is just a Decl* or NamedDecl*, can we write out the type?

================
Comment at: lib/Sema/SemaCUDA.cpp:793
+    FunctionDecl *OldFD = OldND->getAsFunction();
+    if (!OldFD || OldFD->isFunctionTemplateSpecialization())
+      continue;
----------------
Please add a comment explaining why we ignore template specializations.

================
Comment at: lib/Sema/SemaTemplate.cpp:7044

+      // Filter out matches that have different target.
+      if (LangOpts.CUDA &&
----------------
Can we have a better comment here?  (And, can we expand upon it in D25845?)

================
Comment at: lib/Sema/SemaTemplate.cpp:8117
+    if (LangOpts.CUDA &&
+        IdentifyCUDATarget(Specialization) != IdentifyCUDATarget(Attr)) {
+      FailedCandidates.addCandidate().set(
----------------
Can we just pass D here (and thus not write the new overload of IdentifyCUDATarget)?

================
Comment at: test/SemaCUDA/function-template-overload.cu:34
+
+// Can't overload HD template with H or D template, though functions are OK.
+template <typename T> __host__ __device__ HDType overload_hd(T a) { return HDType(); }
----------------
"though non-template functions are OK"?

================
Comment at: test/SemaCUDA/function-template-overload.cu:44
+// explicitly specialize or instantiate function tempaltes.
+template <> __host__ HType overload_hd(int a);
+// expected-error at -1 {{no function template matches function template specialization 'overload_hd'}}
----------------
This is OK:

  template <typename T> __host__ __device__ HDType overload_hd(T a);
  template <> __host__ HType overload_hd(int a);

but this is not OK:

  template <typename T> __host__ HType overload_h_d(T a) { return HType(); }
  template <typename T> __device__ DType overload_h_d(T a) { return DType(); }
  template  <> __device__ __host__ DType overload_h_d(long a);

Is the rule that you *can* specialize an HD function template as H or D, but you can't go in reverse?  If so, I am not getting that from the commit message or comments here.

Like in D25705, please clarify this somewhere in the code (and commit message) if this isn't just me blatantly misreading things (which is possible).

================
Comment at: test/SemaCUDA/function-template-overload.cu:44
+// explicitly specialize or instantiate function tempaltes.
+template <> __host__ HType overload_hd(int a);
+// expected-error at -1 {{no function template matches function template specialization 'overload_hd'}}
----------------
Please ignore the above comment; it is not correct, but I cannot delete it or edit it in phab.  :-/

https://reviews.llvm.org/D25809