[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

Fri May 8 14:31:38 PDT 2020

tra added a comment.

This one is just a FYI. I've managed to reduce the failure in the first version of this patch and it looks rather odd because the reduced test case has nothing to do with CUDA. Instead it appears to introduce a difference in compilation of regular host-only C++ code with `-x cuda` vs -x `c++`. I'm not sure how/why first version caused this and why the latest one fixes it. It may be worth double checking that we're not missing something here.

  template <class a> a b;
  auto c(...);
  template <class d> constexpr auto c(d) -> decltype(0);
  struct e {
    template <class ad, class... f> static auto g(ad, f...) {
      h<e, decltype(b<f>)...>;
    }
    struct i {
      template <class, class... f> static constexpr auto j(f... k) { c(k...); }
    };
    template <class, class... f> static auto h() { i::j<int, f...>; }
  };
  class l {
    l() {
      e::g([] {}, this);
    }
  };

The latest version of this patch works, but previous one failed with an error, when the example was compiled as CUDA, but not, when it was compiled as C++:

  $ bin/clang++ -x cuda argmax.cc -ferror-limit=1 -fsyntax-only --cuda-host-only -nocudalib -nocudainc -fsized-deallocation -std=c++17

  argmax.cc:9:68: error: function 'c' with deduced return type cannot be used before it is defined
      template <class, class... f> static constexpr auto j(f... k) { c(k...); }
                                                                     ^
  argmax.cc:11:53: note: in instantiation of function template specialization 'e::i::j<int, l *>' requested here
    template <class, class... f> static auto h() { i::j<int, f...>; }
                                                      ^
  argmax.cc:6:5: note: in instantiation of function template specialization 'e::h<e, l *>' requested here
      h<e, decltype(b<f>)...>;
      ^
  argmax.cc:15:8: note: in instantiation of function template specialization 'e::g<(lambda at argmax.cc:15:10), l *>' requested here
      e::g([] {}, this);
         ^
  argmax.cc:2:6: note: 'c' declared here
  auto c(...);
       ^
  fatal error: too many errors emitted, stopping now [-ferror-limit=]
  2 errors generated when compiling for host.

  $ bin/clang++ -x c++ argmax.cc -ferror-limit=1 -fsyntax-only --cuda-host-only -nocudalib -nocudainc -fsized-deallocation -std=c++17

  clang-11: warning: argument unused during compilation: '-nocudainc' [-Wunused-command-line-argument]
  argmax.cc:11:50: warning: expression result unused [-Wunused-value]
    template <class, class... f> static auto h() { i::j<int, f...>; }
                                                   ^~~~~~~~~~~~~~~
  argmax.cc:6:5: note: in instantiation of function template specialization 'e::h<e, l *>' requested here
      h<e, decltype(b<f>)...>;
      ^
  argmax.cc:15:8: note: in instantiation of function template specialization 'e::g<(lambda at argmax.cc:15:10), l *>' requested here
      e::g([] {}, this);
         ^
  argmax.cc:6:5: warning: expression result unused [-Wunused-value]
      h<e, decltype(b<f>)...>;
      ^~~~~~~~~~~~~~~~~~~~~~~
  argmax.cc:15:8: note: in instantiation of function template specialization 'e::g<(lambda at argmax.cc:15:10), l *>' requested here
      e::g([] {}, this);
         ^
  argmax.cc:3:35: warning: inline function 'c<l *>' is not defined [-Wundefined-inline]
  template <class d> constexpr auto c(d) -> decltype(0);
                                    ^
  argmax.cc:9:68: note: used here
      template <class, class... f> static constexpr auto j(f... k) { c(k...); }
                                                                     ^
  3 warnings generated.

================
Comment at: clang/include/clang/Sema/Sema.h:11663
+                                        bool IgnoreImplicitHDAttr = false,
+                                        bool *IsImplicitHDAttr = nullptr);
   CUDAFunctionTarget IdentifyCUDATarget(const ParsedAttributesView &Attrs);
----------------
Plumbing an optional output argument it through multiple levels of callers as an output argument is rather hard to follow, especially considering that it's not set in all code paths. Perhaps we can turn IsImplicitHDAttr into a separate function and call it from isBetterOverloadCandidate().

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79526/new/

https://reviews.llvm.org/D79526