[PATCH] D78655: [HIP] Let lambda be host device by default

Paul Fultz II via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Apr 30 13:26:51 PDT 2020


pfultz2 added a comment.

> I.e. if I pass a mutable lambda by reference to the GPU kernel

I dont think we are enabling passing host objects by reference through functions. Although it could be possible to capture the mutable lambda by reference by another lambda.

> will the same lambda called on host do the same thing when it's called on the device?

Yes, just as the same as capturing a host variable by reference and using it on the device.

> In principle it would work if GPU and host operate un a uniform memory

A unified memory is not necessary. What is needed is a coordination between the compiler and runtime.

We dont support capturing host variable by reference, so maybe we can restrict the implicit HD to lambdas that don't capture by reference?

> According to cppreference, it's only true since C++17 and, AFAICT, only for capture-less lambdas.

You can capture as well, if its in a `constexpr` context.

> Considering they are not always constexpr, this assertion is not true, either.

Yes, we seem to delay this. It is always HD but not always emitted for both host and device.

The issue would be if users tried to detect HD using SFINAE. It could be a false claim, but maybe it doesnt matter. More importantly, if the lambda is called in a unevaluated context, will the compiler still emit the function or will it produce a hard error instead of a substitution failure? I assume something like this would compile:

  template<class F>
  __host__ auto is_host(F f) -> decltype(f(), std::true_type{});
  std::false_type is_host(...);
  
  template<class F>
  __device__ auto is_device(F f) -> decltype(f(), std::true_type{});
  std::false_type is_device(...);
  
  __host__ void f();
  
  void g()
  {
      auto l = []{ f(); };
      using on_host = decltype(is_host(l));
      static_assert(on_host{}, "Lambda not on host");
      using on_device = decltype(is_device(l));
      static_assert(on_device{}, "Lambda not on device");
  }



> If/when operator() does get constexpr treatment by compiler, we should already derive HD attributes from constexpr. If we do not, then that's what needs to be fixed.

How does the compiler implement this? Does it add `constexpr` attribute onto the operator() or does the constexpr-evalutation visits the lambda as if it were `constexpr`? It seems the latter would be more effecient, and it would be similar to what we are doing with HD. The only difference is that a function can be overloaded with `__host__` and `__device__` whereas that is not possible with `constexpr`. So a difference could be detected by the user, but maybe that doesn't matter

> That at least would make sense from consistency standpoint as we currently do treat all other constexpr functions as HD.

I mean consistent across the different attributes not in the interpretation of constexpr. A lambda that only calls constexpr functions implicitly has `constexpr` attribute. So, a lambda that only calls device functions(or HD) should implicitly have the `__device__` attribute.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78655/new/

https://reviews.llvm.org/D78655





More information about the cfe-commits mailing list