[PATCH] D102507: [HIP] Support <functional> in device code

Thu Jun 24 08:59:41 PDT 2021

yaxunl added a comment.

In D102507#2833594 <https://reviews.llvm.org/D102507#2833594>, @ldionne wrote:

> In D102507#2830688 <https://reviews.llvm.org/D102507#2830688>, @yaxunl wrote:
>
>> In D102507#2792087 <https://reviews.llvm.org/D102507#2792087>, @rsmith wrote:
>>
>>> @ldionne How should we go about establishing whether libc++ would be prepared to officially support CUDA? Right now, Clang's CUDA support is patching in attributes onto libc++ functions from the outside, which doesn't seem like a sustainable model.
>>
>> ping
>
> If the current approach is to patch libc++ from the outside, then yeah, that's most definitely not a great design IMO. It's going to be very brittle. I think it *may* be reasonable to support this in libc++, but I'd like to see some sort of basic explanation of what the changes would be so we can have a discussion and make our mind up about whether we can support this, and what's the best way of doing it.

Thanks Louis. Please allow me to have a brief explanation about our plan to support libc++ for HIP device compilation.

HIP functions can have `__device__`, `__host__`, or `__device__ __host__` attributes, indicating the target of a function. `__device__` function can only be executed on device (GPU). `__host__` functions can only be executed on host. `__device__ __host__` functions can be executed on both device and host. By default (without explicit device/host attributes) a non-constexpr function is a host function, a constexpr function is `__device__ __host__` function. This also applies to member functions of class. Clang is able to resolve overloaded functions differing only by device/function attributes.

Currently libc++ functions are host functions by default, except constexpr functions. As such the non-constexpr libc++ functions can only be called by host functions in HIP programs. This is similar to C++ programs.

By supporting libc++ in HIP device compilation we mean "allowing libc++ functions to be executed on device in HIP programs". To achieve this we can take 3 approaches:

1. Many libc++ functions are generic regarding device or host, i.e., their code is common for device and host. For such functions we can make them `__device__ __host__` functions.

2. Some libc++ functions are mostly common for device or host with minor differences. For such functions, we can make them `__device__ __host__` and use `#if __HIP_DEVICE_COMPILE__` (indicating device compilation) for the minor difference in the function body.

3. Some libc++ functions have different implementations for device and host. We can leave these host functions as they are and adding overloaded `__device__` functions.

There are two ways to mark libc++ functions as `__device__ __host__`:

1. Define a macro which expands to empty for non-HIP programs and expands to `__device__ __host__` for HIP and add it to each libc++ function which is to be marked as `__device__ __host__`.

2. Define macros which expand to empty for non-HIP programs and expand to `#pragma clang force_cuda_host_device begin/end` for HIP and put them at the beginning and end of a file where all the functions are to be marked as `__device__ __host__`.

We plan to implement libc++ support in HIP device compilation in a progressive approach, header by header, and document the supported libc++ headers. We will prioritize libc++ headers to support based on 1) user requests 2) whether it has already been supported through clang wrapper headers (patching) 4) usefulness for device execution 3) availability of lower level support with HIP runtime.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102507/new/

https://reviews.llvm.org/D102507