[cfe-dev] [RFC] Delayed target-specific diagnostic when compiling for the devices.

Tue Jan 15 15:04:35 PST 2019

> __host__ __device__ functions are still device functions and it means
that they must be emitted when you compile for the device

That is not the case for templated or inline __host__ __device__
functions.  They explicitly are not emitted for host/device unless they are
called from a host/device context.  CUDA code relies heavily on this fact.
As a result, you are allowed to do "host-only" things from a __host__
__device__ function so long as it's not codegen'ed for device.  Similarly,
you can do "device-only" things from a __host__ __device__ function so long
as it's not codegen'ed for host.

The notion of "deferred diagnostics" in clang's CUDA support is explicitly
there to handle the case when we do not know whether or not a __host__
__device__ function must be emitted for host or device and so we don't know
whether or not to raise an error when you do a "wrong-side" thing (i.e.
you're compiling for device and you did a host-only thing, or you're
compiling for host and you did a device-only thing).

On Tue, Jan 15, 2019 at 2:58 PM Alexey Bataev <a.bataev at outlook.com> wrote:

> __host__ __device__ functions are still device functions and it means that
> they must be emitted when you compile for the device. You know, that the
> user marked those functions as the device functions. In OpenMP, you cannot
> say before the codegen phase whether the function is used on the device or
> not. We should not emit all the functions available, only those, which are
> used (implicitly or explicitly, directly or indirectly) in the target
> regions.
>
> Best regards,
> Alexey Bataev
>
> >> 15 янв. 2019 г., в 17:34, John McCall <jmccall at apple.com> написал(а):
> >>
> >> On 15 Jan 2019, at 17:20, Alexey Bataev wrote:
> >> This is not only for asm, we need to delay all target-specific
> diagnostics.
> >> I'm not saying that we need to move the host diagnostic, only the
> diagnostic for the device compilation.
> >> As for Cuda, it is a little but different. In Cuda the programmer must
> explicitly mark the device functions,  while in OpenMP it must be done
> implicitly. Thus, we cannot reuse the solution used for Cuda.
> >
> > All it means is that you can't just use the solution used for CUDA "off
> the shelf".  The basic idea of associating diagnostics with the current
> function and then emitting those diagnostics later when you realize that
> you have to emit that function is still completely applicable.
> >
> > John.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190115/d26dcb9b/attachment.html>