[PATCH] D61458: [hip] Relax CUDA call restriction within `decltype` context.

Justin Lebar via cfe-commits cfe-commits at lists.llvm.org
Thu May 2 20:00:49 PDT 2019


> In any case, it seems like your examples argue for disallowing a
return-type mismatch between host and device overloads, not disallowing
observing the type?

Oh no, we have to allow return-type mismatches between host and device
overloads, that is a common thing in CUDA code I've seen.  You can safely
observe this difference *so long as you're inside of a function*.  This is
because we have this caller-sensitive function parsing thing.  When parsing
a __host__ __device__ function, we look at the caller to understand what
context we're in.

What I think you can't do is observe the return-type mismatch between host
and device overloads *from outside of a function*, e.g. from within a
trailing return type.

But perhaps rsmith or another expert can take my attempt at a
contract above and trap me in a Faustian contradiction.

On Thu, May 2, 2019 at 7:47 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:

> Thanks, Justin. It sees like we have the standard set of options: We can
> disallow the mismatch. We can allow it with a warning. We can allow it
> without a warning. We can say that if the mismatch contributes to the type
> of a kernel function, that's illformed (NDR).
>
> In any case, it seems like your examples argue for disallowing a
> return-type mismatch between host and device overloads, not disallowing
> observing the type? Or maybe disallowing observing the type only when
> there's a mismatch?
>
>  -Hal
>
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> ------------------------------
> *From:* Justin Lebar <jlebar at google.com>
> *Sent:* Thursday, May 2, 2019 9:16 PM
> *To:* reviews+D61458+public+f6ea501465ad52d2 at reviews.llvm.org
> *Cc:* michael.hliao at gmail.com; Artem Belevich; John McCall; Liu, Yaxun
> (Sam); Finkel, Hal J.; Richard Smith; Clang Commits; mlekena at skidmore.edu;
> blitzrakete at gmail.com; Han Shen
> *Subject:* Re: [PATCH] D61458: [hip] Relax CUDA call restriction within
> `decltype` context.
>
> > So, actually, I wonder if that's not the right answer. We generally
> allow different overloads to have different return types. What if, for
> example, the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> The problem is that conceptually compiling for host/device does not create
> a new set of overloads.
>
> When we compile for (say) host, we build a full AST for all functions,
> including device functions, and that AST must pass sema checks.  This is
> significant for example because when compiling for device we need to know
> which kernel templates were instantiated on the host side, so we know which
> kernels to emit.
>
> Here's a contrived example.
>
> ```
>  __host__ int8 bar();
> __device__ int16 bar();
> __host__ __device__ auto foo() -> decltype(bar()) {}
>
> template <int N> __global__ kernel();
>
> void launch_kernel() {
>   kernel<sizeof(decltype(foo()))><<<...>>>();
> }
> ```
>
> This template instantiation had better be the same when compiling for host
> and device.
>
> That's contrived, but consider this much simpler case:
>
> ```
> void host_fn() {
>   static_assert(sizeof(decltype(foo())) == sizeof(int8));
> }
> ```
>
> If we let foo return int16 in device mode, this static_assert will fail
> when compiling in *device* mode even though host_fn is never called on the
> device.  https://gcc.godbolt.org/z/gYq901
>
> Why are we doing sema checks on the host code when compiling for device?
> See contrived example above, we need quite a bit of info about the host
> code to infer those templates.
>
> On Thu, May 2, 2019 at 7:05 PM Hal Finkel via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
> hfinkel added a comment.
>
> In D61458#1488970 <https://reviews.llvm.org/D61458#1488970>, @jlebar
> wrote:
>
> > Here's one for you:
> >
> >   __host__ float bar();
> >   __device__ int bar();
> >   __host__ __device__ auto foo() -> decltype(bar()) {}
> >
> >
> > What is the return type of `foo`?  :)
> >
> > I don't believe the right answer is, "float when compiling for host, int
> when compiling for device."
>
>
> So, actually, I wonder if that's not the right answer. We generally allow
> different overloads to have different return types. What if, for example,
> the return type on the host is __float128 and on the device it's
> `MyLongFloatTy`?
>
> > I'd be happy if we said this was an error, so long as it's well-defined
> what exactly we're disallowing.  But I bet @rsmith can come up with
> substantially more evil testcases than this.
>
>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D61458/new/
>
> https://reviews.llvm.org/D61458
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20190502/c4eeed1a/attachment.html>


More information about the cfe-commits mailing list