[PATCH] D18458: [CUDA] Mangle __host__ __device__ functions differently than __host__ or __device__ functions.

Justin Lebar via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 25 13:05:19 PDT 2016

jlebar added a comment.

> ! In http://reviews.llvm.org/D18458#383276, @jlebar wrote:


> > ! In http://reviews.llvm.org/D18458#383266, @rsmith wrote:



> This makes the "`constexpr` implies `__host__` `__device__`" patch look slightly questionable: two translation units defining the same `constexpr` function will mangle that function differently depending on whether the translation unit is built with CUDA support enabled. That will cause you to get duplicates of static locals and the like (but I suppose you do anyway between the host and the device, so maybe that's not much more broken than it would be regardless).

The breakage seems to be worse than this.  :(  Eigen seems to do the following:

    #ifdef __CUDACC__  // If compiling CUDA code
    #define HOST_DEVICE __host__ __device__
    #define HOST_DEVICE
    HOST_DEVICE void foo();
  foo.cc:  // Compiled as CUDA
    HOST_DEVICE void foo() { ... }
  bar.cc:  // *Not* compiled as CUDA
    #include "foo.h"
    void bar() { foo(); }

With this patch, foo() has a different mangled name in foo.o and bar.cc, and
we're hosed.

If we think this use-case is reasonable (I think it is?) I think this means
that we cannot mangle __host__ __device__ functions differently when doing host
compilation.  That seems to restrict us to saying that H and HD functions with
the same signatures cannot overload.  This leaves us with two options:

1. No overloading between HD and H or D functions with the same signature.

  I don't see how to do this while still letting constexpr be HD; the issue is that there are constexpr std math functions which we want to overload for device.  We could let constexpr be something other than HD, but if that new thing can overload with D, then I think we still have the same problem.
2. No overloading between HD and H, but OK to overload HD and D.

  If we did this, we'd still need to give D functions a different mangled name.  But we don't have this problem of referencing symbols defined in a file compiled in CUDA mode from a file compiled without CUDA.

  tra pointed out a problem with this, which is that if someone (say, nvidia) gave us a C++ library consisting of precompiled device code plus headers, we wouldn't be able to link with it, because we would use different mangling.

  I also don't like this because it's inconsistent to say HD can overload D but not H.  But that's a minor point at this point.

Richard, what do you think?  Maybe you have an alternative idea?


More information about the cfe-commits mailing list