[PATCH] D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables.

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 29 13:56:40 PDT 2019

tra added a comment.

In D62603#1521979 <https://reviews.llvm.org/D62603#1521979>, @yaxunl wrote:

> > I think `static __device__` globals would fall into the same category -- nominally they should not be visible outside of device-side object file, but in practice we do need to make them visible from the host side of the same TU.
> Are you sure nvcc support accessing static `__device__` variables in host code? That would be expensive to implement.

Address (of the shadow, translatable to device address) and size -- yes. Values -- no.

E.g. you can pass &array as a parameter to the kernel. Host-side code will use shadow's address, but device-side kernel will get the real device-side address, translated from the shadow address by the runtime.

> Instead of looking up dynamic symbol tables only, now we need to look up symbol tables for local symbols. Also we have to differentiate local symbols that have the same name. This also means user can not strip symbol tables.

I'm not sure I understand what you're saying. CUDA runtime and device-side object file management is a black box to me, so I don't know how exactly NVIDIA has implemented this on device side, but the fact remains. host must have some way to refer to (some) device-side entities. Specifically, kernels and the global variables, whether they are nominally static or not.

  rC Clang



More information about the cfe-commits mailing list