[PATCH] D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc

Fri Jan 15 13:23:06 PST 2021

yaxunl added a comment.

In D85223#2452363 <https://reviews.llvm.org/D85223#2452363>, @JonChesterfield wrote:

> I concede that making the variables external, and trying to give them unique names, does work around static variables not working. I believe static variables are subjected to more aggressive optimisation than external ones but the effect might not be significant.
>
> This "works" in cuda today because the loader ignores the local annotation when accessing the variable. There is some probably unintended behaviour when multiple static variables have the same name in that the first one wins.
>
> The corresponding change to the hsa loader is trivial. Why is making the symbols external, with the associated complexity in picking non-conflicting names, considered better than changing the loader?

Three reasons:

1. The loader would like to look up dynsym only, which conforms better to the standard dynamic linker behavior and is more efficient than looking up all symbols.

2. There could be symbols with the same name from different compilation units and they end up as local symbols with the same name in the binary. How does the loader know which is which.

3. If a device symbol is static but actually accessed by the host code in the same compilation unit, the device symbol has de facto external linkage since it is truly accessed by some one out side of the device object (this is due to the unfortunate fact that a single source file ends up with a host object and a device object even though they are supposed to be the same compilation unit).  Keeping the device symbol with internal linkage will cause the compiler over optimize the device code.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85223/new/

https://reviews.llvm.org/D85223