[PATCH] D44435: Add the module name to __cuda_module_ctor and __cuda_module_dtor for unique function names

Wed Mar 14 10:40:43 PDT 2018

rjmccall added inline comments.

================
Comment at: lib/CodeGen/CGCUDANV.cpp:281

+  // get name from the module to generate unique ctor name for every module
+  SmallString<128> ModuleName
----------------
SimeonEhrig wrote:
> tra wrote:
> > SimeonEhrig wrote:
> > > rjmccall wrote:
> > > > Please explain in the comment *why* you're doing this.  It's just for debugging, right?  So that it's known which object file the constructor function comes from.
> > > The motivation is the same at this review: https://reviews.llvm.org/D34059
> > > We try to enable incremental compiling of cuda runtime code, so we need unique ctor/dtor names, to handle the cuda device code over different modules. 
> > I'm also interested in in the motivation for this change.
> > 
> > Also, if the goal is to have an unique module identifier, would compiling two different files with the same name be a problem? If the goal is to help identifying a module, this may be OK, if not ideal. If you really need to have unique name, then you may need to do something more elaborate. NVCC appears to use some random number (or hash of something?) for that.
> We need this modification for our C++-interpreter Cling, which we want to expand to interpret CUDA runtime code. Effective, it's a jit, which read in line by line the program code. Every line get his own llvm::Module. The Interpreter works with incremental and lazy compilation. Because the lazy compilation, we needs this modification. In the CUDA mode, clang generates  for every module an _ _cuda_module_ctor and _ _cuda_module_dtor, if the compiler was started with a path to a fatbinary file. But the ctor is also depend on the source code, which will translate to llvm IR in the module. For Example, if a _ _global_ _ kernel will defined, the CodeGen add the function call __cuda_register_globals() to the ctor. But the lazy compilations prevents, that we can translate a function, which is already translate. Without the modification, the interpreter things, that the ctor is always same and use the first translation of the function, which was generate. Therefore, it is impossible to add new kernels. 
I'm not asking you to explain to *me* why you're doing this, I'm asking you to explain *in the comment* why you're doing this.

That said, we should discuss this.  It sounds like you need the function to have a unique name because otherwise you're seeing inter-module conflicts between incremental slices.  Since the function is emitted with internal linkage, I assume that those conflicts must be because you're promoting internal linkage to external in order to make incremental processing able to link to declarations from an earlier slice of the translation unit.  I really think that a better solution would be to change how we assign LLVM linkage to static global declarations in IRGen — basically, recognizing the difference between internal linkage (where different parts of the translation unit can still refer to the same entity) and no linkage at all (where they cannot).  We could then continue to emit truly private entities, like global ctors/dtors, lambda bodies, block functions, and so on, with internal/private linkage without worrying about how your pass will mess up the linkage later.

Repository:
  rC Clang

https://reviews.llvm.org/D44435