[PATCH] D44985: Remove initializer for CUDA shared varirable

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 29 08:42:41 PDT 2018


yaxunl added a comment.

In https://reviews.llvm.org/D44985#1050876, @rjmccall wrote:

> In https://reviews.llvm.org/D44985#1050674, @yaxunl wrote:
>
> > In https://reviews.llvm.org/D44985#1050670, @rjmccall wrote:
> >
> > > What exactly are you trying to express here?  Are you just trying to make these external declarations when compiling for the device because `__shared__` variables are actually defined on the host?  That should be handled by the frontend by setting up the AST so that these declarations are not definitions.
> >
> >
> > No. These variables are not like external symbols defined on the host. They behave like global variables in the kernel code but never initialized. Currently no targets are able to initialize them and it is users' responsibility to initialize them explicitly.
> >
> > Giving them an initial value will cause error in some backends since they cannot handle them, therefore put undef as initializer.
>
>
> So undef is being used as a special marker to the backends that it's okay not to try to initialize these variables?


I think undef as the initializer tells the llvm passes and backend that this global variable contains undefined value. I am not sure if this is better than without an initializer. I saw code in CodeGenModule::getOrCreateStaticVarDecl

  // Local address space cannot have an initializer.
  llvm::Constant *Init = nullptr;
  if (Ty.getAddressSpace() != LangAS::opencl_local)
    Init = EmitNullConstant(Ty);
  else
    Init = llvm::UndefValue::get(LTy);

which means OpenCL static variable in local address space (equivalent to CUDA shared address space) gets an undef initializer.

For CUDA shared variable, in CodeGenFunction::EmitStaticVarDecl, it first goes through call of CodeGenModule::getOrCreateStaticVarDecl and gets a zeroinitializer, then it reaches line 400

  // Whatever initializer such variable may have when it gets here is
    // a no-op and should not be emitted.
    bool isCudaSharedVar = getLangOpts().CUDA && getLangOpts().CUDAIsDevice &&
                           D.hasAttr<CUDASharedAttr>();
    // If this value has an initializer, emit it.
    if (D.getInit() && !isCudaSharedVar)
      var = AddInitializerToStaticVarDecl(D, var);

Although this disables adding initializer from D, var already has a zeroinitializer from CodeGenModule::getOrCreateStaticVarDecl, therefore its initializer needs to be overwritten by undef.

Probably a better solution would be do it in  CodeGenModule::getOrCreateStaticVarDecl, side by side by the OpenCL code.


https://reviews.llvm.org/D44985





More information about the cfe-commits mailing list