[PATCH] D95558: [NFC][CUDA] Refactor registering device variable

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 1 12:45:39 PST 2021


tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

Couple of minor nits LGTM otherwise,



================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:924
+
+void CGNVCUDARuntime::adjustShadowVarLinkage(
+    const VarDecl *D, llvm::GlobalValue::LinkageTypes &Linkage) {
----------------
yaxunl wrote:
> tra wrote:
> > The name and the functionality do not match -- despite what the name says, the function will set internal linkage on any var with device-side attributes.
> > 
> > Rename to `internalizeDeviceSideVars` ?
> done
My suggestion could be improved, too. `Vars` -> `Var` as we're dealing with only one VarDecl here.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4270-4271
     } else {
-      // Host-side shadows of external declarations of device-side
-      // global variables become internal definitions. These have to
-      // be internal in order to prevent name conflicts with global
-      // host variables with the same name in a different TUs.
-      if (D->hasAttr<CUDADeviceAttr>() || D->hasAttr<CUDAConstantAttr>()) {
-        Linkage = llvm::GlobalValue::InternalLinkage;
-        // Shadow variables and their properties must be registered with CUDA
-        // runtime. Skip Extern global variables, which will be registered in
-        // the TU where they are defined.
-        //
-        // Don't register a C++17 inline variable. The local symbol can be
-        // discarded and referencing a discarded local symbol from outside the
-        // comdat (__cuda_register_globals) is disallowed by the ELF spec.
-        // TODO: Reject __device__ constexpr and __device__ inline in Sema.
-        if (!D->hasExternalStorage() && !D->isInline())
-          getCUDARuntime().registerDeviceVar(D, *GV, !D->hasDefinition(),
-                                             D->hasAttr<CUDAConstantAttr>());
-      } else if (D->hasAttr<CUDASharedAttr>()) {
-        // __shared__ variables are odd. Shadows do get created, but
-        // they are not registered with the CUDA runtime, so they
-        // can't really be used to access their device-side
-        // counterparts. It's not clear yet whether it's nvcc's bug or
-        // a feature, but we've got to do the same for compatibility.
-        Linkage = llvm::GlobalValue::InternalLinkage;
-      } else if (D->getType()->isCUDADeviceBuiltinSurfaceType() ||
-                 D->getType()->isCUDADeviceBuiltinTextureType()) {
-        // Builtin surfaces and textures and their template arguments are
-        // also registered with CUDA runtime.
-        Linkage = llvm::GlobalValue::InternalLinkage;
-        const ClassTemplateSpecializationDecl *TD =
-            cast<ClassTemplateSpecializationDecl>(
-                D->getType()->getAs<RecordType>()->getDecl());
-        const TemplateArgumentList &Args = TD->getTemplateArgs();
-        if (TD->hasAttr<CUDADeviceBuiltinSurfaceTypeAttr>()) {
-          assert(Args.size() == 2 &&
-                 "Unexpected number of template arguments of CUDA device "
-                 "builtin surface type.");
-          auto SurfType = Args[1].getAsIntegral();
-          if (!D->hasExternalStorage())
-            getCUDARuntime().registerDeviceSurf(D, *GV, !D->hasDefinition(),
-                                                SurfType.getSExtValue());
-        } else {
-          assert(Args.size() == 3 &&
-                 "Unexpected number of template arguments of CUDA device "
-                 "builtin texture type.");
-          auto TexType = Args[1].getAsIntegral();
-          auto Normalized = Args[2].getAsIntegral();
-          if (!D->hasExternalStorage())
-            getCUDARuntime().registerDeviceTex(D, *GV, !D->hasDefinition(),
-                                               TexType.getSExtValue(),
-                                               Normalized.getZExtValue());
-        }
-      }
+      getCUDARuntime().internalizeDeviceSideVars(D, Linkage);
+      getCUDARuntime().handleVarRegistration(D, *GV);
     }
----------------
Should we fold `internalizeDeviceSideVar` into `handleVarRegistration` or call it from there?
I don't think we'll have any independent use for it in CGM.  It seems to be an implementation detail for `handleVarRegistration` and may not even need to be virtual.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95558/new/

https://reviews.llvm.org/D95558



More information about the cfe-commits mailing list