[PATCH] D11664: [CUDA] Implemented additional processing steps needed to link with CUDA libdevice bitcode.

Mon Aug 24 13:22:19 PDT 2015

echristo added inline comments.

================
Comment at: lib/CodeGen/CodeGenAction.cpp:166-170
@@ +165,7 @@
+        std::vector<const char *> ModuleFuncNames;
+        // We need to internalize contents of the linked module but it
+        // has to be done *after* the linking because internalized
+        // symbols will not be linked in otherwise.
+        // In order to do that, we preserve current list of function names in
+        // the module and then pass it to Internalize pass to preserve.
+        if (LangOpts.CUDA && LangOpts.CUDAIsDevice &&
----------------
Can you explain this in a different way perhaps? I'm not sure what you mean here.

================
Comment at: lib/CodeGen/CodeGenAction.cpp:181-190
@@ -166,2 +180,12 @@
           return;
+        if (LangOpts.CUDA && LangOpts.CUDAIsDevice &&
+            LangOpts.CUDAUsesLibDevice) {
+          legacy::PassManager passes;
+          passes.add(createInternalizePass(ModuleFuncNames));
+          // Considering that most of the functions we've linked are
+          // not going to be used, we may want to eliminate them
+          // early.
+          passes.add(createGlobalDCEPass());
+          passes.run(*TheModule);
+        }
       }
----------------
Seems like this should be part of the normal IPO pass run? This seems like an odd place to put this, can you explain why a bit more?


http://reviews.llvm.org/D11664