[PATCH] D123810: [Cuda] Add initial support for wrapping CUDA images in the new driver.

Tue May 10 14:52:42 PDT 2022

tra added inline comments.

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:188
+  static inline OffloadKind getEmptyKey() {
+    return static_cast<OffloadKind>(0xFFFF);
+  }
----------------
Extend `enum OffloadKind` to include these special kinds. 

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:201
   static DeviceFile getEmptyKey() {
-    return {DenseMapInfo<StringRef>::getEmptyKey(),
+    return {static_cast<OffloadKind>(DenseMapInfo<uint16_t>::getEmptyKey()),
             DenseMapInfo<StringRef>::getEmptyKey(),
----------------
Why do we limit ourselves to uint16_t here? Can't we just use `OffloadKind` itself and get rid of these casts?

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:726
+
+  if (Error Err = executeCommands(*FatBinaryPath, CmdArgs))
+    return std::move(Err);
----------------
We should have a way to pass extra options to fatbinary, too. E.g. we may want to use `--compress-all`. Also, we may need to pass through `-g` for debug builds.

Oh. Debug builds. Makes me wonder if cuda-gdb will be able to find GPU binaries packaged by the new driver. If it does not, it will be a rather serious problem. It would likely affect various profiling tools the same way, too. Can you give it a try?

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1192
     // Run LTO on any bitcode files and replace the input with the result.
-    if (Error Err = linkBitcodeFiles(LinkerInput.getSecond(), TheTriple,
-                                     File.Arch, WholeProgram))
+    if (Error Err = linkBitcodeFiles(LinkerInputFiles, TheTriple, File.Arch,
+                                     WholeProgram))
----------------
Nit. We should think of changing `linkBitcodeFiles` to return `llvm::ErrorOr<bool>` so we can return `WholeArchive` value, instead of modifying it as an argument.

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1200
+      LinkedImages.emplace_back(OFK_OpenMP, TheTriple.getTriple(), File.Arch,
+                                LinkerInputFiles.front());
       continue;
----------------
What's expected to happen if we have more than one input?
If only one is ever expected, I'd add an assert.

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1206
     // CUDA in non-RDC mode.
     if (WholeProgram && TheTriple.isNVPTX()) {
+      assert(!LinkerInputFiles.empty() && "No non-RDC image to embed");
----------------
Should `EmbedBitcode` be mutually exclusive vs `WholeArchive`? 

Can we ever end up with both unset? 

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1210
+        LinkedImages.emplace_back(Kind, TheTriple.getTriple(), File.Arch,
+                                  LinkerInputFiles.front());
       continue;
----------------
ditto about the assert on the number of inputs.

================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1347
+
+    auto FileOrErr = compileModule(M);
+    if (!FileOrErr)
----------------
The `M` contains generated registration glue at this point, right? It may be worth a comment to explain what is it that we're compiling here.

================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:271
+llvm::Error wrapCudaBinary(llvm::Module &M, llvm::ArrayRef<char> Images) {
+  return Error::success();
+}
----------------
A "not-implemented-yet/TODO" comment would be appropriate here.

================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.h:20
+
+/// Wrap the input fatbinary image into the module \p M as global symbols and
+/// registers the images with the CUDA runtime.
----------------
It should be either "Wraps/registers" or "Wrap/register".

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123810/new/

https://reviews.llvm.org/D123810