[PATCH] D29654: [OpenMP] Integrate OpenMP target region cubin into host binary

Fri Mar 31 13:18:11 PDT 2017

rnk added inline comments.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:324
+         "CUDA toolchain not expected for an OpenMP host device.");
+  if (JA.isDeviceOffloading(Action::OFK_OpenMP)) {
+    if (Output.isFilename()) {
----------------
This entire if block constructs a completely unrelated link line from the rest of the function. They should be separate functions. In fact, `CudaToolChain::buildLinker()` should probably return a different tool depending on whether you're using OpenMP or not, and this logic should be in a different tool. Maybe NVPTX::Linker should be called NVPTX::FatLink as well.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:388-394
+      const char *CopyExec = Args.MakeArgString(getToolChain().GetProgramPath(
+          C.getDriver().IsCLMode() ? "copy" : "cp"));
+      ArgStringList CopyCmdArgs;
+      CopyCmdArgs.push_back(II.getFilename());
+      CopyCmdArgs.push_back(CubinF);
+      C.addCommand(
+          llvm::make_unique<Command>(JA, *this, CopyExec, CopyCmdArgs, Inputs));
----------------
Hahnfeld wrote:
> Copying files around would be something new in Clang that someone more experienced has to weight. @rnk, @hfinkel?
I don't think `copy` is an actual binary on Windows, it's a builtin cmd command. At least, that's what where copy suggests. clang-cl can also run on Linux, so you probably want to check `#ifdef LLVM_ON_WIN32` for this.

Honestly, I'd prefer it if you instead found a way to get the driver to rename the inputs right before executing nvlink. We have cross-platform utilities in Support for doing renames. You can probably make some kind of custom Command to do this.

================
Comment at: test/Driver/openmp-offload.c:594
+/// Check cubin file generation and usage by nvlink
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHK-CUBIN %s
----------------
In this case, it would be nicer if you arranged for ptxas to output to a .cubin file.

================
Comment at: test/Driver/openmp-offload.c:601
+// CHK-CUBIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" "{{.*}}" "{{.*}}-openmp-nvptx64-nvidia-cuda-{{.*}}.cubin"
+
+/// ###########################################################################
----------------
Can you add a test that links two input ptxas .o files into a binary? You can use `touch` to make the .o files.

Repository:
  rL LLVM

https://reviews.llvm.org/D29654