[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

Tue May 29 11:08:36 PDT 2018

gtbercea added inline comments.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:536
+  }
 }

----------------
sfantao wrote:
> What prevents all this from being done in the bundler? If I understand it correctly, if the bundler implements this wrapping all the checks for librariers wouldn't be required and, only two changes would be required in the driver:
> 
> - generate fatbin instead of cubin. This is straightforward to do by changing the device assembling job. In terms of the loading of the kernels by the device API, doing it through fatbin or cubin should be equivalent except that fatbin enables storing the PTX format and JIT for newer GPUs.
> - Use NVIDIA linker as host linker.
> 
> This last requirement could be problematic if we get two targets attempting  to use different (incompatible linkers). If we get this kind of incompatibility we should get the appropriate diagnostic.
What prevents it is the fact that the bundler is called AFTER the HOST and DEVICE object files have been produced. The creation of the fatbin (FATBINARY + CALNG++) needs to happen within the NVPTX toolchain.

Repository:
  rC Clang

https://reviews.llvm.org/D47394