[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

Tue Jun 5 15:48:41 PDT 2018

tra added a comment.

With the updated patch description + the discussion I'm OK with the approach from the general "how do we compile/use CUDA" point of view. I'll leave the question of whether the approach works for OpenMP to someone more familiar with it.

While I'm not completely convinced that [fatbin]->.c->[clang]->.o (with device code only)->[ld -r] -> host.o (host+device code) is ideal (things could be done with smaller number of tool invocations), it should help to deal with -rdc compilation until we get a chance to improve support for it in Clang. We may revisit and change this portion of the pipeline when clang can incorporate -rdc GPU binaries in a way compatible with CUDA tools.

Repository:
  rC Clang

https://reviews.llvm.org/D47394