[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain
Gheorghe-Teodor Bercea via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Jun 1 06:47:28 PDT 2018
gtbercea added a comment.
> I disagree in this context because this patch currently means that static archives will only work with NVPTX and there is no clear path how to "fix" things for other offloading targets. I'll try to work on my proposal over the next few days (sorry, very busy week...), maybe I can put together a prototype of my idea.
Other toolchains can also have static linking if they:
1. ditch the clang-offload-bundler for generating/consuming object files.
2. implement a link step on the device toolchain which can identify the vendor specific object file inside the host object file. (this is how the so called "bunlding" should have been done in the first place not using a custom tool which limits the functionality of the compiler). Identifying toolchain-specific objects/binaries is a task that belongs within the device-specific toolchain and SHOULD NOT be factored out because you can't treat object that are different by definition in the same way. Ignoring their differences leads to those object not being link-able. On top of that, factoring out introduces custom object formats which only CLANG understands AND it introduces compilation steps that impede static linking.
I'm surprised you now disagree with this technique, when I first introduced you to this in an e-mail off list you agreed with it.
So this patch, the only new CUDA tool that it calls is the FATBINARY tool which is done on the device-specific side of the toolchain so you can't possibly object to that. The CUDA toolchain already calls FATBINARY so it's not a novelty. That step is essential to making device-side objects identifiable by NVLINK (which we already call).
The only step you might object to is the partial linking step which, as I explained in my original post, I envisage will be improved over time as more toolchains support this scheme. I think this is a true solution to the problem. What you are proposing is a workaround that doesn't tackle the problem head-on.
Repository:
rC Clang
https://reviews.llvm.org/D47394
More information about the cfe-commits
mailing list