[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

Tue Jul 31 12:39:29 PDT 2018

sfantao added inline comments.

================
Comment at: include/clang/Driver/Compilation.h:312
+  /// \param skipBundler - bool value set once by the driver.
+  void setSkipOffloadBundler(bool skipBundler);
+
----------------
gtbercea wrote:
> sfantao wrote:
> > gtbercea wrote:
> > > sfantao wrote:
> > > > Why is this a property of the compilation and not of a set of actions referring to a given target? That would allow one to combine in the same compilation targets requiring the bundler and targets that wouldn't. 
> > > This was a way to pass this information to the OpenMP NVPTX device toolchain.
> > > 
> > > Both the Driver OpenMP NVPTX toolchain need to agree on the usage of the new scheme (proposed in this patch) or the old scheme (the one that is in the compiler today).
> > > 
> > > 
> > I understand, but the way I see it is that it is the toolchain that skips the bundler not the compilation. I understand that as of this patch, you skip only if there is a single nvptx target. If you have more than one target, as some tests do, some toolchains will still need the bundler. So, we are making what happens with the nvptx target dependent of other toolchains. Is this an intended effect of this patch?
> Bundler is skipped only for the OpenMP NVPTX toolchain. I'm not sure what you mean by "other toolchain".
Is skipped for the NVPTX toolchain if there are no "other" device toolchains requested. Say I have a working pipeline that does static linking with nvptx correctly. Then on top of that I add another device to `-fopenmp-targets`, that pipeline will now fail even for nvptx, right?

================
Comment at: lib/Driver/Compilation.cpp:276
+void Compilation::setSkipOffloadBundler(bool skipBundler) {
+  skipOffloadBundler = skipBundler;
+}
----------------
gtbercea wrote:
> sfantao wrote:
> > gtbercea wrote:
> > > sfantao wrote:
> > > > Given the logic you have below, you are assuming this is not set to false ever. It would be wise to get an assertion here in case you end up having toolchains skipping and others don't. If that is just not supported a diagnostic should be added instead.
> > > > 
> > > > The convention is that local variables use CamelCase.
> > > The checks I added in the Driver will set this flag to true if all toolchains Clang offloads to support the skipping of the bundler/unbundler for object files. Currently only NVPTX toolchain can skip the bundler/unbundler for object files so the code path in this patch will be taken only for:
> > > 
> > > -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda
> > Ok, if that is the case, just add an assertion here.
> If one of the toolchains in the list of toolchains can't skip then none of them skip. If all can skip then they all skip. What assertion would you like me to add?
If SkipOffloadBundler is set to true you don't expect it to be set to false afterwards, right? That should be asserted.

================
Comment at: lib/Driver/ToolChains/Cuda.cpp:496
+              ? CudaVirtualArchToString(VirtualArchForCudaArch(gpu_arch))
+              : GPUArch.str().c_str();
+      const char *PtxF =
----------------
gtbercea wrote:
> sfantao wrote:
> > Why don't create fatbins instead of cubins in all cases. For the purposes of OpenMP they are equivalent, i.e. nvlink can interpret them in the same way, no?
> I'm not sure why the comment is attached to this particular line in the code.
> 
> But the reason why I don't use fatbins everywhere is because I want to leave the previous toolchain intact. So when the bundler is not skipped we do precisely what we did before.
The comment was here, because this is where you generate the command to create the fatbin - no other intended meaning.

Given that fatbin can be linked with nvlink to get a device cubin the toolchain won't need to change regardless of whether bundling is used or not, for the bundler the device images are just bits. 

Repository:
  rC Clang

https://reviews.llvm.org/D47394