[PATCH] D42922: [CUDA] Register relocatable GPU binaries

Fri Feb 16 10:32:51 PST 2018

tra added inline comments.

================
Comment at: lib/CodeGen/CGCUDANV.cpp:330-331
   // the GPU side.
   for (const std::string &GpuBinaryFileName :
        CGM.getCodeGenOpts().CudaGpuBinaryFileNames) {
     llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> GpuBinaryOrErr =
----------------
Hahnfeld wrote:
> tra wrote:
> > Hahnfeld wrote:
> > > Can we actually have multiple GPU binaries here? If yes, how do I get there?
> > Yes. `clang --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_50...` will compile for sm_35 and sm_50 and then will pass the names of GPU-side objects to the host compilation via `-fcuda-include-gpubinary`.
> I'm not sure if that's true anymore: I think they are now combined by `fatbinary`. This seems to be confirmed by `test/Driver/cuda-options.cu`. If that was the only use case, we may try to get rid of this possibility, let me check this.
You are correct. All GPU binaries are in the single fatbin now.
That said, you could still pass extra -fcuda-include-gpubinary to cc1 manually, but I see no practical reason to do it -- single fatbin serves the purpose better.

We should remove this loop and make CGM.getCodeGenOpts().CudaGpuBinaryFileNames a scalar.

https://reviews.llvm.org/D42922