[PATCH] D42922: [CUDA] Register relocatable GPU binaries
Jonas Hahnfeld via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Feb 16 10:36:24 PST 2018
Hahnfeld marked an inline comment as done.
Hahnfeld added inline comments.
================
Comment at: lib/CodeGen/CGCUDANV.cpp:330-331
// the GPU side.
for (const std::string &GpuBinaryFileName :
CGM.getCodeGenOpts().CudaGpuBinaryFileNames) {
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> GpuBinaryOrErr =
----------------
tra wrote:
> Hahnfeld wrote:
> > tra wrote:
> > > Hahnfeld wrote:
> > > > Can we actually have multiple GPU binaries here? If yes, how do I get there?
> > > Yes. `clang --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_50...` will compile for sm_35 and sm_50 and then will pass the names of GPU-side objects to the host compilation via `-fcuda-include-gpubinary`.
> > I'm not sure if that's true anymore: I think they are now combined by `fatbinary`. This seems to be confirmed by `test/Driver/cuda-options.cu`. If that was the only use case, we may try to get rid of this possibility, let me check this.
> You are correct. All GPU binaries are in the single fatbin now.
> That said, you could still pass extra -fcuda-include-gpubinary to cc1 manually, but I see no practical reason to do it -- single fatbin serves the purpose better.
>
> We should remove this loop and make CGM.getCodeGenOpts().CudaGpuBinaryFileNames a scalar.
>
Ok, I'll work on this as a preparation patch and rebase this on top. That actually explains why my changes have always been working even though it didn't handle the loop correctly :-)
https://reviews.llvm.org/D42922
More information about the cfe-commits
mailing list