[llvm-dev] JIT compiling CUDA source code
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 20 06:17:44 PST 2020
This sounds very similar to what I had developed here:
https://github.com/hfinkel/llvm-project-cxxjit/tree/cxxjit/clang --
please look at the code in
https://github.com/hfinkel/llvm-project-cxxjit/blob/cxxjit/clang/lib/CodeGen/JIT.cpp,
etc., for an example of how you can get JIT'd CUDA kernels up and running.
-Hal
On 11/19/20 12:10 PM, Geoff Levner via llvm-dev wrote:
> I have made a bit of progress... When compiling CUDA source code in
> memory, the Compilation instance returned by
> Driver::BuildCompilation() contains two clang Commands: one for the
> host and one for the CUDA device. I can execute both commands using
> EmitLLVMOnlyActions. I add the Module from the host compilation to my
> JIT as usual, but... what to do with the Module from the device
> compilation? If I just add it to the JIT, I get an error message like
> this:
>
> Added modules have incompatible data layouts:
> e-i64:64-i128:128-v16:16-v32:32-n16:32:64 (module) vs
> e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128
> (jit)
>
> Any suggestions as to what to do with the Module containing CUDA
> kernel code, so that the host Module can invoke it?
>
> Geoff
>
> On Tue, Nov 17, 2020 at 6:39 PM Geoff Levner <glevner at gmail.com
> <mailto:glevner at gmail.com>> wrote:
>
> We have an application that allows the user to compile and execute
> C++ code on the fly, using Orc JIT v2, via the LLJIT class. And we
> would like to extend it to allow the user to provide CUDA source
> code as well, for GPU programming. But I am having a hard time
> figuring out how to do it.
>
> To JIT compile C++ code, we do basically as follows:
>
> 1. call Driver::BuildCompilation(), which returns a clang Command
> to execute
> 2. create a CompilerInvocation using the arguments from the Command
> 3. create a CompilerInstance around the CompilerInvocation
> 4. use the CompilerInstance to execute an EmitLLVMOnlyAction
> 5. retrieve the resulting Module from the action and add it to the JIT
>
> But to compile C++ requires only a single clang command. When you
> add CUDA to the equation, you add several other steps. If you use
> the clang front end to compile, clang does the following:
>
> 1. compiles the driver source code
> 2. compiles the resulting PTX code using the CUDA ptxas command
> 3. builds a "fat binary" using the CUDA fatbinary command
> 4. compiles the host source code and links in the fat binary
>
> So my question is: how do we replicate that process in memory, to
> generate modules that we can add to our JIT?
>
> I am no CUDA expert, and not much of a clang expert either, so if
> anyone out there can point me in the right direction, I would be
> grateful.
>
> Geoff
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201120/3e12ddab/attachment.html>
More information about the llvm-dev
mailing list