[llvm-dev] JIT compiling CUDA source code

Geoff Levner via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 20 07:28:48 PST 2020


Thanks very much, Hal! At first glance, your code makes it look like a lot
more effort than I was hoping to put into this, but I will give it a study.

Geoff

On Fri, Nov 20, 2020 at 3:17 PM Hal Finkel <hal.finkel.llvm at gmail.com>
wrote:

> This sounds very similar to what I had developed here:
> https://github.com/hfinkel/llvm-project-cxxjit/tree/cxxjit/clang --
> please look at the code in
> https://github.com/hfinkel/llvm-project-cxxjit/blob/cxxjit/clang/lib/CodeGen/JIT.cpp,
> etc., for an example of how you can get JIT'd CUDA kernels up and running.
>
>  -Hal
> On 11/19/20 12:10 PM, Geoff Levner via llvm-dev wrote:
>
> I have made a bit of progress... When compiling CUDA source code in
> memory, the Compilation instance returned by Driver::BuildCompilation()
> contains two clang Commands: one for the host and one for the CUDA device.
> I can execute both commands using EmitLLVMOnlyActions. I add the Module
> from the host compilation to my JIT as usual, but... what to do with the
> Module from the device compilation? If I just add it to the JIT, I get an
> error message like this:
>
>     Added modules have incompatible data layouts:
> e-i64:64-i128:128-v16:16-v32:32-n16:32:64 (module) vs
> e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128 (jit)
>
> Any suggestions as to what to do with the Module containing CUDA kernel
> code, so that the host Module can invoke it?
>
> Geoff
>
> On Tue, Nov 17, 2020 at 6:39 PM Geoff Levner <glevner at gmail.com> wrote:
>
>> We have an application that allows the user to compile and execute C++
>> code on the fly, using Orc JIT v2, via the LLJIT class. And we would like
>> to extend it to allow the user to provide CUDA source code as well, for GPU
>> programming. But I am having a hard time figuring out how to do it.
>>
>> To JIT compile C++ code, we do basically as follows:
>>
>> 1. call Driver::BuildCompilation(), which returns a clang Command to
>> execute
>> 2. create a CompilerInvocation using the arguments from the Command
>> 3. create a CompilerInstance around the CompilerInvocation
>> 4. use the CompilerInstance to execute an EmitLLVMOnlyAction
>> 5. retrieve the resulting Module from the action and add it to the JIT
>>
>> But to compile C++ requires only a single clang command. When you add
>> CUDA to the equation, you add several other steps. If you use the clang
>> front end to compile, clang does the following:
>>
>> 1. compiles the driver source code
>> 2. compiles the resulting PTX code using the CUDA ptxas command
>> 3. builds a "fat binary" using the CUDA fatbinary command
>> 4. compiles the host source code and links in the fat binary
>>
>> So my question is: how do we replicate that process in memory, to
>> generate modules that we can add to our JIT?
>>
>> I am no CUDA expert, and not much of a clang expert either, so if anyone
>> out there can point me in the right direction, I would be grateful.
>>
>> Geoff
>>
>>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201120/0582f9fc/attachment.html>


More information about the llvm-dev mailing list