<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <blockquote type="cite">My impression is that he actually uses nvcc
      to compile the CUDA kernels, not clang</blockquote>
    The constructor here looks very much like the CUDA command line
    options are added to a clang::CompilerInstance, I might be wrong,
    but you could try to follow the trace and see where it ends up:<br>
    <br>
<a class="moz-txt-link-freetext" href="https://github.com/root-project/cling/blob/master/lib/Interpreter/IncrementalCUDADeviceCompiler.cpp">https://github.com/root-project/cling/blob/master/lib/Interpreter/IncrementalCUDADeviceCompiler.cpp</a><br>
    <br>
    Disclaimer: I am not familiar with the details of Simeons work or
    cling or even with JITing CUDA :) Maybe Simeon can confirm or deny
    my guess.<br>
    <br>
    <br>
    On 22/11/2020 09:09, Vassil Vassilev wrote:<br>
    <blockquote type="cite"
      cite="mid:0649b677-c765-68ea-3d15-801413e6539d@gmail.com"> Adding
      Simeon in the loop for Cling and CUDA. </blockquote>
    Thanks, hi Simeon!<br>
    <br>
    <br>
    <div class="moz-cite-prefix">On 22/11/2020 09:22, Geoff Levner
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAHMBa1sOAaKtLxuHsCQRs_4CA+3DAgdo=d9SWtvr_F5LS8-jZw@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="auto">
        <div>Hi, Stefan.
          <div dir="auto"><br>
          </div>
          <div dir="auto">Yes, when compiling from the command line,
            clang does all the work for you transparently. But behind
            the scenes it performs two passes: one to compile source
            code for the host, and one to compile CUDA kernels. </div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">When compiling in memory, as far as I can
            tell, you have to perform those two passes yourself. And the
            CUDA pass produces a Module that is incompatible with the
            host Module. You cannot simply add it to the JIT. I don't
            know what to do with it. </div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">And yes, I did watch Simeon's presentation,
            but he didn't get into that level of detail (or if he did, I
            missed it). My impression is that he actually uses nvcc to
            compile the CUDA kernels, not clang, using his own parser to
            separate and adapt the source code... </div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">Thanks, </div>
          <div dir="auto">Geoff </div>
          <br>
          <br>
          <div class="gmail_quote">
            <div dir="ltr" class="gmail_attr">Le dim. 22 nov. 2020 à
              01:03, Stefan Gränitz <<a
                href="mailto:stefan.graenitz@gmail.com" target="_blank"
                rel="noreferrer" moz-do-not-send="true">stefan.graenitz@gmail.com</a>>
              a écrit :<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div> Hi Geoff<br>
                <br>
                It looks like clang does that altogether: <a
                  href="https://llvm.org/docs/CompileCudaWithLLVM.html"
                  rel="noreferrer noreferrer" target="_blank"
                  moz-do-not-send="true">https://llvm.org/docs/CompileCudaWithLLVM.html</a><br>
                <br>
                And, probably related: CUDA support has been added to
                Cling and there was a presentation for it at the last
                Dev Meeting <a
                  href="https://www.youtube.com/watch?v=XjjZRhiFDVs"
                  rel="noreferrer noreferrer" target="_blank"
                  moz-do-not-send="true">https://www.youtube.com/watch?v=XjjZRhiFDVs</a><br>
                <br>
                Best,<br>
                Stefan<br>
                <br>
                <div>On 20/11/2020 12:09, Geoff Levner via llvm-dev
                  wrote:<br>
                </div>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div>Thanks for that, Valentin.</div>
                    <div><br>
                    </div>
                    <div>To be sure I understand what you are saying...
                      Assume we are talking about a single .cu file
                      containing both a C++ function and a CUDA kernel
                      that it invokes, using <<<>>>
                      syntax. Are you suggesting that we bypass clang
                      altogether and use the Nvidia API to compile and
                      install the CUDA kernel? If we do that, how will
                      the JIT-compiled C++ function find the kernel?</div>
                    <div><br>
                    </div>
                    <div>Geoff<br>
                    </div>
                  </div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Thu, Nov 19,
                      2020 at 6:34 PM Valentin Churavy <<a
                        href="mailto:v.churavy@gmail.com"
                        rel="noreferrer noreferrer" target="_blank"
                        moz-do-not-send="true">v.churavy@gmail.com</a>>
                      wrote:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px 0.8ex;border-left:1px solid
                      rgb(204,204,204);padding-left:1ex">
                      <div dir="ltr">
                        <div>Sound right now like you are emitting an
                          LLVM module?<br>
                        </div>
                        <div>The best strategy is probably to use to
                          emit a PTX module and then pass that to the 
                          CUDA driver. This is what we do on the Julia
                          side in CUDA.jl.</div>
                        <div><br>
                        </div>
                        <div>Nvidia has a somewhat helpful tutorial on
                          this at <a
href="https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/vectorAdd_nvrtc/vectorAdd.cpp"
                            rel="noreferrer noreferrer" target="_blank"
                            moz-do-not-send="true">https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/vectorAdd_nvrtc/vectorAdd.cpp</a></div>
                        <div>and <a
href="https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/simpleDrvRuntime/simpleDrvRuntime.cpp"
                            rel="noreferrer noreferrer" target="_blank"
                            moz-do-not-send="true">https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/simpleDrvRuntime/simpleDrvRuntime.cpp</a></div>
                        <div><br>
                        </div>
                        <div>Hope that helps.</div>
                        <div>-V<br>
                        </div>
                        <div><br>
                        </div>
                      </div>
                      <br>
                      <div class="gmail_quote">
                        <div dir="ltr" class="gmail_attr">On Thu, Nov
                          19, 2020 at 12:11 PM Geoff Levner via llvm-dev
                          <<a href="mailto:llvm-dev@lists.llvm.org"
                            rel="noreferrer noreferrer" target="_blank"
                            moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
                          wrote:<br>
                        </div>
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
                          0.8ex;border-left:1px solid
                          rgb(204,204,204);padding-left:1ex">
                          <div dir="ltr">
                            <div>I have made a bit of progress... When
                              compiling CUDA source code in memory, the
                              Compilation instance returned by
                              Driver::BuildCompilation() contains two
                              clang Commands: one for the host and one
                              for the CUDA device. I can execute both
                              commands using EmitLLVMOnlyActions. I add
                              the Module from the host compilation to my
                              JIT as usual, but... what to do with the
                              Module from the device compilation? If I
                              just add it to the JIT, I get an error
                              message like this:</div>
                            <div><br>
                            </div>
                            <div>    Added modules have incompatible
                              data layouts:
                              e-i64:64-i128:128-v16:16-v32:32-n16:32:64
                              (module) vs
e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128
                              (jit)</div>
                            <div><br>
                            </div>
                            <div>Any suggestions as to what to do with
                              the Module containing CUDA kernel code, so
                              that the host Module can invoke it?</div>
                            <div><br>
                            </div>
                            <div>Geoff<br>
                            </div>
                            <br>
                            <div class="gmail_quote">
                              <div dir="ltr" class="gmail_attr">On Tue,
                                Nov 17, 2020 at 6:39 PM Geoff Levner
                                <<a href="mailto:glevner@gmail.com"
                                  rel="noreferrer noreferrer"
                                  target="_blank" moz-do-not-send="true">glevner@gmail.com</a>>
                                wrote:<br>
                              </div>
                              <blockquote class="gmail_quote"
                                style="margin:0px 0px 0px
                                0.8ex;border-left:1px solid
                                rgb(204,204,204);padding-left:1ex">
                                <div dir="ltr">
                                  <div>We have an application that
                                    allows the user to compile and
                                    execute C++ code on the fly, using
                                    Orc JIT v2, via the LLJIT class. And
                                    we would like to extend it to allow
                                    the user to provide CUDA source code
                                    as well, for GPU programming. But I
                                    am having a hard time figuring out
                                    how to do it.</div>
                                  <div><br>
                                  </div>
                                  <div>To JIT compile C++ code, we do
                                    basically as follows:</div>
                                  <div><br>
                                  </div>
                                  <div>1. call
                                    Driver::BuildCompilation(), which
                                    returns a clang Command to execute</div>
                                  <div>2. create a CompilerInvocation
                                    using the arguments from the Command</div>
                                  <div>3. create a CompilerInstance
                                    around the CompilerInvocation</div>
                                  <div>4. use the CompilerInstance to
                                    execute an EmitLLVMOnlyAction</div>
                                  <div>5. retrieve the resulting Module
                                    from the action and add it to the
                                    JIT</div>
                                  <div><br>
                                  </div>
                                  <div>But to compile C++ requires only
                                    a single clang command. When you add
                                    CUDA to the equation, you add
                                    several other steps. If you use the
                                    clang front end to compile, clang
                                    does the following:</div>
                                  <div><br>
                                  </div>
                                  <div>1. compiles the driver source
                                    code<br>
                                  </div>
                                  <div>2. compiles the resulting PTX
                                    code using the CUDA ptxas command<br>
                                  </div>
                                  <div>3. builds a "fat binary" using
                                    the CUDA fatbinary command</div>
                                  <div>4. compiles the host source code
                                    and links in the fat binary</div>
                                  <div><br>
                                  </div>
                                  <div>So my question is: how do we
                                    replicate that process in memory, to
                                    generate modules that we can add to
                                    our JIT?</div>
                                  <div><br>
                                  </div>
                                  <div>I am no CUDA expert, and not much
                                    of a clang expert either, so if
                                    anyone out there can point me in the
                                    right direction, I would be
                                    grateful.</div>
                                  <div><br>
                                  </div>
                                  <div>Geoff</div>
                                  <div><br>
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                          </div>
_______________________________________________<br>
                          LLVM Developers mailing list<br>
                          <a href="mailto:llvm-dev@lists.llvm.org"
                            rel="noreferrer noreferrer" target="_blank"
                            moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br>
                          <a
                            href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                            rel="noreferrer noreferrer noreferrer"
                            target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                        </blockquote>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                  <fieldset></fieldset>
                  <pre>_______________________________________________
LLVM Developers mailing list
<a href="mailto:llvm-dev@lists.llvm.org" rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
                </blockquote>
                <pre cols="72">-- 
<a href="https://flowcrypt.com/pub/stefan.graenitz@gmail.com" rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://flowcrypt.com/pub/stefan.graenitz@gmail.com</a></pre>
              </div>
            </blockquote>
          </div>
        </div>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
<a class="moz-txt-link-freetext" href="https://flowcrypt.com/pub/stefan.graenitz@gmail.com">https://flowcrypt.com/pub/stefan.graenitz@gmail.com</a></pre>
  </body>
</html>