<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">On 7/11/20 12:58 AM, Richard Smith
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAOfiQq=r3uL8rJisi4QwG-RZ+RmTF4p4yZnD41Oc8fuM=hV9Gg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div dir="ltr">On Fri, 10 Jul 2020 at 13:59, Vassil Vassilev via
          cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org"
            moz-do-not-send="true">cfe-dev@lists.llvm.org</a>> wrote:<br>
        </div>
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <div>Hi Richard,</div>
              <div><br>
              </div>
              <div>On 7/10/20 11:10 PM, Richard Smith wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">Hi Vassil,
                  <div><br>
                  </div>
                  <div>This is a very exciting proposal that I can
                    imagine bringing important benefits to the existing
                    cling users and also to the clang user and developer
                    community. Thank you for all the work you and your
                    team have done on cling so far and for offering to
                    bring that work under the LLVM umbrella!</div>
                  <div><br>
                  </div>
                  <div>Are you imagining cling being part of the clang
                    repository, or a separate LLVM subproject (with only
                    the changes necessary to support cling-style uses of
                    the clang libraries added to the clang tree)?</div>
                </div>
              </blockquote>
              <p><br>
              </p>
              <p>  Good question. In principle cling was developed with
                the idea to become a separate LLVM subproject. Although
                I'd easily see it fit in clang/tools/.<br>
              </p>
              <p><br>
              </p>
              <p>  Nominally, cling has "high-energy physics"-specific
                features such as the so called 'meta commands'. For
                example, `[cling] .L some_file` would try to load a
                library called some_file.so and if it does not exist,
                try #include-ing a header with that name; `[cling] .x
                script.C` includes script.C and calls a function named
                `script`. I can imagine that broader community may not
                like/use that. If we start trimming down features like
                that then it won't really be cling anymore. Here is what
                I would imagine as a way forward:</p>
              <p>  1. Land as many cling/"incremental
                compilation"-related patches as we can in clang.<br>
                  2. Build a simple tool, let's use a strawman name --
                clang-repl, which only does the basics. For example, one
                can feed it incremental C++ and execute it.<br>
                  3. Rework cling to use that infrastructure -- ideally,
                implementing it's specific meta commands and other
                domain-specific features such as dynamic scopes.</p>
              <p>  We could move any of the cling features which the
                broader community finds useful closer to clang. For the
                moment I am being conservative as this will also give us
                the opportunity to rethink some of the features.</p>
              <p>  The hard part is what lives where. First bullet point
                is clear. The second -- not so much. Clang has a
                clang-interpreter in its examples folder and it looks a
                little unmaintained. Maybe we can start repurposing that
                to match 2.</p>
              <p>  As for cling itself there are some challenges we
                should try to solve. Our community lives downstream
                (currently llvm-5) and a straight-forward llvm upgrade +
                bugfixing takes around 3 months due to the nature of our
                software stacks. It would be a non-trivial task to move
                the cling-based development in llvm upstream. My worry
                is that HEP-cling will soon depart from LLVM-cling if we
                don't get both communities on the same codebase (we have
                experienced such a problem with the getFullyQualified*
                interfaces). I am hoping that a middleman, such as
                clang-repl, can help. When we move parts of cling in
                clang we will develop and test the required
                functionality using clang-repl. This way users will
                enjoy cling-like experience and when cling upgrades its
                llvm its codebase will become smaller in size.</p>
              <p>  Am I making sense?</p>
            </div>
          </blockquote>
          <div>Yes, the above all makes sense to me. I agree that there
            should be only one thing named 'cling', and that it should
            broadly have the feature set that current 'cling' has. I
            think there are a couple of ways we can get there while
            still providing the a minimalist interpreter to a broader
            audience: either we can build a simpler clang-interpreter
            and a more advanced cling binary from a common set of
            libraries, or we could produce a configurable binary that's
            able to serve both rules depending on configuration or a
            plugin or scripting system.</div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>  Good point. We could make it extendable, and actually that
      should be a design goal. The question how exactly is not very
      clear to me. Can you elaborate on what you had in mind as
      configuration or scripting system (plugin system I think I know
      what you meant). I will give an example with 3 distinct features
      in cling which we have implemented over the years and had
      different requirements:</p>
    <p>  * <a moz-do-not-send="true"
        href="https://llvm.org/devmtg/2013-11/slides/Vassilev-Poster.pdf">AST-based
        automatic differentiation</a> with the <a
        moz-do-not-send="true" href="https://github.com/vgvassilev/clad">clad
        library</a> -- here we essentially extend cling's runtime by
      providing a `clad::differentiate`, `clad::gradient`,
      `clad::hessian` and `clad::jacobian` primitives. Each primitive is
      a specially annotated wrapper over a function, say `double
      pow2(double x) { return x*x; }; auto pow2dx =
      clad::differentiate(pow2,/*wrt*/0);`. Here we let clang build a
      valid AST and the plugin creates the first order derivative and
      swaps the DeclRefExpr just before codegen so that we call the
      derivative instead. This is achievable by the current clang plugin
      system ( a bit problematic on windows as clang plugins do not work
      there ).</p>
    <p>  * Language extensions which require Sema support -- we have a
      legacy feature which should define a variable on the prompt if not
      defined (something like implicit auto) `cling[] i = 13;` should be
      translated into `cling[] auto i = 13;` if I is undefined. We solve
      that by adding some last resort lookup callback which marks `i` of
      dependent type so that we can produce an AST which we can later
      'fix'.</p>
    <p>  * Language extensions which require delayed lookup rules (aka
      dynamic scope) -- ROOT has an I/O system bound to cling people can
      write:`if (TFile::Open("file_that_has_hist_cpp_obj.root"))
      hist->Draw();`. Here we use the approach from the previous
      bullet and synthesize `if
      (TFile::Open("file_that_has_hist_cpp_obj.root"))
      eval<void>("hist->Draw()", /*escape some context*/...);`.</p>
    <p><br>
    </p>
    <p>  The implementation of these three features can be considered as
      possible with current clang. The issue is that it seems more like
      hacking clang rather than extending it. If we can come up with a
      sound way of implementing these features that would be awesome.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAOfiQq=r3uL8rJisi4QwG-RZ+RmTF4p4yZnD41Oc8fuM=hV9Gg@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <div><br>
          </div>
          <div>One other thing I think we should consider: there will be
            substantial overlap between the incremental compilation,
            code generation, REPL, etc. of cling and that of lldb.</div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>  I would love to hear opinions from the lldb folks. We have
      chatted number of times and I have looked at how they do it. I
      think lldb spawns (used to spawn last time I looked) a compiler
      instance per input line. That is not acceptable for cling due to
      its high-performance requirements. Most of the issues that need
      solving for lldb comes from materializing debug information to
      AST. LLDB folks, correct me if I am wrong.</p>
    <p>  That being said doesn't mean that we should not aim for
      centralizing the incremental compilation for both projects. We
      should but may be challenging because of the different focus which
      defines project priorities.<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAOfiQq=r3uL8rJisi4QwG-RZ+RmTF4p4yZnD41Oc8fuM=hV9Gg@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <div> For the initial integration of cling into LLVM, there's
            probably not much we can do about that, but it would seem
            beneficial for both cling and lldb if common parts could be
            shared where possible. As an extreme example, if we could
            fully unify the projects to the point where a user could
            switch into an 'lldb mode' in the middle of a cling session
            to do step-by-step debugging of code entered into the REPL,
            that would seem like an incredibly useful feature. Perhaps
            there's some common set of base functionality that can be
            factored out of lldb and cling and unified. It would likely
            be a good idea to start talking to the lldb folks about that
            early, in case it guides your work porting cling to trunk.</div>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
    <p>  Indeed. There have been user requests to be able to run
      step-by-step in cling. That would be the ultimate long term goal!<br>
    </p>
    <p><br>
    </p>
    <blockquote type="cite"
cite="mid:CAOfiQq=r3uL8rJisi4QwG-RZ+RmTF4p4yZnD41Oc8fuM=hV9Gg@mail.gmail.com">
      <div dir="ltr">
        <div class="gmail_quote">
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex">
            <div>
              <blockquote type="cite">
                <div class="gmail_quote">
                  <div dir="ltr" class="gmail_attr">On Thu, 9 Jul 2020
                    at 13:46, Vassil Vassilev via cfe-dev <<a
                      href="mailto:cfe-dev@lists.llvm.org"
                      target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a>>
                    wrote:<br>
                  </div>
                  <blockquote class="gmail_quote" style="margin:0px 0px
                    0px 0.8ex;border-left:1px solid
                    rgb(204,204,204);padding-left:1ex">Motivation<br>
                    ===<br>
                    <br>
                    Over the last decade we have developed an
                    interactive, interpretative <br>
                    C++ (aka REPL) as part of the high-energy physics
                    (HEP) data analysis <br>
                    project -- ROOT [1-2]. We invested a significant 
                    effort to replace the <br>
                    CINT C++ interpreter with a newly implemented REPL
                    based on llvm -- <br>
                    cling [3]. The cling infrastructure is a core
                    component of the data <br>
                    analysis framework of ROOT and runs in production
                    for approximately 5 <br>
                    years.<br>
                    <br>
                    Cling is also  a standalone tool, which has a
                    growing community outside <br>
                    of our field. Cling’s user community includes users
                    in finance, biology <br>
                    and in a few companies with proprietary software.
                    For example, there is <br>
                    a xeus-cling jupyter kernel [4]. One of the major
                    challenges we face to <br>
                    foster that community is  our cling-related patches
                    in llvm and clang <br>
                    forks. The benefits of using the LLVM community
                    standards for code <br>
                    reviews, release cycles and integration has been
                    mentioned a number of <br>
                    times by our "external" users.<br>
                    <br>
                    Last year we were awarded an NSF grant to improve
                    cling's sustainability <br>
                    and make it a standalone tool. We thank the LLVM
                    Foundation Board for <br>
                    supporting us with a non-binding letter of
                    collaboration which was <br>
                    essential for getting this grant.<br>
                    <br>
                    <br>
                    Background<br>
                    ===<br>
                    <br>
                    Cling is a C++ interpreter built on top of clang and
                    llvm. In a <br>
                    nutshell, it uses clang's incremental compilation
                    facilities to process <br>
                    code chunk-by-chunk by assuming an ever-growing
                    translation unit [5]. <br>
                    Then code is lowered into llvm IR and run by the
                    llvm jit. Cling has <br>
                    implemented some language "extensions" such as
                    execution statements on <br>
                    the global scope and error recovery. Cling is in the
                    core of HEP -- it <br>
                    is heavily used during data analysis of exabytes of
                    particle physics <br>
                    data coming from the Large Hadron Collider (LHC) and
                    other particle <br>
                    physics experiments.<br>
                    <br>
                    <br>
                    Plans<br>
                    ===<br>
                    <br>
                    The project foresees three main directions -- move
                    parts of cling <br>
                    upstream along with the clang and llvm features that
                    enable them; extend <br>
                    and generalize the language interoperability layer
                    around cling; and <br>
                    extend and generalize the OpenCL/CUDA support in
                    cling. We are at the <br>
                    early stages of the project and this email intends
                    to be an RFC for the <br>
                    first part -- upstreaming parts of cling. Please do
                    share your thoughts <br>
                    on the rest, too.<br>
                    <br>
                    <br>
                    Moving Parts of Cling Upstream<br>
                    ---<br>
                    <br>
                    Over the years we have slowly moved some patches
                    upstream. However we <br>
                    still have around 100 patches in the clang fork.
                    Most of them are in the <br>
                    context of extending the incremental compilation
                    support for clang. The <br>
                    incremental compilation poses some challenges in the
                    clang <br>
                    infrastructure. For example, we need to tune CodeGen
                    to work with <br>
                    multiple llvm::Module instances, and finalize per
                    each <br>
                    end-of-translation unit (we have multiple of them).
                    Other changes <br>
                    include small adjustments in the FileManager's
                    caching mechanism, and <br>
                    bug fixes in the SourceManager (code which can be
                    reached mostly from <br>
                    within our setup). One conclusion we can draw from
                    our research is that <br>
                    the clang infrastructure fits amazingly well to
                    something which was not <br>
                    its main use case. The grand total of our diffs
                    against clang-9 is: `62 <br>
                    files changed, 1294 insertions(+), 231
                    deletions(-)`. Cling is currently <br>
                    being upgraded from llvm-5 to llvm-9.<br>
                    <br>
                    A major weakness of cling's infrastructure is that
                    it does not work with <br>
                    the clang Action infrastructure due to the lack of
                    an <br>
                    IncrementalAction.  A possible way forward would be
                    to implement a <br>
                    clang::IncrementalAction as a starting point. This
                    way we should be able <br>
                    to reduce the amount of setup necessary to use the
                    incremental <br>
                    infrastructure in clang. However, this will be a bit
                    of a testing <br>
                    challenge -- cling lives downstream and some of the
                    new code may be <br>
                    impossible to pick straight away and use. Building a
                    mainline example <br>
                    tool such as clang-repl which gives us a way to test
                    that incremental <br>
                    case or repurpose the already existing
                    clang-interpreter may  be able to <br>
                    address the issue. The major risk of the task is
                    avoiding code in the <br>
                    clang mainline which is untested by its HEP
                    production environment.<br>
                    There are several other types of patches to the ROOT
                    fork of Clang, <br>
                    including ones  in the context of
                    performance,towards  C++ modules <br>
                    support (D41416), and storage (does not have a patch
                    yet but has an open <br>
                    projects entry and somebody working on it). These
                    patches can be <br>
                    considered in parallel independently on the rest.<br>
                    <br>
                    Extend and Generalize the Language Interoperability
                    Layer Around Cling<br>
                    ---<br>
                    <br>
                    HEP has extensive experience with on-demand python
                    interoperability <br>
                    using cppyy[6], which is built around the type
                    information provided by <br>
                    cling. Unlike tools with custom parsers such as swig
                    and sip and tools <br>
                    built on top of C-APIs such as boost.python and
                    pybind11, cling can <br>
                    provide information about memory management patterns
                    (eg refcounting) <br>
                    and instantiate templates on the fly.We feel that
                    functionality may not <br>
                    be of general interest to the llvm community but we
                    will prepare another <br>
                    RFC and send it here later on to gather feedback.<br>
                    <br>
                    <br>
                    Extend and Generalize the OpenCL/CUDA Support in
                    Cling<br>
                    ---<br>
                    <br>
                    Cling can incrementally compile CUDA code [7-8]
                    allowing easier set up <br>
                    and enabling some interesting use cases. There are a
                    number of planned <br>
                    improvements including talking to HIP [9] and SYCL
                    to support more <br>
                    hardware architectures.<br>
                    <br>
                    <br>
                    <br>
                    The primary focus of our work is to upstreaming
                    functionality required <br>
                    to build an incremental compiler and rework cling
                    build against vanilla <br>
                    clang and llvm. The last two points are to give the
                    scope of the work <br>
                    which we will be doing the next 2-3 years. We will
                    send here RFCs for <br>
                    both of them to trigger technical discussion if
                    there is interest in <br>
                    pursuing this direction.<br>
                    <br>
                    <br>
                    Collaboration<br>
                    ===<br>
                    <br>
                    Open source development nowadays relies on
                    reviewers. LLVM is no <br>
                    different and we will probably disturb a good number
                    of people in the <br>
                    community ;)We would like to invite anybody
                    interested in joining our <br>
                    incremental C++ activities to our open every second
                    week calls. <br>
                    Announcements will be done via google group:
                    compiler-research-announce <br>
                    (<a
                      href="https://groups.google.com/g/compiler-research-announce"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://groups.google.com/g/compiler-research-announce</a>).<br>
                    <br>
                    <br>
                    <br>
                    Many thanks!<br>
                    <br>
                    <br>
                    David & Vassil<br>
                    <br>
                    References<br>
                    ===<br>
                    [1] ROOT GitHub <a
                      href="https://github.com/root-project/root"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://github.com/root-project/root</a><br>
                    [2] ROOT <a href="https://root.cern"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://root.cern</a><br>
                    [3] Cling <a
                      href="https://github.com/root-project/cling"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://github.com/root-project/cling</a><br>
                    [4] Xeus-Cling <br>
                    <a
href="https://blog.jupyter.org/xeus-is-now-a-jupyter-subproject-c4ec5a1bf30b"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://blog.jupyter.org/xeus-is-now-a-jupyter-subproject-c4ec5a1bf30b</a><br>
                    [5] Cling – The New Interactive Interpreter for ROOT
                    6, <br>
                    <a
                      href="https://iopscience.iop.org/article/10.1088/1742-6596/396/5/052071"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://iopscience.iop.org/article/10.1088/1742-6596/396/5/052071</a><br>
                    [6] High-performance Python-C++ bindings with PyPy
                    and Cling, <br>
                    <a
                      href="https://dl.acm.org/doi/10.5555/3019083.3019087"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://dl.acm.org/doi/10.5555/3019083.3019087</a><br>
                    [7] <br>
                    <a
href="https://indico.cern.ch/event/697389/contributions/3085538/attachments/1712698/2761717/2018_09_10_cling_CUDA.pdf"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://indico.cern.ch/event/697389/contributions/3085538/attachments/1712698/2761717/2018_09_10_cling_CUDA.pdf</a><br>
                    [8] CUDA C++ in Jupyter: Adding CUDA Runtime Support
                    to Cling', <br>
                    <a
                      href="https://zenodo.org/record/3713753#.Xu8jqvJRXxU"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://zenodo.org/record/3713753#.Xu8jqvJRXxU</a><br>
                    [9] HIP Programming Guide <br>
                    <a
href="https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html</a><br>
                    <br>
                    _______________________________________________<br>
                    cfe-dev mailing list<br>
                    <a href="mailto:cfe-dev@lists.llvm.org"
                      target="_blank" moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
                    <a
                      href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
                      rel="noreferrer" target="_blank"
                      moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
                  </blockquote>
                </div>
              </blockquote>
              <p><br>
              </p>
            </div>
            _______________________________________________<br>
            cfe-dev mailing list<br>
            <a href="mailto:cfe-dev@lists.llvm.org" target="_blank"
              moz-do-not-send="true">cfe-dev@lists.llvm.org</a><br>
            <a
              href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>