[llvm-dev] [cfe-dev] [RFC] Moving (parts of) the Cling REPL in Clang
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Thu Jul 9 19:25:21 PDT 2020
I think that it would be great to have infrastructure for incremental
C++ compilation, supporting interactive use, just-in-time compilation,
and so on. I think that the best way to deal with the patches, etc., as
well as IncrementalAction, is to first send an RFC explaining the
overall design.
-Hal
On 7/9/20 3:46 PM, Vassil Vassilev via cfe-dev wrote:
> Motivation
> ===
>
> Over the last decade we have developed an interactive, interpretative
> C++ (aka REPL) as part of the high-energy physics (HEP) data analysis
> project -- ROOT [1-2]. We invested a significant effort to replace
> the CINT C++ interpreter with a newly implemented REPL based on llvm
> -- cling [3]. The cling infrastructure is a core component of the data
> analysis framework of ROOT and runs in production for approximately 5
> years.
>
> Cling is also a standalone tool, which has a growing community
> outside of our field. Cling’s user community includes users in
> finance, biology and in a few companies with proprietary software. For
> example, there is a xeus-cling jupyter kernel [4]. One of the major
> challenges we face to foster that community is our cling-related
> patches in llvm and clang forks. The benefits of using the LLVM
> community standards for code reviews, release cycles and integration
> has been mentioned a number of times by our "external" users.
>
> Last year we were awarded an NSF grant to improve cling's
> sustainability and make it a standalone tool. We thank the LLVM
> Foundation Board for supporting us with a non-binding letter of
> collaboration which was essential for getting this grant.
>
>
> Background
> ===
>
> Cling is a C++ interpreter built on top of clang and llvm. In a
> nutshell, it uses clang's incremental compilation facilities to
> process code chunk-by-chunk by assuming an ever-growing translation
> unit [5]. Then code is lowered into llvm IR and run by the llvm jit.
> Cling has implemented some language "extensions" such as execution
> statements on the global scope and error recovery. Cling is in the
> core of HEP -- it is heavily used during data analysis of exabytes of
> particle physics data coming from the Large Hadron Collider (LHC) and
> other particle physics experiments.
>
>
> Plans
> ===
>
> The project foresees three main directions -- move parts of cling
> upstream along with the clang and llvm features that enable them;
> extend and generalize the language interoperability layer around
> cling; and extend and generalize the OpenCL/CUDA support in cling. We
> are at the early stages of the project and this email intends to be an
> RFC for the first part -- upstreaming parts of cling. Please do share
> your thoughts on the rest, too.
>
>
> Moving Parts of Cling Upstream
> ---
>
> Over the years we have slowly moved some patches upstream. However we
> still have around 100 patches in the clang fork. Most of them are in
> the context of extending the incremental compilation support for
> clang. The incremental compilation poses some challenges in the clang
> infrastructure. For example, we need to tune CodeGen to work with
> multiple llvm::Module instances, and finalize per each
> end-of-translation unit (we have multiple of them). Other changes
> include small adjustments in the FileManager's caching mechanism, and
> bug fixes in the SourceManager (code which can be reached mostly from
> within our setup). One conclusion we can draw from our research is
> that the clang infrastructure fits amazingly well to something which
> was not its main use case. The grand total of our diffs against
> clang-9 is: `62 files changed, 1294 insertions(+), 231 deletions(-)`.
> Cling is currently being upgraded from llvm-5 to llvm-9.
>
> A major weakness of cling's infrastructure is that it does not work
> with the clang Action infrastructure due to the lack of an
> IncrementalAction. A possible way forward would be to implement a
> clang::IncrementalAction as a starting point. This way we should be
> able to reduce the amount of setup necessary to use the incremental
> infrastructure in clang. However, this will be a bit of a testing
> challenge -- cling lives downstream and some of the new code may be
> impossible to pick straight away and use. Building a mainline example
> tool such as clang-repl which gives us a way to test that incremental
> case or repurpose the already existing clang-interpreter may be able
> to address the issue. The major risk of the task is avoiding code in
> the clang mainline which is untested by its HEP production environment.
> There are several other types of patches to the ROOT fork of Clang,
> including ones in the context of performance,towards C++ modules
> support (D41416), and storage (does not have a patch yet but has an
> open projects entry and somebody working on it). These patches can be
> considered in parallel independently on the rest.
>
> Extend and Generalize the Language Interoperability Layer Around Cling
> ---
>
> HEP has extensive experience with on-demand python interoperability
> using cppyy[6], which is built around the type information provided by
> cling. Unlike tools with custom parsers such as swig and sip and tools
> built on top of C-APIs such as boost.python and pybind11, cling can
> provide information about memory management patterns (eg refcounting)
> and instantiate templates on the fly.We feel that functionality may
> not be of general interest to the llvm community but we will prepare
> another RFC and send it here later on to gather feedback.
>
>
> Extend and Generalize the OpenCL/CUDA Support in Cling
> ---
>
> Cling can incrementally compile CUDA code [7-8] allowing easier set up
> and enabling some interesting use cases. There are a number of planned
> improvements including talking to HIP [9] and SYCL to support more
> hardware architectures.
>
>
>
> The primary focus of our work is to upstreaming functionality required
> to build an incremental compiler and rework cling build against
> vanilla clang and llvm. The last two points are to give the scope of
> the work which we will be doing the next 2-3 years. We will send here
> RFCs for both of them to trigger technical discussion if there is
> interest in pursuing this direction.
>
>
> Collaboration
> ===
>
> Open source development nowadays relies on reviewers. LLVM is no
> different and we will probably disturb a good number of people in the
> community ;)We would like to invite anybody interested in joining our
> incremental C++ activities to our open every second week calls.
> Announcements will be done via google group:
> compiler-research-announce
> (https://groups.google.com/g/compiler-research-announce).
>
>
>
> Many thanks!
>
>
> David & Vassil
>
> References
> ===
> [1] ROOT GitHub https://github.com/root-project/root
> [2] ROOT https://root.cern
> [3] Cling https://github.com/root-project/cling
> [4] Xeus-Cling
> https://blog.jupyter.org/xeus-is-now-a-jupyter-subproject-c4ec5a1bf30b
> [5] Cling – The New Interactive Interpreter for ROOT 6,
> https://iopscience.iop.org/article/10.1088/1742-6596/396/5/052071
> [6] High-performance Python-C++ bindings with PyPy and Cling,
> https://dl.acm.org/doi/10.5555/3019083.3019087
> [7]
> https://indico.cern.ch/event/697389/contributions/3085538/attachments/1712698/2761717/2018_09_10_cling_CUDA.pdf
> [8] CUDA C++ in Jupyter: Adding CUDA Runtime Support to Cling',
> https://zenodo.org/record/3713753#.Xu8jqvJRXxU
> [9] HIP Programming Guide
> https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list