[cfe-dev] [RFC] Moving (parts of) the Cling REPL in Clang
Hal Finkel via cfe-dev
cfe-dev at lists.llvm.org
Fri Jul 10 13:55:44 PDT 2020
On 7/10/20 1:57 PM, Vassil Vassilev wrote:
> On 7/10/20 6:43 AM, JF Bastien wrote:
>> I like cling, and having it integrated with the rest of the project
>> would be neat. I agree with Hal’s suggestion to explain the design of
>> what remains. It sounds like a pretty small amount of code.
>
>
> JF, Hal, did you mean you want a design document of how cling in
> general or a design RFC for the patches we have? A design document for
> cling would be quite large and will take us some time to write up.
> OTOH, we could relatively easily give a rationale for each patch.
I had in mind something that's probably in between. Something that
explains the patches and enough about how they fit into a larger system
that we can reason about the context.
-Hal
>
>
>>
>>
>>> On Jul 9, 2020, at 7:25 PM, Hal Finkel via cfe-dev
>>> <cfe-dev at lists.llvm.org> wrote:
>>>
>>> I think that it would be great to have infrastructure for
>>> incremental C++ compilation, supporting interactive use,
>>> just-in-time compilation, and so on. I think that the best way to
>>> deal with the patches, etc., as well as IncrementalAction, is to
>>> first send an RFC explaining the overall design.
>>>
>>> -Hal
>>>
>>> On 7/9/20 3:46 PM, Vassil Vassilev via cfe-dev wrote:
>>>> Motivation
>>>> ===
>>>>
>>>> Over the last decade we have developed an interactive,
>>>> interpretative C++ (aka REPL) as part of the high-energy physics
>>>> (HEP) data analysis project -- ROOT [1-2]. We invested a
>>>> significant effort to replace the CINT C++ interpreter with a
>>>> newly implemented REPL based on llvm -- cling [3]. The cling
>>>> infrastructure is a core component of the data analysis framework
>>>> of ROOT and runs in production for approximately 5 years.
>>>>
>>>> Cling is also a standalone tool, which has a growing community
>>>> outside of our field. Cling’s user community includes users in
>>>> finance, biology and in a few companies with proprietary software.
>>>> For example, there is a xeus-cling jupyter kernel [4]. One of the
>>>> major challenges we face to foster that community is our
>>>> cling-related patches in llvm and clang forks. The benefits of
>>>> using the LLVM community standards for code reviews, release cycles
>>>> and integration has been mentioned a number of times by our
>>>> "external" users.
>>>>
>>>> Last year we were awarded an NSF grant to improve cling's
>>>> sustainability and make it a standalone tool. We thank the LLVM
>>>> Foundation Board for supporting us with a non-binding letter of
>>>> collaboration which was essential for getting this grant.
>>>>
>>>>
>>>> Background
>>>> ===
>>>>
>>>> Cling is a C++ interpreter built on top of clang and llvm. In a
>>>> nutshell, it uses clang's incremental compilation facilities to
>>>> process code chunk-by-chunk by assuming an ever-growing translation
>>>> unit [5]. Then code is lowered into llvm IR and run by the llvm
>>>> jit. Cling has implemented some language "extensions" such as
>>>> execution statements on the global scope and error recovery. Cling
>>>> is in the core of HEP -- it is heavily used during data analysis of
>>>> exabytes of particle physics data coming from the Large Hadron
>>>> Collider (LHC) and other particle physics experiments.
>>>>
>>>>
>>>> Plans
>>>> ===
>>>>
>>>> The project foresees three main directions -- move parts of cling
>>>> upstream along with the clang and llvm features that enable them;
>>>> extend and generalize the language interoperability layer around
>>>> cling; and extend and generalize the OpenCL/CUDA support in cling.
>>>> We are at the early stages of the project and this email intends to
>>>> be an RFC for the first part -- upstreaming parts of cling. Please
>>>> do share your thoughts on the rest, too.
>>>>
>>>>
>>>> Moving Parts of Cling Upstream
>>>> ---
>>>>
>>>> Over the years we have slowly moved some patches upstream. However
>>>> we still have around 100 patches in the clang fork. Most of them
>>>> are in the context of extending the incremental compilation support
>>>> for clang. The incremental compilation poses some challenges in the
>>>> clang infrastructure. For example, we need to tune CodeGen to work
>>>> with multiple llvm::Module instances, and finalize per each
>>>> end-of-translation unit (we have multiple of them). Other changes
>>>> include small adjustments in the FileManager's caching mechanism,
>>>> and bug fixes in the SourceManager (code which can be reached
>>>> mostly from within our setup). One conclusion we can draw from our
>>>> research is that the clang infrastructure fits amazingly well to
>>>> something which was not its main use case. The grand total of our
>>>> diffs against clang-9 is: `62 files changed, 1294 insertions(+),
>>>> 231 deletions(-)`. Cling is currently being upgraded from llvm-5 to
>>>> llvm-9.
>>>>
>>>> A major weakness of cling's infrastructure is that it does not work
>>>> with the clang Action infrastructure due to the lack of an
>>>> IncrementalAction. A possible way forward would be to implement a
>>>> clang::IncrementalAction as a starting point. This way we should be
>>>> able to reduce the amount of setup necessary to use the incremental
>>>> infrastructure in clang. However, this will be a bit of a testing
>>>> challenge -- cling lives downstream and some of the new code may be
>>>> impossible to pick straight away and use. Building a mainline
>>>> example tool such as clang-repl which gives us a way to test that
>>>> incremental case or repurpose the already existing
>>>> clang-interpreter may be able to address the issue. The major risk
>>>> of the task is avoiding code in the clang mainline which is
>>>> untested by its HEP production environment.
>>>> There are several other types of patches to the ROOT fork of Clang,
>>>> including ones in the context of performance,towards C++ modules
>>>> support (D41416), and storage (does not have a patch yet but has an
>>>> open projects entry and somebody working on it). These patches can
>>>> be considered in parallel independently on the rest.
>>>>
>>>> Extend and Generalize the Language Interoperability Layer Around Cling
>>>> ---
>>>>
>>>> HEP has extensive experience with on-demand python interoperability
>>>> using cppyy[6], which is built around the type information provided
>>>> by cling. Unlike tools with custom parsers such as swig and sip and
>>>> tools built on top of C-APIs such as boost.python and pybind11,
>>>> cling can provide information about memory management patterns (eg
>>>> refcounting) and instantiate templates on the fly.We feel that
>>>> functionality may not be of general interest to the llvm community
>>>> but we will prepare another RFC and send it here later on to gather
>>>> feedback.
>>>>
>>>>
>>>> Extend and Generalize the OpenCL/CUDA Support in Cling
>>>> ---
>>>>
>>>> Cling can incrementally compile CUDA code [7-8] allowing easier set
>>>> up and enabling some interesting use cases. There are a number of
>>>> planned improvements including talking to HIP [9] and SYCL to
>>>> support more hardware architectures.
>>>>
>>>>
>>>>
>>>> The primary focus of our work is to upstreaming functionality
>>>> required to build an incremental compiler and rework cling build
>>>> against vanilla clang and llvm. The last two points are to give the
>>>> scope of the work which we will be doing the next 2-3 years. We
>>>> will send here RFCs for both of them to trigger technical
>>>> discussion if there is interest in pursuing this direction.
>>>>
>>>>
>>>> Collaboration
>>>> ===
>>>>
>>>> Open source development nowadays relies on reviewers. LLVM is no
>>>> different and we will probably disturb a good number of people in
>>>> the community ;)We would like to invite anybody interested in
>>>> joining our incremental C++ activities to our open every second
>>>> week calls. Announcements will be done via google group:
>>>> compiler-research-announce
>>>> (https://groups.google.com/g/compiler-research-announce).
>>>>
>>>>
>>>>
>>>> Many thanks!
>>>>
>>>>
>>>> David & Vassil
>>>>
>>>> References
>>>> ===
>>>> [1] ROOT GitHub https://github.com/root-project/root
>>>> [2] ROOT https://root.cern
>>>> [3] Cling https://github.com/root-project/cling
>>>> [4] Xeus-Cling
>>>> https://blog.jupyter.org/xeus-is-now-a-jupyter-subproject-c4ec5a1bf30b
>>>> [5] Cling – The New Interactive Interpreter for ROOT 6,
>>>> https://iopscience.iop.org/article/10.1088/1742-6596/396/5/052071
>>>> [6] High-performance Python-C++ bindings with PyPy and Cling,
>>>> https://dl.acm.org/doi/10.5555/3019083.3019087
>>>> [7]
>>>> https://indico.cern.ch/event/697389/contributions/3085538/attachments/1712698/2761717/2018_09_10_cling_CUDA.pdf
>>>> [8] CUDA C++ in Jupyter: Adding CUDA Runtime Support to Cling',
>>>> https://zenodo.org/record/3713753#.Xu8jqvJRXxU
>>>> [9] HIP Programming Guide
>>>> https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> --
>>> Hal Finkel
>>> Lead, Compiler Technology and Programming Languages
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the cfe-dev
mailing list