[cfe-dev] [RFC] Moving (parts of) the Cling REPL in Clang

Fri Jul 10 13:59:22 PDT 2020

Hi Richard,

On 7/10/20 11:10 PM, Richard Smith wrote:
> Hi Vassil,
>
> This is a very exciting proposal that I can imagine bringing important 
> benefits to the existing cling users and also to the clang user and 
> developer community. Thank you for all the work you and your team have 
> done on cling so far and for offering to bring that work under the 
> LLVM umbrella!
>
> Are you imagining cling being part of the clang repository, or a 
> separate LLVM subproject (with only the changes necessary to support 
> cling-style uses of the clang libraries added to the clang tree)?

   Good question. In principle cling was developed with the idea to 
become a separate LLVM subproject. Although I'd easily see it fit in 
clang/tools/.

   Nominally, cling has "high-energy physics"-specific features such as 
the so called 'meta commands'. For example, `[cling] .L some_file` would 
try to load a library called some_file.so and if it does not exist, try 
#include-ing a header with that name; `[cling] .x script.C` includes 
script.C and calls a function named `script`. I can imagine that broader 
community may not like/use that. If we start trimming down features like 
that then it won't really be cling anymore. Here is what I would imagine 
as a way forward:

   1. Land as many cling/"incremental compilation"-related patches as we 
can in clang.
   2. Build a simple tool, let's use a strawman name -- clang-repl, 
which only does the basics. For example, one can feed it incremental C++ 
and execute it.
   3. Rework cling to use that infrastructure -- ideally, implementing 
it's specific meta commands and other domain-specific features such as 
dynamic scopes.

   We could move any of the cling features which the broader community 
finds useful closer to clang. For the moment I am being conservative as 
this will also give us the opportunity to rethink some of the features.

   The hard part is what lives where. First bullet point is clear. The 
second -- not so much. Clang has a clang-interpreter in its examples 
folder and it looks a little unmaintained. Maybe we can start 
repurposing that to match 2.

   As for cling itself there are some challenges we should try to solve. 
Our community lives downstream (currently llvm-5) and a straight-forward 
llvm upgrade + bugfixing takes around 3 months due to the nature of our 
software stacks. It would be a non-trivial task to move the cling-based 
development in llvm upstream. My worry is that HEP-cling will soon 
depart from LLVM-cling if we don't get both communities on the same 
codebase (we have experienced such a problem with the getFullyQualified* 
interfaces). I am hoping that a middleman, such as clang-repl, can help. 
When we move parts of cling in clang we will develop and test the 
required functionality using clang-repl. This way users will enjoy 
cling-like experience and when cling upgrades its llvm its codebase will 
become smaller in size.

   Am I making sense?

>
> On Thu, 9 Jul 2020 at 13:46, Vassil Vassilev via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     Motivation
>     ===
>
>     Over the last decade we have developed an interactive, interpretative
>     C++ (aka REPL) as part of the high-energy physics (HEP) data analysis
>     project -- ROOT [1-2]. We invested a significant  effort to
>     replace the
>     CINT C++ interpreter with a newly implemented REPL based on llvm --
>     cling [3]. The cling infrastructure is a core component of the data
>     analysis framework of ROOT and runs in production for approximately 5
>     years.
>
>     Cling is also  a standalone tool, which has a growing community
>     outside
>     of our field. Cling’s user community includes users in finance,
>     biology
>     and in a few companies with proprietary software. For example,
>     there is
>     a xeus-cling jupyter kernel [4]. One of the major challenges we
>     face to
>     foster that community is  our cling-related patches in llvm and clang
>     forks. The benefits of using the LLVM community standards for code
>     reviews, release cycles and integration has been mentioned a
>     number of
>     times by our "external" users.
>
>     Last year we were awarded an NSF grant to improve cling's
>     sustainability
>     and make it a standalone tool. We thank the LLVM Foundation Board for
>     supporting us with a non-binding letter of collaboration which was
>     essential for getting this grant.
>
>
>     Background
>     ===
>
>     Cling is a C++ interpreter built on top of clang and llvm. In a
>     nutshell, it uses clang's incremental compilation facilities to
>     process
>     code chunk-by-chunk by assuming an ever-growing translation unit [5].
>     Then code is lowered into llvm IR and run by the llvm jit. Cling has
>     implemented some language "extensions" such as execution
>     statements on
>     the global scope and error recovery. Cling is in the core of HEP
>     -- it
>     is heavily used during data analysis of exabytes of particle physics
>     data coming from the Large Hadron Collider (LHC) and other particle
>     physics experiments.
>
>
>     Plans
>     ===
>
>     The project foresees three main directions -- move parts of cling
>     upstream along with the clang and llvm features that enable them;
>     extend
>     and generalize the language interoperability layer around cling; and
>     extend and generalize the OpenCL/CUDA support in cling. We are at the
>     early stages of the project and this email intends to be an RFC
>     for the
>     first part -- upstreaming parts of cling. Please do share your
>     thoughts
>     on the rest, too.
>
>
>     Moving Parts of Cling Upstream
>     ---
>
>     Over the years we have slowly moved some patches upstream. However we
>     still have around 100 patches in the clang fork. Most of them are
>     in the
>     context of extending the incremental compilation support for
>     clang. The
>     incremental compilation poses some challenges in the clang
>     infrastructure. For example, we need to tune CodeGen to work with
>     multiple llvm::Module instances, and finalize per each
>     end-of-translation unit (we have multiple of them). Other changes
>     include small adjustments in the FileManager's caching mechanism, and
>     bug fixes in the SourceManager (code which can be reached mostly from
>     within our setup). One conclusion we can draw from our research is
>     that
>     the clang infrastructure fits amazingly well to something which
>     was not
>     its main use case. The grand total of our diffs against clang-9
>     is: `62
>     files changed, 1294 insertions(+), 231 deletions(-)`. Cling is
>     currently
>     being upgraded from llvm-5 to llvm-9.
>
>     A major weakness of cling's infrastructure is that it does not
>     work with
>     the clang Action infrastructure due to the lack of an
>     IncrementalAction.  A possible way forward would be to implement a
>     clang::IncrementalAction as a starting point. This way we should
>     be able
>     to reduce the amount of setup necessary to use the incremental
>     infrastructure in clang. However, this will be a bit of a testing
>     challenge -- cling lives downstream and some of the new code may be
>     impossible to pick straight away and use. Building a mainline example
>     tool such as clang-repl which gives us a way to test that incremental
>     case or repurpose the already existing clang-interpreter may be
>     able to
>     address the issue. The major risk of the task is avoiding code in the
>     clang mainline which is untested by its HEP production environment.
>     There are several other types of patches to the ROOT fork of Clang,
>     including ones  in the context of performance,towards  C++ modules
>     support (D41416), and storage (does not have a patch yet but has
>     an open
>     projects entry and somebody working on it). These patches can be
>     considered in parallel independently on the rest.
>
>     Extend and Generalize the Language Interoperability Layer Around Cling
>     ---
>
>     HEP has extensive experience with on-demand python interoperability
>     using cppyy[6], which is built around the type information
>     provided by
>     cling. Unlike tools with custom parsers such as swig and sip and
>     tools
>     built on top of C-APIs such as boost.python and pybind11, cling can
>     provide information about memory management patterns (eg refcounting)
>     and instantiate templates on the fly.We feel that functionality
>     may not
>     be of general interest to the llvm community but we will prepare
>     another
>     RFC and send it here later on to gather feedback.
>
>
>     Extend and Generalize the OpenCL/CUDA Support in Cling
>     ---
>
>     Cling can incrementally compile CUDA code [7-8] allowing easier
>     set up
>     and enabling some interesting use cases. There are a number of
>     planned
>     improvements including talking to HIP [9] and SYCL to support more
>     hardware architectures.
>
>
>
>     The primary focus of our work is to upstreaming functionality
>     required
>     to build an incremental compiler and rework cling build against
>     vanilla
>     clang and llvm. The last two points are to give the scope of the work
>     which we will be doing the next 2-3 years. We will send here RFCs for
>     both of them to trigger technical discussion if there is interest in
>     pursuing this direction.
>
>
>     Collaboration
>     ===
>
>     Open source development nowadays relies on reviewers. LLVM is no
>     different and we will probably disturb a good number of people in the
>     community ;)We would like to invite anybody interested in joining our
>     incremental C++ activities to our open every second week calls.
>     Announcements will be done via google group:
>     compiler-research-announce
>     (https://groups.google.com/g/compiler-research-announce).
>
>
>
>     Many thanks!
>
>
>     David & Vassil
>
>     References
>     ===
>     [1] ROOT GitHub https://github.com/root-project/root
>     [2] ROOT https://root.cern
>     [3] Cling https://github.com/root-project/cling
>     [4] Xeus-Cling
>     https://blog.jupyter.org/xeus-is-now-a-jupyter-subproject-c4ec5a1bf30b
>     [5] Cling – The New Interactive Interpreter for ROOT 6,
>     https://iopscience.iop.org/article/10.1088/1742-6596/396/5/052071
>     [6] High-performance Python-C++ bindings with PyPy and Cling,
>     https://dl.acm.org/doi/10.5555/3019083.3019087
>     [7]
>     https://indico.cern.ch/event/697389/contributions/3085538/attachments/1712698/2761717/2018_09_10_cling_CUDA.pdf
>     [8] CUDA C++ in Jupyter: Adding CUDA Runtime Support to Cling',
>     https://zenodo.org/record/3713753#.Xu8jqvJRXxU
>     [9] HIP Programming Guide
>     https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html
>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200710/8b16534c/attachment-0001.html>