[LLVMdev] LLVM as a shared library

Wed Aug 6 01:34:07 PDT 2014

On 5 Aug 2014, at 21:17, Filip Pizlo <fpizlo at apple.com> wrote:

> - Engage with non-WebKit embedders that currently use the C++ API to figure out what it would take to get them to switch to the C API.

I maintain a reasonable amount of out-of-tree code that embeds LLVM in various things, including a couple of language front ends, an out-of-tree back end, and some tools for interfacing with experimental hardare and so suffer the pain of C++ API changes on a fairly frequent basis.  The C APIs generally feel clunky to use.  If we have a stable C API that is useful, I'd love to see C++ wrappers so that we don't have to suffer things like iterators.

Currently, we conflate 'C' and 'stable'.  In many cases, I'd prefer a C++ API to a C one, or would be happy to write (or even use automatically generated) thin C wrappers around C++.  The requirement is not the language, it's the stability.  I presume this also applies to WebKit: there's no reason why a C++ library should prefer a C API for talking to a C++ library.  Stability isn't just a binary thing.  We care about several different definitions:

- ABI doesn't change.  I can keep using the same binary with new LLVM shared libraries.  Symbols are versioned and new code can just use the new version.

- API doesn't change.  I have to recompile, but there are SOVERSION bumps whenever I need to and so the version that I need can easily coexist on the same system with newer ones until I get around to recompiling.

- API doesn't change gratuitously.  Public APIs change, but only after a deprecation period and not simply to please some developer's aesthetic.  We don't randomly rename classes or change capitalisation of functions without at least shipping one release with both the new and old versions working and being marked as deprecated.

Most of LLVM fails even to meet the third requirement.  For most of the code that I maintain, I have no strong requirements for the first, would be very happy with the second, and would find the third acceptable.  

Some things clearly can't be supported by a set-in-stone interface.  Much as I'd love for out-of-tree back ends to be something that people could just ship as plugins, it's not really feasible.  There are lots of things in the back end interface that need fixing, and having to support an interface that's defined before they're fixed would be a lot of pain.  I also don't see a benefit in exporting these 

A few of the things that I maintain out of tree are optimisations.  We've added some infrastructure a few releases ago for plugging optimisations into the pipeline, but the APIs required to write optimisations change a lot.  They can't be C APIs, because they require inheriting from FunctionPass or similar (we could, perhaps, have a CFunctionPass class that had callbacks for the virtual functions, but it would be quite clunky).

One of the goals of LLVM was that you'd be able to write optimisations that made use of the LLVM infrastructure but only made sense for a particular source language, or even set of idioms used with a particular library.  I'd love to see, for example, Qt ship with a plugin that adds optimisations for their slots and signals mechanism.  I wouldn't want this in the LLVM tree, because it's completely useless to anyone not using Qt, but currently the only way of guaranteeing that it will work one svn revision into the future is to put it in the LLVM tree.

I currently have a GSoC student working on using LLVM for high-performance packet filtering.  The front-end code could probably use the existing C APIs, but then there will be optimisations that are unlikely to make sense for any code that doesn't have this particular structure (e.g. prefetching the next packet based on knowledge of the structure of the network stack's ring buffers).  Having an API that was at least useable for two releases for doing this, even if it spat out a lot of deprecated warnings in the second release, would be immensely helpful.

David