[LLVMdev] LLVM as a shared library

Nick Lewycky nicholas at mxc.ca
Wed Aug 6 00:00:24 PDT 2014


Filip Pizlo wrote:
> This is exciting!
>
> I would be happy to help.
>
>
>> On Aug 5, 2014, at 12:38 PM, Chris Bieneman<beanz at apple.com>  wrote:
>>
>> Hello LLVM community,
>>
>> Over the last few years the LLVM team here at Apple and development teams elsewhere have been busily working on finding new and interesting uses for LLVM. Some of these uses are traditional compilers, but a growing number of them aren’t. Some of LLVM’s new clients, like WebKit, are embedding LLVM into existing applications. These embedded uses of LLVM have their own unique challenges.
>>
>> Over the next few months, a few of us at Apple are going to be working on tackling a few new problems that we would like solved in open source so other projects can benefit from them. Some of these efforts will be non-trivial, so we’d like to start a few discussions over the next few weeks.
>>
>> Our primary goals are to (1) make it easier to embed LLVM into external projects as a shared library, and (2) generally improve the performance of LLVM as a shared library.
>>
>> The list of the problems we’re currently planning to tackle is:
>>
>> (1) Reduce or eliminate static initializers, global constructors, and global destructors
>> (2) Clean up cross compiling in the CMake build system
>> (3) Update LLVM debugging mechanisms for being part of a dynamic library
>> (4) Move overridden sys calls (like abort) into the tools, rather than the libraries
>> (5) Update TableGen to support stripping unused content (i.e. Intrinsics for backends you’re not building)
>
> Also:
>
> (6) Determine if command line options are the best way of passing configuration settings into LLVM.

They're already banned, so there isn't anything left to determine here, 
just code to fix.

> It’s an awkward abstraction when LLVM is embedded. I suspect (6) will be closely related to (1) since command line option parsing was the hardest impediment to getting rid of static initializers.

Yes, for all these reasons. Two libraries may be using llvm under the 
hood unaware of each other, they can't both share global state. Command 
line flags block that. Our command-line tools should be parsing their 
own flags and setting state through some other mechanism, and that state 
musn't be more global than an LLVMContext.

> My understanding of the shared library proposal is that the library only exposes the C API since the C++ API is not intended to allow for binary compatibility.  So, I think we need to either add the following as either an explicit goal of the shared library work, or as a closely related project:
>
> (7) Make the C API truly great.
>
> I think it’s harmful to LLVM in the long run if external embedders use the C++ API.

The quality with which we maintain the C API today suggests that we 
collectively think of it as an albatross to be suffered. There is work 
necessary to change that perception too.

   I think that one way of ensuring that they don’t have an excuse to do 
it is to flesh out some things:
>
> - Add more tests of the C API to ensure that people don’t break it accidentally and to give more gravitas to the C API backwards compatibility claims.

Yes, for well-designed high level APIs like libLTO and libIndex. For 
other APIs, we should remove the backwards compatibility guarantees ...

> - Increase C API coverage.

... which in turn allows us to do this.

Designing a good high-level API is hard (even libLTO has very ugly 
cracks in its API surface) and that makes it hard to do. What actually 
happens is that people write C APIs that closely match the C++ APIs in 
order to access them through other languages, but there's no way we can 
guarantee compatibility without freezing the C++ API too. Which we never 
will. This isn't a theoretical problem either, look at this case:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140804/229354.html
where we made a straight-forward update to the LLVM IR, but in theory a 
user of the C API would be able to observe the difference, and that 
could in turn break a C API user that was relying on the way old LLVM 
worked.

The solution is to offer two levels of C API, one intended for people to 
use to bind to their own language. This matches the C++ API closely and 
changes when the C++ API changes. (It could even be partially/wholy 
auto-generated via a clang tool?) Users of it will be broken with newer 
versions.

Secondly, some people really want a stable interface, so we give them an 
API expressed in higher-level tasks they want to achieve, so that we can 
change the underlying workings of how LLVM works without disturbing the 
API. That can be made ABI stable.

> 	- For example, WebKit currently sidesteps the C API to pass some commandline options to LLVM.  We don’t want that.

Seconded!

> 	- Add more support for reasoning about targets and triples.  WebKit still has to hardcode triples in some places even though it only ever does in-process JITing where host==target.  That’s weird.

Sounds good.

> 	- Expose debugging and runtime stuff and make sure that there’s a coherent integration story with the MCJIT C API.
> 		- Currently it’s difficult to round-trip debug info: creating it in C is awkward and parsing DWARF sections that MCJIT generates involves lots of weirdness.  WebKit has its own DWARF parser for this, which shouldn’t be necessary.
> 		- WebKit is about to have its own copies of both a compactunwind and EH frame parser.  The contributor who “wrote” the EH frame parser actually just took it from LLVM.  The licenses are compatible, but nonetheless, copy-paste from LLVM into WebKit should be discouraged.

I am not familiar with the MCJIT C API, but this sounds reasonable. I'll 
trust that you know what you're doing.

> - Engage with non-WebKit embedders that currently use the C++ API to figure out what it would take to get them to switch to the C API.

Engage with our users? That's crazy talk! ;)

Nick

> I think that a lot of time when C API discussions arise, lots of embedders give excuses for using the C++ API.  WebKit used the C API for generating IR and even doing some IR manipulation, and for driving the MCJIT.  It’s been a positive experience and we enjoy the binary compatibility that it gives us.  I think it would be great to see if other embedders can do the same.
>
> -Filip
>
>>
>> We will be sending more specific proposals and patches for each of the changes listed above starting this week. If you’re interested in these problems and their solutions, please speak up and help us develop a solution that will work for your needs and ours.
>>
>> Thanks,
>> -Chris
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list