[LLVMdev] LLVM as a shared library

Eric Christopher echristo at gmail.com
Tue Aug 5 17:31:02 PDT 2014


On Tue, Aug 5, 2014 at 4:42 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>
> On Aug 5, 2014, at 3:19 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> On Tue, Aug 5, 2014 at 2:57 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>
>
> On Aug 5, 2014, at 2:51 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> On Tue, Aug 5, 2014 at 2:49 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>
>
> On Aug 5, 2014, at 1:46 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> (7) Make the C API truly great.
>
> I think it’s harmful to LLVM in the long run if external embedders use the
> C++ API.  I think that one way of ensuring that they don’t have an excuse to
> do it is to flesh out some things:
>
> - Add more tests of the C API to ensure that people don’t break it
> accidentally and to give more gravitas to the C API backwards compatibility
> claims.
> - Increase C API coverage.
>      - For example, WebKit currently sidesteps the C API to pass some
> commandline options to LLVM.  We don’t want that.
>      - Add more support for reasoning about targets and triples.  WebKit
> still has to hardcode triples in some places even though it only ever does
> in-process JITing where host==target. That’s weird.
>      - Expose debugging and runtime stuff and make sure that there’s a
> coherent integration story with the MCJIT C API.
>              - Currently it’s difficult to round-trip debug info: creating
> it in C is awkward and parsing DWARF sections that MCJIT generates involves
> lots of weirdness.  WebKit has its own DWARF parser for this, which
> shouldn’t be necessary.
>              - WebKit is about to have its own copies of both a
> compactunwind and EH frame parser.  The contributor who “wrote” the EH frame
> parser actually just took it from LLVM.  The licenses are compatible, but
> nonetheless, copy-paste from LLVM into WebKit should be discouraged.
> - Engage with non-WebKit embedders that currently use the C++ API to figure
> out what it would take to get them to switch to the C API.
>
> I think that a lot of time when C API discussions arise, lots of embedders
> give excuses for using the C++ API.  WebKit used the C API for generating IR
> and even doing some IR manipulation, and for driving the MCJIT.  It’s been a
> positive experience and we enjoy the binary compatibility that it gives us.
> I think it would be great to see if other embedders can do the same.
>
>
> Honestly I think if you want to make the C API great we should burn it
> to the ground and come up with another one - and one that can be
> versioned as well so we don't have the problems of being limited in
> what we can do to llvm by needing compatibility with the C API.
>
>
> Can you come up with specific reasons why building a new API would be better
> for the community than maintaining the one we’ve got?
>
>
> Rafael came up with a few in his, but also having an API that lightly
> wraps the C++ api is hard if we want to change a major C++ interface
> or completely remove a class, etc. There's no existing way in the API
> to either version or remove an interface given our current promises.
>
>
> Can you give a specific example of an intended C++ API change that wasn’t
> possible because of a C API?
>
> Just because you have an API doesn’t mean that things can’t be deprecated,
> or that the API layer can’t be hacked to give the illusion of old behavior.
> Can you give an example of a C API deprecation proposal that was intended to
> make some C++ change possible, that was rejected on the grounds that it
> would break the C API?
>
> I can only recall cases where the C API was broken by accident because of
> lack of testing, and in all of those cases, the issue was either resolved,
> or there is a plan to resolve it and a workaround was made available.
>
>
> Sure, these are going to be a bit vague because I'm a bit busy at the
> moment, but I recall a couple of times during the year that we've had
>
> API up for review (or even committed temporarily) that exposed
> internal constants via enums, and I think Rafael had some issues with
> visibility changes for the same reasons.
>
> In a more recent case here's a thread:
>
> [LLVMdev] Inconsistent third field in global_ctors (was Re: [llvm]
> r214321 - UseListOrder: Visit global values)
>
>
> There appears to be a patch up for review that takes care of this with a
> slightly careful dance and there is a PR tracking deprecating the bad
> construct (two-field version) eventually.  So, I don’t think this qualifies
> as a change that was made impossible by the C API, since the patch
> demonstrates that it *is* possible.
>
>
> and
>
> [PATCH] Add return value attribute to C interface
>
>
> This appears to be an observation that we should extend the API to better
> handle attributes.  I agree with that observation and with the general
> sentiment that strings are better than bits.  This isn’t a reason to burn
> the API to the ground.  Someone should just make the change, which would
> probably involve allowing C API clients to use either bits or strings for
> the time being.
>
>
> also I think the conversation we were having in here:
>
> [PATCH] Expose MCInst in C Disassembler API
>
>
> The arguments against exposing MCInst were pretty vague.  I never agreed
> with any of them.  It was an obviously useful addition and someone should
> still do it.  That being said, this thread just covered adding more stuff to
> the C API; it’s not an example of a C++ change that couldn’t be made because
> of the C API.
>
>
> is somewhat relevant :)
>
> Just a couple of quick things I could find with a search. I could
> probably dig up more given some more time.
>
>
> Your current examples are just small bugs that can be fixed - and in two out
> of the three examples, there are sensible patches up for review already.
>

As I said, I haven't spent a lot of time on it. It's friction etc.

How about a theoretical?

Let's say we write a decent PRE pass that encompasses these two passes:

/** See llvm::createMergedLoadStoreMotionPass function. */
void LLVMAddMergedLoadStoreMotionPass(LLVMPassManagerRef PM);

/** See llvm::createGVNPass function. */
void LLVMAddGVNPass(LLVMPassManagerRef PM);

then, why not delete the existing passes? No need keeping dead code
around right? Except we can't because the passes are in the C API and
someone might be using them.


Another example: We've already had multiple C lto APIs, moving those
has only been possible because we just change everyone and there are
only 2 users.


Here's a different theoretical that I'm considering - the
TargetMachine interface:

LLVMTargetMachineRef LLVMCreateTargetMachine(LLVMTargetRef T,
  const char *Triple, const char *CPU, const char *Features,
  LLVMCodeGenOptLevel Level, LLVMRelocMode Reloc, LLVMCodeModel CodeModel);

The last 3 arguments here are a little weird.

Optimization doesn't necessarily make sense within the target machine
and should probably be pulled out somewhere else - perhaps the
PassManagerBuilder? Perhaps somewhere else? The code model and reloc
model make a bit more sense, but are more of an overarching object
file matter for the first. The second is definitely something that
deals with a particular machine. Now, let's say we come up with
extensions to allow multiple architectures generate code within a
single object file. The relocation model will need to be part of some
overarching class on top of the TargetMachine that'll need the
relocation model.

This probably won't be necessary until we hit something like ARM/X86
in the same module. Though with the existing ARM/Thumb support and the
way those subtargets are actually held as target machines we'll
actually need this if we want to handle emitting arm and thumb into
the same module. Now we can try rewriting that support ala mips16 for
the mips port and probably should, but with the increasing relevance
of accelerator computing I see this happening sooner rather than
later. I don't want to close the door on it by having a C API that
can't be revved or changed and is even more extensive than this
relatively odd problem I mention here.

I realize it seems mostly like "aaa! change! aaa! fear!" but we've
been able to do pretty well over the years with the C API by keeping
it very general and having a lot of forethought with what we allow
into it. If we're going to expand the C API greatly (as it sounds like
you want) then I feel we're going to run into issues with wanting to
change the C API and not being able to - hence the request for things
like versioning etc.

-eric




More information about the llvm-dev mailing list