[LLVMdev] New JIT APIs
Philip Reames
listmail at philipreames.com
Wed Jan 14 11:08:04 PST 2015
On 01/14/2015 12:05 AM, Lang Hames wrote:
> Hi All,
>
> The attached patch (against r225842) contains some new JIT APIs that
> I've been working on. I'm going to start breaking it up, tidying it
> up, and submitting patches to llvm-commits soon, but while I'm working
> on that I thought I'd put the whole patch out for the curious to start
> playing around with and/or commenting on.
>
> The aim of these new APIs is to cleanly support a wider range of JIT
> use cases in LLVM, and to recover some of the functionality lost when
> the legacy JIT was removed. In particular, I wanted to see if I could
> re-enable lazy compilation while following MCJIT's design philosophy
> of relying on the MC layer and module-at-a-time compilation. The
> attached patch goes some way to addressing these aims, though there's
> a lot still to do.
In terms of the overall idea, I like what your proposing. However, I
want to be very clear: you are not planning on removing any
functionality from the existing (fairly low level) MCJIT interface
right? We've built our own infrastructure around that and require a few
features it doesn't sounds like you're planning on supporting in the new
abstractions. (The biggest one is that we "install" code into a
different location from where it was compiled.)
I really like the idea of having a low level JIT interface for advanced
users and an easy starting point for folks getting started.
>
> The 20,000 ft overview, for those who want to get straight to the code:
>
> The new APIs are not built on top of the MCJIT class, as I didn't want
> a single class trying to be all things to all people. Instead, the new
> APIs consist of a set of software components for building JITs. The
> idea is that you should be able to take these off the shelf and
> compose them reasonably easily to get the behavior that you want. In
> the future I hope that people who are working on LLVM-based JITs, if
> they find this approach useful, will contribute back components that
> they've built locally and that they think would be useful for a wider
> audience. As a demonstration of the practicality of this approach the
> attached patch contains a class, MCJITReplacement, that composes some
> of the components to re-create the behavior of MCJIT. This works well
> enough to pass all MCJIT regression and unit tests on Darwin, and all
> but four regression tests on Linux. The patch also contains the
> desired "new" feature: Function-at-a-time lazy jitting in roughly the
> style of the legacy JIT. The attached lazydemo.tgz file contains a
> program which composes the new JIT components (including the
> lazy-jitting component) to lazily execute bitcode. I've tested this
> program on Darwin and it can run non-trivial benchmark programs, e.g.
> 401.bzip2 from SPEC2006.
>
> These new APIs are named after the motivating feature: On Request
> Compilation, or ORC. I believe the logo potential is outstanding. I'm
> picturing an Orc riding a Dragon. If I'm honest this was at least 45%
> of my motivation for doing this project*.
>
> You'll find the new headers in
> llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation
> files in lib/ExecutionEngine/OrcJIT/*.
>
> I imagine there will be a number of questions about the design and
> implementation. I've tried to preempt a few below, but please fire
> away with anything I've left out.
>
> Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete
> Cooper, Eric Christopher, and Louis Gerbarg for taking time out to
> review, discuss and test this thing as I've worked on it.
>
> Cheers,
> Lang.
>
> Possible questions:
>
> (1)
> Q. Are you trying to kill off MCJIT?
> A. There are no plans to remove MCJIT. The new APIs are designed to
> live alongside it.
>
> (2)
> Q. What do "JIT components" look like, and how do you compose them?
> A. The classes and functions you'll find in OrcJIT/*.h fall into two
> rough categories: Layers and Utilities. Layers are classes that
> implement a small common interface that makes them easy to compose:
>
> class SomeLayer {
> private:
> // Implementation details
> public:
> // Implementation details
>
> typedef ??? Handle;
>
> template <typename ModuleSet>
> Handle addModuleSet(ModuleSet&& Ms);
>
> void removeModuleSet(Handle H);
>
> uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);
>
> uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool
> ExportedSymbolsOnly);
> };
>
> Layers are usually designed to sit one-on-top-of-another, with each
> doing some sort of useful work before handing off to the layer below
> it. The layers that are currently included in the patch are the the
> CompileOnDemandLayer, which breaks up modules and redirects calls to
> not-yet-compiled functions back into the JIT; the LazyEmitLayer, which
> defers adding modules to the layer below until a symbol in the module
> is actually requested; the IRCompilingLayer, which compiles bitcode to
> objects; and the ObjectLinkingLayer, which links sets of objects in
> memory using RuntimeDyld.
>
> Utilities are everything that's not a layer. Ideally the heavy lifting
> is done by the utilities. Layers just wrap certain uses-cases to make
> them easy to compose.
>
> Clients are free to use utilities directly, or compose layers, or
> implement new utilities or layers.
>
> (3)
> Q. Why "addModuleSet" rather than "addModule"?
> A. Allowing multiple modules to be passed around together allows
> layers lower in the stack to perform interesting optimizations. E.g.
> direct calls between objects that are allocated sufficiently close in
> memory. To add a single Module you just add a single-element set.
Please add a utility function for a single Module if you haven't
already. For a method based JIT use case, multiple Modules just aren't
that useful.
>
> (4)
> Q. What happened to "finalize"?
> A. In the Orc APIs, getSymbolAddress automatically finalizes as
> necessary before returning addresses to the client. When you get an
> address back from getSymbolAddress, that address is ready to call.
As long as this is true for the high level API and *not* the low level
one (as is true today), this seems fine. I don't really like the
finalize mechanism we have, but we do need a mechanism to get at the
code before relocations have been applied.
>
> (5)
> Q. What does "removeModuleSet" do?
> A. It removes the modules represented by the handle from the JIT. The
> meaning of this is specific to each layer, but generally speaking it
> means that any memory allocated for those modules (and their
> corresponding Objects, linked sections, etc) has been freed, and the
> symbols those modules provided are now undefined. Calling
> getSymbolAddress for a symbol that was defined in a module that has
> been removed is expected to return '0'.
>
> (5a)
> Q. How are the linked sections freed? RTDyldMemoryManager doesn't have
> any "free.*Section" methods.
> A. Each ModuleSet gets its own RTDyldMemoryManager, and that is
> destroyed when the module set is freed. The choice of
> RTDyldMemoryManager is up to the client, but the standard memory
> managers will free the memory allocated for the linked sections when
> they're destroyed.
>
> (6)
> Q. How does the CompileOnDemand layer redirect calls to the JIT?
> A. It currently uses double-indirection: Function bodies are extracted
> into new modules, and the body of the original function is replaced
> with an indirect call to the extracted body. The pointer for the
> indirect call is initialized by the JIT to point at some inline
> assembly which is injected into the module, and this calls back in to
> the JIT to trigger compilation of the extracted body. In the future I
> plan to make the redirection strategy a parameter of the
> CompileOnDemand layer. Double-indirection is the safest: It preserves
> function-pointer equality and works with non-writable executable
> memory, however there's no reason we couldn't use single indirection
> (for extra speed where pointer-equality isn't required), or
> patchpoints (for clients who can allocate writable/executable memory),
> or any combination of the three. My intent is that this should be up
> to the client.
>
> As a brief note: it's worth noting that the CompileOnDemand layer
> doesn't handle lazy compilation itself, just lazy symbol resolution
> (i.e. symbols are resolved on first call, not when compiling). If
> you've put the CompileOnDemand layer on top of the LazyEmitLayer then
> deferring symbol lookup automatically defers compilation. (E.g. You
> can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll
> get indirection and callbacks, but no lazy compilation).
>
> (7)
> Q. Do the new APIs support cross-target JITing like MCJIT does?
> A. Yes.
>
> (7.a)
> Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
> A. Not yet, but all that is required is for us to add a small amount
> of runtime to the JIT'd process to call back in to the JIT via some
> RPC mechanism. There are no significant barriers to implementing this
> that I'm aware of.
>
> (8)
> Q. Do any of the components implement the ExecutionEngine interface?
> A. None of the components do, but the MCJITReplacement class does.
>
> (9)
> Q. Does this address any of the long-standing issues with MCJIT -
> Stackmap parsing? Debugging? Thread-local-storage?
> A. No, but it doesn't get in the way either. These features are still
> on the road-map (such as it exists) and I'm hoping that the modular
> nature of Orc will us to play around with new features like this
> without any risk of disturbing existing clients, and so allow us to
> make faster progress.
>
> (10)
> Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
> A. I'm still tidying the patch up - please save patch specific
> feedback for for llvm-commits, otherwise we'll get cross-talk between
> the threads. The patches should be coming soon.
>
> ---
>
> As mentioned above, I'm happy to answer further general questions
> about what these APIs can do, or where I see them going. Feedback on
> the patch itself should be directed to the llvm-commits list when I
> start posting patches there for discussion.
>
>
> * Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly
> friendly with ELF". "Not yet on speaking terms with DWARF".
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150114/afa51146/attachment.html>
More information about the llvm-dev
mailing list