[LLVMdev] New JIT APIs

Philip Reames listmail at philipreames.com
Wed Jan 14 11:08:04 PST 2015


On 01/14/2015 12:05 AM, Lang Hames wrote:
> Hi All,
>
> The attached patch (against r225842) contains some new JIT APIs that 
> I've been working on. I'm going to start breaking it up, tidying it 
> up, and submitting patches to llvm-commits soon, but while I'm working 
> on that I thought I'd put the whole patch out for the curious to start 
> playing around with and/or commenting on.
>
> The aim of these new APIs is to cleanly support a wider range of JIT 
> use cases in LLVM, and to recover some of the functionality lost when 
> the legacy JIT was removed. In particular, I wanted to see if I could 
> re-enable lazy compilation while following MCJIT's design philosophy 
> of relying on the MC layer and module-at-a-time compilation. The 
> attached patch goes some way to addressing these aims, though there's 
> a lot still to do.
In terms of the overall idea, I like what your proposing.  However, I 
want to be very clear: you are not planning on removing any 
functionality from the existing (fairly low level) MCJIT interface 
right?  We've built our own infrastructure around that and require a few 
features it doesn't sounds like you're planning on supporting in the new 
abstractions.  (The biggest one is that we "install" code into a 
different location from where it was compiled.)

I really like the idea of having a low level JIT interface for advanced 
users and an easy starting point for folks getting started.

>
> The 20,000 ft overview, for those who want to get straight to the code:
>
> The new APIs are not built on top of the MCJIT class, as I didn't want 
> a single class trying to be all things to all people. Instead, the new 
> APIs consist of a set of software components for building JITs. The 
> idea is that you should be able to take these off the shelf and 
> compose them reasonably easily to get the behavior that you want. In 
> the future I hope that people who are working on LLVM-based JITs, if 
> they find this approach useful, will contribute back components that 
> they've built locally and that they think would be useful for a wider 
> audience. As a demonstration of the practicality of this approach the 
> attached patch contains a class, MCJITReplacement, that composes some 
> of the components to re-create the behavior of MCJIT. This works well 
> enough to pass all MCJIT regression and unit tests on Darwin, and all 
> but four regression tests on Linux. The patch also contains the 
> desired "new" feature: Function-at-a-time lazy jitting in roughly the 
> style of the legacy JIT. The attached lazydemo.tgz file contains a 
> program which composes the new JIT components (including the 
> lazy-jitting component) to lazily execute bitcode. I've tested this 
> program on Darwin and it can run non-trivial benchmark programs, e.g. 
> 401.bzip2 from SPEC2006.
>
> These new APIs are named after the motivating feature: On Request 
> Compilation, or ORC. I believe the logo potential is outstanding. I'm 
> picturing an Orc riding a Dragon. If I'm honest this was at least 45% 
> of my motivation for doing this project*.
>
> You'll find the new headers in 
> llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation 
> files in lib/ExecutionEngine/OrcJIT/*.
>
> I imagine there will be a number of questions about the design and 
> implementation. I've tried to preempt a few below, but please fire 
> away with anything I've left out.
>
> Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete 
> Cooper, Eric Christopher, and Louis Gerbarg for taking time out to 
> review, discuss and test this thing as I've worked on it.
>
> Cheers,
> Lang.
>
> Possible questions:
>
> (1)
> Q. Are you trying to kill off MCJIT?
> A. There are no plans to remove MCJIT. The new APIs are designed to 
> live alongside it.
>
> (2)
> Q. What do "JIT components" look like, and how do you compose them?
> A. The classes and functions you'll find in OrcJIT/*.h fall into two 
> rough categories: Layers and Utilities. Layers are classes that 
> implement a small common interface that makes them easy to compose:
>
> class SomeLayer {
> private:
>   // Implementation details
> public:
>   // Implementation details
>
>   typedef ??? Handle;
>
>   template <typename ModuleSet>
>   Handle addModuleSet(ModuleSet&& Ms);
>
>   void removeModuleSet(Handle H);
>
>   uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);
>
>   uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool 
> ExportedSymbolsOnly);
> };
>
> Layers are usually designed to sit one-on-top-of-another, with each 
> doing some sort of useful work before handing off to the layer below 
> it. The layers that are currently included in the patch are the the 
> CompileOnDemandLayer, which breaks up modules and redirects calls to 
> not-yet-compiled functions back into the JIT; the LazyEmitLayer, which 
> defers adding modules to the layer below until a symbol in the module 
> is actually requested; the IRCompilingLayer, which compiles bitcode to 
> objects; and the ObjectLinkingLayer, which links sets of objects in 
> memory using RuntimeDyld.
>
> Utilities are everything that's not a layer. Ideally the heavy lifting 
> is done by the utilities. Layers just wrap certain uses-cases to make 
> them easy to compose.
>
> Clients are free to use utilities directly, or compose layers, or 
> implement new utilities or layers.
>
> (3)
> Q. Why "addModuleSet" rather than "addModule"?
> A. Allowing multiple modules to be passed around together allows 
> layers lower in the stack to perform interesting optimizations. E.g. 
> direct calls between objects that are allocated sufficiently close in 
> memory. To add a single Module you just add a single-element set.
Please add a utility function for a single Module if you haven't 
already.  For a method based JIT use case, multiple Modules just aren't 
that useful.
>
> (4)
> Q. What happened to "finalize"?
> A. In the Orc APIs, getSymbolAddress automatically finalizes as 
> necessary before returning addresses to the client. When you get an 
> address back from getSymbolAddress, that address is ready to call.
As long as this is true for the high level API and *not* the low level 
one (as is true today), this seems fine.  I don't really like the 
finalize mechanism we have, but we do need a mechanism to get at the 
code before relocations have been applied.
>
> (5)
> Q. What does "removeModuleSet" do?
> A. It removes the modules represented by the handle from the JIT. The 
> meaning of this is specific to each layer, but generally speaking it 
> means that any memory allocated for those modules (and their 
> corresponding Objects, linked sections, etc) has been freed, and the 
> symbols those modules provided are now undefined. Calling 
> getSymbolAddress for a symbol that was defined in a module that has 
> been removed is expected to return '0'.
>
> (5a)
> Q. How are the linked sections freed? RTDyldMemoryManager doesn't have 
> any "free.*Section" methods.
> A. Each ModuleSet gets its own RTDyldMemoryManager, and that is 
> destroyed when the module set is freed. The choice of 
> RTDyldMemoryManager is up to the client, but the standard memory 
> managers will free the memory allocated for the linked sections when 
> they're destroyed.
>
> (6)
> Q. How does the CompileOnDemand layer redirect calls to the JIT?
> A. It currently uses double-indirection: Function bodies are extracted 
> into new modules, and the body of the original function is replaced 
> with an indirect call to the extracted body. The pointer for the 
> indirect call is initialized by the JIT to point at some inline 
> assembly which is injected into the module, and this calls back in to 
> the JIT to trigger compilation of the extracted body. In the future I 
> plan to make the redirection strategy a parameter of the 
> CompileOnDemand layer. Double-indirection is the safest: It preserves 
> function-pointer equality and works with non-writable executable 
> memory, however there's no reason we couldn't use single indirection 
> (for extra speed where pointer-equality isn't required), or 
> patchpoints (for clients who can allocate writable/executable memory), 
> or any combination of the three. My intent is that this should be up 
> to the client.
>
> As a brief note: it's worth noting that the CompileOnDemand layer 
> doesn't handle lazy compilation itself, just lazy symbol resolution 
> (i.e. symbols are resolved on first call, not when compiling). If 
> you've put the CompileOnDemand layer on top of the LazyEmitLayer then 
> deferring symbol lookup automatically defers compilation. (E.g. You 
> can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll 
> get indirection and callbacks, but no lazy compilation).
>
> (7)
> Q. Do the new APIs support cross-target JITing like MCJIT does?
> A. Yes.
>
> (7.a)
> Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
> A. Not yet, but all that is required is for us to add a small amount 
> of runtime to the JIT'd process to call back in to the JIT via some 
> RPC mechanism. There are no significant barriers to implementing this 
> that I'm aware of.
>
> (8)
> Q. Do any of the components implement the ExecutionEngine interface?
> A. None of the components do, but the MCJITReplacement class does.
>
> (9)
> Q. Does this address any of the long-standing issues with MCJIT - 
> Stackmap parsing? Debugging? Thread-local-storage?
> A. No, but it doesn't get in the way either. These features are still 
> on the road-map (such as it exists) and I'm hoping that the modular 
> nature of Orc will us to play around with new features like this 
> without any risk of disturbing existing clients, and so allow us to 
> make faster progress.
>
> (10)
> Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
> A. I'm still tidying the patch up - please save patch specific 
> feedback for for llvm-commits, otherwise we'll get cross-talk between 
> the threads. The patches should be coming soon.
>
> ---
>
> As mentioned above, I'm happy to answer further general questions 
> about what these APIs can do, or where I see them going. Feedback on 
> the patch itself should be directed to the llvm-commits list when I 
> start posting patches there for discussion.
>
>
> * Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly 
> friendly with ELF". "Not yet on speaking terms with DWARF".
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150114/afa51146/attachment.html>


More information about the llvm-dev mailing list