[LLVMdev] New JIT APIs

Fri Jan 16 14:08:49 PST 2015

Hi Lang,

Lang Hames schrieb:
> Hi Armin,
>
> > The MCJIT API can only be used once to JIT compile external souces 
> to excuteable code into the address space of a running process.

That means: after the first successfull JIT compile it isn't possible to 
do it again (within the same active process) ... because of some 
resource issues.

>
> I'm not sure exactly what you mean by "can only be used once" in this 
> context. Regardless, the new APIs are definitely designed to make it 
> easier to lead, unload and replace modules, and I hope they will 
> support a wider range of use cases off-the-shelf than MCJIT does.

OK ... sound interesting,  I will test it.

Regards

Armin

>
> Cheers,
> Lang.
>
> On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <armin at steinhoff.de 
> <mailto:armin at steinhoff.de>> wrote:
>
>
>     Hi Lang,
>
>     we are using the JIT API of TCC andÂ  the MCJIT API in order to
>     import external code into a running control application process.
>
>     The MCJIT API can only be used once to JIT compile external souces
>     to excuteable code into the address space of a running process.
>
>     Has your JIT API the same restriction ?Â  It would be very nice if
>     your JIT API could provide a similar functionalty as provided by TCC.
>
>     Best Regards
>
>     Armin
>
>
>     Lang Hames schrieb:
>>     Hi All,
>>
>>     The attached patch (against r225842) contains some new JIT APIs
>>     that I've been working on. I'm going to start breaking it up,
>>     tidying it up, and submitting patches to llvm-commits soon, but
>>     while I'm working on that I thought I'd put the whole patch out
>>     for the curious to start playing around with and/or commenting on.
>>
>>     The aim of these new APIs is to cleanly support a wider range of
>>     JIT use cases in LLVM, and to recover some of the functionality
>>     lost when the legacy JIT was removed. In particular, I wanted to
>>     see if I could re-enable lazy compilation while following MCJIT's
>>     design philosophy of relying on the MC layer and module-at-a-time
>>     compilation. The attached patch goes some way to addressing these
>>     aims, though there's a lot still to do.
>>
>>     The 20,000 ft overview, for those who want to get straight to the
>>     code:
>>
>>     The new APIs are not built on top of the MCJIT class, as I didn't
>>     want a single class trying to be all things to all people.
>>     Instead, the new APIs consist of a set of software components for
>>     building JITs. The idea is that you should be able to take these
>>     off the shelf and compose them reasonably easily to get the
>>     behavior that you want. In the future I hope that people who are
>>     working on LLVM-based JITs, if they find this approach useful,
>>     will contribute back components that they've built locally and
>>     that they think would be useful for a wider audience. As a
>>     demonstration of the practicality of this approach the attached
>>     patch contains a class, MCJITReplacement, that composes some of
>>     the components to re-create the behavior of MCJIT. This works
>>     well enough to pass all MCJIT regression and unit tests on
>>     Darwin, and all but four regression tests on Linux. The patch
>>     also contains the desired "new" feature: Function-at-a-time lazy
>>     jitting in roughly the style of the legacy JIT. The attached
>>     lazydemo.tgz file contains a program which composes the new JIT
>>     components (including the lazy-jitting component) to lazily
>>     execute bitcode. I've tested this program on Darwin and it can
>>     run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.
>>
>>     These new APIs are named after the motivating feature: On Request
>>     Compilation, or ORC. I believe the logo potential is outstanding.
>>     I'm picturing an Orc riding a Dragon. If I'm honest this was at
>>     least 45% of my motivation for doing this project*.
>>
>>     You'll find the new headers in
>>     llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the
>>     implementation files in lib/ExecutionEngine/OrcJIT/*.
>>
>>     I imagine there will be a number of questions about the design
>>     and implementation. I've tried to preempt a few below, but please
>>     fire away with anything I've left out.
>>
>>     Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie,
>>     Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time
>>     out to review, discuss and test this thing as I've worked on it.
>>
>>     Cheers,
>>     Lang.
>>
>>     Possible questions:
>>
>>     (1)
>>     Q. Are you trying to kill off MCJIT?
>>     A. There are no plans to remove MCJIT. The new APIs are designed
>>     to live alongside it.
>>
>>     (2)
>>     Q. What do "JIT components" look like, and how do you compose them?
>>     A. The classes and functions you'll find in OrcJIT/*.h fall into
>>     two rough categories: Layers and Utilities. Layers are classes
>>     that implement a small common interface that makes them easy to
>>     compose:
>>
>>     class SomeLayer {
>>     private:
>>     Ã‚Â  // Implementation details
>>     public:
>>     Ã‚Â  // Implementation details
>>
>>     Ã‚Â  typedef ??? Handle;
>>
>>     Ã‚Â  template <typename ModuleSet>
>>     Ã‚Â  Handle addModuleSet(ModuleSet&& Ms);
>>
>>     Ã‚Â  void removeModuleSet(Handle H);
>>
>>     Ã‚Â  uint64_t getSymbolAddress(StringRef Name, bool
>>     ExportedSymbolsOnly);
>>
>>     Ã‚Â  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name,
>>     bool ExportedSymbolsOnly);
>>     };
>>
>>     Layers are usually designed to sit one-on-top-of-another, with
>>     each doing some sort of useful work before handing off to the
>>     layer below it. The layers that are currently included in the
>>     patch are the the CompileOnDemandLayer, which breaks up modules
>>     and redirects calls to not-yet-compiled functions back into the
>>     JIT; the LazyEmitLayer, which defers adding modules to the layer
>>     below until a symbol in the module is actually requested; the
>>     IRCompilingLayer, which compiles bitcode to objects; and the
>>     ObjectLinkingLayer, which links sets of objects in memory using
>>     RuntimeDyld.
>>
>>     Utilities are everything that's not a layer. Ideally the heavy
>>     lifting is done by the utilities. Layers just wrap certain
>>     uses-cases to make them easy to compose.
>>
>>     Clients are free to use utilities directly, or compose layers, or
>>     implement new utilities or layers.
>>
>>     (3)
>>     Q. Why "addModuleSet" rather than "addModule"?
>>     A. Allowing multiple modules to be passed around together allows
>>     layers lower in the stack to perform interesting optimizations.
>>     E.g. direct calls between objects that are allocated sufficiently
>>     close in memory. To add a single Module you just add a
>>     single-element set.
>>
>>     (4)
>>     Q. What happened to "finalize"?
>>     A. In the Orc APIs, getSymbolAddress automatically finalizes as
>>     necessary before returning addresses to the client. When you get
>>     an address back from getSymbolAddress, that address is ready to call.
>>
>>     (5)
>>     Q. What does "removeModuleSet" do?
>>     A. It removes the modules represented by the handle from the JIT.
>>     The meaning of this is specific to each layer, but generally
>>     speaking it means that any memory allocated for those modules
>>     (and their corresponding Objects, linked sections, etc) has been
>>     freed, and the symbols those modules provided are now undefined.
>>     Calling getSymbolAddress for a symbol that was defined in a
>>     module that has been removed is expected to return '0'.
>>
>>     (5a)
>>     Q. How are the linked sections freed? RTDyldMemoryManager doesn't
>>     have any "free.*Section" methods.
>>     A. Each ModuleSet gets its own RTDyldMemoryManager, and that is
>>     destroyed when the module set is freed. The choice of
>>     RTDyldMemoryManager is up to the client, but the standard memory
>>     managers will free the memory allocated for the linked sections
>>     when they're destroyed.
>>
>>     (6)
>>     Q. How does the CompileOnDemand layer redirect calls to the JIT?
>>     A. It currently uses double-indirection: Function bodies are
>>     extracted into new modules, and the body of the original function
>>     is replaced with an indirect call to the extracted body. The
>>     pointer for the indirect call is initialized by the JIT to point
>>     at some inline assembly which is injected into the module, and
>>     this calls back in to the JIT to trigger compilation of the
>>     extracted body. In the future I plan to make the redirection
>>     strategy a parameter of the CompileOnDemand layer.
>>     Double-indirection is the safest: It preserves function-pointer
>>     equality and works with non-writable executable memory, however
>>     there's no reason we couldn't use single indirection (for extra
>>     speed where pointer-equality isn't required), or patchpoints (for
>>     clients who can allocate writable/executable memory), or any
>>     combination of the three. My intent is that this should be up to
>>     the client.
>>
>>     As a brief note: it's worth noting that the CompileOnDemand layer
>>     doesn't handle lazy compilation itself, just lazy symbol
>>     resolution (i.e. symbols are resolved on first call, not when
>>     compiling). If you've put the CompileOnDemand layer on top of the
>>     LazyEmitLayer then deferring symbol lookup automatically defers
>>     compilation. (E.g. You can remove the LazyEmitLayer in main.cpp
>>     of the lazydemo and you'll get indirection and callbacks, but no
>>     lazy compilation).Ã‚Â
>>
>>     (7)
>>     Q. Do the new APIs support cross-target JITing like MCJIT does?
>>     A. Yes.
>>
>>     (7.a)
>>     Q. Do the new APIs support cross-target (or cross process)
>>     lazy-jitting?
>>     A. Not yet, but all that is required is for us to add a small
>>     amount of runtime to the JIT'd process to call back in to the JIT
>>     via some RPC mechanism. There are no significant barriers to
>>     implementing this that I'm aware of.
>>
>>     (8)
>>     Q. Do any of the components implement the ExecutionEngine interface?
>>     A. None of the components do, but the MCJITReplacement class does.
>>
>>     (9)
>>     Q. Does this address any of the long-standing issues with MCJIT -
>>     Stackmap parsing? Debugging? Thread-local-storage?
>>     A. No, but it doesn't get in the way either. These features are
>>     still on the road-map (such as it exists) and I'm hoping that the
>>     modular nature of Orc will us to play around with new features
>>     like this without any risk of disturbing existing clients, and so
>>     allow us to make faster progress.
>>
>>     (10)
>>     Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
>>     A. I'm still tidying the patch up - please save patch specific
>>     feedback for for llvm-commits, otherwise we'll get cross-talk
>>     between the threads. The patches should be coming soon.
>>
>>     ---
>>
>>     As mentioned above, I'm happy to answer further general questions
>>     about what these APIs can do, or where I see them going. Feedback
>>     on the patch itself should be directed to the llvm-commits list
>>     when I start posting patches there for discussion.
>>
>>
>>     * Marketing slogans abound: "Very MachO". "Some warts".
>>     "Surprisingly friendly with ELF". "Not yet on speaking terms with
>>     DWARF".
>>
>>
>>     _______________________________________________
>>     LLVM Developers mailing list
>>     LLVMdev at cs.uiuc.edu  <mailto:LLVMdev at cs.uiuc.edu>          http://llvm.cs.uiuc.edu
>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150116/32c59402/attachment.html>