[LLVMdev] New JIT APIs

Philip Reames listmail at philipreames.com
Fri Jan 16 15:02:36 PST 2015


On 01/16/2015 02:08 PM, Armin Steinhoff wrote:
> Hi Lang,
>
> Lang Hames schrieb:
>> Hi Armin,
>>
>> > The MCJIT API can only be used once to JIT compile external souces 
>> to excuteable code into the address space of a running process.
>
> That means: after the first successfull JIT compile it isn't possible 
> to do it again (within the same active process) ... because of some 
> resource issues.
Er, this is definitely something specific to your use case or 
environment.  I'm doing thousands of compiles in the same process on an 
extremely regular basis with no problems.
>
>>
>> I'm not sure exactly what you mean by "can only be used once" in this 
>> context. Regardless, the new APIs are definitely designed to make it 
>> easier to lead, unload and replace modules, and I hope they will 
>> support a wider range of use cases off-the-shelf than MCJIT does.
>
> OK ... sound interesting,  I will test it.
>
>
> Regards
>
> Armin
>
>
>>
>> Cheers,
>> Lang.
>>
>> On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <armin at steinhoff.de 
>> <mailto:armin at steinhoff.de>> wrote:
>>
>>
>>     Hi Lang,
>>
>>     we are using the JIT API of TCC and  the MCJIT API in order to
>>     import external code into a running control application process.
>>
>>     The MCJIT API can only be used once to JIT compile external
>>     souces to excuteable code into the address space of a running
>>     process.
>>
>>     Has your JIT API the same restriction ?  It would be very nice
>>     if your JIT API could provide a similar functionalty as provided
>>     by TCC.
>>
>>     Best Regards
>>
>>     Armin
>>
>>
>>     Lang Hames schrieb:
>>>     Hi All,
>>>
>>>     The attached patch (against r225842) contains some new JIT APIs
>>>     that I've been working on. I'm going to start breaking it up,
>>>     tidying it up, and submitting patches to llvm-commits soon, but
>>>     while I'm working on that I thought I'd put the whole patch out
>>>     for the curious to start playing around with and/or commenting on.
>>>
>>>     The aim of these new APIs is to cleanly support a wider range of
>>>     JIT use cases in LLVM, and to recover some of the functionality
>>>     lost when the legacy JIT was removed. In particular, I wanted to
>>>     see if I could re-enable lazy compilation while following
>>>     MCJIT's design philosophy of relying on the MC layer and
>>>     module-at-a-time compilation. The attached patch goes some way
>>>     to addressing these aims, though there's a lot still to do.
>>>
>>>     The 20,000 ft overview, for those who want to get straight to
>>>     the code:
>>>
>>>     The new APIs are not built on top of the MCJIT class, as I
>>>     didn't want a single class trying to be all things to all
>>>     people. Instead, the new APIs consist of a set of software
>>>     components for building JITs. The idea is that you should be
>>>     able to take these off the shelf and compose them reasonably
>>>     easily to get the behavior that you want. In the future I hope
>>>     that people who are working on LLVM-based JITs, if they find
>>>     this approach useful, will contribute back components that
>>>     they've built locally and that they think would be useful for a
>>>     wider audience. As a demonstration of the practicality of this
>>>     approach the attached patch contains a class, MCJITReplacement,
>>>     that composes some of the components to re-create the behavior
>>>     of MCJIT. This works well enough to pass all MCJIT regression
>>>     and unit tests on Darwin, and all but four regression tests on
>>>     Linux. The patch also contains the desired "new" feature:
>>>     Function-at-a-time lazy jitting in roughly the style of the
>>>     legacy JIT. The attached lazydemo.tgz file contains a program
>>>     which composes the new JIT components (including the
>>>     lazy-jitting component) to lazily execute bitcode. I've tested
>>>     this program on Darwin and it can run non-trivial benchmark
>>>     programs, e.g. 401.bzip2 from SPEC2006.
>>>
>>>     These new APIs are named after the motivating feature: On
>>>     Request Compilation, or ORC. I believe the logo potential is
>>>     outstanding. I'm picturing an Orc riding a Dragon. If I'm honest
>>>     this was at least 45% of my motivation for doing this project*.
>>>
>>>     You'll find the new headers in
>>>     llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the
>>>     implementation files in lib/ExecutionEngine/OrcJIT/*.
>>>
>>>     I imagine there will be a number of questions about the design
>>>     and implementation. I've tried to preempt a few below, but
>>>     please fire away with anything I've left out.
>>>
>>>     Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie,
>>>     Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time
>>>     out to review, discuss and test this thing as I've worked on it.
>>>
>>>     Cheers,
>>>     Lang.
>>>
>>>     Possible questions:
>>>
>>>     (1)
>>>     Q. Are you trying to kill off MCJIT?
>>>     A. There are no plans to remove MCJIT. The new APIs are designed
>>>     to live alongside it.
>>>
>>>     (2)
>>>     Q. What do "JIT components" look like, and how do you compose them?
>>>     A. The classes and functions you'll find in OrcJIT/*.h fall into
>>>     two rough categories: Layers and Utilities. Layers are classes
>>>     that implement a small common interface that makes them easy to
>>>     compose:
>>>
>>>     class SomeLayer {
>>>     private:
>>>       // Implementation details
>>>     public:
>>>       // Implementation details
>>>
>>>       typedef ??? Handle;
>>>
>>>       template <typename ModuleSet>
>>>       Handle addModuleSet(ModuleSet&& Ms);
>>>
>>>       void removeModuleSet(Handle H);
>>>
>>>       uint64_t getSymbolAddress(StringRef Name, bool
>>>     ExportedSymbolsOnly);
>>>
>>>       uint64_t lookupSymbolAddressIn(Handle H, StringRef Name,
>>>     bool ExportedSymbolsOnly);
>>>     };
>>>
>>>     Layers are usually designed to sit one-on-top-of-another, with
>>>     each doing some sort of useful work before handing off to the
>>>     layer below it. The layers that are currently included in the
>>>     patch are the the CompileOnDemandLayer, which breaks up modules
>>>     and redirects calls to not-yet-compiled functions back into the
>>>     JIT; the LazyEmitLayer, which defers adding modules to the layer
>>>     below until a symbol in the module is actually requested; the
>>>     IRCompilingLayer, which compiles bitcode to objects; and the
>>>     ObjectLinkingLayer, which links sets of objects in memory using
>>>     RuntimeDyld.
>>>
>>>     Utilities are everything that's not a layer. Ideally the heavy
>>>     lifting is done by the utilities. Layers just wrap certain
>>>     uses-cases to make them easy to compose.
>>>
>>>     Clients are free to use utilities directly, or compose layers,
>>>     or implement new utilities or layers.
>>>
>>>     (3)
>>>     Q. Why "addModuleSet" rather than "addModule"?
>>>     A. Allowing multiple modules to be passed around together allows
>>>     layers lower in the stack to perform interesting optimizations.
>>>     E.g. direct calls between objects that are allocated
>>>     sufficiently close in memory. To add a single Module you just
>>>     add a single-element set.
>>>
>>>     (4)
>>>     Q. What happened to "finalize"?
>>>     A. In the Orc APIs, getSymbolAddress automatically finalizes as
>>>     necessary before returning addresses to the client. When you get
>>>     an address back from getSymbolAddress, that address is ready to
>>>     call.
>>>
>>>     (5)
>>>     Q. What does "removeModuleSet" do?
>>>     A. It removes the modules represented by the handle from the
>>>     JIT. The meaning of this is specific to each layer, but
>>>     generally speaking it means that any memory allocated for those
>>>     modules (and their corresponding Objects, linked sections, etc)
>>>     has been freed, and the symbols those modules provided are now
>>>     undefined. Calling getSymbolAddress for a symbol that was
>>>     defined in a module that has been removed is expected to return '0'.
>>>
>>>     (5a)
>>>     Q. How are the linked sections freed? RTDyldMemoryManager
>>>     doesn't have any "free.*Section" methods.
>>>     A. Each ModuleSet gets its own RTDyldMemoryManager, and that is
>>>     destroyed when the module set is freed. The choice of
>>>     RTDyldMemoryManager is up to the client, but the standard memory
>>>     managers will free the memory allocated for the linked sections
>>>     when they're destroyed.
>>>
>>>     (6)
>>>     Q. How does the CompileOnDemand layer redirect calls to the JIT?
>>>     A. It currently uses double-indirection: Function bodies are
>>>     extracted into new modules, and the body of the original
>>>     function is replaced with an indirect call to the extracted
>>>     body. The pointer for the indirect call is initialized by the
>>>     JIT to point at some inline assembly which is injected into the
>>>     module, and this calls back in to the JIT to trigger compilation
>>>     of the extracted body. In the future I plan to make the
>>>     redirection strategy a parameter of the CompileOnDemand layer.
>>>     Double-indirection is the safest: It preserves function-pointer
>>>     equality and works with non-writable executable memory, however
>>>     there's no reason we couldn't use single indirection (for extra
>>>     speed where pointer-equality isn't required), or patchpoints
>>>     (for clients who can allocate writable/executable memory), or
>>>     any combination of the three. My intent is that this should be
>>>     up to the client.
>>>
>>>     As a brief note: it's worth noting that the CompileOnDemand
>>>     layer doesn't handle lazy compilation itself, just lazy symbol
>>>     resolution (i.e. symbols are resolved on first call, not when
>>>     compiling). If you've put the CompileOnDemand layer on top of
>>>     the LazyEmitLayer then deferring symbol lookup automatically
>>>     defers compilation. (E.g. You can remove the LazyEmitLayer in
>>>     main.cpp of the lazydemo and you'll get indirection and
>>>     callbacks, but no lazy compilation).ÂÂ
>>>
>>>     (7)
>>>     Q. Do the new APIs support cross-target JITing like MCJIT does?
>>>     A. Yes.
>>>
>>>     (7.a)
>>>     Q. Do the new APIs support cross-target (or cross process)
>>>     lazy-jitting?
>>>     A. Not yet, but all that is required is for us to add a small
>>>     amount of runtime to the JIT'd process to call back in to the
>>>     JIT via some RPC mechanism. There are no significant barriers to
>>>     implementing this that I'm aware of.
>>>
>>>     (8)
>>>     Q. Do any of the components implement the ExecutionEngine interface?
>>>     A. None of the components do, but the MCJITReplacement class does.
>>>
>>>     (9)
>>>     Q. Does this address any of the long-standing issues with MCJIT
>>>     - Stackmap parsing? Debugging? Thread-local-storage?
>>>     A. No, but it doesn't get in the way either. These features are
>>>     still on the road-map (such as it exists) and I'm hoping that
>>>     the modular nature of Orc will us to play around with new
>>>     features like this without any risk of disturbing existing
>>>     clients, and so allow us to make faster progress.
>>>
>>>     (10)
>>>     Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
>>>     A. I'm still tidying the patch up - please save patch specific
>>>     feedback for for llvm-commits, otherwise we'll get cross-talk
>>>     between the threads. The patches should be coming soon.
>>>
>>>     ---
>>>
>>>     As mentioned above, I'm happy to answer further general
>>>     questions about what these APIs can do, or where I see them
>>>     going. Feedback on the patch itself should be directed to the
>>>     llvm-commits list when I start posting patches there for discussion.
>>>
>>>
>>>     * Marketing slogans abound: "Very MachO". "Some warts".
>>>     "Surprisingly friendly with ELF". "Not yet on speaking terms
>>>     with DWARF".
>>>
>>>
>>>     _______________________________________________
>>>     LLVM Developers mailing list
>>>     LLVMdev at cs.uiuc.edu  <mailto:LLVMdev at cs.uiuc.edu>          http://llvm.cs.uiuc.edu
>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu          http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150116/b11b5ae4/attachment.html>


More information about the llvm-dev mailing list