[LLVMdev] New JIT APIs

Armin Steinhoff armin at steinhoff.de
Sun Jan 18 06:49:24 PST 2015


Philip Reames schrieb:
>
> On 01/16/2015 02:08 PM, Armin Steinhoff wrote:
>> Hi Lang,
>>
>> Lang Hames schrieb:
>>> Hi Armin,
>>>
>>> > The MCJIT API can only be used once to JIT compile external souces 
>>> to excuteable code into the address space of a running process.
>>
>> That means: after the first successfull JIT compile it isn't possible 
>> to do it again (within the same active process) ... because of some 
>> resource issues.
> Er, this is definitely something specific to your use case or 
> environment.  I'm doing thousands of compiles in the same process on 
> an extremely regular basis with no problems.

Good to know ... I started 2 years ago and used the tool "lli" as an JIT 
example, but didn't care about the atexit handling in details.
Is in the meantime a documentation available about the "MCJIT API" ?
  I'm not a specialist for compiler construction ... so it is a PITA for 
me to go through the djungle of class definitions (most of them w/o any 
comments explaining their semantics).

Is it not possible to develop a user interface for the JIT compile API 
comparable to the user interface of e.g. CLANG ??

Thanks so far

Armin


>>
>>>
>>> I'm not sure exactly what you mean by "can only be used once" in 
>>> this context. Regardless, the new APIs are definitely designed to 
>>> make it easier to lead, unload and replace modules, and I hope they 
>>> will support a wider range of use cases off-the-shelf than MCJIT does.
>>
>> OK ... sound interesting,  I will test it.
>>
>>
>> Regards
>>
>> Armin
>>
>>
>>>
>>> Cheers,
>>> Lang.
>>>
>>> On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <armin at steinhoff.de 
>>> <mailto:armin at steinhoff.de>> wrote:
>>>
>>>
>>>     Hi Lang,
>>>
>>>     we are using the JIT API of TCC and  the MCJIT API in order to
>>>     import external code into a running control application process.
>>>
>>>     The MCJIT API can only be used once to JIT compile external
>>>     souces to excuteable code into the address space of a running
>>>     process.
>>>
>>>     Has your JIT API the same restriction ?  It would be very nice
>>>     if your JIT API could provide a similar functionalty as provided
>>>     by TCC.
>>>
>>>     Best Regards
>>>
>>>     Armin
>>>
>>>
>>>     Lang Hames schrieb:
>>>>     Hi All,
>>>>
>>>>     The attached patch (against r225842) contains some new JIT APIs
>>>>     that I've been working on. I'm going to start breaking it up,
>>>>     tidying it up, and submitting patches to llvm-commits soon, but
>>>>     while I'm working on that I thought I'd put the whole patch out
>>>>     for the curious to start playing around with and/or commenting on.
>>>>
>>>>     The aim of these new APIs is to cleanly support a wider range
>>>>     of JIT use cases in LLVM, and to recover some of the
>>>>     functionality lost when the legacy JIT was removed. In
>>>>     particular, I wanted to see if I could re-enable lazy
>>>>     compilation while following MCJIT's design philosophy of
>>>>     relying on the MC layer and module-at-a-time compilation. The
>>>>     attached patch goes some way to addressing these aims, though
>>>>     there's a lot still to do.
>>>>
>>>>     The 20,000 ft overview, for those who want to get straight to
>>>>     the code:
>>>>
>>>>     The new APIs are not built on top of the MCJIT class, as I
>>>>     didn't want a single class trying to be all things to all
>>>>     people. Instead, the new APIs consist of a set of software
>>>>     components for building JITs. The idea is that you should be
>>>>     able to take these off the shelf and compose them reasonably
>>>>     easily to get the behavior that you want. In the future I hope
>>>>     that people who are working on LLVM-based JITs, if they find
>>>>     this approach useful, will contribute back components that
>>>>     they've built locally and that they think would be useful for a
>>>>     wider audience. As a demonstration of the practicality of this
>>>>     approach the attached patch contains a class, MCJITReplacement,
>>>>     that composes some of the components to re-create the behavior
>>>>     of MCJIT. This works well enough to pass all MCJIT regression
>>>>     and unit tests on Darwin, and all but four regression tests on
>>>>     Linux. The patch also contains the desired "new" feature:
>>>>     Function-at-a-time lazy jitting in roughly the style of the
>>>>     legacy JIT. The attached lazydemo.tgz file contains a program
>>>>     which composes the new JIT components (including the
>>>>     lazy-jitting component) to lazily execute bitcode. I've tested
>>>>     this program on Darwin and it can run non-trivial benchmark
>>>>     programs, e.g. 401.bzip2 from SPEC2006.
>>>>
>>>>     These new APIs are named after the motivating feature: On
>>>>     Request Compilation, or ORC. I believe the logo potential is
>>>>     outstanding. I'm picturing an Orc riding a Dragon. If I'm
>>>>     honest this was at least 45% of my motivation for doing this
>>>>     project*.
>>>>
>>>>     You'll find the new headers in
>>>>     llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the
>>>>     implementation files in lib/ExecutionEngine/OrcJIT/*.
>>>>
>>>>     I imagine there will be a number of questions about the design
>>>>     and implementation. I've tried to preempt a few below, but
>>>>     please fire away with anything I've left out.
>>>>
>>>>     Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie,
>>>>     Pete Cooper, Eric Christopher, and Louis Gerbarg for taking
>>>>     time out to review, discuss and test this thing as I've worked
>>>>     on it.
>>>>
>>>>     Cheers,
>>>>     Lang.
>>>>
>>>>     Possible questions:
>>>>
>>>>     (1)
>>>>     Q. Are you trying to kill off MCJIT?
>>>>     A. There are no plans to remove MCJIT. The new APIs are
>>>>     designed to live alongside it.
>>>>
>>>>     (2)
>>>>     Q. What do "JIT components" look like, and how do you compose them?
>>>>     A. The classes and functions you'll find in OrcJIT/*.h fall
>>>>     into two rough categories: Layers and Utilities. Layers are
>>>>     classes that implement a small common interface that makes them
>>>>     easy to compose:
>>>>
>>>>     class SomeLayer {
>>>>     private:
>>>>       // Implementation details
>>>>     public:
>>>>       // Implementation details
>>>>
>>>>       typedef ??? Handle;
>>>>
>>>>       template <typename ModuleSet>
>>>>       Handle addModuleSet(ModuleSet&& Ms);
>>>>
>>>>       void removeModuleSet(Handle H);
>>>>
>>>>       uint64_t getSymbolAddress(StringRef Name, bool
>>>>     ExportedSymbolsOnly);
>>>>
>>>>       uint64_t lookupSymbolAddressIn(Handle H, StringRef Name,
>>>>     bool ExportedSymbolsOnly);
>>>>     };
>>>>
>>>>     Layers are usually designed to sit one-on-top-of-another, with
>>>>     each doing some sort of useful work before handing off to the
>>>>     layer below it. The layers that are currently included in the
>>>>     patch are the the CompileOnDemandLayer, which breaks up modules
>>>>     and redirects calls to not-yet-compiled functions back into the
>>>>     JIT; the LazyEmitLayer, which defers adding modules to the
>>>>     layer below until a symbol in the module is actually requested;
>>>>     the IRCompilingLayer, which compiles bitcode to objects; and
>>>>     the ObjectLinkingLayer, which links sets of objects in memory
>>>>     using RuntimeDyld.
>>>>
>>>>     Utilities are everything that's not a layer. Ideally the heavy
>>>>     lifting is done by the utilities. Layers just wrap certain
>>>>     uses-cases to make them easy to compose.
>>>>
>>>>     Clients are free to use utilities directly, or compose layers,
>>>>     or implement new utilities or layers.
>>>>
>>>>     (3)
>>>>     Q. Why "addModuleSet" rather than "addModule"?
>>>>     A. Allowing multiple modules to be passed around together
>>>>     allows layers lower in the stack to perform interesting
>>>>     optimizations. E.g. direct calls between objects that are
>>>>     allocated sufficiently close in memory. To add a single Module
>>>>     you just add a single-element set.
>>>>
>>>>     (4)
>>>>     Q. What happened to "finalize"?
>>>>     A. In the Orc APIs, getSymbolAddress automatically finalizes as
>>>>     necessary before returning addresses to the client. When you
>>>>     get an address back from getSymbolAddress, that address is
>>>>     ready to call.
>>>>
>>>>     (5)
>>>>     Q. What does "removeModuleSet" do?
>>>>     A. It removes the modules represented by the handle from the
>>>>     JIT. The meaning of this is specific to each layer, but
>>>>     generally speaking it means that any memory allocated for those
>>>>     modules (and their corresponding Objects, linked sections, etc)
>>>>     has been freed, and the symbols those modules provided are now
>>>>     undefined. Calling getSymbolAddress for a symbol that was
>>>>     defined in a module that has been removed is expected to return
>>>>     '0'.
>>>>
>>>>     (5a)
>>>>     Q. How are the linked sections freed? RTDyldMemoryManager
>>>>     doesn't have any "free.*Section" methods.
>>>>     A. Each ModuleSet gets its own RTDyldMemoryManager, and that is
>>>>     destroyed when the module set is freed. The choice of
>>>>     RTDyldMemoryManager is up to the client, but the standard
>>>>     memory managers will free the memory allocated for the linked
>>>>     sections when they're destroyed.
>>>>
>>>>     (6)
>>>>     Q. How does the CompileOnDemand layer redirect calls to the JIT?
>>>>     A. It currently uses double-indirection: Function bodies are
>>>>     extracted into new modules, and the body of the original
>>>>     function is replaced with an indirect call to the extracted
>>>>     body. The pointer for the indirect call is initialized by the
>>>>     JIT to point at some inline assembly which is injected into the
>>>>     module, and this calls back in to the JIT to trigger
>>>>     compilation of the extracted body. In the future I plan to make
>>>>     the redirection strategy a parameter of the CompileOnDemand
>>>>     layer. Double-indirection is the safest: It preserves
>>>>     function-pointer equality and works with non-writable
>>>>     executable memory, however there's no reason we couldn't use
>>>>     single indirection (for extra speed where pointer-equality
>>>>     isn't required), or patchpoints (for clients who can allocate
>>>>     writable/executable memory), or any combination of the three.
>>>>     My intent is that this should be up to the client.
>>>>
>>>>     As a brief note: it's worth noting that the CompileOnDemand
>>>>     layer doesn't handle lazy compilation itself, just lazy symbol
>>>>     resolution (i.e. symbols are resolved on first call, not when
>>>>     compiling). If you've put the CompileOnDemand layer on top of
>>>>     the LazyEmitLayer then deferring symbol lookup automatically
>>>>     defers compilation. (E.g. You can remove the LazyEmitLayer in
>>>>     main.cpp of the lazydemo and you'll get indirection and
>>>>     callbacks, but no lazy compilation).ÂÂ
>>>>
>>>>     (7)
>>>>     Q. Do the new APIs support cross-target JITing like MCJIT does?
>>>>     A. Yes.
>>>>
>>>>     (7.a)
>>>>     Q. Do the new APIs support cross-target (or cross process)
>>>>     lazy-jitting?
>>>>     A. Not yet, but all that is required is for us to add a small
>>>>     amount of runtime to the JIT'd process to call back in to the
>>>>     JIT via some RPC mechanism. There are no significant barriers
>>>>     to implementing this that I'm aware of.
>>>>
>>>>     (8)
>>>>     Q. Do any of the components implement the ExecutionEngine
>>>>     interface?
>>>>     A. None of the components do, but the MCJITReplacement class does.
>>>>
>>>>     (9)
>>>>     Q. Does this address any of the long-standing issues with MCJIT
>>>>     - Stackmap parsing? Debugging? Thread-local-storage?
>>>>     A. No, but it doesn't get in the way either. These features are
>>>>     still on the road-map (such as it exists) and I'm hoping that
>>>>     the modular nature of Orc will us to play around with new
>>>>     features like this without any risk of disturbing existing
>>>>     clients, and so allow us to make faster progress.
>>>>
>>>>     (10)
>>>>     Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
>>>>     A. I'm still tidying the patch up - please save patch specific
>>>>     feedback for for llvm-commits, otherwise we'll get cross-talk
>>>>     between the threads. The patches should be coming soon.
>>>>
>>>>     ---
>>>>
>>>>     As mentioned above, I'm happy to answer further general
>>>>     questions about what these APIs can do, or where I see them
>>>>     going. Feedback on the patch itself should be directed to the
>>>>     llvm-commits list when I start posting patches there for
>>>>     discussion.
>>>>
>>>>
>>>>     * Marketing slogans abound: "Very MachO". "Some warts".
>>>>     "Surprisingly friendly with ELF". "Not yet on speaking terms
>>>>     with DWARF".
>>>>
>>>>
>>>>     _______________________________________________
>>>>     LLVM Developers mailing list
>>>>     LLVMdev at cs.uiuc.edu  <mailto:LLVMdev at cs.uiuc.edu>          http://llvm.cs.uiuc.edu
>>>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu          http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu          http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu          http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150118/473923f3/attachment.html>


More information about the llvm-dev mailing list