[LLVMdev] New JIT APIs
Armin Steinhoff
armin at steinhoff.de
Sun Jan 18 06:49:24 PST 2015
Philip Reames schrieb:
>
> On 01/16/2015 02:08 PM, Armin Steinhoff wrote:
>> Hi Lang,
>>
>> Lang Hames schrieb:
>>> Hi Armin,
>>>
>>> > The MCJIT API can only be used once to JIT compile external souces
>>> to excuteable code into the address space of a running process.
>>
>> That means: after the first successfull JIT compile it isn't possible
>> to do it again (within the same active process) ... because of some
>> resource issues.
> Er, this is definitely something specific to your use case or
> environment. I'm doing thousands of compiles in the same process on
> an extremely regular basis with no problems.
Good to know ... I started 2 years ago and used the tool "lli" as an JIT
example, but didn't care about the atexit handling in details.
Is in the meantime a documentation available about the "MCJIT API" ?
I'm not a specialist for compiler construction ... so it is a PITA for
me to go through the djungle of class definitions (most of them w/o any
comments explaining their semantics).
Is it not possible to develop a user interface for the JIT compile API
comparable to the user interface of e.g. CLANG ??
Thanks so far
Armin
>>
>>>
>>> I'm not sure exactly what you mean by "can only be used once" in
>>> this context. Regardless, the new APIs are definitely designed to
>>> make it easier to lead, unload and replace modules, and I hope they
>>> will support a wider range of use cases off-the-shelf than MCJIT does.
>>
>> OK ... sound interesting, I will test it.
>>
>>
>> Regards
>>
>> Armin
>>
>>
>>>
>>> Cheers,
>>> Lang.
>>>
>>> On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <armin at steinhoff.de
>>> <mailto:armin at steinhoff.de>> wrote:
>>>
>>>
>>> Hi Lang,
>>>
>>> we are using the JIT API of TCC and the MCJIT API in order to
>>> import external code into a running control application process.
>>>
>>> The MCJIT API can only be used once to JIT compile external
>>> souces to excuteable code into the address space of a running
>>> process.
>>>
>>> Has your JIT API the same restriction ? It would be very nice
>>> if your JIT API could provide a similar functionalty as provided
>>> by TCC.
>>>
>>> Best Regards
>>>
>>> Armin
>>>
>>>
>>> Lang Hames schrieb:
>>>> Hi All,
>>>>
>>>> The attached patch (against r225842) contains some new JIT APIs
>>>> that I've been working on. I'm going to start breaking it up,
>>>> tidying it up, and submitting patches to llvm-commits soon, but
>>>> while I'm working on that I thought I'd put the whole patch out
>>>> for the curious to start playing around with and/or commenting on.
>>>>
>>>> The aim of these new APIs is to cleanly support a wider range
>>>> of JIT use cases in LLVM, and to recover some of the
>>>> functionality lost when the legacy JIT was removed. In
>>>> particular, I wanted to see if I could re-enable lazy
>>>> compilation while following MCJIT's design philosophy of
>>>> relying on the MC layer and module-at-a-time compilation. The
>>>> attached patch goes some way to addressing these aims, though
>>>> there's a lot still to do.
>>>>
>>>> The 20,000 ft overview, for those who want to get straight to
>>>> the code:
>>>>
>>>> The new APIs are not built on top of the MCJIT class, as I
>>>> didn't want a single class trying to be all things to all
>>>> people. Instead, the new APIs consist of a set of software
>>>> components for building JITs. The idea is that you should be
>>>> able to take these off the shelf and compose them reasonably
>>>> easily to get the behavior that you want. In the future I hope
>>>> that people who are working on LLVM-based JITs, if they find
>>>> this approach useful, will contribute back components that
>>>> they've built locally and that they think would be useful for a
>>>> wider audience. As a demonstration of the practicality of this
>>>> approach the attached patch contains a class, MCJITReplacement,
>>>> that composes some of the components to re-create the behavior
>>>> of MCJIT. This works well enough to pass all MCJIT regression
>>>> and unit tests on Darwin, and all but four regression tests on
>>>> Linux. The patch also contains the desired "new" feature:
>>>> Function-at-a-time lazy jitting in roughly the style of the
>>>> legacy JIT. The attached lazydemo.tgz file contains a program
>>>> which composes the new JIT components (including the
>>>> lazy-jitting component) to lazily execute bitcode. I've tested
>>>> this program on Darwin and it can run non-trivial benchmark
>>>> programs, e.g. 401.bzip2 from SPEC2006.
>>>>
>>>> These new APIs are named after the motivating feature: On
>>>> Request Compilation, or ORC. I believe the logo potential is
>>>> outstanding. I'm picturing an Orc riding a Dragon. If I'm
>>>> honest this was at least 45% of my motivation for doing this
>>>> project*.
>>>>
>>>> You'll find the new headers in
>>>> llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the
>>>> implementation files in lib/ExecutionEngine/OrcJIT/*.
>>>>
>>>> I imagine there will be a number of questions about the design
>>>> and implementation. I've tried to preempt a few below, but
>>>> please fire away with anything I've left out.
>>>>
>>>> Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie,
>>>> Pete Cooper, Eric Christopher, and Louis Gerbarg for taking
>>>> time out to review, discuss and test this thing as I've worked
>>>> on it.
>>>>
>>>> Cheers,
>>>> Lang.
>>>>
>>>> Possible questions:
>>>>
>>>> (1)
>>>> Q. Are you trying to kill off MCJIT?
>>>> A. There are no plans to remove MCJIT. The new APIs are
>>>> designed to live alongside it.
>>>>
>>>> (2)
>>>> Q. What do "JIT components" look like, and how do you compose them?
>>>> A. The classes and functions you'll find in OrcJIT/*.h fall
>>>> into two rough categories: Layers and Utilities. Layers are
>>>> classes that implement a small common interface that makes them
>>>> easy to compose:
>>>>
>>>> class SomeLayer {
>>>> private:
>>>> ÃÂ // Implementation details
>>>> public:
>>>> ÃÂ // Implementation details
>>>>
>>>> ÃÂ typedef ??? Handle;
>>>>
>>>> ÃÂ template <typename ModuleSet>
>>>> ÃÂ Handle addModuleSet(ModuleSet&& Ms);
>>>>
>>>> ÃÂ void removeModuleSet(Handle H);
>>>>
>>>> ÃÂ uint64_t getSymbolAddress(StringRef Name, bool
>>>> ExportedSymbolsOnly);
>>>>
>>>> ÃÂ uint64_t lookupSymbolAddressIn(Handle H, StringRef Name,
>>>> bool ExportedSymbolsOnly);
>>>> };
>>>>
>>>> Layers are usually designed to sit one-on-top-of-another, with
>>>> each doing some sort of useful work before handing off to the
>>>> layer below it. The layers that are currently included in the
>>>> patch are the the CompileOnDemandLayer, which breaks up modules
>>>> and redirects calls to not-yet-compiled functions back into the
>>>> JIT; the LazyEmitLayer, which defers adding modules to the
>>>> layer below until a symbol in the module is actually requested;
>>>> the IRCompilingLayer, which compiles bitcode to objects; and
>>>> the ObjectLinkingLayer, which links sets of objects in memory
>>>> using RuntimeDyld.
>>>>
>>>> Utilities are everything that's not a layer. Ideally the heavy
>>>> lifting is done by the utilities. Layers just wrap certain
>>>> uses-cases to make them easy to compose.
>>>>
>>>> Clients are free to use utilities directly, or compose layers,
>>>> or implement new utilities or layers.
>>>>
>>>> (3)
>>>> Q. Why "addModuleSet" rather than "addModule"?
>>>> A. Allowing multiple modules to be passed around together
>>>> allows layers lower in the stack to perform interesting
>>>> optimizations. E.g. direct calls between objects that are
>>>> allocated sufficiently close in memory. To add a single Module
>>>> you just add a single-element set.
>>>>
>>>> (4)
>>>> Q. What happened to "finalize"?
>>>> A. In the Orc APIs, getSymbolAddress automatically finalizes as
>>>> necessary before returning addresses to the client. When you
>>>> get an address back from getSymbolAddress, that address is
>>>> ready to call.
>>>>
>>>> (5)
>>>> Q. What does "removeModuleSet" do?
>>>> A. It removes the modules represented by the handle from the
>>>> JIT. The meaning of this is specific to each layer, but
>>>> generally speaking it means that any memory allocated for those
>>>> modules (and their corresponding Objects, linked sections, etc)
>>>> has been freed, and the symbols those modules provided are now
>>>> undefined. Calling getSymbolAddress for a symbol that was
>>>> defined in a module that has been removed is expected to return
>>>> '0'.
>>>>
>>>> (5a)
>>>> Q. How are the linked sections freed? RTDyldMemoryManager
>>>> doesn't have any "free.*Section" methods.
>>>> A. Each ModuleSet gets its own RTDyldMemoryManager, and that is
>>>> destroyed when the module set is freed. The choice of
>>>> RTDyldMemoryManager is up to the client, but the standard
>>>> memory managers will free the memory allocated for the linked
>>>> sections when they're destroyed.
>>>>
>>>> (6)
>>>> Q. How does the CompileOnDemand layer redirect calls to the JIT?
>>>> A. It currently uses double-indirection: Function bodies are
>>>> extracted into new modules, and the body of the original
>>>> function is replaced with an indirect call to the extracted
>>>> body. The pointer for the indirect call is initialized by the
>>>> JIT to point at some inline assembly which is injected into the
>>>> module, and this calls back in to the JIT to trigger
>>>> compilation of the extracted body. In the future I plan to make
>>>> the redirection strategy a parameter of the CompileOnDemand
>>>> layer. Double-indirection is the safest: It preserves
>>>> function-pointer equality and works with non-writable
>>>> executable memory, however there's no reason we couldn't use
>>>> single indirection (for extra speed where pointer-equality
>>>> isn't required), or patchpoints (for clients who can allocate
>>>> writable/executable memory), or any combination of the three.
>>>> My intent is that this should be up to the client.
>>>>
>>>> As a brief note: it's worth noting that the CompileOnDemand
>>>> layer doesn't handle lazy compilation itself, just lazy symbol
>>>> resolution (i.e. symbols are resolved on first call, not when
>>>> compiling). If you've put the CompileOnDemand layer on top of
>>>> the LazyEmitLayer then deferring symbol lookup automatically
>>>> defers compilation. (E.g. You can remove the LazyEmitLayer in
>>>> main.cpp of the lazydemo and you'll get indirection and
>>>> callbacks, but no lazy compilation).ÃÂ
>>>>
>>>> (7)
>>>> Q. Do the new APIs support cross-target JITing like MCJIT does?
>>>> A. Yes.
>>>>
>>>> (7.a)
>>>> Q. Do the new APIs support cross-target (or cross process)
>>>> lazy-jitting?
>>>> A. Not yet, but all that is required is for us to add a small
>>>> amount of runtime to the JIT'd process to call back in to the
>>>> JIT via some RPC mechanism. There are no significant barriers
>>>> to implementing this that I'm aware of.
>>>>
>>>> (8)
>>>> Q. Do any of the components implement the ExecutionEngine
>>>> interface?
>>>> A. None of the components do, but the MCJITReplacement class does.
>>>>
>>>> (9)
>>>> Q. Does this address any of the long-standing issues with MCJIT
>>>> - Stackmap parsing? Debugging? Thread-local-storage?
>>>> A. No, but it doesn't get in the way either. These features are
>>>> still on the road-map (such as it exists) and I'm hoping that
>>>> the modular nature of Orc will us to play around with new
>>>> features like this without any risk of disturbing existing
>>>> clients, and so allow us to make faster progress.
>>>>
>>>> (10)
>>>> Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
>>>> A. I'm still tidying the patch up - please save patch specific
>>>> feedback for for llvm-commits, otherwise we'll get cross-talk
>>>> between the threads. The patches should be coming soon.
>>>>
>>>> ---
>>>>
>>>> As mentioned above, I'm happy to answer further general
>>>> questions about what these APIs can do, or where I see them
>>>> going. Feedback on the patch itself should be directed to the
>>>> llvm-commits list when I start posting patches there for
>>>> discussion.
>>>>
>>>>
>>>> * Marketing slogans abound: "Very MachO". "Some warts".
>>>> "Surprisingly friendly with ELF". "Not yet on speaking terms
>>>> with DWARF".
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150118/473923f3/attachment.html>
More information about the llvm-dev
mailing list