[LLVMdev] RFC: MCJIT enhancements

Jim Grosbach grosbach at apple.com
Thu Jul 26 14:36:30 PDT 2012


Hi Andy,

This all sounds fantastic. I'm sure we'll have some spirited discussions about the details, but I completely agree with the approach and goals. Good stuff, and thanks for working on this!

-Jim

On Jul 24, 2012, at 3:24 PM, "Kaylor, Andrew" <andrew.kaylor at intel.com> wrote:

> Following up and expanding on an earlier conversation, I'd like to discuss making several non-trivial changes to the MCJIT engine and related objects.  There may be some interdependencies between these changes, but I think that they can be logically grouped as follows:
>  
> * Lazy module compilation
>  
> * Enhance the JIT memory manager interface to enable section-based memory protection
>  
> * Clean up object/memory ownership problems
>  
> * Introduce a mechanism for caching generated objects
>  
> * Support JIT events for MCJIT
>  
> * Support ELF generation on Windows
>  
>  
> Now let me expand on each of these a bit.
>  
> ---------------------
> Lazy compilation
> ---------------------
>  
> Right now a Module is passed to the EngineBuilder::create call and is compiled immediately within the MCJIT constructor.  I don't believe it is possible to perform lazy compilation on a per function basis as was done in the legacy JIT interface, but there's no reason that module compilation and loading can't be deferred until a function is requested.  I think this is a fairly straight-forward and non-controversial change.
>  

Yep, also effectively a pre-requisite for meaningful lazy JITing of multiple modules.

>  
> ----------------------------------
> Memory manager interface
> ----------------------------------
>  
> The memory manager changes are a bit more complicated.  In the current MCJIT implementation, MCJIT clients may specify a memory manager through the EngineBuilder and this memory manager will be passed along to the MCJIT engine.  The  JITMemoryManager interface was originally created to suit the needs of the legacy JIT engine.  As the MCJIT engine has come into existence, its memory needs have been somewhat wedged in to the old JITMemoryManager interface.  The result has been that there are effectively two interfaces existing side-by-side in the same abstract base class and implementations will typically only support one or the other set of functions.  This is further complicated by the RuntimeDyldMemoryManager, which doesn't inherit from the JITMemoryManager but does share a subset of functions and is implemented as a wrapper around the client-supplied memory manager in the current MCJIT code.  There's also an MCJITmemoryManager class which derives from RuntimeDyld and is used by MCJIT but isn't exposed to the client at all.
>  
> I'd like to see this situation cleaned up as we move forward with a way for the client to specify an MCJITMemoryManager directly and a separation of the legacy memory manager interface from the new interface.  In addition, I'd like the new memory manager interface to be extended with the functions that the RuntimeDyld needs to manage setting of section-specific permissions including non-writable code and read-only data.

The current situation is a bit ad-hoc as we try to minimize the impact of MCJIT bring up on any clients of the old JIT. Now that the MCJIT is becoming more mature, we can start being a bit more aggressive about that and making the interfaces conform to what the MCJIT really needs.

>  
>  
> ----------------------
> Object ownership
> ----------------------
>  
> Within the current MCJIT implementation there are a number of ugly cross dependencies in object/memory life cycle, making the MCJIT/RuntimeDyld/JITMemoryManager relationship fragile and very dependent upon a particular order of object destruction. 
>  
> In the current implementation, the buffer into which code is generated is owned by the MCJIT component, but when sections from the generated code are loaded, they are loaded into memory owned by the JITMemoryManager.  During object loading, an ObjectFile is created within the RuntimeDyld which references both of these buffers.  The situation is further compilicated in the case where the RuntimeDyldELF object attempts to register the generated object with GDB, because the GDB-interface requires a reference to both memory buffers.  The GDB_required references are currently maintained in an ObjectImage instance which is held by RuntimeDyldELF.
>  
> For those who are visually oriented, I am attaching a diagram which shows the object relationships.
>  
> I would like to change this by introducing an ObjectBuffer which would be allocated by the MCJIT object at compilation time and then passed to the RuntimeDyld::loadObject.  RuntimeDyld::loadObject would hand this ObjectBuffer off to the new ObjectImage instance (which it already creates today).  The ObjectImage would be returned from the RuntimeDyld::loadObject to MCJIT and MCJIT would own that object.
>  
> Again, a diagram is attached.
>  
>  
Sounds reasonable.

> -------------------
> Object caching
> -------------------
>  
> I would like to introduce a mechanism whereby JITed objects could be cached and loaded from cache at a future time.  It seems that the caching should logically be performed transparently within the MCJIT engine, but it also seems that the caching mechanism will need to be provided by the client (or at least under client control).  I think that the best way to accomplish this is the introduce a new interface (ObjectManager?), similar to the memory manager, that can be specified by the client and used by the MCJIT engine.  What I have in mind is that when MCJIT is about to compile a module, it will first check with the ObjectManager to see if a pre-compiled image is available.  If so, it will simply pass that image to the RuntimeDyld for loading.  If not, it will compile as usual.  Likewise, after compilation (but before loading) MCJIT would offer the compiled image to the ObjectManager and the ObjectManager could save the image to a file cache (or whatever).
>  
>  
> -----------------
> MCJIT Events
> -----------------
>  
> The legacy JIT interface supported a JITEventListener interface that would provide notification when new functions were JITed.  That interface was used to enable source-level profiling of JITed code.  There is currently no equivalent for the MCJIT engine.  I'd like to correct that.
>  
> In order to get to the function level, some code somewhere will need to parse the format-specific object code that is generated.  I think that it makes sense for that to be pushed to the event listener itself since some listeners may be satisfied with the raw object itself as input.
>  
> Most of the support necessary to parse object images to obtain source-level debug information is already present in LLVM somewhere, though not all of it is handily exposed through interfaces in the include tree.  In order to implement a default event listener that provides the basic functionality of the existing JIT event listeners, some extensions to the existing DIContext and related DebugInfo classes will be necessary.  The development of an interface to expose more DWARF information may prove worthy of its own discussion, as it would also enable better testing, but for now I'll just mention it as a requirement here.
>  
> It may make sense to subsume the object loading event under the ObjectManager interface described in the "object caching" section above rather than overloading the existing JITEventListener interface within MCJIT, as the other events in that interface don't properly apply to MCJIT.
>  
>  
> -------------------------------
> ELF Support on Windows
> -------------------------------
>  
> There are various reasons that it would be nice to be able to support generation of ELF objects on Windows through the MCJIT interface, one of which is that it would allow users to debug JITed code with GDB (i.e. MinGW).  There was a proposal by Eli Bendersky to enable ELF generation on Windows by extending the target triple handling.  We have been using that approach, and it works.  However, at the time it was proposed there seemed to be a general feeling that it would be preferable to have a general way to specify a non-default object format for any platform rather than a Windows/ELF-specific extension.
>   
> The original discussion can be found here:
>  
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136053.html
>  
> Unfortunately, I don't believe a consensus was ever reached as to what the correct way to implement this would be.  I'd like to re-open that discussion now so that we can find a solution that would be generally accepted for inclusion in LLVM.
>  
> We can, of course, discuss any and all of these in more details as the implementation unfolds, but I wanted to get all of it out there first as a sort of road map to get general input on the overall direction of things in the MCJIT space.
>  
> Thanks to all who took the time to read this, and thank you in advance for your feedback.
>  
> -Andy
>  
> <mcjit-current-ownership.png><mcjit-proposed-ownership.png>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120726/be639cda/attachment.html>


More information about the llvm-dev mailing list