[LLVMdev] RFC: MCJIT enhancements - Object Caching

Jim Grosbach grosbach at apple.com
Mon Jul 30 10:17:03 PDT 2012


This sounds interesting, but strikes me as significantly outside the scope of the MCJIT itself. That is, this is the sort of thing a client application or framework would build on top of the MCJIT.

-Jim

On Jul 30, 2012, at 6:20 AM, "Demikhovsky, Elena" <elena.demikhovsky at intel.com> wrote:

> I have some vision about Object Caching.
>  
> A client, which is interested in object caching, will need some help from compiler and LLVM side. The object file, located on disk, should contain information that allows to decide – rebuild or not rebuild.
> The client (a backend engine, which translates IR to JIT) compares data from object file, found on disk  and module IR.
> The information that object file may contain I call “Signature”. It can be stored in “.comment” section.
> Signature includes:
> ·         Name of source file, full path. (It may be .ll or .cl or .cpp or any other extension, the compilation was started from)
> ·         Time stamp of the source file.
> ·         Name and version of the compiler.
> ·         Compilation flags.
> ·         Environment variables (all or those who affect compilation process)
> The compiler can generate this data and put it in metadata.
>  
> If we’ll take Intel OpenCL product, the clang can generate this special metadata. MCJIT, while building an object file, will compose a “ .comment” with a signature, described above.
>  
> Ø  I would like to introduce a mechanism whereby JITed objects could be cached and loaded from cache at a future time.  It seems that the caching should logically be performed transparently within the MCJIT engine, but it also seems that the caching mechanism will need to be provided by the client (or at least under client control).  I think that the best way to accomplish this is the introduce a new interface (ObjectManager?), similar to the memory manager, that can be specified by the client and used by the MCJIT engine.  What I have in mind is that when MCJIT is about to compile a module, it will first check with the ObjectManager to see if a pre-compiled image is available.  If so, it will simply pass that image to the RuntimeDyld for loading.  If not, it will compile as usual.  Likewise, after compilation (but before loading) MCJIT would offer the compiled image to the ObjectManager and the ObjectManager could save the image to a file cache (or whatever).
>  
>  
> - Elena
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Kaylor, Andrew
> Sent: Wednesday, July 25, 2012 01:25
> To: llvmdev at cs.uiuc.edu
> Subject: [LLVMdev] RFC: MCJIT enhancements
>  
> Following up and expanding on an earlier conversation, I'd like to discuss making several non-trivial changes to the MCJIT engine and related objects.  There may be some interdependencies between these changes, but I think that they can be logically grouped as follows:
>  
> * Lazy module compilation
>  
> * Enhance the JIT memory manager interface to enable section-based memory protection
>  
> * Clean up object/memory ownership problems
>  
> * Introduce a mechanism for caching generated objects
>  
> * Support JIT events for MCJIT
>  
> * Support ELF generation on Windows
>  
>  
> Now let me expand on each of these a bit.
>  
> ---------------------
> Lazy compilation
> ---------------------
>  
> Right now a Module is passed to the EngineBuilder::create call and is compiled immediately within the MCJIT constructor.  I don't believe it is possible to perform lazy compilation on a per function basis as was done in the legacy JIT interface, but there's no reason that module compilation and loading can't be deferred until a function is requested.  I think this is a fairly straight-forward and non-controversial change.
>  
>  
> ----------------------------------
> Memory manager interface
> ----------------------------------
>  
> The memory manager changes are a bit more complicated.  In the current MCJIT implementation, MCJIT clients may specify a memory manager through the EngineBuilder and this memory manager will be passed along to the MCJIT engine.  The  JITMemoryManager interface was originally created to suit the needs of the legacy JIT engine.  As the MCJIT engine has come into existence, its memory needs have been somewhat wedged in to the old JITMemoryManager interface.  The result has been that there are effectively two interfaces existing side-by-side in the same abstract base class and implementations will typically only support one or the other set of functions.  This is further complicated by the RuntimeDyldMemoryManager, which doesn't inherit from the JITMemoryManager but does share a subset of functions and is implemented as a wrapper around the client-supplied memory manager in the current MCJIT code.  There's also an MCJITmemoryManager class which derives from RuntimeDyld and is used by MCJIT but isn't exposed to the client at all.
>  
> I'd like to see this situation cleaned up as we move forward with a way for the client to specify an MCJITMemoryManager directly and a separation of the legacy memory manager interface from the new interface.  In addition, I'd like the new memory manager interface to be extended with the functions that the RuntimeDyld needs to manage setting of section-specific permissions including non-writable code and read-only data.
>  
>  
> ----------------------
> Object ownership
> ----------------------
>  
> Within the current MCJIT implementation there are a number of ugly cross dependencies in object/memory life cycle, making the MCJIT/RuntimeDyld/JITMemoryManager relationship fragile and very dependent upon a particular order of object destruction. 
>  
> In the current implementation, the buffer into which code is generated is owned by the MCJIT component, but when sections from the generated code are loaded, they are loaded into memory owned by the JITMemoryManager.  During object loading, an ObjectFile is created within the RuntimeDyld which references both of these buffers.  The situation is further compilicated in the case where the RuntimeDyldELF object attempts to register the generated object with GDB, because the GDB-interface requires a reference to both memory buffers.  The GDB_required references are currently maintained in an ObjectImage instance which is held by RuntimeDyldELF.
>  
> For those who are visually oriented, I am attaching a diagram which shows the object relationships.
>  
> I would like to change this by introducing an ObjectBuffer which would be allocated by the MCJIT object at compilation time and then passed to the RuntimeDyld::loadObject.  RuntimeDyld::loadObject would hand this ObjectBuffer off to the new ObjectImage instance (which it already creates today).  The ObjectImage would be returned from the RuntimeDyld::loadObject to MCJIT and MCJIT would own that object.
>  
> Again, a diagram is attached.
>  
>  
> -------------------
> Object caching
> -------------------
>  
> I would like to introduce a mechanism whereby JITed objects could be cached and loaded from cache at a future time.  It seems that the caching should logically be performed transparently within the MCJIT engine, but it also seems that the caching mechanism will need to be provided by the client (or at least under client control).  I think that the best way to accomplish this is the introduce a new interface (ObjectManager?), similar to the memory manager, that can be specified by the client and used by the MCJIT engine.  What I have in mind is that when MCJIT is about to compile a module, it will first check with the ObjectManager to see if a pre-compiled image is available.  If so, it will simply pass that image to the RuntimeDyld for loading.  If not, it will compile as usual.  Likewise, after compilation (but before loading) MCJIT would offer the compiled image to the ObjectManager and the ObjectManager could save the image to a file cache (or whatever).
>  
>  
> -----------------
> MCJIT Events
> -----------------
>  
> The legacy JIT interface supported a JITEventListener interface that would provide notification when new functions were JITed.  That interface was used to enable source-level profiling of JITed code.  There is currently no equivalent for the MCJIT engine.  I'd like to correct that.
>  
> In order to get to the function level, some code somewhere will need to parse the format-specific object code that is generated.  I think that it makes sense for that to be pushed to the event listener itself since some listeners may be satisfied with the raw object itself as input.
>  
> Most of the support necessary to parse object images to obtain source-level debug information is already present in LLVM somewhere, though not all of it is handily exposed through interfaces in the include tree.  In order to implement a default event listener that provides the basic functionality of the existing JIT event listeners, some extensions to the existing DIContext and related DebugInfo classes will be necessary.  The development of an interface to expose more DWARF information may prove worthy of its own discussion, as it would also enable better testing, but for now I'll just mention it as a requirement here.
>  
> It may make sense to subsume the object loading event under the ObjectManager interface described in the "object caching" section above rather than overloading the existing JITEventListener interface within MCJIT, as the other events in that interface don't properly apply to MCJIT.
>  
>  
> -------------------------------
> ELF Support on Windows
> -------------------------------
>  
> There are various reasons that it would be nice to be able to support generation of ELF objects on Windows through the MCJIT interface, one of which is that it would allow users to debug JITed code with GDB (i.e. MinGW).  There was a proposal by Eli Bendersky to enable ELF generation on Windows by extending the target triple handling.  We have been using that approach, and it works.  However, at the time it was proposed there seemed to be a general feeling that it would be preferable to have a general way to specify a non-default object format for any platform rather than a Windows/ELF-specific extension.
>   
> The original discussion can be found here:
>  
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136053.html
>  
> Unfortunately, I don't believe a consensus was ever reached as to what the correct way to implement this would be.  I'd like to re-open that discussion now so that we can find a solution that would be generally accepted for inclusion in LLVM.
>  
> We can, of course, discuss any and all of these in more details as the implementation unfolds, but I wanted to get all of it out there first as a sort of road map to get general input on the overall direction of things in the MCJIT space.
>  
> Thanks to all who took the time to read this, and thank you in advance for your feedback.
>  
> -Andy
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies._______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120730/c82fb534/attachment.html>


More information about the llvm-dev mailing list