[PATCH][llvm-c] Expose MC JIT

Filip Pizlo fpizlo at apple.com
Mon Apr 22 14:09:08 PDT 2013


Here it is.  The C API no longer exposes finalizeObject() in any way, and makes sure to call it prior to doing anything that the user would expect to have already done permissions/invalidation.



-Filip


On Apr 22, 2013, at 2:02 PM, Filip Pizlo <fpizlo at apple.com> wrote:

> OK - I think that calling finalizeObject() within the C API is a reasonable kludge for now.  I'll have a patch shortly.
> 
> More broadly, getting MCJIT into a better shape is on my critical path, as well.  See: https://bugs.webkit.org/show_bug.cgi?id=112840.  I'd like to help out with the de-sloppiness-ifying of it, so I'd like to understand better what the end goal is.  I already know that I want to control memory management, cache management, and page permissions myself as a client, and preferably I want to accomplish this by supplying my own RuntimeDyld (or JITMemoryManager).  I also know that I'll want to improve error tolerance - for example in case of memory allocation failure (honestly, I haven't investigated this much, so this may already be supported - I just don't see how through the API).  So, I'm particularly interesting in seeing how to get the MCJIT into a great shape while supporting both the currently-envisioned use cases in LLVM and its other clients, and also the way that WebKit will use it!  This should be fun. :-)
> 
> -Filip
> 
> 
> On Apr 22, 2013, at 11:51 AM, "Kaylor, Andrew" <andrew.kaylor at intel.com> wrote:
> 
>> I wasn’t sure the C API needed to support the case of JITing into another address space.  I guess it makes sense that we’d want to make that available eventually.
>>  
>> I do think it would be better to have MCJIT finalize the object from getPointerToFunction(), but as I said we can’t do that until there’s another way to trigger code generation.  This is part of the sloppiness I referred to earlier.  There is a web of interactions going on within MCJIT and there are some constraints on the order in which things can be done.  Right now MCJIT has a patchwork way of handling this, but it really needs to be cleaned up to manage things in a more intentional way.
>>  
>> Anyway, my immediate goal here is to figure out the simplest reasonable thing we can do to get your patch working in a sustainable way.
>>  
>> I don’t know if anything other than lli is relying on the current side-effects of MCJIT::getPointerToFunction().  I’d prefer to hold off on putting new things in the MCJIT interface until the aforementioned clean-up is done.  I was suggesting putting the call to finalizeObject in the C API because I’d rather rush something into implementation than rush something into an interface, but depending on your timeline for making the C API I could try to bump the priority of the MCJIT state handling clean-up on my end.
>>  
>> -Andy
>>  
>> From: Filip Pizlo [mailto:fpizlo at apple.com] 
>> Sent: Monday, April 22, 2013 11:25 AM
>> To: Kaylor, Andrew
>> Cc: David Tweed; Eric Christopher; llvm-commits at cs.uiuc.edu
>> Subject: Re: [PATCH][llvm-c] Expose MC JIT
>>  
>> Wouldn't it be better to just have MCJIT call finalizeObject in getPointerToFunction and friends, and eradicate ExecutionEngine::finalizeObject() completely?
>>  
>> OTOH, if I understand right and the goal of the MCJIT is to support the notion of JITing into someone else's address space, then finalizeObject() makes loads of sense: you want the client to be able to say when it happens. If that's true then it should be exposed to the C API. 
>> 
>> -Filip
>> 
>> On Apr 22, 2013, at 10:57 AM, "Kaylor, Andrew" <andrew.kaylor at intel.com> wrote:
>> 
>> It definitely won’t be necessary to expose more things through the C API.  At most, things will need to be added to the implementation.
>>  
>> I’m wondering if we can even avoid having to put the ‘finalize’ concept into the API.
>>  
>> As I understand it, the normal work flow would be something like this:
>>  
>> 1.      Create a module and populate it
>> 2.      Create an execution engine for the module
>> 3.      Get a pointer to a function in the module
>> 4.      Execute the function
>>  
>> If that’s right, I guess MCJIT trips because while its implementation of getPointerTo[Named]Function triggers compilation it doesn’t cause permissions to be applied or invalidate the code cache.  This happens because MCJIT needs to handle the case where the generated code is going to be executed in another process (and possibly on another system), so it doesn’t make assumptions about when everything is in its final place.  The C API, however, could arguably make such assumptions.
>>  
>> What this boils down to is that somewhere between step 2 above and step 4 above we need to:
>>  
>> 1.      Generate the code
>> 2.      Apply relocations
>> 3.      Apply memory permissions
>> 4.      Invalidate the code cache
>>  
>> MCJIT does 1 and 2, if necessary, in response to a getPointerToFunction() call.  Arguably it could also do 3 and 4 there, since in the remote case the client code isn’t going to want pointers to functions.  The trouble is that there are places (such as lli) where we are using that call to trigger code emission even though we may still want to move things around before 3 and 4 happen.  That wouldn’t be a problem if we exposed a function to trigger code emission directly, but we don’t right now.
>>  
>> However, it may be reasonable for the C API implementation to call MCJIT::finalizeObject when its getPointerToFunction equivalent is called.
>>  
>> That leaves us with invalidating the code cache.  I don’t see any reason that the memory manager shouldn’t do that automatically when the applyPermissions function is called.  I notice that currently invalidateInstructionCache is part of the SectionMemoryManager interface but not the RuntimeDyldMemoryManager, so that’s a problem already if we don’t just make it part of applyPermissions().
>>  
>> So what I’m thinking is that if you can add a call to MCJIT::finalizeObject in the appropriate places in the C API implementation then the FinalizeAllObjects method can be removed completely.  I’ll add an invalidateInstructionCache() call in the SectionMemoryManager::applyPermissions() implementation, and that should take care of the ARM issue.
>>  
>> Sounds good. 
>> 
>> 
>>  
>> Does that sound reasonable?
>>  
>> -Andy
>>  
>>  
>> From: Filip Pizlo [mailto:fpizlo at apple.com] 
>> Sent: Monday, April 22, 2013 10:19 AM
>> To: Kaylor, Andrew
>> Cc: David Tweed; Eric Christopher; llvm-commits at cs.uiuc.edu
>> Subject: Re: [PATCH][llvm-c] Expose MC JIT
>>  
>>  
>> On Apr 22, 2013, at 10:06 AM, "Kaylor, Andrew" <andrew.kaylor at intel.com> wrote:
>> 
>> 
>> 
>> The state management in MCJIT is quite sloppy right now.  I agree that invalidating the code cache is an issue that needs to be considered.  It seems to me that MCJIT itself ought to be able to do that when it needs to be done if it were paying attention.  At the very least, it could happen in the 'finalizeObject' method.  That's somewhat tangential to the patch at hand, but it is a consideration.  If there's something quick I can do to MCJIT to make this work, that's probably preferable to pushing something into the C-interface implementation.  I'll give that some thought today, but if anyone else is interested I'd be happy to make it a discussion rather than just a private rumination.
>>  
>> Hopefully we can do this without exposing more stuff via the C API.  I think that finalizeObject() should do this, but I will think about it some more.
>> 
>> 
>> 
>> 
>> Otherwise, my main question about the patch has to do with the nature of the C-interface API.  Is that API treated as a contract that needs to be respected from release to release or are we free to tinker with it as needed?  The thing that worries me is how this interface will survive the transition to multiple module support.
>>  
>> I believe that this is the goal of the C API, yes.  It is also a goal of this patch to be forward-compatible in this way.
>>  
>> My patch defends against this in two ways:
>>  
>> 1) When creating the MCJIT via the C API, I just follow the same convention as other ExecutionEngines do: you specify a module, but you can add one later.  Right now calling AddModule on an MCJIT instance will crash, and the documentation tells you this.  Once the MCJIT supports multiple modules, I believe that this should Just Work - you will then be able to call AddModule.
>>  
>> 2) I don't expose finalizeObject() directly.  Instead I created a new API called FinalizeAllObjects(), which requires that all modules associated with the execution engine get finalized at the time of call.  My understanding is that it is safe to finalizeObject() if you've already done it before - it's currently idempotent.  So if MCJIT goes multi-module, then this API will still have a well-defined behavior, and this behavior will not be different from what my patch does.
>>  
>> -Filip
>>  
>> 
>> 
>> 
>> 
>> -Andy
>> 
>> -----Original Message-----
>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of David Tweed
>> Sent: Monday, April 22, 2013 7:33 AM
>> To: 'Eric Christopher'; Filip Pizlo
>> Cc: llvm-commits at cs.uiuc.edu
>> Subject: RE: [PATCH][llvm-c] Expose MC JIT
>> 
>> Hi,
>> 
>> Not a comment on the general idea (which seems like a good one), but asking detail question: some architectures (such as ARM) invalidating cache entries at more times than, eg, x86. (IIRC ARM needs to invalidate the cache at the transition time between being "data" and "instructions".) It looks to me like this can be done in LLVMFinalizeAllObjects(), but I'm cc:ing someone much, much more knowledgeable about this than me... It's certainly desirable that the API provides enough points at which the MC JIT is in command that it can call all the (possibly platform specific) permissions/cache invalidation operations needed on that memory without user code needing to do it.
>> 
>> Other than that, the patch looks like a good patch to me (but again Andrew is the main authority).
>> 
>> Cheers,
>> Dave
>> 
>> 
>> -----Original Message-----
>> From: llvm-commits-bounces at cs.uiuc.edu
>> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Eric Christopher
>> Sent: 22 April 2013 14:22
>> To: Filip Pizlo
>> Cc: llvm-commits at cs.uiuc.edu
>> Subject: Re: [PATCH][llvm-c] Expose MC JIT
>> 
>> The API and rationale seem totally reasonable to me. Up to Andy to approve though.
>> 
>> -eric
>> 
>> On Sun, Apr 21, 2013 at 7:14 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>> 
>> 
>> OK - I have a new patch for review, which incorporates Andrew's feedback.
>> 
>> The patch exposes the MCJIT via the C API.
>> 
>> The current API uses the SectionMemoryManager by default.  I plan to
>> expose
>> 
>> 
>> the ability to have the C API call back into the client's code, and 
>> use
>> the
>> 
>> 
>> client's own memory manager, in the future.  But even then, the 
>> default
>> will
>> 
>> 
>> be SectionMemoryManager.  Because this requires calling
>> applyPermissions(),
>> 
>> 
>> I also expose the ExecutionEngine::finalizeObject() method.  This was
>> tricky
>> 
>> 
>> - I take it that in the future, this method will take a Module*M 
>> parameter to specify which module to finalize.  In order to not create 
>> future confusion in the C API, I expose this as 
>> LLVMFinalizeAllObjects() and specify the API's semantics as being that 
>> all objects associated with the execution engine should be finalized.
>> 
>> The patch also exposes the NoFramePointerElim option.  The manner in 
>> which options are exposed is designed for forward compatibility; you 
>> supply an options struct along with a size which you zero-fill prior 
>> to
>> manipulating.
>> 
>> 
>> This is similar to the idiom I've seen used in other C APIs like
>> BerkeleyDB.
>> 
>> 
>> I considered having separate C function calls for each option, in the
>> style
>> 
>> 
>> of the ExecutionEngineBuilder API - but while that idiom feels right 
>> to me in C++, it feels less C-like.  As well, the current options 
>> approach
>> exposes
>> 
>> 
>> not just parts of the Builder but also part of TargetOptions (namely, 
>> NoFramePointerElim).  It's also more concise in practice.
>> 
>> I plan to expose more innards through the LLVMMCJITCompilerOptions in 
>> the future.  I'd be happy to do more of that in one go if that was 
>> preferred; but I thought that a baby step would be the best thing for now.
>> 
>> 
>> 
>> -Filip
>> 
>> 
>> On Apr 21, 2013, at 6:26 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>> 
>> 
>> On Apr 15, 2013, at 10:42 AM, "Kaylor, Andrew" 
>> <andrew.kaylor at intel.com>
>> wrote:
>> 
>> OK, let me start by saying that MCJIT does take ownership of the 
>> memory manager.  It doesn't use an OwningPtr, which would make this 
>> clear, but it does delete the pointer in its destructor.  I think this 
>> is happening because we needed some finer control over when the MM got 
>> deleted.  I
>> should
>> 
>> 
>> probably revisit this and at least add some comments to make it clear
>> what's
>> 
>> 
>> happening and why.  It might not even be an issue anymore, because I 
>> did some work a while ago to try to clean up object ownership issues.
>> 
>> 
>> Actually, I was just confused.  MCJIT deletes MemMgr, which is always 
>> aliased to Dyld, as far as I can tell.  So I was just wrong. :-)
>> 
>> 
>> That said, I have been meaning for some time to break apart the JIT 
>> and MCJIT interfaces.  The fact that they are both abstracted by
>> ExecutionEngine
>> 
>> 
>> and EngineBuilder complicates that, but it really needs to be done (as 
>> you are seeing).
>> 
>> For now, would it be possible to have the C-interface provide a 
>> wrapper
>> that
>> 
>> 
>> supplies empty implementations of the irrelevant functions when 
>> creating a memory manager for MCJIT?
>> 
>> 
>> I think this is sensible.  I will proceed in this way.
>> 
>> Thanks!
>> 
>> 
>> -Andy
>> 
>> From: Filip Pizlo [mailto:fpizlo at apple.com]
>> Sent: Saturday, April 13, 2013 1:18 AM
>> To: Kaylor, Andrew
>> Cc: llvm-commits at cs.uiuc.edu
>> Subject: Re: [PATCH][llvm-c] Expose MC JIT
>> 
>> Ah - good thing you pointed this out.  I just realized that my patch 
>> is wrong.  Perhaps I can get some feedback on the best way to architect this.
>> 
>> Here's the problem:
>> 
>> - MCJIT does not take ownership of the memory manager.  Hence 
>> allocating
>> one
>> 
>> 
>> in the constructor is wrong; it'll leak when MCJIT dies.  But deleting 
>> the memory manager passed to MCJIT would be a change in behavior, and 
>> I'm not sure if it's in line with either what existing users expect or 
>> what was intended.  Insofar as the JIT instance corresponds to 
>> ownership of
>> modules,
>> 
>> 
>> it feels like it shouldn't also take ownership of the memory manager; 
>> for example you might imagine wanting to throw away the MCJIT but keep 
>> the
>> code
>> 
>> 
>> it generated and continue to use the memory manager to track it - and 
>> eventually free it.  But EngineBuilder currently claims that the 
>> ExecutionEngine takes ownership of the JMM - I'm assuming that this is
>> just
>> 
>> 
>> wrong documentation, and that EngineBuilder's use of the same JMM 
>> option
>> for
>> 
>> 
>> both JIT and MCJIT is just not right.
>> 
>> - I'd like to expose SectionMemoryManager and, eventually in a 
>> separate patch, the ability to create custom RTDyldMemoryManagers via the C API.
>> I'd
>> 
>> 
>> prefer this to be an RTDyldMemoryManager and not a JITMemoryManager, 
>> since the latter has a load of methods that are not relevant to MCJIT.  
>> But EngineBuilder wants a JITMemoryManager.  This would mean that the 
>> C API would have to pass its RTDyldMemoryManager via a cast to 
>> JITMemoryManager just so MCJIT could then use it as an 
>> RTDyldMemoryManager again.  Seems wrong.  I'm assuming that the 
>> correct long-term thing is to fix the EngineBuilder to not pass the 
>> JMM to the MCJIT, since it's good to expose the fact that the MCJIT 
>> actually just wants an RTDyldMemoryManager
>> instead.
>> 
>> 
>> 
>> In short, I'd like to have a separate EngineBuilder setting for the 
>> RTDyldMemoryManager.  If this is specified and you end up using the 
>> JIT
>> and
>> 
>> 
>> not MCJIT, you get an error.  If you use the MCJIT, then the 
>> RTDyldMemoryManager option overrides the JMM option.  Or something
>> similar.
>> 
>> 
>> 
>> Does that make sense?
>> 
>> -Filip
>> 
>> 
>> On Apr 12, 2013, at 5:38 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>> 
>> 
>> Thanks for the feedback!  I will try this change and see what happens.
>> 
>> -Filip
>> 
>> 
>> On Apr 12, 2013, at 5:35 PM, "Kaylor, Andrew" 
>> <andrew.kaylor at intel.com>
>> wrote:
>> 
>> Hi Filip,
>> 
>> I'll take a closer look at your patches on Monday, but my initial 
>> input is that the default memory manager used should be 
>> SectionMemoryManager rather than the DefaultJITMemoryManager.
>> 
>> Thanks,
>> Andy
>> 
>> From: llvm-commits-bounces at cs.uiuc.edu 
>> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Filip Pizlo
>> Sent: Friday, April 12, 2013 4:49 PM
>> To: llvm-commits at cs.uiuc.edu
>> Subject: Re: [PATCH][llvm-c] Expose MC JIT
>> 
>> Revised patches included.
>> 
>> I added additional ruggedizing to the LLVMCreateMCJITCompilerForModule 
>> function, so that if it detects that the passed struct is larger than 
>> expected, it reports an error instead of continuing.
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130422/4be255f6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcjit.patch
Type: application/octet-stream
Size: 5543 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130422/4be255f6/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130422/4be255f6/attachment-0001.html>


More information about the llvm-commits mailing list