ExecutionEngine::finalizeModule()

Mon Oct 21 13:15:40 PDT 2013

Hi,

Sure, I'll do the SmallPtrSet patch.

I'll get to the other issues later.

Yaron

2013/10/21 Kaylor, Andrew <andrew.kaylor at intel.com>

>  Hi Yaron,****
>
> ** **
>
> Regarding the name changes, my expectation is that typically the module(s)
> will have already been loaded before the finalize method is called.  I was
> thinking of a name more like “prepareModuleForExecution”.****
>
> ** **
>
> Regarding the data structure, I had considered the linear time find issue
> and I did look over the data structures in the llvm programmer’s manual
> before choosing the list.  Mostly I chose it because it seemed like the
> simplest way to implement what I wanted.  I wasn’t worried about the find
> time because the only case where find is used outside of assert statements
> is in the deprecated getPointerToFunction method.****
>
> ** **
>
> That said, I’m not against the suggestion of using SmallPtrSet.  I was
> hoping to provide an interface to the module container class that would
> allow the core MCJIT implementation to be independent of the container
> class implementation so it could be optimized as needed without requiring
> changes to the base MCJIT code.  As I presented it in the patch using three
> internal containers would require a custom iterator implementation to walk
> the full set of containers, but that could be avoided by making the changes
> you suggest.****
>
> ** **
>
> Would you be interested in taking over this patch and working it through
> to completion?  I’ve got some other things I need to work on, and it would
> be very helpful to me if you could do this.****
>
> ** **
>
> Thanks,****
>
> Andy****
>
> ** **
>
> ** **
>
> P.S. One other thing that I noticed while considering your comments…. The
> old getPointerToFunction method takes an llvm::Function pointer as input,
> while the new getFunctionAddress takes the name of the function.  There are
> a number of reasons for this change.  We want MCJIT to be capable of
> discarding IR objects once code has been generated.  Also, there will be
> cases where MCJIT has never had IR for loaded objects.  (I have a patch in
> the works for this latter case now.)  For these reasons, we need an
> interface that will find symbol addresses with just a symbol name.
> However, I noticed while considering your comments that searching for an
> owning module using a function name is necessarily a linear search, whereas
> if we had the llvm::Function object from which we can get the parent module
> directly, it would just be a simple matter of verifying that we own that
> module.****
>
> ** **
>
> Perhaps both forms of the function are worth having.  On the other hand,
> the search by name is always going to happen when we’re linking so we
> should probably try to find a way to optimize that path.  One possibility
> would be to build some kind of map of names to modules as modules are
> added.  This would add a lot of upfront cost and wouldn’t be worth doing in
> many cases, but in the case of a REPL that is likely to have a low
> function-to-module ratio, contain a lot of modules and require a lot of
> linking this might be a critical optimization.  My suggestion here would be
> to provide a client option to enable this sort of behavior.****
>
> ** **
>
> ** **
>
> ****
>
> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
> *Sent:* Saturday, October 19, 2013 1:58 AM
>
> *To:* Kaylor, Andrew
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: ExecutionEngine::finalizeModule()****
>
> ** **
>
> Hi,****
>
> ** **
>
> Couple of ideas how to fix the FIXMEs.****
>
> ** **
>
> finalizeModule should be called loadAndFinalizeModule ****
>
> and****
>
> finalizeObject should be called loadAndFinalizeModules****
>
> to convey their functions.****
>
> ** **
>
> About the data structures, while the combined list solves nicely the
> looping find case it introduces a linear time find is time linear with
> possibly a high constant: http://llvm.org/docs/ProgrammersManual.html#list says "std::list
> is an extremely inefficient class that is rarely useful. It performs a heap
> allocation for every element inserted into it, thus having an extremely
> high constant factor, particularly for small data types."****
>
> ** **
>
> LLVM has many excellent datatypes available.****
>
> ** **
>
> The SmallPtrSet
> http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallptrset-h  ****
>
> would be a great fit for MCJIT needs: Iteration is still supported and
> other operations are constant time.****
>
> This will make a big difference when there are 1000s of modules.****
>
> ** **
>
> The price we'd have to pay for using SmallPtrSet is that we'll have three
> different SmallPtrSets so any operation on all three sets will require
> three loops, set by set.****
>
> ** **
>
> How bad is it ? there are two cases where a loop on all owned modules is
> required:****
>
> ** **
>
> 1) In MCJIT.CPP, we need several times to find a module from the three
> sets.****
>
> This can be done by a find function of the OwningModuleContainer which
> will find in the three sets and using it.****
>
> ** **
>
> 2) in ~OwningModuleContainer****
>
> We will need to destruct the three sets in three loops.****
>
> ** **
>
> Not too bad in exchange for runtime performance that will scale much
> better to the number of modules.****
>
> ** **
>
> Yaron****
>
> ** **
>
> ** **
>
> 2013/10/19 Kaylor, Andrew <andrew.kaylor at intel.com>****
>
>  Hi Yaron,****
>
>  ****
>
> Here’s a first attempt at implementing what I described below.  It’s still
> got one significant place that I know needs revision (marked by a FIXME
> comment), but it seems to be functional.****
>
>  ****
>
> Would you mind checking it out and telling me what you think?****
>
>  ****
>
> Thanks,****
>
> Andy****
>
>  ****
>
> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
> *Sent:* Friday, October 18, 2013 3:04 PM****
>
>
> *To:* Kaylor, Andrew
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: ExecutionEngine::finalizeModule()****
>
>  ****
>
> This data structure sounds good .****
>
>  ****
>
> I actually didn't consider lazy compilation with regard about module
> states because I knew that "MCJIT does not have lazy compilation".  So the
> loaded state purpose is having modules in "standby" should they be
> required. OK. ****
>
>  ****
>
> If you put one function into one module exactly (call this a
> function-module) you almost have the lazy compilation capability. Maybe
> it's possible to split a module into function-modules automatically in a
> addModuleFunctions() function.****
>
>  ****
>
> Some questions -****
>
>  ****
>
> 1) Is the purpose of having loaded and finalized states to enable an
> external action on the module between loading and finalizing?****
>
>  ****
>
> 2) Is it OK to finalize a set of modules, add more modules and finalize
> again?****
>
>  ****
>
> Yaron****
>
>  ****
>
>  ****
>
> 2013/10/19 Kaylor, Andrew <andrew.kaylor at intel.com>****
>
>  I’ve actually got an idea for this and am working on a patch today.
> I’ll send it to you for input when I have something working.****
>
>  ****
>
> Basically my idea is to replace the existing Modules vector that MCJIT
> inherits from ExecutionEngine with a smarter container class that will
> internally maintain separate lists/sets for added, loaded and finalized
> modules.   This class would provide iterator access that MCJIT could use to
> iterate through the entire combined set of modules (I think I can make that
> work reasonably) but also iterators to go through just the modules in a
> particular state.  This would let us get rid of the clumsy ModuleState
> thing completely and provide faster access to the subsets we need.****
>
>  ****
>
> The “loaded” state is needed because when relocations are processed
> (during the finalize handling) they are applied to all modules that have
> been loaded and the finalize operation on the memory manager (which sets
> page protection, invalidates code caches, etc.) is applied to the sections
> of all loaded modules.  It would require a significant amount of extra
> infrastructure to make this behave otherwise and I couldn’t think of a
> compelling reason why it would be needed.****
>
>  ****
>
> But maybe there’s some confusion here.  The emitted state is entirely
> useless because it is always immediately followed by object loading.  The
> state I’m currently calling “loaded” is probably what you’re referring to
> as “compiled”.  We definitely do not want to immediately compile modules
> when addModule is called because that would eliminate the only facsimile of
> lazy compilation MCJIT currently has.  The idea is that you can add a
> module to provide function definitions but if nothing in that module is
> referenced by a module in which a function is executed (note the wording
> here because it’s not exactly the ideal case) then the module never gets
> compiled.****
>
>  ****
>
> Eventually, we’ll want to add another level of lazy compilation in which
> we generate stubs to resolve external function during linking/finalization
> and only compile their dependent modules if the stub is called.  But that’s
> a job for another day.****
>
>  ****
>
> -Andy****
>
>  ****
>
> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
> *Sent:* Friday, October 18, 2013 1:41 PM****
>
>
> *To:* Kaylor, Andrew
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: ExecutionEngine::finalizeModule()****
>
>  ****
>
> Ok, it's simpler than I thought since only three states are relevant.****
>
>  ****
>
> But first - do we need really all three states? what good is loaded state?
> is just means being on the list, and any operation that is possible on a
> loaded module should be possible before adding the module to MCJIT, right?
> ****
>
>  ****
>
> If we merge loading and emitted by having the addModule immediately emit
> the loaded module we'll need only two states:  "Compiled" and "Linked" (or
> "Finalized").****
>
>  ****
>
> addModule emits module and adds the module to an "Compiled" vector of
> modules.****
>
> Finalize loops over "Compiled" vector, each module is finalized, added to
> "Linked" list and then the "Compiled" list is cleared.****
>
>  ****
>
> If we still need three lists then just the same, three vectors would be
> required with add inserting the loaded vector, emit moving modules from
> loaded to compiled and finalize moving them from compiled to linked.****
>
>  ****
>
> Yaron****
>
>  ****
>
>  ****
>
>  ****
>
>  ****
>
> 2013/10/18 Kaylor, Andrew <andrew.kaylor at intel.com>****
>
>  Your suggestion to introduce an optimization within MCJIT to avoid
> inefficiency in finalizeLoadedModules is the correct direction.  I
> definitely prefer to improve the MCJIT implementation rather than
> increasing the surface of the interface and requiring clients to optimize
> their own use of MCJIT.****
>
>  ****
>
> We’ve talked about removing runFunction from the MCJIT interface.  It
> really doesn’t belong there and was just implemented as a convenience to
> ease the transition of legacy code to MCJIT.  As such, I think of it in the
> same way I think of finalizeObject.  That is, I’m hoping it will go away.
> It might make sense to provide an independent class or perhaps a static
> function that provides the signature decoding that’s contained in
> MCJIT::runFunction, which would ease the pain of deprecating it.  That
> method isn’t entirely complete as it stands, BTW.****
>
>  ****
>
> The Module state handling in MCJIT definitely isn’t mature.  The only
> states that really have any significance are “Added” (which needs a better
> name), “Loaded” and “Finalized”.  The current state tracking is just
> something I put in there to get the multiple module support up and
> running.  I would be happy to see it replaced with something better.****
>
>  ****
>
> -Andy****
>
>  ****
>
>  ****
>
> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
> *Sent:* Thursday, October 17, 2013 4:02 PM
> *To:* Kaylor, Andrew
> *Cc:* llvm-commits at cs.uiuc.edu
> *Subject:* Re: ExecutionEngine::finalizeModule()****
>
>  ****
>
> The changes you are making make sense. Return valid ready-to-run function
> address rather than depend upon the user calling finalize is right.****
>
>  ****
>
> The issue is with scalability of the current  implementation. Let's take
> your Kaliedoscope example (which was very helpful) as the use-case but
> every REPL will do the same.****
>
>  ****
>
> After adding  a new module it and only it needs to be finalized to run the
> new code. But finalizeLoadedModules loops over all modules to see which one
> should be finalized. It's not efficient. ****
>
>  ****
>
> I initially thought using finalizeModule would be the solution, calling it
> after creating a module, saving the loop.****
>
>  ****
>
> However this goes deeper and finalizeModule isn't the solution.  As you
> say getFunctionAddress calls finalizeLoadedModules() if a symbol found.
> finalizeLoadedModules always loops on all modules so it's wasteful. In an
> extreme case, we may know that all modules are already finalized but can't
> prevent the needless loop on module.****
>
>  ****
>
> The solution would be other data structures. ****
>
> Simplified case, instead of looping on all modules and checking module
> states  MCJIT could have a vector of un-finalized modules and a vector of
> finalized modules. New modules enter the un-finalized vector
> and finalizeLoadedModules() loops over the un-finalized vector, finalize
> every module, appends them to finalized vector and clears the unfinalized
> vector. ****
>
>  ****
>
> This is simplified since there are more ModuleStates so maybe other data
> structures should be used. I have not looked in depth the ModuleStates.
>
> Regarding getFunctionAddress, runFunction() does not use it but calls the
> older getPointerToFunction, so either runFunction() should be changed to
> the newer API or we still need to manually finalize modules.
>
> Yaron****
>
>  ****
>
> 2013/10/18 Kaylor, Andrew <andrew.kaylor at intel.com>****
>
>  Hi Yaron,****
>
>  ****
>
> What do you see as the use case for adding this?  I was thinking about
> removing finalizeObject() from the interface -- it’s mostly there to
> support code that was written when it was necessary.  It was probably a
> misstep to introduce finalizeObject in the EE interface in the first
> place.  It’s a case of leaking an implementation detail into the interface.
> ****
>
>  ****
>
> With the current trunk implementation of MCJIT, calling getFunctionAddress
> will implicitly finalize the module that contains the function (as well as
> any other modules that have been loaded).  The idea is that the address you
> get back is immediately useful (which was not true of the older
> getPointerToFunction method).****
>
>  ****
>
> Also note that within MCJIT calling finalizeModule has the effect of
> finalizing all loaded modules.  In fact, I plan to update the
> implementation of finalizeModule to call finalizeLoadedModules soon.****
>
>  ****
>
> -Andy****
>
>  ****
>
>  ****
>
> *From:* Yaron Keren [mailto:yaron.keren at gmail.com]
> *Sent:* Thursday, October 17, 2013 2:32 PM
> *To:* llvm-commits at cs.uiuc.edu; Kaylor, Andrew
> *Subject:* ExecutionEngine::finalizeModule()****
>
>  ****
>
> Hi,****
>
>  ****
>
> Make finalizeModule() accessible for MCJIT created through ExecutionEngine.
> ****
>
>  ****
>
> Maybe it would make sense to expose finalizeLoadedModules() as well.****
>
>  ****
>
> Yaron****
>
>  ****
>
>  ****
>
>   ****
>
>   ****
>
>   ****
>
>  ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131021/f27c7f02/attachment.html>