[llvm] r271865 - [Kaleidoscope][BuildingAJIT] Add tutorial text for Chapter 2.

Sun Jun 5 20:28:13 PDT 2016

Author: lhames
Date: Sun Jun  5 22:28:12 2016
New Revision: 271865

URL: http://llvm.org/viewvc/llvm-project?rev=271865&view=rev
Log:
[Kaleidoscope][BuildingAJIT] Add tutorial text for Chapter 2.

This chapter discusses IR optimizations, the ORC IRTransformLayer, and the ORC
layer concept itself.

The text is still pretty rough, but I think the main ideas are there. Feedback
is very welcome, as always.


Modified:
    llvm/trunk/docs/tutorial/BuildingAJIT2.rst

Modified: llvm/trunk/docs/tutorial/BuildingAJIT2.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/BuildingAJIT2.rst?rev=271865&r1=271864&r2=271865&view=diff
==============================================================================

--- llvm/trunk/docs/tutorial/BuildingAJIT2.rst (original)
+++ llvm/trunk/docs/tutorial/BuildingAJIT2.rst Sun Jun  5 22:28:12 2016
@@ -12,22 +12,296 @@ we welcome any feedback.
 Chapter 2 Introduction
 ======================
 
-Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. This
-chapter shows you how to add IR optimization support to the KaleidoscopeJIT
-class that was introduced in `Chapter 1 <BuildingAJIT1.html>`_ by adding a
-new *ORC Layer* -- IRTransformLayer.
-
-**To be done:**
-
-**(1) Briefly describe FunctionPassManager and the optimizeModule
-method (reference the Kaleidoscope language tutorial chapter 4 for more detail
-about IR optimization - it's covered in detail there, here it just provides a
-motivation for learning about layers).**
+Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. In
+`Chapter 1 <BuildingAJIT1.html>`_ of this series we examined a basic JIT
+class, KaleidoscopeJIT, that could take LLVM IR modules as input and produce
+executable code in memory. KaleidoscopeJIT was able to do this with relatively
+little code by composing two off-the-shelf *ORC layers*: IRCompileLayer and
+ObjectLinkingLayer, to do much of the heavy lifting.
+
+In this layer we'll learn more about the ORC layer concept by using a new layer,
+IRTransformLayer, to add IR optimization support to KaleidoscopeJIT.
+
+Optimizing Modules using the IRTransformLayer
+=============================================
+
+In `Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
+tutorial series the llvm *FunctionPassManager* is introduced as a means for
+optimizing LLVM IR. Interested readers may read that chapter for details, but
+in short, to optimize a Module we create an llvm::FunctionPassManager
+instance, configure it with a set of optimizations, then run the PassManager on
+a Module to mutate it into a (hopefully) more optimized but semantically
+equivalent form. In the original tutorial series the FunctionPassManager was
+created outside the KaleidoscopeJIT, and modules were optimized before being
+added to it. In this Chapter we will make optimization a phase of our JIT
+instead. For now, this will provide us a motivation to learn more about ORC
+layers, but in the long term making optimization part of our JIT will yield an
+important benefit: When we begin lazily compiling code (i.e. deferring
+compilation of each function until the first time it's run), having
+optimization managed by our JIT will allow us to optimize lazily too, rather
+than having to do all our optimization up-front.
+
+To add optimization support to our JIT we will take the KaleidoscopeJIT from
+Chapter 1 and compose an ORC *IRTransformLayer* on top. We will look at how the
+IRTransformLayer works in more detail below, but the interface is simple: the
+constructor for this layer takes a reference to the layer below (as all layers
+do) plus an *IR optimization function* that it will apply to each Module that
+is added via addModuleSet:
+
+.. code-block: c++
+
+  class KaleidoscopeJIT {
+  private:
+    std::unique_ptr<TargetMachine> TM;
+    const DataLayout DL;
+    ObjectLinkingLayer<> ObjectLayer;
+    IRCompileLayer<decltype(ObjectLayer)> CompileLayer;
+
+    typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)>
+      OptimizeFunction;
+
+    IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer;
+
+  public:
+    typedef decltype(OptimizeLayer)::ModuleSetHandleT ModuleHandle;
+
+    KaleidoscopeJIT()
+        : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
+          CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
+          OptimizeLayer(CompileLayer,
+                        [this](std::unique_ptr<Module> M) {
+                          return optimizeModule(std::move(M));
+                        }) {
+      llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
+    }
+
+Our extended KaleidoscopeJIT class starts out the same as it did in Chapter 1,
+but after the CompileLayer we introduce a typedef for our optimization function.
+In this case we use a std::function (a handy wrapper for "function-like" things)
+from a single unique_ptr<Module> input to a std::unique_ptr<Module> output. With
+our optimization function typedef in place we can declare our OptimizeLayer,
+which sits on top of our CompileLayer.
+
+To initialize our OptimizeLayer we pass it a reference to the CompileLayer
+below (standard practice for layers), and we initialize the OptimizeFunction
+using a lambda. In the lambda, we just call out to the "optimizeModule" function
+that we will define below.
+
+.. code-block:
+
+  // ...
+  auto Resolver = createLambdaResolver(
+      [&](const std::string &Name) {
+        if (auto Sym = OptimizeLayer.findSymbol(Name, false))
+          return Sym.toRuntimeDyldSymbol();
+        return RuntimeDyld::SymbolInfo(nullptr);
+      },
+  // ...
+  // Add the set to the JIT with the resolver we created above and a newly
+  // created SectionMemoryManager.
+  return OptimizeLayer.addModuleSet(std::move(Ms),
+                                    make_unique<SectionMemoryManager>(),
+                                    std::move(Resolver));
+  // ...
+
+  // ...
+  return OptimizeLayer.findSymbol(MangledNameStream.str(), true);
+  // ...
+
+  // ...
+  OptimizeLayer.removeModuleSet(H);
+  // ...
+
+Next we need to replace references to 'CompileLayer' with references to
+OptimizeLayer in our key methods: addModule, findSymbol, and removeModule. In
+addModule we need to be careful to replace both references: the findSymbol call
+inside our resolver, and the call through to addModuleSet.
+
+.. code-block: c++
+
+  std::unique_ptr<Module> optimizeModule(std::unique_ptr<Module> M) {
+    // Create a function pass manager.
+    auto FPM = llvm::make_unique<legacy::FunctionPassManager>(M.get());
+
+    // Add some optimizations.
+    FPM->add(createInstructionCombiningPass());
+    FPM->add(createReassociatePass());
+    FPM->add(createGVNPass());
+    FPM->add(createCFGSimplificationPass());
+    FPM->doInitialization();
+
+    // Run the optimizations over all functions in the module being added to
+    // the JIT.
+    for (auto &F : *M)
+      FPM->run(F);
+
+    return M;
+  }
+
+At the bottom of our JIT we add a private method to do the actual optimization:
+*optimizeModule*. This function sets up a FunctionPassManager, adds some passes
+to it, runs it over every function in the module, and then returns the mutated
+module. The specific optimizations used are the same ones used in
+`Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
+tutorial series -- readers may visit that chapter for a more in-depth
+discussion of them, and of IR optimization in general.
+
+And that's it: When a module is added to our JIT the OptimizeLayer will now
+pass it to our optimizeModule function before passing the transformed module
+on to the CompileLayer below. Of course, we could have called optimizeModule
+directly in our addModule function and not gone to the bother of using the
+IRTransformLayer, but it gives us an opportunity to see how layers compose, and
+how one can be implemented, because IRTransformLayer turns out to be one of
+the simplest implementations of the *layer* concept that can be devised:
+
+.. code-block:
+
+  template <typename BaseLayerT, typename TransformFtor>
+  class IRTransformLayer {
+  public:
+    typedef typename BaseLayerT::ModuleSetHandleT ModuleSetHandleT;
+
+    IRTransformLayer(BaseLayerT &BaseLayer,
+                     TransformFtor Transform = TransformFtor())
+      : BaseLayer(BaseLayer), Transform(std::move(Transform)) {}
+
+    template <typename ModuleSetT, typename MemoryManagerPtrT,
+              typename SymbolResolverPtrT>
+    ModuleSetHandleT addModuleSet(ModuleSetT Ms,
+                                  MemoryManagerPtrT MemMgr,
+                                  SymbolResolverPtrT Resolver) {
+
+      for (auto I = Ms.begin(), E = Ms.end(); I != E; ++I)
+        *I = Transform(std::move(*I));
+
+      return BaseLayer.addModuleSet(std::move(Ms), std::move(MemMgr),
+                                  std::move(Resolver));
+    }
+
+    void removeModuleSet(ModuleSetHandleT H) { BaseLayer.removeModuleSet(H); }
+
+    JITSymbol findSymbol(const std::string &Name, bool ExportedSymbolsOnly) {
+      return BaseLayer.findSymbol(Name, ExportedSymbolsOnly);
+    }
+
+    JITSymbol findSymbolIn(ModuleSetHandleT H, const std::string &Name,
+                           bool ExportedSymbolsOnly) {
+      return BaseLayer.findSymbolIn(H, Name, ExportedSymbolsOnly);
+    }
+
+    void emitAndFinalize(ModuleSetHandleT H) {
+      BaseLayer.emitAndFinalize(H);
+    }
+
+    TransformFtor& getTransform() { return Transform; }
+
+    const TransformFtor& getTransform() const { return Transform; }
+
+  private:
+    BaseLayerT &BaseLayer;
+    TransformFtor Transform;
+  };
+
+This is the whole definition of IRTransformLayer, from
+``llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h``, stripped of its
+comments. It is a template class with two template arguments: ``BaesLayerT`` and
+``TransformFtor`` that provide the type of the base layer, and the type of the
+"transform functor" (in our case a std::function) respectively. The body of the
+class is concerned with two very simple jobs: (1) Running every IR Module that
+is added with addModuleSet through the transform functor, and (2) conforming to
+the ORC layer interface, which is:
+
++------------------------------------------------------------------------------+
+|     Interface    |                         Description                       |
++==================+===========================================================+
+|                  | Provides a handle that can be used to identify a module   |
+| ModuleSetHandleT | set when calling findSymbolIn, removeModuleSet, or        |
+|                  | emitAndFinalize.                                          |
++------------------+-----------------------------------------------------------+
+|                  | Takes a given set of Modules and makes them "available    |
+|                  | for execution. This means that symbols in those modules   |
+|                  | should be searchable via findSymbol and findSymbolIn, and |
+|                  | the address of the symbols should be read/writable (for   |
+|                  | data symbols), or executable (for function symbols) after |
+|                  | JITSymbol::getAddress() is called. Note: This means that  |
+|   addModuleSet   | addModuleSet doesn't have to compile (or do any other     |
+|                  | work) up-front. It *can*, like IRCompileLayer, act        |
+|                  | eagerly, but it can also simply record the module and     |
+|                  | take no further action until somebody calls               |
+|                  | JITSymbol::getAddress(). In IRTransformLayer's case       |
+|                  | addModuleSet eagerly applies the transform functor to     |
+|                  | each module in the set, then passes the resulting set     |
+|                  | of mutated modules down to the layer below.               |
++------------------+-----------------------------------------------------------+
+|                  | Removes a set of modules from the JIT. Code or data       |
+|  removeModuleSet | defined in these modules will no longer be available, and |
+|                  | the memory holding the JIT'd definitions will be freed.   |
++------------------+-----------------------------------------------------------+
+|                  | Searches for the named symbol in all modules that have    |
+|                  | previously been added via addModuleSet (and not yet       |
+|    findSymbol    | removed by a call to removeModuleSet). In                 |
+|                  | IRTransformLayer we just pass the query on to the layer   |
+|                  | below. In our REPL this is our default way to search for  |
+|                  | function definitions.                                     |
++------------------+-----------------------------------------------------------+
+|                  | Searches for the named symbol in the module set indicated |
+|                  | by the given ModuleSetHandleT. This is just an optimized  |
+|                  | search, better for lookup-speed when you know exactly     |
+|                  | a symbol definition should be found. In IRTransformLayer  |
+|   findSymbolIn   | we just pass this query on to the layer below. In our     |
+|                  | REPL we use this method to search for functions           |
+|                  | representing top-level expressions, since we know exactly |
+|                  | where we'll find them: in the top-level expression module |
+|                  | we just added.                                            |
++------------------+-----------------------------------------------------------+
+|                  | Forces all of the actions required to make the code and   |
+|                  | data in a module set (represented by a ModuleSetHandleT)  |
+|                  | accessible. Behaves as if some symbol in the set had been |
+|                  | searched for and JITSymbol::getSymbolAddress called. This |
+| emitAndFinalize  | is rarely needed, but can be useful when dealing with     |
+|                  | layers that usually behave lazily if the user wants to    |
+|                  | trigger early compilation (for example, to use idle CPU   |
+|                  | time to eagerly compile code in the background).          |
++------------------+-----------------------------------------------------------+
+
+This interface attempts to capture the natural operations of a JIT (with some
+wrinkles like emitAndFinalize for performance), similar to the basic JIT API
+operations we identified in Chapter 1. Conforming to the layer concept allows
+classes to compose neatly by implementing their behaviors in terms of the these
+same operations, carried out on the layer below. For example, an eager layer
+(like IRTransformLayer) can implement addModuleSet by running each module in the
+set through its transform up-front and immediately passing the result to the
+layer below. A lazy layer, by contrast, could implement addModuleSet by
+squirreling away the modules doing no other up-front work, but applying the
+transform (and calling addModuleSet on the layer below) when the client calls
+findSymbol instead. The JIT'd program behavior will be the same either way, but
+these choices will have different performance characteristics: Doing work
+eagerly means the JIT takes longer up-front, but proceeds smoothly once this is
+done. Deferring work allows the JIT to get up-and-running quickly, but will
+force the JIT to pause and wait whenever some code or data is needed that hasn't
+already been procesed.
+
+Our current REPL is eager: Each function definition is optimized and compiled as
+soon as it's typed in. If we were to make the transform layer lazy (but not
+change things otherwise) we could defer optimization until the first time we
+reference a function in a top-level expression (see if you can figure out why,
+then check out the answer below [1]_). In the next chapter, however we'll
+introduce fully lazy compilation, in which function's aren't compiled until
+they're first called at run-time. At this point the trade-offs get much more
+interesting: the lazier we are, the quicker we can start executing the first
+function, but the more often we'll have to pause to compile newly encountered
+functions. If we only code-gen lazily, but optimize eagerly, we'll have a slow
+startup (which everything is optimized) but relatively short pauses as each
+function just passes through code-gen. If we both optimize and code-gen lazily
+we can start executing the first function more quickly, but we'll have longer
+pauses as each function has to be both optimized and code-gen'd when it's first
+executed. Things become even more interesting if we consider interproceedural
+optimizations like inlining, which must be performed eagerly. These are
+complex trade-offs, and there is no one-size-fits all solution to them, but by
+providing composable layers we leave the decisions to the person implementing
+the JIT, and make it easy for them to experiment with different configurations.
 
-**(2) Describe IRTransformLayer, show how it is used to call our optimizeModule
-method.**
-
-**(3) Describe the ORC Layer concept using IRTransformLayer as an example.**
+`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_
 
 Full Code Listing
 =================
@@ -47,4 +321,9 @@ Here is the code:
 .. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h
    :language: c++
 
-`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_
+.. [1] When we add our top-level expression to the JIT, any calls to functions
+       that we defined earlier will appear to the ObjectLinkingLayer as
+       external symbols. The ObjectLinkingLayer will call the SymbolResolver
+       that we defined in addModuleSet, which in turn calls findSymbol on the
+       OptimizeLayer, at which point even a lazy transform layer will have to
+       do its work.