[llvm] r366075 - [ORC] Start adding ORCv1 to ORCv2 transition tips to the ORCv2 doc.

Mon Jul 15 08:36:37 PDT 2019

Author: lhames
Date: Mon Jul 15 08:36:37 2019
New Revision: 366075

URL: http://llvm.org/viewvc/llvm-project?rev=366075&view=rev
Log:
[ORC] Start adding ORCv1 to ORCv2 transition tips to the ORCv2 doc.

Added:
    llvm/trunk/docs/ORCv2.rst
      - copied, changed from r366072, llvm/trunk/docs/ORCv2DesignAndImplementation.rst
Removed:
    llvm/trunk/docs/ORCv2DesignAndImplementation.rst
Modified:
    llvm/trunk/docs/index.rst

Copied: llvm/trunk/docs/ORCv2.rst (from r366072, llvm/trunk/docs/ORCv2DesignAndImplementation.rst)
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ORCv2.rst?p2=llvm/trunk/docs/ORCv2.rst&p1=llvm/trunk/docs/ORCv2DesignAndImplementation.rst&r1=366072&r2=366075&rev=366075&view=diff
==============================================================================

--- llvm/trunk/docs/ORCv2DesignAndImplementation.rst (original)
+++ llvm/trunk/docs/ORCv2.rst Mon Jul 15 08:36:37 2019
@@ -16,7 +16,7 @@ Use-cases
 =========
 
 ORC provides a modular API for building JIT compilers. There are a range
-of use cases for such an API:
+of use cases for such an API. For example:
 
 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
 compiled from a toy languge: Kaleidoscope.
@@ -56,11 +56,11 @@ ORC provides the following features:
   deferring compilation until first call.
 
 - *Support for custom compilers and program representations*. Clients can supply
-   custom compilers for each symbol that they define in their JIT session. ORC
-   will run the user-supplied compiler when the a definition of a symbol is
-   needed. ORC is actually fully language agnostic: LLVM IR is not treated
-   specially, and is supported via the same wrapper mechanism (the
-   ``MaterializationUnit`` class) that is used for custom compilers.
+  custom compilers for each symbol that they define in their JIT session. ORC
+  will run the user-supplied compiler when the a definition of a symbol is
+  needed. ORC is actually fully language agnostic: LLVM IR is not treated
+  specially, and is supported via the same wrapper mechanism (the
+  ``MaterializationUnit`` class) that is used for custom compilers.
 
 - *Concurrent JIT'd code* and *concurrent compilation*. JIT'd code may spawn
   multiple threads, and may re-enter the JIT (e.g. for lazy compilation)
@@ -311,10 +311,129 @@ Supporting Custom Compilers
 
 TBD.
 
-Low Level (MCJIT style) Use
-===========================
+Transitioning from ORCv1 to ORCv2
+=================================
 
-TBD.
+Since LLVM 7.0 new ORC developement has focused on adding support for concurrent
+compilation. In order to enable concurrency new APIs were introduced
+(ExecutionSession, JITDylib, etc.) and new implementations of existing layers
+were written. In LLVM 8.0 the old layer implementations, which do not support
+concurrency, were renamed (with a "Legacy" prefix), but remained in tree.  In
+LLVM 9.0 we have added a deprecation warning for the old layers and utilities,
+and in LLVM 10.0 the old layers and utilities will be removed.
+
+Clients currently using the legacy (ORCv1) layers and utilities will usually
+find it easy to transition to the newer (ORCv2) variants. Most of the ORCv1
+layers and utilities have ORCv2 counterparts[2]_ that can be
+substituted. However there are some differences between ORCv1 and ORCv2 to be
+aware of:
+
+  1. All JIT stacks now need an ExecutionSession instance which manages the
+     string pool, error reporting, synchronization, and symbol lookup.
+
+  2. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) to reduce memory
+     overhead and improve lookup performance. To get a uniqued string, call
+     ``intern`` on your ExecutionSession instance:
+
+     .. code-block:: c++
+
+       ExecutionSession ES;
+
+       /// ...
+
+       auto MainSymbolName = ES.intern("main");
+
+  3. Program representations (Modules, Object Files, etc.) are no longer added
+     *to* layers. Instead they are added *to* JITDylibs *by* layers. The layer
+     determines how the program representation will be compiled if it is needed.
+     The JITDylib provides the symbol table, enforces linkage rules (e.g.
+     rejecting duplicate definitions), and synchronizes concurrent compiles.
+
+     Most ORCv1 clients (or MCJIT clients wanting to try out ORCv2) should
+     simply add code to the default *main* JITDylib provided by the
+     ExecutionSession:
+
+     .. code-block:: c++
+
+       ExecutionSession ES;
+       RTDyldObjectLinkingLayer ObjLinkingLayer(
+         ES, []() { return llvm::make_unique<SectionMemoryManager>(); });
+       IRCompileLayer CompileLayer(ES, ObjLinkingLayer, SimpleIRCompiler(TM));
+
+       auto M = loadModule(...);
+
+       if (auto Err = CompileLayer.add(ES.getMainJITDylib(), M))
+         return Err;
+
+  4. IR layers require ThreadSafeModule instances, rather than
+     std::unique_ptr<Module>s. A ThreadSafeModule instance is a pair of a
+     std::unique_ptr<Module> and a ThreadSafeContext, which is in turn a
+     pair of a std::unique_ptr<LLVMContext> and a lock. This allows the JIT
+     to ensure that the LLVMContext for a module is locked before the module
+     is accessed. Multiple ThreadSafeModules may share a ThreadSafeContext
+     value, but in that case the modules will not be able to be compiled
+     concurrently[3]_.
+
+     ThreadSafeContexts may be constructed explicitly:
+
+     .. code-block:: c++
+
+       // ThreadSafeContext shared between two modules.
+       ThreadSafeContext TSCtx(llvm::make_unique<LLVMContext>());
+       ThreadSafeModule TSM1(
+         llvm::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
+       ThreadSafeModule TSM2(
+         llvm::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
+
+     , or they can be created implicitly by passing a new LLVMContext to the
+     ThreadSafeModuleConstructor:
+
+     .. code-block:: c++
+
+       // Constructing a ThreadSafeModule (and implicitly a ThreadSafeContext)
+       // from a pair of a Module and a Context.
+       auto Ctx = llvm::make_unique<LLVMContext>();
+       auto M = llvm::make_unique<Module>("M", *Ctx);
+       return ThreadSafeModule(std::move(M), std::move(Ctx));
+
+  5. The symbol resolution and lookup scheme have been fundamentally changed.
+     Symbol lookup has been removed from the layer interface. Instead,
+     symbols are looked up via the ``ExecutionSession::lookup`` method by
+     scanning a list of JITDylibs.
+
+     SymbolResolvers have been removed entirely. Resolution rules now follow the
+     linkage relationship between JITDylibs. For example, to resolve a reference
+     to a symbol *F* from a module *M* that has been added to JITDylib *J1* we
+     would first search for a definition of *F* in *J1* then (if no definition
+     was found) search each of the JITDylibs that *J1* links against.
+
+     While the new resolution scheme is, strictly speaking, less flexible than
+     the old scheme of customizable resolvers this has not yet led to problems
+     in practice. Instead, using standard linker rules has removed a lot of
+     boilerplate while providing correct[4]_ behavior for common and weak symbols.
+
+     One notable difference is in exposing in-process symbols to the JIT. To
+     support this (without requiring the set of symbols to be enumerated up
+     front), JITDylibs allow for a *GeneratorFunction* to be attached to
+     generate new definitions upon lookup. Reflecting the processes symbols into
+     the JIT can be done by writing:
+
+     .. code-block:: c++
+
+       ExecutionSession ES;
+       const auto DataLayout &DL = ...;
+
+       {
+         auto ProcessSymbolsGenerator =
+           DynamicLibrarySearchGenerator::GetForCurrentProcess(DL.getGlobalPrefix());
+         if (!ProcessSymbolsGenerator)
+           return ProcessSymbolsGenerator.takeError();
+         ES.getMainJITDylib().setGenerator(std::move(*ProcessSymbolsGenerator));
+       }
+
+  6. Module removal is not yet supported. There is no equivalent of the
+     layer concept removeModule/removeObject methods. Work on resource tracking
+     and removal in ORCv2 is ongoing.
 
 Future Features
 ===============
@@ -322,4 +441,24 @@ Future Features
 TBD: Speculative compilation. Object Caches.
 
 .. [1] Formats/architectures vary in terms of supported features. MachO and
-       ELF tend to have better support than COFF. Patches very welcome!
\ No newline at end of file
+       ELF tend to have better support than COFF. Patches very welcome!
+
+.. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
+       ``RemoteObjectServerLayer`` do not have counterparts in the new
+       system. In the case of ``LazyEmittingLayer`` it was simply no longer
+       needed: in ORCv2, deferring compilation until symbols are looked up is
+       the default. The removal of ``RemoteObjectClientLayer`` and
+       ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
+       across processes, however this functionality appears not to have been
+       used.
+
+.. [3] Sharing ThreadSafeModules in a concurrent compilation can be dangerous:
+       if interdependent modules are loaded on the same context, but compiled
+       on different threads a deadlock may occur (with each compile waiting for
+       the other(s) to complete, and the other(s) unable to proceed because the
+       context is locked).
+
+.. [4] Mostly. Weak definitions are handled correctly within dylibs, but if
+       multiple dylibs provide a weak definition of a symbol each will end up
+       with its own definition (similar to how weak symbols in Windows DLLs
+       behave). This will be fixed in the future.
\ No newline at end of file

Removed: llvm/trunk/docs/ORCv2DesignAndImplementation.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/ORCv2DesignAndImplementation.rst?rev=366074&view=auto
==============================================================================
--- llvm/trunk/docs/ORCv2DesignAndImplementation.rst (original)
+++ llvm/trunk/docs/ORCv2DesignAndImplementation.rst (removed)
@@ -1,325 +0,0 @@
-===============================
-ORC Design and Implementation
-===============================
-
-Introduction
-============
-
-This document aims to provide a high-level overview of the design and
-implementation of the ORC JIT APIs. Except where otherwise stated, all
-discussion applies to the design of the APIs as of LLVM verison 9 (ORCv2).
-
-.. contents::
-   :local:
-
-Use-cases
-=========
-
-ORC provides a modular API for building JIT compilers. There are a range
-of use cases for such an API:
-
-1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
-compiled from a toy languge: Kaleidoscope.
-
-2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
-evaluation. In this use case, cross compilation allows expressions compiled
-in the debugger process to be executed on the debug target process, which may
-be on a different device/architecture.
-
-3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
-optimizations within an existing JIT infrastructure.
-
-4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
-
-By adoping a modular, library-based design we aim to make ORC useful in as many
-of these contexts as possible.
-
-Features
-========
-
-ORC provides the following features:
-
-- *JIT-linking* links relocatable object files (COFF, ELF, MachO) [1]_ into a
-  target process an runtime. The target process may be the same process that
-  contains the JIT session object and jit-linker, or may be another process
-  (even one running on a different machine or architecture) that communicates
-  with the JIT via RPC.
-
-- *LLVM IR compilation*, which is provided by off the shelf components
-  (IRCompileLayer, SimpleCompiler, ConcurrentIRCompiler) that make it easy to
-  add LLVM IR to a JIT'd process.
-
-- *Eager and lazy compilation*. By default, ORC will compile symbols as soon as
-  they are looked up in the JIT session object (``ExecutionSession``). Compiling
-  eagerly by default makes it easy to use ORC as a simple in-memory compiler for
-  an existing JIT. ORC also provides a simple mechanism, lazy-reexports, for
-  deferring compilation until first call.
-
-- *Support for custom compilers and program representations*. Clients can supply
-   custom compilers for each symbol that they define in their JIT session. ORC
-   will run the user-supplied compiler when the a definition of a symbol is
-   needed. ORC is actually fully language agnostic: LLVM IR is not treated
-   specially, and is supported via the same wrapper mechanism (the
-   ``MaterializationUnit`` class) that is used for custom compilers.
-
-- *Concurrent JIT'd code* and *concurrent compilation*. JIT'd code may spawn
-  multiple threads, and may re-enter the JIT (e.g. for lazy compilation)
-  concurrently from multiple threads. The ORC APIs also support running multiple
-  compilers concurrently, and provides off-the-shelf infrastructure to track
-  dependencies on running compiles (e.g. to ensure that we never call into code
-  until it is safe to do so, even if that involves waiting on multiple
-  compiles).
-
-- *Orthogonality* and *composability*: Each of the features above can be used (or
-  not) independently. It is possible to put ORC components together to make a
-  non-lazy, in-process, single threaded JIT or a lazy, out-of-process,
-  concurrent JIT, or anything in between.
-
-LLJIT and LLLazyJIT
-===================
-
-ORC provides two basic JIT classes off-the-shelf. These are useful both as
-examples of how to assemble ORC components to make a JIT, and as replacements
-for earlier LLVM JIT APIs (e.g. MCJIT).
-
-The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
-compilation of LLVM IR and linking of relocatable object files. All operations
-are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
-as soon as you attempt to look up its address). LLJIT is a suitable replacement
-for MCJIT in most cases (note: some more advanced features, e.g.
-JITEventListeners are not supported yet).
-
-The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
-compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
-method, function bodies in that module will not be compiled until they are first
-called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
-JIT API.
-
-LLJIT and LLLazyJIT instances can be created using their respective builder
-classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
-module ``M`` loaded on an ThreadSafeContext ``Ctx``:
-
-.. code-block:: c++
-
-  // Try to detect the host arch and construct an LLJIT instance.
-  auto JIT = LLJITBuilder().create();
-
-  // If we could not construct an instance, return an error.
-  if (!JIT)
-    return JIT.takeError();
-
-  // Add the module.
-  if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
-    return Err;
-
-  // Look up the JIT'd code entry point.
-  auto EntrySym = JIT->lookup("entry");
-  if (!EntrySym)
-    return EntrySym.takeError();
-
-  auto *Entry = (void(*)())EntrySym.getAddress();
-
-  Entry();
-
-The builder clasess provide a number of configuration options that can be
-specified before the JIT instance is constructed. For example:
-
-.. code-block:: c++
-
-  // Build an LLLazyJIT instance that uses four worker threads for compilation,
-  // and jumps to a specific error handler (rather than null) on lazy compile
-  // failures.
-
-  void handleLazyCompileFailure() {
-    // JIT'd code will jump here if lazy compilation fails, giving us an
-    // opportunity to exit or throw an exception into JIT'd code.
-    throw JITFailed();
-  }
-
-  auto JIT = LLLazyJITBuilder()
-               .setNumCompileThreads(4)
-               .setLazyCompileFailureAddr(
-                   toJITTargetAddress(&handleLazyCompileFailure))
-               .create();
-
-  // ...
-
-For users wanting to get started with LLJIT a minimal example program can be
-found at ``llvm/examples/HowToUseLLJIT``.
-
-Design Overview
-===============
-
-ORC's JIT'd program model aims to emulate the linking and symbol resolution
-rules used by the static and dynamic linkers. This allows ORC to JIT
-arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
-clang) that uses constructs like symbol linkage and visibility, and weak and
-common symbol definitions.
-
-To see how this works, imagine a program ``foo`` which links against a pair
-of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
-system might look like:
-
-.. code-block:: bash
-
-  $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
-  $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
-  $ clang++ -o myapp myapp.cpp -L. -lA -lB
-  $ ./myapp
-
-In ORC, this would translate into API calls on a "CXXCompilingLayer" (with error
-checking omitted for brevity) as:
-
-.. code-block:: c++
-
-  ExecutionSession ES;
-  RTDyldObjectLinkingLayer ObjLinkingLayer(
-      ES, []() { return llvm::make_unique<SectionMemoryManager>(); });
-  CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
-
-  // Create JITDylib "A" and add code to it using the CXX layer.
-  auto &LibA = ES.createJITDylib("A");
-  CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
-  CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
-
-  // Create JITDylib "B" and add code to it using the CXX layer.
-  auto &LibB = ES.createJITDylib("B");
-  CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
-  CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
-
-  // Specify the search order for the main JITDylib. This is equivalent to a
-  // "links against" relationship in a command-line link.
-  ES.getMainJITDylib().setSearchOrder({{&LibA, false}, {&LibB, false}});
-  CXXLayer.add(ES.getMainJITDylib(), MemoryBuffer::getFile("main.cpp"));
-
-  // Look up the JIT'd main, cast it to a function pointer, then call it.
-  auto MainSym = ExitOnErr(ES.lookup({&ES.getMainJITDylib()}, "main"));
-  auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
-
-  int Result = Main(...);
-
-
-This example tells us nothing about *how* or *when* compilation will happen.
-That will depend on the implementation of the hypothetical CXXCompilingLayer,
-but the linking rules will be the same regardless. For example, if a1.cpp and
-a2.cpp both define a function "foo" the API should generate a duplicate
-definition error. On the other hand, if a1.cpp and b1.cpp both define "foo"
-there is no error (different dynamic libraries may define the same symbol). If
-main.cpp refers to "foo", it should bind to the definition in LibA rather than
-the one in LibB, since main.cpp is part of the "main" dylib, and the main dylib
-links against LibA before LibB.
-
-Many JIT clients will have no need for this strict adherence to the usual
-ahead-of-time linking rules and should be able to get by just fine by putting
-all of their code in a single JITDylib. However, clients who want to JIT code
-for languages/projects that traditionally rely on ahead-of-time linking (e.g.
-C++) will find that this feature makes life much easier.
-
-Symbol lookup in ORC serves two other important functions, beyond basic lookup:
-(1) It triggers compilation of the symbol(s) searched for, and (2) it provides
-the synchronization mechanism for concurrent compilation. The pseudo-code for
-the lookup process is:
-
-.. code-block:: none
-
-  construct a query object from a query set and query handler
-  lock the session
-  lodge query against requested symbols, collect required materializers (if any)
-  unlock the session
-  dispatch materializers (if any)
-
-In this context a materializer is something that provides a working definition
-of a symbol upon request. Generally materializers wrap compilers, but they may
-also wrap a linker directly (if the program representation backing the
-definitions is an object file), or even just a class that writes bits directly
-into memory (if the definitions are stubs). Materialization is the blanket term
-for any actions (compiling, linking, splatting bits, registering with runtimes,
-etc.) that is requried to generate a symbol definition that is safe to call or
-access.
-
-As each materializer completes its work it notifies the JITDylib, which in turn
-notifies any query objects that are waiting on the newly materialized
-definitions. Each query object maintains a count of the number of symbols that
-it is still waiting on, and once this count reaches zero the query object calls
-the query handler with a *SymbolMap* (a map of symbol names to addresses)
-describing the result. If any symbol fails to materialize the query immediately
-calls the query handler with an error.
-
-The collected materialization units are sent to the ExecutionSession to be
-dispatched, and the dispatch behavior can be set by the client. By default each
-materializer is run on the calling thread. Clients are free to create new
-threads to run materializers, or to send the work to a work queue for a thread
-pool (this is what LLJIT/LLLazyJIT do).
-
-Top Level APIs
-==============
-
-Many of ORC's top-level APIs are visible in the example above:
-
-- *ExecutionSession* represents the JIT'd program and provides context for the
-  JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
-  materializers.
-
-- *JITDylibs* provide the symbol tables.
-
-- *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
-  allow clients to add uncompiled program representations supported by those
-  compilers to JITDylibs.
-
-Several other important APIs are used explicitly. JIT clients need not be aware
-of them, but Layer authors will use them:
-
-- *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
-  program representation (in this example, C++ source) in a MaterializationUnit,
-  which is then stored in the JITDylib. MaterializationUnits are responsible for
-  describing the definitions they provide, and for unwrapping the program
-  representation and passing it back to the layer when compilation is required
-  (this ownership shuffle makes writing thread-safe layers easier, since the
-  ownership of the program representation will be passed back on the stack,
-  rather than having to be fished out of a Layer member, which would require
-  synchronization).
-
-- *MaterializationResponsibility* - When a MaterializationUnit hands a program
-  representation back to the layer it comes with an associated
-  MaterializationResponsibility object. This object tracks the definitions
-  that must be materialized and provides a way to notify the JITDylib once they
-  are either successfully materialized or a failure occurs.
-
-Handy utilities
-===============
-
-TBD: absolute symbols, aliases, off-the-shelf layers.
-
-Laziness
-========
-
-Laziness in ORC is provided by a utility called "lazy-reexports". The aim of
-this utility is to re-use the synchronization provided by the symbol lookup
-mechanism to make it safe to lazily compile functions, even if calls to the
-stub occur simultaneously on multiple threads of JIT'd code. It does this by
-reducing lazy compilation to symbol lookup: The lazy stub performs a lookup of
-its underlying definition on first call, updating the function body pointer
-once the definition is available. If additional calls arrive on other threads
-while compilation is ongoing they will be safely blocked by the normal lookup
-synchronization guarantee (no result until the result is safe) and can also
-proceed as soon as compilation completes.
-
-TBD: Usage example.
-
-Supporting Custom Compilers
-===========================
-
-TBD.
-
-Low Level (MCJIT style) Use
-===========================
-
-TBD.
-
-Future Features
-===============
-
-TBD: Speculative compilation. Object Caches.
-
-.. [1] Formats/architectures vary in terms of supported features. MachO and
-       ELF tend to have better support than COFF. Patches very welcome!
\ No newline at end of file

Modified: llvm/trunk/docs/index.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/index.rst?rev=366075&r1=366074&r2=366075&view=diff
==============================================================================
--- llvm/trunk/docs/index.rst (original)
+++ llvm/trunk/docs/index.rst Mon Jul 15 08:36:37 2019
@@ -89,7 +89,7 @@ intermediate LLVM representation.
    GetElementPtr
    Frontend/PerformanceTips
    MCJITDesignAndImplementation
-   ORCv2DesignAndImplementation
+   ORCv2
    CodeOfConduct
    CompileCudaWithLLVM
    ReportingGuide
@@ -383,9 +383,9 @@ For API clients and LLVM developers.
 :doc:`MCJITDesignAndImplementation`
    Describes the inner workings of MCJIT execution engine.
 
-:doc:`ORCv2DesignAndImplementation`
+:doc:`ORCv2`
    Describes the design and implementation of the ORC APIs, including some
-   usage examples.
+   usage examples, and a guide for users transitioning from ORCv1 to ORCv2.
 
 :doc:`BranchWeightMetadata`
    Provides information about Branch Prediction Information.