[www-releases] r372328 - Check in 9.0.0 source and docs

Thu Sep 19 07:32:55 PDT 2019

Added: www-releases/trunk/9.0.0/docs/_sources/MIRLangRef.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/MIRLangRef.rst.txt?rev=372328&view=auto
==============================================================================

--- www-releases/trunk/9.0.0/docs/_sources/MIRLangRef.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/MIRLangRef.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,782 @@
+========================================
+Machine IR (MIR) Format Reference Manual
+========================================
+
+.. contents::
+   :local:
+
+.. warning::
+  This is a work in progress.
+
+Introduction
+============
+
+This document is a reference manual for the Machine IR (MIR) serialization
+format. MIR is a human readable serialization format that is used to represent
+LLVM's :ref:`machine specific intermediate representation
+<machine code representation>`.
+
+The MIR serialization format is designed to be used for testing the code
+generation passes in LLVM.
+
+Overview
+========
+
+The MIR serialization format uses a YAML container. YAML is a standard
+data serialization language, and the full YAML language spec can be read at
+`yaml.org
+<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_.
+
+A MIR file is split up into a series of `YAML documents`_. The first document
+can contain an optional embedded LLVM IR module, and the rest of the documents
+contain the serialized machine functions.
+
+.. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132
+
+MIR Testing Guide
+=================
+
+You can use the MIR format for testing in two different ways:
+
+- You can write MIR tests that invoke a single code generation pass using the
+  ``-run-pass`` option in llc.
+
+- You can use llc's ``-stop-after`` option with existing or new LLVM assembly
+  tests and check the MIR output of a specific code generation pass.
+
+Testing Individual Code Generation Passes
+-----------------------------------------
+
+The ``-run-pass`` option in llc allows you to create MIR tests that invoke just
+a single code generation pass. When this option is used, llc will parse an
+input MIR file, run the specified code generation pass(es), and output the
+resulting MIR code.
+
+You can generate an input MIR file for the test by using the ``-stop-after`` or
+``-stop-before`` option in llc. For example, if you would like to write a test
+for the post register allocation pseudo instruction expansion pass, you can
+specify the machine copy propagation pass in the ``-stop-after`` option, as it
+runs just before the pass that we are trying to test:
+
+   ``llc -stop-after=machine-cp bug-trigger.ll > test.mir``
+
+If the same pass is run multiple times, a run index can be included
+after the name with a comma.
+
+   ``llc -stop-after=dead-mi-elimination,1 bug-trigger.ll > test.mir``
+
+After generating the input MIR file, you'll have to add a run line that uses
+the ``-run-pass`` option to it. In order to test the post register allocation
+pseudo instruction expansion pass on X86-64, a run line like the one shown
+below can be used:
+
+    ``# RUN: llc -o - %s -mtriple=x86_64-- -run-pass=postrapseudos | FileCheck %s``
+
+The MIR files are target dependent, so they have to be placed in the target
+specific test directories (``lib/CodeGen/TARGETNAME``). They also need to
+specify a target triple or a target architecture either in the run line or in
+the embedded LLVM IR module.
+
+Simplifying MIR files
+^^^^^^^^^^^^^^^^^^^^^
+
+The MIR code coming out of ``-stop-after``/``-stop-before`` is very verbose;
+Tests are more accessible and future proof when simplified:
+
+- Use the ``-simplify-mir`` option with llc.
+
+- Machine function attributes often have default values or the test works just
+  as well with default values. Typical candidates for this are: `alignment:`,
+  `exposesReturnsTwice`, `legalized`, `regBankSelected`, `selected`.
+  The whole `frameInfo` section is often unnecessary if there is no special
+  frame usage in the function. `tracksRegLiveness` on the other hand is often
+  necessary for some passes that care about block livein lists.
+
+- The (global) `liveins:` list is typically only interesting for early
+  instruction selection passes and can be removed when testing later passes.
+  The per-block `liveins:` on the other hand are necessary if
+  `tracksRegLiveness` is true.
+
+- Branch probability data in block `successors:` lists can be dropped if the
+  test doesn't depend on it. Example:
+  `successors: %bb.1(0x40000000), %bb.2(0x40000000)` can be replaced with
+  `successors: %bb.1, %bb.2`.
+
+- MIR code contains a whole IR module. This is necessary because there are
+  no equivalents in MIR for global variables, references to external functions,
+  function attributes, metadata, debug info. Instead some MIR data references
+  the IR constructs. You can often remove them if the test doesn't depend on
+  them.
+
+- Alias Analysis is performed on IR values. These are referenced by memory
+  operands in MIR. Example: `:: (load 8 from %ir.foobar, !alias.scope !9)`.
+  If the test doesn't depend on (good) alias analysis the references can be
+  dropped: `:: (load 8)`
+
+- MIR blocks can reference IR blocks for debug printing, profile information
+  or debug locations. Example: `bb.42.myblock` in MIR references the IR block
+  `myblock`. It is usually possible to drop the `.myblock` reference and simply
+  use `bb.42`.
+
+- If there are no memory operands or blocks referencing the IR then the
+  IR function can be replaced by a parameterless dummy function like
+  `define @func() { ret void }`.
+
+- It is possible to drop the whole IR section of the MIR file if it only
+  contains dummy functions (see above). The .mir loader will create the
+  IR functions automatically in this case.
+
+.. _limitations:
+
+Limitations
+-----------
+
+Currently the MIR format has several limitations in terms of which state it
+can serialize:
+
+- The target-specific state in the target-specific ``MachineFunctionInfo``
+  subclasses isn't serialized at the moment.
+
+- The target-specific ``MachineConstantPoolValue`` subclasses (in the ARM and
+  SystemZ backends) aren't serialized at the moment.
+
+- The ``MCSymbol`` machine operands don't support temporary or local symbols.
+
+- A lot of the state in ``MachineModuleInfo`` isn't serialized - only the CFI
+  instructions and the variable debug information from MMI is serialized right
+  now.
+
+These limitations impose restrictions on what you can test with the MIR format.
+For now, tests that would like to test some behaviour that depends on the state
+of temporary or local ``MCSymbol``  operands or the exception handling state in
+MMI, can't use the MIR format. As well as that, tests that test some behaviour
+that depends on the state of the target specific ``MachineFunctionInfo`` or
+``MachineConstantPoolValue`` subclasses can't use the MIR format at the moment.
+
+High Level Structure
+====================
+
+.. _embedded-module:
+
+Embedded Module
+---------------
+
+When the first YAML document contains a `YAML block literal string`_, the MIR
+parser will treat this string as an LLVM assembly language string that
+represents an embedded LLVM IR module.
+Here is an example of a YAML document that contains an LLVM module:
+
+.. code-block:: llvm
+
+       define i32 @inc(i32* %x) {
+       entry:
+         %0 = load i32, i32* %x
+         %1 = add i32 %0, 1
+         store i32 %1, i32* %x
+         ret i32 %1
+       }
+
+.. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688
+
+Machine Functions
+-----------------
+
+The remaining YAML documents contain the machine functions. This is an example
+of such YAML document:
+
+.. code-block:: text
+
+     ---
+     name:            inc
+     tracksRegLiveness: true
+     liveins:
+       - { reg: '$rdi' }
+     callSites:
+       - { bb: 0, offset: 3, fwdArgRegs:
+           - { arg: 0, reg: '$edi' } }
+     body: |
+       bb.0.entry:
+         liveins: $rdi
+
+         $eax = MOV32rm $rdi, 1, _, 0, _
+         $eax = INC32r killed $eax, implicit-def dead $eflags
+         MOV32mr killed $rdi, 1, _, 0, _, $eax
+         CALL64pcrel32 @foo <regmask...>
+         RETQ $eax
+     ...
+
+The document above consists of attributes that represent the various
+properties and data structures in a machine function.
+
+The attribute ``name`` is required, and its value should be identical to the
+name of a function that this machine function is based on.
+
+The attribute ``body`` is a `YAML block literal string`_. Its value represents
+the function's machine basic blocks and their machine instructions.
+
+The attribute ``callSites`` is a representation of call site information which
+keeps track of call instructions and registers used to transfer call arguments.
+
+Machine Instructions Format Reference
+=====================================
+
+The machine basic blocks and their instructions are represented using a custom,
+human readable serialization language. This language is used in the
+`YAML block literal string`_ that corresponds to the machine function's body.
+
+A source string that uses this language contains a list of machine basic
+blocks, which are described in the section below.
+
+Machine Basic Blocks
+--------------------
+
+A machine basic block is defined in a single block definition source construct
+that contains the block's ID.
+The example below defines two blocks that have an ID of zero and one:
+
+.. code-block:: text
+
+    bb.0:
+      <instructions>
+    bb.1:
+      <instructions>
+
+A machine basic block can also have a name. It should be specified after the ID
+in the block's definition:
+
+.. code-block:: text
+
+    bb.0.entry:       ; This block's name is "entry"
+       <instructions>
+
+The block's name should be identical to the name of the IR block that this
+machine block is based on.
+
+.. _block-references:
+
+Block References
+^^^^^^^^^^^^^^^^
+
+The machine basic blocks are identified by their ID numbers. Individual
+blocks are referenced using the following syntax:
+
+.. code-block:: text
+
+    %bb.<id>
+
+Example:
+
+.. code-block:: llvm
+
+    %bb.0
+
+The following syntax is also supported, but the former syntax is preferred for
+block references:
+
+.. code-block:: text
+
+    %bb.<id>[.<name>]
+
+Example:
+
+.. code-block:: llvm
+
+    %bb.1.then
+
+Successors
+^^^^^^^^^^
+
+The machine basic block's successors have to be specified before any of the
+instructions:
+
+.. code-block:: text
+
+    bb.0.entry:
+      successors: %bb.1.then, %bb.2.else
+      <instructions>
+    bb.1.then:
+      <instructions>
+    bb.2.else:
+      <instructions>
+
+The branch weights can be specified in brackets after the successor blocks.
+The example below defines a block that has two successors with branch weights
+of 32 and 16:
+
+.. code-block:: text
+
+    bb.0.entry:
+      successors: %bb.1.then(32), %bb.2.else(16)
+
+.. _bb-liveins:
+
+Live In Registers
+^^^^^^^^^^^^^^^^^
+
+The machine basic block's live in registers have to be specified before any of
+the instructions:
+
+.. code-block:: text
+
+    bb.0.entry:
+      liveins: $edi, $esi
+
+The list of live in registers and successors can be empty. The language also
+allows multiple live in register and successor lists - they are combined into
+one list by the parser.
+
+Miscellaneous Attributes
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The attributes ``IsAddressTaken``, ``IsLandingPad`` and ``Alignment`` can be
+specified in brackets after the block's definition:
+
+.. code-block:: text
+
+    bb.0.entry (address-taken):
+      <instructions>
+    bb.2.else (align 4):
+      <instructions>
+    bb.3(landing-pad, align 4):
+      <instructions>
+
+.. TODO: Describe the way the reference to an unnamed LLVM IR block can be
+   preserved.
+
+Machine Instructions
+--------------------
+
+A machine instruction is composed of a name,
+:ref:`machine operands <machine-operands>`,
+:ref:`instruction flags <instruction-flags>`, and machine memory operands.
+
+The instruction's name is usually specified before the operands. The example
+below shows an instance of the X86 ``RETQ`` instruction with a single machine
+operand:
+
+.. code-block:: text
+
+    RETQ $eax
+
+However, if the machine instruction has one or more explicitly defined register
+operands, the instruction's name has to be specified after them. The example
+below shows an instance of the AArch64 ``LDPXpost`` instruction with three
+defined register operands:
+
+.. code-block:: text
+
+    $sp, $fp, $lr = LDPXpost $sp, 2
+
+The instruction names are serialized using the exact definitions from the
+target's ``*InstrInfo.td`` files, and they are case sensitive. This means that
+similar instruction names like ``TSTri`` and ``tSTRi`` represent different
+machine instructions.
+
+.. _instruction-flags:
+
+Instruction Flags
+^^^^^^^^^^^^^^^^^
+
+The flag ``frame-setup`` or ``frame-destroy`` can be specified before the
+instruction's name:
+
+.. code-block:: text
+
+    $fp = frame-setup ADDXri $sp, 0, 0
+
+.. code-block:: text
+
+    $x21, $x20 = frame-destroy LDPXi $sp
+
+.. _registers:
+
+Bundled Instructions
+^^^^^^^^^^^^^^^^^^^^
+
+The syntax for bundled instructions is the following:
+
+.. code-block:: text
+
+    BUNDLE implicit-def $r0, implicit-def $r1, implicit $r2 {
+      $r0 = SOME_OP $r2
+      $r1 = ANOTHER_OP internal $r0
+    }
+
+The first instruction is often a bundle header. The instructions between ``{``
+and ``}`` are bundled with the first instruction.
+
+Registers
+---------
+
+Registers are one of the key primitives in the machine instructions
+serialization language. They are primarily used in the
+:ref:`register machine operands <register-operands>`,
+but they can also be used in a number of other places, like the
+:ref:`basic block's live in list <bb-liveins>`.
+
+The physical registers are identified by their name and by the '$' prefix sigil.
+They use the following syntax:
+
+.. code-block:: text
+
+    $<name>
+
+The example below shows three X86 physical registers:
+
+.. code-block:: text
+
+    $eax
+    $r15
+    $eflags
+
+The virtual registers are identified by their ID number and by the '%' sigil.
+They use the following syntax:
+
+.. code-block:: text
+
+    %<id>
+
+Example:
+
+.. code-block:: text
+
+    %0
+
+The null registers are represented using an underscore ('``_``'). They can also be
+represented using a '``$noreg``' named register, although the former syntax
+is preferred.
+
+.. _machine-operands:
+
+Machine Operands
+----------------
+
+There are seventeen different kinds of machine operands, and all of them can be
+serialized.
+
+Immediate Operands
+^^^^^^^^^^^^^^^^^^
+
+The immediate machine operands are untyped, 64-bit signed integers. The
+example below shows an instance of the X86 ``MOV32ri`` instruction that has an
+immediate machine operand ``-42``:
+
+.. code-block:: text
+
+    $eax = MOV32ri -42
+
+An immediate operand is also used to represent a subregister index when the
+machine instruction has one of the following opcodes:
+
+- ``EXTRACT_SUBREG``
+
+- ``INSERT_SUBREG``
+
+- ``REG_SEQUENCE``
+
+- ``SUBREG_TO_REG``
+
+In case this is true, the Machine Operand is printed according to the target.
+
+For example:
+
+In AArch64RegisterInfo.td:
+
+.. code-block:: text
+
+  def sub_32 : SubRegIndex<32>;
+
+If the third operand is an immediate with the value ``15`` (target-dependent
+value), based on the instruction's opcode and the operand's index the operand
+will be printed as ``%subreg.sub_32``:
+
+.. code-block:: text
+
+    %1:gpr64 = SUBREG_TO_REG 0, %0, %subreg.sub_32
+
+For integers > 64bit, we use a special machine operand, ``MO_CImmediate``,
+which stores the immediate in a ``ConstantInt`` using an ``APInt`` (LLVM's
+arbitrary precision integers).
+
+.. TODO: Describe the FPIMM immediate operands.
+
+.. _register-operands:
+
+Register Operands
+^^^^^^^^^^^^^^^^^
+
+The :ref:`register <registers>` primitive is used to represent the register
+machine operands. The register operands can also have optional
+:ref:`register flags <register-flags>`,
+:ref:`a subregister index <subregister-indices>`,
+and a reference to the tied register operand.
+The full syntax of a register operand is shown below:
+
+.. code-block:: text
+
+    [<flags>] <register> [ :<subregister-idx-name> ] [ (tied-def <tied-op>) ]
+
+This example shows an instance of the X86 ``XOR32rr`` instruction that has
+5 register operands with different register flags:
+
+.. code-block:: text
+
+  dead $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, implicit-def $al
+
+.. _register-flags:
+
+Register Flags
+~~~~~~~~~~~~~~
+
+The table below shows all of the possible register flags along with the
+corresponding internal ``llvm::RegState`` representation:
+
+.. list-table::
+   :header-rows: 1
+
+   * - Flag
+     - Internal Value
+
+   * - ``implicit``
+     - ``RegState::Implicit``
+
+   * - ``implicit-def``
+     - ``RegState::ImplicitDefine``
+
+   * - ``def``
+     - ``RegState::Define``
+
+   * - ``dead``
+     - ``RegState::Dead``
+
+   * - ``killed``
+     - ``RegState::Kill``
+
+   * - ``undef``
+     - ``RegState::Undef``
+
+   * - ``internal``
+     - ``RegState::InternalRead``
+
+   * - ``early-clobber``
+     - ``RegState::EarlyClobber``
+
+   * - ``debug-use``
+     - ``RegState::Debug``
+
+   * - ``renamable``
+     - ``RegState::Renamable``
+
+.. _subregister-indices:
+
+Subregister Indices
+~~~~~~~~~~~~~~~~~~~
+
+The register machine operands can reference a portion of a register by using
+the subregister indices. The example below shows an instance of the ``COPY``
+pseudo instruction that uses the X86 ``sub_8bit`` subregister index to copy 8
+lower bits from the 32-bit virtual register 0 to the 8-bit virtual register 1:
+
+.. code-block:: text
+
+    %1 = COPY %0:sub_8bit
+
+The names of the subregister indices are target specific, and are typically
+defined in the target's ``*RegisterInfo.td`` file.
+
+Constant Pool Indices
+^^^^^^^^^^^^^^^^^^^^^
+
+A constant pool index (CPI) operand is printed using its index in the
+function's ``MachineConstantPool`` and an offset.
+
+For example, a CPI with the index 1 and offset 8:
+
+.. code-block:: text
+
+    %1:gr64 = MOV64ri %const.1 + 8
+
+For a CPI with the index 0 and offset -12:
+
+.. code-block:: text
+
+    %1:gr64 = MOV64ri %const.0 - 12
+
+A constant pool entry is bound to a LLVM IR ``Constant`` or a target-specific
+``MachineConstantPoolValue``. When serializing all the function's constants the
+following format is used:
+
+.. code-block:: text
+
+    constants:
+      - id:               <index>
+        value:            <value>
+        alignment:        <alignment>
+        isTargetSpecific: <target-specific>
+
+where ``<index>`` is a 32-bit unsigned integer, ``<value>`` is a `LLVM IR Constant
+<https://www.llvm.org/docs/LangRef.html#constants>`_, alignment is a 32-bit
+unsigned integer, and ``<target-specific>`` is either true or false.
+
+Example:
+
+.. code-block:: text
+
+    constants:
+      - id:               0
+        value:            'double 3.250000e+00'
+        alignment:        8
+      - id:               1
+        value:            'g-(LPC0+8)'
+        alignment:        4
+        isTargetSpecific: true
+
+Global Value Operands
+^^^^^^^^^^^^^^^^^^^^^
+
+The global value machine operands reference the global values from the
+:ref:`embedded LLVM IR module <embedded-module>`.
+The example below shows an instance of the X86 ``MOV64rm`` instruction that has
+a global value operand named ``G``:
+
+.. code-block:: text
+
+    $rax = MOV64rm $rip, 1, _, @G, _
+
+The named global values are represented using an identifier with the '@' prefix.
+If the identifier doesn't match the regular expression
+`[-a-zA-Z$._][-a-zA-Z$._0-9]*`, then this identifier must be quoted.
+
+The unnamed global values are represented using an unsigned numeric value with
+the '@' prefix, like in the following examples: ``@0``, ``@989``.
+
+Target-dependent Index Operands
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A target index operand is a target-specific index and an offset. The
+target-specific index is printed using target-specific names and a positive or
+negative offset.
+
+For example, the ``amdgpu-constdata-start`` is associated with the index ``0``
+in the AMDGPU backend. So if we have a target index operand with the index 0
+and the offset 8:
+
+.. code-block:: text
+
+    $sgpr2 = S_ADD_U32 _, target-index(amdgpu-constdata-start) + 8, implicit-def _, implicit-def _
+
+Jump-table Index Operands
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A jump-table index operand with the index 0 is printed as following:
+
+.. code-block:: text
+
+    tBR_JTr killed $r0, %jump-table.0
+
+A machine jump-table entry contains a list of ``MachineBasicBlocks``. When serializing all the function's jump-table entries, the following format is used:
+
+.. code-block:: text
+
+    jumpTable:
+      kind:             <kind>
+      entries:
+        - id:             <index>
+          blocks:         [ <bbreference>, <bbreference>, ... ]
+
+where ``<kind>`` is describing how the jump table is represented and emitted (plain address, relocations, PIC, etc.), and each ``<index>`` is a 32-bit unsigned integer and ``blocks`` contains a list of :ref:`machine basic block references <block-references>`.
+
+Example:
+
+.. code-block:: text
+
+    jumpTable:
+      kind:             inline
+      entries:
+        - id:             0
+          blocks:         [ '%bb.3', '%bb.9', '%bb.4.d3' ]
+        - id:             1
+          blocks:         [ '%bb.7', '%bb.7', '%bb.4.d3', '%bb.5' ]
+
+External Symbol Operands
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+An external symbol operand is represented using an identifier with the ``&``
+prefix. The identifier is surrounded with ""'s and escaped if it has any
+special non-printable characters in it.
+
+Example:
+
+.. code-block:: text
+
+    CALL64pcrel32 &__stack_chk_fail, csr_64, implicit $rsp, implicit-def $rsp
+
+MCSymbol Operands
+^^^^^^^^^^^^^^^^^
+
+A MCSymbol operand is holding a pointer to a ``MCSymbol``. For the limitations
+of this operand in MIR, see :ref:`limitations <limitations>`.
+
+The syntax is:
+
+.. code-block:: text
+
+    EH_LABEL <mcsymbol Ltmp1>
+
+CFIIndex Operands
+^^^^^^^^^^^^^^^^^
+
+A CFI Index operand is holding an index into a per-function side-table,
+``MachineFunction::getFrameInstructions()``, which references all the frame
+instructions in a ``MachineFunction``. A ``CFI_INSTRUCTION`` may look like it
+contains multiple operands, but the only operand it contains is the CFI Index.
+The other operands are tracked by the ``MCCFIInstruction`` object.
+
+The syntax is:
+
+.. code-block:: text
+
+    CFI_INSTRUCTION offset $w30, -16
+
+which may be emitted later in the MC layer as:
+
+.. code-block:: text
+
+    .cfi_offset w30, -16
+
+IntrinsicID Operands
+^^^^^^^^^^^^^^^^^^^^
+
+An Intrinsic ID operand contains a generic intrinsic ID or a target-specific ID.
+
+The syntax for the ``returnaddress`` intrinsic is:
+
+.. code-block:: text
+
+   $x0 = COPY intrinsic(@llvm.returnaddress)
+
+Predicate Operands
+^^^^^^^^^^^^^^^^^^
+
+A Predicate operand contains an IR predicate from ``CmpInst::Predicate``, like
+``ICMP_EQ``, etc.
+
+For an int eq predicate ``ICMP_EQ``, the syntax is:
+
+.. code-block:: text
+
+   %2:gpr(s32) = G_ICMP intpred(eq), %0, %1
+
+.. TODO: Describe the parsers default behaviour when optional YAML attributes
+   are missing.
+.. TODO: Describe the syntax for virtual register YAML definitions.
+.. TODO: Describe the machine function's YAML flag attributes.
+.. TODO: Describe the syntax for the register mask machine operands.
+.. TODO: Describe the frame information YAML mapping.
+.. TODO: Describe the syntax of the stack object machine operands and their
+   YAML definitions.
+.. TODO: Describe the syntax of the block address machine operands.
+.. TODO: Describe the syntax of the metadata machine operands, and the
+   instructions debug location attribute.
+.. TODO: Describe the syntax of the register live out machine operands.
+.. TODO: Describe the syntax of the machine memory operands.

Added: www-releases/trunk/9.0.0/docs/_sources/MarkdownQuickstartTemplate.md.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/MarkdownQuickstartTemplate.md.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/MarkdownQuickstartTemplate.md.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/MarkdownQuickstartTemplate.md.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,157 @@
+# Markdown Quickstart Template
+
+## Introduction and Quickstart
+
+This document is meant to get you writing documentation as fast as possible
+even if you have no previous experience with Markdown. The goal is to take
+someone in the state of "I want to write documentation and get it added to
+LLVM's docs" and turn that into useful documentation mailed to llvm-commits
+with as little nonsense as possible.
+
+You can find this document in `docs/MarkdownQuickstartTemplate.md`. You
+should copy it, open the new file in your text editor, write your docs, and
+then send the new document to llvm-commits for review.
+
+Focus on *content*. It is easy to fix the Markdown syntax
+later if necessary, although Markdown tries to imitate common
+plain-text conventions so it should be quite natural. A basic knowledge of
+Markdown syntax is useful when writing the document, so the last
+~half of this document (starting with [Example Section](#example-section)) gives examples
+which should cover 99% of use cases.
+
+Let me say that again: focus on *content*. But if you really need to verify
+Sphinx's output, see `docs/README.txt` for information.
+
+Once you have finished with the content, please send the `.md` file to
+llvm-commits for review.
+
+## Guidelines
+
+Try to answer the following questions in your first section:
+
+1. Why would I want to read this document?
+
+2. What should I know to be able to follow along with this document?
+
+3. What will I have learned by the end of this document?
+
+Common names for the first section are `Introduction`, `Overview`, or
+`Background`.
+
+If possible, make your document a "how to". Give it a name `HowTo*.md`
+like the other "how to" documents. This format is usually the easiest
+for another person to understand and also the most useful.
+
+You generally should not be writing documentation other than a "how to"
+unless there is already a "how to" about your topic. The reason for this
+is that without a "how to" document to read first, it is difficult for a
+person to understand a more advanced document.
+
+Focus on content (yes, I had to say it again).
+
+The rest of this document shows example Markdown markup constructs
+that are meant to be read by you in your text editor after you have copied
+this file into a new file for the documentation you are about to write.
+
+## Example Section
+
+Your text can be *emphasized*, **bold**, or `monospace`.
+
+Use blank lines to separate paragraphs.
+
+Headings (like `Example Section` just above) give your document its
+structure.
+
+### Example Subsection
+
+Make a link [like this](http://llvm.org/). There is also a more
+sophisticated syntax which [can be more readable] for longer links since
+it disrupts the flow less. You can put the `[link name]: <URL>` block
+pretty much anywhere later in the document.
+
+[can be more readable]: http://en.wikipedia.org/wiki/LLVM
+
+Lists can be made like this:
+
+1. A list starting with `[0-9].` will be automatically numbered.
+
+1. This is a second list element.
+
+   1. Use indentation to create nested lists.
+
+You can also use unordered lists.
+
+* Stuff.
+
+  + Deeper stuff.
+
+* More stuff.
+
+#### Example Subsubsection
+
+You can make blocks of code like this:
+
+```
+int main() {
+  return 0;
+}
+```
+
+As an extension to markdown, you can also specify a highlighter to use.
+
+``` C++
+int main() {
+  return 0;
+}
+```
+
+For a shell session, use a `console` code block.
+
+```console
+$ echo "Goodbye cruel world!"
+$ rm -rf /
+```
+
+If you need to show LLVM IR use the `llvm` code block.
+
+``` llvm
+define i32 @test1() {
+entry:
+  ret i32 0
+}
+```
+
+Some other common code blocks you might need are `c`, `objc`, `make`,
+and `cmake`. If you need something beyond that, you can look at the [full
+list] of supported code blocks.
+
+[full list]: http://pygments.org/docs/lexers/
+
+However, don't waste time fiddling with syntax highlighting when you could
+be adding meaningful content. When in doubt, show preformatted text
+without any syntax highlighting like this:
+
+                          .
+                           +:.
+                       ..:: ::
+                    .++:+:: ::+:.:.
+                   .:+           :
+            ::.::..::            .+.
+          ..:+    ::              :
+    ......+:.                    ..
+          :++.    ..              :
+            .+:::+::              :
+            ..   . .+            ::
+                     +.:      .::+.
+                      ...+. .: .
+                         .++:..
+                          ...
+
+##### Hopefully you won't need to be this deep
+
+If you need to do fancier things than what has been shown in this document,
+you can mail the list or check the [Common Mark spec].  Sphinx specific
+integration documentation can be found in the [recommonmark docs].
+
+[Common Mark spec]: http://spec.commonmark.org/0.28/
+[recommonmark docs]: http://recommonmark.readthedocs.io/en/latest/index.html

Added: www-releases/trunk/9.0.0/docs/_sources/MarkedUpDisassembly.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/MarkedUpDisassembly.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/MarkedUpDisassembly.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/MarkedUpDisassembly.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,86 @@
+=======================================
+LLVM's Optional Rich Disassembly Output
+=======================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+LLVM's default disassembly output is raw text. To allow consumers more ability
+to introspect the instructions' textual representation or to reformat for a more
+user friendly display there is an optional rich disassembly output.
+
+This optional output is sufficient to reference into individual portions of the
+instruction text. This is intended for clients like disassemblers, list file
+generators, and pretty-printers, which need more than the raw instructions and
+the ability to print them.
+
+To provide this functionality the assembly text is marked up with annotations.
+The markup is simple enough in syntax to be robust even in the case of version
+mismatches between consumers and producers. That is, the syntax generally does
+not carry semantics beyond "this text has an annotation," so consumers can
+simply ignore annotations they do not understand or do not care about.
+
+After calling ``LLVMCreateDisasm()`` to create a disassembler context the
+optional output is enable with this call:
+
+.. code-block:: c
+
+    LLVMSetDisasmOptions(DC, LLVMDisassembler_Option_UseMarkup);
+
+Then subsequent calls to ``LLVMDisasmInstruction()`` will return output strings
+with the marked up annotations.
+
+Instruction Annotations
+=======================
+
+.. _contextual markups:
+
+Contextual markups
+------------------
+
+Annoated assembly display will supply contextual markup to help clients more
+efficiently implement things like pretty printers. Most markup will be target
+independent, so clients can effectively provide good display without any target
+specific knowledge.
+
+Annotated assembly goes through the normal instruction printer, but optionally
+includes contextual tags on portions of the instruction string. An annotation
+is any '<' '>' delimited section of text(1).
+
+.. code-block:: bat
+
+    annotation: '<' tag-name tag-modifier-list ':' annotated-text '>'
+    tag-name: identifier
+    tag-modifier-list: comma delimited identifier list
+
+The tag-name is an identifier which gives the type of the annotation. For the
+first pass, this will be very simple, with memory references, registers, and
+immediates having the tag names "mem", "reg", and "imm", respectively.
+
+The tag-modifier-list is typically additional target-specific context, such as
+register class.
+
+Clients should accept and ignore any tag-names or tag-modifiers they do not
+understand, allowing the annotations to grow in richness without breaking older
+clients.
+
+For example, a possible annotation of an ARM load of a stack-relative location
+might be annotated as:
+
+.. code-block:: text
+
+   ldr <reg gpr:r0>, <mem regoffset:[<reg gpr:sp>, <imm:#4>]>
+
+
+1: For assembly dialects in which '<' and/or '>' are legal tokens, a literal token is escaped by following immediately with a repeat of the character.  For example, a literal '<' character is output as '<<' in an annotated assembly string.
+
+C API Details
+-------------
+
+The intended consumers of this information use the C API, therefore the new C
+API function for the disassembler will be added to provide an option to produce
+disassembled instructions with annotations, ``LLVMSetDisasmOptions()`` and the
+``LLVMDisassembler_Option_UseMarkup`` option (see above).

Added: www-releases/trunk/9.0.0/docs/_sources/MeetupGuidelines.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/MeetupGuidelines.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/MeetupGuidelines.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/MeetupGuidelines.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,82 @@
+=====================================
+How to start LLVM Social in your town
+=====================================
+
+Here are several ideas you can take into account when designing your specific
+LLVM Social.
+
+Before you start, it is essential to make sure that the meetup is as welcoming
+as any other event related to LLVM. Therefore you shall follow LLVM's
+`Code of Conduct <https://llvm.org/docs/CodeOfConduct.html>`_.
+
+Other than that - your mileage may vary. Please adapt your social to what works
+best for your specific situation.
+
+General suggestions
+-------------------
+
+* We highly recommend that you join the official LLVM meetup organization. In
+  addition to covering the cost of the meetup, all LLVM meetups are advertised
+  together and easily found by potential attendees. Please contact
+  arnaud.degrandmaison at llvm.org for more details.
+* Beware of cultural differences: what works well in one region may not work in
+  other part of the world.
+* Do not be alone to organize the meetup. Try to work with a couple other
+  organizers. This is more motivating as an organizer, and this makes the
+  meetup more resilient over time.
+* Each event can have a different form such as a social event, or
+  a hackathon/workshop, or a 'mini-conference' with one or more talks. You do
+  not have to stick to one format forever.
+* Whatever format you choose, `LLVM Weekly <http://llvmweekly.org/>`_ is an
+  excellent topic starter: go through the 3-4 recent LLVM Weekly posts and
+  prepare a list of the most interesting/notable news and discuss them with the
+  group.
+
+Advertisement
+-------------
+
+* Try to advertise via similar meetups/user groups
+* Advertise your meetup on the mailing lists (llvm-dev, cfe-dev, lldb-dev,
+  ...). Feel free to post to all of them, or at least to llvm-dev.
+  But as these mailing lists have high traffic and some LLVM developers are not
+  very active on them, you may reach more interested people using the mailing
+  feature from meetup.com.
+* Advertise the meetup on Twitter and mention
+  `@llvmweekly <http://twitter.com/llvmweekly>`_ and
+  `@llvmorg <http://twitter.com/llvmorg>`_.
+* Announce the next meetup in advance, and remind in one week or so.
+
+Tech talks
+----------
+
+* Itâs a great idea to have several talks scheduled for several upcoming
+  meetups to get the ball rolling.
+* Keep looking for speakers far in advance, ideally you should have 2-3
+  speakers ready in the pipeline.
+* Try to record the talks if possible. It adds visibility to the meetup and
+  just a good idea in general. Any modern smartphone or tablet should work, but
+  you can also get a camera. Though, it is recommended to get an external
+  microphone for better sound.
+
+Where to host the meetup?
+-------------------------
+
+* Look around for bars/cafÃ© with projectors.
+* Talk to tech companies in the area.
+* Some co-working spaces provide their facilities for non-profit (i.e., you do
+  not charge attendees any fees) meetups.
+* Ask nearby universities or university departments.
+
+How to pick the date?
+---------------------
+
+* Make sure you do not clash with the similar meetups in the city (e.g.,
+  C++ user groups).
+* Prefer not to have a meetup the same week when the other similar meetups
+  happen (e.g., itâs not a good idea to have LLVM meetup on Thursday after
+  C++ meetup on Wednesday).
+* Meetups on weekends may attract people who live far away from the city,
+  but the people who live in the city may not attend.
+* Make a poll, but beware that not every responder will join (we had ~20 votes
+  on the poll, while only ~8 people attended).
+

Added: www-releases/trunk/9.0.0/docs/_sources/MemorySSA.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/MemorySSA.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/MemorySSA.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/MemorySSA.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,364 @@
+=========
+MemorySSA
+=========
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+``MemorySSA`` is an analysis that allows us to cheaply reason about the
+interactions between various memory operations. Its goal is to replace
+``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because,
+unless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
+result in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't
+have as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get
+better results, too.
+
+At a high level, one of the goals of ``MemorySSA`` is to provide an SSA based
+form for memory, complete with def-use and use-def chains, which
+enables users to quickly find may-def and may-uses of memory operations.
+It can also be thought of as a way to cheaply give versions to the complete
+state of heap memory, and associate memory operations with those versions.
+
+This document goes over how ``MemorySSA`` is structured, and some basic
+intuition on how ``MemorySSA`` works.
+
+A paper on MemorySSA (with notes about how it's implemented in GCC) `can be
+found here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's
+relatively out-of-date; the paper references multiple heap partitions, but GCC
+eventually swapped to just using one, like we now have in LLVM.  Like
+GCC's, LLVM's MemorySSA is intraprocedural.
+
+
+MemorySSA Structure
+===================
+
+MemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
+structure that maps ``Instruction``\ s to ``MemoryAccess``\ es, which are
+``MemorySSA``'s parallel to LLVM ``Instruction``\ s.
+
+Each ``MemoryAccess`` can be one of three types:
+
+- ``MemoryPhi``
+- ``MemoryUse``
+- ``MemoryDef``
+
+``MemoryPhi``\ s are ``PhiNode``\ s, but for memory operations. If at any
+point we have two (or more) ``MemoryDef``\ s that could flow into a
+``BasicBlock``, the block's top ``MemoryAccess`` will be a
+``MemoryPhi``. As in LLVM IR, ``MemoryPhi``\ s don't correspond to any
+concrete operation. As such, ``BasicBlock``\ s are mapped to ``MemoryPhi``\ s
+inside ``MemorySSA``, whereas ``Instruction``\ s are mapped to ``MemoryUse``\ s
+and ``MemoryDef``\ s.
+
+Note also that in SSA, Phi nodes merge must-reach definitions (that is,
+definitions that *must* be new versions of variables). In MemorySSA, PHI nodes
+merge may-reach definitions (that is, until disambiguated, the versions that
+reach a phi node may or may not clobber a given variable).
+
+``MemoryUse``\ s are operations which use but don't modify memory. An example of
+a ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
+
+``MemoryDef``\ s are operations which may either modify memory, or which
+introduce some kind of ordering constraints. Examples of ``MemoryDef``\ s
+include ``store``\ s, function calls, ``load``\ s with ``acquire`` (or higher)
+ordering, volatile operations, memory fences, etc.
+
+Every function that exists has a special ``MemoryDef`` called ``liveOnEntry``.
+It dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being
+run on, and implies that we've hit the top of the function. It's the only
+``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
+``liveOnEntry`` implies that the memory being used is either undefined or
+defined before the function begins.
+
+An example of all of this overlaid on LLVM IR (obtained by running ``opt
+-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When
+viewing this example, it may be helpful to view it in terms of clobbers. The
+operands of a given ``MemoryAccess`` are all (potential) clobbers of said
+MemoryAccess, and the value produced by a ``MemoryAccess`` can act as a clobber
+for other ``MemoryAccess``\ es. Another useful way of looking at it is in
+terms of heap versions.  In that view, operands of a given
+``MemoryAccess`` are the version of the heap before the operation, and
+if the access produces a value, the value is the new version of the heap
+after the operation.
+
+.. code-block:: llvm
+
+  define void @foo() {
+  entry:
+    %p1 = alloca i8
+    %p2 = alloca i8
+    %p3 = alloca i8
+    ; 1 = MemoryDef(liveOnEntry)
+    store i8 0, i8* %p3
+    br label %while.cond
+
+  while.cond:
+    ; 6 = MemoryPhi({%0,1},{if.end,4})
+    br i1 undef, label %if.then, label %if.else
+
+  if.then:
+    ; 2 = MemoryDef(6)
+    store i8 0, i8* %p1
+    br label %if.end
+
+  if.else:
+    ; 3 = MemoryDef(6)
+    store i8 1, i8* %p2
+    br label %if.end
+
+  if.end:
+    ; 5 = MemoryPhi({if.then,2},{if.else,3})
+    ; MemoryUse(5)
+    %1 = load i8, i8* %p1
+    ; 4 = MemoryDef(5)
+    store i8 2, i8* %p2
+    ; MemoryUse(1)
+    %2 = load i8, i8* %p3
+    br label %while.cond
+  }
+
+The ``MemorySSA`` IR is shown in comments that precede the instructions they map
+to (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)``
+is a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM
+instruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this
+particular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8*
+%p1`` in LLVM with ``%1``). Again, ``MemoryPhi``\ s don't correspond to any LLVM
+Instruction, so the line directly below a ``MemoryPhi`` isn't special.
+
+Going from the top down:
+
+- ``6 = MemoryPhi({entry,1},{if.end,4})`` notes that, when entering
+  ``while.cond``, the reaching definition for it is either ``1`` or ``4``. This
+  ``MemoryPhi`` is referred to in the textual IR by the number ``6``.
+- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
+  and its reaching definition before it is ``6``, or the ``MemoryPhi`` after
+  ``while.cond``. (See the `Build-time use optimization`_ and `Precision`_
+  sections below for why this ``MemoryDef`` isn't linked to a separate,
+  disambiguated ``MemoryPhi``.)
+- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its
+  reaching definition is also ``6``.
+- ``5 = MemoryPhi({if.then,2},{if.else,3})`` notes that the clobber before
+  this block could either be ``2`` or ``3``.
+- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that
+  it's clobbered by ``5``.
+- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; it's
+  reaching definition is ``5``.
+- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory,
+  and the last thing that could clobber this use is above ``while.cond`` (e.g.
+  the store to ``%p3``). In heap versioning parlance, it really only depends on
+  the heap version 1, and is unaffected by the new heap versions generated since
+  then.
+
+As an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not
+meant to interact with LLVM IR.
+
+Design of MemorySSA
+===================
+
+``MemorySSA`` is an analysis that can be built for any arbitrary function. When
+it's built, it does a pass over the function's IR in order to build up its
+mapping of ``MemoryAccess``\ es. You can then query ``MemorySSA`` for things
+like the dominance relation between ``MemoryAccess``\ es, and get the
+``MemoryAccess`` for any given ``Instruction`` .
+
+When ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker``
+that you can use (see below).
+
+
+The walker
+----------
+
+A structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or
+the walker, for short. The goal of the walker is to provide answers to clobber
+queries beyond what's represented directly by ``MemoryAccess``\ es. For example,
+given:
+
+.. code-block:: llvm
+
+  define void @foo() {
+    %a = alloca i8
+    %b = alloca i8
+
+    ; 1 = MemoryDef(liveOnEntry)
+    store i8 0, i8* %a
+    ; 2 = MemoryDef(1)
+    store i8 0, i8* %b
+  }
+
+The store to ``%a`` is clearly not a clobber for the store to ``%b``. It would
+be the walker's goal to figure this out, and return ``liveOnEntry`` when queried
+for the clobber of ``MemoryAccess`` ``2``.
+
+By default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef``\ s
+and ``MemoryUse``\ s by consulting whatever alias analysis stack you happen to
+be using. Walkers were built to be flexible, though, so it's entirely reasonable
+(and expected) to create more specialized walkers (e.g. one that specifically
+queries ``GlobalsAA``, one that always stops at ``MemoryPhi`` nodes, etc).
+
+
+Locating clobbers yourself
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you choose to make your own walker, you can find the clobber for a
+``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
+``MemoryAccess``. The structure of ``MemoryDef``\ s makes this relatively simple;
+they ultimately form a linked list of every clobber that dominates the
+``MemoryAccess`` that you're trying to optimize. In other words, the
+``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
+``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
+
+
+Build-time use optimization
+---------------------------
+
+``MemorySSA`` will optimize some ``MemoryAccess``\ es at build-time.
+Specifically, we optimize the operand of every ``MemoryUse`` to point to the
+actual clobber of said ``MemoryUse``. This can be seen in the above example; the
+second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
+``MemoryDef`` from the entry block.  This is done to make walking,
+value numbering, etc, faster and easier.
+
+It is not possible to optimize ``MemoryDef`` in the same way, as we
+restrict ``MemorySSA`` to one heap variable and, thus, one Phi node
+per block.
+
+
+Invalidation and updating
+-------------------------
+
+Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever
+the IR is updated. "Update", in this case, includes the addition, deletion, and
+motion of ``Instructions``. The update API is being made on an as-needed basis.
+If you'd like examples, ``GVNHoist`` is a user of ``MemorySSA``\ s update API.
+
+
+Phi placement
+^^^^^^^^^^^^^
+
+``MemorySSA`` only places ``MemoryPhi``\ s where they're actually
+needed. That is, it is a pruned SSA form, like LLVM's SSA form.  For
+example, consider:
+
+.. code-block:: llvm
+
+  define void @foo() {
+  entry:
+    %p1 = alloca i8
+    %p2 = alloca i8
+    %p3 = alloca i8
+    ; 1 = MemoryDef(liveOnEntry)
+    store i8 0, i8* %p3
+    br label %while.cond
+
+  while.cond:
+    ; 3 = MemoryPhi({%0,1},{if.end,2})
+    br i1 undef, label %if.then, label %if.else
+
+  if.then:
+    br label %if.end
+
+  if.else:
+    br label %if.end
+
+  if.end:
+    ; MemoryUse(1)
+    %1 = load i8, i8* %p1
+    ; 2 = MemoryDef(3)
+    store i8 2, i8* %p2
+    ; MemoryUse(1)
+    %2 = load i8, i8* %p3
+    br label %while.cond
+  }
+
+Because we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi``
+for ``if.end`` would be pointless, so we don't place one. So, if you need to
+place a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create
+a ``MemoryPhi`` for ``if.end``.
+
+If it turns out that this is a large burden, we can just place ``MemoryPhi``\ s
+everywhere. Because we have Walkers that are capable of optimizing above said
+phis, doing so shouldn't prohibit optimizations.
+
+
+Non-Goals
+---------
+
+``MemorySSA`` is meant to reason about the relation between memory
+operations, and enable quicker querying.
+It isn't meant to be the single source of truth for all potential memory-related
+optimizations. Specifically, care must be taken when trying to use ``MemorySSA``
+to reason about atomic or volatile operations, as in:
+
+.. code-block:: llvm
+
+  define i8 @foo(i8* %a) {
+  entry:
+    br i1 undef, label %if.then, label %if.end
+
+  if.then:
+    ; 1 = MemoryDef(liveOnEntry)
+    %0 = load volatile i8, i8* %a
+    br label %if.end
+
+  if.end:
+    %av = phi i8 [0, %entry], [%0, %if.then]
+    ret i8 %av
+  }
+
+Going solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may
+seem legal. Because it's a volatile load, though, it's not.
+
+
+Design tradeoffs
+----------------
+
+Precision
+^^^^^^^^^
+
+``MemorySSA`` in LLVM deliberately trades off precision for speed.
+Let us think about memory variables as if they were disjoint partitions of the
+heap (that is, if you have one variable, as above, it represents the entire
+heap, and if you have multiple variables, each one represents some
+disjoint portion of the heap)
+
+First, because alias analysis results conflict with each other, and
+each result may be what an analysis wants (IE
+TBAA may say no-alias, and something else may say must-alias), it is
+not possible to partition the heap the way every optimization wants.
+Second, some alias analysis results are not transitive (IE A noalias B,
+and B noalias C, does not mean A noalias C), so it is not possible to
+come up with a precise partitioning in all cases without variables to
+represent every pair of possible aliases.  Thus, partitioning
+precisely may require introducing at least N^2 new virtual variables,
+phi nodes, etc.
+
+Each of these variables may be clobbered at multiple def sites.
+
+To give an example, if you were to split up struct fields into
+individual variables, all aliasing operations that may-def multiple struct
+fields, will may-def more than one of them.  This is pretty common (calls,
+copies, field stores, etc).
+
+Experience with SSA forms for memory in other compilers has shown that
+it is simply not possible to do this precisely, and in fact, doing it
+precisely is not worth it, because now all the optimizations have to
+walk tons and tons of virtual variables and phi nodes.
+
+So we partition.  At the point at which you partition, again,
+experience has shown us there is no point in partitioning to more than
+one variable.  It simply generates more IR, and optimizations still
+have to query something to disambiguate further anyway.
+
+As a result, LLVM partitions to one variable.
+
+Use Optimization
+^^^^^^^^^^^^^^^^
+
+Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one
+useful guarantee - all loads are optimized to point at the thing that
+actually clobbers them. This gives some nice properties.  For example,
+for a given store, you can find all loads actually clobbered by that
+store by walking the immediate uses of the store.

Added: www-releases/trunk/9.0.0/docs/_sources/MergeFunctions.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/MergeFunctions.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/MergeFunctions.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/MergeFunctions.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,785 @@
+=================================
+MergeFunctions pass, how it works
+=================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+Sometimes code contains equal functions, or functions that does exactly the same
+thing even though they are non-equal on the IR level (e.g.: multiplication on 2
+and 'shl 1'). It could happen due to several reasons: mainly, the usage of
+templates and automatic code generators. Though, sometimes the user itself could
+write the same thing twice :-)
+
+The main purpose of this pass is to recognize such functions and merge them.
+
+This document is the extension to pass comments and describes the pass logic. It
+describes the algorithm that is used in order to compare functions and
+explains how we could combine equal functions correctly to keep the module
+valid.
+
+Material is brought in a top-down form, so the reader could start to learn pass
+from high level ideas and end with low-level algorithm details, thus preparing
+him or her for reading the sources.
+
+The main goal is to describe the algorithm and logic here and the concept. If
+you *don't want* to read the source code, but want to understand pass
+algorithms, this document is good for you. The author tries not to repeat the
+source-code and covers only common cases to avoid the cases of needing to
+update this document after any minor code changes.
+
+
+What should I know to be able to follow along with this document?
+-----------------------------------------------------------------
+
+The reader should be familiar with common compile-engineering principles and
+LLVM code fundamentals. In this article, we assume the reader is familiar with
+`Single Static Assignment
+<http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+concept and has an understanding of
+`IR structure <http://llvm.org/docs/LangRef.html#high-level-structure>`_.
+
+We will use terms such as
+"`module <http://llvm.org/docs/LangRef.html#high-level-structure>`_",
+"`function <http://llvm.org/docs/ProgrammersManual.html#the-function-class>`_",
+"`basic block <http://en.wikipedia.org/wiki/Basic_block>`_",
+"`user <http://llvm.org/docs/ProgrammersManual.html#the-user-class>`_",
+"`value <http://llvm.org/docs/ProgrammersManual.html#the-value-class>`_",
+"`instruction
+<http://llvm.org/docs/ProgrammersManual.html#the-instruction-class>`_".
+
+As a good starting point, the Kaleidoscope tutorial can be used:
+
+:doc:`tutorial/index`
+
+It's especially important to understand chapter 3 of tutorial:
+
+:doc:`tutorial/LangImpl03`
+
+The reader should also know how passes work in LLVM. They could use this
+article as a reference and start point here:
+
+:doc:`WritingAnLLVMPass`
+
+What else? Well perhaps the reader should also have some experience in LLVM pass
+debugging and bug-fixing.
+
+Narrative structure
+-------------------
+The article consists of three parts. The first part explains pass functionality
+on the top-level. The second part describes the comparison procedure itself.
+The third part describes the merging process.
+
+In every part, the author tries to put the contents in the top-down form.
+The top-level methods will first be described followed by the terminal ones at
+the end, in the tail of each part. If the reader sees the reference to the
+method that wasn't described yet, they will find its description a bit below.
+
+Basics
+======
+
+How to do it?
+-------------
+Do we need to merge functions? The obvious answer is: Yes, that is quite a
+possible case. We usually *do* have duplicates and it would be good to get rid
+of them. But how do we detect duplicates? This is the idea: we split functions
+into smaller bricks or parts and compare the "bricks" amount. If equal,
+we compare the "bricks" themselves, and then do our conclusions about functions
+themselves.
+
+What could the difference be? For example, on a machine with 64-bit pointers
+(let's assume we have only one address space), one function stores a 64-bit
+integer, while another one stores a pointer. If the target is the machine
+mentioned above, and if functions are identical, except the parameter type (we
+could consider it as a part of function type), then we can treat a ``uint64_t``
+and a ``void*`` as equal.
+
+This is just an example; more possible details are described a bit below.
+
+As another example, the reader may imagine two more functions. The first
+function performs a multiplication on 2, while the second one performs an
+arithmetic right shift on 1.
+
+Possible solutions
+^^^^^^^^^^^^^^^^^^
+Let's briefly consider possible options about how and what we have to implement
+in order to create full-featured functions merging, and also what it would
+mean for us.
+
+Equal function detection obviously supposes that a "detector" method to be
+implemented and latter should answer the question "whether functions are equal".
+This "detector" method consists of tiny "sub-detectors", which each answers
+exactly the same question, but for function parts.
+
+As the second step, we should merge equal functions. So it should be a "merger"
+method. "Merger" accepts two functions *F1* and *F2*, and produces *F1F2*
+function, the result of merging.
+
+Having such routines in our hands, we can process a whole module, and merge all
+equal functions.
+
+In this case, we have to compare every function with every another function. As
+the reader may notice, this way seems to be quite expensive. Of course we could
+introduce hashing and other helpers, but it is still just an optimization, and
+thus the level of O(N*N) complexity.
+
+Can we reach another level? Could we introduce logarithmical search, or random
+access lookup? The answer is: "yes".
+
+Random-access
+"""""""""""""
+How it could this be done? Just convert each function to a number, and gather
+all of them in a special hash-table. Functions with equal hashes are equal.
+Good hashing means, that every function part must be taken into account. That
+means we have to convert every function part into some number, and then add it
+into the hash. The lookup-up time would be small, but such a approach adds some
+delay due to the hashing routine.
+
+Logarithmical search
+""""""""""""""""""""
+We could introduce total ordering among the functions set, once ordered we
+could then implement a logarithmical search. Lookup time still depends on N,
+but adds a little of delay (*log(N)*).
+
+Present state
+"""""""""""""
+Both of the approaches (random-access and logarithmical) have been implemented
+and tested and both give a very good improvement. What was most
+surprising is that logarithmical search was faster; sometimes by up to 15%. The
+hashing method needs some extra CPU time, which is the main reason why it works
+slower; in most cases, total "hashing" time is greater than total
+"logarithmical-search" time.
+
+So, preference has been granted to the "logarithmical search".
+
+Though in the case of need, *logarithmical-search* (read "total-ordering") could
+be used as a milestone on our way to the *random-access* implementation.
+
+Every comparison is based either on the numbers or on the flags comparison. In
+the *random-access* approach, we could use the same comparison algorithm.
+During comparison, we exit once we find the difference, but here we might have
+to scan the whole function body every time (note, it could be slower). Like in
+"total-ordering", we will track every number and flag, but instead of
+comparison, we should get the numbers sequence and then create the hash number.
+So, once again, *total-ordering* could be considered as a milestone for even
+faster (in theory) random-access approach.
+
+MergeFunctions, main fields and runOnModule
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+There are two main important fields in the class:
+
+``FnTree``  â the set of all unique functions. It keeps items that couldn't be
+merged with each other. It is defined as:
+
+``std::set<FunctionNode> FnTree;``
+
+Here ``FunctionNode`` is a wrapper for ``llvm::Function`` class, with
+implemented â<â operator among the functions set (below we explain how it works
+exactly; this is a key point in fast functions comparison).
+
+``Deferred`` â merging process can affect bodies of functions that are in
+``FnTree`` already. Obviously, such functions should be rechecked again. In this
+case, we remove them from ``FnTree``, and mark them to be rescanned, namely
+put them into ``Deferred`` list.
+
+runOnModule
+"""""""""""
+The algorithm is pretty simple:
+
+1. Put all module's functions into the *worklist*.
+
+2. Scan *worklist*'s functions twice: first enumerate only strong functions and
+then only weak ones:
+
+   2.1. Loop body: take a function from *worklist*  (call it *FCur*) and try to
+   insert it into *FnTree*: check whether *FCur* is equal to one of functions
+   in *FnTree*. If there *is* an equal function in *FnTree*
+   (call it *FExists*): merge function *FCur* with *FExists*. Otherwise add
+   the function from the *worklist* to *FnTree*.
+
+3. Once the *worklist* scanning and merging operations are complete, check the
+*Deferred* list. If it is not empty: refill the *worklist* contents with
+*Deferred* list and redo step 2, if the *Deferred* list is empty, then exit
+from method.
+
+Comparison and logarithmical search
+"""""""""""""""""""""""""""""""""""
+Let's recall our task: for every function *F* from module *M*, we have to find
+equal functions *F`* in the shortest time possible , and merge them into a
+single function.
+
+Defining total ordering among the functions set allows us to organize
+functions into a binary tree. The lookup procedure complexity would be
+estimated as O(log(N)) in this case. But how do we define *total-ordering*?
+
+We have to introduce a single rule applicable to every pair of functions, and
+following this rule, then evaluate which of them is greater. What kind of rule
+could it be? Let's declare it as the "compare" method that returns one of 3
+possible values:
+
+-1, left is *less* than right,
+
+0, left and right are *equal*,
+
+1, left is *greater* than right.
+
+Of course it means, that we have to maintain
+*strict and non-strict order relation properties*:
+
+* reflexivity (``a <= a``, ``a == a``, ``a >= a``),
+* antisymmetry (if ``a <= b`` and ``b <= a`` then ``a == b``),
+* transitivity (``a <= b`` and ``b <= c``, then ``a <= c``)
+* asymmetry (if ``a < b``, then ``a > b`` or ``a == b``).
+
+As mentioned before, the comparison routine consists of
+"sub-comparison-routines", with each of them also consisting of
+"sub-comparison-routines", and so on. Finally, it ends up with primitive
+comparison.
+
+Below, we will use the following operations:
+
+#. ``cmpNumbers(number1, number2)`` is a method that returns -1 if left is less
+   than right; 0, if left and right are equal; and 1 otherwise.
+
+#. ``cmpFlags(flag1, flag2)`` is a hypothetical method that compares two flags.
+   The logic is the same as in ``cmpNumbers``, where ``true`` is 1, and
+   ``false`` is 0.
+
+The rest of the article is based on *MergeFunctions.cpp* source code
+(found in *<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp*). We would like
+to ask reader to keep this file open, so we could use it as a reference
+for further explanations.
+
+Now, we're ready to proceed to the next chapter and see how it works.
+
+Functions comparison
+====================
+At first, let's define how exactly we compare complex objects.
+
+Complex object comparison (function, basic-block, etc) is mostly based on its
+sub-object comparison results. It is similar to the next "tree" objects
+comparison:
+
+#. For two trees *T1* and *T2* we perform *depth-first-traversal* and have
+   two sequences as a product: "*T1Items*" and "*T2Items*".
+
+#. We then compare chains "*T1Items*" and "*T2Items*" in
+   the most-significant-item-first order. The result of items comparison
+   would be the result of *T1* and *T2* comparison itself.
+
+FunctionComparator::compare(void)
+---------------------------------
+A brief look at the source code tells us that the comparison starts in the
+â``int FunctionComparator::compare(void)``â method.
+
+1. The first parts to be compared are the function's attributes and some
+properties that is outside the âattributesâ term, but still could make the
+function different without changing its body. This part of the comparison is
+usually done within simple *cmpNumbers* or *cmpFlags* operations (e.g.
+``cmpFlags(F1->hasGC(), F2->hasGC())``). Below is a full list of function's
+properties to be compared on this stage:
+
+  * *Attributes* (those are returned by ``Function::getAttributes()``
+    method).
+
+  * *GC*, for equivalence, *RHS* and *LHS* should be both either without
+    *GC* or with the same one.
+
+  * *Section*, just like a *GC*: *RHS* and *LHS* should be defined in the
+    same section.
+
+  * *Variable arguments*. *LHS* and *RHS* should be both either with or
+    without *var-args*.
+
+  * *Calling convention* should be the same.
+
+2. Function type. Checked by ``FunctionComparator::cmpType(Type*, Type*)``
+method. It checks return type and parameters type; the method itself will be
+described later.
+
+3. Associate function formal parameters with each other. Then comparing function
+bodies, if we see the usage of *LHS*'s *i*-th argument in *LHS*'s body, then,
+we want to see usage of *RHS*'s *i*-th argument at the same place in *RHS*'s
+body, otherwise functions are different. On this stage we grant the preference
+to those we met later in function body (value we met first would be *less*).
+This is done by â``FunctionComparator::cmpValues(const Value*, const Value*)``â
+method (will be described a bit later).
+
+4. Function body comparison. As it written in method comments:
+
+âWe do a CFG-ordered walk since the actual ordering of the blocks in the linked
+list is immaterial. Our walk starts at the entry block for both functions, then
+takes each block from each terminator in order. As an artifact, this also means
+that unreachable blocks are ignored.â
+
+So, using this walk we get BBs from *left* and *right* in the same order, and
+compare them by â``FunctionComparator::compare(const BasicBlock*, const
+BasicBlock*)``â method.
+
+We also associate BBs with each other, like we did it with function formal
+arguments (see ``cmpValues`` method below).
+
+FunctionComparator::cmpType
+---------------------------
+Consider how type comparison works.
+
+1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the
+integer type. It could be done if its address space is 0, or if address spaces
+are ignored at all. Do the same thing for the right type.
+
+2. If left and right types are equal, return 0. Otherwise we need to give
+preference to one of them. So proceed to the next step.
+
+3. If types are of different kind (different type IDs). Return result of type
+IDs comparison, treating them as numbers (use ``cmpNumbers`` operation).
+
+4. If types are vectors or integers, return result of their pointers comparison,
+comparing them as numbers.
+
+5. Check whether type ID belongs to the next group (call it equivalent-group):
+
+   * Void
+
+   * Float
+
+   * Double
+
+   * X86_FP80
+
+   * FP128
+
+   * PPC_FP128
+
+   * Label
+
+   * Metadata.
+
+   If ID belongs to group above, return 0. Since it's enough to see that
+   types has the same ``TypeID``. No additional information is required.
+
+6. Left and right are pointers. Return result of address space comparison
+(numbers comparison).
+
+7. Complex types (structures, arrays, etc.). Follow complex objects comparison
+technique (see the very first paragraph of this chapter). Both *left* and
+*right* are to be expanded and their element types will be checked the same
+way. If we get -1 or 1 on some stage, return it. Otherwise return 0.
+
+8. Steps 1-6 describe all the possible cases, if we passed steps 1-6 and didn't
+get any conclusions, then invoke ``llvm_unreachable``, since it's quite an
+unexpectable case.
+
+cmpValues(const Value*, const Value*)
+-------------------------------------
+Method that compares local values.
+
+This method gives us an answer to a very curious question: whether we could
+treat local values as equal, and which value is greater otherwise. It's
+better to start from example:
+
+Consider the situation when we're looking at the same place in left
+function "*FL*" and in right function "*FR*". Every part of *left* place is
+equal to the corresponding part of *right* place, and (!) both parts use
+*Value* instances, for example:
+
+.. code-block:: text
+
+   instr0 i32 %LV   ; left side, function FL
+   instr0 i32 %RV   ; right side, function FR
+
+So, now our conclusion depends on *Value* instances comparison.
+
+The main purpose of this method is to determine relation between such values.
+
+What can we expect from equal functions? At the same place, in functions
+"*FL*" and "*FR*" we expect to see *equal* values, or values *defined* at
+the same place in "*FL*" and "*FR*".
+
+Consider a small example here:
+
+.. code-block:: text
+
+  define void %f(i32 %pf0, i32 %pf1) {
+    instr0 i32 %pf0 instr1 i32 %pf1 instr2 i32 123
+  }
+
+.. code-block:: text
+
+  define void %g(i32 %pg0, i32 %pg1) {
+    instr0 i32 %pg0 instr1 i32 %pg0 instr2 i32 123
+  }
+
+In this example, *pf0* is associated with *pg0*, *pf1* is associated with
+*pg1*, and we also declare that *pf0* < *pf1*, and thus *pg0* < *pf1*.
+
+Instructions with opcode "*instr0*" would be *equal*, since their types and
+opcodes are equal, and values are *associated*.
+
+Instructions with opcode "*instr1*" from *f* is *greater* than instructions
+with opcode "*instr1*" from *g*; here we have equal types and opcodes, but
+"*pf1* is greater than "*pg0*".
+
+Instructions with opcode "*instr2*" are equal, because their opcodes and
+types are equal, and the same constant is used as a value.
+
+What we associate in cmpValues?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Function arguments. *i*-th argument from left function associated with
+  *i*-th argument from right function.
+* BasicBlock instances. In basic-block enumeration loop we associate *i*-th
+  BasicBlock from the left function with *i*-th BasicBlock from the right
+  function.
+* Instructions.
+* Instruction operands. Note, we can meet *Value* here we have never seen
+  before. In this case it is not a function argument, nor *BasicBlock*, nor
+  *Instruction*. It is a global value. It is a constant, since it's the only
+  supposed global here. The method also compares: Constants that are of the
+  same type and if right constant can be losslessly bit-casted to the left
+  one, then we also compare them.
+
+How to implement cmpValues?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+*Association* is a case of equality for us. We just treat such values as equal,
+but, in general, we need to implement antisymmetric relation. As mentioned
+above, to understand what is *less*, we can use order in which we
+meet values. If both values have the same order in a function (met at the same
+time), we then treat values as *associated*. Otherwise â it depends on who was
+first.
+
+Every time we run the top-level compare method, we initialize two identical
+maps (one for the left side, another one for the right side):
+
+``map<Value, int> sn_mapL, sn_mapR;``
+
+The key of the map is the *Value* itself, the *value* â is its order (call it
+*serial number*).
+
+To add value *V* we need to perform the next procedure:
+
+``sn_map.insert(std::make_pair(V, sn_map.size()));``
+
+For the first *Value*, map will return *0*, for the second *Value* map will
+return *1*, and so on.
+
+We can then check whether left and right values met at the same time with
+a simple comparison:
+
+``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);``
+
+Of course, we can combine insertion and comparison:
+
+.. code-block:: c++
+
+  std::pair<iterator, bool>
+    LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())), RightRes
+    = sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
+  return cmpNumbers(LeftRes.first->second, RightRes.first->second);
+
+Let's look, how whole method could be implemented.
+
+1. We have to start with the bad news. Consider function self and
+cross-referencing cases:
+
+.. code-block:: c++
+
+  // self-reference unsigned fact0(unsigned n) { return n > 1 ? n
+  * fact0(n-1) : 1; } unsigned fact1(unsigned n) { return n > 1 ? n *
+  fact1(n-1) : 1; }
+
+  // cross-reference unsigned ping(unsigned n) { return n!= 0 ? pong(n-1) : 0;
+  } unsigned pong(unsigned n) { return n!= 0 ? ping(n-1) : 0; }
+
+..
+
+  This comparison has been implemented in initial *MergeFunctions* pass
+  version. But, unfortunately, it is not transitive. And this is the only case
+  we can't convert to less-equal-greater comparison. It is a seldom case, 4-5
+  functions of 10000 (checked in test-suite), and, we hope, the reader would
+  forgive us for such a sacrifice in order to get the O(log(N)) pass time.
+
+2. If left/right *Value* is a constant, we have to compare them. Return 0 if it
+is the same constant, or use ``cmpConstants`` method otherwise.
+
+3. If left/right is *InlineAsm* instance. Return result of *Value* pointers
+comparison.
+
+4. Explicit association of *L* (left value) and *R*  (right value). We need to
+find out whether values met at the same time, and thus are *associated*. Or we
+need to put the rule: when we treat *L* < *R*. Now it is easy: we just return
+the result of numbers comparison:
+
+.. code-block:: c++
+
+   std::pair<iterator, bool>
+     LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())),
+     RightRes = sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
+   if (LeftRes.first->second == RightRes.first->second) return 0;
+   if (LeftRes.first->second < RightRes.first->second) return -1;
+   return 1;
+
+Now when *cmpValues* returns 0, we can proceed the comparison procedure.
+Otherwise, if we get (-1 or 1), we need to pass this result to the top level,
+and finish comparison procedure.
+
+cmpConstants
+------------
+Performs constants comparison as follows:
+
+1. Compare constant types using ``cmpType`` method. If the result is -1 or 1,
+goto step 2, otherwise proceed to step 3.
+
+2. If types are different, we still can check whether constants could be
+losslessly bitcasted to each other. The further explanation is modification of
+``canLosslesslyBitCastTo`` method.
+
+   2.1 Check whether constants are of the first class types
+   (``isFirstClassType`` check):
+
+   2.1.1. If both constants are *not* of the first class type: return result
+   of ``cmpType``.
+
+   2.1.2. Otherwise, if left type is not of the first class, return -1. If
+   right type is not of the first class, return 1.
+
+   2.1.3. If both types are of the first class type, proceed to the next step
+   (2.1.3.1).
+
+   2.1.3.1. If types are vectors, compare their bitwidth using the
+   *cmpNumbers*. If result is not 0, return it.
+
+   2.1.3.2. Different types, but not a vectors:
+
+   * if both of them are pointers, good for us, we can proceed to step 3.
+   * if one of types is pointer, return result of *isPointer* flags
+     comparison (*cmpFlags* operation).
+   * otherwise we have no methods to prove bitcastability, and thus return
+     result of types comparison (-1 or 1).
+
+Steps below are for the case when types are equal, or case when constants are
+bitcastable:
+
+3. One of constants is a "*null*" value. Return the result of
+``cmpFlags(L->isNullValue, R->isNullValue)`` comparison.
+
+4. Compare value IDs, and return result if it is not 0:
+
+.. code-block:: c++
+
+  if (int Res = cmpNumbers(L->getValueID(), R->getValueID()))
+    return Res;
+
+5. Compare the contents of constants. The comparison depends on the kind of
+constants, but on this stage it is just a lexicographical comparison. Just see
+how it was described in the beginning of "*Functions comparison*" paragraph.
+Mathematically, it is equal to the next case: we encode left constant and right
+constant (with similar way *bitcode-writer* does). Then compare left code
+sequence and right code sequence.
+
+compare(const BasicBlock*, const BasicBlock*)
+---------------------------------------------
+Compares two *BasicBlock* instances.
+
+It enumerates instructions from left *BB* and right *BB*.
+
+1. It assigns serial numbers to the left and right instructions, using
+``cmpValues`` method.
+
+2. If one of left or right is *GEP* (``GetElementPtr``), then treat *GEP* as
+greater than other instructions. If both instructions are *GEPs* use ``cmpGEP``
+method for comparison. If result is -1 or 1, pass it to the top-level
+comparison (return it).
+
+   3.1. Compare operations. Call ``cmpOperation`` method. If result is -1 or
+   1, return it.
+
+   3.2. Compare number of operands, if result is -1 or 1, return it.
+
+   3.3. Compare operands themselves, use ``cmpValues`` method. Return result
+   if it is -1 or 1.
+
+   3.4. Compare type of operands, using ``cmpType`` method. Return result if
+   it is -1 or 1.
+
+   3.5. Proceed to the next instruction.
+
+4. We can finish instruction enumeration in 3 cases:
+
+   4.1. We reached the end of both left and right basic-blocks. We didn't
+   exit on steps 1-3, so contents are equal, return 0.
+
+   4.2. We have reached the end of the left basic-block. Return -1.
+
+   4.3. Return 1 (we reached the end of the right basic block).
+
+cmpGEP
+------
+Compares two GEPs (``getelementptr`` instructions).
+
+It differs from regular operations comparison with the only thing: possibility
+to use ``accumulateConstantOffset`` method.
+
+So, if we get constant offset for both left and right *GEPs*, then compare it as
+numbers, and return comparison result.
+
+Otherwise treat it like a regular operation (see previous paragraph).
+
+cmpOperation
+------------
+Compares instruction opcodes and some important operation properties.
+
+1. Compare opcodes, if it differs return the result.
+
+2. Compare number of operands. If it differs â return the result.
+
+3. Compare operation types, use *cmpType*. All the same â if types are
+different, return result.
+
+4. Compare *subclassOptionalData*, get it with ``getRawSubclassOptionalData``
+method, and compare it like a numbers.
+
+5. Compare operand types.
+
+6. For some particular instructions, check equivalence (relation in our case) of
+some significant attributes. For example, we have to compare alignment for
+``load`` instructions.
+
+O(log(N))
+---------
+Methods described above implement order relationship. And latter, could be used
+for nodes comparison in a binary tree. So we can organize functions set into
+the binary tree and reduce the cost of lookup procedure from
+O(N*N) to O(log(N)).
+
+Merging process, mergeTwoFunctions
+==================================
+Once *MergeFunctions* detected that current function (*G*) is equal to one that
+were analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*,
+Function*)``.
+
+Operation affects ``FnTree`` contents with next way: *F* will stay in
+``FnTree``. *G* being equal to *F* will not be added to ``FnTree``. Calls of
+*G* would be replaced with something else. It changes bodies of callers. So,
+functions that calls *G* would be put into ``Deferred`` set and removed from
+``FnTree``, and analyzed again.
+
+The approach is next:
+
+1. Most wished case: when we can use alias and both of *F* and *G* are weak. We
+make both of them with aliases to the third strong function *H*. Actually *H*
+is *F*. See below how it's made (but it's better to look straight into the
+source code). Well, this is a case when we can just replace *G* with *F*
+everywhere, we use ``replaceAllUsesWith`` operation here (*RAUW*).
+
+2. *F* could not be overridden, while *G* could. It would be good to do the
+next: after merging the places where overridable function were used, still use
+overridable stub. So try to make *G* alias to *F*, or create overridable tail
+call wrapper around *F* and replace *G* with that call.
+
+3. Neither *F* nor *G* could be overridden. We can't use *RAUW*. We can just
+change the callers: call *F* instead of *G*.  That's what
+``replaceDirectCallers`` does.
+
+Below is a detailed body description.
+
+If âFâ may be overridden
+------------------------
+As follows from ``mayBeOverridden`` comments: âwhether the definition of this
+global may be replaced by something non-equivalent at link timeâ. If so, that's
+ok: we can use alias to *F* instead of *G* or change call instructions itself.
+
+HasGlobalAliases, removeUsers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+First consider the case when we have global aliases of one function name to
+another. Our purpose is  make both of them with aliases to the third strong
+function. Though if we keep *F* alive and without major changes we can leave it
+in ``FnTree``. Try to combine these two goals.
+
+Do stub replacement of *F* itself with an alias to *F*.
+
+1. Create stub function *H*, with the same name and attributes like function
+*F*. It takes maximum alignment of *F* and *G*.
+
+2. Replace all uses of function *F* with uses of function *H*. It is the two
+steps procedure instead. First of all, we must take into account, all functions
+from whom *F* is called would be changed: since we change the call argument
+(from *F* to *H*). If so we must to review these caller functions again after
+this procedure. We remove callers from ``FnTree``, method with name
+``removeUsers(F)`` does that (don't confuse with ``replaceAllUsesWith``):
+
+   2.1. ``Inside removeUsers(Value*
+   V)`` we go through the all values that use value *V* (or *F* in our context).
+   If value is instruction, we go to function that holds this instruction and
+   mark it as to-be-analyzed-again (put to ``Deferred`` set), we also remove
+   caller from ``FnTree``.
+
+   2.2. Now we can do the replacement: call ``F->replaceAllUsesWith(H)``.
+
+3. *H* (that now "officially" plays *F*'s role) is replaced with alias to *F*.
+Do the same with *G*: replace it with alias to *F*. So finally everywhere *F*
+was used, we use *H* and it is alias to *F*, and everywhere *G* was used we
+also have alias to *F*.
+
+4. Set *F* linkage to private. Make it strong :-)
+
+No global aliases, replaceDirectCallers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+If global aliases are not supported. We call ``replaceDirectCallers``. Just
+go through all calls of *G* and replace it with calls of *F*. If you look into
+the method you will see that it scans all uses of *G* too, and if use is callee
+(if user is call instruction and *G* is used as what to be called), we replace
+it with use of *F*.
+
+If âFâ could not be overridden, fix it!
+"""""""""""""""""""""""""""""""""""""""
+
+We call ``writeThunkOrAlias(Function *F, Function *G)``. Here we try to replace
+*G* with alias to *F* first. The next conditions are essential:
+
+* target should support global aliases,
+* the address itself of  *G* should be not significant, not named and not
+  referenced anywhere,
+* function should come with external, local or weak linkage.
+
+Otherwise we write thunk: some wrapper that has *G's* interface and calls *F*,
+so *G* could be replaced with this wrapper.
+
+*writeAlias*
+
+As follows from *llvm* reference:
+
+âAliases act as *second name* for the aliasee valueâ. So we just want to create
+a second name for *F* and use it instead of *G*:
+
+1. create global alias itself (*GA*),
+
+2. adjust alignment of *F* so it must be maximum of current and *G's* alignment;
+
+3. replace uses of *G*:
+
+   3.1. first mark all callers of *G* as to-be-analyzed-again, using
+   ``removeUsers`` method (see chapter above),
+
+   3.2. call ``G->replaceAllUsesWith(GA)``.
+
+4. Get rid of *G*.
+
+*writeThunk*
+
+As it written in method comments:
+
+âReplace G with a simple tail call to bitcast(F). Also replace direct uses of G
+with bitcast(F). Deletes G.â
+
+In general it does the same as usual when we want to replace callee, except the
+first point:
+
+1. We generate tail call wrapper around *F*, but with interface that allows use
+it instead of *G*.
+
+2. âAs-usualâ: ``removeUsers`` and ``replaceAllUsesWith`` then.
+
+3. Get rid of *G*.
+
+

Added: www-releases/trunk/9.0.0/docs/_sources/NVPTXUsage.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/NVPTXUsage.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/NVPTXUsage.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/NVPTXUsage.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,972 @@
+=============================
+User Guide for NVPTX Back-end
+=============================
+
+.. contents::
+   :local:
+   :depth: 3
+
+
+Introduction
+============
+
+To support GPU programming, the NVPTX back-end supports a subset of LLVM IR
+along with a defined set of conventions used to represent GPU programming
+concepts. This document provides an overview of the general usage of the back-
+end, including a description of the conventions used and the set of accepted
+LLVM IR.
+
+.. note:: 
+   
+   This document assumes a basic familiarity with CUDA and the PTX
+   assembly language. Information about the CUDA Driver API and the PTX assembly
+   language can be found in the `CUDA documentation
+   <http://docs.nvidia.com/cuda/index.html>`_.
+
+
+
+Conventions
+===========
+
+Marking Functions as Kernels
+----------------------------
+
+In PTX, there are two types of functions: *device functions*, which are only
+callable by device code, and *kernel functions*, which are callable by host
+code. By default, the back-end will emit device functions. Metadata is used to
+declare a function as a kernel function. This metadata is attached to the
+``nvvm.annotations`` named metadata object, and has the following format:
+
+.. code-block:: text
+
+   !0 = !{<function-ref>, metadata !"kernel", i32 1}
+
+The first parameter is a reference to the kernel function. The following
+example shows a kernel function calling a device function in LLVM IR. The
+function ``@my_kernel`` is callable from host code, but ``@my_fmad`` is not.
+
+.. code-block:: llvm
+
+    define float @my_fmad(float %x, float %y, float %z) {
+      %mul = fmul float %x, %y
+      %add = fadd float %mul, %z
+      ret float %add
+    }
+
+    define void @my_kernel(float* %ptr) {
+      %val = load float, float* %ptr
+      %ret = call float @my_fmad(float %val, float %val, float %val)
+      store float %ret, float* %ptr
+      ret void
+    }
+
+    !nvvm.annotations = !{!1}
+    !1 = !{void (float*)* @my_kernel, !"kernel", i32 1}
+
+When compiled, the PTX kernel functions are callable by host-side code.
+
+
+.. _address_spaces:
+
+Address Spaces
+--------------
+
+The NVPTX back-end uses the following address space mapping:
+
+   ============= ======================
+   Address Space Memory Space
+   ============= ======================
+   0             Generic
+   1             Global
+   2             Internal Use
+   3             Shared
+   4             Constant
+   5             Local
+   ============= ======================
+
+Every global variable and pointer type is assigned to one of these address
+spaces, with 0 being the default address space. Intrinsics are provided which
+can be used to convert pointers between the generic and non-generic address
+spaces.
+
+As an example, the following IR will define an array ``@g`` that resides in
+global device memory.
+
+.. code-block:: llvm
+
+    @g = internal addrspace(1) global [4 x i32] [ i32 0, i32 1, i32 2, i32 3 ]
+
+LLVM IR functions can read and write to this array, and host-side code can
+copy data to it by name with the CUDA Driver API.
+
+Note that since address space 0 is the generic space, it is illegal to have
+global variables in address space 0.  Address space 0 is the default address
+space in LLVM, so the ``addrspace(N)`` annotation is *required* for global
+variables.
+
+
+Triples
+-------
+
+The NVPTX target uses the module triple to select between 32/64-bit code
+generation and the driver-compiler interface to use. The triple architecture
+can be one of ``nvptx`` (32-bit PTX) or ``nvptx64`` (64-bit PTX). The
+operating system should be one of ``cuda`` or ``nvcl``, which determines the
+interface used by the generated code to communicate with the driver.  Most
+users will want to use ``cuda`` as the operating system, which makes the
+generated PTX compatible with the CUDA Driver API.
+
+Example: 32-bit PTX for CUDA Driver API: ``nvptx-nvidia-cuda``
+
+Example: 64-bit PTX for CUDA Driver API: ``nvptx64-nvidia-cuda``
+
+
+
+.. _nvptx_intrinsics:
+
+NVPTX Intrinsics
+================
+
+Address Space Conversion
+------------------------
+
+'``llvm.nvvm.ptr.*.to.gen``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+These are overloaded intrinsics.  You can use these on any pointer types.
+
+.. code-block:: llvm
+
+    declare i8* @llvm.nvvm.ptr.global.to.gen.p0i8.p1i8(i8 addrspace(1)*)
+    declare i8* @llvm.nvvm.ptr.shared.to.gen.p0i8.p3i8(i8 addrspace(3)*)
+    declare i8* @llvm.nvvm.ptr.constant.to.gen.p0i8.p4i8(i8 addrspace(4)*)
+    declare i8* @llvm.nvvm.ptr.local.to.gen.p0i8.p5i8(i8 addrspace(5)*)
+
+Overview:
+"""""""""
+
+The '``llvm.nvvm.ptr.*.to.gen``' intrinsics convert a pointer in a non-generic
+address space to a generic address space pointer.
+
+Semantics:
+""""""""""
+
+These intrinsics modify the pointer value to be a valid generic address space
+pointer.
+
+
+'``llvm.nvvm.ptr.gen.to.*``' Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+These are overloaded intrinsics.  You can use these on any pointer types.
+
+.. code-block:: llvm
+
+    declare i8 addrspace(1)* @llvm.nvvm.ptr.gen.to.global.p1i8.p0i8(i8*)
+    declare i8 addrspace(3)* @llvm.nvvm.ptr.gen.to.shared.p3i8.p0i8(i8*)
+    declare i8 addrspace(4)* @llvm.nvvm.ptr.gen.to.constant.p4i8.p0i8(i8*)
+    declare i8 addrspace(5)* @llvm.nvvm.ptr.gen.to.local.p5i8.p0i8(i8*)
+
+Overview:
+"""""""""
+
+The '``llvm.nvvm.ptr.gen.to.*``' intrinsics convert a pointer in the generic
+address space to a pointer in the target address space.  Note that these
+intrinsics are only useful if the address space of the target address space of
+the pointer is known.  It is not legal to use address space conversion
+intrinsics to convert a pointer from one non-generic address space to another
+non-generic address space.
+
+Semantics:
+""""""""""
+
+These intrinsics modify the pointer value to be a valid pointer in the target
+non-generic address space.
+
+
+Reading PTX Special Registers
+-----------------------------
+
+'``llvm.nvvm.read.ptx.sreg.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+    declare i32 @llvm.nvvm.read.ptx.sreg.tid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.tid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.tid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ntid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.x()
+    declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.y()
+    declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.z()
+    declare i32 @llvm.nvvm.read.ptx.sreg.warpsize()
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.read.ptx.sreg.*``' intrinsics provide access to the PTX
+special registers, in particular the kernel launch bounds.  These registers
+map in the following way to CUDA builtins:
+
+   ============ =====================================
+   CUDA Builtin PTX Special Register Intrinsic
+   ============ =====================================
+   ``threadId`` ``@llvm.nvvm.read.ptx.sreg.tid.*``
+   ``blockIdx`` ``@llvm.nvvm.read.ptx.sreg.ctaid.*``
+   ``blockDim`` ``@llvm.nvvm.read.ptx.sreg.ntid.*``
+   ``gridDim``  ``@llvm.nvvm.read.ptx.sreg.nctaid.*``
+   ============ =====================================
+
+
+Barriers
+--------
+
+'``llvm.nvvm.barrier0``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+  declare void @llvm.nvvm.barrier0()
+
+Overview:
+"""""""""
+
+The '``@llvm.nvvm.barrier0()``' intrinsic emits a PTX ``bar.sync 0``
+instruction, equivalent to the ``__syncthreads()`` call in CUDA.
+
+
+Other Intrinsics
+----------------
+
+For the full set of NVPTX intrinsics, please see the
+``include/llvm/IR/IntrinsicsNVVM.td`` file in the LLVM source tree.
+
+
+.. _libdevice:
+
+Linking with Libdevice
+======================
+
+The CUDA Toolkit comes with an LLVM bitcode library called ``libdevice`` that
+implements many common mathematical functions. This library can be used as a
+high-performance math library for any compilers using the LLVM NVPTX target.
+The library can be found under ``nvvm/libdevice/`` in the CUDA Toolkit and
+there is a separate version for each compute architecture.
+
+For a list of all math functions implemented in libdevice, see
+`libdevice Users Guide <http://docs.nvidia.com/cuda/libdevice-users-guide/index.html>`_.
+
+To accommodate various math-related compiler flags that can affect code
+generation of libdevice code, the library code depends on a special LLVM IR
+pass (``NVVMReflect``) to handle conditional compilation within LLVM IR. This
+pass looks for calls to the ``@__nvvm_reflect`` function and replaces them
+with constants based on the defined reflection parameters. Such conditional
+code often follows a pattern:
+
+.. code-block:: c++
+
+  float my_function(float a) {
+    if (__nvvm_reflect("FASTMATH"))
+      return my_function_fast(a);
+    else
+      return my_function_precise(a);
+  }
+
+The default value for all unspecified reflection parameters is zero.
+
+The ``NVVMReflect`` pass should be executed early in the optimization
+pipeline, immediately after the link stage. The ``internalize`` pass is also
+recommended to remove unused math functions from the resulting PTX. For an
+input IR module ``module.bc``, the following compilation flow is recommended:
+
+1. Save list of external functions in ``module.bc``
+2. Link ``module.bc`` with ``libdevice.compute_XX.YY.bc``
+3. Internalize all functions not in list from (1)
+4. Eliminate all unused internal functions
+5. Run ``NVVMReflect`` pass
+6. Run standard optimization pipeline
+
+.. note::
+
+  ``linkonce`` and ``linkonce_odr`` linkage types are not suitable for the
+  libdevice functions. It is possible to link two IR modules that have been
+  linked against libdevice using different reflection variables.
+
+Since the ``NVVMReflect`` pass replaces conditionals with constants, it will
+often leave behind dead code of the form:
+
+.. code-block:: llvm
+
+  entry:
+    ..
+    br i1 true, label %foo, label %bar
+  foo:
+    ..
+  bar:
+    ; Dead code
+    ..
+
+Therefore, it is recommended that ``NVVMReflect`` is executed early in the
+optimization pipeline before dead-code elimination.
+
+The NVPTX TargetMachine knows how to schedule ``NVVMReflect`` at the beginning
+of your pass manager; just use the following code when setting up your pass
+manager:
+
+.. code-block:: c++
+
+    std::unique_ptr<TargetMachine> TM = ...;
+    PassManagerBuilder PMBuilder(...);
+    if (TM)
+      TM->adjustPassManager(PMBuilder);
+
+Reflection Parameters
+---------------------
+
+The libdevice library currently uses the following reflection parameters to
+control code generation:
+
+==================== ======================================================
+Flag                 Description
+==================== ======================================================
+``__CUDA_FTZ=[0,1]`` Use optimized code paths that flush subnormals to zero
+==================== ======================================================
+
+The value of this flag is determined by the "nvvm-reflect-ftz" module flag.
+The following sets the ftz flag to 1.
+
+.. code-block:: llvm
+
+    !llvm.module.flag = !{!0}
+    !0 = !{i32 4, !"nvvm-reflect-ftz", i32 1}
+
+(``i32 4`` indicates that the value set here overrides the value in another
+module we link with.  See the `LangRef <LangRef.html#module-flags-metadata>`
+for details.)
+
+Executing PTX
+=============
+
+The most common way to execute PTX assembly on a GPU device is to use the CUDA
+Driver API. This API is a low-level interface to the GPU driver and allows for
+JIT compilation of PTX code to native GPU machine code.
+
+Initializing the Driver API:
+
+.. code-block:: c++
+
+    CUdevice device;
+    CUcontext context;
+
+    // Initialize the driver API
+    cuInit(0);
+    // Get a handle to the first compute device
+    cuDeviceGet(&device, 0);
+    // Create a compute device context
+    cuCtxCreate(&context, 0, device);
+
+JIT compiling a PTX string to a device binary:
+
+.. code-block:: c++
+
+    CUmodule module;
+    CUfunction function;
+
+    // JIT compile a null-terminated PTX string
+    cuModuleLoadData(&module, (void*)PTXString);
+
+    // Get a handle to the "myfunction" kernel function
+    cuModuleGetFunction(&function, module, "myfunction");
+
+For full examples of executing PTX assembly, please see the `CUDA Samples
+<https://developer.nvidia.com/cuda-downloads>`_ distribution.
+
+
+Common Issues
+=============
+
+ptxas complains of undefined function: __nvvm_reflect
+-----------------------------------------------------
+
+When linking with libdevice, the ``NVVMReflect`` pass must be used. See
+:ref:`libdevice` for more information.
+
+
+Tutorial: A Simple Compute Kernel
+=================================
+
+To start, let us take a look at a simple compute kernel written directly in
+LLVM IR. The kernel implements vector addition, where each thread computes one
+element of the output vector C from the input vectors A and B.  To make this
+easier, we also assume that only a single CTA (thread block) will be launched,
+and that it will be one dimensional.
+
+
+The Kernel
+----------
+
+.. code-block:: llvm
+
+  target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
+  target triple = "nvptx64-nvidia-cuda"
+
+  ; Intrinsic to read X component of thread ID
+  declare i32 @llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind
+
+  define void @kernel(float addrspace(1)* %A,
+                      float addrspace(1)* %B,
+                      float addrspace(1)* %C) {
+  entry:
+    ; What is my ID?
+    %id = tail call i32 @llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind
+
+    ; Compute pointers into A, B, and C
+    %ptrA = getelementptr float, float addrspace(1)* %A, i32 %id
+    %ptrB = getelementptr float, float addrspace(1)* %B, i32 %id
+    %ptrC = getelementptr float, float addrspace(1)* %C, i32 %id
+
+    ; Read A, B
+    %valA = load float, float addrspace(1)* %ptrA, align 4
+    %valB = load float, float addrspace(1)* %ptrB, align 4
+
+    ; Compute C = A + B
+    %valC = fadd float %valA, %valB
+
+    ; Store back to C
+    store float %valC, float addrspace(1)* %ptrC, align 4
+
+    ret void
+  }
+
+  !nvvm.annotations = !{!0}
+  !0 = !{void (float addrspace(1)*,
+               float addrspace(1)*,
+               float addrspace(1)*)* @kernel, !"kernel", i32 1}
+
+
+We can use the LLVM ``llc`` tool to directly run the NVPTX code generator:
+
+.. code-block:: text
+
+  # llc -mcpu=sm_20 kernel.ll -o kernel.ptx
+
+
+.. note::
+
+  If you want to generate 32-bit code, change ``p:64:64:64`` to ``p:32:32:32``
+  in the module data layout string and use ``nvptx-nvidia-cuda`` as the
+  target triple.
+
+
+The output we get from ``llc`` (as of LLVM 3.4):
+
+.. code-block:: text
+
+  //
+  // Generated by LLVM NVPTX Back-End
+  //
+
+  .version 3.1
+  .target sm_20
+  .address_size 64
+
+    // .globl kernel
+                                          // @kernel
+  .visible .entry kernel(
+    .param .u64 kernel_param_0,
+    .param .u64 kernel_param_1,
+    .param .u64 kernel_param_2
+  )
+  {
+    .reg .f32   %f<4>;
+    .reg .s32   %r<2>;
+    .reg .s64   %rl<8>;
+
+  // %bb.0:                                // %entry
+    ld.param.u64    %rl1, [kernel_param_0];
+    mov.u32         %r1, %tid.x;
+    mul.wide.s32    %rl2, %r1, 4;
+    add.s64         %rl3, %rl1, %rl2;
+    ld.param.u64    %rl4, [kernel_param_1];
+    add.s64         %rl5, %rl4, %rl2;
+    ld.param.u64    %rl6, [kernel_param_2];
+    add.s64         %rl7, %rl6, %rl2;
+    ld.global.f32   %f1, [%rl3];
+    ld.global.f32   %f2, [%rl5];
+    add.f32         %f3, %f1, %f2;
+    st.global.f32   [%rl7], %f3;
+    ret;
+  }
+
+
+Dissecting the Kernel
+---------------------
+
+Now let us dissect the LLVM IR that makes up this kernel. 
+
+Data Layout
+^^^^^^^^^^^
+
+The data layout string determines the size in bits of common data types, their
+ABI alignment, and their storage size.  For NVPTX, you should use one of the
+following:
+
+32-bit PTX:
+
+.. code-block:: llvm
+
+  target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
+
+64-bit PTX:
+
+.. code-block:: llvm
+
+  target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
+
+
+Target Intrinsics
+^^^^^^^^^^^^^^^^^
+
+In this example, we use the ``@llvm.nvvm.read.ptx.sreg.tid.x`` intrinsic to
+read the X component of the current thread's ID, which corresponds to a read
+of register ``%tid.x`` in PTX. The NVPTX back-end supports a large set of
+intrinsics.  A short list is shown below; please see
+``include/llvm/IR/IntrinsicsNVVM.td`` for the full list.
+
+
+================================================ ====================
+Intrinsic                                        CUDA Equivalent
+================================================ ====================
+``i32 @llvm.nvvm.read.ptx.sreg.tid.{x,y,z}``     threadIdx.{x,y,z}
+``i32 @llvm.nvvm.read.ptx.sreg.ctaid.{x,y,z}``   blockIdx.{x,y,z}
+``i32 @llvm.nvvm.read.ptx.sreg.ntid.{x,y,z}``    blockDim.{x,y,z}
+``i32 @llvm.nvvm.read.ptx.sreg.nctaid.{x,y,z}``  gridDim.{x,y,z}
+``void @llvm.nvvm.barrier0()``                   __syncthreads()
+================================================ ====================
+
+
+Address Spaces
+^^^^^^^^^^^^^^
+
+You may have noticed that all of the pointer types in the LLVM IR example had
+an explicit address space specifier. What is address space 1? NVIDIA GPU
+devices (generally) have four types of memory:
+
+- Global: Large, off-chip memory
+- Shared: Small, on-chip memory shared among all threads in a CTA
+- Local: Per-thread, private memory
+- Constant: Read-only memory shared across all threads
+
+These different types of memory are represented in LLVM IR as address spaces.
+There is also a fifth address space used by the NVPTX code generator that
+corresponds to the "generic" address space.  This address space can represent
+addresses in any other address space (with a few exceptions).  This allows
+users to write IR functions that can load/store memory using the same
+instructions. Intrinsics are provided to convert pointers between the generic
+and non-generic address spaces.
+
+See :ref:`address_spaces` and :ref:`nvptx_intrinsics` for more information.
+
+
+Kernel Metadata
+^^^^^^^^^^^^^^^
+
+In PTX, a function can be either a `kernel` function (callable from the host
+program), or a `device` function (callable only from GPU code). You can think
+of `kernel` functions as entry-points in the GPU program. To mark an LLVM IR
+function as a `kernel` function, we make use of special LLVM metadata. The
+NVPTX back-end will look for a named metadata node called
+``nvvm.annotations``. This named metadata must contain a list of metadata that
+describe the IR. For our purposes, we need to declare a metadata node that
+assigns the "kernel" attribute to the LLVM IR function that should be emitted
+as a PTX `kernel` function. These metadata nodes take the form:
+
+.. code-block:: text
+
+  !{<function ref>, metadata !"kernel", i32 1}
+
+For the previous example, we have:
+
+.. code-block:: llvm
+
+  !nvvm.annotations = !{!0}
+  !0 = !{void (float addrspace(1)*,
+               float addrspace(1)*,
+               float addrspace(1)*)* @kernel, !"kernel", i32 1}
+
+Here, we have a single metadata declaration in ``nvvm.annotations``. This
+metadata annotates our ``@kernel`` function with the ``kernel`` attribute.
+
+
+Running the Kernel
+------------------
+
+Generating PTX from LLVM IR is all well and good, but how do we execute it on
+a real GPU device? The CUDA Driver API provides a convenient mechanism for
+loading and JIT compiling PTX to a native GPU device, and launching a kernel.
+The API is similar to OpenCL.  A simple example showing how to load and
+execute our vector addition code is shown below. Note that for brevity this
+code does not perform much error checking!
+
+.. note::
+
+  You can also use the ``ptxas`` tool provided by the CUDA Toolkit to offline
+  compile PTX to machine code (SASS) for a specific GPU architecture. Such
+  binaries can be loaded by the CUDA Driver API in the same way as PTX. This
+  can be useful for reducing startup time by precompiling the PTX kernels.
+
+
+.. code-block:: c++
+
+  #include <iostream>
+  #include <fstream>
+  #include <cassert>
+  #include "cuda.h"
+
+
+  void checkCudaErrors(CUresult err) {
+    assert(err == CUDA_SUCCESS);
+  }
+
+  /// main - Program entry point
+  int main(int argc, char **argv) {
+    CUdevice    device;
+    CUmodule    cudaModule;
+    CUcontext   context;
+    CUfunction  function;
+    CUlinkState linker;
+    int         devCount;
+
+    // CUDA initialization
+    checkCudaErrors(cuInit(0));
+    checkCudaErrors(cuDeviceGetCount(&devCount));
+    checkCudaErrors(cuDeviceGet(&device, 0));
+
+    char name[128];
+    checkCudaErrors(cuDeviceGetName(name, 128, device));
+    std::cout << "Using CUDA Device [0]: " << name << "\n";
+
+    int devMajor, devMinor;
+    checkCudaErrors(cuDeviceComputeCapability(&devMajor, &devMinor, device));
+    std::cout << "Device Compute Capability: "
+              << devMajor << "." << devMinor << "\n";
+    if (devMajor < 2) {
+      std::cerr << "ERROR: Device 0 is not SM 2.0 or greater\n";
+      return 1;
+    }
+
+    std::ifstream t("kernel.ptx");
+    if (!t.is_open()) {
+      std::cerr << "kernel.ptx not found\n";
+      return 1;
+    }
+    std::string str((std::istreambuf_iterator<char>(t)),
+                      std::istreambuf_iterator<char>());
+
+    // Create driver context
+    checkCudaErrors(cuCtxCreate(&context, 0, device));
+
+    // Create module for object
+    checkCudaErrors(cuModuleLoadDataEx(&cudaModule, str.c_str(), 0, 0, 0));
+
+    // Get kernel function
+    checkCudaErrors(cuModuleGetFunction(&function, cudaModule, "kernel"));
+
+    // Device data
+    CUdeviceptr devBufferA;
+    CUdeviceptr devBufferB;
+    CUdeviceptr devBufferC;
+
+    checkCudaErrors(cuMemAlloc(&devBufferA, sizeof(float)*16));
+    checkCudaErrors(cuMemAlloc(&devBufferB, sizeof(float)*16));
+    checkCudaErrors(cuMemAlloc(&devBufferC, sizeof(float)*16));
+
+    float* hostA = new float[16];
+    float* hostB = new float[16];
+    float* hostC = new float[16];
+
+    // Populate input
+    for (unsigned i = 0; i != 16; ++i) {
+      hostA[i] = (float)i;
+      hostB[i] = (float)(2*i);
+      hostC[i] = 0.0f;
+    }
+
+    checkCudaErrors(cuMemcpyHtoD(devBufferA, &hostA[0], sizeof(float)*16));
+    checkCudaErrors(cuMemcpyHtoD(devBufferB, &hostB[0], sizeof(float)*16));
+
+
+    unsigned blockSizeX = 16;
+    unsigned blockSizeY = 1;
+    unsigned blockSizeZ = 1;
+    unsigned gridSizeX  = 1;
+    unsigned gridSizeY  = 1;
+    unsigned gridSizeZ  = 1;
+
+    // Kernel parameters
+    void *KernelParams[] = { &devBufferA, &devBufferB, &devBufferC };
+
+    std::cout << "Launching kernel\n";
+
+    // Kernel launch
+    checkCudaErrors(cuLaunchKernel(function, gridSizeX, gridSizeY, gridSizeZ,
+                                   blockSizeX, blockSizeY, blockSizeZ,
+                                   0, NULL, KernelParams, NULL));
+
+    // Retrieve device data
+    checkCudaErrors(cuMemcpyDtoH(&hostC[0], devBufferC, sizeof(float)*16));
+
+
+    std::cout << "Results:\n";
+    for (unsigned i = 0; i != 16; ++i) {
+      std::cout << hostA[i] << " + " << hostB[i] << " = " << hostC[i] << "\n";
+    }
+
+
+    // Clean up after ourselves
+    delete [] hostA;
+    delete [] hostB;
+    delete [] hostC;
+
+    // Clean-up
+    checkCudaErrors(cuMemFree(devBufferA));
+    checkCudaErrors(cuMemFree(devBufferB));
+    checkCudaErrors(cuMemFree(devBufferC));
+    checkCudaErrors(cuModuleUnload(cudaModule));
+    checkCudaErrors(cuCtxDestroy(context));
+
+    return 0;
+  }
+
+
+You will need to link with the CUDA driver and specify the path to cuda.h.
+
+.. code-block:: text
+
+  # clang++ sample.cpp -o sample -O2 -g -I/usr/local/cuda-5.5/include -lcuda
+
+We don't need to specify a path to ``libcuda.so`` since this is installed in a
+system location by the driver, not the CUDA toolkit.
+
+If everything goes as planned, you should see the following output when
+running the compiled program:
+
+.. code-block:: text
+
+  Using CUDA Device [0]: GeForce GTX 680
+  Device Compute Capability: 3.0
+  Launching kernel
+  Results:
+  0 + 0 = 0
+  1 + 2 = 3
+  2 + 4 = 6
+  3 + 6 = 9
+  4 + 8 = 12
+  5 + 10 = 15
+  6 + 12 = 18
+  7 + 14 = 21
+  8 + 16 = 24
+  9 + 18 = 27
+  10 + 20 = 30
+  11 + 22 = 33
+  12 + 24 = 36
+  13 + 26 = 39
+  14 + 28 = 42
+  15 + 30 = 45
+
+.. note::
+
+  You will likely see a different device identifier based on your hardware
+
+
+Tutorial: Linking with Libdevice
+================================
+
+In this tutorial, we show a simple example of linking LLVM IR with the
+libdevice library. We will use the same kernel as the previous tutorial,
+except that we will compute ``C = pow(A, B)`` instead of ``C = A + B``.
+Libdevice provides an ``__nv_powf`` function that we will use.
+
+.. code-block:: llvm
+
+  target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
+  target triple = "nvptx64-nvidia-cuda"
+
+  ; Intrinsic to read X component of thread ID
+  declare i32 @llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind
+  ; libdevice function
+  declare float @__nv_powf(float, float)
+
+  define void @kernel(float addrspace(1)* %A,
+                      float addrspace(1)* %B,
+                      float addrspace(1)* %C) {
+  entry:
+    ; What is my ID?
+    %id = tail call i32 @llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind
+
+    ; Compute pointers into A, B, and C
+    %ptrA = getelementptr float, float addrspace(1)* %A, i32 %id
+    %ptrB = getelementptr float, float addrspace(1)* %B, i32 %id
+    %ptrC = getelementptr float, float addrspace(1)* %C, i32 %id
+
+    ; Read A, B
+    %valA = load float, float addrspace(1)* %ptrA, align 4
+    %valB = load float, float addrspace(1)* %ptrB, align 4
+
+    ; Compute C = pow(A, B)
+    %valC = call float @__nv_powf(float %valA, float %valB)
+
+    ; Store back to C
+    store float %valC, float addrspace(1)* %ptrC, align 4
+
+    ret void
+  }
+
+  !nvvm.annotations = !{!0}
+  !0 = !{void (float addrspace(1)*,
+               float addrspace(1)*,
+               float addrspace(1)*)* @kernel, !"kernel", i32 1}
+
+
+To compile this kernel, we perform the following steps:
+
+1. Link with libdevice
+2. Internalize all but the public kernel function
+3. Run ``NVVMReflect`` and set ``__CUDA_FTZ`` to 0
+4. Optimize the linked module
+5. Codegen the module
+
+
+These steps can be performed by the LLVM ``llvm-link``, ``opt``, and ``llc``
+tools. In a complete compiler, these steps can also be performed entirely
+programmatically by setting up an appropriate pass configuration (see
+:ref:`libdevice`).
+
+.. code-block:: text
+
+  # llvm-link t2.bc libdevice.compute_20.10.bc -o t2.linked.bc
+  # opt -internalize -internalize-public-api-list=kernel -nvvm-reflect-list=__CUDA_FTZ=0 -nvvm-reflect -O3 t2.linked.bc -o t2.opt.bc
+  # llc -mcpu=sm_20 t2.opt.bc -o t2.ptx
+
+.. note::
+
+  The ``-nvvm-reflect-list=_CUDA_FTZ=0`` is not strictly required, as any
+  undefined variables will default to zero. It is shown here for evaluation
+  purposes.
+
+
+This gives us the following PTX (excerpt):
+
+.. code-block:: text
+
+  //
+  // Generated by LLVM NVPTX Back-End
+  //
+
+  .version 3.1
+  .target sm_20
+  .address_size 64
+
+    // .globl kernel
+                                          // @kernel
+  .visible .entry kernel(
+    .param .u64 kernel_param_0,
+    .param .u64 kernel_param_1,
+    .param .u64 kernel_param_2
+  )
+  {
+    .reg .pred  %p<30>;
+    .reg .f32   %f<111>;
+    .reg .s32   %r<21>;
+    .reg .s64   %rl<8>;
+
+  // %bb.0:                                // %entry
+    ld.param.u64  %rl2, [kernel_param_0];
+    mov.u32   %r3, %tid.x;
+    ld.param.u64  %rl3, [kernel_param_1];
+    mul.wide.s32  %rl4, %r3, 4;
+    add.s64   %rl5, %rl2, %rl4;
+    ld.param.u64  %rl6, [kernel_param_2];
+    add.s64   %rl7, %rl3, %rl4;
+    add.s64   %rl1, %rl6, %rl4;
+    ld.global.f32   %f1, [%rl5];
+    ld.global.f32   %f2, [%rl7];
+    setp.eq.f32 %p1, %f1, 0f3F800000;
+    setp.eq.f32 %p2, %f2, 0f00000000;
+    or.pred   %p3, %p1, %p2;
+    @%p3 bra  BB0_1;
+    bra.uni   BB0_2;
+  BB0_1:
+    mov.f32   %f110, 0f3F800000;
+    st.global.f32   [%rl1], %f110;
+    ret;
+  BB0_2:                                  // %__nv_isnanf.exit.i
+    abs.f32   %f4, %f1;
+    setp.gtu.f32  %p4, %f4, 0f7F800000;
+    @%p4 bra  BB0_4;
+  // %bb.3:                                // %__nv_isnanf.exit5.i
+    abs.f32   %f5, %f2;
+    setp.le.f32 %p5, %f5, 0f7F800000;
+    @%p5 bra  BB0_5;
+  BB0_4:                                  // %.critedge1.i
+    add.f32   %f110, %f1, %f2;
+    st.global.f32   [%rl1], %f110;
+    ret;
+  BB0_5:                                  // %__nv_isinff.exit.i
+
+    ...
+
+  BB0_26:                                 // %__nv_truncf.exit.i.i.i.i.i
+    mul.f32   %f90, %f107, 0f3FB8AA3B;
+    cvt.rzi.f32.f32 %f91, %f90;
+    mov.f32   %f92, 0fBF317200;
+    fma.rn.f32  %f93, %f91, %f92, %f107;
+    mov.f32   %f94, 0fB5BFBE8E;
+    fma.rn.f32  %f95, %f91, %f94, %f93;
+    mul.f32   %f89, %f95, 0f3FB8AA3B;
+    // inline asm
+    ex2.approx.ftz.f32 %f88,%f89;
+    // inline asm
+    add.f32   %f96, %f91, 0f00000000;
+    ex2.approx.f32  %f97, %f96;
+    mul.f32   %f98, %f88, %f97;
+    setp.lt.f32 %p15, %f107, 0fC2D20000;
+    selp.f32  %f99, 0f00000000, %f98, %p15;
+    setp.gt.f32 %p16, %f107, 0f42D20000;
+    selp.f32  %f110, 0f7F800000, %f99, %p16;
+    setp.eq.f32 %p17, %f110, 0f7F800000;
+    @%p17 bra   BB0_28;
+  // %bb.27:
+    fma.rn.f32  %f110, %f110, %f108, %f110;
+  BB0_28:                                 // %__internal_accurate_powf.exit.i
+    setp.lt.f32 %p18, %f1, 0f00000000;
+    setp.eq.f32 %p19, %f3, 0f3F800000;
+    and.pred    %p20, %p18, %p19;
+    @!%p20 bra  BB0_30;
+    bra.uni   BB0_29;
+  BB0_29:
+    mov.b32    %r9, %f110;
+    xor.b32   %r10, %r9, -2147483648;
+    mov.b32    %f110, %r10;
+  BB0_30:                                 // %__nv_powf.exit
+    st.global.f32   [%rl1], %f110;
+    ret;
+  }
+

Added: www-releases/trunk/9.0.0/docs/_sources/ORCv2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/ORCv2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/ORCv2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/ORCv2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,632 @@
+===============================
+ORC Design and Implementation
+===============================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document aims to provide a high-level overview of the design and
+implementation of the ORC JIT APIs. Except where otherwise stated, all
+discussion applies to the design of the APIs as of LLVM verison 9 (ORCv2).
+
+Use-cases
+=========
+
+ORC provides a modular API for building JIT compilers. There are a range
+of use cases for such an API. For example:
+
+1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
+compiled from a toy languge: Kaleidoscope.
+
+2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
+evaluation. In this use case, cross compilation allows expressions compiled
+in the debugger process to be executed on the debug target process, which may
+be on a different device/architecture.
+
+3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
+optimizations within an existing JIT infrastructure.
+
+4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
+
+By adoping a modular, library-based design we aim to make ORC useful in as many
+of these contexts as possible.
+
+Features
+========
+
+ORC provides the following features:
+
+- *JIT-linking* links relocatable object files (COFF, ELF, MachO) [1]_ into a
+  target process an runtime. The target process may be the same process that
+  contains the JIT session object and jit-linker, or may be another process
+  (even one running on a different machine or architecture) that communicates
+  with the JIT via RPC.
+
+- *LLVM IR compilation*, which is provided by off the shelf components
+  (IRCompileLayer, SimpleCompiler, ConcurrentIRCompiler) that make it easy to
+  add LLVM IR to a JIT'd process.
+
+- *Eager and lazy compilation*. By default, ORC will compile symbols as soon as
+  they are looked up in the JIT session object (``ExecutionSession``). Compiling
+  eagerly by default makes it easy to use ORC as a simple in-memory compiler for
+  an existing JIT. ORC also provides a simple mechanism, lazy-reexports, for
+  deferring compilation until first call.
+
+- *Support for custom compilers and program representations*. Clients can supply
+  custom compilers for each symbol that they define in their JIT session. ORC
+  will run the user-supplied compiler when the a definition of a symbol is
+  needed. ORC is actually fully language agnostic: LLVM IR is not treated
+  specially, and is supported via the same wrapper mechanism (the
+  ``MaterializationUnit`` class) that is used for custom compilers.
+
+- *Concurrent JIT'd code* and *concurrent compilation*. JIT'd code may spawn
+  multiple threads, and may re-enter the JIT (e.g. for lazy compilation)
+  concurrently from multiple threads. The ORC APIs also support running multiple
+  compilers concurrently, and provides off-the-shelf infrastructure to track
+  dependencies on running compiles (e.g. to ensure that we never call into code
+  until it is safe to do so, even if that involves waiting on multiple
+  compiles).
+
+- *Orthogonality* and *composability*: Each of the features above can be used (or
+  not) independently. It is possible to put ORC components together to make a
+  non-lazy, in-process, single threaded JIT or a lazy, out-of-process,
+  concurrent JIT, or anything in between.
+
+LLJIT and LLLazyJIT
+===================
+
+ORC provides two basic JIT classes off-the-shelf. These are useful both as
+examples of how to assemble ORC components to make a JIT, and as replacements
+for earlier LLVM JIT APIs (e.g. MCJIT).
+
+The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
+compilation of LLVM IR and linking of relocatable object files. All operations
+are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
+as soon as you attempt to look up its address). LLJIT is a suitable replacement
+for MCJIT in most cases (note: some more advanced features, e.g.
+JITEventListeners are not supported yet).
+
+The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
+compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
+method, function bodies in that module will not be compiled until they are first
+called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
+JIT API.
+
+LLJIT and LLLazyJIT instances can be created using their respective builder
+classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
+module ``M`` loaded on an ThreadSafeContext ``Ctx``:
+
+.. code-block:: c++
+
+  // Try to detect the host arch and construct an LLJIT instance.
+  auto JIT = LLJITBuilder().create();
+
+  // If we could not construct an instance, return an error.
+  if (!JIT)
+    return JIT.takeError();
+
+  // Add the module.
+  if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
+    return Err;
+
+  // Look up the JIT'd code entry point.
+  auto EntrySym = JIT->lookup("entry");
+  if (!EntrySym)
+    return EntrySym.takeError();
+
+  auto *Entry = (void(*)())EntrySym.getAddress();
+
+  Entry();
+
+The builder clasess provide a number of configuration options that can be
+specified before the JIT instance is constructed. For example:
+
+.. code-block:: c++
+
+  // Build an LLLazyJIT instance that uses four worker threads for compilation,
+  // and jumps to a specific error handler (rather than null) on lazy compile
+  // failures.
+
+  void handleLazyCompileFailure() {
+    // JIT'd code will jump here if lazy compilation fails, giving us an
+    // opportunity to exit or throw an exception into JIT'd code.
+    throw JITFailed();
+  }
+
+  auto JIT = LLLazyJITBuilder()
+               .setNumCompileThreads(4)
+               .setLazyCompileFailureAddr(
+                   toJITTargetAddress(&handleLazyCompileFailure))
+               .create();
+
+  // ...
+
+For users wanting to get started with LLJIT a minimal example program can be
+found at ``llvm/examples/HowToUseLLJIT``.
+
+Design Overview
+===============
+
+ORC's JIT'd program model aims to emulate the linking and symbol resolution
+rules used by the static and dynamic linkers. This allows ORC to JIT
+arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
+clang) that uses constructs like symbol linkage and visibility, and weak and
+common symbol definitions.
+
+To see how this works, imagine a program ``foo`` which links against a pair
+of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
+program might look like:
+
+.. code-block:: bash
+
+  $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
+  $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
+  $ clang++ -o myapp myapp.cpp -L. -lA -lB
+  $ ./myapp
+
+In ORC, this would translate into API calls on a "CXXCompilingLayer" (with error
+checking omitted for brevity) as:
+
+.. code-block:: c++
+
+  ExecutionSession ES;
+  RTDyldObjectLinkingLayer ObjLinkingLayer(
+      ES, []() { return llvm::make_unique<SectionMemoryManager>(); });
+  CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
+
+  // Create JITDylib "A" and add code to it using the CXX layer.
+  auto &LibA = ES.createJITDylib("A");
+  CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
+  CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
+
+  // Create JITDylib "B" and add code to it using the CXX layer.
+  auto &LibB = ES.createJITDylib("B");
+  CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
+  CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
+
+  // Specify the search order for the main JITDylib. This is equivalent to a
+  // "links against" relationship in a command-line link.
+  ES.getMainJITDylib().setSearchOrder({{&LibA, false}, {&LibB, false}});
+  CXXLayer.add(ES.getMainJITDylib(), MemoryBuffer::getFile("main.cpp"));
+
+  // Look up the JIT'd main, cast it to a function pointer, then call it.
+  auto MainSym = ExitOnErr(ES.lookup({&ES.getMainJITDylib()}, "main"));
+  auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
+
+v  int Result = Main(...);
+
+This example tells us nothing about *how* or *when* compilation will happen.
+That will depend on the implementation of the hypothetical CXXCompilingLayer.
+The same linker-based symbol resolution rules will apply regardless of that
+implementation, however. For example, if a1.cpp and a2.cpp both define a
+function "foo" then ORCv2 will generate a duplicate definition error. On the
+other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
+dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
+should bind to the definition in LibA rather than the one in LibB, since
+main.cpp is part of the "main" dylib, and the main dylib links against LibA
+before LibB.
+
+Many JIT clients will have no need for this strict adherence to the usual
+ahead-of-time linking rules, and should be able to get by just fine by putting
+all of their code in a single JITDylib. However, clients who want to JIT code
+for languages/projects that traditionally rely on ahead-of-time linking (e.g.
+C++) will find that this feature makes life much easier.
+
+Symbol lookup in ORC serves two other important functions, beyond providing
+addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
+(if they have not been compiled already), and (2) it provides the
+synchronization mechanism for concurrent compilation. The pseudo-code for the
+lookup process is:
+
+.. code-block:: none
+
+  construct a query object from a query set and query handler
+  lock the session
+  lodge query against requested symbols, collect required materializers (if any)
+  unlock the session
+  dispatch materializers (if any)
+
+In this context a materializer is something that provides a working definition
+of a symbol upon request. Usually materializers are just wrappers for compilers,
+but they may also wrap a jit-linker directly (if the program representation
+backing the definitions is an object file), or may even be a class that writes
+bits directly into memory (for example, if the definitions are
+stubs). Materialization is the blanket term for any actions (compiling, linking,
+splatting bits, registering with runtimes, etc.) that are requried to generate a
+symbol definition that is safe to call or access.
+
+As each materializer completes its work it notifies the JITDylib, which in turn
+notifies any query objects that are waiting on the newly materialized
+definitions. Each query object maintains a count of the number of symbols that
+it is still waiting on, and once this count reaches zero the query object calls
+the query handler with a *SymbolMap* (a map of symbol names to addresses)
+describing the result. If any symbol fails to materialize the query immediately
+calls the query handler with an error.
+
+The collected materialization units are sent to the ExecutionSession to be
+dispatched, and the dispatch behavior can be set by the client. By default each
+materializer is run on the calling thread. Clients are free to create new
+threads to run materializers, or to send the work to a work queue for a thread
+pool (this is what LLJIT/LLLazyJIT do).
+
+Top Level APIs
+==============
+
+Many of ORC's top-level APIs are visible in the example above:
+
+- *ExecutionSession* represents the JIT'd program and provides context for the
+  JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
+  materializers.
+
+- *JITDylibs* provide the symbol tables.
+
+- *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
+  allow clients to add uncompiled program representations supported by those
+  compilers to JITDylibs.
+
+Several other important APIs are used explicitly. JIT clients need not be aware
+of them, but Layer authors will use them:
+
+- *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
+  program representation (in this example, C++ source) in a MaterializationUnit,
+  which is then stored in the JITDylib. MaterializationUnits are responsible for
+  describing the definitions they provide, and for unwrapping the program
+  representation and passing it back to the layer when compilation is required
+  (this ownership shuffle makes writing thread-safe layers easier, since the
+  ownership of the program representation will be passed back on the stack,
+  rather than having to be fished out of a Layer member, which would require
+  synchronization).
+
+- *MaterializationResponsibility* - When a MaterializationUnit hands a program
+  representation back to the layer it comes with an associated
+  MaterializationResponsibility object. This object tracks the definitions
+  that must be materialized and provides a way to notify the JITDylib once they
+  are either successfully materialized or a failure occurs.
+
+Handy utilities
+===============
+
+TBD: absolute symbols, aliases, off-the-shelf layers.
+
+Laziness
+========
+
+Laziness in ORC is provided by a utility called "lazy-reexports". The aim of
+this utility is to re-use the synchronization provided by the symbol lookup
+mechanism to make it safe to lazily compile functions, even if calls to the
+stub occur simultaneously on multiple threads of JIT'd code. It does this by
+reducing lazy compilation to symbol lookup: The lazy stub performs a lookup of
+its underlying definition on first call, updating the function body pointer
+once the definition is available. If additional calls arrive on other threads
+while compilation is ongoing they will be safely blocked by the normal lookup
+synchronization guarantee (no result until the result is safe) and can also
+proceed as soon as compilation completes.
+
+TBD: Usage example.
+
+Supporting Custom Compilers
+===========================
+
+TBD.
+
+Transitioning from ORCv1 to ORCv2
+=================================
+
+Since LLVM 7.0, new ORC development work has focused on adding support for
+concurrent JIT compilation. The new APIs (including new layer interfaces and
+implementations, and new utilities) that support concurrency are collectively
+referred to as ORCv2, and the original, non-concurrent layers and utilities
+are now referred to as ORCv1.
+
+The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
+prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
+10.0 ORCv1 will be removed entirely.
+
+Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
+ORCv1 layers and utilities have ORCv2 counterparts[2]_ that can be directly
+substituted. However there are some design differences between ORCv1 and ORCv2
+to be aware of:
+
+  1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
+     (and other program representations, e.g. Object Files)  are no longer added
+     directly to JIT classes or layers. Instead, they are added to ``JITDylib``
+     instances *by* layers. The ``JITDylib`` determines *where* the definitions
+     reside, the layers determine *how* the definitions will be compiled.
+     Linkage relationships between ``JITDylibs`` determine how inter-module
+     references are resolved, and symbol resolvers are no longer used. See the
+     section `Design Overview`_ for more details.
+
+     Unless multiple JITDylibs are needed to model linkage relationsips, ORCv1
+     clients should place all code in the main JITDylib (returned by
+     ``ExecutionSession::getMainJITDylib()``). MCJIT clients should use LLJIT
+     (see `LLJIT and LLLazyJIT`_).
+
+  2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
+     manages the string pool, error reporting, synchronization, and symbol
+     lookup.
+
+  3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
+     string values in order to reduce memory overhead and improve lookup
+     performance. See the subsection `How to manage symbol strings`_.
+
+  4. IR layers require ThreadSafeModule instances, rather than
+     std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
+     Modules that use the same LLVMContext are not accessed concurrently.
+     See `How to use ThreadSafeModule and ThreadSafeContext`_.
+
+  5. Symbol lookup is no longer handled by layers. Instead, there is a
+     ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
+
+     .. code-block:: c++
+
+       ExecutionSession ES;
+       JITDylib &JD1 = ...;
+       JITDylib &JD2 = ...;
+
+       auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
+
+  6. Module removal is not yet supported. There is no equivalent of the
+     layer concept removeModule/removeObject methods. Work on resource tracking
+     and removal in ORCv2 is ongoing.
+
+For code examples and suggestions of how to use the ORCv2 APIs, please see
+the section `How-tos`_.
+
+How-tos
+=======
+
+How to manage symbol strings
+############################
+
+Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
+overhead, and allow symbol names to function as efficient keys. To get the
+unique ``SymbolStringPtr`` for a string value, call the
+``ExecutionSession::intern`` method:
+
+  .. code-block:: c++
+
+    ExecutionSession ES;
+    /// ...
+    auto MainSymbolName = ES.intern("main");
+
+If you wish to perform lookup using the C/IR name of a symbol you will also
+need to apply the platform linker-mangling before interning the string. On
+Linux this mangling is a no-op, but on other platforms it usually involves
+adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
+based on the DataLayout for the target. Given a DataLayout and an
+ExecutionSession, you can create a MangleAndInterner function object that
+will perform both jobs for you:
+
+  .. code-block:: c++
+
+    ExecutionSession ES;
+    const DataLayout &DL = ...;
+    MangleAndInterner Mangle(ES, DL);
+
+    // ...
+
+    // Portable IR-symbol-name lookup:
+    auto Sym = ES.lookup({&ES.getMainJITDylib()}, Mangle("main"));
+
+How to create JITDylibs and set up linkage relationships
+########################################################
+
+In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
+calling the ``ExecutionSession::createJITDylib`` method with a unique name:
+
+  .. code-block:: c++
+
+    ExecutionSession ES;
+    auto &JD = ES.createJITDylib("libFoo.dylib");
+
+The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
+when it is destroyed.
+
+A JITDylib representing the JIT main program is created by ExecutionEngine by
+default. A reference to it can be obtained by calling
+``ExecutionSession::getMainJITDylib()``:
+
+  .. code-block:: c++
+
+    ExecutionSession ES;
+    auto &MainJD = ES.getMainJITDylib();
+
+How to use ThreadSafeModule and ThreadSafeContext
+#################################################
+
+ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
+LLVMContexts respectively. A ThreadSafeModule is a pair of a
+std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
+ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
+This design serves two purposes: providing both a locking scheme and lifetime
+management for LLVMContexts. The ThreadSafeContext may be locked to prevent
+accidental concurrent access by two Modules that use the same LLVMContext.
+The underlying LLVMContext is freed once all ThreadSafeContext values pointing
+to it are destroyed, allowing the context memory to be reclaimed as soon as
+the Modules referring to it are destroyed.
+
+ThreadSafeContexts can be explicitly constructed from a
+std::unique_ptr<LLVMContext>:
+
+  .. code-block:: c++
+
+    ThreadSafeContext TSCtx(llvm::make_unique<LLVMContext>());
+
+ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
+and a ThreadSafeContext value. ThreadSafeContext values may be shared between
+multiple ThreadSafeModules:
+
+  .. code-block:: c++
+
+    ThreadSafeModule TSM1(
+      llvm::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
+
+    ThreadSafeModule TSM2(
+      llvm::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
+
+Before using a ThreadSafeContext, clients should ensure that either the context
+is only accessible on the current thread, or that the context is locked. In the
+example above (where the context is never locked) we rely on the fact that both
+``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
+going to be shared between threads then it must be locked before the context,
+or any Modules attached to it, are accessed. When code is added to in-tree IR
+layers this locking is is done automatically by the
+``BasicIRLayerMaterializationUnit::materialize`` method. In all other
+situations, for example when writing a custom IR materialization unit, or
+constructing a new ThreadSafeModule from higher-level program representations,
+locking must be done explicitly:
+
+  .. code-block:: c++
+
+    void HighLevelRepresentationLayer::emit(MaterializationResponsibility R,
+                                            HighLevelProgramRepresentation H) {
+      // Get or create a context value that may be shared between threads.
+      ThreadSafeContext TSCtx = getContext();
+
+      // Lock the context to prevent concurrent access.
+      auto Lock = TSCtx.getLock();
+
+      // IRGen a module onto the locked Context.
+      ThreadSafeModule TSM(IRGen(H, *TSCtx.getContext()), TSCtx);
+
+      // Emit the module to the base layer with the context still locked.
+      BaseIRLayer.emit(std::move(R), std::move(TSM));
+    }
+
+Clients wishing to maximize possibilities for concurrent compilation will want
+to create every new ThreadSafeModule on a new ThreadSafeContext. For this reason
+a convenience constructor for ThreadSafeModule is provided that implicitly
+constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
+
+  .. code-block:: c++
+
+    // Maximize concurrency opportunities by loading every module on a
+    // separate context.
+    for (const auto &IRPath : IRPaths) {
+      auto Ctx = llvm::make_unique<LLVMContext>();
+      auto M = llvm::make_unique<LLVMContext>("M", *Ctx);
+      CompileLayer.add(ES.getMainJITDylib(),
+                       ThreadSafeModule(std::move(M), std::move(Ctx)));
+    }
+
+Clients who plan to run single-threaded may choose to save memory by loading
+all modules on the same context:
+
+  .. code-block:: c++
+
+    // Save memory by using one context for all Modules:
+    ThreadSafeContext TSCtx(llvm::make_unique<LLVMContext>());
+    for (const auto &IRPath : IRPaths) {
+      ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
+      CompileLayer.add(ES.getMainJITDylib(), ThreadSafeModule(std::move(TSM));
+    }
+
+How to Add Process and Library Symbols to the JITDylibs
+=======================================================
+
+JIT'd code typically needs access to symbols in the host program or in
+supporting libraries. References to process symbols can be "baked in" to code
+as it is compiled by turning external references into pre-resolved integer
+constants, however this ties the JIT'd code to the current process's virtual
+memory layout (meaning that it can not be cached between runs) and makes
+debugging lower level program representations difficult (as all external
+references are opaque integer values). A bettor solution is to maintain symbolic
+external references and let the jit-linker bind them for you at runtime. To
+allow the JIT linker to find these external definitions their addresses must
+be added to a JITDylib that the JIT'd definitions link against.
+
+Adding definitions for external symbols could be done using the absoluteSymbols
+function:
+
+  .. code-block:: c++
+
+    const DataLayout &DL = getDataLayout();
+    MangleAndInterner Mangle(ES, DL);
+
+    auto &JD = ES.getMainJITDylib();
+
+    JD.define(
+      absoluteSymbols({
+        { Mangle("puts"), pointerToJITTargetAddress(&puts)},
+        { Mangle("gets"), pointerToJITTargetAddress(&getS)}
+      }));
+
+Manually adding absolute symbols for a large or changing interface is cumbersome
+however, so ORC provides an alternative to generate new definitions on demand:
+*definition generators*. If a definition generator is attached to a JITDylib,
+then any unsuccessful lookup on that JITDylib will fall back to calling the
+definition generator, and the definition generator may choose to generate a new
+definition for the missing symbols. Of particular use here is the
+``DynamicLibrarySearchGenerator`` utility. This can be used to reflect the whole
+exported symbol set of the process or a specific dynamic library, or a subset
+of either of these determined by a predicate.
+
+For example, to load the whole interface of a runtime library:
+
+  .. code-block:: c++
+
+    const DataLayout &DL = getDataLayout();
+    auto &JD = ES.getMainJITDylib();
+
+    JD.setGenerator(DynamicLibrarySearchGenerator::Load("/path/to/lib"
+                                                        DL.getGlobalPrefix()));
+
+    // IR added to JD can now link against all symbols exported by the library
+    // at '/path/to/lib'.
+    CompileLayer.add(JD, loadModule(...));
+
+Or, to expose a whitelisted set of symbols from the main process:
+
+  .. code-block:: c++
+
+    const DataLayout &DL = getDataLayout();
+    MangleAndInterner Mangle(ES, DL);
+
+    auto &JD = ES.getMainJITDylib();
+
+    DenseSet<SymbolStringPtr> Whitelist({
+        Mangle("puts"),
+        Mangle("gets")
+      });
+
+    // Use GetForCurrentProcess with a predicate function that checks the
+    // whitelist.
+    JD.setGenerator(
+      DynamicLibrarySearchGenerator::GetForCurrentProcess(
+        DL.getGlobalPrefix(),
+        [&](const SymbolStringPtr &S) { return Whitelist.count(S); }));
+
+    // IR added to JD can now link against any symbols exported by the process
+    // and contained in the whitelist.
+    CompileLayer.add(JD, loadModule(...));
+
+Future Features
+===============
+
+TBD: Speculative compilation. Object Caches.
+
+.. [1] Formats/architectures vary in terms of supported features. MachO and
+       ELF tend to have better support than COFF. Patches very welcome!
+
+.. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
+       ``RemoteObjectServerLayer`` do not have counterparts in the new
+       system. In the case of ``LazyEmittingLayer`` it was simply no longer
+       needed: in ORCv2, deferring compilation until symbols are looked up is
+       the default. The removal of ``RemoteObjectClientLayer`` and
+       ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
+       across processes, however this functionality appears not to have been
+       used.
+
+.. [3] Sharing ThreadSafeModules in a concurrent compilation can be dangerous:
+       if interdependent modules are loaded on the same context, but compiled
+       on different threads a deadlock may occur (with each compile waiting for
+       the other(s) to complete, and the other(s) unable to proceed because the
+       context is locked).
+
+.. [4] Mostly. Weak definitions are handled correctly within dylibs, but if
+       multiple dylibs provide a weak definition of a symbol each will end up
+       with its own definition (similar to how weak symbols in Windows DLLs
+       behave). This will be fixed in the future.
\ No newline at end of file

Added: www-releases/trunk/9.0.0/docs/_sources/OptBisect.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/OptBisect.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/OptBisect.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/OptBisect.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,193 @@
+====================================================
+Using -opt-bisect-limit to debug optimization errors
+====================================================
+.. contents::
+   :local:
+   :depth: 1
+
+Introduction
+============
+
+The -opt-bisect-limit option provides a way to disable all optimization passes
+above a specified limit without modifying the way in which the Pass Managers
+are populated.  The intention of this option is to assist in tracking down
+problems where incorrect transformations during optimization result in incorrect
+run-time behavior.
+
+This feature is implemented on an opt-in basis.  Passes which can be safely
+skipped while still allowing correct code generation call a function to
+check the opt-bisect limit before performing optimizations.  Passes which
+either must be run or do not modify the IR do not perform this check and are
+therefore never skipped.  Generally, this means analysis passes, passes
+that are run at CodeGenOpt::None and passes which are required for register
+allocation.
+
+The -opt-bisect-limit option can be used with any tool, including front ends
+such as clang, that uses the core LLVM library for optimization and code
+generation.  The exact syntax for invoking the option is discussed below.
+
+This feature is not intended to replace other debugging tools such as bugpoint.
+Rather it provides an alternate course of action when reproducing the problem
+requires a complex build infrastructure that would make using bugpoint
+impractical or when reproducing the failure requires a sequence of
+transformations that is difficult to replicate with tools like opt and llc.
+
+
+Getting Started
+===============
+
+The -opt-bisect-limit command line option can be passed directly to tools such
+as opt, llc and lli.  The syntax is as follows:
+
+::
+
+  <tool name> [other options] -opt-bisect-limit=<limit>
+
+If a value of -1 is used the tool will perform all optimizations but a message
+will be printed to stderr for each optimization that could be skipped
+indicating the index value that is associated with that optimization.  To skip
+optimizations, pass the value of the last optimization to be performed as the
+opt-bisect-limit.  All optimizations with a higher index value will be skipped.
+
+In order to use the -opt-bisect-limit option with a driver that provides a
+wrapper around the LLVM core library, an additional prefix option may be
+required, as defined by the driver.  For example, to use this option with
+clang, the "-mllvm" prefix must be used.  A typical clang invocation would look
+like this:
+
+::
+
+  clang -O2 -mllvm -opt-bisect-limit=256 my_file.c
+
+The -opt-bisect-limit option may also be applied to link-time optimizations by
+using a prefix to indicate that this is a plug-in option for the linker. The
+following syntax will set a bisect limit for LTO transformations:
+
+::
+
+  # When using lld, or ld64 (macOS)
+  clang -flto -Wl,-mllvm,-opt-bisect-limit=256 my_file.o my_other_file.o
+  # When using Gold
+  clang -flto -Wl,-plugin-opt,-opt-bisect-limit=256 my_file.o my_other_file.o
+
+LTO passes are run by a library instance invoked by the linker. Therefore any
+passes run in the primary driver compilation phase are not affected by options
+passed via '-Wl,-plugin-opt' and LTO passes are not affected by options
+passed to the driver-invoked LLVM invocation via '-mllvm'.
+
+
+Bisection Index Values
+======================
+
+The granularity of the optimizations associated with a single index value is
+variable.  Depending on how the optimization pass has been instrumented the
+value may be associated with as much as all transformations that would have
+been performed by an optimization pass on an IR unit for which it is invoked
+(for instance, during a single call of runOnFunction for a FunctionPass) or as
+little as a single transformation. The index values may also be nested so that
+if an invocation of the pass is not skipped individual transformations within
+that invocation may still be skipped.
+
+The order of the values assigned is guaranteed to remain stable and consistent
+from one run to the next up to and including the value specified as the limit.
+Above the limit value skipping of optimizations can cause a change in the
+numbering, but because all optimizations above the limit are skipped this
+is not a problem.
+
+When an opt-bisect index value refers to an entire invocation of the run
+function for a pass, the pass will query whether or not it should be skipped
+each time it is invoked and each invocation will be assigned a unique value.
+For example, if a FunctionPass is used with a module containing three functions
+a different index value will be assigned to the pass for each of the functions
+as the pass is run. The pass may be run on two functions but skipped for the
+third.
+
+If the pass internally performs operations on a smaller IR unit the pass must be
+specifically instrumented to enable bisection at this finer level of granularity
+(see below for details).
+
+
+Example Usage
+=============
+
+.. code-block:: console
+
+  $ opt -O2 -o test-opt.bc -opt-bisect-limit=16 test.ll
+
+  BISECT: running pass (1) Simplify the CFG on function (g)
+  BISECT: running pass (2) SROA on function (g)
+  BISECT: running pass (3) Early CSE on function (g)
+  BISECT: running pass (4) Infer set function attributes on module (test.ll)
+  BISECT: running pass (5) Interprocedural Sparse Conditional Constant Propagation on module (test.ll)
+  BISECT: running pass (6) Global Variable Optimizer on module (test.ll)
+  BISECT: running pass (7) Promote Memory to Register on function (g)
+  BISECT: running pass (8) Dead Argument Elimination on module (test.ll)
+  BISECT: running pass (9) Combine redundant instructions on function (g)
+  BISECT: running pass (10) Simplify the CFG on function (g)
+  BISECT: running pass (11) Remove unused exception handling info on SCC (<<null function>>)
+  BISECT: running pass (12) Function Integration/Inlining on SCC (<<null function>>)
+  BISECT: running pass (13) Deduce function attributes on SCC (<<null function>>)
+  BISECT: running pass (14) Remove unused exception handling info on SCC (f)
+  BISECT: running pass (15) Function Integration/Inlining on SCC (f)
+  BISECT: running pass (16) Deduce function attributes on SCC (f)
+  BISECT: NOT running pass (17) Remove unused exception handling info on SCC (g)
+  BISECT: NOT running pass (18) Function Integration/Inlining on SCC (g)
+  BISECT: NOT running pass (19) Deduce function attributes on SCC (g)
+  BISECT: NOT running pass (20) SROA on function (g)
+  BISECT: NOT running pass (21) Early CSE on function (g)
+  BISECT: NOT running pass (22) Speculatively execute instructions if target has divergent branches on function (g)
+  ... etc. ...
+
+
+Pass Skipping Implementation
+============================
+
+The -opt-bisect-limit implementation depends on individual passes opting in to
+the opt-bisect process.  The OptBisect object that manages the process is
+entirely passive and has no knowledge of how any pass is implemented.  When a
+pass is run if the pass may be skipped, it should call the OptBisect object to
+see if it should be skipped.
+
+The OptBisect object is intended to be accessed through LLVMContext and each
+Pass base class contains a helper function that abstracts the details in order
+to make this check uniform across all passes.  These helper functions are:
+
+.. code-block:: c++
+
+  bool ModulePass::skipModule(Module &M);
+  bool CallGraphSCCPass::skipSCC(CallGraphSCC &SCC);
+  bool FunctionPass::skipFunction(const Function &F);
+  bool BasicBlockPass::skipBasicBlock(const BasicBlock &BB);
+  bool LoopPass::skipLoop(const Loop *L);
+
+A MachineFunctionPass should use FunctionPass::skipFunction() as such:
+
+.. code-block:: c++
+
+  bool MyMachineFunctionPass::runOnMachineFunction(Function &MF) {
+    if (skipFunction(*MF.getFunction())
+      return false;
+    // Otherwise, run the pass normally.
+  }
+
+In addition to checking with the OptBisect class to see if the pass should be
+skipped, the skipFunction(), skipLoop() and skipBasicBlock() helper functions
+also look for the presence of the "optnone" function attribute.  The calling
+pass will be unable to determine whether it is being skipped because the
+"optnone" attribute is present or because the opt-bisect-limit has been
+reached.  This is desirable because the behavior should be the same in either
+case.
+
+The majority of LLVM passes which can be skipped have already been instrumented
+in the manner described above.  If you are adding a new pass or believe you
+have found a pass which is not being included in the opt-bisect process but
+should be, you can add it as described above.
+
+
+Adding Finer Granularity
+========================
+
+Once the pass in which an incorrect transformation is performed has been
+determined, it may be useful to perform further analysis in order to determine
+which specific transformation is causing the problem.  Debug counters
+can be used for this purpose.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewSymbols.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewSymbols.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewSymbols.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewSymbols.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,272 @@
+=====================================
+CodeView Symbol Records
+=====================================
+
+
+.. contents::
+   :local:
+
+.. _symbols_intro:
+
+Introduction
+============
+
+This document describes the usage and serialization format of the various
+CodeView symbol records that LLVM understands.  Like
+:doc:`CodeView Type Records <CodeViewTypes>`, we describe only the important
+types which are generated by modern C++ toolchains.
+
+Record Categories
+=================
+
+Symbol records share one major similarity with :doc:`type records <CodeViewTypes>`:
+They start with the same :ref:`record prefix <leaf_types>`, which we will not describe
+again (refer to the previous link for a description).  As a result of this, a sequence
+of symbol records can be processed with largely the same code as that which processes
+type records.  There are several important differences between symbol and type records:
+
+* Symbol records only appear in the :doc:`PublicStream`, :doc:`GlobalStream`, and
+  :doc:`Module Info Streams <ModiStream>`.
+* Type records only appear in the :doc:`TPI & IPI streams <TpiStream>`.
+* While types are referenced from other CodeView records via :ref:`type indices <type_indices>`,
+  symbol records are referenced by the byte offset of the record in the stream that it appears
+  in.
+* Types can reference types (via type indices), and symbols can reference both types (via type
+  indices) and symbols (via offsets), but types can never reference symbols.
+* There is no notion of :ref:`Leaf Records <leaf_types>` and :ref:`Member Records <member_types>`
+  as there are with types.  Every symbol record describes is own length.
+* Certain special symbol records begin a "scope".  For these records, all following records
+  up until the next ``S_END`` record are "children" of this symbol record.  For example,
+  given a symbol record which describes a certain function, all local variables of this
+  function would appear following the function up until the corresponding ``S_END`` record.
+
+Finally, there are three general categories of symbol record, grouped by where they are legal
+to appear in a PDB file.  Public Symbols (which appear only in the
+:doc:`publics stream <PublicStream>`), Global Symbols (which appear only in the
+:doc:`globals stream <GlobalStream>`) and module symbols (which appear in the
+:doc:`module info stream <ModiStream>`).
+
+
+.. _public_symbols:
+
+Public Symbols
+--------------
+
+Public symbols are the CodeView equivalent of DWARF ``.debug_pubnames``.  There
+is one public symbol record for every function or variable in the program that
+has a mangled name.  The :doc:`Publics Stream <PublicStream>`, which contains these
+records, additionally contains a hash table that allows one to quickly locate a
+record by mangled name.
+
+S_PUB32 (0x110e)
+^^^^^^^^^^^^^^^^
+
+There is only type of public symbol, an ``S_PUB32`` which describes a mangled
+name, a flag indicating what kind of symbol it is (e.g. function, variable), and
+the symbol's address.  The :ref:`dbi_section_map_substream` of the
+:doc:`DBI Stream <DbiStream>` can be consulted to determine what module this address
+corresponds to, and from there that module's :doc:`module debug stream <ModiStream>`
+can be consulted to locate full information for the symbol with the given address.
+
+.. _global_symbols:
+
+Global Symbols
+--------------
+
+While there is one :ref:`public symbol <public_symbols>` for every symbol in the
+program with `external` linkage, there is one global symbol for every symbol in the
+program with linkage (including internal linkage).  As a result, global symbols do
+not describe a mangled name *or* an address, since symbols with internal linkage
+need not have any mangling at all, and also may not have an address.  Thus, all
+global symbols simply refer directly to the full symbol record via a module/offset
+combination.
+
+Similarly to :ref:`public symbols <public_symbols>`, all global symbols are contained
+in a single :doc:`Globals Stream <GlobalStream>`, which contains a hash table mapping
+fully qualified name to the corresponding record in the globals stream (which as
+mentioned, then contains information allowing one to locate the full record in the
+corresponding module symbol stream).
+
+Note that a consequence and limitation of this design is that program-wide lookup
+by anything other than an exact textually matching fully-qualified name of whatever
+the compiler decided to emit is impractical.  This differs from DWARF, where even
+though we don't necessarily have O(1) lookup by basename within a given scope (including
+O(1) scope, we at least have O(n) access within a given scope).
+
+.. important::
+   Program-wide lookup of names by anything other than an exact textually matching fully
+   qualified name is not possible.
+
+
+S_GDATA32
+^^^^^^^^^^
+
+S_GTHREAD32 (0x1113)
+^^^^^^^^^^^^^^^^^^^^
+
+S_PROCREF (0x1125)
+^^^^^^^^^^^^^^^^^^
+
+S_LPROCREF (0x1127)
+^^^^^^^^^^^^^^^^^^^
+
+S_GMANDATA (0x111d)
+^^^^^^^^^^^^^^^^^^^
+
+.. _module_symbols:
+
+Module Symbols
+--------------
+
+S_END (0x0006)
+^^^^^^^^^^^^^^
+
+S_FRAMEPROC (0x1012)
+^^^^^^^^^^^^^^^^^^^^
+
+S_OBJNAME (0x1101)
+^^^^^^^^^^^^^^^^^^
+
+S_THUNK32 (0x1102)
+^^^^^^^^^^^^^^^^^^
+
+S_BLOCK32 (0x1103)
+^^^^^^^^^^^^^^^^^^
+
+S_LABEL32 (0x1105)
+^^^^^^^^^^^^^^^^^^
+
+S_REGISTER (0x1106)
+^^^^^^^^^^^^^^^^^^^
+
+S_BPREL32 (0x110b)
+^^^^^^^^^^^^^^^^^^
+
+S_LPROC32 (0x110f)
+^^^^^^^^^^^^^^^^^^
+
+S_GPROC32 (0x1110)
+^^^^^^^^^^^^^^^^^^
+
+S_REGREL32 (0x1111)
+^^^^^^^^^^^^^^^^^^^
+
+S_COMPILE2 (0x1116)
+^^^^^^^^^^^^^^^^^^^
+
+S_UNAMESPACE (0x1124)
+^^^^^^^^^^^^^^^^^^^^^
+
+S_TRAMPOLINE (0x112c)
+^^^^^^^^^^^^^^^^^^^^^
+
+S_SECTION (0x1136)
+^^^^^^^^^^^^^^^^^^
+
+S_COFFGROUP (0x1137)
+^^^^^^^^^^^^^^^^^^^^
+
+S_EXPORT (0x1138)
+^^^^^^^^^^^^^^^^^
+
+S_CALLSITEINFO (0x1139)
+^^^^^^^^^^^^^^^^^^^^^^^
+
+S_FRAMECOOKIE (0x113a)
+^^^^^^^^^^^^^^^^^^^^^^
+
+S_COMPILE3 (0x113c)
+^^^^^^^^^^^^^^^^^^^
+
+S_ENVBLOCK (0x113d)
+^^^^^^^^^^^^^^^^^^^
+
+S_LOCAL (0x113e)
+^^^^^^^^^^^^^^^^
+
+S_DEFRANGE (0x113f)
+^^^^^^^^^^^^^^^^^^^
+
+S_DEFRANGE_SUBFIELD (0x1140)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_DEFRANGE_REGISTER (0x1141)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_DEFRANGE_FRAMEPOINTER_REL (0x1142)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_DEFRANGE_SUBFIELD_REGISTER (0x1143)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_DEFRANGE_FRAMEPOINTER_REL_FULL_SCOPE (0x1144)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_DEFRANGE_REGISTER_REL (0x1145)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_LPROC32_ID (0x1146)
+^^^^^^^^^^^^^^^^^^^^^
+
+S_GPROC32_ID (0x1147)
+^^^^^^^^^^^^^^^^^^^^^
+
+S_BUILDINFO (0x114c)
+^^^^^^^^^^^^^^^^^^^^
+
+S_INLINESITE (0x114d)
+^^^^^^^^^^^^^^^^^^^^^
+
+S_INLINESITE_END (0x114e)
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_PROC_ID_END (0x114f)
+^^^^^^^^^^^^^^^^^^^^^^
+
+S_FILESTATIC (0x1153)
+^^^^^^^^^^^^^^^^^^^^^
+
+S_LPROC32_DPC (0x1155)
+^^^^^^^^^^^^^^^^^^^^^^
+
+S_LPROC32_DPC_ID (0x1156)
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_CALLEES (0x115a)
+^^^^^^^^^^^^^^^^^^
+
+S_CALLERS (0x115b)
+^^^^^^^^^^^^^^^^^^
+
+S_HEAPALLOCSITE (0x115e)
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+S_FASTLINK (0x1167)
+^^^^^^^^^^^^^^^^^^^
+
+S_INLINEES (0x1168)
+^^^^^^^^^^^^^^^^^^^
+
+.. _module_and_global_symbols:
+
+Symbols which can go in either/both of the module info stream & global stream
+-----------------------------------------------------------------------------
+
+S_CONSTANT (0x1107)
+^^^^^^^^^^^^^^^^^^^
+
+S_UDT (0x1108)
+^^^^^^^^^^^^^^
+
+S_LDATA32 (0x110c)
+^^^^^^^^^^^^^^^^^^
+
+S_LTHREAD32 (0x1112)
+^^^^^^^^^^^^^^^^^^^^
+
+S_LMANDATA (0x111c)
+^^^^^^^^^^^^^^^^^^^
+
+S_MANCONSTANT (0x112d)
+^^^^^^^^^^^^^^^^^^^^^^
+

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewTypes.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewTypes.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewTypes.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/CodeViewTypes.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,261 @@
+ï»¿=====================================
+CodeView Type Records
+=====================================
+
+
+.. contents::
+   :local:
+
+.. _types_intro:
+
+Introduction
+============
+
+This document describes the usage and serialization format of the various
+CodeView type records that LLVM understands.  This document does not describe
+every single CodeView type record that is defined.  In some cases, this is
+because the records are clearly deprecated and can only appear in very old
+software (e.g. the 16-bit types).  On other cases, it is because the records
+have never been observed in practice.  This could be because they are only
+generated for non-C++ code (e.g. Visual Basic, C#), or because they have been
+made obsolete by newer records, or any number of other reasons.  However, the
+records we describe here should cover 99% of type records that one can expect
+to encounter when dealing with modern C++ toolchains.
+
+Record Categories
+=================
+
+We can think of a sequence of CodeView type records as an array of variable length
+`leaf records`.  Each such record describes its own length as part of a fixed-size
+header, as well as the kind of record it is.  Leaf records are either padded to 4
+bytes (if this type stream appears in a TPI/IPI stream of a PDB) or not padded at
+all (if this type stream appears in the ``.debug$T`` section of an object file).
+Padding is implemented by inserting a decreasing sequence of `<_padding_records>`
+that terminates with ``LF_PAD0``.
+
+The final category of record is a ``member record``.  One particular leaf type -- 
+``LF_FIELDLIST`` -- contains a series of embedded records.  While the outer
+``LF_FIELDLIST`` describes its length (like any other leaf record), the embedded
+records -- called ``member records`` do not.
+
+.. _leaf_types:
+
+Leaf Records
+------------
+
+All leaf records begin with the following 4 byte prefix:
+
+.. code-block:: c++
+
+  struct RecordHeader {
+    uint16_t RecordLen;  // Record length, not including this 2 byte field.
+    uint16_t RecordKind; // Record kind enum.
+  };
+
+LF_POINTER (0x1002)
+^^^^^^^^^^^^^^^^^^^
+
+**Usage:** Describes a pointer to another type.
+
+**Layout:**
+
+.. code-block:: none
+
+  .--------------------.-- +0
+  |    Referent Type   |
+  .--------------------.-- +4
+  |     Attributes     |
+  .--------------------.-- +8
+  |  Member Ptr Info   |       Only present if |Attributes| indicates this is a member pointer.
+  .--------------------.-- +E
+
+Attributes is a bitfield with the following layout:
+
+.. code-block:: none
+
+    .-----------------------------------------------------------------------------------------------------.
+    |     Unused                   |  Flags  |       Size       |   Modifiers   |  Mode   |      Kind     |
+    .-----------------------------------------------------------------------------------------------------.
+    |                              |         |                  |               |         |               |
+   0x100                         +0x16     +0x13               +0xD            +0x8      +0x5            +0x0
+
+where the various fields are defined by the following enums:
+
+.. code-block:: c++
+
+  enum class PointerKind : uint8_t {
+    Near16 = 0x00,                // 16 bit pointer
+    Far16 = 0x01,                 // 16:16 far pointer
+    Huge16 = 0x02,                // 16:16 huge pointer
+    BasedOnSegment = 0x03,        // based on segment
+    BasedOnValue = 0x04,          // based on value of base
+    BasedOnSegmentValue = 0x05,   // based on segment value of base
+    BasedOnAddress = 0x06,        // based on address of base
+    BasedOnSegmentAddress = 0x07, // based on segment address of base
+    BasedOnType = 0x08,           // based on type
+    BasedOnSelf = 0x09,           // based on self
+    Near32 = 0x0a,                // 32 bit pointer
+    Far32 = 0x0b,                 // 16:32 pointer
+    Near64 = 0x0c                 // 64 bit pointer
+  };
+  enum class PointerMode : uint8_t {
+    Pointer = 0x00,                 // "normal" pointer
+    LValueReference = 0x01,         // "old" reference
+    PointerToDataMember = 0x02,     // pointer to data member
+    PointerToMemberFunction = 0x03, // pointer to member function
+    RValueReference = 0x04          // r-value reference
+  };
+  enum class PointerModifiers : uint8_t {
+    None = 0x00,                    // "normal" pointer
+    Flat32 = 0x01,                  // "flat" pointer
+    Volatile = 0x02,                // pointer is marked volatile
+    Const = 0x04,                   // pointer is marked const
+    Unaligned = 0x08,               // pointer is marked unaligned
+    Restrict = 0x10,                // pointer is marked restrict
+  };
+  enum class PointerFlags : uint8_t {
+    WinRTSmartPointer = 0x01,       // pointer is a WinRT smart pointer
+    LValueRefThisPointer = 0x02,    // pointer is a 'this' pointer of a member function with ref qualifier (e.g. void X::foo() &)
+    RValueRefThisPointer = 0x04     // pointer is a 'this' pointer of a member function with ref qualifier (e.g. void X::foo() &&)
+  };
+
+The ``Size`` field of the Attributes bitmask is a 1-byte value indicating the
+pointer size.  For example, a `void*` would have a size of either 4 or 8 depending
+on the target architecture.  On the other hand, if ``Mode`` indicates that this is
+a pointer to member function or pointer to data member, then the size can be any
+implementation defined number.
+
+The ``Member Ptr Info`` field of the ``LF_POINTER`` record is only present if the
+attributes indicate that this is a pointer to member.
+
+Note that "plain" pointers to primitive types are not represented by ``LF_POINTER``
+records, they are indicated by special reserved :ref:`TypeIndex values <type_indices>`.
+
+
+
+LF_MODIFIER (0x1001)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_PROCEDURE (0x1008)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_MFUNCTION (0x1009)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_LABEL (0x000e)
+^^^^^^^^^^^^^^^^^
+
+LF_ARGLIST (0x1201)
+^^^^^^^^^^^^^^^^^^^
+
+LF_FIELDLIST (0x1203)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_ARRAY (0x1503)
+^^^^^^^^^^^^^^^^^
+
+LF_CLASS (0x1504)
+^^^^^^^^^^^^^^^^^
+
+LF_STRUCTURE (0x1505)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_INTERFACE (0x1519)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_UNION (0x1506)
+^^^^^^^^^^^^^^^^^
+
+LF_ENUM (0x1507)
+^^^^^^^^^^^^^^^^
+
+LF_TYPESERVER2 (0x1515)
+^^^^^^^^^^^^^^^^^^^^^^^
+
+LF_VFTABLE (0x151d)
+^^^^^^^^^^^^^^^^^^^
+
+LF_VTSHAPE (0x000a)
+^^^^^^^^^^^^^^^^^^^
+
+LF_BITFIELD (0x1205)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_FUNC_ID (0x1601)
+^^^^^^^^^^^^^^^^^^^
+
+LF_MFUNC_ID (0x1602)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_BUILDINFO (0x1603)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_SUBSTR_LIST (0x1604)
+^^^^^^^^^^^^^^^^^^^^^^^
+
+LF_STRING_ID (0x1605)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_UDT_SRC_LINE (0x1606)
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+LF_UDT_MOD_SRC_LINE (0x1607)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+LF_METHODLIST (0x1206)
+^^^^^^^^^^^^^^^^^^^^^^
+
+LF_PRECOMP (0x1509)
+^^^^^^^^^^^^^^^^^^^
+
+LF_ENDPRECOMP (0x0014)
+^^^^^^^^^^^^^^^^^^^^^^
+
+.. _member_types:
+
+Member Records
+--------------
+
+LF_BCLASS (0x1400)
+^^^^^^^^^^^^^^^^^^
+
+LF_BINTERFACE (0x151a)
+^^^^^^^^^^^^^^^^^^^^^^
+
+LF_VBCLASS (0x1401)
+^^^^^^^^^^^^^^^^^^^
+
+LF_IVBCLASS (0x1402)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_VFUNCTAB (0x1409)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_STMEMBER (0x150e)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_METHOD (0x150f)
+^^^^^^^^^^^^^^^^^^
+
+LF_MEMBER (0x150d)
+^^^^^^^^^^^^^^^^^^
+
+LF_NESTTYPE (0x1510)
+^^^^^^^^^^^^^^^^^^^^
+
+LF_ONEMETHOD (0x1511)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_ENUMERATE (0x1502)
+^^^^^^^^^^^^^^^^^^^^^
+
+LF_INDEX (0x1404)
+^^^^^^^^^^^^^^^^^
+
+.. _padding_records:
+
+Padding Records
+---------------
+
+LF_PADn (0xf0 + n)
+^^^^^^^^^^^^^^^^^^

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/DbiStream.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/DbiStream.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/DbiStream.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/DbiStream.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,465 @@
+=====================================
+The PDB DBI (Debug Info) Stream
+=====================================
+
+.. contents::
+   :local:
+
+.. _dbi_intro:
+
+Introduction
+============
+
+The PDB DBI Stream (Index 3) is one of the largest and most important streams
+in a PDB file.  It contains information about how the program was compiled,
+(e.g. compilation flags, etc), the compilands (e.g. object files) that
+were used to link together the program, the source files which were used
+to build the program, as well as references to other streams that contain more
+detailed information about each compiland, such as the CodeView symbol records
+contained within each compiland and the source and line information for
+functions and other symbols within each compiland.
+
+
+.. _dbi_header:
+
+Stream Header
+=============
+At offset 0 of the DBI Stream is a header with the following layout:
+
+
+.. code-block:: c++
+
+  struct DbiStreamHeader {
+    int32_t VersionSignature;
+    uint32_t VersionHeader;
+    uint32_t Age;
+    uint16_t GlobalStreamIndex;
+    uint16_t BuildNumber;
+    uint16_t PublicStreamIndex;
+    uint16_t PdbDllVersion;
+    uint16_t SymRecordStream;
+    uint16_t PdbDllRbld;
+    int32_t ModInfoSize;
+    int32_t SectionContributionSize;
+    int32_t SectionMapSize;
+    int32_t SourceInfoSize;
+    int32_t TypeServerMapSize;
+    uint32_t MFCTypeServerIndex;
+    int32_t OptionalDbgHeaderSize;
+    int32_t ECSubstreamSize;
+    uint16_t Flags;
+    uint16_t Machine;
+    uint32_t Padding;
+  };
+  
+- **VersionSignature** - Unknown meaning.  Appears to always be ``-1``.
+
+- **VersionHeader** - A value from the following enum.
+
+.. code-block:: c++
+
+  enum class DbiStreamVersion : uint32_t {
+    VC41 = 930803,
+    V50 = 19960307,
+    V60 = 19970606,
+    V70 = 19990903,
+    V110 = 20091201
+  };
+
+Similar to the :doc:`PDB Stream <PdbStream>`, this value always appears to be
+``V70``, and it is not clear what the other values are for.
+
+- **Age** - The number of times the PDB has been written.  Equal to the same
+  field from the :ref:`PDB Stream header <pdb_stream_header>`.
+  
+- **GlobalStreamIndex** - The index of the :doc:`Global Symbol Stream <GlobalStream>`,
+  which contains CodeView symbol records for all global symbols.  Actual records
+  are stored in the symbol record stream, and are referenced from this stream.
+  
+- **BuildNumber** - A bitfield containing values representing the major and minor
+  version number of the toolchain (e.g. 12.0 for MSVC 2013) used to build the
+  program, with the following layout:
+
+.. code-block:: c++
+
+  uint16_t MinorVersion : 8;
+  uint16_t MajorVersion : 7;
+  uint16_t NewVersionFormat : 1;
+
+For the purposes of LLVM, we assume ``NewVersionFormat`` to be always ``true``.
+If it is ``false``, the layout above does not apply and the reader should consult
+the `Microsoft Source Code <https://github.com/Microsoft/microsoft-pdb>`__ for
+further guidance.
+  
+- **PublicStreamIndex** - The index of the :doc:`Public Symbol Stream <PublicStream>`,
+  which contains CodeView symbol records for all public symbols.  Actual records
+  are stored in the symbol record stream, and are referenced from this stream.
+  
+- **PdbDllVersion** - The version number of ``mspdbXXXX.dll`` used to produce this
+  PDB.  Note this obviously does not apply for LLVM as LLVM does not use ``mspdb.dll``.
+  
+- **SymRecordStream** - The stream containing all CodeView symbol records used
+  by the program.  This is used for deduplication, so that many different
+  compilands can refer to the same symbols without having to include the full record
+  content inside of each module stream.
+  
+- **PdbDllRbld** - Unknown
+
+- **MFCTypeServerIndex** - The index of the MFC type server in the
+  :ref:`dbi_type_server_map_substream`.
+
+- **Flags** - A bitfield with the following layout, containing various
+  information about how the program was built:
+  
+.. code-block:: c++
+
+  uint16_t WasIncrementallyLinked : 1;
+  uint16_t ArePrivateSymbolsStripped : 1;
+  uint16_t HasConflictingTypes : 1;
+  uint16_t Reserved : 13;
+
+The only one of these that is not self-explanatory is ``HasConflictingTypes``.
+Although undocumented, ``link.exe`` contains a hidden flag ``/DEBUG:CTYPES``.
+If it is passed to ``link.exe``, this field will be set.  Otherwise it will
+not be set.  It is unclear what this flag does, although it seems to have
+subtle implications on the algorithm used to look up type records.
+
+- **Machine** - A value from the `CV_CPU_TYPE_e <https://msdn.microsoft.com/en-us/library/b2fc64ek.aspx>`__
+  enumeration.  Common values are ``0x8664`` (x86-64) and ``0x14C`` (x86).
+
+Immediately after the fixed-size DBI Stream header are ``7`` variable-length
+`substreams`.  The following ``7`` fields of the DBI Stream header specify the
+number of bytes of the corresponding substream.  Each substream's contents will
+be described in detail :ref:`below <dbi_substreams>`.  The length of the entire
+DBI Stream should equal ``64`` (the length of the header above) plus the value
+of each of the following ``7`` fields.
+
+- **ModInfoSize** - The length of the :ref:`dbi_mod_info_substream`.
+  
+- **SectionContributionSize** - The length of the :ref:`dbi_sec_contr_substream`.
+
+- **SectionMapSize** - The length of the :ref:`dbi_section_map_substream`.
+
+- **SourceInfoSize** - The length of the :ref:`dbi_file_info_substream`.
+
+- **TypeServerMapSize** - The length of the :ref:`dbi_type_server_map_substream`.
+
+- **OptionalDbgHeaderSize** - The length of the :ref:`dbi_optional_dbg_stream`.
+
+- **ECSubstreamSize** - The length of the :ref:`dbi_ec_substream`.
+
+.. _dbi_substreams:
+
+Substreams
+==========
+
+.. _dbi_mod_info_substream:
+
+Module Info Substream
+^^^^^^^^^^^^^^^^^^^^^
+
+Begins at offset ``0`` immediately after the :ref:`header <dbi_header>`.  The
+module info substream is an array of variable-length records, each one
+describing a single module (e.g. object file) linked into the program.  Each
+record in the array has the format:
+  
+.. code-block:: c++
+
+  struct ModInfo {
+    uint32_t Unused1;
+    struct SectionContribEntry {
+      uint16_t Section;
+      char Padding1[2];
+      int32_t Offset;
+      int32_t Size;
+      uint32_t Characteristics;
+      uint16_t ModuleIndex;
+      char Padding2[2];
+      uint32_t DataCrc;
+      uint32_t RelocCrc;
+    } SectionContr;
+    uint16_t Flags;
+    uint16_t ModuleSymStream;
+    uint32_t SymByteSize;
+    uint32_t C11ByteSize;
+    uint32_t C13ByteSize;
+    uint16_t SourceFileCount;
+    char Padding[2];
+    uint32_t Unused2;
+    uint32_t SourceFileNameIndex;
+    uint32_t PdbFilePathNameIndex;
+    char ModuleName[];
+    char ObjFileName[];
+  };
+  
+- **SectionContr** - Describes the properties of the section in the final binary
+  which contain the code and data from this module.
+
+  ``SectionContr.Characteristics`` corresponds to the ``Characteristics`` field
+  of the `IMAGE_SECTION_HEADER <https://msdn.microsoft.com/en-us/library/windows/desktop/ms680341(v=vs.85).aspx>`__
+  structure.
+  
+
+- **Flags** - A bitfield with the following format:
+  
+.. code-block:: c++
+
+  // ``true`` if this ModInfo has been written since reading the PDB.  This is
+  // likely used to support incremental linking, so that the linker can decide
+  // if it needs to commit changes to disk.
+  uint16_t Dirty : 1;
+  // ``true`` if EC information is present for this module. EC is presumed to
+  // stand for "Edit & Continue", which LLVM does not support.  So this flag
+  // will always be be false.
+  uint16_t EC : 1;
+  uint16_t Unused : 6;
+  // Type Server Index for this module.  This is assumed to be related to /Zi,
+  // but as LLVM treats /Zi as /Z7, this field will always be invalid for LLVM
+  // generated PDBs.
+  uint16_t TSM : 8;
+  
+
+- **ModuleSymStream** - The index of the stream that contains symbol information
+  for this module.  This includes CodeView symbol information as well as source
+  and line information.  If this field is -1, then no additional debug info will
+  be present for this module (for example, this is what happens when you strip
+  private symbols from a PDB).
+
+- **SymByteSize** - The number of bytes of data from the stream identified by
+  ``ModuleSymStream`` that represent CodeView symbol records.
+
+- **C11ByteSize** - The number of bytes of data from the stream identified by
+  ``ModuleSymStream`` that represent C11-style CodeView line information.
+
+- **C13ByteSize** - The number of bytes of data from the stream identified by
+  ``ModuleSymStream`` that represent C13-style CodeView line information.  At
+  most one of ``C11ByteSize`` and ``C13ByteSize`` will be non-zero.  Modern PDBs
+  always use C13 instead of C11.
+
+- **SourceFileCount** - The number of source files that contributed to this
+  module during compilation.
+
+- **SourceFileNameIndex** - The offset in the names buffer of the primary
+  translation unit used to build this module.  All PDB files observed to date
+  always have this value equal to 0.
+
+- **PdbFilePathNameIndex** - The offset in the names buffer of the PDB file
+  containing this module's symbol information.  This has only been observed
+  to be non-zero for the special ``* Linker *`` module.
+
+- **ModuleName** - The module name.  This is usually either a full path to an
+  object file (either directly passed to ``link.exe`` or from an archive) or
+  a string of the form ``Import:<dll name>``.
+
+- **ObjFileName** - The object file name.  In the case of an module that is
+  linked directly passed to ``link.exe``, this is the same as **ModuleName**.
+  In the case of a module that comes from an archive, this is usually the full
+  path to the archive.
+
+.. _dbi_sec_contr_substream:
+
+Section Contribution Substream
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Begins at offset ``0`` immediately after the :ref:`dbi_mod_info_substream` ends,
+and consumes ``Header->SectionContributionSize`` bytes.  This substream begins
+with a single ``uint32_t`` which will be one of the following values:
+  
+.. code-block:: c++
+
+  enum class SectionContrSubstreamVersion : uint32_t {
+    Ver60 = 0xeffe0000 + 19970605,
+    V2 = 0xeffe0000 + 20140516
+  };
+  
+``Ver60`` is the only value which has been observed in a PDB so far.  Following
+this is an array of fixed-length structures.  If the version is ``Ver60``,
+it is an array of ``SectionContribEntry`` structures (this is the nested structure
+from the ``ModInfo`` type.  If the version is ``V2``, it is an array of
+``SectionContribEntry2`` structures, defined as follows:
+  
+.. code-block:: c++
+
+  struct SectionContribEntry2 {
+    SectionContribEntry SC;
+    uint32_t ISectCoff;
+  };
+  
+The purpose of the second field is not well understood.  The name implies that
+is the index of the COFF section, but this also describes the existing field
+``SectionContribEntry::Section``.
+  
+
+.. _dbi_section_map_substream:
+
+Section Map Substream
+^^^^^^^^^^^^^^^^^^^^^
+Begins at offset ``0`` immediately after the :ref:`dbi_sec_contr_substream` ends,
+and consumes ``Header->SectionMapSize`` bytes.  This substream begins with an ``4``
+byte header followed by an array of fixed-length records.  The header and records
+have the following layout:
+  
+.. code-block:: c++
+
+  struct SectionMapHeader {
+    uint16_t Count;    // Number of segment descriptors
+    uint16_t LogCount; // Number of logical segment descriptors
+  };
+  
+  struct SectionMapEntry {
+    uint16_t Flags;         // See the SectionMapEntryFlags enum below.
+    uint16_t Ovl;           // Logical overlay number
+    uint16_t Group;         // Group index into descriptor array.
+    uint16_t Frame;
+    uint16_t SectionName;   // Byte index of segment / group name in string table, or 0xFFFF.
+    uint16_t ClassName;     // Byte index of class in string table, or 0xFFFF.
+    uint32_t Offset;        // Byte offset of the logical segment within physical segment.  If group is set in flags, this is the offset of the group.
+    uint32_t SectionLength; // Byte count of the segment or group.
+  };
+  
+  enum class SectionMapEntryFlags : uint16_t {
+    Read = 1 << 0,              // Segment is readable.
+    Write = 1 << 1,             // Segment is writable.
+    Execute = 1 << 2,           // Segment is executable.
+    AddressIs32Bit = 1 << 3,    // Descriptor describes a 32-bit linear address.
+    IsSelector = 1 << 8,        // Frame represents a selector.
+    IsAbsoluteAddress = 1 << 9, // Frame represents an absolute address.
+    IsGroup = 1 << 10           // If set, descriptor represents a group.
+  };
+  
+Many of these fields are not well understood, so will not be discussed further.
+
+.. _dbi_file_info_substream:
+
+File Info Substream
+^^^^^^^^^^^^^^^^^^^
+Begins at offset ``0`` immediately after the :ref:`dbi_section_map_substream` ends,
+and consumes ``Header->SourceInfoSize`` bytes.  This substream defines the mapping
+from module to the source files that contribute to that module.  Since multiple
+modules can use the same source file (for example, a header file), this substream
+uses a string table to store each unique file name only once, and then have each
+module use offsets into the string table rather than embedding the string's value
+directly.  The format of this substream is as follows:
+  
+.. code-block:: c++
+
+  struct FileInfoSubstream {
+    uint16_t NumModules;
+    uint16_t NumSourceFiles;
+    
+    uint16_t ModIndices[NumModules];
+    uint16_t ModFileCounts[NumModules];
+    uint32_t FileNameOffsets[NumSourceFiles];
+    char NamesBuffer[][NumSourceFiles];
+  };
+
+**NumModules** - The number of modules for which source file information is
+contained within this substream.  Should match the corresponding value from the
+ref:`dbi_header`.
+
+**NumSourceFiles**: In theory this is supposed to contain the number of source
+files for which this substream contains information.  But that would present a
+problem in that the width of this field being ``16``-bits would prevent one from
+having more than 64K source files in a program.  In early versions of the file
+format, this seems to have been the case.  In order to support more than this, this
+field of the is simply ignored, and computed dynamically by summing up the values of
+the ``ModFileCounts`` array (discussed below).  In short, this value should be
+ignored.
+
+**ModIndices** - This array is present, but does not appear to be useful.
+
+**ModFileCountArray** - An array of ``NumModules`` integers, each one containing
+the number of source files which contribute to the module at the specified index.
+While each individual module is limited to 64K contributing source files, the
+union of all modules' source files may be greater than 64K.  The real number of
+source files is thus computed by summing this array.  Note that summing this array
+does not give the number of `unique` source files, only the total number of source
+file contributions to modules.
+
+**FileNameOffsets** - An array of **NumSourceFiles** integers (where **NumSourceFiles**
+here refers to the 32-bit value obtained from summing **ModFileCountArray**), where
+each integer is an offset into **NamesBuffer** pointing to a null terminated string.
+
+**NamesBuffer** - An array of null terminated strings containing the actual source
+file names.
+
+.. _dbi_type_server_map_substream:
+
+Type Server Map Substream
+^^^^^^^^^^^^^^^^^^^^^^^^^
+Begins at offset ``0`` immediately after the :ref:`dbi_file_info_substream`
+ends, and consumes ``Header->TypeServerMapSize`` bytes.  Neither the purpose
+nor the layout of this substream is understood, although it is assumed to
+related somehow to the usage of ``/Zi`` and ``mspdbsrv.exe``.  This substream
+will not be discussed further.
+
+.. _dbi_ec_substream:
+
+EC Substream
+^^^^^^^^^^^^
+Begins at offset ``0`` immediately after the
+:ref:`dbi_type_server_map_substream` ends, and consumes
+``Header->ECSubstreamSize`` bytes.  This is presumed to be related to Edit &
+Continue support in MSVC.  LLVM does not support Edit & Continue, so this
+stream will not be discussed further.
+
+.. _dbi_optional_dbg_stream:
+
+Optional Debug Header Stream
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Begins at offset ``0`` immediately after the :ref:`dbi_ec_substream` ends, and
+consumes ``Header->OptionalDbgHeaderSize`` bytes.  This field is an array of
+stream indices (e.g. ``uint16_t``'s), each of which identifies a stream
+index in the larger MSF file which contains some additional debug information.
+Each position of this array has a special meaning, allowing one to determine
+what kind of debug information is at the referenced stream.  ``11`` indices
+are currently understood, although it's possible there may be more.  The
+layout of each stream generally corresponds exactly to a particular type
+of debug data directory from the PE/COFF file.  The format of these fields
+can be found in the `Microsoft PE/COFF Specification <https://www.microsoft.com/en-us/download/details.aspx?id=19509>`__.
+If any of these fields is -1, it means the corresponding type of debug info is
+not present in the PDB.
+
+**FPO Data** - ``DbgStreamArray[0]``.  The data in the referenced stream is an
+array of ``FPO_DATA`` structures.  This contains the relocated contents of
+any ``.debug$F`` section from any of the linker inputs.
+
+**Exception Data** - ``DbgStreamArray[1]``.  The data in the referenced stream
+is a debug data directory of type ``IMAGE_DEBUG_TYPE_EXCEPTION``.
+
+**Fixup Data** - ``DbgStreamArray[2]``.  The data in the referenced stream is a
+debug data directory of type ``IMAGE_DEBUG_TYPE_FIXUP``.
+
+**Omap To Src Data** - ``DbgStreamArray[3]``.  The data in the referenced stream
+is a debug data directory of type ``IMAGE_DEBUG_TYPE_OMAP_TO_SRC``.  This 
+is used for mapping addresses between instrumented and uninstrumented code.
+
+**Omap From Src Data** - ``DbgStreamArray[4]``.  The data in the referenced stream
+is a debug data directory of type ``IMAGE_DEBUG_TYPE_OMAP_FROM_SRC``.  This 
+is used for mapping addresses between instrumented and uninstrumented code.
+
+**Section Header Data** - ``DbgStreamArray[5]``.  A dump of all section headers from
+the original executable.
+
+**Token / RID Map** - ``DbgStreamArray[6]``.  The layout of this stream is not
+understood, but it is assumed to be a mapping from ``CLR Token`` to 
+``CLR Record ID``.  Refer to `ECMA 335 <http://www.ecma-international.org/publications/standards/Ecma-335.htm>`__
+for more information.
+
+**Xdata** - ``DbgStreamArray[7]``.  A copy of the ``.xdata`` section from the
+executable.
+
+**Pdata** - ``DbgStreamArray[8]``. This is assumed to be a copy of the ``.pdata``
+section from the executable, but that would make it identical to
+``DbgStreamArray[1]``.  The difference between these two indices is not well
+understood.
+
+**New FPO Data** - ``DbgStreamArray[9]``.  The data in the referenced stream is a
+debug data directory of type ``IMAGE_DEBUG_TYPE_FPO``.  Note that this is different
+from ``DbgStreamArray[0]`` in that ``.debug$F`` sections are only emitted by MASM.
+Thus, it is possible for both to appear in the same PDB if both MASM object files
+and cl object files are linked into the same program.
+
+**Original Section Header Data** - ``DbgStreamArray[10]``.  Similar to 
+``DbgStreamArray[5]``, but contains the section headers before any binary translation
+has been performed.  This can be used in conjunction with ``DebugStreamArray[3]``
+and ``DbgStreamArray[4]`` to map instrumented and uninstrumented addresses.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/GlobalStream.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/GlobalStream.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/GlobalStream.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/GlobalStream.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,3 @@
+=====================================
+The PDB Global Symbol Stream
+=====================================

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/HashTable.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/HashTable.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/HashTable.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/HashTable.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,103 @@
+The PDB Serialized Hash Table Format
+====================================
+
+.. contents::
+   :local:
+
+.. _hash_intro:
+
+Introduction
+============
+
+One of the design goals of the PDB format is to provide accelerated access to
+debug information, and for this reason there are several occasions where hash
+tables are serialized and embedded directly to the file, rather than requiring
+a consumer to read a list of values and reconstruct the hash table on the fly.
+
+The serialization format supports hash tables of arbitrarily large size and
+capacity, as well as value types and hash functions.  The only supported key
+value type is a uint32.  The only requirement is that the producer and consumer
+agree on the hash function.  As such, the hash function can is not discussed
+further in this document, it is assumed that for a particular instance of a PDB
+file hash table, the appropriate hash function is being used.
+
+On-Disk Format
+==============
+
+.. code-block:: none
+
+  .--------------------.-- +0
+  |        Size        |
+  .--------------------.-- +4
+  |      Capacity      |
+  .--------------------.-- +8
+  | Present Bit Vector |
+  .--------------------.-- +N
+  | Deleted Bit Vector |
+  .--------------------.-- +M                  ââ®
+  |        Key         |                        â
+  .--------------------.-- +M+4                 â
+  |       Value        |                        â
+  .--------------------.-- +M+4+sizeof(Value)   â
+           ...                                  ââ |Capacity| Bucket entries
+  .--------------------.                        â
+  |        Key         |                        â
+  .--------------------.                        â
+  |       Value        |                        â
+  .--------------------.                       ââ¯
+
+- **Size** - The number of values contained in the hash table.
+
+- **Capacity** - The number of buckets in the hash table.  Producers should
+  maintain a load factor of no greater than ``2/3*Capacity+1``.
+
+- **Present Bit Vector** - A serialized bit vector which contains information
+  about which buckets have valid values.  If the bucket has a value, the
+  corresponding bit will be set, and if the bucket doesn't have a value (either
+  because the bucket is empty or because the value is a tombstone value) the bit
+  will be unset.
+
+- **Deleted Bit Vector** - A serialized bit vector which contains information
+  about which buckets have tombstone values.  If the entry in this bucket is
+  deleted, the bit will be set, otherwise it will be unset.
+
+- **Keys and Values** - A list of ``Capacity`` hash buckets, where the first
+  entry is the key (always a uint32), and the second entry is the value.  The
+  state of each bucket (valid, empty, deleted) can be determined by examining
+  the present and deleted bit vectors.
+
+
+.. _hash_bit_vectors:
+
+Present and Deleted Bit Vectors
+===============================
+
+The bit vectors indicating the status of each bucket are serialized as follows:
+
+.. code-block:: none
+
+  .--------------------.-- +0
+  |     Word Count     |
+  .--------------------.-- +4
+  |        Word_0      |        ââ®
+  .--------------------.-- +8    â
+  |        Word_1      |         â
+  .--------------------.-- +12   ââ |Word Count| values
+           ...                   â
+  .--------------------.         â
+  |       Word_N       |         â
+  .--------------------.        ââ¯
+
+The words, when viewed as a contiguous block of bytes, represent a bit vector
+with the following layout:
+
+.. code-block:: none
+
+    .------------.         .------------.------------.
+    |   Word_N   |   ...   |   Word_1   |   Word_0   |
+    .------------.         .------------.------------.
+    |            |         |            |            |
+  +N*32      +(N-1)*32    +64          +32          +0
+
+where the k'th bit of this bit vector represents the status of the k'th bucket
+in the hash table.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/ModiStream.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/ModiStream.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/ModiStream.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/ModiStream.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,80 @@
+=====================================
+The Module Information Stream
+=====================================
+
+.. contents::
+   :local:
+
+.. _modi_stream_intro:
+
+Introduction
+============
+
+The Module Info Stream (henceforth referred to as the Modi stream) contains
+information about a single module (object file, import library, etc that
+contributes to the binary this PDB contains debug information about.  There
+is one modi stream for each module, and the mapping between modi stream index
+and module is contained in the :doc:`DBI Stream <DbiStream>`.  The modi stream
+for a single module contains line information for the compiland, as well as
+all CodeView information for the symbols defined in the compiland.  Finally,
+there is a "global refs" substream which is not well understood.
+
+.. _modi_stream_layout:
+
+Stream Layout
+=============
+
+A modi stream is laid out as follows:
+
+
+.. code-block:: c++
+
+  struct ModiStream {
+    uint32_t Signature;
+    uint8_t Symbols[SymbolSize-4];
+    uint8_t C11LineInfo[C11Size];
+    uint8_t C13LineInfo[C13Size];
+
+    uint32_t GlobalRefsSize;
+    uint8_t GlobalRefs[GlobalRefsSize];
+  };
+
+- **Signature** - Unknown.  In practice only the value of ``4`` has been
+  observed.  It is hypothesized that this value corresponds to the set of
+  ``CV_SIGNATURE_xx`` defines in ``cvinfo.h``, with the value of ``4``
+  meaning that this module has C13 line information (as opposed to C11 line
+  information).  A corollary of this is that we expect to only ever see
+  C13 line info, and that we do not understand the format of C11 line info.
+
+- **Symbols** - The :ref:`CodeView Symbol Substream <modi_symbol_substream>`.
+  ``SymbolSize`` is equal to the value of ``SymByteSize`` for the
+  corresponding module's entry in the :ref:`Module Info Substream
+  <dbi_mod_info_substream>` of the :doc:`DBI Stream <DbiStream>`.
+
+- **C11LineInfo** - A block containing CodeView line information in C11
+  format.  ``C11Size`` is equal to the value of ``C11ByteSize`` from the
+  :ref:`Module Info Substream <dbi_mod_info_substream>` of the
+  :doc:`DBI Stream <DbiStream>`.  If this value is ``0``, then C11 line
+  information is not present.  As mentioned previously, the format of
+  C11 line info is not understood and we assume all line in modern PDBs
+  to be in C13 format.
+
+- **C13LineInfo** - A block containing CodeView line information in C13
+  format.  ``C13Size`` is equal to the value of ``C13ByteSize`` from the
+  :ref:`Module Info Substream <dbi_mod_info_substream>` of the
+  :doc:`DBI Stream <DbiStream>`.  If this value is ``0``, then C13 line
+  information is not present.
+
+- **GlobalRefs** - The meaning of this substream is not understood.
+
+.. _modi_symbol_substream:
+
+The CodeView Symbol Substream
+=============================
+
+The CodeView Symbol Substream.  This is an array of variable length
+records describing the functions, variables, inlining information,
+and other symbols defined in the compiland.  The entire array consumes
+``SymbolSize-4`` bytes.  The format of a CodeView Symbol Record (and
+thusly, an array of CodeView Symbol Records) is described in
+:doc:`CodeViewSymbols`.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/MsfFile.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/MsfFile.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/MsfFile.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/MsfFile.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,181 @@
+=====================================
+The MSF File Format
+=====================================
+
+.. contents::
+   :local:
+
+.. _msf_layout:
+
+File Layout
+===========
+
+The MSF file format consists of the following components:
+
+1. :ref:`msf_superblock`
+2. :ref:`msf_freeblockmap` (also know as Free Page Map, or FPM)
+3. Data
+
+Each component is stored as an indexed block, the length of which is specified
+in ``SuperBlock::BlockSize``. The file consists of 1 or more iterations of the
+following pattern (sometimes referred to as an "interval"):
+
+1. 1 block of data
+2. Free Block Map 1 (corresponds to ``SuperBlock::FreeBlockMapBlock`` 1)
+3. Free Block Map 2 (corresponds to ``SuperBlock::FreeBlockMapBlock`` 2)
+4. ``SuperBlock::BlockSize - 3`` blocks of data
+
+In the first interval, the first data block is used to store
+:ref:`msf_superblock`.
+
+The following diagram demonstrates the general layout of the file (\| denotes
+the end of an interval, and is for visualization purposes only):
+
++-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+
+| Block Index | 0                     | 1                | 2                | 3 - 4095 | \| | 4096 | 4097 | 4098 | 4099 - 8191 | \| | ... |
++=============+=======================+==================+==================+==========+====+======+======+======+=============+====+=====+
+| Meaning     | :ref:`msf_superblock` | Free Block Map 1 | Free Block Map 2 | Data     | \| | Data | FPM1 | FPM2 | Data        | \| | ... |
++-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+
+
+The file may end after any block, including immediately after a FPM1.
+
+.. note::
+  LLVM only supports 4096 byte blocks (sometimes referred to as the "BigMsf"
+  variant), so the rest of this document will assume a block size of 4096.
+
+.. _msf_superblock:
+
+The Superblock
+==============
+At file offset 0 in an MSF file is the MSF *SuperBlock*, which is laid out as
+follows:
+
+.. code-block:: c++
+
+  struct SuperBlock {
+    char FileMagic[sizeof(Magic)];
+    ulittle32_t BlockSize;
+    ulittle32_t FreeBlockMapBlock;
+    ulittle32_t NumBlocks;
+    ulittle32_t NumDirectoryBytes;
+    ulittle32_t Unknown;
+    ulittle32_t BlockMapAddr;
+  };
+
+- **FileMagic** - Must be equal to ``"Microsoft C / C++ MSF 7.00\\r\\n"``
+  followed by the bytes ``1A 44 53 00 00 00``.
+- **BlockSize** - The block size of the internal file system.  Valid values are
+  512, 1024, 2048, and 4096 bytes.  Certain aspects of the MSF file layout vary
+  depending on the block sizes.  For the purposes of LLVM, we handle only block
+  sizes of 4KiB, and all further discussion assumes a block size of 4KiB.
+- **FreeBlockMapBlock** - The index of a block within the file, at which begins
+  a bitfield representing the set of all blocks within the file which are "free"
+  (i.e. the data within that block is not used).  See :ref:`msf_freeblockmap`
+  for more information.
+  **Important**: ``FreeBlockMapBlock`` can only be ``1`` or ``2``!
+- **NumBlocks** - The total number of blocks in the file.  ``NumBlocks *
+  BlockSize`` should equal the size of the file on disk.
+- **NumDirectoryBytes** - The size of the stream directory, in bytes.  The
+  stream directory contains information about each stream's size and the set of
+  blocks that it occupies.  It will be described in more detail later.
+- **BlockMapAddr** - The index of a block within the MSF file.  At this block is
+  an array of ``ulittle32_t``'s listing the blocks that the stream directory
+  resides on.  For large MSF files, the stream directory (which describes the
+  block layout of each stream) may not fit entirely on a single block.  As a
+  result, this extra layer of indirection is introduced, whereby this block
+  contains the list of blocks that the stream directory occupies, and the stream
+  directory itself can be stitched together accordingly.  The number of
+  ``ulittle32_t``'s in this array is given by ``ceil(NumDirectoryBytes /
+  BlockSize)``.
+
+.. _msf_freeblockmap:
+
+The Free Block Map
+==================
+
+The Free Block Map (sometimes referred to as the Free Page Map, or FPM) is a
+series of blocks which contains a bit flag for every block in the file. The
+flag will be set to 0 if the block is in use, and 1 if the block is unused.
+
+Each file contains two FPMs, one of which is active at any given time. This
+feature is designed to support incremental and atomic updates of the underlying
+MSF file. While writing to an MSF file, if the active FPM is FPM1, you can
+write your new modified bitfield to FPM2, and vice versa. Only when you commit
+the file to disk do you need to swap the value in the SuperBlock to point to
+the new ``FreeBlockMapBlock``.
+
+The Free Block Maps are stored as a series of single blocks thoughout the file
+at intervals of BlockSize. Because each FPM block is of size ``BlockSize``
+bytes, it contains 8 times as many bits as an interval has blocks. This means
+that the first block of each FPM refers to the first 8 intervals of the file
+(the first 32768 blocks), the second block of each FPM refers to the next 8
+blocks, and so on. This results in far more FPM blocks being present than are
+required, but in order to maintain backwards compatibility the format must stay
+this way.
+
+The Stream Directory
+====================
+The Stream Directory is the root of all access to the other streams in an MSF
+file.  Beginning at byte 0 of the stream directory is the following structure:
+
+.. code-block:: c++
+
+  struct StreamDirectory {
+    ulittle32_t NumStreams;
+    ulittle32_t StreamSizes[NumStreams];
+    ulittle32_t StreamBlocks[NumStreams][];
+  };
+
+And this structure occupies exactly ``SuperBlock->NumDirectoryBytes`` bytes.
+Note that each of the last two arrays is of variable length, and in particular
+that the second array is jagged.
+
+**Example:** Suppose a hypothetical PDB file with a 4KiB block size, and 4
+streams of lengths {1000 bytes, 8000 bytes, 16000 bytes, 9000 bytes}.
+
+Stream 0: ceil(1000 / 4096) = 1 block
+
+Stream 1: ceil(8000 / 4096) = 2 blocks
+
+Stream 2: ceil(16000 / 4096) = 4 blocks
+
+Stream 3: ceil(9000 / 4096) = 3 blocks
+
+In total, 10 blocks are used.  Let's see what the stream directory might look
+like:
+
+.. code-block:: c++
+
+  struct StreamDirectory {
+    ulittle32_t NumStreams = 4;
+    ulittle32_t StreamSizes[] = {1000, 8000, 16000, 9000};
+    ulittle32_t StreamBlocks[][] = {
+      {4},
+      {5, 6},
+      {11, 9, 7, 8},
+      {10, 15, 12}
+    };
+  };
+
+In total, this occupies ``15 * 4 = 60`` bytes, so
+``SuperBlock->NumDirectoryBytes`` would equal ``60``, and
+``SuperBlock->BlockMapAddr`` would be an array of one ``ulittle32_t``, since
+``60 <= SuperBlock->BlockSize``.
+
+Note also that the streams are discontiguous, and that part of stream 3 is in the
+middle of part of stream 2.  You cannot assume anything about the layout of the
+blocks!
+
+Alignment and Block Boundaries
+==============================
+As may be clear by now, it is possible for a single field (whether it be a high
+level record, a long string field, or even a single ``uint16``) to begin and
+end in separate blocks.  For example, if the block size is 4096 bytes, and a
+``uint16`` field begins at the last byte of the current block, then it would
+need to end on the first byte of the next block.  Since blocks are not
+necessarily contiguously laid out in the file, this means that both the consumer
+and the producer of an MSF file must be prepared to split data apart
+accordingly.  In the aforementioned example, the high byte of the ``uint16``
+would be written to the last byte of block N, and the low byte would be written
+to the first byte of block N+1, which could be tens of thousands of bytes later
+(or even earlier!) in the file, depending on what the stream directory says.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/PdbStream.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/PdbStream.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/PdbStream.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/PdbStream.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,154 @@
+========================================
+The PDB Info Stream (aka the PDB Stream)
+========================================
+
+.. contents::
+   :local:
+
+.. _pdb_stream_header:
+
+Stream Header
+=============
+At offset 0 of the PDB Stream is a header with the following layout:
+
+
+.. code-block:: c++
+
+  struct PdbStreamHeader {
+    ulittle32_t Version;
+    ulittle32_t Signature;
+    ulittle32_t Age;
+    Guid UniqueId;
+  };
+
+- **Version** - A Value from the following enum:
+
+.. code-block:: c++
+
+  enum class PdbStreamVersion : uint32_t {
+    VC2 = 19941610,
+    VC4 = 19950623,
+    VC41 = 19950814,
+    VC50 = 19960307,
+    VC98 = 19970604,
+    VC70Dep = 19990604,
+    VC70 = 20000404,
+    VC80 = 20030901,
+    VC110 = 20091201,
+    VC140 = 20140508,
+  };
+
+While the meaning of this field appears to be obvious, in practice we have
+never observed a value other than ``VC70``, even with modern versions of
+the toolchain, and it is unclear why the other values exist.  It is assumed
+that certain aspects of the PDB stream's layout, and perhaps even that of
+the other streams, will change if the value is something other than ``VC70``.
+
+- **Signature** - A 32-bit time-stamp generated with a call to ``time()`` at
+  the time the PDB file is written.  Note that due to the inherent uniqueness
+  problems of using a timestamp with 1-second granularity, this field does not
+  really serve its intended purpose, and as such is typically ignored in favor
+  of the ``Guid`` field, described below.
+  
+- **Age** - The number of times the PDB file has been written.  This can be used
+  along with ``Guid`` to match the PDB to its corresponding executable.
+  
+- **Guid** - A 128-bit identifier guaranteed to be unique across space and time.
+  In general, this can be thought of as the result of calling the Win32 API 
+  `UuidCreate <https://msdn.microsoft.com/en-us/library/windows/desktop/aa379205(v=vs.85).aspx>`__,
+  although LLVM cannot rely on that, as it must work on non-Windows platforms.
+  
+.. _pdb_named_stream_map:
+
+Named Stream Map
+================
+
+Following the header is a serialized hash table whose key type is a string, and
+whose value type is an integer.  The existence of a mapping ``X -> Y`` means
+that the stream with the name ``X`` has stream index ``Y`` in the underlying MSF
+file.  Note that not all streams are named (for example, the 
+:doc:`TPI Stream <TpiStream>` has a fixed index and as such there is no need to
+look up its index by name).  In practice, there are usually only a small number
+of named streams and these are enumerated in the table of streams in :doc:`index`.
+A corollary of this is if a stream does have a name (and as such is in the named
+stream map) then consulting the Named Stream Map is likely to be the only way to
+discover the stream's MSF stream index.  Several important streams (such as the
+global string table, which is called ``/names``) can only be located this way, and
+so it is important to both produce and consume this correctly as tools will not
+function correctly without it.
+
+.. important::
+   Some streams are located by fixed indices (e.g TPI Stream has index 2), but
+   other streams are located by fixed names (e.g. the string table is called
+   ``/names``) and can only be located by consulting the Named Stream Map.
+
+The on-disk layout of the Named Stream Map consists of 2 components.  The first is
+a buffer of string data prefixed by a 32-bit length.  The second is a serialized
+hash table whose key and value types are both ``uint32_t``.  The key is the offset
+of a null-terminated string in the string data buffer specifying the name of the
+stream, and the value is the MSF stream index of the stream with said name. 
+Note that although the key is an integer, the hash function used to find the right
+bucket hashes the string at the corresponding offset in the string data buffer.
+
+The on-disk layout of the serialized hash table is described at :doc:`HashTable`.
+
+Note that the entire Named Stream Map is not length-prefixed, so the only way to
+get to the data following it is to de-serialize it in its entirety.
+
+  
+.. _pdb_stream_features:
+
+PDB Feature Codes
+=================
+Following the Named Stream Map, and consuming all remaining bytes of the PDB
+Stream is a list of values from the following enumeration:
+
+.. code-block:: c++
+
+  enum class PdbRaw_FeatureSig : uint32_t {
+    VC110 = 20091201,
+    VC140 = 20140508,
+    NoTypeMerge = 0x4D544F4E,
+    MinimalDebugInfo = 0x494E494D,
+  };
+  
+The meaning of these values is summarized by the following table:
+
++------------------+-------------------------------------------------+
+| Flag             | Meaning                                         |
++==================+=================================================+
+| VC110            | - No other features flags are present           |
+|                  | - PDB contains an :doc:`IPI Stream <TpiStream>` |
++------------------+-------------------------------------------------+
+| VC140            | - Other feature flags may be present            |
+|                  | - PDB contains an :doc:`IPI Stream <TpiStream>` |
++------------------+-------------------------------------------------+
+| NoTypeMerge      | - Presumably duplicate types can appear in the  |
+|                  |   TPI Stream, although it's unclear why this    |
+|                  |   might happen.                                 |
++------------------+-------------------------------------------------+
+| MinimalDebugInfo | - Program was linked with /DEBUG:FASTLINK       |
+|                  | - There is no TPI / IPI stream, all type info   |
+|                  |   is contained in the original object files.    |
++------------------+-------------------------------------------------+
+  
+Matching a PDB to its executable
+================================
+The linker is responsible for writing both the PDB and the final executable, and
+as a result is the only entity capable of writing the information necessary to
+match the PDB to the executable.
+
+In order to accomplish this, the linker generates a guid for the PDB (or
+re-uses the existing guid if it is linking incrementally) and increments the Age
+field.
+
+The executable is a PE/COFF file, and part of a PE/COFF file is the presence of
+number of "directories".  For our purposes here, we are interested in the "debug
+directory".  The exact format of a debug directory is described by the
+`IMAGE_DEBUG_DIRECTORY structure <https://msdn.microsoft.com/en-us/library/windows/desktop/ms680307(v=vs.85).aspx>`__.
+For this particular case, the linker emits a debug directory of type
+``IMAGE_DEBUG_TYPE_CODEVIEW``.  The format of this record is defined in
+``llvm/DebugInfo/CodeView/CVDebugRecord.h``, but it suffices to say here only
+that it includes the same ``Guid`` and ``Age`` fields.  At runtime, a
+debugger or tool can scan the COFF executable image for the presence of
+a debug directory of the correct type and verify that the Guid and Age match.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/PublicStream.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/PublicStream.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/PublicStream.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/PublicStream.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,3 @@
+=====================================
+The PDB Public Symbol Stream
+=====================================

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/TpiStream.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/TpiStream.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/TpiStream.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/TpiStream.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,314 @@
+=====================================
+The PDB TPI and IPI Streams
+=====================================
+
+.. contents::
+   :local:
+
+.. _tpi_intro:
+
+Introduction
+============
+
+The PDB TPI Stream (Index 2) and IPI Stream (Index 4) contain information about
+all types used in the program.  It is organized as a :ref:`header <tpi_header>`
+followed by a list of :doc:`CodeView Type Records <CodeViewTypes>`.  Types are
+referenced from various streams and records throughout the PDB by their
+:ref:`type index <type_indices>`.  In general, the sequence of type records
+following the :ref:`header <tpi_header>` forms a topologically sorted DAG
+(directed acyclic graph), which means that a type record B can only refer to
+the type A if ``A.TypeIndex < B.TypeIndex``.  While there are rare cases where
+this property will not hold (particularly when dealing with object files
+compiled with MASM), an implementation should try very hard to make this
+property hold, as it means the entire type graph can be constructed in a single
+pass.
+
+.. important::
+   Type records form a topologically sorted DAG (directed acyclic graph).
+
+.. _tpi_ipi:
+
+TPI vs IPI Stream
+=================
+
+Recent versions of the PDB format (aka all versions covered by this document)
+have 2 streams with identical layout, henceforth referred to as the TPI stream
+and IPI stream.  Subsequent contents of this document describing the on-disk
+format apply equally whether it is for the TPI Stream or the IPI Stream.  The
+only difference between the two is in *which* CodeView records are allowed to
+appear in each one, summarized by the following table:
+
++----------------------+---------------------+
+|    TPI Stream        |    IPI Stream       |
++======================+=====================+
+|  LF_POINTER          | LF_FUNC_ID          |
++----------------------+---------------------+
+|  LF_MODIFIER         | LF_MFUNC_ID         |
++----------------------+---------------------+
+|  LF_PROCEDURE        | LF_BUILDINFO        |
++----------------------+---------------------+
+|  LF_MFUNCTION        | LF_SUBSTR_LIST      |
++----------------------+---------------------+
+|  LF_LABEL            | LF_STRING_ID        |
++----------------------+---------------------+
+|  LF_ARGLIST          | LF_UDT_SRC_LINE     |
++----------------------+---------------------+
+|  LF_FIELDLIST        | LF_UDT_MOD_SRC_LINE |
++----------------------+---------------------+
+|  LF_ARRAY            |                     |
++----------------------+---------------------+
+|  LF_CLASS            |                     |
++----------------------+---------------------+
+|  LF_STRUCTURE        |                     |
++----------------------+---------------------+
+|  LF_INTERFACE        |                     |
++----------------------+---------------------+
+|  LF_UNION            |                     |
++----------------------+---------------------+
+|  LF_ENUM             |                     |
++----------------------+---------------------+
+|  LF_TYPESERVER2      |                     |
++----------------------+---------------------+
+|  LF_VFTABLE          |                     |
++----------------------+---------------------+
+|  LF_VTSHAPE          |                     |
++----------------------+---------------------+
+|  LF_BITFIELD         |                     |
++----------------------+---------------------+
+|  LF_METHODLIST       |                     |
++----------------------+---------------------+
+|  LF_PRECOMP          |                     |
++----------------------+---------------------+
+|  LF_ENDPRECOMP       |                     |
++----------------------+---------------------+
+
+The usage of these records is described in more detail in
+:doc:`CodeView Type Records <CodeViewTypes>`.
+
+.. _type_indices:
+
+Type Indices
+============
+
+A type index is a 32-bit integer that uniquely identifies a type inside of an
+object file's ``.debug$T`` section or a PDB file's TPI or IPI stream.  The
+value of the type index for the first type record from the TPI stream is given
+by the ``TypeIndexBegin`` member of the :ref:`TPI Stream Header <tpi_header>`
+although in practice this value is always equal to 0x1000 (4096).
+
+Any type index with a high bit set is considered to come from the IPI stream,
+although this appears to be more of a hack, and LLVM does not generate type
+indices of this nature.  They can, however, be observed in Microsoft PDBs
+occasionally, so one should be prepared to handle them.  Note that having the
+high bit set is not a necessary condition to determine whether a type index
+comes from the IPI stream, it is only sufficient.
+
+Once the high bit is cleared, any type index >= ``TypeIndexBegin`` is presumed
+to come from the appropriate stream, and any type index less than this is a
+bitmask which can be decomposed as follows:
+
+.. code-block:: none
+
+  .---------------------------.------.----------.
+  |           Unused          | Mode |   Kind   |
+  '---------------------------'------'----------'
+  |+32                        |+12   |+8        |+0
+
+
+- **Kind** - A value from the following enum:
+
+.. code-block:: c++
+
+  enum class SimpleTypeKind : uint32_t {
+    None = 0x0000,          // uncharacterized type (no type)
+    Void = 0x0003,          // void
+    NotTranslated = 0x0007, // type not translated by cvpack
+    HResult = 0x0008,       // OLE/COM HRESULT
+
+    SignedCharacter = 0x0010,   // 8 bit signed
+    UnsignedCharacter = 0x0020, // 8 bit unsigned
+    NarrowCharacter = 0x0070,   // really a char
+    WideCharacter = 0x0071,     // wide char
+    Character16 = 0x007a,       // char16_t
+    Character32 = 0x007b,       // char32_t
+
+    SByte = 0x0068,       // 8 bit signed int
+    Byte = 0x0069,        // 8 bit unsigned int
+    Int16Short = 0x0011,  // 16 bit signed
+    UInt16Short = 0x0021, // 16 bit unsigned
+    Int16 = 0x0072,       // 16 bit signed int
+    UInt16 = 0x0073,      // 16 bit unsigned int
+    Int32Long = 0x0012,   // 32 bit signed
+    UInt32Long = 0x0022,  // 32 bit unsigned
+    Int32 = 0x0074,       // 32 bit signed int
+    UInt32 = 0x0075,      // 32 bit unsigned int
+    Int64Quad = 0x0013,   // 64 bit signed
+    UInt64Quad = 0x0023,  // 64 bit unsigned
+    Int64 = 0x0076,       // 64 bit signed int
+    UInt64 = 0x0077,      // 64 bit unsigned int
+    Int128Oct = 0x0014,   // 128 bit signed int
+    UInt128Oct = 0x0024,  // 128 bit unsigned int
+    Int128 = 0x0078,      // 128 bit signed int
+    UInt128 = 0x0079,     // 128 bit unsigned int
+
+    Float16 = 0x0046,                 // 16 bit real
+    Float32 = 0x0040,                 // 32 bit real
+    Float32PartialPrecision = 0x0045, // 32 bit PP real
+    Float48 = 0x0044,                 // 48 bit real
+    Float64 = 0x0041,                 // 64 bit real
+    Float80 = 0x0042,                 // 80 bit real
+    Float128 = 0x0043,                // 128 bit real
+
+    Complex16 = 0x0056,                 // 16 bit complex
+    Complex32 = 0x0050,                 // 32 bit complex
+    Complex32PartialPrecision = 0x0055, // 32 bit PP complex
+    Complex48 = 0x0054,                 // 48 bit complex
+    Complex64 = 0x0051,                 // 64 bit complex
+    Complex80 = 0x0052,                 // 80 bit complex
+    Complex128 = 0x0053,                // 128 bit complex
+
+    Boolean8 = 0x0030,   // 8 bit boolean
+    Boolean16 = 0x0031,  // 16 bit boolean
+    Boolean32 = 0x0032,  // 32 bit boolean
+    Boolean64 = 0x0033,  // 64 bit boolean
+    Boolean128 = 0x0034, // 128 bit boolean
+  };
+
+- **Mode** - A value from the following enum:
+
+.. code-block:: c++
+
+  enum class SimpleTypeMode : uint32_t {
+    Direct = 0,        // Not a pointer
+    NearPointer = 1,   // Near pointer
+    FarPointer = 2,    // Far pointer
+    HugePointer = 3,   // Huge pointer
+    NearPointer32 = 4, // 32 bit near pointer
+    FarPointer32 = 5,  // 32 bit far pointer
+    NearPointer64 = 6, // 64 bit near pointer
+    NearPointer128 = 7 // 128 bit near pointer
+  };
+
+Note that for pointers, the bitness is represented in the mode.  So a ``void*``
+would have a type index with ``Mode=NearPointer32, Kind=Void`` if built for
+32-bits but a type index with ``Mode=NearPointer64, Kind=Void`` if built for
+64-bits.
+
+By convention, the type index for ``std::nullptr_t`` is constructed the same
+way as the type index for ``void*``, but using the bitless enumeration value
+``NearPointer``.
+
+.. _tpi_header:
+
+Stream Header
+=============
+At offset 0 of the TPI Stream is a header with the following layout:
+
+.. code-block:: c++
+
+  struct TpiStreamHeader {
+    uint32_t Version;
+    uint32_t HeaderSize;
+    uint32_t TypeIndexBegin;
+    uint32_t TypeIndexEnd;
+    uint32_t TypeRecordBytes;
+
+    uint16_t HashStreamIndex;
+    uint16_t HashAuxStreamIndex;
+    uint32_t HashKeySize;
+    uint32_t NumHashBuckets;
+
+    int32_t HashValueBufferOffset;
+    uint32_t HashValueBufferLength;
+
+    int32_t IndexOffsetBufferOffset;
+    uint32_t IndexOffsetBufferLength;
+
+    int32_t HashAdjBufferOffset;
+    uint32_t HashAdjBufferLength;
+  };
+
+- **Version** - A value from the following enum.
+
+.. code-block:: c++
+
+  enum class TpiStreamVersion : uint32_t {
+    V40 = 19950410,
+    V41 = 19951122,
+    V50 = 19961031,
+    V70 = 19990903,
+    V80 = 20040203,
+  };
+
+Similar to the :doc:`PDB Stream <PdbStream>`, this value always appears to be
+``V80``, and no other values have been observed.  It is assumed that should
+another value be observed, the layout described by this document may not be
+accurate.
+
+- **HeaderSize** - ``sizeof(TpiStreamHeader)``
+
+- **TypeIndexBegin** - The numeric value of the type index representing the
+  first type record in the TPI stream.  This is usually the value 0x1000 as
+  type indices lower than this are reserved (see :ref:`Type Indices
+  <type_indices>` for
+  a discussion of reserved type indices).
+
+- **TypeIndexEnd** - One greater than the numeric value of the type index
+  representing the last type record in the TPI stream.  The total number of
+  type records in the TPI stream can be computed as ``TypeIndexEnd -
+  TypeIndexBegin``.
+
+- **TypeRecordBytes** - The number of bytes of type record data following the
+  header.
+
+- **HashStreamIndex** - The index of a stream which contains a list of hashes
+  for every type record.  This value may be -1, indicating that hash
+  information is not present.  In practice a valid stream index is always
+  observed, so any producer implementation should be prepared to emit this
+  stream to ensure compatibility with tools which may expect it to be present.
+
+- **HashAuxStreamIndex** - Presumably the index of a stream which contains a
+  separate hash table, although this has not been observed in practice and it's
+  unclear what it might be used for.
+
+- **HashKeySize** - The size of a hash value (usually 4 bytes).
+
+- **NumHashBuckets** - The number of buckets used to generate the hash values
+  in the aforementioned hash streams.
+
+- **HashValueBufferOffset / HashValueBufferLength** - The offset and size within
+  the TPI Hash Stream of the list of hash values.  It should be assumed that
+  there are either 0 hash values, or a number equal to the number of type
+  records in the TPI stream (``TypeIndexEnd - TypeEndBegin``).  Thus, if
+  ``HashBufferLength`` is not equal to ``(TypeIndexEnd - TypeEndBegin) *
+  HashKeySize`` we can consider the PDB malformed.
+
+- **IndexOffsetBufferOffset / IndexOffsetBufferLength** - The offset and size
+  within the TPI Hash Stream of the Type Index Offsets Buffer.  This is a list
+  of pairs of uint32_t's where the first value is a :ref:`Type Index
+  <type_indices>` and the second value is the offset in the type record data of
+  the type with this index.  This can be used to do a binary search followed by
+  a linear search to get O(log n) lookup by type index.
+
+- **HashAdjBufferOffset / HashAdjBufferLength** - The offset and size within
+  the TPI hash stream of a serialized hash table whose keys are the hash values
+  in the hash value buffer and whose values are type indices.  This appears to
+  be useful in incremental linking scenarios, so that if a type is modified an
+  entry can be created mapping the old hash value to the new type index so that
+  a PDB file consumer can always have the most up to date version of the type
+  without forcing the incremental linker to garbage collect and update
+  references that point to the old version to now point to the new version.
+  The layout of this hash table is described in :doc:`HashTable`.
+
+.. _tpi_records:
+
+CodeView Type Record List
+=========================
+Following the header, there are ``TypeRecordBytes`` bytes of data that
+represent a variable length array of :doc:`CodeView type records
+<CodeViewTypes>`.  The number of such records (e.g. the length of the array)
+can be determined by computing the value ``Header.TypeIndexEnd -
+Header.TypeIndexBegin``.
+
+O(log(n)) access is provided by way of the Type Index Offsets array (if
+present) described previously.

Added: www-releases/trunk/9.0.0/docs/_sources/PDB/index.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/PDB/index.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/PDB/index.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/PDB/index.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,169 @@
+=====================================
+The PDB File Format
+=====================================
+
+.. contents::
+   :local:
+
+.. _pdb_intro:
+
+Introduction
+============
+
+PDB (Program Database) is a file format invented by Microsoft and which contains
+debug information that can be consumed by debuggers and other tools.  Since
+officially supported APIs exist on Windows for querying debug information from
+PDBs even without the user understanding the internals of the file format, a
+large ecosystem of tools has been built for Windows to consume this format.  In
+order for Clang to be able to generate programs that can interoperate with these
+tools, it is necessary for us to generate PDB files ourselves.
+
+At the same time, LLVM has a long history of being able to cross-compile from
+any platform to any platform, and we wish for the same to be true here.  So it
+is necessary for us to understand the PDB file format at the byte-level so that
+we can generate PDB files entirely on our own.
+
+This manual describes what we know about the PDB file format today.  The layout
+of the file, the various streams contained within, the format of individual
+records within, and more.
+
+We would like to extend our heartfelt gratitude to Microsoft, without whom we
+would not be where we are today.  Much of the knowledge contained within this
+manual was learned through reading code published by Microsoft on their `GitHub
+repo <https://github.com/Microsoft/microsoft-pdb>`__.
+
+.. _pdb_layout:
+
+File Layout
+===========
+
+.. important::
+   Unless otherwise specified, all numeric values are encoded in little endian.
+   If you see a type such as ``uint16_t`` or ``uint64_t`` going forward, always
+   assume it is little endian!
+
+.. toctree::
+   :hidden:
+
+   MsfFile
+   PdbStream
+   TpiStream
+   DbiStream
+   ModiStream
+   PublicStream
+   GlobalStream
+   HashTable
+   CodeViewSymbols
+   CodeViewTypes
+
+.. _msf:
+
+The MSF Container
+-----------------
+A PDB file is an MSF (Multi-Stream Format) file.  An MSF file is a "file system
+within a file".  It contains multiple streams (aka files) which can represent
+arbitrary data, and these streams are divided into blocks which may not
+necessarily be contiguously laid out within the MSF container file.
+Additionally, the MSF contains a stream directory (aka MFT) which describes how
+the streams (files) are laid out within the MSF.
+
+For more information about the MSF container format, stream directory, and
+block layout, see :doc:`MsfFile`.
+
+.. _streams:
+
+Streams
+-------
+The PDB format contains a number of streams which describe various information
+such as the types, symbols, source files, and compilands (e.g. object files)
+of a program, as well as some additional streams containing hash tables that are
+used by debuggers and other tools to provide fast lookup of records and types
+by name, and various other information about how the program was compiled such
+as the specific toolchain used, and more.  A summary of streams contained in a
+PDB file is as follows:
+
++--------------------+------------------------------+-------------------------------------------+
+| Name               | Stream Index                 | Contents                                  |
++====================+==============================+===========================================+
+| Old Directory      | - Fixed Stream Index 0       | - Previous MSF Stream Directory           |
++--------------------+------------------------------+-------------------------------------------+
+| PDB Stream         | - Fixed Stream Index 1       | - Basic File Information                  |
+|                    |                              | - Fields to match EXE to this PDB         |
+|                    |                              | - Map of named streams to stream indices  |
++--------------------+------------------------------+-------------------------------------------+
+| TPI Stream         | - Fixed Stream Index 2       | - CodeView Type Records                   |
+|                    |                              | - Index of TPI Hash Stream                |
++--------------------+------------------------------+-------------------------------------------+
+| DBI Stream         | - Fixed Stream Index 3       | - Module/Compiland Information            |
+|                    |                              | - Indices of individual module streams    |
+|                    |                              | - Indices of public / global streams      |
+|                    |                              | - Section Contribution Information        |
+|                    |                              | - Source File Information                 |
+|                    |                              | - References to streams containing        |
+|                    |                              |   FPO / PGO Data                          |
++--------------------+------------------------------+-------------------------------------------+
+| IPI Stream         | - Fixed Stream Index 4       | - CodeView Type Records                   |
+|                    |                              | - Index of IPI Hash Stream                |
++--------------------+------------------------------+-------------------------------------------+
+| /LinkInfo          | - Contained in PDB Stream    | - Unknown                                 |
+|                    |   Named Stream map           |                                           |
++--------------------+------------------------------+-------------------------------------------+
+| /src/headerblock   | - Contained in PDB Stream    | - Summary of embedded source file content |
+|                    |   Named Stream map           |   (e.g. natvis files)                     |
++--------------------+------------------------------+-------------------------------------------+
+| /names             | - Contained in PDB Stream    | - PDB-wide global string table used for   |
+|                    |   Named Stream map           |   string de-duplication                   |
++--------------------+------------------------------+-------------------------------------------+
+| Module Info Stream | - Contained in DBI Stream    | - CodeView Symbol Records for this module |
+|                    | - One for each compiland     | - Line Number Information                 |
++--------------------+------------------------------+-------------------------------------------+
+| Public Stream      | - Contained in DBI Stream    | - Public (Exported) Symbol Records        |
+|                    |                              | - Index of Public Hash Stream             |
++--------------------+------------------------------+-------------------------------------------+
+| Global Stream      | - Contained in DBI Stream    | - Single combined master symbol-table     |
+|                    |                              | - Index of Global Hash Stream             |
++--------------------+------------------------------+-------------------------------------------+
+| TPI Hash Stream    | - Contained in TPI Stream    | - Hash table for looking up TPI records   |
+|                    |                              |   by name                                 |
++--------------------+------------------------------+-------------------------------------------+
+| IPI Hash Stream    | - Contained in IPI Stream    | - Hash table for looking up IPI records   |
+|                    |                              |   by name                                 |
++--------------------+------------------------------+-------------------------------------------+
+
+More information about the structure of each of these can be found on the
+following pages:
+
+:doc:`PdbStream`
+   Information about the PDB Info Stream and how it is used to match PDBs to EXEs.
+
+:doc:`TpiStream`
+   Information about the TPI stream and the CodeView records contained within.
+
+:doc:`DbiStream`
+   Information about the DBI stream and relevant substreams including the
+   Module Substreams, source file information, and CodeView symbol records
+   contained within.
+
+:doc:`ModiStream`
+   Information about the Module Information Stream, of which there is one for
+   each compilation unit and the format of symbols contained within.
+
+:doc:`PublicStream`
+   Information about the Public Symbol Stream.
+
+:doc:`GlobalStream`
+   Information about the Global Symbol Stream.
+
+:doc:`HashTable`
+   Information about the serialized hash table format used internally to
+   represent things such as the Named Stream Map and the Hash Adjusters in the
+   :doc:`TPI/IPI Stream <TpiStream>`.
+
+CodeView
+========
+CodeView is another format which comes into the picture.  While MSF defines
+the structure of the overall file, and PDB defines the set of streams that
+appear within the MSF file and the format of those streams, CodeView defines
+the format of **symbol and type records** that appear within specific streams.
+Refer to the pages on :doc:`CodeViewSymbols` and :doc:`CodeViewTypes` for
+more information about the CodeView format.

Added: www-releases/trunk/9.0.0/docs/_sources/Packaging.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Packaging.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Packaging.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Packaging.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,73 @@
+========================
+Advice on Packaging LLVM
+========================
+
+.. contents::
+   :local:
+
+Overview
+========
+
+LLVM sets certain default configure options to make sure our developers don't
+break things for constrained platforms.  These settings are not optimal for most
+desktop systems, and we hope that packagers (e.g., Redhat, Debian, MacPorts,
+etc.) will tweak them.  This document lists settings we suggest you tweak.
+
+LLVM's API changes with each release, so users are likely to want, for example,
+both LLVM-2.6 and LLVM-2.7 installed at the same time to support apps developed
+against each.
+
+Compile Flags
+=============
+
+LLVM runs much more quickly when it's optimized and assertions are removed.
+However, such a build is currently incompatible with users who build without
+defining ``NDEBUG``, and the lack of assertions makes it hard to debug problems
+in user code.  We recommend allowing users to install both optimized and debug
+versions of LLVM in parallel.  The following configure flags are relevant:
+
+``--disable-assertions``
+    Builds LLVM with ``NDEBUG`` defined.  Changes the LLVM ABI.  Also available
+    by setting ``DISABLE_ASSERTIONS=0|1`` in ``make``'s environment.  This
+    defaults to enabled regardless of the optimization setting, but it slows
+    things down.
+
+``--enable-debug-symbols``
+    Builds LLVM with ``-g``.  Also available by setting ``DEBUG_SYMBOLS=0|1`` in
+    ``make``'s environment.  This defaults to disabled when optimizing, so you
+    should turn it back on to let users debug their programs.
+
+``--enable-optimized``
+    (For svn checkouts) Builds LLVM with ``-O2`` and, by default, turns off
+    debug symbols.  Also available by setting ``ENABLE_OPTIMIZED=0|1`` in
+    ``make``'s environment.  This defaults to enabled when not in a
+    checkout.
+
+C++ Features
+============
+
+RTTI
+    LLVM disables RTTI by default.  Add ``REQUIRES_RTTI=1`` to your environment
+    while running ``make`` to re-enable it.  This will allow users to build with
+    RTTI enabled and still inherit from LLVM classes.
+
+Shared Library
+==============
+
+Configure with ``--enable-shared`` to build
+``libLLVM-<major>.<minor>.(so|dylib)`` and link the tools against it.  This
+saves lots of binary size at the cost of some startup time.
+
+Dependencies
+============
+
+``--enable-libffi``
+    Depend on `libffi <http://sources.redhat.com/libffi/>`_ to allow the LLVM
+    interpreter to call external functions.
+
+``--with-oprofile``
+
+    Depend on `libopagent
+    <http://oprofile.sourceforge.net/doc/devel/index.html>`_ (>=version 0.9.4)
+    to let the LLVM JIT tell oprofile about function addresses and line
+    numbers.

Added: www-releases/trunk/9.0.0/docs/_sources/Passes.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Passes.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Passes.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Passes.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1231 @@
+..
+    If Passes.html is up to date, the following "one-liner" should print
+    an empty diff.
+
+    egrep -e '^<tr><td><a href="#.*">-.*</a></td><td>.*</td></tr>$' \
+          -e '^  <a name=".*">.*</a>$' < Passes.html >html; \
+    perl >help <<'EOT' && diff -u help html; rm -f help html
+    open HTML, "<Passes.html" or die "open: Passes.html: $!\n";
+    while (<HTML>) {
+      m:^<tr><td><a href="#(.*)">-.*</a></td><td>.*</td></tr>$: or next;
+      $order{$1} = sprintf("%03d", 1 + int %order);
+    }
+    open HELP, "../Release/bin/opt -help|" or die "open: opt -help: $!\n";
+    while (<HELP>) {
+      m:^    -([^ ]+) +- (.*)$: or next;
+      my $o = $order{$1};
+      $o = "000" unless defined $o;
+      push @x, "$o<tr><td><a href=\"#$1\">-$1</a></td><td>$2</td></tr>\n";
+      push @y, "$o  <a name=\"$1\">-$1: $2</a>\n";
+    }
+    @x = map { s/^\d\d\d//; $_ } sort @x;
+    @y = map { s/^\d\d\d//; $_ } sort @y;
+    print @x, @y;
+    EOT
+
+    This (real) one-liner can also be helpful when converting comments to HTML:
+
+    perl -e '$/ = undef; for (split(/\n/, <>)) { s:^ *///? ?::; print "  <p>\n" if !$on && $_ =~ /\S/; print "  </p>\n" if $on && $_ =~ /^\s*$/; print "  $_\n"; $on = ($_ =~ /\S/); } print "  </p>\n" if $on'
+
+====================================
+LLVM's Analysis and Transform Passes
+====================================
+
+.. contents::
+    :local:
+
+Introduction
+============
+
+This document serves as a high level summary of the optimization features that
+LLVM provides.  Optimizations are implemented as Passes that traverse some
+portion of a program to either collect information or transform the program.
+The table below divides the passes that LLVM provides into three categories.
+Analysis passes compute information that other passes can use or for debugging
+or program visualization purposes.  Transform passes can use (or invalidate)
+the analysis passes.  Transform passes all mutate the program in some way.
+Utility passes provides some utility but don't otherwise fit categorization.
+For example passes to extract functions to bitcode or write a module to bitcode
+are neither analysis nor transform passes.  The table of contents above
+provides a quick summary of each pass and links to the more complete pass
+description later in the document.
+
+Analysis Passes
+===============
+
+This section describes the LLVM Analysis Passes.
+
+``-aa-eval``: Exhaustive Alias Analysis Precision Evaluator
+-----------------------------------------------------------
+
+This is a simple N^2 alias analysis accuracy evaluator.  Basically, for each
+function in the program, it simply queries to see how the alias analysis
+implementation answers alias queries between each pair of pointers in the
+function.
+
+This is inspired and adapted from code by: Naveen Neelakantam, Francesco
+Spadini, and Wojciech Stryjewski.
+
+``-basicaa``: Basic Alias Analysis (stateless AA impl)
+------------------------------------------------------
+
+A basic alias analysis pass that implements identities (two different globals
+cannot alias, etc), but does no stateful analysis.
+
+``-basiccg``: Basic CallGraph Construction
+------------------------------------------
+
+Yet to be written.
+
+``-count-aa``: Count Alias Analysis Query Responses
+---------------------------------------------------
+
+A pass which can be used to count how many alias queries are being made and how
+the alias analysis implementation being used responds.
+
+.. _passes-da:
+
+``-da``: Dependence Analysis
+----------------------------
+
+Dependence analysis framework, which is used to detect dependences in memory
+accesses.
+
+``-debug-aa``: AA use debugger
+------------------------------
+
+This simple pass checks alias analysis users to ensure that if they create a
+new value, they do not query AA without informing it of the value.  It acts as
+a shim over any other AA pass you want.
+
+Yes keeping track of every value in the program is expensive, but this is a
+debugging pass.
+
+``-domfrontier``: Dominance Frontier Construction
+-------------------------------------------------
+
+This pass is a simple dominator construction algorithm for finding forward
+dominator frontiers.
+
+``-domtree``: Dominator Tree Construction
+-----------------------------------------
+
+This pass is a simple dominator construction algorithm for finding forward
+dominators.
+
+
+``-dot-callgraph``: Print Call Graph to "dot" file
+--------------------------------------------------
+
+This pass, only available in ``opt``, prints the call graph into a ``.dot``
+graph.  This graph can then be processed with the "dot" tool to convert it to
+postscript or some other suitable format.
+
+``-dot-cfg``: Print CFG of function to "dot" file
+-------------------------------------------------
+
+This pass, only available in ``opt``, prints the control flow graph into a
+``.dot`` graph.  This graph can then be processed with the :program:`dot` tool
+to convert it to postscript or some other suitable format.
+
+``-dot-cfg-only``: Print CFG of function to "dot" file (with no function bodies)
+--------------------------------------------------------------------------------
+
+This pass, only available in ``opt``, prints the control flow graph into a
+``.dot`` graph, omitting the function bodies.  This graph can then be processed
+with the :program:`dot` tool to convert it to postscript or some other suitable
+format.
+
+``-dot-dom``: Print dominance tree of function to "dot" file
+------------------------------------------------------------
+
+This pass, only available in ``opt``, prints the dominator tree into a ``.dot``
+graph.  This graph can then be processed with the :program:`dot` tool to
+convert it to postscript or some other suitable format.
+
+``-dot-dom-only``: Print dominance tree of function to "dot" file (with no function bodies)
+-------------------------------------------------------------------------------------------
+
+This pass, only available in ``opt``, prints the dominator tree into a ``.dot``
+graph, omitting the function bodies.  This graph can then be processed with the
+:program:`dot` tool to convert it to postscript or some other suitable format.
+
+``-dot-postdom``: Print postdominance tree of function to "dot" file
+--------------------------------------------------------------------
+
+This pass, only available in ``opt``, prints the post dominator tree into a
+``.dot`` graph.  This graph can then be processed with the :program:`dot` tool
+to convert it to postscript or some other suitable format.
+
+``-dot-postdom-only``: Print postdominance tree of function to "dot" file (with no function bodies)
+---------------------------------------------------------------------------------------------------
+
+This pass, only available in ``opt``, prints the post dominator tree into a
+``.dot`` graph, omitting the function bodies.  This graph can then be processed
+with the :program:`dot` tool to convert it to postscript or some other suitable
+format.
+
+``-globalsmodref-aa``: Simple mod/ref analysis for globals
+----------------------------------------------------------
+
+This simple pass provides alias and mod/ref information for global values that
+do not have their address taken, and keeps track of whether functions read or
+write memory (are "pure").  For this simple (but very common) case, we can
+provide pretty accurate and useful information.
+
+``-instcount``: Counts the various types of ``Instruction``\ s
+--------------------------------------------------------------
+
+This pass collects the count of all instructions and reports them.
+
+``-intervals``: Interval Partition Construction
+-----------------------------------------------
+
+This analysis calculates and represents the interval partition of a function,
+or a preexisting interval partition.
+
+In this way, the interval partition may be used to reduce a flow graph down to
+its degenerate single node interval partition (unless it is irreducible).
+
+``-iv-users``: Induction Variable Users
+---------------------------------------
+
+Bookkeeping for "interesting" users of expressions computed from induction
+variables.
+
+``-lazy-value-info``: Lazy Value Information Analysis
+-----------------------------------------------------
+
+Interface for lazy computation of value constraint information.
+
+``-libcall-aa``: LibCall Alias Analysis
+---------------------------------------
+
+LibCall Alias Analysis.
+
+``-lint``: Statically lint-checks LLVM IR
+-----------------------------------------
+
+This pass statically checks for common and easily-identified constructs which
+produce undefined or likely unintended behavior in LLVM IR.
+
+It is not a guarantee of correctness, in two ways.  First, it isn't
+comprehensive.  There are checks which could be done statically which are not
+yet implemented.  Some of these are indicated by TODO comments, but those
+aren't comprehensive either.  Second, many conditions cannot be checked
+statically.  This pass does no dynamic instrumentation, so it can't check for
+all possible problems.
+
+Another limitation is that it assumes all code will be executed.  A store
+through a null pointer in a basic block which is never reached is harmless, but
+this pass will warn about it anyway.
+
+Optimization passes may make conditions that this pass checks for more or less
+obvious.  If an optimization pass appears to be introducing a warning, it may
+be that the optimization pass is merely exposing an existing condition in the
+code.
+
+This code may be run before :ref:`instcombine <passes-instcombine>`.  In many
+cases, instcombine checks for the same kinds of things and turns instructions
+with undefined behavior into unreachable (or equivalent).  Because of this,
+this pass makes some effort to look through bitcasts and so on.
+
+``-loops``: Natural Loop Information
+------------------------------------
+
+This analysis is used to identify natural loops and determine the loop depth of
+various nodes of the CFG.  Note that the loops identified may actually be
+several natural loops that share the same header node... not just a single
+natural loop.
+
+``-memdep``: Memory Dependence Analysis
+---------------------------------------
+
+An analysis that determines, for a given memory operation, what preceding
+memory operations it depends on.  It builds on alias analysis information, and
+tries to provide a lazy, caching interface to a common kind of alias
+information query.
+
+``-module-debuginfo``: Decodes module-level debug info
+------------------------------------------------------
+
+This pass decodes the debug info metadata in a module and prints in a
+(sufficiently-prepared-) human-readable form.
+
+For example, run this pass from ``opt`` along with the ``-analyze`` option, and
+it'll print to standard output.
+
+``-postdomfrontier``: Post-Dominance Frontier Construction
+----------------------------------------------------------
+
+This pass is a simple post-dominator construction algorithm for finding
+post-dominator frontiers.
+
+``-postdomtree``: Post-Dominator Tree Construction
+--------------------------------------------------
+
+This pass is a simple post-dominator construction algorithm for finding
+post-dominators.
+
+``-print-alias-sets``: Alias Set Printer
+----------------------------------------
+
+Yet to be written.
+
+``-print-callgraph``: Print a call graph
+----------------------------------------
+
+This pass, only available in ``opt``, prints the call graph to standard error
+in a human-readable form.
+
+``-print-callgraph-sccs``: Print SCCs of the Call Graph
+-------------------------------------------------------
+
+This pass, only available in ``opt``, prints the SCCs of the call graph to
+standard error in a human-readable form.
+
+``-print-cfg-sccs``: Print SCCs of each function CFG
+----------------------------------------------------
+
+This pass, only available in ``opt``, printsthe SCCs of each function CFG to
+standard error in a human-readable fom.
+
+``-print-dom-info``: Dominator Info Printer
+-------------------------------------------
+
+Dominator Info Printer.
+
+``-print-externalfnconstants``: Print external fn callsites passed constants
+----------------------------------------------------------------------------
+
+This pass, only available in ``opt``, prints out call sites to external
+functions that are called with constant arguments.  This can be useful when
+looking for standard library functions we should constant fold or handle in
+alias analyses.
+
+``-print-function``: Print function to stderr
+---------------------------------------------
+
+The ``PrintFunctionPass`` class is designed to be pipelined with other
+``FunctionPasses``, and prints out the functions of the module as they are
+processed.
+
+``-print-module``: Print module to stderr
+-----------------------------------------
+
+This pass simply prints out the entire module when it is executed.
+
+.. _passes-print-used-types:
+
+``-print-used-types``: Find Used Types
+--------------------------------------
+
+This pass is used to seek out all of the types in use by the program.  Note
+that this analysis explicitly does not include types only used by the symbol
+table.
+
+``-regions``: Detect single entry single exit regions
+-----------------------------------------------------
+
+The ``RegionInfo`` pass detects single entry single exit regions in a function,
+where a region is defined as any subgraph that is connected to the remaining
+graph at only two spots.  Furthermore, an hierarchical region tree is built.
+
+``-scalar-evolution``: Scalar Evolution Analysis
+------------------------------------------------
+
+The ``ScalarEvolution`` analysis can be used to analyze and catagorize scalar
+expressions in loops.  It specializes in recognizing general induction
+variables, representing them with the abstract and opaque ``SCEV`` class.
+Given this analysis, trip counts of loops and other important properties can be
+obtained.
+
+This analysis is primarily useful for induction variable substitution and
+strength reduction.
+
+``-scev-aa``: ScalarEvolution-based Alias Analysis
+--------------------------------------------------
+
+Simple alias analysis implemented in terms of ``ScalarEvolution`` queries.
+
+This differs from traditional loop dependence analysis in that it tests for
+dependencies within a single iteration of a loop, rather than dependencies
+between different iterations.
+
+``ScalarEvolution`` has a more complete understanding of pointer arithmetic
+than ``BasicAliasAnalysis``' collection of ad-hoc analyses.
+
+``-stack-safety``: Stack Safety Analysis
+------------------------------------------------
+
+The ``StackSafety`` analysis can be used to determine if stack allocated
+variables can be considered safe from memory access bugs.
+
+This analysis' primary purpose is to be used by sanitizers to avoid unnecessary
+instrumentation of safe variables.
+
+``-targetdata``: Target Data Layout
+-----------------------------------
+
+Provides other passes access to information on how the size and alignment
+required by the target ABI for various data types.
+
+Transform Passes
+================
+
+This section describes the LLVM Transform Passes.
+
+``-adce``: Aggressive Dead Code Elimination
+-------------------------------------------
+
+ADCE aggressively tries to eliminate code.  This pass is similar to :ref:`DCE
+<passes-dce>` but it assumes that values are dead until proven otherwise.  This
+is similar to :ref:`SCCP <passes-sccp>`, except applied to the liveness of
+values.
+
+``-always-inline``: Inliner for ``always_inline`` functions
+-----------------------------------------------------------
+
+A custom inliner that handles only functions that are marked as "always
+inline".
+
+``-argpromotion``: Promote 'by reference' arguments to scalars
+--------------------------------------------------------------
+
+This pass promotes "by reference" arguments to be "by value" arguments.  In
+practice, this means looking for internal functions that have pointer
+arguments.  If it can prove, through the use of alias analysis, that an
+argument is *only* loaded, then it can pass the value into the function instead
+of the address of the value.  This can cause recursive simplification of code
+and lead to the elimination of allocas (especially in C++ template code like
+the STL).
+
+This pass also handles aggregate arguments that are passed into a function,
+scalarizing them if the elements of the aggregate are only loaded.  Note that
+it refuses to scalarize aggregates which would require passing in more than
+three operands to the function, because passing thousands of operands for a
+large array or structure is unprofitable!
+
+Note that this transformation could also be done for arguments that are only
+stored to (returning the value instead), but does not currently.  This case
+would be best handled when and if LLVM starts supporting multiple return values
+from functions.
+
+``-bb-vectorize``: Basic-Block Vectorization
+--------------------------------------------
+
+This pass combines instructions inside basic blocks to form vector
+instructions.  It iterates over each basic block, attempting to pair compatible
+instructions, repeating this process until no additional pairs are selected for
+vectorization.  When the outputs of some pair of compatible instructions are
+used as inputs by some other pair of compatible instructions, those pairs are
+part of a potential vectorization chain.  Instruction pairs are only fused into
+vector instructions when they are part of a chain longer than some threshold
+length.  Moreover, the pass attempts to find the best possible chain for each
+pair of compatible instructions.  These heuristics are intended to prevent
+vectorization in cases where it would not yield a performance increase of the
+resulting code.
+
+``-block-placement``: Profile Guided Basic Block Placement
+----------------------------------------------------------
+
+This pass is a very simple profile guided basic block placement algorithm.  The
+idea is to put frequently executed blocks together at the start of the function
+and hopefully increase the number of fall-through conditional branches.  If
+there is no profile information for a particular function, this pass basically
+orders blocks in depth-first order.
+
+``-break-crit-edges``: Break critical edges in CFG
+--------------------------------------------------
+
+Break all of the critical edges in the CFG by inserting a dummy basic block.
+It may be "required" by passes that cannot deal with critical edges.  This
+transformation obviously invalidates the CFG, but can update forward dominator
+(set, immediate dominators, tree, and frontier) information.
+
+``-codegenprepare``: Optimize for code generation
+-------------------------------------------------
+
+This pass munges the code in the input function to better prepare it for
+SelectionDAG-based code generation.  This works around limitations in its
+basic-block-at-a-time approach.  It should eventually be removed.
+
+``-constmerge``: Merge Duplicate Global Constants
+-------------------------------------------------
+
+Merges duplicate global constants together into a single constant that is
+shared.  This is useful because some passes (i.e., TraceValues) insert a lot of
+string constants into the program, regardless of whether or not an existing
+string is available.
+
+``-constprop``: Simple constant propagation
+-------------------------------------------
+
+This pass implements constant propagation and merging.  It looks for
+instructions involving only constant operands and replaces them with a constant
+value instead of an instruction.  For example:
+
+.. code-block:: llvm
+
+  add i32 1, 2
+
+becomes
+
+.. code-block:: llvm
+
+  i32 3
+
+NOTE: this pass has a habit of making definitions be dead.  It is a good idea
+to run a :ref:`Dead Instruction Elimination <passes-die>` pass sometime after
+running this pass.
+
+.. _passes-dce:
+
+``-dce``: Dead Code Elimination
+-------------------------------
+
+Dead code elimination is similar to :ref:`dead instruction elimination
+<passes-die>`, but it rechecks instructions that were used by removed
+instructions to see if they are newly dead.
+
+``-deadargelim``: Dead Argument Elimination
+-------------------------------------------
+
+This pass deletes dead arguments from internal functions.  Dead argument
+elimination removes arguments which are directly dead, as well as arguments
+only passed into function calls as dead arguments of other functions.  This
+pass also deletes dead arguments in a similar way.
+
+This pass is often useful as a cleanup pass to run after aggressive
+interprocedural passes, which add possibly-dead arguments.
+
+``-deadtypeelim``: Dead Type Elimination
+----------------------------------------
+
+This pass is used to cleanup the output of GCC.  It eliminate names for types
+that are unused in the entire translation unit, using the :ref:`find used types
+<passes-print-used-types>` pass.
+
+.. _passes-die:
+
+``-die``: Dead Instruction Elimination
+--------------------------------------
+
+Dead instruction elimination performs a single pass over the function, removing
+instructions that are obviously dead.
+
+``-dse``: Dead Store Elimination
+--------------------------------
+
+A trivial dead store elimination that only considers basic-block local
+redundant stores.
+
+.. _passes-functionattrs:
+
+``-functionattrs``: Deduce function attributes
+----------------------------------------------
+
+A simple interprocedural pass which walks the call-graph, looking for functions
+which do not access or only read non-local memory, and marking them
+``readnone``/``readonly``.  In addition, it marks function arguments (of
+pointer type) "``nocapture``" if a call to the function does not create any
+copies of the pointer value that outlive the call.  This more or less means
+that the pointer is only dereferenced, and not returned from the function or
+stored in a global.  This pass is implemented as a bottom-up traversal of the
+call-graph.
+
+``-globaldce``: Dead Global Elimination
+---------------------------------------
+
+This transform is designed to eliminate unreachable internal globals from the
+program.  It uses an aggressive algorithm, searching out globals that are known
+to be alive.  After it finds all of the globals which are needed, it deletes
+whatever is left over.  This allows it to delete recursive chunks of the
+program which are unreachable.
+
+``-globalopt``: Global Variable Optimizer
+-----------------------------------------
+
+This pass transforms simple global variables that never have their address
+taken.  If obviously true, it marks read/write globals as constant, deletes
+variables only stored to, etc.
+
+``-gvn``: Global Value Numbering
+--------------------------------
+
+This pass performs global value numbering to eliminate fully and partially
+redundant instructions.  It also performs redundant load elimination.
+
+.. _passes-indvars:
+
+``-indvars``: Canonicalize Induction Variables
+----------------------------------------------
+
+This transformation analyzes and transforms the induction variables (and
+computations derived from them) into simpler forms suitable for subsequent
+analysis and transformation.
+
+This transformation makes the following changes to each loop with an
+identifiable induction variable:
+
+* All loops are transformed to have a *single* canonical induction variable
+  which starts at zero and steps by one.
+* The canonical induction variable is guaranteed to be the first PHI node in
+  the loop header block.
+* Any pointer arithmetic recurrences are raised to use array subscripts.
+
+If the trip count of a loop is computable, this pass also makes the following
+changes:
+
+* The exit condition for the loop is canonicalized to compare the induction
+  value against the exit value.  This turns loops like:
+
+  .. code-block:: c++
+
+    for (i = 7; i*i < 1000; ++i)
+
+    into
+
+  .. code-block:: c++
+
+    for (i = 0; i != 25; ++i)
+
+* Any use outside of the loop of an expression derived from the indvar is
+  changed to compute the derived value outside of the loop, eliminating the
+  dependence on the exit value of the induction variable.  If the only purpose
+  of the loop is to compute the exit value of some derived expression, this
+  transformation will make the loop dead.
+
+This transformation should be followed by strength reduction after all of the
+desired loop transformations have been performed.  Additionally, on targets
+where it is profitable, the loop could be transformed to count down to zero
+(the "do loop" optimization).
+
+``-inline``: Function Integration/Inlining
+------------------------------------------
+
+Bottom-up inlining of functions into callees.
+
+.. _passes-instcombine:
+
+``-instcombine``: Combine redundant instructions
+------------------------------------------------
+
+Combine instructions to form fewer, simple instructions.  This pass does not
+modify the CFG. This pass is where algebraic simplification happens.
+
+This pass combines things like:
+
+.. code-block:: llvm
+
+  %Y = add i32 %X, 1
+  %Z = add i32 %Y, 1
+
+into:
+
+.. code-block:: llvm
+
+  %Z = add i32 %X, 2
+
+This is a simple worklist driven algorithm.
+
+This pass guarantees that the following canonicalizations are performed on the
+program:
+
+#. If a binary operator has a constant operand, it is moved to the right-hand
+   side.
+#. Bitwise operators with constant operands are always grouped so that shifts
+   are performed first, then ``or``\ s, then ``and``\ s, then ``xor``\ s.
+#. Compare instructions are converted from ``<``, ``>``, ``â¤``, or ``â¥`` to
+   ``=`` or ``â `` if possible.
+#. All ``cmp`` instructions on boolean values are replaced with logical
+   operations.
+#. ``add X, X`` is represented as ``mul X, 2`` â ``shl X, 1``
+#. Multiplies with a constant power-of-two argument are transformed into
+   shifts.
+#. â¦ etc.
+
+This pass can also simplify calls to specific well-known function calls (e.g.
+runtime library functions).  For example, a call ``exit(3)`` that occurs within
+the ``main()`` function can be transformed into simply ``return 3``. Whether or
+not library calls are simplified is controlled by the
+:ref:`-functionattrs <passes-functionattrs>` pass and LLVM's knowledge of
+library calls on different targets.
+
+.. _passes-aggressive-instcombine:
+
+``-aggressive-instcombine``: Combine expression patterns
+--------------------------------------------------------
+
+Combine expression patterns to form expressions with fewer, simple instructions.
+This pass does not modify the CFG.
+
+For example, this pass reduce width of expressions post-dominated by TruncInst
+into smaller width when applicable.
+
+It differs from instcombine pass in that it contains pattern optimization that
+requires higher complexity than the O(1), thus, it should run fewer times than
+instcombine pass.
+
+``-internalize``: Internalize Global Symbols
+--------------------------------------------
+
+This pass loops over all of the functions in the input module, looking for a
+main function.  If a main function is found, all other functions and all global
+variables with initializers are marked as internal.
+
+``-ipconstprop``: Interprocedural constant propagation
+------------------------------------------------------
+
+This pass implements an *extremely* simple interprocedural constant propagation
+pass.  It could certainly be improved in many different ways, like using a
+worklist.  This pass makes arguments dead, but does not remove them.  The
+existing dead argument elimination pass should be run after this to clean up
+the mess.
+
+``-ipsccp``: Interprocedural Sparse Conditional Constant Propagation
+--------------------------------------------------------------------
+
+An interprocedural variant of :ref:`Sparse Conditional Constant Propagation
+<passes-sccp>`.
+
+``-jump-threading``: Jump Threading
+-----------------------------------
+
+Jump threading tries to find distinct threads of control flow running through a
+basic block.  This pass looks at blocks that have multiple predecessors and
+multiple successors.  If one or more of the predecessors of the block can be
+proven to always cause a jump to one of the successors, we forward the edge
+from the predecessor to the successor by duplicating the contents of this
+block.
+
+An example of when this can occur is code like this:
+
+.. code-block:: c++
+
+  if () { ...
+    X = 4;
+  }
+  if (X < 3) {
+
+In this case, the unconditional branch at the end of the first if can be
+revectored to the false side of the second if.
+
+``-lcssa``: Loop-Closed SSA Form Pass
+-------------------------------------
+
+This pass transforms loops by placing phi nodes at the end of the loops for all
+values that are live across the loop boundary.  For example, it turns the left
+into the right code:
+
+.. code-block:: c++
+
+  for (...)                for (...)
+      if (c)                   if (c)
+          X1 = ...                 X1 = ...
+      else                     else
+          X2 = ...                 X2 = ...
+      X3 = phi(X1, X2)         X3 = phi(X1, X2)
+  ... = X3 + 4              X4 = phi(X3)
+                              ... = X4 + 4
+
+This is still valid LLVM; the extra phi nodes are purely redundant, and will be
+trivially eliminated by ``InstCombine``.  The major benefit of this
+transformation is that it makes many other loop optimizations, such as
+``LoopUnswitch``\ ing, simpler.
+
+.. _passes-licm:
+
+``-licm``: Loop Invariant Code Motion
+-------------------------------------
+
+This pass performs loop invariant code motion, attempting to remove as much
+code from the body of a loop as possible.  It does this by either hoisting code
+into the preheader block, or by sinking code to the exit blocks if it is safe.
+This pass also promotes must-aliased memory locations in the loop to live in
+registers, thus hoisting and sinking "invariant" loads and stores.
+
+This pass uses alias analysis for two purposes:
+
+#. Moving loop invariant loads and calls out of loops.  If we can determine
+   that a load or call inside of a loop never aliases anything stored to, we
+   can hoist it or sink it like any other instruction.
+
+#. Scalar Promotion of Memory.  If there is a store instruction inside of the
+   loop, we try to move the store to happen AFTER the loop instead of inside of
+   the loop.  This can only happen if a few conditions are true:
+
+   #. The pointer stored through is loop invariant.
+   #. There are no stores or loads in the loop which *may* alias the pointer.
+      There are no calls in the loop which mod/ref the pointer.
+
+   If these conditions are true, we can promote the loads and stores in the
+   loop of the pointer to use a temporary alloca'd variable.  We then use the
+   :ref:`mem2reg <passes-mem2reg>` functionality to construct the appropriate
+   SSA form for the variable.
+
+``-loop-deletion``: Delete dead loops
+-------------------------------------
+
+This file implements the Dead Loop Deletion Pass.  This pass is responsible for
+eliminating loops with non-infinite computable trip counts that have no side
+effects or volatile instructions, and do not contribute to the computation of
+the function's return value.
+
+.. _passes-loop-extract:
+
+``-loop-extract``: Extract loops into new functions
+---------------------------------------------------
+
+A pass wrapper around the ``ExtractLoop()`` scalar transformation to extract
+each top-level loop into its own new function.  If the loop is the *only* loop
+in a given function, it is not touched.  This is a pass most useful for
+debugging via bugpoint.
+
+``-loop-extract-single``: Extract at most one loop into a new function
+----------------------------------------------------------------------
+
+Similar to :ref:`Extract loops into new functions <passes-loop-extract>`, this
+pass extracts one natural loop from the program into a function if it can.
+This is used by :program:`bugpoint`.
+
+``-loop-reduce``: Loop Strength Reduction
+-----------------------------------------
+
+This pass performs a strength reduction on array references inside loops that
+have as one or more of their components the loop induction variable.  This is
+accomplished by creating a new value to hold the initial value of the array
+access for the first iteration, and then creating a new GEP instruction in the
+loop to increment the value by the appropriate amount.
+
+``-loop-rotate``: Rotate Loops
+------------------------------
+
+A simple loop rotation transformation.
+
+``-loop-simplify``: Canonicalize natural loops
+----------------------------------------------
+
+This pass performs several transformations to transform natural loops into a
+simpler form, which makes subsequent analyses and transformations simpler and
+more effective.
+
+Loop pre-header insertion guarantees that there is a single, non-critical entry
+edge from outside of the loop to the loop header.  This simplifies a number of
+analyses and transformations, such as :ref:`LICM <passes-licm>`.
+
+Loop exit-block insertion guarantees that all exit blocks from the loop (blocks
+which are outside of the loop that have predecessors inside of the loop) only
+have predecessors from inside of the loop (and are thus dominated by the loop
+header).  This simplifies transformations such as store-sinking that are built
+into LICM.
+
+This pass also guarantees that loops will have exactly one backedge.
+
+Note that the :ref:`simplifycfg <passes-simplifycfg>` pass will clean up blocks
+which are split out but end up being unnecessary, so usage of this pass should
+not pessimize generated code.
+
+This pass obviously modifies the CFG, but updates loop information and
+dominator information.
+
+``-loop-unroll``: Unroll loops
+------------------------------
+
+This pass implements a simple loop unroller.  It works best when loops have
+been canonicalized by the :ref:`indvars <passes-indvars>` pass, allowing it to
+determine the trip counts of loops easily.
+
+``-loop-unroll-and-jam``: Unroll and Jam loops
+----------------------------------------------
+
+This pass implements a simple unroll and jam classical loop optimisation pass.
+It transforms loop from:
+
+.. code-block:: c++
+
+  for i.. i+= 1              for i.. i+= 4
+    for j..                    for j..
+      code(i, j)                 code(i, j)
+                                 code(i+1, j)
+                                 code(i+2, j)
+                                 code(i+3, j)
+                             remainder loop
+
+Which can be seen as unrolling the outer loop and "jamming" (fusing) the inner
+loops into one. When variables or loads can be shared in the new inner loop, this
+can lead to significant performance improvements. It uses
+:ref:`Dependence Analysis <passes-da>` for proving the transformations are safe.
+
+``-loop-unswitch``: Unswitch loops
+----------------------------------
+
+This pass transforms loops that contain branches on loop-invariant conditions
+to have multiple loops.  For example, it turns the left into the right code:
+
+.. code-block:: c++
+
+  for (...)                  if (lic)
+      A                          for (...)
+      if (lic)                       A; B; C
+          B                  else
+      C                          for (...)
+                                     A; C
+
+This can increase the size of the code exponentially (doubling it every time a
+loop is unswitched) so we only unswitch if the resultant code will be smaller
+than a threshold.
+
+This pass expects :ref:`LICM <passes-licm>` to be run before it to hoist
+invariant conditions out of the loop, to make the unswitching opportunity
+obvious.
+
+``-loweratomic``: Lower atomic intrinsics to non-atomic form
+------------------------------------------------------------
+
+This pass lowers atomic intrinsics to non-atomic form for use in a known
+non-preemptible environment.
+
+The pass does not verify that the environment is non-preemptible (in general
+this would require knowledge of the entire call graph of the program including
+any libraries which may not be available in bitcode form); it simply lowers
+every atomic intrinsic.
+
+``-lowerinvoke``: Lower invokes to calls, for unwindless code generators
+------------------------------------------------------------------------
+
+This transformation is designed for use by code generators which do not yet
+support stack unwinding.  This pass converts ``invoke`` instructions to
+``call`` instructions, so that any exception-handling ``landingpad`` blocks
+become dead code (which can be removed by running the ``-simplifycfg`` pass
+afterwards).
+
+``-lowerswitch``: Lower ``SwitchInst``\ s to branches
+-----------------------------------------------------
+
+Rewrites switch instructions with a sequence of branches, which allows targets
+to get away with not implementing the switch instruction until it is
+convenient.
+
+.. _passes-mem2reg:
+
+``-mem2reg``: Promote Memory to Register
+----------------------------------------
+
+This file promotes memory references to be register references.  It promotes
+alloca instructions which only have loads and stores as uses.  An ``alloca`` is
+transformed by using dominator frontiers to place phi nodes, then traversing
+the function in depth-first order to rewrite loads and stores as appropriate.
+This is just the standard SSA construction algorithm to construct "pruned" SSA
+form.
+
+``-memcpyopt``: MemCpy Optimization
+-----------------------------------
+
+This pass performs various transformations related to eliminating ``memcpy``
+calls, or transforming sets of stores into ``memset``\ s.
+
+``-mergefunc``: Merge Functions
+-------------------------------
+
+This pass looks for equivalent functions that are mergable and folds them.
+
+Total-ordering is introduced among the functions set: we define comparison
+that answers for every two functions which of them is greater. It allows to
+arrange functions into the binary tree.
+
+For every new function we check for equivalent in tree.
+
+If equivalent exists we fold such functions. If both functions are overridable,
+we move the functionality into a new internal function and leave two
+overridable thunks to it.
+
+If there is no equivalent, then we add this function to tree.
+
+Lookup routine has O(log(n)) complexity, while whole merging process has
+complexity of O(n*log(n)).
+
+Read
+:doc:`this <MergeFunctions>`
+article for more details.
+
+``-mergereturn``: Unify function exit nodes
+-------------------------------------------
+
+Ensure that functions have at most one ``ret`` instruction in them.
+Additionally, it keeps track of which node is the new exit node of the CFG.
+
+``-partial-inliner``: Partial Inliner
+-------------------------------------
+
+This pass performs partial inlining, typically by inlining an ``if`` statement
+that surrounds the body of the function.
+
+``-prune-eh``: Remove unused exception handling info
+----------------------------------------------------
+
+This file implements a simple interprocedural pass which walks the call-graph,
+turning invoke instructions into call instructions if and only if the callee
+cannot throw an exception.  It implements this as a bottom-up traversal of the
+call-graph.
+
+``-reassociate``: Reassociate expressions
+-----------------------------------------
+
+This pass reassociates commutative expressions in an order that is designed to
+promote better constant propagation, GCSE, :ref:`LICM <passes-licm>`, PRE, etc.
+
+For example: 4 + (x + 5) â x + (4 + 5)
+
+In the implementation of this algorithm, constants are assigned rank = 0,
+function arguments are rank = 1, and other values are assigned ranks
+corresponding to the reverse post order traversal of current function (starting
+at 2), which effectively gives values in deep loops higher rank than values not
+in loops.
+
+``-reg2mem``: Demote all values to stack slots
+----------------------------------------------
+
+This file demotes all registers to memory references.  It is intended to be the
+inverse of :ref:`mem2reg <passes-mem2reg>`.  By converting to ``load``
+instructions, the only values live across basic blocks are ``alloca``
+instructions and ``load`` instructions before ``phi`` nodes.  It is intended
+that this should make CFG hacking much easier.  To make later hacking easier,
+the entry block is split into two, such that all introduced ``alloca``
+instructions (and nothing else) are in the entry block.
+
+``-sroa``: Scalar Replacement of Aggregates
+------------------------------------------------------
+
+The well-known scalar replacement of aggregates transformation.  This transform
+breaks up ``alloca`` instructions of aggregate type (structure or array) into
+individual ``alloca`` instructions for each member if possible.  Then, if
+possible, it transforms the individual ``alloca`` instructions into nice clean
+scalar SSA form.
+
+.. _passes-sccp:
+
+``-sccp``: Sparse Conditional Constant Propagation
+--------------------------------------------------
+
+Sparse conditional constant propagation and merging, which can be summarized
+as:
+
+* Assumes values are constant unless proven otherwise
+* Assumes BasicBlocks are dead unless proven otherwise
+* Proves values to be constant, and replaces them with constants
+* Proves conditional branches to be unconditional
+
+Note that this pass has a habit of making definitions be dead.  It is a good
+idea to run a :ref:`DCE <passes-dce>` pass sometime after running this pass.
+
+.. _passes-simplifycfg:
+
+``-simplifycfg``: Simplify the CFG
+----------------------------------
+
+Performs dead code elimination and basic block merging.  Specifically:
+
+* Removes basic blocks with no predecessors.
+* Merges a basic block into its predecessor if there is only one and the
+  predecessor only has one successor.
+* Eliminates PHI nodes for basic blocks with a single predecessor.
+* Eliminates a basic block that only contains an unconditional branch.
+
+``-sink``: Code sinking
+-----------------------
+
+This pass moves instructions into successor blocks, when possible, so that they
+aren't executed on paths where their results aren't needed.
+
+``-strip``: Strip all symbols from a module
+-------------------------------------------
+
+Performs code stripping.  This transformation can delete:
+
+* names for virtual registers
+* symbols for internal globals and functions
+* debug information
+
+Note that this transformation makes code much less readable, so it should only
+be used in situations where the strip utility would be used, such as reducing
+code size or making it harder to reverse engineer code.
+
+``-strip-dead-debug-info``: Strip debug info for unused symbols
+---------------------------------------------------------------
+
+.. FIXME: this description is the same as for -strip
+
+performs code stripping. this transformation can delete:
+
+* names for virtual registers
+* symbols for internal globals and functions
+* debug information
+
+note that this transformation makes code much less readable, so it should only
+be used in situations where the strip utility would be used, such as reducing
+code size or making it harder to reverse engineer code.
+
+``-strip-dead-prototypes``: Strip Unused Function Prototypes
+------------------------------------------------------------
+
+This pass loops over all of the functions in the input module, looking for dead
+declarations and removes them.  Dead declarations are declarations of functions
+for which no implementation is available (i.e., declarations for unused library
+functions).
+
+``-strip-debug-declare``: Strip all ``llvm.dbg.declare`` intrinsics
+-------------------------------------------------------------------
+
+.. FIXME: this description is the same as for -strip
+
+This pass implements code stripping.  Specifically, it can delete:
+
+#. names for virtual registers
+#. symbols for internal globals and functions
+#. debug information
+
+Note that this transformation makes code much less readable, so it should only
+be used in situations where the 'strip' utility would be used, such as reducing
+code size or making it harder to reverse engineer code.
+
+``-strip-nondebug``: Strip all symbols, except dbg symbols, from a module
+-------------------------------------------------------------------------
+
+.. FIXME: this description is the same as for -strip
+
+This pass implements code stripping.  Specifically, it can delete:
+
+#. names for virtual registers
+#. symbols for internal globals and functions
+#. debug information
+
+Note that this transformation makes code much less readable, so it should only
+be used in situations where the 'strip' utility would be used, such as reducing
+code size or making it harder to reverse engineer code.
+
+``-tailcallelim``: Tail Call Elimination
+----------------------------------------
+
+This file transforms calls of the current function (self recursion) followed by
+a return instruction with a branch to the entry of the function, creating a
+loop.  This pass also implements the following extensions to the basic
+algorithm:
+
+#. Trivial instructions between the call and return do not prevent the
+   transformation from taking place, though currently the analysis cannot
+   support moving any really useful instructions (only dead ones).
+#. This pass transforms functions that are prevented from being tail recursive
+   by an associative expression to use an accumulator variable, thus compiling
+   the typical naive factorial or fib implementation into efficient code.
+#. TRE is performed if the function returns void, if the return returns the
+   result returned by the call, or if the function returns a run-time constant
+   on all exits from the function.  It is possible, though unlikely, that the
+   return returns something else (like constant 0), and can still be TRE'd.  It
+   can be TRE'd if *all other* return instructions in the function return the
+   exact same value.
+#. If it can prove that callees do not access theier caller stack frame, they
+   are marked as eligible for tail call elimination (by the code generator).
+
+Utility Passes
+==============
+
+This section describes the LLVM Utility Passes.
+
+``-deadarghaX0r``: Dead Argument Hacking (BUGPOINT USE ONLY; DO NOT USE)
+------------------------------------------------------------------------
+
+Same as dead argument elimination, but deletes arguments to functions which are
+external.  This is only for use by :doc:`bugpoint <Bugpoint>`.
+
+``-extract-blocks``: Extract Basic Blocks From Module (for bugpoint use)
+------------------------------------------------------------------------
+
+This pass is used by bugpoint to extract all blocks from the module into their
+own functions.
+
+``-instnamer``: Assign names to anonymous instructions
+------------------------------------------------------
+
+This is a little utility pass that gives instructions names, this is mostly
+useful when diffing the effect of an optimization because deleting an unnamed
+instruction can change all other instruction numbering, making the diff very
+noisy.
+
+.. _passes-verify:
+
+``-verify``: Module Verifier
+----------------------------
+
+Verifies an LLVM IR code.  This is useful to run after an optimization which is
+undergoing testing.  Note that llvm-as verifies its input before emitting
+bitcode, and also that malformed bitcode is likely to make LLVM crash.  All
+language front-ends are therefore encouraged to verify their output before
+performing optimizing transformations.
+
+#. Both of a binary operator's parameters are of the same type.
+#. Verify that the indices of mem access instructions match other operands.
+#. Verify that arithmetic and other things are only performed on first-class
+   types.  Verify that shifts and logicals only happen on integrals f.e.
+#. All of the constants in a switch statement are of the correct type.
+#. The code is in valid SSA form.
+#. It is illegal to put a label into any other type (like a structure) or to
+   return one.
+#. Only phi nodes can be self referential: ``%x = add i32 %x``, ``%x`` is
+   invalid.
+#. PHI nodes must have an entry for each predecessor, with no extras.
+#. PHI nodes must be the first thing in a basic block, all grouped together.
+#. PHI nodes must have at least one entry.
+#. All basic blocks should only end with terminator insts, not contain them.
+#. The entry node to a function must not have predecessors.
+#. All Instructions must be embedded into a basic block.
+#. Functions cannot take a void-typed parameter.
+#. Verify that a function's argument list agrees with its declared type.
+#. It is illegal to specify a name for a void value.
+#. It is illegal to have an internal global value with no initializer.
+#. It is illegal to have a ``ret`` instruction that returns a value that does
+   not agree with the function return value type.
+#. Function call argument types match the function prototype.
+#. All other things that are tested by asserts spread about the code.
+
+Note that this does not provide full security verification (like Java), but
+instead just tries to ensure that code is well-formed.
+
+``-view-cfg``: View CFG of function
+-----------------------------------
+
+Displays the control flow graph using the GraphViz tool.
+
+``-view-cfg-only``: View CFG of function (with no function bodies)
+------------------------------------------------------------------
+
+Displays the control flow graph using the GraphViz tool, but omitting function
+bodies.
+
+``-view-dom``: View dominance tree of function
+----------------------------------------------
+
+Displays the dominator tree using the GraphViz tool.
+
+``-view-dom-only``: View dominance tree of function (with no function bodies)
+-----------------------------------------------------------------------------
+
+Displays the dominator tree using the GraphViz tool, but omitting function
+bodies.
+
+``-view-postdom``: View postdominance tree of function
+------------------------------------------------------
+
+Displays the post dominator tree using the GraphViz tool.
+
+``-view-postdom-only``: View postdominance tree of function (with no function bodies)
+-------------------------------------------------------------------------------------
+
+Displays the post dominator tree using the GraphViz tool, but omitting function
+bodies.
+
+``-transform-warning``: Report missed forced transformations
+------------------------------------------------------------
+
+Emits warnings about not yet applied forced transformations (e.g. from
+``#pragma omp simd``).

Added: www-releases/trunk/9.0.0/docs/_sources/Phabricator.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Phabricator.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Phabricator.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Phabricator.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,250 @@
+.. _phabricator-reviews:
+
+=============================
+Code Reviews with Phabricator
+=============================
+
+.. contents::
+  :local:
+
+If you prefer to use a web user interface for code reviews, you can now submit
+your patches for Clang and LLVM at `LLVM's Phabricator`_ instance.
+
+While Phabricator is a useful tool for some, the relevant -commits mailing list
+is the system of record for all LLVM code review. The mailing list should be
+added as a subscriber on all reviews, and Phabricator users should be prepared
+to respond to free-form comments in mail sent to the commits list.
+
+Sign up
+-------
+
+To get started with Phabricator, navigate to `https://reviews.llvm.org`_ and
+click the power icon in the top right. You can register with a GitHub account,
+a Google account, or you can create your own profile.
+
+Make *sure* that the email address registered with Phabricator is subscribed
+to the relevant -commits mailing list. If you are not subscribed to the commit
+list, all mail sent by Phabricator on your behalf will be held for moderation.
+
+Note that if you use your Subversion user name as Phabricator user name,
+Phabricator will automatically connect your submits to your Phabricator user in
+the `Code Repository Browser`_.
+
+Requesting a review via the command line
+----------------------------------------
+
+Phabricator has a tool called *Arcanist* to upload patches from
+the command line. To get you set up, follow the
+`Arcanist Quick Start`_ instructions.
+
+You can learn more about how to use arc to interact with
+Phabricator in the `Arcanist User Guide`_.
+
+.. _phabricator-request-review-web:
+
+Requesting a review via the web interface
+-----------------------------------------
+
+The tool to create and review patches in Phabricator is called
+*Differential*.
+
+Note that you can upload patches created through various diff tools,
+including git and svn. To make reviews easier, please always include
+**as much context as possible** with your diff! Don't worry, Phabricator
+will automatically send a diff with a smaller context in the review
+email, but having the full file in the web interface will help the
+reviewer understand your code.
+
+To get a full diff, use one of the following commands (or just use Arcanist
+to upload your patch):
+
+* ``git show HEAD -U999999 > mypatch.patch``
+* ``git format-patch -U999999 @{u}``
+* ``svn diff --diff-cmd=diff -x -U999999``
+
+To upload a new patch:
+
+* Click *Differential*.
+* Click *+ Create Diff*.
+* Paste the text diff or browse to the patch file. Click *Create Diff*.
+* Leave this first Repository field blank. (We'll fill in the Repository
+  later, when sending the review.)
+* Leave the drop down on *Create a new Revision...* and click *Continue*.
+* Enter a descriptive title and summary.  The title and summary are usually
+  in the form of a :ref:`commit message <commit messages>`.
+* Add reviewers (see below for advice). (If you set the Repository field
+  correctly, llvm-commits or cfe-commits will be subscribed automatically;
+  otherwise, you will have to manually subscribe them.)
+* In the Repository field, enter the name of the project (LLVM, Clang,
+  etc.) to which the review should be sent.
+* Click *Save*.
+
+To submit an updated patch:
+
+* Click *Differential*.
+* Click *+ Create Diff*.
+* Paste the updated diff or browse to the updated patch file. Click *Create Diff*.
+* Select the review you want to from the *Attach To* dropdown and click
+  *Continue*.
+* Leave the Repository field blank. (We previously filled out the Repository
+  for the review request.)
+* Add comments about the changes in the new diff. Click *Save*.
+
+Choosing reviewers: You typically pick one or two people as initial reviewers.
+This choice is not crucial, because you are merely suggesting and not requiring
+them to participate. Many people will see the email notification on cfe-commits
+or llvm-commits, and if the subject line suggests the patch is something they
+should look at, they will.
+
+
+.. _finding-potential-reviewers:
+
+Finding potential reviewers
+---------------------------
+
+Here are a couple of ways to pick the initial reviewer(s):
+
+* Use ``svn blame`` and the commit log to find names of people who have
+  recently modified the same area of code that you are modifying.
+* Look in CODE_OWNERS.TXT to see who might be responsible for that area.
+* If you've discussed the change on a dev list, the people who participated
+  might be appropriate reviewers.
+
+Even if you think the code owner is the busiest person in the world, it's still
+okay to put them as a reviewer. Being the code owner means they have accepted
+responsibility for making sure the review happens.
+
+Reviewing code with Phabricator
+-------------------------------
+
+Phabricator allows you to add inline comments as well as overall comments
+to a revision. To add an inline comment, select the lines of code you want
+to comment on by clicking and dragging the line numbers in the diff pane.
+When you have added all your comments, scroll to the bottom of the page and
+click the Submit button.
+
+You can add overall comments in the text box at the bottom of the page.
+When you're done, click the Submit button.
+
+Phabricator has many useful features, for example allowing you to select
+diffs between different versions of the patch as it was reviewed in the
+*Revision Update History*. Most features are self descriptive - explore, and
+if you have a question, drop by on #llvm in IRC to get help.
+
+Note that as e-mail is the system of reference for code reviews, and some
+people prefer it over a web interface, we do not generate automated mail
+when a review changes state, for example by clicking "Accept Revision" in
+the web interface. Thus, please type LGTM into the comment box to accept
+a change from Phabricator.
+
+Committing a change
+-------------------
+
+Once a patch has been reviewed and approved on Phabricator it can then be
+committed to trunk. If you do not have commit access, someone has to
+commit the change for you (with attribution). It is sufficient to add
+a comment to the approved review indicating you cannot commit the patch
+yourself. If you have commit access, there are multiple workflows to commit the
+change. Whichever method you follow it is recommended that your commit message
+ends with the line:
+
+::
+
+  Differential Revision: <URL>
+
+where ``<URL>`` is the URL for the code review, starting with
+``https://reviews.llvm.org/``.
+
+This allows people reading the version history to see the review for
+context. This also allows Phabricator to detect the commit, close the
+review, and add a link from the review to the commit.
+
+Note that if you use the Arcanist tool the ``Differential Revision`` line will
+be added automatically. If you don't want to use Arcanist, you can add the
+``Differential Revision`` line (as the last line) to the commit message
+yourself.
+
+Using the Arcanist tool can simplify the process of committing reviewed code as
+it will retrieve reviewers, the ``Differential Revision``, etc from the review
+and place it in the commit message. You may also commit an accepted change
+directly using ``git llvm push``, per the section in the :ref:`getting started
+guide <commit_from_git>`.
+
+Note that if you commit the change without using Arcanist and forget to add the
+``Differential Revision`` line to your commit message then it is recommended
+that you close the review manually. In the web UI, under "Leap Into Action" put
+the SVN revision number in the Comment, set the Action to "Close Revision" and
+click Submit.  Note the review must have been Accepted first.
+
+
+Committing someone's change from Phabricator
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+On a clean Git repository on an up to date ``master`` branch run the
+following (where ``<Revision>`` is the Phabricator review number):
+
+::
+
+  arc patch D<Revision>
+
+
+This will create a new branch called ``arcpatch-D<Revision>`` based on the
+current ``master`` and will create a commit corresponding to ``D<Revision>`` with a
+commit message derived from information in the Phabricator review.
+
+Check you are happy with the commit message and amend it if necessary. Then,
+make sure the commit is up-to-date, and commit it. This can be done by running
+the following:
+
+::
+
+  git pull --rebase origin master
+  git show # Ensure the patch looks correct.
+  ninja check-$whatever # Rerun the appropriate tests if needed.
+  git llvm push
+
+Subversion and Arcanist (deprecated)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To download a change from Phabricator and commit it with subversion, you should
+first make sure you have a clean working directory. Then run the following
+(where ``<Revision>`` is the Phabricator review number):
+
+::
+
+  arc patch D<Revision>
+  arc commit --revision D<Revision>
+
+The first command will take the latest version of the reviewed patch and apply
+it to the working copy. The second command will commit this revision to trunk.
+
+Abandoning a change
+-------------------
+
+If you decide you should not commit the patch, you should explicitly abandon
+the review so that reviewers don't think it is still open. In the web UI,
+scroll to the bottom of the page where normally you would enter an overall
+comment. In the drop-down Action list, which defaults to "Comment," you should
+select "Abandon Revision" and then enter a comment explaining why. Click the
+Submit button to finish closing the review.
+
+Status
+------
+
+Please let us know whether you like it and what could be improved! We're still
+working on setting up a bug tracker, but you can email klimek-at-google-dot-com
+and chandlerc-at-gmail-dot-com and CC the llvm-dev mailing list with questions
+until then. We also could use help implementing improvements. This sadly is
+really painful and hard because the Phabricator codebase is in PHP and not as
+testable as you might like. However, we've put exactly what we're deploying up
+on an `llvm-reviews GitHub project`_ where folks can hack on it and post pull
+requests. We're looking into what the right long-term hosting for this is, but
+note that it is a derivative of an existing open source project, and so not
+trivially a good fit for an official LLVM project.
+
+.. _LLVM's Phabricator: https://reviews.llvm.org
+.. _`https://reviews.llvm.org`: https://reviews.llvm.org
+.. _Code Repository Browser: https://reviews.llvm.org/diffusion/
+.. _Arcanist Quick Start: https://secure.phabricator.com/book/phabricator/article/arcanist_quick_start/
+.. _Arcanist User Guide: https://secure.phabricator.com/book/phabricator/article/arcanist/
+.. _llvm-reviews GitHub project: https://github.com/r4nt/llvm-reviews/

Added: www-releases/trunk/9.0.0/docs/_sources/ProgrammersManual.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/ProgrammersManual.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/ProgrammersManual.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/ProgrammersManual.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,4179 @@
+========================
+LLVM Programmer's Manual
+========================
+
+.. contents::
+   :local:
+
+.. warning::
+   This is always a work in progress.
+
+.. _introduction:
+
+Introduction
+============
+
+This document is meant to highlight some of the important classes and interfaces
+available in the LLVM source-base.  This manual is not intended to explain what
+LLVM is, how it works, and what LLVM code looks like.  It assumes that you know
+the basics of LLVM and are interested in writing transformations or otherwise
+analyzing or manipulating the code.
+
+This document should get you oriented so that you can find your way in the
+continuously growing source code that makes up the LLVM infrastructure.  Note
+that this manual is not intended to serve as a replacement for reading the
+source code, so if you think there should be a method in one of these classes to
+do something, but it's not listed, check the source.  Links to the `doxygen
+<http://llvm.org/doxygen/>`__ sources are provided to make this as easy as
+possible.
+
+The first section of this document describes general information that is useful
+to know when working in the LLVM infrastructure, and the second describes the
+Core LLVM classes.  In the future this manual will be extended with information
+describing how to use extension libraries, such as dominator information, CFG
+traversal routines, and useful utilities like the ``InstVisitor`` (`doxygen
+<http://llvm.org/doxygen/InstVisitor_8h_source.html>`__) template.
+
+.. _general:
+
+General Information
+===================
+
+This section contains general information that is useful if you are working in
+the LLVM source-base, but that isn't specific to any particular API.
+
+.. _stl:
+
+The C++ Standard Template Library
+---------------------------------
+
+LLVM makes heavy use of the C++ Standard Template Library (STL), perhaps much
+more than you are used to, or have seen before.  Because of this, you might want
+to do a little background reading in the techniques used and capabilities of the
+library.  There are many good pages that discuss the STL, and several books on
+the subject that you can get, so it will not be discussed in this document.
+
+Here are some useful links:
+
+#. `cppreference.com
+   <http://en.cppreference.com/w/>`_ - an excellent
+   reference for the STL and other parts of the standard C++ library.
+
+#. `C++ In a Nutshell <http://www.tempest-sw.com/cpp/>`_ - This is an O'Reilly
+   book in the making.  It has a decent Standard Library Reference that rivals
+   Dinkumware's, and is unfortunately no longer free since the book has been
+   published.
+
+#. `C++ Frequently Asked Questions <http://www.parashift.com/c++-faq-lite/>`_.
+
+#. `SGI's STL Programmer's Guide <http://www.sgi.com/tech/stl/>`_ - Contains a
+   useful `Introduction to the STL
+   <http://www.sgi.com/tech/stl/stl_introduction.html>`_.
+
+#. `Bjarne Stroustrup's C++ Page
+   <http://www.research.att.com/%7Ebs/C++.html>`_.
+
+#. `Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0
+   (even better, get the book)
+   <http://www.mindview.net/Books/TICPP/ThinkingInCPP2e.html>`_.
+
+You are also encouraged to take a look at the :doc:`LLVM Coding Standards
+<CodingStandards>` guide which focuses on how to write maintainable code more
+than where to put your curly braces.
+
+.. _resources:
+
+Other useful references
+-----------------------
+
+#. `Using static and shared libraries across platforms
+   <http://www.fortran-2000.com/ArnaudRecipes/sharedlib.html>`_
+
+.. _apis:
+
+Important and useful LLVM APIs
+==============================
+
+Here we highlight some LLVM APIs that are generally useful and good to know
+about when writing transformations.
+
+.. _isa:
+
+The ``isa<>``, ``cast<>`` and ``dyn_cast<>`` templates
+------------------------------------------------------
+
+The LLVM source-base makes extensive use of a custom form of RTTI.  These
+templates have many similarities to the C++ ``dynamic_cast<>`` operator, but
+they don't have some drawbacks (primarily stemming from the fact that
+``dynamic_cast<>`` only works on classes that have a v-table).  Because they are
+used so often, you must know what they do and how they work.  All of these
+templates are defined in the ``llvm/Support/Casting.h`` (`doxygen
+<http://llvm.org/doxygen/Casting_8h_source.html>`__) file (note that you very
+rarely have to include this file directly).
+
+``isa<>``:
+  The ``isa<>`` operator works exactly like the Java "``instanceof``" operator.
+  It returns true or false depending on whether a reference or pointer points to
+  an instance of the specified class.  This can be very useful for constraint
+  checking of various sorts (example below).
+
+``cast<>``:
+  The ``cast<>`` operator is a "checked cast" operation.  It converts a pointer
+  or reference from a base class to a derived class, causing an assertion
+  failure if it is not really an instance of the right type.  This should be
+  used in cases where you have some information that makes you believe that
+  something is of the right type.  An example of the ``isa<>`` and ``cast<>``
+  template is:
+
+  .. code-block:: c++
+
+    static bool isLoopInvariant(const Value *V, const Loop *L) {
+      if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V))
+        return true;
+
+      // Otherwise, it must be an instruction...
+      return !L->contains(cast<Instruction>(V)->getParent());
+    }
+
+  Note that you should **not** use an ``isa<>`` test followed by a ``cast<>``,
+  for that use the ``dyn_cast<>`` operator.
+
+``dyn_cast<>``:
+  The ``dyn_cast<>`` operator is a "checking cast" operation.  It checks to see
+  if the operand is of the specified type, and if so, returns a pointer to it
+  (this operator does not work with references).  If the operand is not of the
+  correct type, a null pointer is returned.  Thus, this works very much like
+  the ``dynamic_cast<>`` operator in C++, and should be used in the same
+  circumstances.  Typically, the ``dyn_cast<>`` operator is used in an ``if``
+  statement or some other flow control statement like this:
+
+  .. code-block:: c++
+
+    if (auto *AI = dyn_cast<AllocationInst>(Val)) {
+      // ...
+    }
+
+  This form of the ``if`` statement effectively combines together a call to
+  ``isa<>`` and a call to ``cast<>`` into one statement, which is very
+  convenient.
+
+  Note that the ``dyn_cast<>`` operator, like C++'s ``dynamic_cast<>`` or Java's
+  ``instanceof`` operator, can be abused.  In particular, you should not use big
+  chained ``if/then/else`` blocks to check for lots of different variants of
+  classes.  If you find yourself wanting to do this, it is much cleaner and more
+  efficient to use the ``InstVisitor`` class to dispatch over the instruction
+  type directly.
+
+``isa_and_nonnull<>``:
+  The ``isa_and_nonnull<>`` operator works just like the ``isa<>`` operator,
+  except that it allows for a null pointer as an argument (which it then
+  returns false).  This can sometimes be useful, allowing you to combine several
+  null checks into one.
+
+``cast_or_null<>``:
+  The ``cast_or_null<>`` operator works just like the ``cast<>`` operator,
+  except that it allows for a null pointer as an argument (which it then
+  propagates).  This can sometimes be useful, allowing you to combine several
+  null checks into one.
+
+``dyn_cast_or_null<>``:
+  The ``dyn_cast_or_null<>`` operator works just like the ``dyn_cast<>``
+  operator, except that it allows for a null pointer as an argument (which it
+  then propagates).  This can sometimes be useful, allowing you to combine
+  several null checks into one.
+
+These five templates can be used with any classes, whether they have a v-table
+or not.  If you want to add support for these templates, see the document
+:doc:`How to set up LLVM-style RTTI for your class hierarchy
+<HowToSetUpLLVMStyleRTTI>`
+
+.. _string_apis:
+
+Passing strings (the ``StringRef`` and ``Twine`` classes)
+---------------------------------------------------------
+
+Although LLVM generally does not do much string manipulation, we do have several
+important APIs which take strings.  Two important examples are the Value class
+-- which has names for instructions, functions, etc. -- and the ``StringMap``
+class which is used extensively in LLVM and Clang.
+
+These are generic classes, and they need to be able to accept strings which may
+have embedded null characters.  Therefore, they cannot simply take a ``const
+char *``, and taking a ``const std::string&`` requires clients to perform a heap
+allocation which is usually unnecessary.  Instead, many LLVM APIs use a
+``StringRef`` or a ``const Twine&`` for passing strings efficiently.
+
+.. _StringRef:
+
+The ``StringRef`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``StringRef`` data type represents a reference to a constant string (a
+character array and a length) and supports the common operations available on
+``std::string``, but does not require heap allocation.
+
+It can be implicitly constructed using a C style null-terminated string, an
+``std::string``, or explicitly with a character pointer and length.  For
+example, the ``StringRef`` find function is declared as:
+
+.. code-block:: c++
+
+  iterator find(StringRef Key);
+
+and clients can call it using any one of:
+
+.. code-block:: c++
+
+  Map.find("foo");                 // Lookup "foo"
+  Map.find(std::string("bar"));    // Lookup "bar"
+  Map.find(StringRef("\0baz", 4)); // Lookup "\0baz"
+
+Similarly, APIs which need to return a string may return a ``StringRef``
+instance, which can be used directly or converted to an ``std::string`` using
+the ``str`` member function.  See ``llvm/ADT/StringRef.h`` (`doxygen
+<http://llvm.org/doxygen/StringRef_8h_source.html>`__) for more
+information.
+
+You should rarely use the ``StringRef`` class directly, because it contains
+pointers to external memory it is not generally safe to store an instance of the
+class (unless you know that the external storage will not be freed).
+``StringRef`` is small and pervasive enough in LLVM that it should always be
+passed by value.
+
+The ``Twine`` class
+^^^^^^^^^^^^^^^^^^^
+
+The ``Twine`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1Twine.html>`__)
+class is an efficient way for APIs to accept concatenated strings.  For example,
+a common LLVM paradigm is to name one instruction based on the name of another
+instruction with a suffix, for example:
+
+.. code-block:: c++
+
+    New = CmpInst::Create(..., SO->getName() + ".cmp");
+
+The ``Twine`` class is effectively a lightweight `rope
+<http://en.wikipedia.org/wiki/Rope_(computer_science)>`_ which points to
+temporary (stack allocated) objects.  Twines can be implicitly constructed as
+the result of the plus operator applied to strings (i.e., a C strings, an
+``std::string``, or a ``StringRef``).  The twine delays the actual concatenation
+of strings until it is actually required, at which point it can be efficiently
+rendered directly into a character array.  This avoids unnecessary heap
+allocation involved in constructing the temporary results of string
+concatenation.  See ``llvm/ADT/Twine.h`` (`doxygen
+<http://llvm.org/doxygen/Twine_8h_source.html>`__) and :ref:`here <dss_twine>`
+for more information.
+
+As with a ``StringRef``, ``Twine`` objects point to external memory and should
+almost never be stored or mentioned directly.  They are intended solely for use
+when defining a function which should be able to efficiently accept concatenated
+strings.
+
+.. _formatting_strings:
+
+Formatting strings (the ``formatv`` function)
+---------------------------------------------
+While LLVM doesn't necessarily do a lot of string manipulation and parsing, it
+does do a lot of string formatting.  From diagnostic messages, to llvm tool
+outputs such as ``llvm-readobj`` to printing verbose disassembly listings and
+LLDB runtime logging, the need for string formatting is pervasive.
+
+The ``formatv`` is similar in spirit to ``printf``, but uses a different syntax
+which borrows heavily from Python and C#.  Unlike ``printf`` it deduces the type
+to be formatted at compile time, so it does not need a format specifier such as
+``%d``.  This reduces the mental overhead of trying to construct portable format
+strings, especially for platform-specific types like ``size_t`` or pointer types.
+Unlike both ``printf`` and Python, it additionally fails to compile if LLVM does
+not know how to format the type.  These two properties ensure that the function
+is both safer and simpler to use than traditional formatting methods such as 
+the ``printf`` family of functions.
+
+Simple formatting
+^^^^^^^^^^^^^^^^^
+
+A call to ``formatv`` involves a single **format string** consisting of 0 or more
+**replacement sequences**, followed by a variable length list of **replacement values**.
+A replacement sequence is a string of the form ``{N[[,align]:style]}``.
+
+``N`` refers to the 0-based index of the argument from the list of replacement
+values.  Note that this means it is possible to reference the same parameter
+multiple times, possibly with different style and/or alignment options, in any order.
+
+``align`` is an optional string specifying the width of the field to format
+the value into, and the alignment of the value within the field.  It is specified as
+an optional **alignment style** followed by a positive integral **field width**.  The
+alignment style can be one of the characters ``-`` (left align), ``=`` (center align),
+or ``+`` (right align).  The default is right aligned.  
+
+``style`` is an optional string consisting of a type specific that controls the
+formatting of the value.  For example, to format a floating point value as a percentage,
+you can use the style option ``P``.
+
+Custom formatting
+^^^^^^^^^^^^^^^^^
+
+There are two ways to customize the formatting behavior for a type.
+
+1. Provide a template specialization of ``llvm::format_provider<T>`` for your
+   type ``T`` with the appropriate static format method.
+
+  .. code-block:: c++
+  
+    namespace llvm {
+      template<>
+      struct format_provider<MyFooBar> {
+        static void format(const MyFooBar &V, raw_ostream &Stream, StringRef Style) {
+          // Do whatever is necessary to format `V` into `Stream`
+        }
+      };
+      void foo() {
+        MyFooBar X;
+        std::string S = formatv("{0}", X);
+      }
+    }
+    
+  This is a useful extensibility mechanism for adding support for formatting your own
+  custom types with your own custom Style options.  But it does not help when you want
+  to extend the mechanism for formatting a type that the library already knows how to
+  format.  For that, we need something else.
+    
+2. Provide a **format adapter** inheriting from ``llvm::FormatAdapter<T>``.
+
+  .. code-block:: c++
+  
+    namespace anything {
+      struct format_int_custom : public llvm::FormatAdapter<int> {
+        explicit format_int_custom(int N) : llvm::FormatAdapter<int>(N) {}
+        void format(llvm::raw_ostream &Stream, StringRef Style) override {
+          // Do whatever is necessary to format ``this->Item`` into ``Stream``
+        }
+      };
+    }
+    namespace llvm {
+      void foo() {
+        std::string S = formatv("{0}", anything::format_int_custom(42));
+      }
+    }
+    
+  If the type is detected to be derived from ``FormatAdapter<T>``, ``formatv``
+  will call the
+  ``format`` method on the argument passing in the specified style.  This allows
+  one to provide custom formatting of any type, including one which already has
+  a builtin format provider.
+
+``formatv`` Examples
+^^^^^^^^^^^^^^^^^^^^
+Below is intended to provide an incomplete set of examples demonstrating
+the usage of ``formatv``.  More information can be found by reading the
+doxygen documentation or by looking at the unit test suite.
+
+
+.. code-block:: c++
+  
+  std::string S;
+  // Simple formatting of basic types and implicit string conversion.
+  S = formatv("{0} ({1:P})", 7, 0.35);  // S == "7 (35.00%)"
+  
+  // Out-of-order referencing and multi-referencing
+  outs() << formatv("{0} {2} {1} {0}", 1, "test", 3); // prints "1 3 test 1"
+  
+  // Left, right, and center alignment
+  S = formatv("{0,7}",  'a');  // S == "      a";
+  S = formatv("{0,-7}", 'a');  // S == "a      ";
+  S = formatv("{0,=7}", 'a');  // S == "   a   ";
+  S = formatv("{0,+7}", 'a');  // S == "      a";
+  
+  // Custom styles
+  S = formatv("{0:N} - {0:x} - {1:E}", 12345, 123908342); // S == "12,345 - 0x3039 - 1.24E8"
+  
+  // Adapters
+  S = formatv("{0}", fmt_align(42, AlignStyle::Center, 7));  // S == "  42   "
+  S = formatv("{0}", fmt_repeat("hi", 3)); // S == "hihihi"
+  S = formatv("{0}", fmt_pad("hi", 2, 6)); // S == "  hi      "
+  
+  // Ranges
+  std::vector<int> V = {8, 9, 10};
+  S = formatv("{0}", make_range(V.begin(), V.end())); // S == "8, 9, 10"
+  S = formatv("{0:$[+]}", make_range(V.begin(), V.end())); // S == "8+9+10"
+  S = formatv("{0:$[ + ]@[x]}", make_range(V.begin(), V.end())); // S == "0x8 + 0x9 + 0xA"
+
+.. _error_apis:
+
+Error handling
+--------------
+
+Proper error handling helps us identify bugs in our code, and helps end-users
+understand errors in their tool usage. Errors fall into two broad categories:
+*programmatic* and *recoverable*, with different strategies for handling and
+reporting.
+
+Programmatic Errors
+^^^^^^^^^^^^^^^^^^^
+
+Programmatic errors are violations of program invariants or API contracts, and
+represent bugs within the program itself. Our aim is to document invariants, and
+to abort quickly at the point of failure (providing some basic diagnostic) when
+invariants are broken at runtime.
+
+The fundamental tools for handling programmatic errors are assertions and the
+llvm_unreachable function. Assertions are used to express invariant conditions,
+and should include a message describing the invariant:
+
+.. code-block:: c++
+
+  assert(isPhysReg(R) && "All virt regs should have been allocated already.");
+
+The llvm_unreachable function can be used to document areas of control flow
+that should never be entered if the program invariants hold:
+
+.. code-block:: c++
+
+  enum { Foo, Bar, Baz } X = foo();
+
+  switch (X) {
+    case Foo: /* Handle Foo */; break;
+    case Bar: /* Handle Bar */; break;
+    default:
+      llvm_unreachable("X should be Foo or Bar here");
+  }
+
+Recoverable Errors
+^^^^^^^^^^^^^^^^^^
+
+Recoverable errors represent an error in the program's environment, for example
+a resource failure (a missing file, a dropped network connection, etc.), or
+malformed input. These errors should be detected and communicated to a level of
+the program where they can be handled appropriately. Handling the error may be
+as simple as reporting the issue to the user, or it may involve attempts at
+recovery.
+
+.. note::
+
+   While it would be ideal to use this error handling scheme throughout
+   LLVM, there are places where this hasn't been practical to apply. In
+   situations where you absolutely must emit a non-programmatic error and
+   the ``Error`` model isn't workable you can call ``report_fatal_error``,
+   which will call installed error handlers, print a message, and exit the
+   program.
+
+Recoverable errors are modeled using LLVM's ``Error`` scheme. This scheme
+represents errors using function return values, similar to classic C integer
+error codes, or C++'s ``std::error_code``. However, the ``Error`` class is
+actually a lightweight wrapper for user-defined error types, allowing arbitrary
+information to be attached to describe the error. This is similar to the way C++
+exceptions allow throwing of user-defined types.
+
+Success values are created by calling ``Error::success()``, E.g.:
+
+.. code-block:: c++
+
+  Error foo() {
+    // Do something.
+    // Return success.
+    return Error::success();
+  }
+
+Success values are very cheap to construct and return - they have minimal
+impact on program performance.
+
+Failure values are constructed using ``make_error<T>``, where ``T`` is any class
+that inherits from the ErrorInfo utility, E.g.:
+
+.. code-block:: c++
+
+  class BadFileFormat : public ErrorInfo<BadFileFormat> {
+  public:
+    static char ID;
+    std::string Path;
+
+    BadFileFormat(StringRef Path) : Path(Path.str()) {}
+
+    void log(raw_ostream &OS) const override {
+      OS << Path << " is malformed";
+    }
+
+    std::error_code convertToErrorCode() const override {
+      return make_error_code(object_error::parse_failed);
+    }
+  };
+
+  char BadFileFormat::ID; // This should be declared in the C++ file.
+
+  Error printFormattedFile(StringRef Path) {
+    if (<check for valid format>)
+      return make_error<BadFileFormat>(Path);
+    // print file contents.
+    return Error::success();
+  }
+
+Error values can be implicitly converted to bool: true for error, false for
+success, enabling the following idiom:
+
+.. code-block:: c++
+
+  Error mayFail();
+
+  Error foo() {
+    if (auto Err = mayFail())
+      return Err;
+    // Success! We can proceed.
+    ...
+
+For functions that can fail but need to return a value the ``Expected<T>``
+utility can be used. Values of this type can be constructed with either a
+``T``, or an ``Error``. Expected<T> values are also implicitly convertible to
+boolean, but with the opposite convention to ``Error``: true for success, false
+for error. If success, the ``T`` value can be accessed via the dereference
+operator. If failure, the ``Error`` value can be extracted using the
+``takeError()`` method. Idiomatic usage looks like:
+
+.. code-block:: c++
+
+  Expected<FormattedFile> openFormattedFile(StringRef Path) {
+    // If badly formatted, return an error.
+    if (auto Err = checkFormat(Path))
+      return std::move(Err);
+    // Otherwise return a FormattedFile instance.
+    return FormattedFile(Path);
+  }
+
+  Error processFormattedFile(StringRef Path) {
+    // Try to open a formatted file
+    if (auto FileOrErr = openFormattedFile(Path)) {
+      // On success, grab a reference to the file and continue.
+      auto &File = *FileOrErr;
+      ...
+    } else
+      // On error, extract the Error value and return it.
+      return FileOrErr.takeError();
+  }
+
+If an ``Expected<T>`` value is in success mode then the ``takeError()`` method
+will return a success value. Using this fact, the above function can be
+rewritten as:
+
+.. code-block:: c++
+
+  Error processFormattedFile(StringRef Path) {
+    // Try to open a formatted file
+    auto FileOrErr = openFormattedFile(Path);
+    if (auto Err = FileOrErr.takeError())
+      // On error, extract the Error value and return it.
+      return Err;
+    // On success, grab a reference to the file and continue.
+    auto &File = *FileOrErr;
+    ...
+  }
+
+This second form is often more readable for functions that involve multiple
+``Expected<T>`` values as it limits the indentation required.
+
+All ``Error`` instances, whether success or failure, must be either checked or
+moved from (via ``std::move`` or a return) before they are destructed.
+Accidentally discarding an unchecked error will cause a program abort at the
+point where the unchecked value's destructor is run, making it easy to identify
+and fix violations of this rule.
+
+Success values are considered checked once they have been tested (by invoking
+the boolean conversion operator):
+
+.. code-block:: c++
+
+  if (auto Err = mayFail(...))
+    return Err; // Failure value - move error to caller.
+
+  // Safe to continue: Err was checked.
+
+In contrast, the following code will always cause an abort, even if ``mayFail``
+returns a success value:
+
+.. code-block:: c++
+
+    mayFail();
+    // Program will always abort here, even if mayFail() returns Success, since
+    // the value is not checked.
+
+Failure values are considered checked once a handler for the error type has
+been activated:
+
+.. code-block:: c++
+
+  handleErrors(
+    processFormattedFile(...),
+    [](const BadFileFormat &BFF) {
+      report("Unable to process " + BFF.Path + ": bad format");
+    },
+    [](const FileNotFound &FNF) {
+      report("File not found " + FNF.Path);
+    });
+
+The ``handleErrors`` function takes an error as its first argument, followed by
+a variadic list of "handlers", each of which must be a callable type (a
+function, lambda, or class with a call operator) with one argument. The
+``handleErrors`` function will visit each handler in the sequence and check its
+argument type against the dynamic type of the error, running the first handler
+that matches. This is the same decision process that is used decide which catch
+clause to run for a C++ exception.
+
+Since the list of handlers passed to ``handleErrors`` may not cover every error
+type that can occur, the ``handleErrors`` function also returns an Error value
+that must be checked or propagated. If the error value that is passed to
+``handleErrors`` does not match any of the handlers it will be returned from
+handleErrors. Idiomatic use of ``handleErrors`` thus looks like:
+
+.. code-block:: c++
+
+  if (auto Err =
+        handleErrors(
+          processFormattedFile(...),
+          [](const BadFileFormat &BFF) {
+            report("Unable to process " + BFF.Path + ": bad format");
+          },
+          [](const FileNotFound &FNF) {
+            report("File not found " + FNF.Path);
+          }))
+    return Err;
+
+In cases where you truly know that the handler list is exhaustive the
+``handleAllErrors`` function can be used instead. This is identical to
+``handleErrors`` except that it will terminate the program if an unhandled
+error is passed in, and can therefore return void. The ``handleAllErrors``
+function should generally be avoided: the introduction of a new error type
+elsewhere in the program can easily turn a formerly exhaustive list of errors
+into a non-exhaustive list, risking unexpected program termination. Where
+possible, use handleErrors and propagate unknown errors up the stack instead.
+
+For tool code, where errors can be handled by printing an error message then
+exiting with an error code, the :ref:`ExitOnError <err_exitonerr>` utility
+may be a better choice than handleErrors, as it simplifies control flow when
+calling fallible functions.
+
+In situations where it is known that a particular call to a fallible function
+will always succeed (for example, a call to a function that can only fail on a
+subset of inputs with an input that is known to be safe) the
+:ref:`cantFail <err_cantfail>` functions can be used to remove the error type,
+simplifying control flow.
+
+StringError
+"""""""""""
+
+Many kinds of errors have no recovery strategy, the only action that can be
+taken is to report them to the user so that the user can attempt to fix the
+environment. In this case representing the error as a string makes perfect
+sense. LLVM provides the ``StringError`` class for this purpose. It takes two
+arguments: A string error message, and an equivalent ``std::error_code`` for
+interoperability. It also provides a ``createStringError`` function to simplify
+common usage of this class:
+
+.. code-block:: c++
+
+  // These two lines of code are equivalent:
+  make_error<StringError>("Bad executable", errc::executable_format_error);
+  createStringError(errc::executable_format_error, "Bad executable");
+
+If you're certain that the error you're building will never need to be converted
+to a ``std::error_code`` you can use the ``inconvertibleErrorCode()`` function:
+
+.. code-block:: c++
+
+  createStringError(inconvertibleErrorCode(), "Bad executable");
+
+This should be done only after careful consideration. If any attempt is made to
+convert this error to a ``std::error_code`` it will trigger immediate program
+termination. Unless you are certain that your errors will not need
+interoperability you should look for an existing ``std::error_code`` that you
+can convert to, and even (as painful as it is) consider introducing a new one as
+a stopgap measure.
+
+``createStringError`` can take ``printf`` style format specifiers to provide a
+formatted message:
+
+.. code-block:: c++
+
+  createStringError(errc::executable_format_error,
+                    "Bad executable: %s", FileName);
+
+Interoperability with std::error_code and ErrorOr
+"""""""""""""""""""""""""""""""""""""""""""""""""
+
+Many existing LLVM APIs use ``std::error_code`` and its partner ``ErrorOr<T>``
+(which plays the same role as ``Expected<T>``, but wraps a ``std::error_code``
+rather than an ``Error``). The infectious nature of error types means that an
+attempt to change one of these functions to return ``Error`` or ``Expected<T>``
+instead often results in an avalanche of changes to callers, callers of callers,
+and so on. (The first such attempt, returning an ``Error`` from
+MachOObjectFile's constructor, was abandoned after the diff reached 3000 lines,
+impacted half a dozen libraries, and was still growing).
+
+To solve this problem, the ``Error``/``std::error_code`` interoperability requirement was
+introduced. Two pairs of functions allow any ``Error`` value to be converted to a
+``std::error_code``, any ``Expected<T>`` to be converted to an ``ErrorOr<T>``, and vice
+versa:
+
+.. code-block:: c++
+
+  std::error_code errorToErrorCode(Error Err);
+  Error errorCodeToError(std::error_code EC);
+
+  template <typename T> ErrorOr<T> expectedToErrorOr(Expected<T> TOrErr);
+  template <typename T> Expected<T> errorOrToExpected(ErrorOr<T> TOrEC);
+
+
+Using these APIs it is easy to make surgical patches that update individual
+functions from ``std::error_code`` to ``Error``, and from ``ErrorOr<T>`` to
+``Expected<T>``.
+
+Returning Errors from error handlers
+""""""""""""""""""""""""""""""""""""
+
+Error recovery attempts may themselves fail. For that reason, ``handleErrors``
+actually recognises three different forms of handler signature:
+
+.. code-block:: c++
+
+  // Error must be handled, no new errors produced:
+  void(UserDefinedError &E);
+
+  // Error must be handled, new errors can be produced:
+  Error(UserDefinedError &E);
+
+  // Original error can be inspected, then re-wrapped and returned (or a new
+  // error can be produced):
+  Error(std::unique_ptr<UserDefinedError> E);
+
+Any error returned from a handler will be returned from the ``handleErrors``
+function so that it can be handled itself, or propagated up the stack.
+
+.. _err_exitonerr:
+
+Using ExitOnError to simplify tool code
+"""""""""""""""""""""""""""""""""""""""
+
+Library code should never call ``exit`` for a recoverable error, however in tool
+code (especially command line tools) this can be a reasonable approach. Calling
+``exit`` upon encountering an error dramatically simplifies control flow as the
+error no longer needs to be propagated up the stack. This allows code to be
+written in straight-line style, as long as each fallible call is wrapped in a
+check and call to exit. The ``ExitOnError`` class supports this pattern by
+providing call operators that inspect ``Error`` values, stripping the error away
+in the success case and logging to ``stderr`` then exiting in the failure case.
+
+To use this class, declare a global ``ExitOnError`` variable in your program:
+
+.. code-block:: c++
+
+  ExitOnError ExitOnErr;
+
+Calls to fallible functions can then be wrapped with a call to ``ExitOnErr``,
+turning them into non-failing calls:
+
+.. code-block:: c++
+
+  Error mayFail();
+  Expected<int> mayFail2();
+
+  void foo() {
+    ExitOnErr(mayFail());
+    int X = ExitOnErr(mayFail2());
+  }
+
+On failure, the error's log message will be written to ``stderr``, optionally
+preceded by a string "banner" that can be set by calling the setBanner method. A
+mapping can also be supplied from ``Error`` values to exit codes using the
+``setExitCodeMapper`` method:
+
+.. code-block:: c++
+
+  int main(int argc, char *argv[]) {
+    ExitOnErr.setBanner(std::string(argv[0]) + " error:");
+    ExitOnErr.setExitCodeMapper(
+      [](const Error &Err) {
+        if (Err.isA<BadFileFormat>())
+          return 2;
+        return 1;
+      });
+
+Use ``ExitOnError`` in your tool code where possible as it can greatly improve
+readability.
+
+.. _err_cantfail:
+
+Using cantFail to simplify safe callsites
+"""""""""""""""""""""""""""""""""""""""""
+
+Some functions may only fail for a subset of their inputs, so calls using known
+safe inputs can be assumed to succeed.
+
+The cantFail functions encapsulate this by wrapping an assertion that their
+argument is a success value and, in the case of Expected<T>, unwrapping the
+T value:
+
+.. code-block:: c++
+
+  Error onlyFailsForSomeXValues(int X);
+  Expected<int> onlyFailsForSomeXValues2(int X);
+
+  void foo() {
+    cantFail(onlyFailsForSomeXValues(KnownSafeValue));
+    int Y = cantFail(onlyFailsForSomeXValues2(KnownSafeValue));
+    ...
+  }
+
+Like the ExitOnError utility, cantFail simplifies control flow. Their treatment
+of error cases is very different however: Where ExitOnError is guaranteed to
+terminate the program on an error input, cantFail simply asserts that the result
+is success. In debug builds this will result in an assertion failure if an error
+is encountered. In release builds the behavior of cantFail for failure values is
+undefined. As such, care must be taken in the use of cantFail: clients must be
+certain that a cantFail wrapped call really can not fail with the given
+arguments.
+
+Use of the cantFail functions should be rare in library code, but they are
+likely to be of more use in tool and unit-test code where inputs and/or
+mocked-up classes or functions may be known to be safe.
+
+Fallible constructors
+"""""""""""""""""""""
+
+Some classes require resource acquisition or other complex initialization that
+can fail during construction. Unfortunately constructors can't return errors,
+and having clients test objects after they're constructed to ensure that they're
+valid is error prone as it's all too easy to forget the test. To work around
+this, use the named constructor idiom and return an ``Expected<T>``:
+
+.. code-block:: c++
+
+  class Foo {
+  public:
+
+    static Expected<Foo> Create(Resource R1, Resource R2) {
+      Error Err;
+      Foo F(R1, R2, Err);
+      if (Err)
+        return std::move(Err);
+      return std::move(F);
+    }
+
+  private:
+
+    Foo(Resource R1, Resource R2, Error &Err) {
+      ErrorAsOutParameter EAO(&Err);
+      if (auto Err2 = R1.acquire()) {
+        Err = std::move(Err2);
+        return;
+      }
+      Err = R2.acquire();
+    }
+  };
+
+
+Here, the named constructor passes an ``Error`` by reference into the actual
+constructor, which the constructor can then use to return errors. The
+``ErrorAsOutParameter`` utility sets the ``Error`` value's checked flag on entry
+to the constructor so that the error can be assigned to, then resets it on exit
+to force the client (the named constructor) to check the error.
+
+By using this idiom, clients attempting to construct a Foo receive either a
+well-formed Foo or an Error, never an object in an invalid state.
+
+Propagating and consuming errors based on types
+"""""""""""""""""""""""""""""""""""""""""""""""
+
+In some contexts, certain types of error are known to be benign. For example,
+when walking an archive, some clients may be happy to skip over badly formatted
+object files rather than terminating the walk immediately. Skipping badly
+formatted objects could be achieved using an elaborate handler method, but the
+Error.h header provides two utilities that make this idiom much cleaner: the
+type inspection method, ``isA``, and the ``consumeError`` function:
+
+.. code-block:: c++
+
+  Error walkArchive(Archive A) {
+    for (unsigned I = 0; I != A.numMembers(); ++I) {
+      auto ChildOrErr = A.getMember(I);
+      if (auto Err = ChildOrErr.takeError()) {
+        if (Err.isA<BadFileFormat>())
+          consumeError(std::move(Err))
+        else
+          return Err;
+      }
+      auto &Child = *ChildOrErr;
+      // Use Child
+      ...
+    }
+    return Error::success();
+  }
+
+Concatenating Errors with joinErrors
+""""""""""""""""""""""""""""""""""""
+
+In the archive walking example above ``BadFileFormat`` errors are simply
+consumed and ignored. If the client had wanted report these errors after
+completing the walk over the archive they could use the ``joinErrors`` utility:
+
+.. code-block:: c++
+
+  Error walkArchive(Archive A) {
+    Error DeferredErrs = Error::success();
+    for (unsigned I = 0; I != A.numMembers(); ++I) {
+      auto ChildOrErr = A.getMember(I);
+      if (auto Err = ChildOrErr.takeError())
+        if (Err.isA<BadFileFormat>())
+          DeferredErrs = joinErrors(std::move(DeferredErrs), std::move(Err));
+        else
+          return Err;
+      auto &Child = *ChildOrErr;
+      // Use Child
+      ...
+    }
+    return DeferredErrs;
+  }
+
+The ``joinErrors`` routine builds a special error type called ``ErrorList``,
+which holds a list of user defined errors. The ``handleErrors`` routine
+recognizes this type and will attempt to handle each of the contained errors in
+order. If all contained errors can be handled, ``handleErrors`` will return
+``Error::success()``, otherwise ``handleErrors`` will concatenate the remaining
+errors and return the resulting ``ErrorList``.
+
+Building fallible iterators and iterator ranges
+"""""""""""""""""""""""""""""""""""""""""""""""
+
+The archive walking examples above retrieve archive members by index, however
+this requires considerable boiler-plate for iteration and error checking. We can
+clean this up by using the "fallible iterator" pattern, which supports the
+following natural iteration idiom for fallible containers like Archive:
+
+.. code-block:: c++
+
+  Error Err;
+  for (auto &Child : Ar->children(Err)) {
+    // Use Child - only enter the loop when it's valid
+
+    // Allow early exit from the loop body, since we know that Err is success
+    // when we're inside the loop.
+    if (BailOutOn(Child))
+      return;
+
+    ...
+  }
+  // Check Err after the loop to ensure it didn't break due to an error.
+  if (Err)
+    return Err;
+
+To enable this idiom, iterators over fallible containers are written in a
+natural style, with their ``++`` and ``--`` operators replaced with fallible
+``Error inc()`` and ``Error dec()`` functions. E.g.:
+
+.. code-block:: c++
+
+  class FallibleChildIterator {
+  public:
+    FallibleChildIterator(Archive &A, unsigned ChildIdx);
+    Archive::Child &operator*();
+    friend bool operator==(const ArchiveIterator &LHS,
+                           const ArchiveIterator &RHS);
+
+    // operator++/operator-- replaced with fallible increment / decrement:
+    Error inc() {
+      if (!A.childValid(ChildIdx + 1))
+        return make_error<BadArchiveMember>(...);
+      ++ChildIdx;
+      return Error::success();
+    }
+
+    Error dec() { ... }
+  };
+
+Instances of this kind of fallible iterator interface are then wrapped with the
+fallible_iterator utility which provides ``operator++`` and ``operator--``,
+returning any errors via a reference passed in to the wrapper at construction
+time. The fallible_iterator wrapper takes care of (a) jumping to the end of the
+range on error, and (b) marking the error as checked whenever an iterator is
+compared to ``end`` and found to be inequal (in particular: this marks the
+error as checked throughout the body of a range-based for loop), enabling early
+exit from the loop without redundant error checking.
+
+Instances of the fallible iterator interface (e.g. FallibleChildIterator above)
+are wrapped using the ``make_fallible_itr`` and ``make_fallible_end``
+functions. E.g.:
+
+.. code-block:: c++
+
+  class Archive {
+  public:
+    using child_iterator = fallible_iterator<FallibleChildIterator>;
+
+    child_iterator child_begin(Error &Err) {
+      return make_fallible_itr(FallibleChildIterator(*this, 0), Err);
+    }
+
+    child_iterator child_end() {
+      return make_fallible_end(FallibleChildIterator(*this, size()));
+    }
+
+    iterator_range<child_iterator> children(Error &Err) {
+      return make_range(child_begin(Err), child_end());
+    }
+  };
+
+Using the fallible_iterator utility allows for both natural construction of
+fallible iterators (using failing ``inc`` and ``dec`` operations) and
+relatively natural use of c++ iterator/loop idioms.
+
+.. _function_apis:
+
+More information on Error and its related utilities can be found in the
+Error.h header file.
+
+Passing functions and other callable objects
+--------------------------------------------
+
+Sometimes you may want a function to be passed a callback object. In order to
+support lambda expressions and other function objects, you should not use the
+traditional C approach of taking a function pointer and an opaque cookie:
+
+.. code-block:: c++
+
+    void takeCallback(bool (*Callback)(Function *, void *), void *Cookie);
+
+Instead, use one of the following approaches:
+
+Function template
+^^^^^^^^^^^^^^^^^
+
+If you don't mind putting the definition of your function into a header file,
+make it a function template that is templated on the callable type.
+
+.. code-block:: c++
+
+    template<typename Callable>
+    void takeCallback(Callable Callback) {
+      Callback(1, 2, 3);
+    }
+
+The ``function_ref`` class template
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``function_ref``
+(`doxygen <http://llvm.org/doxygen/classllvm_1_1function__ref_3_01Ret_07Params_8_8_8_08_4.html>`__) class
+template represents a reference to a callable object, templated over the type
+of the callable. This is a good choice for passing a callback to a function,
+if you don't need to hold onto the callback after the function returns. In this
+way, ``function_ref`` is to ``std::function`` as ``StringRef`` is to
+``std::string``.
+
+``function_ref<Ret(Param1, Param2, ...)>`` can be implicitly constructed from
+any callable object that can be called with arguments of type ``Param1``,
+``Param2``, ..., and returns a value that can be converted to type ``Ret``.
+For example:
+
+.. code-block:: c++
+
+    void visitBasicBlocks(Function *F, function_ref<bool (BasicBlock*)> Callback) {
+      for (BasicBlock &BB : *F)
+        if (Callback(&BB))
+          return;
+    }
+
+can be called using:
+
+.. code-block:: c++
+
+    visitBasicBlocks(F, [&](BasicBlock *BB) {
+      if (process(BB))
+        return isEmpty(BB);
+      return false;
+    });
+
+Note that a ``function_ref`` object contains pointers to external memory, so it
+is not generally safe to store an instance of the class (unless you know that
+the external storage will not be freed). If you need this ability, consider
+using ``std::function``. ``function_ref`` is small enough that it should always
+be passed by value.
+
+.. _DEBUG:
+
+The ``LLVM_DEBUG()`` macro and ``-debug`` option
+------------------------------------------------
+
+Often when working on your pass you will put a bunch of debugging printouts and
+other code into your pass.  After you get it working, you want to remove it, but
+you may need it again in the future (to work out new bugs that you run across).
+
+Naturally, because of this, you don't want to delete the debug printouts, but
+you don't want them to always be noisy.  A standard compromise is to comment
+them out, allowing you to enable them if you need them in the future.
+
+The ``llvm/Support/Debug.h`` (`doxygen
+<http://llvm.org/doxygen/Debug_8h_source.html>`__) file provides a macro named
+``LLVM_DEBUG()`` that is a much nicer solution to this problem.  Basically, you can
+put arbitrary code into the argument of the ``LLVM_DEBUG`` macro, and it is only
+executed if '``opt``' (or any other tool) is run with the '``-debug``' command
+line argument:
+
+.. code-block:: c++
+
+  LLVM_DEBUG(dbgs() << "I am here!\n");
+
+Then you can run your pass like this:
+
+.. code-block:: none
+
+  $ opt < a.bc > /dev/null -mypass
+  <no output>
+  $ opt < a.bc > /dev/null -mypass -debug
+  I am here!
+
+Using the ``LLVM_DEBUG()`` macro instead of a home-brewed solution allows you to not
+have to create "yet another" command line option for the debug output for your
+pass.  Note that ``LLVM_DEBUG()`` macros are disabled for non-asserts builds, so they
+do not cause a performance impact at all (for the same reason, they should also
+not contain side-effects!).
+
+One additional nice thing about the ``LLVM_DEBUG()`` macro is that you can enable or
+disable it directly in gdb.  Just use "``set DebugFlag=0``" or "``set
+DebugFlag=1``" from the gdb if the program is running.  If the program hasn't
+been started yet, you can always just run it with ``-debug``.
+
+.. _DEBUG_TYPE:
+
+Fine grained debug info with ``DEBUG_TYPE`` and the ``-debug-only`` option
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Sometimes you may find yourself in a situation where enabling ``-debug`` just
+turns on **too much** information (such as when working on the code generator).
+If you want to enable debug information with more fine-grained control, you
+should define the ``DEBUG_TYPE`` macro and use the ``-debug-only`` option as
+follows:
+
+.. code-block:: c++
+
+  #define DEBUG_TYPE "foo"
+  LLVM_DEBUG(dbgs() << "'foo' debug type\n");
+  #undef  DEBUG_TYPE
+  #define DEBUG_TYPE "bar"
+  LLVM_DEBUG(dbgs() << "'bar' debug type\n");
+  #undef  DEBUG_TYPE
+
+Then you can run your pass like this:
+
+.. code-block:: none
+
+  $ opt < a.bc > /dev/null -mypass
+  <no output>
+  $ opt < a.bc > /dev/null -mypass -debug
+  'foo' debug type
+  'bar' debug type
+  $ opt < a.bc > /dev/null -mypass -debug-only=foo
+  'foo' debug type
+  $ opt < a.bc > /dev/null -mypass -debug-only=bar
+  'bar' debug type
+  $ opt < a.bc > /dev/null -mypass -debug-only=foo,bar
+  'foo' debug type
+  'bar' debug type
+
+Of course, in practice, you should only set ``DEBUG_TYPE`` at the top of a file,
+to specify the debug type for the entire module. Be careful that you only do
+this after including Debug.h and not around any #include of headers. Also, you
+should use names more meaningful than "foo" and "bar", because there is no
+system in place to ensure that names do not conflict. If two different modules
+use the same string, they will all be turned on when the name is specified.
+This allows, for example, all debug information for instruction scheduling to be
+enabled with ``-debug-only=InstrSched``, even if the source lives in multiple
+files. The name must not include a comma (,) as that is used to separate the
+arguments of the ``-debug-only`` option.
+
+For performance reasons, -debug-only is not available in optimized build
+(``--enable-optimized``) of LLVM.
+
+The ``DEBUG_WITH_TYPE`` macro is also available for situations where you would
+like to set ``DEBUG_TYPE``, but only for one specific ``DEBUG`` statement.  It
+takes an additional first parameter, which is the type to use.  For example, the
+preceding example could be written as:
+
+.. code-block:: c++
+
+  DEBUG_WITH_TYPE("foo", dbgs() << "'foo' debug type\n");
+  DEBUG_WITH_TYPE("bar", dbgs() << "'bar' debug type\n");
+
+.. _Statistic:
+
+The ``Statistic`` class & ``-stats`` option
+-------------------------------------------
+
+The ``llvm/ADT/Statistic.h`` (`doxygen
+<http://llvm.org/doxygen/Statistic_8h_source.html>`__) file provides a class
+named ``Statistic`` that is used as a unified way to keep track of what the LLVM
+compiler is doing and how effective various optimizations are.  It is useful to
+see what optimizations are contributing to making a particular program run
+faster.
+
+Often you may run your pass on some big program, and you're interested to see
+how many times it makes a certain transformation.  Although you can do this with
+hand inspection, or some ad-hoc method, this is a real pain and not very useful
+for big programs.  Using the ``Statistic`` class makes it very easy to keep
+track of this information, and the calculated information is presented in a
+uniform manner with the rest of the passes being executed.
+
+There are many examples of ``Statistic`` uses, but the basics of using it are as
+follows:
+
+Define your statistic like this:
+
+.. code-block:: c++
+
+  #define DEBUG_TYPE "mypassname"   // This goes before any #includes.
+  STATISTIC(NumXForms, "The # of times I did stuff");
+
+The ``STATISTIC`` macro defines a static variable, whose name is specified by
+the first argument.  The pass name is taken from the ``DEBUG_TYPE`` macro, and
+the description is taken from the second argument.  The variable defined
+("NumXForms" in this case) acts like an unsigned integer.
+
+Whenever you make a transformation, bump the counter:
+
+.. code-block:: c++
+
+  ++NumXForms;   // I did stuff!
+
+That's all you have to do.  To get '``opt``' to print out the statistics
+gathered, use the '``-stats``' option:
+
+.. code-block:: none
+
+  $ opt -stats -mypassname < program.bc > /dev/null
+  ... statistics output ...
+
+Note that in order to use the '``-stats``' option, LLVM must be
+compiled with assertions enabled.
+
+When running ``opt`` on a C file from the SPEC benchmark suite, it gives a
+report that looks like this:
+
+.. code-block:: none
+
+   7646 bitcodewriter   - Number of normal instructions
+    725 bitcodewriter   - Number of oversized instructions
+ 129996 bitcodewriter   - Number of bitcode bytes written
+   2817 raise           - Number of insts DCEd or constprop'd
+   3213 raise           - Number of cast-of-self removed
+   5046 raise           - Number of expression trees converted
+     75 raise           - Number of other getelementptr's formed
+    138 raise           - Number of load/store peepholes
+     42 deadtypeelim    - Number of unused typenames removed from symtab
+    392 funcresolve     - Number of varargs functions resolved
+     27 globaldce       - Number of global variables removed
+      2 adce            - Number of basic blocks removed
+    134 cee             - Number of branches revectored
+     49 cee             - Number of setcc instruction eliminated
+    532 gcse            - Number of loads removed
+   2919 gcse            - Number of instructions removed
+     86 indvars         - Number of canonical indvars added
+     87 indvars         - Number of aux indvars removed
+     25 instcombine     - Number of dead inst eliminate
+    434 instcombine     - Number of insts combined
+    248 licm            - Number of load insts hoisted
+   1298 licm            - Number of insts hoisted to a loop pre-header
+      3 licm            - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
+     75 mem2reg         - Number of alloca's promoted
+   1444 cfgsimplify     - Number of blocks simplified
+
+Obviously, with so many optimizations, having a unified framework for this stuff
+is very nice.  Making your pass fit well into the framework makes it more
+maintainable and useful.
+
+.. _DebugCounters:
+
+Adding debug counters to aid in debugging your code
+---------------------------------------------------
+
+Sometimes, when writing new passes, or trying to track down bugs, it
+is useful to be able to control whether certain things in your pass
+happen or not.  For example, there are times the minimization tooling
+can only easily give you large testcases.  You would like to narrow
+your bug down to a specific transformation happening or not happening,
+automatically, using bisection.  This is where debug counters help.
+They provide a framework for making parts of your code only execute a
+certain number of times.
+
+The ``llvm/Support/DebugCounter.h`` (`doxygen
+<http://llvm.org/doxygen/DebugCounter_8h_source.html>`__) file
+provides a class named ``DebugCounter`` that can be used to create
+command line counter options that control execution of parts of your code.
+
+Define your DebugCounter like this:
+
+.. code-block:: c++
+
+  DEBUG_COUNTER(DeleteAnInstruction, "passname-delete-instruction",
+		"Controls which instructions get delete");
+
+The ``DEBUG_COUNTER`` macro defines a static variable, whose name
+is specified by the first argument.  The name of the counter
+(which is used on the command line) is specified by the second
+argument, and the description used in the help is specified by the
+third argument.
+
+Whatever code you want that control, use ``DebugCounter::shouldExecute`` to control it.
+
+.. code-block:: c++
+
+  if (DebugCounter::shouldExecute(DeleteAnInstruction))
+    I->eraseFromParent();
+
+That's all you have to do.  Now, using opt, you can control when this code triggers using
+the '``--debug-counter``' option.  There are two counters provided, ``skip`` and ``count``.
+``skip`` is the number of times to skip execution of the codepath.  ``count`` is the number
+of times, once we are done skipping, to execute the codepath.
+
+.. code-block:: none
+
+  $ opt --debug-counter=passname-delete-instruction-skip=1,passname-delete-instruction-count=2 -passname
+
+This will skip the above code the first time we hit it, then execute it twice, then skip the rest of the executions.
+
+So if executed on the following code:
+
+.. code-block:: llvm
+
+  %1 = add i32 %a, %b
+  %2 = add i32 %a, %b
+  %3 = add i32 %a, %b
+  %4 = add i32 %a, %b
+
+It would delete number ``%2`` and ``%3``.
+
+A utility is provided in `utils/bisect-skip-count` to binary search
+skip and count arguments. It can be used to automatically minimize the
+skip and count for a debug-counter variable.
+
+.. _ViewGraph:
+
+Viewing graphs while debugging code
+-----------------------------------
+
+Several of the important data structures in LLVM are graphs: for example CFGs
+made out of LLVM :ref:`BasicBlocks <BasicBlock>`, CFGs made out of LLVM
+:ref:`MachineBasicBlocks <MachineBasicBlock>`, and :ref:`Instruction Selection
+DAGs <SelectionDAG>`.  In many cases, while debugging various parts of the
+compiler, it is nice to instantly visualize these graphs.
+
+LLVM provides several callbacks that are available in a debug build to do
+exactly that.  If you call the ``Function::viewCFG()`` method, for example, the
+current LLVM tool will pop up a window containing the CFG for the function where
+each basic block is a node in the graph, and each node contains the instructions
+in the block.  Similarly, there also exists ``Function::viewCFGOnly()`` (does
+not include the instructions), the ``MachineFunction::viewCFG()`` and
+``MachineFunction::viewCFGOnly()``, and the ``SelectionDAG::viewGraph()``
+methods.  Within GDB, for example, you can usually use something like ``call
+DAG.viewGraph()`` to pop up a window.  Alternatively, you can sprinkle calls to
+these functions in your code in places you want to debug.
+
+Getting this to work requires a small amount of setup.  On Unix systems
+with X11, install the `graphviz <http://www.graphviz.org>`_ toolkit, and make
+sure 'dot' and 'gv' are in your path.  If you are running on macOS, download
+and install the macOS `Graphviz program
+<http://www.pixelglow.com/graphviz/>`_ and add
+``/Applications/Graphviz.app/Contents/MacOS/`` (or wherever you install it) to
+your path. The programs need not be present when configuring, building or
+running LLVM and can simply be installed when needed during an active debug
+session.
+
+``SelectionDAG`` has been extended to make it easier to locate *interesting*
+nodes in large complex graphs.  From gdb, if you ``call DAG.setGraphColor(node,
+"color")``, then the next ``call DAG.viewGraph()`` would highlight the node in
+the specified color (choices of colors can be found at `colors
+<http://www.graphviz.org/doc/info/colors.html>`_.) More complex node attributes
+can be provided with ``call DAG.setGraphAttrs(node, "attributes")`` (choices can
+be found at `Graph attributes <http://www.graphviz.org/doc/info/attrs.html>`_.)
+If you want to restart and clear all the current graph attributes, then you can
+``call DAG.clearGraphAttrs()``.
+
+Note that graph visualization features are compiled out of Release builds to
+reduce file size.  This means that you need a Debug+Asserts or Release+Asserts
+build to use these features.
+
+.. _datastructure:
+
+Picking the Right Data Structure for a Task
+===========================================
+
+LLVM has a plethora of data structures in the ``llvm/ADT/`` directory, and we
+commonly use STL data structures.  This section describes the trade-offs you
+should consider when you pick one.
+
+The first step is a choose your own adventure: do you want a sequential
+container, a set-like container, or a map-like container?  The most important
+thing when choosing a container is the algorithmic properties of how you plan to
+access the container.  Based on that, you should use:
+
+
+* a :ref:`map-like <ds_map>` container if you need efficient look-up of a
+  value based on another value.  Map-like containers also support efficient
+  queries for containment (whether a key is in the map).  Map-like containers
+  generally do not support efficient reverse mapping (values to keys).  If you
+  need that, use two maps.  Some map-like containers also support efficient
+  iteration through the keys in sorted order.  Map-like containers are the most
+  expensive sort, only use them if you need one of these capabilities.
+
+* a :ref:`set-like <ds_set>` container if you need to put a bunch of stuff into
+  a container that automatically eliminates duplicates.  Some set-like
+  containers support efficient iteration through the elements in sorted order.
+  Set-like containers are more expensive than sequential containers.
+
+* a :ref:`sequential <ds_sequential>` container provides the most efficient way
+  to add elements and keeps track of the order they are added to the collection.
+  They permit duplicates and support efficient iteration, but do not support
+  efficient look-up based on a key.
+
+* a :ref:`string <ds_string>` container is a specialized sequential container or
+  reference structure that is used for character or byte arrays.
+
+* a :ref:`bit <ds_bit>` container provides an efficient way to store and
+  perform set operations on sets of numeric id's, while automatically
+  eliminating duplicates.  Bit containers require a maximum of 1 bit for each
+  identifier you want to store.
+
+Once the proper category of container is determined, you can fine tune the
+memory use, constant factors, and cache behaviors of access by intelligently
+picking a member of the category.  Note that constant factors and cache behavior
+can be a big deal.  If you have a vector that usually only contains a few
+elements (but could contain many), for example, it's much better to use
+:ref:`SmallVector <dss_smallvector>` than :ref:`vector <dss_vector>`.  Doing so
+avoids (relatively) expensive malloc/free calls, which dwarf the cost of adding
+the elements to the container.
+
+.. _ds_sequential:
+
+Sequential Containers (std::vector, std::list, etc)
+---------------------------------------------------
+
+There are a variety of sequential containers available for you, based on your
+needs.  Pick the first in this section that will do what you want.
+
+.. _dss_arrayref:
+
+llvm/ADT/ArrayRef.h
+^^^^^^^^^^^^^^^^^^^
+
+The ``llvm::ArrayRef`` class is the preferred class to use in an interface that
+accepts a sequential list of elements in memory and just reads from them.  By
+taking an ``ArrayRef``, the API can be passed a fixed size array, an
+``std::vector``, an ``llvm::SmallVector`` and anything else that is contiguous
+in memory.
+
+.. _dss_fixedarrays:
+
+Fixed Size Arrays
+^^^^^^^^^^^^^^^^^
+
+Fixed size arrays are very simple and very fast.  They are good if you know
+exactly how many elements you have, or you have a (low) upper bound on how many
+you have.
+
+.. _dss_heaparrays:
+
+Heap Allocated Arrays
+^^^^^^^^^^^^^^^^^^^^^
+
+Heap allocated arrays (``new[]`` + ``delete[]``) are also simple.  They are good
+if the number of elements is variable, if you know how many elements you will
+need before the array is allocated, and if the array is usually large (if not,
+consider a :ref:`SmallVector <dss_smallvector>`).  The cost of a heap allocated
+array is the cost of the new/delete (aka malloc/free).  Also note that if you
+are allocating an array of a type with a constructor, the constructor and
+destructors will be run for every element in the array (re-sizable vectors only
+construct those elements actually used).
+
+.. _dss_tinyptrvector:
+
+llvm/ADT/TinyPtrVector.h
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``TinyPtrVector<Type>`` is a highly specialized collection class that is
+optimized to avoid allocation in the case when a vector has zero or one
+elements.  It has two major restrictions: 1) it can only hold values of pointer
+type, and 2) it cannot hold a null pointer.
+
+Since this container is highly specialized, it is rarely used.
+
+.. _dss_smallvector:
+
+llvm/ADT/SmallVector.h
+^^^^^^^^^^^^^^^^^^^^^^
+
+``SmallVector<Type, N>`` is a simple class that looks and smells just like
+``vector<Type>``: it supports efficient iteration, lays out elements in memory
+order (so you can do pointer arithmetic between elements), supports efficient
+push_back/pop_back operations, supports efficient random access to its elements,
+etc.
+
+The main advantage of SmallVector is that it allocates space for some number of
+elements (N) **in the object itself**.  Because of this, if the SmallVector is
+dynamically smaller than N, no malloc is performed.  This can be a big win in
+cases where the malloc/free call is far more expensive than the code that
+fiddles around with the elements.
+
+This is good for vectors that are "usually small" (e.g. the number of
+predecessors/successors of a block is usually less than 8).  On the other hand,
+this makes the size of the SmallVector itself large, so you don't want to
+allocate lots of them (doing so will waste a lot of space).  As such,
+SmallVectors are most useful when on the stack.
+
+SmallVector also provides a nice portable and efficient replacement for
+``alloca``.
+
+SmallVector has grown a few other minor advantages over std::vector, causing
+``SmallVector<Type, 0>`` to be preferred over ``std::vector<Type>``.
+
+#. std::vector is exception-safe, and some implementations have pessimizations
+   that copy elements when SmallVector would move them.
+
+#. SmallVector understands ``llvm::is_trivially_copyable<Type>`` and uses realloc aggressively.
+
+#. Many LLVM APIs take a SmallVectorImpl as an out parameter (see the note
+   below).
+
+#. SmallVector with N equal to 0 is smaller than std::vector on 64-bit
+   platforms, since it uses ``unsigned`` (instead of ``void*``) for its size
+   and capacity.
+
+.. note::
+
+   Prefer to use ``SmallVectorImpl<T>`` as a parameter type.
+
+   In APIs that don't care about the "small size" (most?), prefer to use
+   the ``SmallVectorImpl<T>`` class, which is basically just the "vector
+   header" (and methods) without the elements allocated after it. Note that
+   ``SmallVector<T, N>`` inherits from ``SmallVectorImpl<T>`` so the
+   conversion is implicit and costs nothing. E.g.
+
+   .. code-block:: c++
+
+      // BAD: Clients cannot pass e.g. SmallVector<Foo, 4>.
+      hardcodedSmallSize(SmallVector<Foo, 2> &Out);
+      // GOOD: Clients can pass any SmallVector<Foo, N>.
+      allowsAnySmallSize(SmallVectorImpl<Foo> &Out);
+
+      void someFunc() {
+        SmallVector<Foo, 8> Vec;
+        hardcodedSmallSize(Vec); // Error.
+        allowsAnySmallSize(Vec); // Works.
+      }
+
+   Even though it has "``Impl``" in the name, this is so widely used that
+   it really isn't "private to the implementation" anymore. A name like
+   ``SmallVectorHeader`` would be more appropriate.
+
+.. _dss_vector:
+
+<vector>
+^^^^^^^^
+
+``std::vector<T>`` is well loved and respected.  However, ``SmallVector<T, 0>``
+is often a better option due to the advantages listed above.  std::vector is
+still useful when you need to store more than ``UINT32_MAX`` elements or when
+interfacing with code that expects vectors :).
+
+One worthwhile note about std::vector: avoid code like this:
+
+.. code-block:: c++
+
+  for ( ... ) {
+     std::vector<foo> V;
+     // make use of V.
+  }
+
+Instead, write this as:
+
+.. code-block:: c++
+
+  std::vector<foo> V;
+  for ( ... ) {
+     // make use of V.
+     V.clear();
+  }
+
+Doing so will save (at least) one heap allocation and free per iteration of the
+loop.
+
+.. _dss_deque:
+
+<deque>
+^^^^^^^
+
+``std::deque`` is, in some senses, a generalized version of ``std::vector``.
+Like ``std::vector``, it provides constant time random access and other similar
+properties, but it also provides efficient access to the front of the list.  It
+does not guarantee continuity of elements within memory.
+
+In exchange for this extra flexibility, ``std::deque`` has significantly higher
+constant factor costs than ``std::vector``.  If possible, use ``std::vector`` or
+something cheaper.
+
+.. _dss_list:
+
+<list>
+^^^^^^
+
+``std::list`` is an extremely inefficient class that is rarely useful.  It
+performs a heap allocation for every element inserted into it, thus having an
+extremely high constant factor, particularly for small data types.
+``std::list`` also only supports bidirectional iteration, not random access
+iteration.
+
+In exchange for this high cost, std::list supports efficient access to both ends
+of the list (like ``std::deque``, but unlike ``std::vector`` or
+``SmallVector``).  In addition, the iterator invalidation characteristics of
+std::list are stronger than that of a vector class: inserting or removing an
+element into the list does not invalidate iterator or pointers to other elements
+in the list.
+
+.. _dss_ilist:
+
+llvm/ADT/ilist.h
+^^^^^^^^^^^^^^^^
+
+``ilist<T>`` implements an 'intrusive' doubly-linked list.  It is intrusive,
+because it requires the element to store and provide access to the prev/next
+pointers for the list.
+
+``ilist`` has the same drawbacks as ``std::list``, and additionally requires an
+``ilist_traits`` implementation for the element type, but it provides some novel
+characteristics.  In particular, it can efficiently store polymorphic objects,
+the traits class is informed when an element is inserted or removed from the
+list, and ``ilist``\ s are guaranteed to support a constant-time splice
+operation.
+
+These properties are exactly what we want for things like ``Instruction``\ s and
+basic blocks, which is why these are implemented with ``ilist``\ s.
+
+Related classes of interest are explained in the following subsections:
+
+* :ref:`ilist_traits <dss_ilist_traits>`
+
+* :ref:`iplist <dss_iplist>`
+
+* :ref:`llvm/ADT/ilist_node.h <dss_ilist_node>`
+
+* :ref:`Sentinels <dss_ilist_sentinel>`
+
+.. _dss_packedvector:
+
+llvm/ADT/PackedVector.h
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Useful for storing a vector of values using only a few number of bits for each
+value.  Apart from the standard operations of a vector-like container, it can
+also perform an 'or' set operation.
+
+For example:
+
+.. code-block:: c++
+
+  enum State {
+      None = 0x0,
+      FirstCondition = 0x1,
+      SecondCondition = 0x2,
+      Both = 0x3
+  };
+
+  State get() {
+      PackedVector<State, 2> Vec1;
+      Vec1.push_back(FirstCondition);
+
+      PackedVector<State, 2> Vec2;
+      Vec2.push_back(SecondCondition);
+
+      Vec1 |= Vec2;
+      return Vec1[0]; // returns 'Both'.
+  }
+
+.. _dss_ilist_traits:
+
+ilist_traits
+^^^^^^^^^^^^
+
+``ilist_traits<T>`` is ``ilist<T>``'s customization mechanism. ``iplist<T>``
+(and consequently ``ilist<T>``) publicly derive from this traits class.
+
+.. _dss_iplist:
+
+iplist
+^^^^^^
+
+``iplist<T>`` is ``ilist<T>``'s base and as such supports a slightly narrower
+interface.  Notably, inserters from ``T&`` are absent.
+
+``ilist_traits<T>`` is a public base of this class and can be used for a wide
+variety of customizations.
+
+.. _dss_ilist_node:
+
+llvm/ADT/ilist_node.h
+^^^^^^^^^^^^^^^^^^^^^
+
+``ilist_node<T>`` implements the forward and backward links that are expected
+by the ``ilist<T>`` (and analogous containers) in the default manner.
+
+``ilist_node<T>``\ s are meant to be embedded in the node type ``T``, usually
+``T`` publicly derives from ``ilist_node<T>``.
+
+.. _dss_ilist_sentinel:
+
+Sentinels
+^^^^^^^^^
+
+``ilist``\ s have another specialty that must be considered.  To be a good
+citizen in the C++ ecosystem, it needs to support the standard container
+operations, such as ``begin`` and ``end`` iterators, etc.  Also, the
+``operator--`` must work correctly on the ``end`` iterator in the case of
+non-empty ``ilist``\ s.
+
+The only sensible solution to this problem is to allocate a so-called *sentinel*
+along with the intrusive list, which serves as the ``end`` iterator, providing
+the back-link to the last element.  However conforming to the C++ convention it
+is illegal to ``operator++`` beyond the sentinel and it also must not be
+dereferenced.
+
+These constraints allow for some implementation freedom to the ``ilist`` how to
+allocate and store the sentinel.  The corresponding policy is dictated by
+``ilist_traits<T>``.  By default a ``T`` gets heap-allocated whenever the need
+for a sentinel arises.
+
+While the default policy is sufficient in most cases, it may break down when
+``T`` does not provide a default constructor.  Also, in the case of many
+instances of ``ilist``\ s, the memory overhead of the associated sentinels is
+wasted.  To alleviate the situation with numerous and voluminous
+``T``-sentinels, sometimes a trick is employed, leading to *ghostly sentinels*.
+
+Ghostly sentinels are obtained by specially-crafted ``ilist_traits<T>`` which
+superpose the sentinel with the ``ilist`` instance in memory.  Pointer
+arithmetic is used to obtain the sentinel, which is relative to the ``ilist``'s
+``this`` pointer.  The ``ilist`` is augmented by an extra pointer, which serves
+as the back-link of the sentinel.  This is the only field in the ghostly
+sentinel which can be legally accessed.
+
+.. _dss_other:
+
+Other Sequential Container options
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Other STL containers are available, such as ``std::string``.
+
+There are also various STL adapter classes such as ``std::queue``,
+``std::priority_queue``, ``std::stack``, etc.  These provide simplified access
+to an underlying container but don't affect the cost of the container itself.
+
+.. _ds_string:
+
+String-like containers
+----------------------
+
+There are a variety of ways to pass around and use strings in C and C++, and
+LLVM adds a few new options to choose from.  Pick the first option on this list
+that will do what you need, they are ordered according to their relative cost.
+
+Note that it is generally preferred to *not* pass strings around as ``const
+char*``'s.  These have a number of problems, including the fact that they
+cannot represent embedded nul ("\0") characters, and do not have a length
+available efficiently.  The general replacement for '``const char*``' is
+StringRef.
+
+For more information on choosing string containers for APIs, please see
+:ref:`Passing Strings <string_apis>`.
+
+.. _dss_stringref:
+
+llvm/ADT/StringRef.h
+^^^^^^^^^^^^^^^^^^^^
+
+The StringRef class is a simple value class that contains a pointer to a
+character and a length, and is quite related to the :ref:`ArrayRef
+<dss_arrayref>` class (but specialized for arrays of characters).  Because
+StringRef carries a length with it, it safely handles strings with embedded nul
+characters in it, getting the length does not require a strlen call, and it even
+has very convenient APIs for slicing and dicing the character range that it
+represents.
+
+StringRef is ideal for passing simple strings around that are known to be live,
+either because they are C string literals, std::string, a C array, or a
+SmallVector.  Each of these cases has an efficient implicit conversion to
+StringRef, which doesn't result in a dynamic strlen being executed.
+
+StringRef has a few major limitations which make more powerful string containers
+useful:
+
+#. You cannot directly convert a StringRef to a 'const char*' because there is
+   no way to add a trailing nul (unlike the .c_str() method on various stronger
+   classes).
+
+#. StringRef doesn't own or keep alive the underlying string bytes.
+   As such it can easily lead to dangling pointers, and is not suitable for
+   embedding in datastructures in most cases (instead, use an std::string or
+   something like that).
+
+#. For the same reason, StringRef cannot be used as the return value of a
+   method if the method "computes" the result string.  Instead, use std::string.
+
+#. StringRef's do not allow you to mutate the pointed-to string bytes and it
+   doesn't allow you to insert or remove bytes from the range.  For editing
+   operations like this, it interoperates with the :ref:`Twine <dss_twine>`
+   class.
+
+Because of its strengths and limitations, it is very common for a function to
+take a StringRef and for a method on an object to return a StringRef that points
+into some string that it owns.
+
+.. _dss_twine:
+
+llvm/ADT/Twine.h
+^^^^^^^^^^^^^^^^
+
+The Twine class is used as an intermediary datatype for APIs that want to take a
+string that can be constructed inline with a series of concatenations.  Twine
+works by forming recursive instances of the Twine datatype (a simple value
+object) on the stack as temporary objects, linking them together into a tree
+which is then linearized when the Twine is consumed.  Twine is only safe to use
+as the argument to a function, and should always be a const reference, e.g.:
+
+.. code-block:: c++
+
+  void foo(const Twine &T);
+  ...
+  StringRef X = ...
+  unsigned i = ...
+  foo(X + "." + Twine(i));
+
+This example forms a string like "blarg.42" by concatenating the values
+together, and does not form intermediate strings containing "blarg" or "blarg.".
+
+Because Twine is constructed with temporary objects on the stack, and because
+these instances are destroyed at the end of the current statement, it is an
+inherently dangerous API.  For example, this simple variant contains undefined
+behavior and will probably crash:
+
+.. code-block:: c++
+
+  void foo(const Twine &T);
+  ...
+  StringRef X = ...
+  unsigned i = ...
+  const Twine &Tmp = X + "." + Twine(i);
+  foo(Tmp);
+
+... because the temporaries are destroyed before the call.  That said, Twine's
+are much more efficient than intermediate std::string temporaries, and they work
+really well with StringRef.  Just be aware of their limitations.
+
+.. _dss_smallstring:
+
+llvm/ADT/SmallString.h
+^^^^^^^^^^^^^^^^^^^^^^
+
+SmallString is a subclass of :ref:`SmallVector <dss_smallvector>` that adds some
+convenience APIs like += that takes StringRef's.  SmallString avoids allocating
+memory in the case when the preallocated space is enough to hold its data, and
+it calls back to general heap allocation when required.  Since it owns its data,
+it is very safe to use and supports full mutation of the string.
+
+Like SmallVector's, the big downside to SmallString is their sizeof.  While they
+are optimized for small strings, they themselves are not particularly small.
+This means that they work great for temporary scratch buffers on the stack, but
+should not generally be put into the heap: it is very rare to see a SmallString
+as the member of a frequently-allocated heap data structure or returned
+by-value.
+
+.. _dss_stdstring:
+
+std::string
+^^^^^^^^^^^
+
+The standard C++ std::string class is a very general class that (like
+SmallString) owns its underlying data.  sizeof(std::string) is very reasonable
+so it can be embedded into heap data structures and returned by-value.  On the
+other hand, std::string is highly inefficient for inline editing (e.g.
+concatenating a bunch of stuff together) and because it is provided by the
+standard library, its performance characteristics depend a lot of the host
+standard library (e.g. libc++ and MSVC provide a highly optimized string class,
+GCC contains a really slow implementation).
+
+The major disadvantage of std::string is that almost every operation that makes
+them larger can allocate memory, which is slow.  As such, it is better to use
+SmallVector or Twine as a scratch buffer, but then use std::string to persist
+the result.
+
+.. _ds_set:
+
+Set-Like Containers (std::set, SmallSet, SetVector, etc)
+--------------------------------------------------------
+
+Set-like containers are useful when you need to canonicalize multiple values
+into a single representation.  There are several different choices for how to do
+this, providing various trade-offs.
+
+.. _dss_sortedvectorset:
+
+A sorted 'vector'
+^^^^^^^^^^^^^^^^^
+
+If you intend to insert a lot of elements, then do a lot of queries, a great
+approach is to use an std::vector (or other sequential container) with
+std::sort+std::unique to remove duplicates.  This approach works really well if
+your usage pattern has these two distinct phases (insert then query), and can be
+coupled with a good choice of :ref:`sequential container <ds_sequential>`.
+
+This combination provides the several nice properties: the result data is
+contiguous in memory (good for cache locality), has few allocations, is easy to
+address (iterators in the final vector are just indices or pointers), and can be
+efficiently queried with a standard binary search (e.g.
+``std::lower_bound``; if you want the whole range of elements comparing
+equal, use ``std::equal_range``).
+
+.. _dss_smallset:
+
+llvm/ADT/SmallSet.h
+^^^^^^^^^^^^^^^^^^^
+
+If you have a set-like data structure that is usually small and whose elements
+are reasonably small, a ``SmallSet<Type, N>`` is a good choice.  This set has
+space for N elements in place (thus, if the set is dynamically smaller than N,
+no malloc traffic is required) and accesses them with a simple linear search.
+When the set grows beyond N elements, it allocates a more expensive
+representation that guarantees efficient access (for most types, it falls back
+to :ref:`std::set <dss_set>`, but for pointers it uses something far better,
+:ref:`SmallPtrSet <dss_smallptrset>`.
+
+The magic of this class is that it handles small sets extremely efficiently, but
+gracefully handles extremely large sets without loss of efficiency.
+
+.. _dss_smallptrset:
+
+llvm/ADT/SmallPtrSet.h
+^^^^^^^^^^^^^^^^^^^^^^
+
+``SmallPtrSet`` has all the advantages of ``SmallSet`` (and a ``SmallSet`` of
+pointers is transparently implemented with a ``SmallPtrSet``). If more than N
+insertions are performed, a single quadratically probed hash table is allocated
+and grows as needed, providing extremely efficient access (constant time
+insertion/deleting/queries with low constant factors) and is very stingy with
+malloc traffic.
+
+Note that, unlike :ref:`std::set <dss_set>`, the iterators of ``SmallPtrSet``
+are invalidated whenever an insertion occurs.  Also, the values visited by the
+iterators are not visited in sorted order.
+
+.. _dss_stringset:
+
+llvm/ADT/StringSet.h
+^^^^^^^^^^^^^^^^^^^^
+
+``StringSet`` is a thin wrapper around :ref:`StringMap\<char\> <dss_stringmap>`,
+and it allows efficient storage and retrieval of unique strings.
+
+Functionally analogous to ``SmallSet<StringRef>``, ``StringSet`` also supports
+iteration. (The iterator dereferences to a ``StringMapEntry<char>``, so you
+need to call ``i->getKey()`` to access the item of the StringSet.)  On the
+other hand, ``StringSet`` doesn't support range-insertion and
+copy-construction, which :ref:`SmallSet <dss_smallset>` and :ref:`SmallPtrSet
+<dss_smallptrset>` do support.
+
+.. _dss_denseset:
+
+llvm/ADT/DenseSet.h
+^^^^^^^^^^^^^^^^^^^
+
+DenseSet is a simple quadratically probed hash table.  It excels at supporting
+small values: it uses a single allocation to hold all of the pairs that are
+currently inserted in the set.  DenseSet is a great way to unique small values
+that are not simple pointers (use :ref:`SmallPtrSet <dss_smallptrset>` for
+pointers).  Note that DenseSet has the same requirements for the value type that
+:ref:`DenseMap <dss_densemap>` has.
+
+.. _dss_sparseset:
+
+llvm/ADT/SparseSet.h
+^^^^^^^^^^^^^^^^^^^^
+
+SparseSet holds a small number of objects identified by unsigned keys of
+moderate size.  It uses a lot of memory, but provides operations that are almost
+as fast as a vector.  Typical keys are physical registers, virtual registers, or
+numbered basic blocks.
+
+SparseSet is useful for algorithms that need very fast clear/find/insert/erase
+and fast iteration over small sets.  It is not intended for building composite
+data structures.
+
+.. _dss_sparsemultiset:
+
+llvm/ADT/SparseMultiSet.h
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+SparseMultiSet adds multiset behavior to SparseSet, while retaining SparseSet's
+desirable attributes. Like SparseSet, it typically uses a lot of memory, but
+provides operations that are almost as fast as a vector.  Typical keys are
+physical registers, virtual registers, or numbered basic blocks.
+
+SparseMultiSet is useful for algorithms that need very fast
+clear/find/insert/erase of the entire collection, and iteration over sets of
+elements sharing a key. It is often a more efficient choice than using composite
+data structures (e.g. vector-of-vectors, map-of-vectors). It is not intended for
+building composite data structures.
+
+.. _dss_FoldingSet:
+
+llvm/ADT/FoldingSet.h
+^^^^^^^^^^^^^^^^^^^^^
+
+FoldingSet is an aggregate class that is really good at uniquing
+expensive-to-create or polymorphic objects.  It is a combination of a chained
+hash table with intrusive links (uniqued objects are required to inherit from
+FoldingSetNode) that uses :ref:`SmallVector <dss_smallvector>` as part of its ID
+process.
+
+Consider a case where you want to implement a "getOrCreateFoo" method for a
+complex object (for example, a node in the code generator).  The client has a
+description of **what** it wants to generate (it knows the opcode and all the
+operands), but we don't want to 'new' a node, then try inserting it into a set
+only to find out it already exists, at which point we would have to delete it
+and return the node that already exists.
+
+To support this style of client, FoldingSet perform a query with a
+FoldingSetNodeID (which wraps SmallVector) that can be used to describe the
+element that we want to query for.  The query either returns the element
+matching the ID or it returns an opaque ID that indicates where insertion should
+take place.  Construction of the ID usually does not require heap traffic.
+
+Because FoldingSet uses intrusive links, it can support polymorphic objects in
+the set (for example, you can have SDNode instances mixed with LoadSDNodes).
+Because the elements are individually allocated, pointers to the elements are
+stable: inserting or removing elements does not invalidate any pointers to other
+elements.
+
+.. _dss_set:
+
+<set>
+^^^^^
+
+``std::set`` is a reasonable all-around set class, which is decent at many
+things but great at nothing.  std::set allocates memory for each element
+inserted (thus it is very malloc intensive) and typically stores three pointers
+per element in the set (thus adding a large amount of per-element space
+overhead).  It offers guaranteed log(n) performance, which is not particularly
+fast from a complexity standpoint (particularly if the elements of the set are
+expensive to compare, like strings), and has extremely high constant factors for
+lookup, insertion and removal.
+
+The advantages of std::set are that its iterators are stable (deleting or
+inserting an element from the set does not affect iterators or pointers to other
+elements) and that iteration over the set is guaranteed to be in sorted order.
+If the elements in the set are large, then the relative overhead of the pointers
+and malloc traffic is not a big deal, but if the elements of the set are small,
+std::set is almost never a good choice.
+
+.. _dss_setvector:
+
+llvm/ADT/SetVector.h
+^^^^^^^^^^^^^^^^^^^^
+
+LLVM's ``SetVector<Type>`` is an adapter class that combines your choice of a
+set-like container along with a :ref:`Sequential Container <ds_sequential>` The
+important property that this provides is efficient insertion with uniquing
+(duplicate elements are ignored) with iteration support.  It implements this by
+inserting elements into both a set-like container and the sequential container,
+using the set-like container for uniquing and the sequential container for
+iteration.
+
+The difference between SetVector and other sets is that the order of iteration
+is guaranteed to match the order of insertion into the SetVector.  This property
+is really important for things like sets of pointers.  Because pointer values
+are non-deterministic (e.g. vary across runs of the program on different
+machines), iterating over the pointers in the set will not be in a well-defined
+order.
+
+The drawback of SetVector is that it requires twice as much space as a normal
+set and has the sum of constant factors from the set-like container and the
+sequential container that it uses.  Use it **only** if you need to iterate over
+the elements in a deterministic order.  SetVector is also expensive to delete
+elements out of (linear time), unless you use its "pop_back" method, which is
+faster.
+
+``SetVector`` is an adapter class that defaults to using ``std::vector`` and a
+size 16 ``SmallSet`` for the underlying containers, so it is quite expensive.
+However, ``"llvm/ADT/SetVector.h"`` also provides a ``SmallSetVector`` class,
+which defaults to using a ``SmallVector`` and ``SmallSet`` of a specified size.
+If you use this, and if your sets are dynamically smaller than ``N``, you will
+save a lot of heap traffic.
+
+.. _dss_uniquevector:
+
+llvm/ADT/UniqueVector.h
+^^^^^^^^^^^^^^^^^^^^^^^
+
+UniqueVector is similar to :ref:`SetVector <dss_setvector>` but it retains a
+unique ID for each element inserted into the set.  It internally contains a map
+and a vector, and it assigns a unique ID for each value inserted into the set.
+
+UniqueVector is very expensive: its cost is the sum of the cost of maintaining
+both the map and vector, it has high complexity, high constant factors, and
+produces a lot of malloc traffic.  It should be avoided.
+
+.. _dss_immutableset:
+
+llvm/ADT/ImmutableSet.h
+^^^^^^^^^^^^^^^^^^^^^^^
+
+ImmutableSet is an immutable (functional) set implementation based on an AVL
+tree.  Adding or removing elements is done through a Factory object and results
+in the creation of a new ImmutableSet object.  If an ImmutableSet already exists
+with the given contents, then the existing one is returned; equality is compared
+with a FoldingSetNodeID.  The time and space complexity of add or remove
+operations is logarithmic in the size of the original set.
+
+There is no method for returning an element of the set, you can only check for
+membership.
+
+.. _dss_otherset:
+
+Other Set-Like Container Options
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The STL provides several other options, such as std::multiset and the various
+"hash_set" like containers (whether from C++ TR1 or from the SGI library).  We
+never use hash_set and unordered_set because they are generally very expensive
+(each insertion requires a malloc) and very non-portable.
+
+std::multiset is useful if you're not interested in elimination of duplicates,
+but has all the drawbacks of :ref:`std::set <dss_set>`.  A sorted vector
+(where you don't delete duplicate entries) or some other approach is almost
+always better.
+
+.. _ds_map:
+
+Map-Like Containers (std::map, DenseMap, etc)
+---------------------------------------------
+
+Map-like containers are useful when you want to associate data to a key.  As
+usual, there are a lot of different ways to do this. :)
+
+.. _dss_sortedvectormap:
+
+A sorted 'vector'
+^^^^^^^^^^^^^^^^^
+
+If your usage pattern follows a strict insert-then-query approach, you can
+trivially use the same approach as :ref:`sorted vectors for set-like containers
+<dss_sortedvectorset>`.  The only difference is that your query function (which
+uses std::lower_bound to get efficient log(n) lookup) should only compare the
+key, not both the key and value.  This yields the same advantages as sorted
+vectors for sets.
+
+.. _dss_stringmap:
+
+llvm/ADT/StringMap.h
+^^^^^^^^^^^^^^^^^^^^
+
+Strings are commonly used as keys in maps, and they are difficult to support
+efficiently: they are variable length, inefficient to hash and compare when
+long, expensive to copy, etc.  StringMap is a specialized container designed to
+cope with these issues.  It supports mapping an arbitrary range of bytes to an
+arbitrary other object.
+
+The StringMap implementation uses a quadratically-probed hash table, where the
+buckets store a pointer to the heap allocated entries (and some other stuff).
+The entries in the map must be heap allocated because the strings are variable
+length.  The string data (key) and the element object (value) are stored in the
+same allocation with the string data immediately after the element object.
+This container guarantees the "``(char*)(&Value+1)``" points to the key string
+for a value.
+
+The StringMap is very fast for several reasons: quadratic probing is very cache
+efficient for lookups, the hash value of strings in buckets is not recomputed
+when looking up an element, StringMap rarely has to touch the memory for
+unrelated objects when looking up a value (even when hash collisions happen),
+hash table growth does not recompute the hash values for strings already in the
+table, and each pair in the map is store in a single allocation (the string data
+is stored in the same allocation as the Value of a pair).
+
+StringMap also provides query methods that take byte ranges, so it only ever
+copies a string if a value is inserted into the table.
+
+StringMap iteration order, however, is not guaranteed to be deterministic, so
+any uses which require that should instead use a std::map.
+
+.. _dss_indexmap:
+
+llvm/ADT/IndexedMap.h
+^^^^^^^^^^^^^^^^^^^^^
+
+IndexedMap is a specialized container for mapping small dense integers (or
+values that can be mapped to small dense integers) to some other type.  It is
+internally implemented as a vector with a mapping function that maps the keys
+to the dense integer range.
+
+This is useful for cases like virtual registers in the LLVM code generator: they
+have a dense mapping that is offset by a compile-time constant (the first
+virtual register ID).
+
+.. _dss_densemap:
+
+llvm/ADT/DenseMap.h
+^^^^^^^^^^^^^^^^^^^
+
+DenseMap is a simple quadratically probed hash table.  It excels at supporting
+small keys and values: it uses a single allocation to hold all of the pairs
+that are currently inserted in the map.  DenseMap is a great way to map
+pointers to pointers, or map other small types to each other.
+
+There are several aspects of DenseMap that you should be aware of, however.
+The iterators in a DenseMap are invalidated whenever an insertion occurs,
+unlike map.  Also, because DenseMap allocates space for a large number of
+key/value pairs (it starts with 64 by default), it will waste a lot of space if
+your keys or values are large.  Finally, you must implement a partial
+specialization of DenseMapInfo for the key that you want, if it isn't already
+supported.  This is required to tell DenseMap about two special marker values
+(which can never be inserted into the map) that it needs internally.
+
+DenseMap's find_as() method supports lookup operations using an alternate key
+type.  This is useful in cases where the normal key type is expensive to
+construct, but cheap to compare against.  The DenseMapInfo is responsible for
+defining the appropriate comparison and hashing methods for each alternate key
+type used.
+
+.. _dss_valuemap:
+
+llvm/IR/ValueMap.h
+^^^^^^^^^^^^^^^^^^^
+
+ValueMap is a wrapper around a :ref:`DenseMap <dss_densemap>` mapping
+``Value*``\ s (or subclasses) to another type.  When a Value is deleted or
+RAUW'ed, ValueMap will update itself so the new version of the key is mapped to
+the same value, just as if the key were a WeakVH.  You can configure exactly how
+this happens, and what else happens on these two events, by passing a ``Config``
+parameter to the ValueMap template.
+
+.. _dss_intervalmap:
+
+llvm/ADT/IntervalMap.h
+^^^^^^^^^^^^^^^^^^^^^^
+
+IntervalMap is a compact map for small keys and values.  It maps key intervals
+instead of single keys, and it will automatically coalesce adjacent intervals.
+When the map only contains a few intervals, they are stored in the map object
+itself to avoid allocations.
+
+The IntervalMap iterators are quite big, so they should not be passed around as
+STL iterators.  The heavyweight iterators allow a smaller data structure.
+
+.. _dss_map:
+
+<map>
+^^^^^
+
+std::map has similar characteristics to :ref:`std::set <dss_set>`: it uses a
+single allocation per pair inserted into the map, it offers log(n) lookup with
+an extremely large constant factor, imposes a space penalty of 3 pointers per
+pair in the map, etc.
+
+std::map is most useful when your keys or values are very large, if you need to
+iterate over the collection in sorted order, or if you need stable iterators
+into the map (i.e. they don't get invalidated if an insertion or deletion of
+another element takes place).
+
+.. _dss_mapvector:
+
+llvm/ADT/MapVector.h
+^^^^^^^^^^^^^^^^^^^^
+
+``MapVector<KeyT,ValueT>`` provides a subset of the DenseMap interface.  The
+main difference is that the iteration order is guaranteed to be the insertion
+order, making it an easy (but somewhat expensive) solution for non-deterministic
+iteration over maps of pointers.
+
+It is implemented by mapping from key to an index in a vector of key,value
+pairs.  This provides fast lookup and iteration, but has two main drawbacks:
+the key is stored twice and removing elements takes linear time.  If it is
+necessary to remove elements, it's best to remove them in bulk using
+``remove_if()``.
+
+.. _dss_inteqclasses:
+
+llvm/ADT/IntEqClasses.h
+^^^^^^^^^^^^^^^^^^^^^^^
+
+IntEqClasses provides a compact representation of equivalence classes of small
+integers.  Initially, each integer in the range 0..n-1 has its own equivalence
+class.  Classes can be joined by passing two class representatives to the
+join(a, b) method.  Two integers are in the same class when findLeader() returns
+the same representative.
+
+Once all equivalence classes are formed, the map can be compressed so each
+integer 0..n-1 maps to an equivalence class number in the range 0..m-1, where m
+is the total number of equivalence classes.  The map must be uncompressed before
+it can be edited again.
+
+.. _dss_immutablemap:
+
+llvm/ADT/ImmutableMap.h
+^^^^^^^^^^^^^^^^^^^^^^^
+
+ImmutableMap is an immutable (functional) map implementation based on an AVL
+tree.  Adding or removing elements is done through a Factory object and results
+in the creation of a new ImmutableMap object.  If an ImmutableMap already exists
+with the given key set, then the existing one is returned; equality is compared
+with a FoldingSetNodeID.  The time and space complexity of add or remove
+operations is logarithmic in the size of the original map.
+
+.. _dss_othermap:
+
+Other Map-Like Container Options
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The STL provides several other options, such as std::multimap and the various
+"hash_map" like containers (whether from C++ TR1 or from the SGI library).  We
+never use hash_set and unordered_set because they are generally very expensive
+(each insertion requires a malloc) and very non-portable.
+
+std::multimap is useful if you want to map a key to multiple values, but has all
+the drawbacks of std::map.  A sorted vector or some other approach is almost
+always better.
+
+.. _ds_bit:
+
+Bit storage containers (BitVector, SparseBitVector)
+---------------------------------------------------
+
+Unlike the other containers, there are only two bit storage containers, and
+choosing when to use each is relatively straightforward.
+
+One additional option is ``std::vector<bool>``: we discourage its use for two
+reasons 1) the implementation in many common compilers (e.g.  commonly
+available versions of GCC) is extremely inefficient and 2) the C++ standards
+committee is likely to deprecate this container and/or change it significantly
+somehow.  In any case, please don't use it.
+
+.. _dss_bitvector:
+
+BitVector
+^^^^^^^^^
+
+The BitVector container provides a dynamic size set of bits for manipulation.
+It supports individual bit setting/testing, as well as set operations.  The set
+operations take time O(size of bitvector), but operations are performed one word
+at a time, instead of one bit at a time.  This makes the BitVector very fast for
+set operations compared to other containers.  Use the BitVector when you expect
+the number of set bits to be high (i.e. a dense set).
+
+.. _dss_smallbitvector:
+
+SmallBitVector
+^^^^^^^^^^^^^^
+
+The SmallBitVector container provides the same interface as BitVector, but it is
+optimized for the case where only a small number of bits, less than 25 or so,
+are needed.  It also transparently supports larger bit counts, but slightly less
+efficiently than a plain BitVector, so SmallBitVector should only be used when
+larger counts are rare.
+
+At this time, SmallBitVector does not support set operations (and, or, xor), and
+its operator[] does not provide an assignable lvalue.
+
+.. _dss_sparsebitvector:
+
+SparseBitVector
+^^^^^^^^^^^^^^^
+
+The SparseBitVector container is much like BitVector, with one major difference:
+Only the bits that are set, are stored.  This makes the SparseBitVector much
+more space efficient than BitVector when the set is sparse, as well as making
+set operations O(number of set bits) instead of O(size of universe).  The
+downside to the SparseBitVector is that setting and testing of random bits is
+O(N), and on large SparseBitVectors, this can be slower than BitVector.  In our
+implementation, setting or testing bits in sorted order (either forwards or
+reverse) is O(1) worst case.  Testing and setting bits within 128 bits (depends
+on size) of the current bit is also O(1).  As a general statement,
+testing/setting bits in a SparseBitVector is O(distance away from last set bit).
+
+.. _debugging:
+
+Debugging
+=========
+
+A handful of `GDB pretty printers
+<https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html>`__ are
+provided for some of the core LLVM libraries. To use them, execute the
+following (or add it to your ``~/.gdbinit``)::
+
+  source /path/to/llvm/src/utils/gdb-scripts/prettyprinters.py
+
+It also might be handy to enable the `print pretty
+<http://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_57.html>`__ option to
+avoid data structures being printed as a big block of text.
+
+.. _common:
+
+Helpful Hints for Common Operations
+===================================
+
+This section describes how to perform some very simple transformations of LLVM
+code.  This is meant to give examples of common idioms used, showing the
+practical side of LLVM transformations.
+
+Because this is a "how-to" section, you should also read about the main classes
+that you will be working with.  The :ref:`Core LLVM Class Hierarchy Reference
+<coreclasses>` contains details and descriptions of the main classes that you
+should know about.
+
+.. _inspection:
+
+Basic Inspection and Traversal Routines
+---------------------------------------
+
+The LLVM compiler infrastructure have many different data structures that may be
+traversed.  Following the example of the C++ standard template library, the
+techniques used to traverse these various data structures are all basically the
+same.  For a enumerable sequence of values, the ``XXXbegin()`` function (or
+method) returns an iterator to the start of the sequence, the ``XXXend()``
+function returns an iterator pointing to one past the last valid element of the
+sequence, and there is some ``XXXiterator`` data type that is common between the
+two operations.
+
+Because the pattern for iteration is common across many different aspects of the
+program representation, the standard template library algorithms may be used on
+them, and it is easier to remember how to iterate.  First we show a few common
+examples of the data structures that need to be traversed.  Other data
+structures are traversed in very similar ways.
+
+.. _iterate_function:
+
+Iterating over the ``BasicBlock`` in a ``Function``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+It's quite common to have a ``Function`` instance that you'd like to transform
+in some way; in particular, you'd like to manipulate its ``BasicBlock``\ s.  To
+facilitate this, you'll need to iterate over all of the ``BasicBlock``\ s that
+constitute the ``Function``.  The following is an example that prints the name
+of a ``BasicBlock`` and the number of ``Instruction``\ s it contains:
+
+.. code-block:: c++
+
+  Function &Func = ...
+  for (BasicBlock &BB : Func)
+    // Print out the name of the basic block if it has one, and then the
+    // number of instructions that it contains
+    errs() << "Basic block (name=" << BB.getName() << ") has "
+               << BB.size() << " instructions.\n";
+
+.. _iterate_basicblock:
+
+Iterating over the ``Instruction`` in a ``BasicBlock``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Just like when dealing with ``BasicBlock``\ s in ``Function``\ s, it's easy to
+iterate over the individual instructions that make up ``BasicBlock``\ s.  Here's
+a code snippet that prints out each instruction in a ``BasicBlock``:
+
+.. code-block:: c++
+
+  BasicBlock& BB = ...
+  for (Instruction &I : BB)
+     // The next statement works since operator<<(ostream&,...)
+     // is overloaded for Instruction&
+     errs() << I << "\n";
+
+
+However, this isn't really the best way to print out the contents of a
+``BasicBlock``!  Since the ostream operators are overloaded for virtually
+anything you'll care about, you could have just invoked the print routine on the
+basic block itself: ``errs() << BB << "\n";``.
+
+.. _iterate_insiter:
+
+Iterating over the ``Instruction`` in a ``Function``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you're finding that you commonly iterate over a ``Function``'s
+``BasicBlock``\ s and then that ``BasicBlock``'s ``Instruction``\ s,
+``InstIterator`` should be used instead.  You'll need to include
+``llvm/IR/InstIterator.h`` (`doxygen
+<http://llvm.org/doxygen/InstIterator_8h.html>`__) and then instantiate
+``InstIterator``\ s explicitly in your code.  Here's a small example that shows
+how to dump all instructions in a function to the standard error stream:
+
+.. code-block:: c++
+
+  #include "llvm/IR/InstIterator.h"
+
+  // F is a pointer to a Function instance
+  for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
+    errs() << *I << "\n";
+
+Easy, isn't it?  You can also use ``InstIterator``\ s to fill a work list with
+its initial contents.  For example, if you wanted to initialize a work list to
+contain all instructions in a ``Function`` F, all you would need to do is
+something like:
+
+.. code-block:: c++
+
+  std::set<Instruction*> worklist;
+  // or better yet, SmallPtrSet<Instruction*, 64> worklist;
+
+  for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
+    worklist.insert(&*I);
+
+The STL set ``worklist`` would now contain all instructions in the ``Function``
+pointed to by F.
+
+.. _iterate_convert:
+
+Turning an iterator into a class pointer (and vice-versa)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Sometimes, it'll be useful to grab a reference (or pointer) to a class instance
+when all you've got at hand is an iterator.  Well, extracting a reference or a
+pointer from an iterator is very straight-forward.  Assuming that ``i`` is a
+``BasicBlock::iterator`` and ``j`` is a ``BasicBlock::const_iterator``:
+
+.. code-block:: c++
+
+  Instruction& inst = *i;   // Grab reference to instruction reference
+  Instruction* pinst = &*i; // Grab pointer to instruction reference
+  const Instruction& inst = *j;
+
+However, the iterators you'll be working with in the LLVM framework are special:
+they will automatically convert to a ptr-to-instance type whenever they need to.
+Instead of dereferencing the iterator and then taking the address of the result,
+you can simply assign the iterator to the proper pointer type and you get the
+dereference and address-of operation as a result of the assignment (behind the
+scenes, this is a result of overloading casting mechanisms).  Thus the second
+line of the last example,
+
+.. code-block:: c++
+
+  Instruction *pinst = &*i;
+
+is semantically equivalent to
+
+.. code-block:: c++
+
+  Instruction *pinst = i;
+
+It's also possible to turn a class pointer into the corresponding iterator, and
+this is a constant time operation (very efficient).  The following code snippet
+illustrates use of the conversion constructors provided by LLVM iterators.  By
+using these, you can explicitly grab the iterator of something without actually
+obtaining it via iteration over some structure:
+
+.. code-block:: c++
+
+  void printNextInstruction(Instruction* inst) {
+    BasicBlock::iterator it(inst);
+    ++it; // After this line, it refers to the instruction after *inst
+    if (it != inst->getParent()->end()) errs() << *it << "\n";
+  }
+
+Unfortunately, these implicit conversions come at a cost; they prevent these
+iterators from conforming to standard iterator conventions, and thus from being
+usable with standard algorithms and containers.  For example, they prevent the
+following code, where ``B`` is a ``BasicBlock``, from compiling:
+
+.. code-block:: c++
+
+  llvm::SmallVector<llvm::Instruction *, 16>(B->begin(), B->end());
+
+Because of this, these implicit conversions may be removed some day, and
+``operator*`` changed to return a pointer instead of a reference.
+
+.. _iterate_complex:
+
+Finding call sites: a slightly more complex example
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Say that you're writing a FunctionPass and would like to count all the locations
+in the entire module (that is, across every ``Function``) where a certain
+function (i.e., some ``Function *``) is already in scope.  As you'll learn
+later, you may want to use an ``InstVisitor`` to accomplish this in a much more
+straight-forward manner, but this example will allow us to explore how you'd do
+it if you didn't have ``InstVisitor`` around.  In pseudo-code, this is what we
+want to do:
+
+.. code-block:: none
+
+  initialize callCounter to zero
+  for each Function f in the Module
+    for each BasicBlock b in f
+      for each Instruction i in b
+        if (i is a CallInst and calls the given function)
+          increment callCounter
+
+And the actual code is (remember, because we're writing a ``FunctionPass``, our
+``FunctionPass``-derived class simply has to override the ``runOnFunction``
+method):
+
+.. code-block:: c++
+
+  Function* targetFunc = ...;
+
+  class OurFunctionPass : public FunctionPass {
+    public:
+      OurFunctionPass(): callCounter(0) { }
+
+      virtual runOnFunction(Function& F) {
+        for (BasicBlock &B : F) {
+          for (Instruction &I: B) {
+            if (auto *CallInst = dyn_cast<CallInst>(&I)) {
+              // We know we've encountered a call instruction, so we
+              // need to determine if it's a call to the
+              // function pointed to by m_func or not.
+              if (CallInst->getCalledFunction() == targetFunc)
+                ++callCounter;
+            }
+          }
+        }
+      }
+
+    private:
+      unsigned callCounter;
+  };
+
+.. _calls_and_invokes:
+
+Treating calls and invokes the same way
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You may have noticed that the previous example was a bit oversimplified in that
+it did not deal with call sites generated by 'invoke' instructions.  In this,
+and in other situations, you may find that you want to treat ``CallInst``\ s and
+``InvokeInst``\ s the same way, even though their most-specific common base
+class is ``Instruction``, which includes lots of less closely-related things.
+For these cases, LLVM provides a handy wrapper class called ``CallSite``
+(`doxygen <http://llvm.org/doxygen/classllvm_1_1CallSite.html>`__) It is
+essentially a wrapper around an ``Instruction`` pointer, with some methods that
+provide functionality common to ``CallInst``\ s and ``InvokeInst``\ s.
+
+This class has "value semantics": it should be passed by value, not by reference
+and it should not be dynamically allocated or deallocated using ``operator new``
+or ``operator delete``.  It is efficiently copyable, assignable and
+constructable, with costs equivalents to that of a bare pointer.  If you look at
+its definition, it has only a single pointer member.
+
+.. _iterate_chains:
+
+Iterating over def-use & use-def chains
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Frequently, we might have an instance of the ``Value`` class (`doxygen
+<http://llvm.org/doxygen/classllvm_1_1Value.html>`__) and we want to determine
+which ``User``\ s use the ``Value``.  The list of all ``User``\ s of a particular
+``Value`` is called a *def-use* chain.  For example, let's say we have a
+``Function*`` named ``F`` to a particular function ``foo``.  Finding all of the
+instructions that *use* ``foo`` is as simple as iterating over the *def-use*
+chain of ``F``:
+
+.. code-block:: c++
+
+  Function *F = ...;
+
+  for (User *U : F->users()) {
+    if (Instruction *Inst = dyn_cast<Instruction>(U)) {
+      errs() << "F is used in instruction:\n";
+      errs() << *Inst << "\n";
+    }
+
+Alternatively, it's common to have an instance of the ``User`` Class (`doxygen
+<http://llvm.org/doxygen/classllvm_1_1User.html>`__) and need to know what
+``Value``\ s are used by it.  The list of all ``Value``\ s used by a ``User`` is
+known as a *use-def* chain.  Instances of class ``Instruction`` are common
+``User`` s, so we might want to iterate over all of the values that a particular
+instruction uses (that is, the operands of the particular ``Instruction``):
+
+.. code-block:: c++
+
+  Instruction *pi = ...;
+
+  for (Use &U : pi->operands()) {
+    Value *v = U.get();
+    // ...
+  }
+
+Declaring objects as ``const`` is an important tool of enforcing mutation free
+algorithms (such as analyses, etc.).  For this purpose above iterators come in
+constant flavors as ``Value::const_use_iterator`` and
+``Value::const_op_iterator``.  They automatically arise when calling
+``use/op_begin()`` on ``const Value*``\ s or ``const User*``\ s respectively.
+Upon dereferencing, they return ``const Use*``\ s.  Otherwise the above patterns
+remain unchanged.
+
+.. _iterate_preds:
+
+Iterating over predecessors & successors of blocks
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Iterating over the predecessors and successors of a block is quite easy with the
+routines defined in ``"llvm/IR/CFG.h"``.  Just use code like this to
+iterate over all predecessors of BB:
+
+.. code-block:: c++
+
+  #include "llvm/IR/CFG.h"
+  BasicBlock *BB = ...;
+
+  for (BasicBlock *Pred : predecessors(BB)) {
+    // ...
+  }
+
+Similarly, to iterate over successors use ``successors``.
+
+.. _simplechanges:
+
+Making simple changes
+---------------------
+
+There are some primitive transformation operations present in the LLVM
+infrastructure that are worth knowing about.  When performing transformations,
+it's fairly common to manipulate the contents of basic blocks.  This section
+describes some of the common methods for doing so and gives example code.
+
+.. _schanges_creating:
+
+Creating and inserting new ``Instruction``\ s
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+*Instantiating Instructions*
+
+Creation of ``Instruction``\ s is straight-forward: simply call the constructor
+for the kind of instruction to instantiate and provide the necessary parameters.
+For example, an ``AllocaInst`` only *requires* a (const-ptr-to) ``Type``.  Thus:
+
+.. code-block:: c++
+
+  auto *ai = new AllocaInst(Type::Int32Ty);
+
+will create an ``AllocaInst`` instance that represents the allocation of one
+integer in the current stack frame, at run time.  Each ``Instruction`` subclass
+is likely to have varying default parameters which change the semantics of the
+instruction, so refer to the `doxygen documentation for the subclass of
+Instruction <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_ that
+you're interested in instantiating.
+
+*Naming values*
+
+It is very useful to name the values of instructions when you're able to, as
+this facilitates the debugging of your transformations.  If you end up looking
+at generated LLVM machine code, you definitely want to have logical names
+associated with the results of instructions!  By supplying a value for the
+``Name`` (default) parameter of the ``Instruction`` constructor, you associate a
+logical name with the result of the instruction's execution at run time.  For
+example, say that I'm writing a transformation that dynamically allocates space
+for an integer on the stack, and that integer is going to be used as some kind
+of index by some other code.  To accomplish this, I place an ``AllocaInst`` at
+the first point in the first ``BasicBlock`` of some ``Function``, and I'm
+intending to use it within the same ``Function``.  I might do:
+
+.. code-block:: c++
+
+  auto *pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc");
+
+where ``indexLoc`` is now the logical name of the instruction's execution value,
+which is a pointer to an integer on the run time stack.
+
+*Inserting instructions*
+
+There are essentially three ways to insert an ``Instruction`` into an existing
+sequence of instructions that form a ``BasicBlock``:
+
+* Insertion into an explicit instruction list
+
+  Given a ``BasicBlock* pb``, an ``Instruction* pi`` within that ``BasicBlock``,
+  and a newly-created instruction we wish to insert before ``*pi``, we do the
+  following:
+
+  .. code-block:: c++
+
+      BasicBlock *pb = ...;
+      Instruction *pi = ...;
+      auto *newInst = new Instruction(...);
+
+      pb->getInstList().insert(pi, newInst); // Inserts newInst before pi in pb
+
+  Appending to the end of a ``BasicBlock`` is so common that the ``Instruction``
+  class and ``Instruction``-derived classes provide constructors which take a
+  pointer to a ``BasicBlock`` to be appended to.  For example code that looked
+  like:
+
+  .. code-block:: c++
+
+    BasicBlock *pb = ...;
+    auto *newInst = new Instruction(...);
+
+    pb->getInstList().push_back(newInst); // Appends newInst to pb
+
+  becomes:
+
+  .. code-block:: c++
+
+    BasicBlock *pb = ...;
+    auto *newInst = new Instruction(..., pb);
+
+  which is much cleaner, especially if you are creating long instruction
+  streams.
+
+* Insertion into an implicit instruction list
+
+  ``Instruction`` instances that are already in ``BasicBlock``\ s are implicitly
+  associated with an existing instruction list: the instruction list of the
+  enclosing basic block.  Thus, we could have accomplished the same thing as the
+  above code without being given a ``BasicBlock`` by doing:
+
+  .. code-block:: c++
+
+    Instruction *pi = ...;
+    auto *newInst = new Instruction(...);
+
+    pi->getParent()->getInstList().insert(pi, newInst);
+
+  In fact, this sequence of steps occurs so frequently that the ``Instruction``
+  class and ``Instruction``-derived classes provide constructors which take (as
+  a default parameter) a pointer to an ``Instruction`` which the newly-created
+  ``Instruction`` should precede.  That is, ``Instruction`` constructors are
+  capable of inserting the newly-created instance into the ``BasicBlock`` of a
+  provided instruction, immediately before that instruction.  Using an
+  ``Instruction`` constructor with a ``insertBefore`` (default) parameter, the
+  above code becomes:
+
+  .. code-block:: c++
+
+    Instruction* pi = ...;
+    auto *newInst = new Instruction(..., pi);
+
+  which is much cleaner, especially if you're creating a lot of instructions and
+  adding them to ``BasicBlock``\ s.
+
+* Insertion using an instance of ``IRBuilder``
+
+  Inserting several ``Instruction``\ s can be quite laborious using the previous
+  methods. The ``IRBuilder`` is a convenience class that can be used to add
+  several instructions to the end of a ``BasicBlock`` or before a particular
+  ``Instruction``. It also supports constant folding and renaming named
+  registers (see ``IRBuilder``'s template arguments).
+
+  The example below demonstrates a very simple use of the ``IRBuilder`` where
+  three instructions are inserted before the instruction ``pi``. The first two
+  instructions are Call instructions and third instruction multiplies the return
+  value of the two calls.
+
+  .. code-block:: c++
+
+    Instruction *pi = ...;
+    IRBuilder<> Builder(pi);
+    CallInst* callOne = Builder.CreateCall(...);
+    CallInst* callTwo = Builder.CreateCall(...);
+    Value* result = Builder.CreateMul(callOne, callTwo);
+
+  The example below is similar to the above example except that the created
+  ``IRBuilder`` inserts instructions at the end of the ``BasicBlock`` ``pb``.
+
+  .. code-block:: c++
+
+    BasicBlock *pb = ...;
+    IRBuilder<> Builder(pb);
+    CallInst* callOne = Builder.CreateCall(...);
+    CallInst* callTwo = Builder.CreateCall(...);
+    Value* result = Builder.CreateMul(callOne, callTwo);
+
+  See :doc:`tutorial/LangImpl03` for a practical use of the ``IRBuilder``.
+
+
+.. _schanges_deleting:
+
+Deleting Instructions
+^^^^^^^^^^^^^^^^^^^^^
+
+Deleting an instruction from an existing sequence of instructions that form a
+BasicBlock_ is very straight-forward: just call the instruction's
+``eraseFromParent()`` method.  For example:
+
+.. code-block:: c++
+
+  Instruction *I = .. ;
+  I->eraseFromParent();
+
+This unlinks the instruction from its containing basic block and deletes it.  If
+you'd just like to unlink the instruction from its containing basic block but
+not delete it, you can use the ``removeFromParent()`` method.
+
+.. _schanges_replacing:
+
+Replacing an Instruction with another Value
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Replacing individual instructions
+"""""""""""""""""""""""""""""""""
+
+Including "`llvm/Transforms/Utils/BasicBlockUtils.h
+<http://llvm.org/doxygen/BasicBlockUtils_8h_source.html>`_" permits use of two
+very useful replace functions: ``ReplaceInstWithValue`` and
+``ReplaceInstWithInst``.
+
+.. _schanges_deleting_sub:
+
+Deleting Instructions
+"""""""""""""""""""""
+
+* ``ReplaceInstWithValue``
+
+  This function replaces all uses of a given instruction with a value, and then
+  removes the original instruction.  The following example illustrates the
+  replacement of the result of a particular ``AllocaInst`` that allocates memory
+  for a single integer with a null pointer to an integer.
+
+  .. code-block:: c++
+
+    AllocaInst* instToReplace = ...;
+    BasicBlock::iterator ii(instToReplace);
+
+    ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii,
+                         Constant::getNullValue(PointerType::getUnqual(Type::Int32Ty)));
+
+* ``ReplaceInstWithInst``
+
+  This function replaces a particular instruction with another instruction,
+  inserting the new instruction into the basic block at the location where the
+  old instruction was, and replacing any uses of the old instruction with the
+  new instruction.  The following example illustrates the replacement of one
+  ``AllocaInst`` with another.
+
+  .. code-block:: c++
+
+    AllocaInst* instToReplace = ...;
+    BasicBlock::iterator ii(instToReplace);
+
+    ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii,
+                        new AllocaInst(Type::Int32Ty, 0, "ptrToReplacedInt"));
+
+
+Replacing multiple uses of Users and Values
+"""""""""""""""""""""""""""""""""""""""""""
+
+You can use ``Value::replaceAllUsesWith`` and ``User::replaceUsesOfWith`` to
+change more than one use at a time.  See the doxygen documentation for the
+`Value Class <http://llvm.org/doxygen/classllvm_1_1Value.html>`_ and `User Class
+<http://llvm.org/doxygen/classllvm_1_1User.html>`_, respectively, for more
+information.
+
+.. _schanges_deletingGV:
+
+Deleting GlobalVariables
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Deleting a global variable from a module is just as easy as deleting an
+Instruction.  First, you must have a pointer to the global variable that you
+wish to delete.  You use this pointer to erase it from its parent, the module.
+For example:
+
+.. code-block:: c++
+
+  GlobalVariable *GV = .. ;
+
+  GV->eraseFromParent();
+
+
+.. _threading:
+
+Threads and LLVM
+================
+
+This section describes the interaction of the LLVM APIs with multithreading,
+both on the part of client applications, and in the JIT, in the hosted
+application.
+
+Note that LLVM's support for multithreading is still relatively young.  Up
+through version 2.5, the execution of threaded hosted applications was
+supported, but not threaded client access to the APIs.  While this use case is
+now supported, clients *must* adhere to the guidelines specified below to ensure
+proper operation in multithreaded mode.
+
+Note that, on Unix-like platforms, LLVM requires the presence of GCC's atomic
+intrinsics in order to support threaded operation.  If you need a
+multhreading-capable LLVM on a platform without a suitably modern system
+compiler, consider compiling LLVM and LLVM-GCC in single-threaded mode, and
+using the resultant compiler to build a copy of LLVM with multithreading
+support.
+
+.. _shutdown:
+
+Ending Execution with ``llvm_shutdown()``
+-----------------------------------------
+
+When you are done using the LLVM APIs, you should call ``llvm_shutdown()`` to
+deallocate memory used for internal structures.
+
+.. _managedstatic:
+
+Lazy Initialization with ``ManagedStatic``
+------------------------------------------
+
+``ManagedStatic`` is a utility class in LLVM used to implement static
+initialization of static resources, such as the global type tables.  In a
+single-threaded environment, it implements a simple lazy initialization scheme.
+When LLVM is compiled with support for multi-threading, however, it uses
+double-checked locking to implement thread-safe lazy initialization.
+
+.. _llvmcontext:
+
+Achieving Isolation with ``LLVMContext``
+----------------------------------------
+
+``LLVMContext`` is an opaque class in the LLVM API which clients can use to
+operate multiple, isolated instances of LLVM concurrently within the same
+address space.  For instance, in a hypothetical compile-server, the compilation
+of an individual translation unit is conceptually independent from all the
+others, and it would be desirable to be able to compile incoming translation
+units concurrently on independent server threads.  Fortunately, ``LLVMContext``
+exists to enable just this kind of scenario!
+
+Conceptually, ``LLVMContext`` provides isolation.  Every LLVM entity
+(``Module``\ s, ``Value``\ s, ``Type``\ s, ``Constant``\ s, etc.) in LLVM's
+in-memory IR belongs to an ``LLVMContext``.  Entities in different contexts
+*cannot* interact with each other: ``Module``\ s in different contexts cannot be
+linked together, ``Function``\ s cannot be added to ``Module``\ s in different
+contexts, etc.  What this means is that is safe to compile on multiple
+threads simultaneously, as long as no two threads operate on entities within the
+same context.
+
+In practice, very few places in the API require the explicit specification of a
+``LLVMContext``, other than the ``Type`` creation/lookup APIs.  Because every
+``Type`` carries a reference to its owning context, most other entities can
+determine what context they belong to by looking at their own ``Type``.  If you
+are adding new entities to LLVM IR, please try to maintain this interface
+design.
+
+.. _jitthreading:
+
+Threads and the JIT
+-------------------
+
+LLVM's "eager" JIT compiler is safe to use in threaded programs.  Multiple
+threads can call ``ExecutionEngine::getPointerToFunction()`` or
+``ExecutionEngine::runFunction()`` concurrently, and multiple threads can run
+code output by the JIT concurrently.  The user must still ensure that only one
+thread accesses IR in a given ``LLVMContext`` while another thread might be
+modifying it.  One way to do that is to always hold the JIT lock while accessing
+IR outside the JIT (the JIT *modifies* the IR by adding ``CallbackVH``\ s).
+Another way is to only call ``getPointerToFunction()`` from the
+``LLVMContext``'s thread.
+
+When the JIT is configured to compile lazily (using
+``ExecutionEngine::DisableLazyCompilation(false)``), there is currently a `race
+condition <https://bugs.llvm.org/show_bug.cgi?id=5184>`_ in updating call sites
+after a function is lazily-jitted.  It's still possible to use the lazy JIT in a
+threaded program if you ensure that only one thread at a time can call any
+particular lazy stub and that the JIT lock guards any IR access, but we suggest
+using only the eager JIT in threaded programs.
+
+.. _advanced:
+
+Advanced Topics
+===============
+
+This section describes some of the advanced or obscure API's that most clients
+do not need to be aware of.  These API's tend manage the inner workings of the
+LLVM system, and only need to be accessed in unusual circumstances.
+
+.. _SymbolTable:
+
+The ``ValueSymbolTable`` class
+------------------------------
+
+The ``ValueSymbolTable`` (`doxygen
+<http://llvm.org/doxygen/classllvm_1_1ValueSymbolTable.html>`__) class provides
+a symbol table that the :ref:`Function <c_Function>` and Module_ classes use for
+naming value definitions.  The symbol table can provide a name for any Value_.
+
+Note that the ``SymbolTable`` class should not be directly accessed by most
+clients.  It should only be used when iteration over the symbol table names
+themselves are required, which is very special purpose.  Note that not all LLVM
+Value_\ s have names, and those without names (i.e. they have an empty name) do
+not exist in the symbol table.
+
+Symbol tables support iteration over the values in the symbol table with
+``begin/end/iterator`` and supports querying to see if a specific name is in the
+symbol table (with ``lookup``).  The ``ValueSymbolTable`` class exposes no
+public mutator methods, instead, simply call ``setName`` on a value, which will
+autoinsert it into the appropriate symbol table.
+
+.. _UserLayout:
+
+The ``User`` and owned ``Use`` classes' memory layout
+-----------------------------------------------------
+
+The ``User`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1User.html>`__)
+class provides a basis for expressing the ownership of ``User`` towards other
+`Value instance <http://llvm.org/doxygen/classllvm_1_1Value.html>`_\ s.  The
+``Use`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1Use.html>`__) helper
+class is employed to do the bookkeeping and to facilitate *O(1)* addition and
+removal.
+
+.. _Use2User:
+
+Interaction and relationship between ``User`` and ``Use`` objects
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A subclass of ``User`` can choose between incorporating its ``Use`` objects or
+refer to them out-of-line by means of a pointer.  A mixed variant (some ``Use``
+s inline others hung off) is impractical and breaks the invariant that the
+``Use`` objects belonging to the same ``User`` form a contiguous array.
+
+We have 2 different layouts in the ``User`` (sub)classes:
+
+* Layout a)
+
+  The ``Use`` object(s) are inside (resp. at fixed offset) of the ``User``
+  object and there are a fixed number of them.
+
+* Layout b)
+
+  The ``Use`` object(s) are referenced by a pointer to an array from the
+  ``User`` object and there may be a variable number of them.
+
+As of v2.4 each layout still possesses a direct pointer to the start of the
+array of ``Use``\ s.  Though not mandatory for layout a), we stick to this
+redundancy for the sake of simplicity.  The ``User`` object also stores the
+number of ``Use`` objects it has. (Theoretically this information can also be
+calculated given the scheme presented below.)
+
+Special forms of allocation operators (``operator new``) enforce the following
+memory layouts:
+
+* Layout a) is modelled by prepending the ``User`` object by the ``Use[]``
+  array.
+
+  .. code-block:: none
+
+    ...---.---.---.---.-------...
+      | P | P | P | P | User
+    '''---'---'---'---'-------'''
+
+* Layout b) is modelled by pointing at the ``Use[]`` array.
+
+  .. code-block:: none
+
+    .-------...
+    | User
+    '-------'''
+        |
+        v
+        .---.---.---.---...
+        | P | P | P | P |
+        '---'---'---'---'''
+
+*(In the above figures* '``P``' *stands for the* ``Use**`` *that is stored in
+each* ``Use`` *object in the member* ``Use::Prev`` *)*
+
+.. _Waymarking:
+
+The waymarking algorithm
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since the ``Use`` objects are deprived of the direct (back)pointer to their
+``User`` objects, there must be a fast and exact method to recover it.  This is
+accomplished by the following scheme:
+
+A bit-encoding in the 2 LSBits (least significant bits) of the ``Use::Prev``
+allows to find the start of the ``User`` object:
+
+* ``00`` --- binary digit 0
+
+* ``01`` --- binary digit 1
+
+* ``10`` --- stop and calculate (``s``)
+
+* ``11`` --- full stop (``S``)
+
+Given a ``Use*``, all we have to do is to walk till we get a stop and we either
+have a ``User`` immediately behind or we have to walk to the next stop picking
+up digits and calculating the offset:
+
+.. code-block:: none
+
+  .---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.----------------
+  | 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*)
+  '---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'----------------
+      |+15                |+10            |+6         |+3     |+1
+      |                   |               |           |       | __>
+      |                   |               |           | __________>
+      |                   |               | ______________________>
+      |                   | ______________________________________>
+      | __________________________________________________________>
+
+Only the significant number of bits need to be stored between the stops, so that
+the *worst case is 20 memory accesses* when there are 1000 ``Use`` objects
+associated with a ``User``.
+
+.. _ReferenceImpl:
+
+Reference implementation
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following literate Haskell fragment demonstrates the concept:
+
+.. code-block:: haskell
+
+  > import Test.QuickCheck
+  >
+  > digits :: Int -> [Char] -> [Char]
+  > digits 0 acc = '0' : acc
+  > digits 1 acc = '1' : acc
+  > digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc
+  >
+  > dist :: Int -> [Char] -> [Char]
+  > dist 0 [] = ['S']
+  > dist 0 acc = acc
+  > dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r
+  > dist n acc = dist (n - 1) $ dist 1 acc
+  >
+  > takeLast n ss = reverse $ take n $ reverse ss
+  >
+  > test = takeLast 40 $ dist 20 []
+  >
+
+Printing <test> gives: ``"1s100000s11010s10100s1111s1010s110s11s1S"``
+
+The reverse algorithm computes the length of the string just by examining a
+certain prefix:
+
+.. code-block:: haskell
+
+  > pref :: [Char] -> Int
+  > pref "S" = 1
+  > pref ('s':'1':rest) = decode 2 1 rest
+  > pref (_:rest) = 1 + pref rest
+  >
+  > decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest
+  > decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest
+  > decode walk acc _ = walk + acc
+  >
+
+Now, as expected, printing <pref test> gives ``40``.
+
+We can *quickCheck* this with following property:
+
+.. code-block:: haskell
+
+  > testcase = dist 2000 []
+  > testcaseLength = length testcase
+  >
+  > identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr
+  >     where arr = takeLast n testcase
+  >
+
+As expected <quickCheck identityProp> gives:
+
+::
+
+  *Main> quickCheck identityProp
+  OK, passed 100 tests.
+
+Let's be a bit more exhaustive:
+
+.. code-block:: haskell
+
+  >
+  > deepCheck p = check (defaultConfig { configMaxTest = 500 }) p
+  >
+
+And here is the result of <deepCheck identityProp>:
+
+::
+
+  *Main> deepCheck identityProp
+  OK, passed 500 tests.
+
+.. _Tagging:
+
+Tagging considerations
+^^^^^^^^^^^^^^^^^^^^^^
+
+To maintain the invariant that the 2 LSBits of each ``Use**`` in ``Use`` never
+change after being set up, setters of ``Use::Prev`` must re-tag the new
+``Use**`` on every modification.  Accordingly getters must strip the tag bits.
+
+For layout b) instead of the ``User`` we find a pointer (``User*`` with LSBit
+set).  Following this pointer brings us to the ``User``.  A portable trick
+ensures that the first bytes of ``User`` (if interpreted as a pointer) never has
+the LSBit set. (Portability is relying on the fact that all known compilers
+place the ``vptr`` in the first word of the instances.)
+
+.. _polymorphism:
+
+Designing Type Hiercharies and Polymorphic Interfaces
+-----------------------------------------------------
+
+There are two different design patterns that tend to result in the use of
+virtual dispatch for methods in a type hierarchy in C++ programs. The first is
+a genuine type hierarchy where different types in the hierarchy model
+a specific subset of the functionality and semantics, and these types nest
+strictly within each other. Good examples of this can be seen in the ``Value``
+or ``Type`` type hierarchies.
+
+A second is the desire to dispatch dynamically across a collection of
+polymorphic interface implementations. This latter use case can be modeled with
+virtual dispatch and inheritance by defining an abstract interface base class
+which all implementations derive from and override. However, this
+implementation strategy forces an **"is-a"** relationship to exist that is not
+actually meaningful. There is often not some nested hierarchy of useful
+generalizations which code might interact with and move up and down. Instead,
+there is a singular interface which is dispatched across a range of
+implementations.
+
+The preferred implementation strategy for the second use case is that of
+generic programming (sometimes called "compile-time duck typing" or "static
+polymorphism"). For example, a template over some type parameter ``T`` can be
+instantiated across any particular implementation that conforms to the
+interface or *concept*. A good example here is the highly generic properties of
+any type which models a node in a directed graph. LLVM models these primarily
+through templates and generic programming. Such templates include the
+``LoopInfoBase`` and ``DominatorTreeBase``. When this type of polymorphism
+truly needs **dynamic** dispatch you can generalize it using a technique
+called *concept-based polymorphism*. This pattern emulates the interfaces and
+behaviors of templates using a very limited form of virtual dispatch for type
+erasure inside its implementation. You can find examples of this technique in
+the ``PassManager.h`` system, and there is a more detailed introduction to it
+by Sean Parent in several of his talks and papers:
+
+#. `Inheritance Is The Base Class of Evil
+   <http://channel9.msdn.com/Events/GoingNative/2013/Inheritance-Is-The-Base-Class-of-Evil>`_
+   - The GoingNative 2013 talk describing this technique, and probably the best
+   place to start.
+#. `Value Semantics and Concepts-based Polymorphism
+   <http://www.youtube.com/watch?v=_BpMYeUFXv8>`_ - The C++Now! 2012 talk
+   describing this technique in more detail.
+#. `Sean Parent's Papers and Presentations
+   <http://github.com/sean-parent/sean-parent.github.com/wiki/Papers-and-Presentations>`_
+   - A Github project full of links to slides, video, and sometimes code.
+
+When deciding between creating a type hierarchy (with either tagged or virtual
+dispatch) and using templates or concepts-based polymorphism, consider whether
+there is some refinement of an abstract base class which is a semantically
+meaningful type on an interface boundary. If anything more refined than the
+root abstract interface is meaningless to talk about as a partial extension of
+the semantic model, then your use case likely fits better with polymorphism and
+you should avoid using virtual dispatch. However, there may be some exigent
+circumstances that require one technique or the other to be used.
+
+If you do need to introduce a type hierarchy, we prefer to use explicitly
+closed type hierarchies with manual tagged dispatch and/or RTTI rather than the
+open inheritance model and virtual dispatch that is more common in C++ code.
+This is because LLVM rarely encourages library consumers to extend its core
+types, and leverages the closed and tag-dispatched nature of its hierarchies to
+generate significantly more efficient code. We have also found that a large
+amount of our usage of type hierarchies fits better with tag-based pattern
+matching rather than dynamic dispatch across a common interface. Within LLVM we
+have built custom helpers to facilitate this design. See this document's
+section on :ref:`isa and dyn_cast <isa>` and our :doc:`detailed document
+<HowToSetUpLLVMStyleRTTI>` which describes how you can implement this
+pattern for use with the LLVM helpers.
+
+.. _abi_breaking_checks:
+
+ABI Breaking Checks
+-------------------
+
+Checks and asserts that alter the LLVM C++ ABI are predicated on the
+preprocessor symbol `LLVM_ENABLE_ABI_BREAKING_CHECKS` -- LLVM
+libraries built with `LLVM_ENABLE_ABI_BREAKING_CHECKS` are not ABI
+compatible LLVM libraries built without it defined.  By default,
+turning on assertions also turns on `LLVM_ENABLE_ABI_BREAKING_CHECKS`
+so a default +Asserts build is not ABI compatible with a
+default -Asserts build.  Clients that want ABI compatibility
+between +Asserts and -Asserts builds should use the CMake build system
+to set `LLVM_ENABLE_ABI_BREAKING_CHECKS` independently
+of `LLVM_ENABLE_ASSERTIONS`.
+
+.. _coreclasses:
+
+The Core LLVM Class Hierarchy Reference
+=======================================
+
+``#include "llvm/IR/Type.h"``
+
+header source: `Type.h <http://llvm.org/doxygen/Type_8h_source.html>`_
+
+doxygen info: `Type Clases <http://llvm.org/doxygen/classllvm_1_1Type.html>`_
+
+The Core LLVM classes are the primary means of representing the program being
+inspected or transformed.  The core LLVM classes are defined in header files in
+the ``include/llvm/IR`` directory, and implemented in the ``lib/IR``
+directory. It's worth noting that, for historical reasons, this library is
+called ``libLLVMCore.so``, not ``libLLVMIR.so`` as you might expect.
+
+.. _Type:
+
+The Type class and Derived Types
+--------------------------------
+
+``Type`` is a superclass of all type classes.  Every ``Value`` has a ``Type``.
+``Type`` cannot be instantiated directly but only through its subclasses.
+Certain primitive types (``VoidType``, ``LabelType``, ``FloatType`` and
+``DoubleType``) have hidden subclasses.  They are hidden because they offer no
+useful functionality beyond what the ``Type`` class offers except to distinguish
+themselves from other subclasses of ``Type``.
+
+All other types are subclasses of ``DerivedType``.  Types can be named, but this
+is not a requirement.  There exists exactly one instance of a given shape at any
+one time.  This allows type equality to be performed with address equality of
+the Type Instance.  That is, given two ``Type*`` values, the types are identical
+if the pointers are identical.
+
+.. _m_Type:
+
+Important Public Methods
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``bool isIntegerTy() const``: Returns true for any integer type.
+
+* ``bool isFloatingPointTy()``: Return true if this is one of the five
+  floating point types.
+
+* ``bool isSized()``: Return true if the type has known size.  Things
+  that don't have a size are abstract types, labels and void.
+
+.. _derivedtypes:
+
+Important Derived Types
+^^^^^^^^^^^^^^^^^^^^^^^
+
+``IntegerType``
+  Subclass of DerivedType that represents integer types of any bit width.  Any
+  bit width between ``IntegerType::MIN_INT_BITS`` (1) and
+  ``IntegerType::MAX_INT_BITS`` (~8 million) can be represented.
+
+  * ``static const IntegerType* get(unsigned NumBits)``: get an integer
+    type of a specific bit width.
+
+  * ``unsigned getBitWidth() const``: Get the bit width of an integer type.
+
+``SequentialType``
+  This is subclassed by ArrayType and VectorType.
+
+  * ``const Type * getElementType() const``: Returns the type of each
+    of the elements in the sequential type.
+
+  * ``uint64_t getNumElements() const``: Returns the number of elements
+    in the sequential type.
+
+``ArrayType``
+  This is a subclass of SequentialType and defines the interface for array
+  types.
+
+``PointerType``
+  Subclass of Type for pointer types.
+
+``VectorType``
+  Subclass of SequentialType for vector types.  A vector type is similar to an
+  ArrayType but is distinguished because it is a first class type whereas
+  ArrayType is not.  Vector types are used for vector operations and are usually
+  small vectors of an integer or floating point type.
+
+``StructType``
+  Subclass of DerivedTypes for struct types.
+
+.. _FunctionType:
+
+``FunctionType``
+  Subclass of DerivedTypes for function types.
+
+  * ``bool isVarArg() const``: Returns true if it's a vararg function.
+
+  * ``const Type * getReturnType() const``: Returns the return type of the
+    function.
+
+  * ``const Type * getParamType (unsigned i)``: Returns the type of the ith
+    parameter.
+
+  * ``const unsigned getNumParams() const``: Returns the number of formal
+    parameters.
+
+.. _Module:
+
+The ``Module`` class
+--------------------
+
+``#include "llvm/IR/Module.h"``
+
+header source: `Module.h <http://llvm.org/doxygen/Module_8h_source.html>`_
+
+doxygen info: `Module Class <http://llvm.org/doxygen/classllvm_1_1Module.html>`_
+
+The ``Module`` class represents the top level structure present in LLVM
+programs.  An LLVM module is effectively either a translation unit of the
+original program or a combination of several translation units merged by the
+linker.  The ``Module`` class keeps track of a list of :ref:`Function
+<c_Function>`\ s, a list of GlobalVariable_\ s, and a SymbolTable_.
+Additionally, it contains a few helpful member functions that try to make common
+operations easy.
+
+.. _m_Module:
+
+Important Public Members of the ``Module`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``Module::Module(std::string name = "")``
+
+  Constructing a Module_ is easy.  You can optionally provide a name for it
+  (probably based on the name of the translation unit).
+
+* | ``Module::iterator`` - Typedef for function list iterator
+  | ``Module::const_iterator`` - Typedef for const_iterator.
+  | ``begin()``, ``end()``, ``size()``, ``empty()``
+
+  These are forwarding methods that make it easy to access the contents of a
+  ``Module`` object's :ref:`Function <c_Function>` list.
+
+* ``Module::FunctionListType &getFunctionList()``
+
+  Returns the list of :ref:`Function <c_Function>`\ s.  This is necessary to use
+  when you need to update the list or perform a complex action that doesn't have
+  a forwarding method.
+
+----------------
+
+* | ``Module::global_iterator`` - Typedef for global variable list iterator
+  | ``Module::const_global_iterator`` - Typedef for const_iterator.
+  | ``global_begin()``, ``global_end()``, ``global_size()``, ``global_empty()``
+
+  These are forwarding methods that make it easy to access the contents of a
+  ``Module`` object's GlobalVariable_ list.
+
+* ``Module::GlobalListType &getGlobalList()``
+
+  Returns the list of GlobalVariable_\ s.  This is necessary to use when you
+  need to update the list or perform a complex action that doesn't have a
+  forwarding method.
+
+----------------
+
+* ``SymbolTable *getSymbolTable()``
+
+  Return a reference to the SymbolTable_ for this ``Module``.
+
+----------------
+
+* ``Function *getFunction(StringRef Name) const``
+
+  Look up the specified function in the ``Module`` SymbolTable_.  If it does not
+  exist, return ``null``.
+
+* ``FunctionCallee getOrInsertFunction(const std::string &Name,
+  const FunctionType *T)``
+
+  Look up the specified function in the ``Module`` SymbolTable_.  If
+  it does not exist, add an external declaration for the function and
+  return it. Note that the function signature already present may not
+  match the requested signature. Thus, in order to enable the common
+  usage of passing the result directly to EmitCall, the return type is
+  a struct of ``{FunctionType *T, Constant *FunctionPtr}``, rather
+  than simply the ``Function*`` with potentially an unexpected
+  signature.
+
+* ``std::string getTypeName(const Type *Ty)``
+
+  If there is at least one entry in the SymbolTable_ for the specified Type_,
+  return it.  Otherwise return the empty string.
+
+* ``bool addTypeName(const std::string &Name, const Type *Ty)``
+
+  Insert an entry in the SymbolTable_ mapping ``Name`` to ``Ty``.  If there is
+  already an entry for this name, true is returned and the SymbolTable_ is not
+  modified.
+
+.. _Value:
+
+The ``Value`` class
+-------------------
+
+``#include "llvm/IR/Value.h"``
+
+header source: `Value.h <http://llvm.org/doxygen/Value_8h_source.html>`_
+
+doxygen info: `Value Class <http://llvm.org/doxygen/classllvm_1_1Value.html>`_
+
+The ``Value`` class is the most important class in the LLVM Source base.  It
+represents a typed value that may be used (among other things) as an operand to
+an instruction.  There are many different types of ``Value``\ s, such as
+Constant_\ s, Argument_\ s.  Even Instruction_\ s and :ref:`Function
+<c_Function>`\ s are ``Value``\ s.
+
+A particular ``Value`` may be used many times in the LLVM representation for a
+program.  For example, an incoming argument to a function (represented with an
+instance of the Argument_ class) is "used" by every instruction in the function
+that references the argument.  To keep track of this relationship, the ``Value``
+class keeps a list of all of the ``User``\ s that is using it (the User_ class
+is a base class for all nodes in the LLVM graph that can refer to ``Value``\ s).
+This use list is how LLVM represents def-use information in the program, and is
+accessible through the ``use_*`` methods, shown below.
+
+Because LLVM is a typed representation, every LLVM ``Value`` is typed, and this
+Type_ is available through the ``getType()`` method.  In addition, all LLVM
+values can be named.  The "name" of the ``Value`` is a symbolic string printed
+in the LLVM code:
+
+.. code-block:: llvm
+
+  %foo = add i32 1, 2
+
+.. _nameWarning:
+
+The name of this instruction is "foo". **NOTE** that the name of any value may
+be missing (an empty string), so names should **ONLY** be used for debugging
+(making the source code easier to read, debugging printouts), they should not be
+used to keep track of values or map between them.  For this purpose, use a
+``std::map`` of pointers to the ``Value`` itself instead.
+
+One important aspect of LLVM is that there is no distinction between an SSA
+variable and the operation that produces it.  Because of this, any reference to
+the value produced by an instruction (or the value available as an incoming
+argument, for example) is represented as a direct pointer to the instance of the
+class that represents this value.  Although this may take some getting used to,
+it simplifies the representation and makes it easier to manipulate.
+
+.. _m_Value:
+
+Important Public Members of the ``Value`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* | ``Value::use_iterator`` - Typedef for iterator over the use-list
+  | ``Value::const_use_iterator`` - Typedef for const_iterator over the
+    use-list
+  | ``unsigned use_size()`` - Returns the number of users of the value.
+  | ``bool use_empty()`` - Returns true if there are no users.
+  | ``use_iterator use_begin()`` - Get an iterator to the start of the
+    use-list.
+  | ``use_iterator use_end()`` - Get an iterator to the end of the use-list.
+  | ``User *use_back()`` - Returns the last element in the list.
+
+  These methods are the interface to access the def-use information in LLVM.
+  As with all other iterators in LLVM, the naming conventions follow the
+  conventions defined by the STL_.
+
+* ``Type *getType() const``
+  This method returns the Type of the Value.
+
+* | ``bool hasName() const``
+  | ``std::string getName() const``
+  | ``void setName(const std::string &Name)``
+
+  This family of methods is used to access and assign a name to a ``Value``, be
+  aware of the :ref:`precaution above <nameWarning>`.
+
+* ``void replaceAllUsesWith(Value *V)``
+
+  This method traverses the use list of a ``Value`` changing all User_\ s of the
+  current value to refer to "``V``" instead.  For example, if you detect that an
+  instruction always produces a constant value (for example through constant
+  folding), you can replace all uses of the instruction with the constant like
+  this:
+
+  .. code-block:: c++
+
+    Inst->replaceAllUsesWith(ConstVal);
+
+.. _User:
+
+The ``User`` class
+------------------
+
+``#include "llvm/IR/User.h"``
+
+header source: `User.h <http://llvm.org/doxygen/User_8h_source.html>`_
+
+doxygen info: `User Class <http://llvm.org/doxygen/classllvm_1_1User.html>`_
+
+Superclass: Value_
+
+The ``User`` class is the common base class of all LLVM nodes that may refer to
+``Value``\ s.  It exposes a list of "Operands" that are all of the ``Value``\ s
+that the User is referring to.  The ``User`` class itself is a subclass of
+``Value``.
+
+The operands of a ``User`` point directly to the LLVM ``Value`` that it refers
+to.  Because LLVM uses Static Single Assignment (SSA) form, there can only be
+one definition referred to, allowing this direct connection.  This connection
+provides the use-def information in LLVM.
+
+.. _m_User:
+
+Important Public Members of the ``User`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``User`` class exposes the operand list in two ways: through an index access
+interface and through an iterator based interface.
+
+* | ``Value *getOperand(unsigned i)``
+  | ``unsigned getNumOperands()``
+
+  These two methods expose the operands of the ``User`` in a convenient form for
+  direct access.
+
+* | ``User::op_iterator`` - Typedef for iterator over the operand list
+  | ``op_iterator op_begin()`` - Get an iterator to the start of the operand
+    list.
+  | ``op_iterator op_end()`` - Get an iterator to the end of the operand list.
+
+  Together, these methods make up the iterator based interface to the operands
+  of a ``User``.
+
+
+.. _Instruction:
+
+The ``Instruction`` class
+-------------------------
+
+``#include "llvm/IR/Instruction.h"``
+
+header source: `Instruction.h
+<http://llvm.org/doxygen/Instruction_8h_source.html>`_
+
+doxygen info: `Instruction Class
+<http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_
+
+Superclasses: User_, Value_
+
+The ``Instruction`` class is the common base class for all LLVM instructions.
+It provides only a few methods, but is a very commonly used class.  The primary
+data tracked by the ``Instruction`` class itself is the opcode (instruction
+type) and the parent BasicBlock_ the ``Instruction`` is embedded into.  To
+represent a specific type of instruction, one of many subclasses of
+``Instruction`` are used.
+
+Because the ``Instruction`` class subclasses the User_ class, its operands can
+be accessed in the same way as for other ``User``\ s (with the
+``getOperand()``/``getNumOperands()`` and ``op_begin()``/``op_end()`` methods).
+An important file for the ``Instruction`` class is the ``llvm/Instruction.def``
+file.  This file contains some meta-data about the various different types of
+instructions in LLVM.  It describes the enum values that are used as opcodes
+(for example ``Instruction::Add`` and ``Instruction::ICmp``), as well as the
+concrete sub-classes of ``Instruction`` that implement the instruction (for
+example BinaryOperator_ and CmpInst_).  Unfortunately, the use of macros in this
+file confuses doxygen, so these enum values don't show up correctly in the
+`doxygen output <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_.
+
+.. _s_Instruction:
+
+Important Subclasses of the ``Instruction`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. _BinaryOperator:
+
+* ``BinaryOperator``
+
+  This subclasses represents all two operand instructions whose operands must be
+  the same type, except for the comparison instructions.
+
+.. _CastInst:
+
+* ``CastInst``
+  This subclass is the parent of the 12 casting instructions.  It provides
+  common operations on cast instructions.
+
+.. _CmpInst:
+
+* ``CmpInst``
+
+  This subclass represents the two comparison instructions,
+  `ICmpInst <LangRef.html#i_icmp>`_ (integer opreands), and
+  `FCmpInst <LangRef.html#i_fcmp>`_ (floating point operands).
+
+.. _m_Instruction:
+
+Important Public Members of the ``Instruction`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``BasicBlock *getParent()``
+
+  Returns the BasicBlock_ that this
+  ``Instruction`` is embedded into.
+
+* ``bool mayWriteToMemory()``
+
+  Returns true if the instruction writes to memory, i.e. it is a ``call``,
+  ``free``, ``invoke``, or ``store``.
+
+* ``unsigned getOpcode()``
+
+  Returns the opcode for the ``Instruction``.
+
+* ``Instruction *clone() const``
+
+  Returns another instance of the specified instruction, identical in all ways
+  to the original except that the instruction has no parent (i.e. it's not
+  embedded into a BasicBlock_), and it has no name.
+
+.. _Constant:
+
+The ``Constant`` class and subclasses
+-------------------------------------
+
+Constant represents a base class for different types of constants.  It is
+subclassed by ConstantInt, ConstantArray, etc. for representing the various
+types of Constants.  GlobalValue_ is also a subclass, which represents the
+address of a global variable or function.
+
+.. _s_Constant:
+
+Important Subclasses of Constant
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ConstantInt : This subclass of Constant represents an integer constant of
+  any width.
+
+  * ``const APInt& getValue() const``: Returns the underlying
+    value of this constant, an APInt value.
+
+  * ``int64_t getSExtValue() const``: Converts the underlying APInt value to an
+    int64_t via sign extension.  If the value (not the bit width) of the APInt
+    is too large to fit in an int64_t, an assertion will result.  For this
+    reason, use of this method is discouraged.
+
+  * ``uint64_t getZExtValue() const``: Converts the underlying APInt value
+    to a uint64_t via zero extension.  IF the value (not the bit width) of the
+    APInt is too large to fit in a uint64_t, an assertion will result.  For this
+    reason, use of this method is discouraged.
+
+  * ``static ConstantInt* get(const APInt& Val)``: Returns the ConstantInt
+    object that represents the value provided by ``Val``.  The type is implied
+    as the IntegerType that corresponds to the bit width of ``Val``.
+
+  * ``static ConstantInt* get(const Type *Ty, uint64_t Val)``: Returns the
+    ConstantInt object that represents the value provided by ``Val`` for integer
+    type ``Ty``.
+
+* ConstantFP : This class represents a floating point constant.
+
+  * ``double getValue() const``: Returns the underlying value of this constant.
+
+* ConstantArray : This represents a constant array.
+
+  * ``const std::vector<Use> &getValues() const``: Returns a vector of
+    component constants that makeup this array.
+
+* ConstantStruct : This represents a constant struct.
+
+  * ``const std::vector<Use> &getValues() const``: Returns a vector of
+    component constants that makeup this array.
+
+* GlobalValue : This represents either a global variable or a function.  In
+  either case, the value is a constant fixed address (after linking).
+
+.. _GlobalValue:
+
+The ``GlobalValue`` class
+-------------------------
+
+``#include "llvm/IR/GlobalValue.h"``
+
+header source: `GlobalValue.h
+<http://llvm.org/doxygen/GlobalValue_8h_source.html>`_
+
+doxygen info: `GlobalValue Class
+<http://llvm.org/doxygen/classllvm_1_1GlobalValue.html>`_
+
+Superclasses: Constant_, User_, Value_
+
+Global values ( GlobalVariable_\ s or :ref:`Function <c_Function>`\ s) are the
+only LLVM values that are visible in the bodies of all :ref:`Function
+<c_Function>`\ s.  Because they are visible at global scope, they are also
+subject to linking with other globals defined in different translation units.
+To control the linking process, ``GlobalValue``\ s know their linkage rules.
+Specifically, ``GlobalValue``\ s know whether they have internal or external
+linkage, as defined by the ``LinkageTypes`` enumeration.
+
+If a ``GlobalValue`` has internal linkage (equivalent to being ``static`` in C),
+it is not visible to code outside the current translation unit, and does not
+participate in linking.  If it has external linkage, it is visible to external
+code, and does participate in linking.  In addition to linkage information,
+``GlobalValue``\ s keep track of which Module_ they are currently part of.
+
+Because ``GlobalValue``\ s are memory objects, they are always referred to by
+their **address**.  As such, the Type_ of a global is always a pointer to its
+contents.  It is important to remember this when using the ``GetElementPtrInst``
+instruction because this pointer must be dereferenced first.  For example, if
+you have a ``GlobalVariable`` (a subclass of ``GlobalValue)`` that is an array
+of 24 ints, type ``[24 x i32]``, then the ``GlobalVariable`` is a pointer to
+that array.  Although the address of the first element of this array and the
+value of the ``GlobalVariable`` are the same, they have different types.  The
+``GlobalVariable``'s type is ``[24 x i32]``.  The first element's type is
+``i32.`` Because of this, accessing a global value requires you to dereference
+the pointer with ``GetElementPtrInst`` first, then its elements can be accessed.
+This is explained in the `LLVM Language Reference Manual
+<LangRef.html#globalvars>`_.
+
+.. _m_GlobalValue:
+
+Important Public Members of the ``GlobalValue`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* | ``bool hasInternalLinkage() const``
+  | ``bool hasExternalLinkage() const``
+  | ``void setInternalLinkage(bool HasInternalLinkage)``
+
+  These methods manipulate the linkage characteristics of the ``GlobalValue``.
+
+* ``Module *getParent()``
+
+  This returns the Module_ that the
+  GlobalValue is currently embedded into.
+
+.. _c_Function:
+
+The ``Function`` class
+----------------------
+
+``#include "llvm/IR/Function.h"``
+
+header source: `Function.h <http://llvm.org/doxygen/Function_8h_source.html>`_
+
+doxygen info: `Function Class
+<http://llvm.org/doxygen/classllvm_1_1Function.html>`_
+
+Superclasses: GlobalValue_, Constant_, User_, Value_
+
+The ``Function`` class represents a single procedure in LLVM.  It is actually
+one of the more complex classes in the LLVM hierarchy because it must keep track
+of a large amount of data.  The ``Function`` class keeps track of a list of
+BasicBlock_\ s, a list of formal Argument_\ s, and a SymbolTable_.
+
+The list of BasicBlock_\ s is the most commonly used part of ``Function``
+objects.  The list imposes an implicit ordering of the blocks in the function,
+which indicate how the code will be laid out by the backend.  Additionally, the
+first BasicBlock_ is the implicit entry node for the ``Function``.  It is not
+legal in LLVM to explicitly branch to this initial block.  There are no implicit
+exit nodes, and in fact there may be multiple exit nodes from a single
+``Function``.  If the BasicBlock_ list is empty, this indicates that the
+``Function`` is actually a function declaration: the actual body of the function
+hasn't been linked in yet.
+
+In addition to a list of BasicBlock_\ s, the ``Function`` class also keeps track
+of the list of formal Argument_\ s that the function receives.  This container
+manages the lifetime of the Argument_ nodes, just like the BasicBlock_ list does
+for the BasicBlock_\ s.
+
+The SymbolTable_ is a very rarely used LLVM feature that is only used when you
+have to look up a value by name.  Aside from that, the SymbolTable_ is used
+internally to make sure that there are not conflicts between the names of
+Instruction_\ s, BasicBlock_\ s, or Argument_\ s in the function body.
+
+Note that ``Function`` is a GlobalValue_ and therefore also a Constant_.  The
+value of the function is its address (after linking) which is guaranteed to be
+constant.
+
+.. _m_Function:
+
+Important Public Members of the ``Function``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``Function(const FunctionType *Ty, LinkageTypes Linkage,
+  const std::string &N = "", Module* Parent = 0)``
+
+  Constructor used when you need to create new ``Function``\ s to add the
+  program.  The constructor must specify the type of the function to create and
+  what type of linkage the function should have.  The FunctionType_ argument
+  specifies the formal arguments and return value for the function.  The same
+  FunctionType_ value can be used to create multiple functions.  The ``Parent``
+  argument specifies the Module in which the function is defined.  If this
+  argument is provided, the function will automatically be inserted into that
+  module's list of functions.
+
+* ``bool isDeclaration()``
+
+  Return whether or not the ``Function`` has a body defined.  If the function is
+  "external", it does not have a body, and thus must be resolved by linking with
+  a function defined in a different translation unit.
+
+* | ``Function::iterator`` - Typedef for basic block list iterator
+  | ``Function::const_iterator`` - Typedef for const_iterator.
+  | ``begin()``, ``end()``, ``size()``, ``empty()``
+
+  These are forwarding methods that make it easy to access the contents of a
+  ``Function`` object's BasicBlock_ list.
+
+* ``Function::BasicBlockListType &getBasicBlockList()``
+
+  Returns the list of BasicBlock_\ s.  This is necessary to use when you need to
+  update the list or perform a complex action that doesn't have a forwarding
+  method.
+
+* | ``Function::arg_iterator`` - Typedef for the argument list iterator
+  | ``Function::const_arg_iterator`` - Typedef for const_iterator.
+  | ``arg_begin()``, ``arg_end()``, ``arg_size()``, ``arg_empty()``
+
+  These are forwarding methods that make it easy to access the contents of a
+  ``Function`` object's Argument_ list.
+
+* ``Function::ArgumentListType &getArgumentList()``
+
+  Returns the list of Argument_.  This is necessary to use when you need to
+  update the list or perform a complex action that doesn't have a forwarding
+  method.
+
+* ``BasicBlock &getEntryBlock()``
+
+  Returns the entry ``BasicBlock`` for the function.  Because the entry block
+  for the function is always the first block, this returns the first block of
+  the ``Function``.
+
+* | ``Type *getReturnType()``
+  | ``FunctionType *getFunctionType()``
+
+  This traverses the Type_ of the ``Function`` and returns the return type of
+  the function, or the FunctionType_ of the actual function.
+
+* ``SymbolTable *getSymbolTable()``
+
+  Return a pointer to the SymbolTable_ for this ``Function``.
+
+.. _GlobalVariable:
+
+The ``GlobalVariable`` class
+----------------------------
+
+``#include "llvm/IR/GlobalVariable.h"``
+
+header source: `GlobalVariable.h
+<http://llvm.org/doxygen/GlobalVariable_8h_source.html>`_
+
+doxygen info: `GlobalVariable Class
+<http://llvm.org/doxygen/classllvm_1_1GlobalVariable.html>`_
+
+Superclasses: GlobalValue_, Constant_, User_, Value_
+
+Global variables are represented with the (surprise surprise) ``GlobalVariable``
+class.  Like functions, ``GlobalVariable``\ s are also subclasses of
+GlobalValue_, and as such are always referenced by their address (global values
+must live in memory, so their "name" refers to their constant address).  See
+GlobalValue_ for more on this.  Global variables may have an initial value
+(which must be a Constant_), and if they have an initializer, they may be marked
+as "constant" themselves (indicating that their contents never change at
+runtime).
+
+.. _m_GlobalVariable:
+
+Important Public Members of the ``GlobalVariable`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``GlobalVariable(const Type *Ty, bool isConstant, LinkageTypes &Linkage,
+  Constant *Initializer = 0, const std::string &Name = "", Module* Parent = 0)``
+
+  Create a new global variable of the specified type.  If ``isConstant`` is true
+  then the global variable will be marked as unchanging for the program.  The
+  Linkage parameter specifies the type of linkage (internal, external, weak,
+  linkonce, appending) for the variable.  If the linkage is InternalLinkage,
+  WeakAnyLinkage, WeakODRLinkage, LinkOnceAnyLinkage or LinkOnceODRLinkage, then
+  the resultant global variable will have internal linkage.  AppendingLinkage
+  concatenates together all instances (in different translation units) of the
+  variable into a single variable but is only applicable to arrays.  See the
+  `LLVM Language Reference <LangRef.html#modulestructure>`_ for further details
+  on linkage types.  Optionally an initializer, a name, and the module to put
+  the variable into may be specified for the global variable as well.
+
+* ``bool isConstant() const``
+
+  Returns true if this is a global variable that is known not to be modified at
+  runtime.
+
+* ``bool hasInitializer()``
+
+  Returns true if this ``GlobalVariable`` has an intializer.
+
+* ``Constant *getInitializer()``
+
+  Returns the initial value for a ``GlobalVariable``.  It is not legal to call
+  this method if there is no initializer.
+
+.. _BasicBlock:
+
+The ``BasicBlock`` class
+------------------------
+
+``#include "llvm/IR/BasicBlock.h"``
+
+header source: `BasicBlock.h
+<http://llvm.org/doxygen/BasicBlock_8h_source.html>`_
+
+doxygen info: `BasicBlock Class
+<http://llvm.org/doxygen/classllvm_1_1BasicBlock.html>`_
+
+Superclass: Value_
+
+This class represents a single entry single exit section of the code, commonly
+known as a basic block by the compiler community.  The ``BasicBlock`` class
+maintains a list of Instruction_\ s, which form the body of the block.  Matching
+the language definition, the last element of this list of instructions is always
+a terminator instruction.
+
+In addition to tracking the list of instructions that make up the block, the
+``BasicBlock`` class also keeps track of the :ref:`Function <c_Function>` that
+it is embedded into.
+
+Note that ``BasicBlock``\ s themselves are Value_\ s, because they are
+referenced by instructions like branches and can go in the switch tables.
+``BasicBlock``\ s have type ``label``.
+
+.. _m_BasicBlock:
+
+Important Public Members of the ``BasicBlock`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* ``BasicBlock(const std::string &Name = "", Function *Parent = 0)``
+
+  The ``BasicBlock`` constructor is used to create new basic blocks for
+  insertion into a function.  The constructor optionally takes a name for the
+  new block, and a :ref:`Function <c_Function>` to insert it into.  If the
+  ``Parent`` parameter is specified, the new ``BasicBlock`` is automatically
+  inserted at the end of the specified :ref:`Function <c_Function>`, if not
+  specified, the BasicBlock must be manually inserted into the :ref:`Function
+  <c_Function>`.
+
+* | ``BasicBlock::iterator`` - Typedef for instruction list iterator
+  | ``BasicBlock::const_iterator`` - Typedef for const_iterator.
+  | ``begin()``, ``end()``, ``front()``, ``back()``,
+    ``size()``, ``empty()``
+    STL-style functions for accessing the instruction list.
+
+  These methods and typedefs are forwarding functions that have the same
+  semantics as the standard library methods of the same names.  These methods
+  expose the underlying instruction list of a basic block in a way that is easy
+  to manipulate.  To get the full complement of container operations (including
+  operations to update the list), you must use the ``getInstList()`` method.
+
+* ``BasicBlock::InstListType &getInstList()``
+
+  This method is used to get access to the underlying container that actually
+  holds the Instructions.  This method must be used when there isn't a
+  forwarding function in the ``BasicBlock`` class for the operation that you
+  would like to perform.  Because there are no forwarding functions for
+  "updating" operations, you need to use this if you want to update the contents
+  of a ``BasicBlock``.
+
+* ``Function *getParent()``
+
+  Returns a pointer to :ref:`Function <c_Function>` the block is embedded into,
+  or a null pointer if it is homeless.
+
+* ``Instruction *getTerminator()``
+
+  Returns a pointer to the terminator instruction that appears at the end of the
+  ``BasicBlock``.  If there is no terminator instruction, or if the last
+  instruction in the block is not a terminator, then a null pointer is returned.
+
+.. _Argument:
+
+The ``Argument`` class
+----------------------
+
+This subclass of Value defines the interface for incoming formal arguments to a
+function.  A Function maintains a list of its formal arguments.  An argument has
+a pointer to the parent Function.
+
+

Added: www-releases/trunk/9.0.0/docs/_sources/Projects.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Projects.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Projects.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Projects.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,257 @@
+========================
+Creating an LLVM Project
+========================
+
+.. contents::
+   :local:
+
+Overview
+========
+
+The LLVM build system is designed to facilitate the building of third party
+projects that use LLVM header files, libraries, and tools.  In order to use
+these facilities, a ``Makefile`` from a project must do the following things:
+
+* Set ``make`` variables. There are several variables that a ``Makefile`` needs
+  to set to use the LLVM build system:
+
+  * ``PROJECT_NAME`` - The name by which your project is known.
+  * ``LLVM_SRC_ROOT`` - The root of the LLVM source tree.
+  * ``LLVM_OBJ_ROOT`` - The root of the LLVM object tree.
+  * ``PROJ_SRC_ROOT`` - The root of the project's source tree.
+  * ``PROJ_OBJ_ROOT`` - The root of the project's object tree.
+  * ``PROJ_INSTALL_ROOT`` - The root installation directory.
+  * ``LEVEL`` - The relative path from the current directory to the
+    project's root ``($PROJ_OBJ_ROOT)``.
+
+* Include ``Makefile.config`` from ``$(LLVM_OBJ_ROOT)``.
+
+* Include ``Makefile.rules`` from ``$(LLVM_SRC_ROOT)``.
+
+There are two ways that you can set all of these variables:
+
+* You can write your own ``Makefiles`` which hard-code these values.
+
+* You can use the pre-made LLVM sample project. This sample project includes
+  ``Makefiles``, a configure script that can be used to configure the location
+  of LLVM, and the ability to support multiple object directories from a single
+  source directory.
+
+If you want to devise your own build system, studying other projects and LLVM
+``Makefiles`` will probably provide enough information on how to write your own
+``Makefiles``.
+
+Source Tree Layout
+==================
+
+In order to use the LLVM build system, you will want to organize your source
+code so that it can benefit from the build system's features.  Mainly, you want
+your source tree layout to look similar to the LLVM source tree layout.
+
+Underneath your top level directory, you should have the following directories:
+
+**lib**
+
+    This subdirectory should contain all of your library source code.  For each
+    library that you build, you will have one directory in **lib** that will
+    contain that library's source code.
+
+    Libraries can be object files, archives, or dynamic libraries.  The **lib**
+    directory is just a convenient place for libraries as it places them all in
+    a directory from which they can be linked later.
+
+**include**
+
+    This subdirectory should contain any header files that are global to your
+    project. By global, we mean that they are used by more than one library or
+    executable of your project.
+
+    By placing your header files in **include**, they will be found
+    automatically by the LLVM build system.  For example, if you have a file
+    **include/jazz/note.h**, then your source files can include it simply with
+    **#include "jazz/note.h"**.
+
+**tools**
+
+    This subdirectory should contain all of your source code for executables.
+    For each program that you build, you will have one directory in **tools**
+    that will contain that program's source code.
+
+**test**
+
+    This subdirectory should contain tests that verify that your code works
+    correctly.  Automated tests are especially useful.
+
+    Currently, the LLVM build system provides basic support for tests. The LLVM
+    system provides the following:
+
+* LLVM contains regression tests in ``llvm/test``.  These tests are run by the
+  :doc:`Lit <CommandGuide/lit>` testing tool.  This test procedure uses ``RUN``
+  lines in the actual test case to determine how to run the test.  See the
+  :doc:`TestingGuide` for more details.
+
+* LLVM contains an optional package called ``llvm-test``, which provides
+  benchmarks and programs that are known to compile with the Clang front
+  end. You can use these programs to test your code, gather statistical
+  information, and compare it to the current LLVM performance statistics.
+  
+  Currently, there is no way to hook your tests directly into the ``llvm/test``
+  testing harness. You will simply need to find a way to use the source
+  provided within that directory on your own.
+
+Typically, you will want to build your **lib** directory first followed by your
+**tools** directory.
+
+Writing LLVM Style Makefiles
+============================
+
+The LLVM build system provides a convenient way to build libraries and
+executables.  Most of your project Makefiles will only need to define a few
+variables.  Below is a list of the variables one can set and what they can
+do:
+
+Required Variables
+------------------
+
+``LEVEL``
+
+    This variable is the relative path from this ``Makefile`` to the top
+    directory of your project's source code.  For example, if your source code
+    is in ``/tmp/src``, then the ``Makefile`` in ``/tmp/src/jump/high``
+    would set ``LEVEL`` to ``"../.."``.
+
+Variables for Building Subdirectories
+-------------------------------------
+
+``DIRS``
+
+    This is a space separated list of subdirectories that should be built.  They
+    will be built, one at a time, in the order specified.
+
+``PARALLEL_DIRS``
+
+    This is a list of directories that can be built in parallel. These will be
+    built after the directories in DIRS have been built.
+
+``OPTIONAL_DIRS``
+
+    This is a list of directories that can be built if they exist, but will not
+    cause an error if they do not exist.  They are built serially in the order
+    in which they are listed.
+
+Variables for Building Libraries
+--------------------------------
+
+``LIBRARYNAME``
+
+    This variable contains the base name of the library that will be built.  For
+    example, to build a library named ``libsample.a``, ``LIBRARYNAME`` should
+    be set to ``sample``.
+
+``BUILD_ARCHIVE``
+
+    By default, a library is a ``.o`` file that is linked directly into a
+    program.  To build an archive (also known as a static library), set the
+    ``BUILD_ARCHIVE`` variable.
+
+``SHARED_LIBRARY``
+
+    If ``SHARED_LIBRARY`` is defined in your Makefile, a shared (or dynamic)
+    library will be built.
+
+Variables for Building Programs
+-------------------------------
+
+``TOOLNAME``
+
+    This variable contains the name of the program that will be built.  For
+    example, to build an executable named ``sample``, ``TOOLNAME`` should be set
+    to ``sample``.
+
+``USEDLIBS``
+
+    This variable holds a space separated list of libraries that should be
+    linked into the program.  These libraries must be libraries that come from
+    your **lib** directory.  The libraries must be specified without their
+    ``lib`` prefix.  For example, to link ``libsample.a``, you would set
+    ``USEDLIBS`` to ``sample.a``.
+
+    Note that this works only for statically linked libraries.
+
+``LLVMLIBS``
+
+    This variable holds a space separated list of libraries that should be
+    linked into the program.  These libraries must be LLVM libraries.  The
+    libraries must be specified without their ``lib`` prefix.  For example, to
+    link with a driver that performs an IR transformation you might set
+    ``LLVMLIBS`` to this minimal set of libraries ``LLVMSupport.a LLVMCore.a
+    LLVMBitReader.a LLVMAsmParser.a LLVMAnalysis.a LLVMTransformUtils.a
+    LLVMScalarOpts.a LLVMTarget.a``.
+
+    Note that this works only for statically linked libraries. LLVM is split
+    into a large number of static libraries, and the list of libraries you
+    require may be much longer than the list above. To see a full list of
+    libraries use: ``llvm-config --libs all``.  Using ``LINK_COMPONENTS`` as
+    described below, obviates the need to set ``LLVMLIBS``.
+
+``LINK_COMPONENTS``
+
+    This variable holds a space separated list of components that the LLVM
+    ``Makefiles`` pass to the ``llvm-config`` tool to generate a link line for
+    the program. For example, to link with all LLVM libraries use
+    ``LINK_COMPONENTS = all``.
+
+``LIBS``
+
+    To link dynamic libraries, add ``-l<library base name>`` to the ``LIBS``
+    variable.  The LLVM build system will look in the same places for dynamic
+    libraries as it does for static libraries.
+
+    For example, to link ``libsample.so``, you would have the following line in
+    your ``Makefile``:
+
+        .. code-block:: makefile
+
+          LIBS += -lsample
+
+Note that ``LIBS`` must occur in the Makefile after the inclusion of
+``Makefile.common``.
+
+Miscellaneous Variables
+-----------------------
+
+``CFLAGS`` & ``CPPFLAGS``
+
+    This variable can be used to add options to the C and C++ compiler,
+    respectively.  It is typically used to add options that tell the compiler
+    the location of additional directories to search for header files.
+
+    It is highly suggested that you append to ``CFLAGS`` and ``CPPFLAGS`` as
+    opposed to overwriting them.  The master ``Makefiles`` may already have
+    useful options in them that you may not want to overwrite.
+
+Placement of Object Code
+========================
+
+The final location of built libraries and executables will depend upon whether
+you do a ``Debug``, ``Release``, or ``Profile`` build.
+
+Libraries
+
+    All libraries (static and dynamic) will be stored in
+    ``PROJ_OBJ_ROOT/<type>/lib``, where *type* is ``Debug``, ``Release``, or
+    ``Profile`` for a debug, optimized, or profiled build, respectively.
+
+Executables
+
+    All executables will be stored in ``PROJ_OBJ_ROOT/<type>/bin``, where *type*
+    is ``Debug``, ``Release``, or ``Profile`` for a debug, optimized, or
+    profiled build, respectively.
+
+Further Help
+============
+
+If you have any questions or need any help creating an LLVM project, the LLVM
+team would be more than happy to help.  You can always post your questions to
+the `LLVM Developers Mailing List
+<http://lists.llvm.org/pipermail/llvm-dev/>`_.

Added: www-releases/trunk/9.0.0/docs/_sources/Proposals/GitHubMove.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Proposals/GitHubMove.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Proposals/GitHubMove.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Proposals/GitHubMove.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1086 @@
+==============================
+Moving LLVM Projects to GitHub
+==============================
+
+Current Status
+==============
+
+We are planning to complete the transition to GitHub by Oct 21, 2019.  See
+the GitHub migration `status page <https://llvm.org/GitHubMigrationStatus.html>`_
+for the latest updates and instructions for how to migrate your workflows.
+
+.. contents:: Table of Contents
+  :depth: 4
+  :local:
+
+Introduction
+============
+
+This is a proposal to move our current revision control system from our own
+hosted Subversion to GitHub. Below are the financial and technical arguments as
+to why we are proposing such a move and how people (and validation
+infrastructure) will continue to work with a Git-based LLVM.
+
+What This Proposal is *Not* About
+=================================
+
+Changing the development policy.
+
+This proposal relates only to moving the hosting of our source-code repository
+from SVN hosted on our own servers to Git hosted on GitHub. We are not proposing
+using GitHub's issue tracker, pull-requests, or code-review.
+
+Contributors will continue to earn commit access on demand under the Developer
+Policy, except that that a GitHub account will be required instead of SVN
+username/password-hash.
+
+Why Git, and Why GitHub?
+========================
+
+Why Move At All?
+----------------
+
+This discussion began because we currently host our own Subversion server
+and Git mirror on a voluntary basis. The LLVM Foundation sponsors the server and
+provides limited support, but there is only so much it can do.
+
+Volunteers are not sysadmins themselves, but compiler engineers that happen
+to know a thing or two about hosting servers. We also don't have 24/7 support,
+and we sometimes wake up to see that continuous integration is broken because
+the SVN server is either down or unresponsive.
+
+We should take advantage of one of the services out there (GitHub, GitLab,
+and BitBucket, among others) that offer better service (24/7 stability, disk
+space, Git server, code browsing, forking facilities, etc) for free.
+
+Why Git?
+--------
+
+Many new coders nowadays start with Git, and a lot of people have never used
+SVN, CVS, or anything else. Websites like GitHub have changed the landscape
+of open source contributions, reducing the cost of first contribution and
+fostering collaboration.
+
+Git is also the version control many LLVM developers use. Despite the
+sources being stored in a SVN server, these developers are already using Git
+through the Git-SVN integration.
+
+Git allows you to:
+
+* Commit, squash, merge, and fork locally without touching the remote server.
+* Maintain local branches, enabling multiple threads of development.
+* Collaborate on these branches (e.g. through your own fork of llvm on GitHub).
+* Inspect the repository history (blame, log, bisect) without Internet access.
+* Maintain remote forks and branches on Git hosting services and
+  integrate back to the main repository.
+
+In addition, because Git seems to be replacing many OSS projects' version
+control systems, there are many tools that are built over Git.
+Future tooling may support Git first (if not only).
+
+Why GitHub?
+-----------
+
+GitHub, like GitLab and BitBucket, provides free code hosting for open source
+projects. Any of these could replace the code-hosting infrastructure that we
+have today.
+
+These services also have a dedicated team to monitor, migrate, improve and
+distribute the contents of the repositories depending on region and load.
+
+GitHub has one important advantage over GitLab and
+BitBucket: it offers read-write **SVN** access to the repository
+(https://github.com/blog/626-announcing-svn-support).
+This would enable people to continue working post-migration as though our code
+were still canonically in an SVN repository.
+
+In addition, there are already multiple LLVM mirrors on GitHub, indicating that
+part of our community has already settled there.
+
+On Managing Revision Numbers with Git
+-------------------------------------
+
+The current SVN repository hosts all the LLVM sub-projects alongside each other.
+A single revision number (e.g. r123456) thus identifies a consistent version of
+all LLVM sub-projects.
+
+Git does not use sequential integer revision number but instead uses a hash to
+identify each commit.
+
+The loss of a sequential integer revision number has been a sticking point in
+past discussions about Git:
+
+- "The 'branch' I most care about is mainline, and losing the ability to say
+  'fixed in r1234' (with some sort of monotonically increasing number) would
+  be a tragic loss." [LattnerRevNum]_
+- "I like those results sorted by time and the chronology should be obvious, but
+  timestamps are incredibly cumbersome and make it difficult to verify that a
+  given checkout matches a given set of results." [TrickRevNum]_
+- "There is still the major regression with unreadable version numbers.
+  Given the amount of Bugzilla traffic with 'Fixed in...', that's a
+  non-trivial issue." [JSonnRevNum]_
+- "Sequential IDs are important for LNT and llvmlab bisection tool." [MatthewsRevNum]_.
+
+However, Git can emulate this increasing revision number:
+``git rev-list --count <commit-hash>``. This identifier is unique only
+within a single branch, but this means the tuple `(num, branch-name)` uniquely
+identifies a commit.
+
+We can thus use this revision number to ensure that e.g. `clang -v` reports a
+user-friendly revision number (e.g. `master-12345` or `4.0-5321`), addressing
+the objections raised above with respect to this aspect of Git.
+
+What About Branches and Merges?
+-------------------------------
+
+In contrast to SVN, Git makes branching easy. Git's commit history is
+represented as a DAG, a departure from SVN's linear history. However, we propose
+to mandate making merge commits illegal in our canonical Git repository.
+
+Unfortunately, GitHub does not support server side hooks to enforce such a
+policy.  We must rely on the community to avoid pushing merge commits.
+
+GitHub offers a feature called `Status Checks`: a branch protected by
+`status checks` requires commits to be whitelisted before the push can happen.
+We could supply a pre-push hook on the client side that would run and check the
+history, before whitelisting the commit being pushed [statuschecks]_.
+However this solution would be somewhat fragile (how do you update a script
+installed on every developer machine?) and prevents SVN access to the
+repository.
+
+What About Commit Emails?
+-------------------------
+
+We will need a new bot to send emails for each commit. This proposal leaves the
+email format unchanged besides the commit URL.
+
+Straw Man Migration Plan
+========================
+
+Step #1 : Before The Move
+-------------------------
+
+1. Update docs to mention the move, so people are aware of what is going on.
+2. Set up a read-only version of the GitHub project, mirroring our current SVN
+   repository.
+3. Add the required bots to implement the commit emails, as well as the
+   umbrella repository update (if the multirepo is selected) or the read-only
+   Git views for the sub-projects (if the monorepo is selected).
+
+Step #2 : Git Move
+------------------
+
+4. Update the buildbots to pick up updates and commits from the GitHub
+   repository. Not all bots have to migrate at this point, but it'll help
+   provide infrastructure testing.
+5. Update Phabricator to pick up commits from the GitHub repository.
+6. LNT and llvmlab have to be updated: they rely on unique monotonically
+   increasing integer across branch [MatthewsRevNum]_.
+7. Instruct downstream integrators to pick up commits from the GitHub
+   repository.
+8. Review and prepare an update for the LLVM documentation.
+
+Until this point nothing has changed for developers, it will just
+boil down to a lot of work for buildbot and other infrastructure
+owners.
+
+The migration will pause here until all dependencies have cleared, and all
+problems have been solved.
+
+Step #3: Write Access Move
+--------------------------
+
+9. Collect developers' GitHub account information, and add them to the project.
+10. Switch the SVN repository to read-only and allow pushes to the GitHub repository.
+11. Update the documentation.
+12. Mirror Git to SVN.
+
+Step #4 : Post Move
+-------------------
+
+13. Archive the SVN repository.
+14. Update links on the LLVM website pointing to viewvc/klaus/phab etc. to
+    point to GitHub instead.
+
+Github Repository Description
+=============================
+
+Monorepo
+----------------
+
+The LLVM git repository hosted at https://github.com/llvm/llvm-project contains all
+sub-projects in a single source tree.  It is often refered to as a monorepo and
+mimics an export of the current SVN repository, with each sub-project having its
+own top-level directory. Not all sub-projects are used for building toolchains.
+For example, www/ and test-suite/ are not part of the monorepo.
+
+Putting all sub-projects in a single checkout makes cross-project refactoring
+naturally simple:
+
+ * New sub-projects can be trivially split out for better reuse and/or layering
+   (e.g., to allow libSupport and/or LIT to be used by runtimes without adding a
+   dependency on LLVM).
+ * Changing an API in LLVM and upgrading the sub-projects will always be done in
+   a single commit, designing away a common source of temporary build breakage.
+ * Moving code across sub-project (during refactoring for instance) in a single
+   commit enables accurate `git blame` when tracking code change history.
+ * Tooling based on `git grep` works natively across sub-projects, allowing to
+   easier find refactoring opportunities across projects (for example reusing a
+   datastructure initially in LLDB by moving it into libSupport).
+ * Having all the sources present encourages maintaining the other sub-projects
+   when changing API.
+
+Finally, the monorepo maintains the property of the existing SVN repository that
+the sub-projects move synchronously, and a single revision number (or commit
+hash) identifies the state of the development across all projects.
+
+.. _build_single_project:
+
+Building a single sub-project
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Even though there is a single source tree, you are not required to build
+all sub-projects together.  It is trivial to configure builds for a single
+sub-project.
+
+For example::
+
+  mkdir build && cd build
+  # Configure only LLVM (default)
+  cmake path/to/monorepo
+  # Configure LLVM and lld
+  cmake path/to/monorepo -DLLVM_ENABLE_PROJECTS=lld
+  # Configure LLVM and clang
+  cmake path/to/monorepo -DLLVM_ENABLE_PROJECTS=clang
+
+.. _git-svn-mirror:
+
+Outstanding Questions
+---------------------
+
+Read-only sub-project mirrors
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+With the Monorepo, it is undecided whether the existing single-subproject
+mirrors (e.g. https://git.llvm.org/git/compiler-rt.git) will continue to
+be maintained.
+
+Read/write SVN bridge
+^^^^^^^^^^^^^^^^^^^^^
+
+GitHub supports a read/write SVN bridge for its repositories.  However,
+there have been issues with this bridge working correctly in the past,
+so it's not clear if this is something that will be supported going forward.
+
+Monorepo Drawbacks
+------------------
+
+ * Using the monolithic repository may add overhead for those contributing to a
+   standalone sub-project, particularly on runtimes like libcxx and compiler-rt
+   that don't rely on LLVM; currently, a fresh clone of libcxx is only 15MB (vs.
+   1GB for the monorepo), and the commit rate of LLVM may cause more frequent
+   `git push` collisions when upstreaming. Affected contributors may be able to
+   use the SVN bridge or the single-subproject Git mirrors. However, it's
+   undecided if these projects will continue to be mantained.
+ * Using the monolithic repository may add overhead for those *integrating* a
+   standalone sub-project, even if they aren't contributing to it, due to the
+   same disk space concern as the point above. The availability of the
+   sub-project Git mirrors would addresses this.
+ * Preservation of the existing read/write SVN-based workflows relies on the
+   GitHub SVN bridge, which is an extra dependency. Maintaining this locks us
+   into GitHub and could restrict future workflow changes.
+
+Workflows
+^^^^^^^^^
+
+ * :ref:`Checkout/Clone a Single Project, without Commit Access <workflow-checkout-commit>`.
+ * :ref:`Checkout/Clone Multiple Projects, with Commit Access <workflow-monocheckout-multicommit>`.
+ * :ref:`Commit an API Change in LLVM and Update the Sub-projects <workflow-cross-repo-commit>`.
+ * :ref:`Branching/Stashing/Updating for Local Development or Experiments <workflow-mono-branching>`.
+ * :ref:`Bisecting <workflow-mono-bisecting>`.
+
+Workflow Before/After
+=====================
+
+This section goes through a few examples of workflows, intended to illustrate
+how end-users or developers would interact with the repository for
+various use-cases.
+
+.. _workflow-checkout-commit:
+
+Checkout/Clone a Single Project, with Commit Access
+---------------------------------------------------
+
+Currently
+^^^^^^^^^
+
+::
+
+  # direct SVN checkout
+  svn co https://user@llvm.org/svn/llvm-project/llvm/trunk llvm
+  # or using the read-only Git view, with git-svn
+  git clone http://llvm.org/git/llvm.git
+  cd llvm
+  git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
+  git config svn-remote.svn.fetch :refs/remotes/origin/master
+  git svn rebase -l  # -l avoids fetching ahead of the git mirror.
+
+Commits are performed using `svn commit` or with the sequence `git commit` and
+`git svn dcommit`.
+
+.. _workflow-multicheckout-nocommit:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+With the monorepo variant, there are a few options, depending on your
+constraints. First, you could just clone the full repository:
+
+git clone https://github.com/llvm/llvm-project.git
+
+At this point you have every sub-project (llvm, clang, lld, lldb, ...), which
+:ref:`doesn't imply you have to build all of them <build_single_project>`. You
+can still build only compiler-rt for instance. In this way it's not different
+from someone who would check out all the projects with SVN today.
+
+If you want to avoid checking out all the sources, you can hide the other
+directories using a Git sparse checkout::
+
+  git config core.sparseCheckout true
+  echo /compiler-rt > .git/info/sparse-checkout
+  git read-tree -mu HEAD
+
+The data for all sub-projects is still in your `.git` directory, but in your
+checkout, you only see `compiler-rt`.
+Before you push, you'll need to fetch and rebase (`git pull --rebase`) as
+usual.
+
+Note that when you fetch you'll likely pull in changes to sub-projects you don't
+care about. If you are using spasre checkout, the files from other projects
+won't appear on your disk. The only effect is that your commit hash changes.
+
+You can check whether the changes in the last fetch are relevant to your commit
+by running::
+
+  git log origin/master@{1}..origin/master -- libcxx
+
+This command can be hidden in a script so that `git llvmpush` would perform all
+these steps, fail only if such a dependent change exists, and show immediately
+the change that prevented the push. An immediate repeat of the command would
+(almost) certainly result in a successful push.
+Note that today with SVN or git-svn, this step is not possible since the
+"rebase" implicitly happens while committing (unless a conflict occurs).
+
+Checkout/Clone Multiple Projects, with Commit Access
+----------------------------------------------------
+
+Let's look how to assemble llvm+clang+libcxx at a given revision.
+
+Currently
+^^^^^^^^^
+
+::
+
+  svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm -r $REVISION
+  cd llvm/tools
+  svn co http://llvm.org/svn/llvm-project/clang/trunk clang -r $REVISION
+  cd ../projects
+  svn co http://llvm.org/svn/llvm-project/libcxx/trunk libcxx -r $REVISION
+
+Or using git-svn::
+
+  git clone http://llvm.org/git/llvm.git
+  cd llvm/
+  git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
+  git config svn-remote.svn.fetch :refs/remotes/origin/master
+  git svn rebase -l
+  git checkout `git svn find-rev -B r258109`
+  cd tools
+  git clone http://llvm.org/git/clang.git
+  cd clang/
+  git svn init https://llvm.org/svn/llvm-project/clang/trunk --username=<username>
+  git config svn-remote.svn.fetch :refs/remotes/origin/master
+  git svn rebase -l
+  git checkout `git svn find-rev -B r258109`
+  cd ../../projects/
+  git clone http://llvm.org/git/libcxx.git
+  cd libcxx
+  git svn init https://llvm.org/svn/llvm-project/libcxx/trunk --username=<username>
+  git config svn-remote.svn.fetch :refs/remotes/origin/master
+  git svn rebase -l
+  git checkout `git svn find-rev -B r258109`
+
+Note that the list would be longer with more sub-projects.
+
+.. _workflow-monocheckout-multicommit:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+The repository contains natively the source for every sub-projects at the right
+revision, which makes this straightforward::
+
+  git clone https://github.com/llvm/llvm-project.git
+  cd llvm-projects
+  git checkout $REVISION
+
+As before, at this point clang, llvm, and libcxx are stored in directories
+alongside each other.
+
+.. _workflow-cross-repo-commit:
+
+Commit an API Change in LLVM and Update the Sub-projects
+--------------------------------------------------------
+
+Today this is possible, even though not common (at least not documented) for
+subversion users and for git-svn users. For example, few Git users try to update
+LLD or Clang in the same commit as they change an LLVM API.
+
+The multirepo variant does not address this: one would have to commit and push
+separately in every individual repository. It would be possible to establish a
+protocol whereby users add a special token to their commit messages that causes
+the umbrella repo's updater bot to group all of them into a single revision.
+
+The monorepo variant handles this natively.
+
+Branching/Stashing/Updating for Local Development or Experiments
+----------------------------------------------------------------
+
+Currently
+^^^^^^^^^
+
+SVN does not allow this use case, but developers that are currently using
+git-svn can do it. Let's look in practice what it means when dealing with
+multiple sub-projects.
+
+To update the repository to tip of trunk::
+
+  git pull
+  cd tools/clang
+  git pull
+  cd ../../projects/libcxx
+  git pull
+
+To create a new branch::
+
+  git checkout -b MyBranch
+  cd tools/clang
+  git checkout -b MyBranch
+  cd ../../projects/libcxx
+  git checkout -b MyBranch
+
+To switch branches::
+
+  git checkout AnotherBranch
+  cd tools/clang
+  git checkout AnotherBranch
+  cd ../../projects/libcxx
+  git checkout AnotherBranch
+
+.. _workflow-mono-branching:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+Regular Git commands are sufficient, because everything is in a single
+repository:
+
+To update the repository to tip of trunk::
+
+  git pull
+
+To create a new branch::
+
+  git checkout -b MyBranch
+
+To switch branches::
+
+  git checkout AnotherBranch
+
+Bisecting
+---------
+
+Assuming a developer is looking for a bug in clang (or lld, or lldb, ...).
+
+Currently
+^^^^^^^^^
+
+SVN does not have builtin bisection support, but the single revision across
+sub-projects makes it possible to script around.
+
+Using the existing Git read-only view of the repositories, it is possible to use
+the native Git bisection script over the llvm repository, and use some scripting
+to synchronize the clang repository to match the llvm revision.
+
+.. _workflow-mono-bisecting:
+
+Monorepo Variant
+^^^^^^^^^^^^^^^^
+
+Bisecting on the monorepo is straightforward, and very similar to the above,
+except that the bisection script does not need to include the
+`git submodule update` step.
+
+The same example, finding which commit introduces a regression where clang-3.9
+crashes but not clang-3.8 passes, will look like::
+
+  git bisect start releases/3.9.x releases/3.8.x
+  git bisect run ./bisect_script.sh
+
+With the `bisect_script.sh` script being::
+
+  #!/bin/sh
+  cd $BUILD_DIR
+
+  ninja clang || exit 125   # an exit code of 125 asks "git bisect"
+                            # to "skip" the current commit
+
+  ./bin/clang some_crash_test.cpp
+
+Also, since the monorepo handles commits update across multiple projects, you're
+less like to encounter a build failure where a commit change an API in LLVM and
+another later one "fixes" the build in clang.
+
+Moving Local Branches to the Monorepo
+=====================================
+
+Suppose you have been developing against the existing LLVM git
+mirrors.  You have one or more git branches that you want to migrate
+to the "final monorepo".
+
+The simplest way to migrate such branches is with the
+``migrate-downstream-fork.py`` tool at
+https://github.com/jyknight/llvm-git-migration.
+
+Basic migration
+---------------
+
+Basic instructions for ``migrate-downstream-fork.py`` are in the
+Python script and are expanded on below to a more general recipe::
+
+  # Make a repository which will become your final local mirror of the
+  # monorepo.
+  mkdir my-monorepo
+  git -C my-monorepo init
+
+  # Add a remote to the monorepo.
+  git -C my-monorepo remote add upstream/monorepo https://github.com/llvm/llvm-project.git
+
+  # Add remotes for each git mirror you use, from upstream as well as
+  # your local mirror.  All projects are listed here but you need only
+  # import those for which you have local branches.
+  my_projects=( clang
+                clang-tools-extra
+                compiler-rt
+                debuginfo-tests
+                libcxx
+                libcxxabi
+                libunwind
+                lld
+                lldb
+                llvm
+                openmp
+                polly )
+  for p in ${my_projects[@]}; do
+    git -C my-monorepo remote add upstream/split/${p} https://github.com/llvm-mirror/${p}.git
+    git -C my-monorepo remote add local/split/${p} https://my.local.mirror.org/${p}.git
+  done
+
+  # Pull in all the commits.
+  git -C my-monorepo fetch --all
+
+  # Run migrate-downstream-fork to rewrite local branches on top of
+  # the upstream monorepo.
+  (
+     cd my-monorepo
+     migrate-downstream-fork.py \
+       refs/remotes/local \
+       refs/tags \
+       --new-repo-prefix=refs/remotes/upstream/monorepo \
+       --old-repo-prefix=refs/remotes/upstream/split \
+       --source-kind=split \
+       --revmap-out=monorepo-map.txt
+  )
+
+  # Octopus-merge the resulting local split histories to unify them.
+
+  # Assumes local work on local split mirrors is on master (and
+  # upstream is presumably represented by some other branch like
+  # upstream/master).
+  my_local_branch="master"
+
+  git -C my-monorepo branch --no-track local/octopus/master \
+    $(git -C my-monorepo merge-base refs/remotes/upstream/monorepo/master \
+                                    refs/remotes/local/split/llvm/${my_local_branch})
+  git -C my-monorepo checkout local/octopus/${my_local_branch}
+
+  subproject_branches=()
+  for p in ${my_projects[@]}; do
+    subproject_branch=${p}/local/monorepo/${my_local_branch}
+    git -C my-monorepo branch ${subproject_branch} \
+      refs/remotes/local/split/${p}/${my_local_branch}
+    if [[ "${p}" != "llvm" ]]; then
+      subproject_branches+=( ${subproject_branch} )
+    fi
+  done
+
+  git -C my-monorepo merge ${subproject_branches[@]}
+
+  for p in ${my_projects[@]}; do
+    subproject_branch=${p}/local/monorepo/${my_local_branch}
+    git -C my-monorepo branch -d ${subproject_branch}
+  done
+
+  # Create local branches for upstream monorepo branches.
+  for ref in $(git -C my-monorepo for-each-ref --format="%(refname)" \
+                   refs/remotes/upstream/monorepo); do
+    upstream_branch=${ref#refs/remotes/upstream/monorepo/}
+    git -C my-monorepo branch upstream/${upstream_branch} ${ref}
+  done
+
+The above gets you to a state like the following::
+
+  U1 - U2 - U3 <- upstream/master
+    \   \    \
+     \   \    - Llld1 - Llld2 -
+      \   \                    \
+       \   - Lclang1 - Lclang2-- Lmerge <- local/octopus/master
+        \                      /
+         - Lllvm1 - Lllvm2-----
+
+Each branched component has its branch rewritten on top of the
+monorepo and all components are unified by a giant octopus merge.
+
+If additional active local branches need to be preserved, the above
+operations following the assignment to ``my_local_branch`` should be
+done for each branch.  Ref paths will need to be updated to map the
+local branch to the corresponding upstream branch.  If local branches
+have no corresponding upstream branch, then the creation of
+``local/octopus/<local branch>`` need not use ``git-merge-base`` to
+pinpont its root commit; it may simply be branched from the
+appropriate component branch (say, ``llvm/local_release_X``).
+
+Zipping local history
+---------------------
+
+The octopus merge is suboptimal for many cases, because walking back
+through the history of one component leaves the other components fixed
+at a history that likely makes things unbuildable.
+
+Some downstream users track the order commits were made to subprojects
+with some kind of "umbrella" project that imports the project git
+mirrors as submodules, similar to the multirepo umbrella proposed
+above.  Such an umbrella repository looks something like this::
+
+   UM1 ---- UM2 -- UM3 -- UM4 ---- UM5 ---- UM6 ---- UM7 ---- UM8 <- master
+   |        |             |        |        |        |        |
+  Lllvm1   Llld1         Lclang1  Lclang2  Lllvm2   Llld2     Lmyproj1
+
+The vertical bars represent submodule updates to a particular local
+commit in the project mirror.  ``UM3`` in this case is a commit of
+some local umbrella repository state that is not a submodule update,
+perhaps a ``README`` or project build script update.  Commit ``UM8``
+updates a submodule of local project ``myproj``.
+
+The tool ``zip-downstream-fork.py`` at
+https://github.com/greened/llvm-git-migration/tree/zip can be used to
+convert the umbrella history into a monorepo-based history with
+commits in the order implied by submodule updates::
+
+  U1 - U2 - U3 <- upstream/master
+   \    \    \
+    \    -----\---------------                                    local/zip--.
+     \         \              \                                               |
+    - Lllvm1 - Llld1 - UM3 -  Lclang1 - Lclang2 - Lllvm2 - Llld2 - Lmyproj1 <-'
+
+
+The ``U*`` commits represent upstream commits to the monorepo master
+branch.  Each submodule update in the local ``UM*`` commits brought in
+a subproject tree at some local commit.  The trees in the ``L*1``
+commits represent merges from upstream.  These result in edges from
+the ``U*`` commits to their corresponding rewritten ``L*1`` commits.
+The ``L*2`` commits did not do any merges from upstream.
+
+Note that the merge from ``U2`` to ``Lclang1`` appears redundant, but
+if, say, ``U3`` changed some files in upstream clang, the ``Lclang1``
+commit appearing after the ``Llld1`` commit would actually represent a
+clang tree *earlier* in the upstream clang history.  We want the
+``local/zip`` branch to accurately represent the state of our umbrella
+history and so the edge ``U2 -> Lclang1`` is a visual reminder of what
+clang's tree actually looks like in ``Lclang1``.
+
+Even so, the edge ``U3 -> Llld1`` could be problematic for future
+merges from upstream.  git will think that we've already merged from
+``U3``, and we have, except for the state of the clang tree.  One
+possible migitation strategy is to manually diff clang between ``U2``
+and ``U3`` and apply those updates to ``local/zip``.  Another,
+possibly simpler strategy is to freeze local work on downstream
+branches and merge all submodules from the latest upstream before
+running ``zip-downstream-fork.py``.  If downstream merged each project
+from upstream in lockstep without any intervening local commits, then
+things should be fine without any special action.  We anticipate this
+to be the common case.
+
+The tree for ``Lclang1`` outside of clang will represent the state of
+things at ``U3`` since all of the upstream projects not participating
+in the umbrella history should be in a state respecting the commit
+``U3``.  The trees for llvm and lld should correctly represent commits
+``Lllvm1`` and ``Llld1``, respectively.
+
+Commit ``UM3`` changed files not related to submodules and we need
+somewhere to put them.  It is not safe in general to put them in the
+monorepo root directory because they may conflict with files in the
+monorepo.  Let's assume we want them in a directory ``local`` in the
+monorepo.
+
+**Example 1: Umbrella looks like the monorepo**
+
+For this example, we'll assume that each subproject appears in its own
+top-level directory in the umbrella, just as they do in the monorepo .
+Let's also assume that we want the files in directory ``myproj`` to
+appear in ``local/myproj``.
+
+Given the above run of ``migrate-downstream-fork.py``, a recipe to
+create the zipped history is below::
+
+  # Import any non-LLVM repositories the umbrella references.
+  git -C my-monorepo remote add localrepo \
+                                https://my.local.mirror.org/localrepo.git
+  git fetch localrepo
+
+  subprojects=( clang clang-tools-extra compiler-rt debuginfo-tests libclc
+                libcxx libcxxabi libunwind lld lldb llgo llvm openmp
+                parallel-libs polly pstl )
+
+  # Import histories for upstream split projects (this was probably
+  # already done for the ``migrate-downstream-fork.py`` run).
+  for project in ${subprojects[@]}; do
+    git remote add upstream/split/${project} \
+                   https://github.com/llvm-mirror/${subproject}.git
+    git fetch umbrella/split/${project}
+  done
+
+  # Import histories for downstream split projects (this was probably
+  # already done for the ``migrate-downstream-fork.py`` run).
+  for project in ${subprojects[@]}; do
+    git remote add local/split/${project} \
+                   https://my.local.mirror.org/${subproject}.git
+    git fetch local/split/${project}
+  done
+
+  # Import umbrella history.
+  git -C my-monorepo remote add umbrella \
+                                https://my.local.mirror.org/umbrella.git
+  git fetch umbrella
+
+  # Put myproj in local/myproj
+  echo "myproj local/myproj" > my-monorepo/submodule-map.txt
+
+  # Rewrite history
+  (
+    cd my-monorepo
+    zip-downstream-fork.py \
+      refs/remotes/umbrella \
+      --new-repo-prefix=refs/remotes/upstream/monorepo \
+      --old-repo-prefix=refs/remotes/upstream/split \
+      --revmap-in=monorepo-map.txt \
+      --revmap-out=zip-map.txt \
+      --subdir=local \
+      --submodule-map=submodule-map.txt \
+      --update-tags
+   )
+
+   # Create the zip branch (assuming umbrella master is wanted).
+   git -C my-monorepo branch --no-track local/zip/master refs/remotes/umbrella/master
+
+Note that if the umbrella has submodules to non-LLVM repositories,
+``zip-downstream-fork.py`` needs to know about them to be able to
+rewrite commits.  That is why the first step above is to fetch commits
+from such repositories.
+
+With ``--update-tags`` the tool will migrate annotated tags pointing
+to submodule commits that were inlined into the zipped history.  If
+the umbrella pulled in an upstream commit that happened to have a tag
+pointing to it, that tag will be migrated, which is almost certainly
+not what is wanted.  The tag can always be moved back to its original
+commit after rewriting, or the ``--update-tags`` option may be
+discarded and any local tags would then be migrated manually.
+
+**Example 2: Nested sources layout**
+
+The tool handles nested submodules (e.g. llvm is a submodule in
+umbrella and clang is a submodule in llvm).  The file
+``submodule-map.txt`` is a list of pairs, one per line.  The first
+pair item describes the path to a submodule in the umbrella
+repository.  The second pair item secribes the path where trees for
+that submodule should be written in the zipped history.  
+
+Let's say your umbrella repository is actually the llvm repository and
+it has submodules in the "nested sources" layout (clang in
+tools/clang, etc.).  Let's also say ``projects/myproj`` is a submodule
+pointing to some downstream repository.  The submodule map file should
+look like this (we still want myproj mapped the same way as
+previously)::
+
+  tools/clang clang
+  tools/clang/tools/extra clang-tools-extra
+  projects/compiler-rt compiler-rt
+  projects/debuginfo-tests debuginfo-tests
+  projects/libclc libclc
+  projects/libcxx libcxx
+  projects/libcxxabi libcxxabi
+  projects/libunwind libunwind
+  tools/lld lld
+  tools/lldb lldb
+  projects/openmp openmp
+  tools/polly polly
+  projects/myproj local/myproj
+
+If a submodule path does not appear in the map, the tools assumes it
+should be placed in the same place in the monorepo.  That means if you
+use the "nested sources" layout in your umrella, you *must* provide
+map entries for all of the projects in your umbrella (except llvm).
+Otherwise trees from submodule updates will appear underneath llvm in
+the zippped history.
+
+Because llvm is itself the umbrella, we use --subdir to write its
+content into ``llvm`` in the zippped history::
+
+  # Import any non-LLVM repositories the umbrella references.
+  git -C my-monorepo remote add localrepo \
+                                https://my.local.mirror.org/localrepo.git
+  git fetch localrepo
+
+  subprojects=( clang clang-tools-extra compiler-rt debuginfo-tests libclc
+                libcxx libcxxabi libunwind lld lldb llgo llvm openmp
+                parallel-libs polly pstl )
+
+  # Import histories for upstream split projects (this was probably
+  # already done for the ``migrate-downstream-fork.py`` run).
+  for project in ${subprojects[@]}; do
+    git remote add upstream/split/${project} \
+                   https://github.com/llvm-mirror/${subproject}.git
+    git fetch umbrella/split/${project}
+  done
+
+  # Import histories for downstream split projects (this was probably
+  # already done for the ``migrate-downstream-fork.py`` run).
+  for project in ${subprojects[@]}; do
+    git remote add local/split/${project} \
+                   https://my.local.mirror.org/${subproject}.git
+    git fetch local/split/${project}
+  done
+
+  # Import umbrella history.  We want this under a different refspec
+  # so zip-downstream-fork.py knows what it is.
+  git -C my-monorepo remote add umbrella \
+                                 https://my.local.mirror.org/llvm.git
+  git fetch umbrella
+
+  # Create the submodule map.
+  echo "tools/clang clang" > my-monorepo/submodule-map.txt
+  echo "tools/clang/tools/extra clang-tools-extra" >> my-monorepo/submodule-map.txt
+  echo "projects/compiler-rt compiler-rt" >> my-monorepo/submodule-map.txt
+  echo "projects/debuginfo-tests debuginfo-tests" >> my-monorepo/submodule-map.txt
+  echo "projects/libclc libclc" >> my-monorepo/submodule-map.txt
+  echo "projects/libcxx libcxx" >> my-monorepo/submodule-map.txt
+  echo "projects/libcxxabi libcxxabi" >> my-monorepo/submodule-map.txt
+  echo "projects/libunwind libunwind" >> my-monorepo/submodule-map.txt
+  echo "tools/lld lld" >> my-monorepo/submodule-map.txt
+  echo "tools/lldb lldb" >> my-monorepo/submodule-map.txt
+  echo "projects/openmp openmp" >> my-monorepo/submodule-map.txt
+  echo "tools/polly polly" >> my-monorepo/submodule-map.txt
+  echo "projects/myproj local/myproj" >> my-monorepo/submodule-map.txt
+
+  # Rewrite history
+  (
+    cd my-monorepo
+    zip-downstream-fork.py \
+      refs/remotes/umbrella \
+      --new-repo-prefix=refs/remotes/upstream/monorepo \
+      --old-repo-prefix=refs/remotes/upstream/split \
+      --revmap-in=monorepo-map.txt \
+      --revmap-out=zip-map.txt \
+      --subdir=llvm \
+      --submodule-map=submodule-map.txt \
+      --update-tags
+   )
+
+   # Create the zip branch (assuming umbrella master is wanted).
+   git -C my-monorepo branch --no-track local/zip/master refs/remotes/umbrella/master
+
+
+Comments at the top of ``zip-downstream-fork.py`` describe in more
+detail how the tool works and various implications of its operation.
+
+Importing local repositories
+----------------------------
+
+You may have additional repositories that integrate with the LLVM
+ecosystem, essentially extending it with new tools.  If such
+repositories are tightly coupled with LLVM, it may make sense to
+import them into your local mirror of the monorepo.
+
+If such repositores participated in the umbrella repository used
+during the zipping process above, they will automatically be added to
+the monorepo.  For downstream repositories that don't participate in
+an umbrella setup, the ``import-downstream-repo.py`` tool at
+https://github.com/greened/llvm-git-migration/tree/import can help with
+getting them into the monorepo.  A recipe follows::
+
+  # Import downstream repo history into the monorepo.
+  git -C my-monorepo remote add myrepo https://my.local.mirror.org/myrepo.git
+  git fetch myrepo
+
+  my_local_tags=( refs/tags/release
+                  refs/tags/hotfix )
+
+  (
+    cd my-monorepo
+    import-downstream-repo.py \
+      refs/remotes/myrepo \
+      ${my_local_tags[@]} \
+      --new-repo-prefix=refs/remotes/upstream/monorepo \
+      --subdir=myrepo \
+      --tag-prefix="myrepo-"
+   )
+
+   # Preserve release braches.
+   for ref in $(git -C my-monorepo for-each-ref --format="%(refname)" \
+                  refs/remotes/myrepo/release); do
+     branch=${ref#refs/remotes/myrepo/}
+     git -C my-monorepo branch --no-track myrepo/${branch} ${ref}
+   done
+
+   # Preserve master.
+   git -C my-monorepo branch --no-track myrepo/master refs/remotes/myrepo/master
+
+   # Merge master.
+   git -C my-monorepo checkout local/zip/master  # Or local/octopus/master
+   git -C my-monorepo merge myrepo/master
+
+You may want to merge other corresponding branches, for example
+``myrepo`` release branches if they were in lockstep with LLVM project
+releases.
+
+``--tag-prefix`` tells ``import-downstream-repo.py`` to rename
+annotated tags with the given prefix.  Due to limitations with
+``fast_filter_branch.py``, unannotated tags cannot be renamed
+(``fast_filter_branch.py`` considers them branches, not tags).  Since
+the upstream monorepo had its tags rewritten with an "llvmorg-"
+prefix, name conflicts should not be an issue.  ``--tag-prefix`` can
+be used to more clearly indicate which tags correspond to various
+imported repositories.
+
+Given this repository history::
+
+  R1 - R2 - R3 <- master
+       ^
+       |
+    release/1
+
+The above recipe results in a history like this::
+
+  U1 - U2 - U3 <- upstream/master
+   \    \    \
+    \    -----\---------------                                         local/zip--.
+     \         \              \                                                    |
+    - Lllvm1 - Llld1 - UM3 -  Lclang1 - Lclang2 - Lllvm2 - Llld2 - Lmyproj1 - M1 <-'
+                                                                             /
+                                                                 R1 - R2 - R3  <-.
+                                                                      ^           |
+                                                                      |           |
+                                                               myrepo-release/1   |
+                                                                                  |
+                                                                   myrepo/master--'
+
+Commits ``R1``, ``R2`` and ``R3`` have trees that *only* contain blobs
+from ``myrepo``.  If you require commits from ``myrepo`` to be
+interleaved with commits on local project branches (for example,
+interleaved with ``llvm1``, ``llvm2``, etc. above) and myrepo doesn't
+appear in an umbrella repository, a new tool will need to be
+developed.  Creating such a tool would involve:
+
+1. Modifying ``fast_filter_branch.py`` to optionally take a
+   revlist directly rather than generating it itself
+
+2. Creating a tool to generate an interleaved ordering of local
+   commits based on some criteria (``zip-downstream-fork.py`` uses the
+   umbrella history as its criterion)
+
+3. Generating such an ordering and feeding it to
+   ``fast_filter_branch.py`` as a revlist
+
+Some care will also likely need to be taken to handle merge commits,
+to ensure the parents of such commits migrate correctly.
+
+Scrubbing the Local Monorepo
+----------------------------
+
+Once all of the migrating, zipping and importing is done, it's time to
+clean up.  The python tools use ``git-fast-import`` which leaves a lot
+of cruft around and we want to shrink our new monorepo mirror as much
+as possible.  Here is one way to do it::
+
+  git -C my-monorepo checkout master
+
+  # Delete branches we no longer need.  Do this for any other branches
+  # you merged above.
+  git -C my-monorepo branch -D local/zip/master || true
+  git -C my-monorepo branch -D local/octopus/master || true
+
+  # Remove remotes.
+  git -C my-monorepo remote remove upstream/monorepo
+
+  for p in ${my_projects[@]}; do
+    git -C my-monorepo remote remove upstream/split/${p}
+    git -C my-monorepo remote remove local/split/${p}
+  done
+
+  git -C my-monorepo remote remove localrepo
+  git -C my-monorepo remote remove umbrella
+  git -C my-monorepo remote remove myrepo
+
+  # Add anything else here you don't need.  refs/tags/release is
+  # listed below assuming tags have been rewritten with a local prefix.
+  # If not, remove it from this list.
+  refs_to_clean=(
+    refs/original
+    refs/remotes
+    refs/tags/backups
+    refs/tags/release
+  )
+
+  git -C my-monorepo for-each-ref --format="%(refname)" ${refs_to_clean[@]} |
+    xargs -n1 --no-run-if-empty git -C my-monorepo update-ref -d
+
+  git -C my-monorepo reflog expire --all --expire=now
+
+  # fast_filter_branch.py might have gc running in the background.
+  while ! git -C my-monorepo \
+    -c gc.reflogExpire=0 \
+    -c gc.reflogExpireUnreachable=0 \
+    -c gc.rerereresolved=0 \
+    -c gc.rerereunresolved=0 \
+    -c gc.pruneExpire=now \
+    gc --prune=now; do
+    continue
+  done
+
+  # Takes a LOOOONG time!
+  git -C my-monorepo repack -A -d -f --depth=250 --window=250
+
+  git -C my-monorepo prune-packed
+  git -C my-monorepo prune
+
+You should now have a trim monorepo.  Upload it to your git server and
+happy hacking!
+
+References
+==========
+
+.. [LattnerRevNum] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041739.html
+.. [TrickRevNum] Andrew Trick, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041721.html
+.. [JSonnRevNum] Joerg Sonnenberg, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041688.html
+.. [MatthewsRevNum] Chris Matthews, http://lists.llvm.org/pipermail/cfe-dev/2016-July/049886.html
+.. [statuschecks] GitHub status-checks, https://help.github.com/articles/about-required-status-checks/

Added: www-releases/trunk/9.0.0/docs/_sources/Proposals/TestSuite.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Proposals/TestSuite.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Proposals/TestSuite.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Proposals/TestSuite.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,321 @@
+=====================
+Test-Suite Extentions
+=====================
+
+.. contents::
+   :depth: 1
+   :local:
+
+Abstract
+========
+
+These are ideas for additional programs, benchmarks, applications and
+algorithms that could be added to the LLVM Test-Suite.
+The test-suite could be much larger than it is now, which would help us
+detecting compiler errors (crashes, miscompiles) during development.
+
+Most probably, the reason why the programs below have not been added to
+the test-suite yet is that nobody has found time to do it. But there
+might be other issues as well, such as
+
+ * Licensing (Support can still be added as external module,
+              like for the SPEC benchmarks)
+
+ * Language (in particular, there is no official LLVM frontend
+             for FORTRAN yet)
+
+ * Parallelism (currently, all programs in test-suite use
+                one thread only)
+
+Benchmarks
+==========
+
+SPEC CPU 2017
+-------------
+https://www.spec.org/cpu2017/
+
+The following have not been included yet because they contain Fortran
+code.
+
+In case of cactuBSSN only a small portion is Fortran. The hosts's
+Fortran compiler could be used for these parts.
+
+Note that CMake's Ninja generator has difficulties with Fortran. See the
+`CMake documentation <https://cmake.org/cmake/help/v3.13/generator/Ninja.html#fortran-support>`_
+for details.
+
+ * 503.bwaves_r/603.bwaves_s
+ * 507.cactuBSSN_r
+ * 521.wrf_r/621.wrf_s
+ * 527.cam4_r/627.cam4_s
+ * 628.pop2_s
+ * 548.exchange2_r/648.exchange2_s
+ * 549.fotonik3d_r/649.fotonik3d_s
+ * 554.roms_r/654.roms_s
+
+SPEC OMP2012
+------------
+https://www.spec.org/omp2012/
+
+ * 350.md
+ * 351.bwaves
+ * 352.nab
+ * 357.bt331
+ * 358.botsalgn
+ * 359.botsspar
+ * 360.ilbdc
+ * 362.fma3d
+ * 363.swim
+ * 367.imagick
+ * 370.mgrid331
+ * 371.applu331
+ * 372.smithwa
+ * 376.kdtree
+
+OpenCV
+------
+https://opencv.org/
+
+OpenMP 4.x SIMD Benchmarks
+--------------------------
+https://github.com/flwende/simd_benchmarks
+
+PWM-benchmarking
+----------------
+https://github.com/tbepler/PWM-benchmarking
+
+SLAMBench
+---------
+https://github.com/pamela-project/slambench
+
+FireHose
+--------
+http://firehose.sandia.gov/
+
+A Benchmark for the C/C++ Standard Library
+------------------------------------------
+https://github.com/hiraditya/std-benchmark
+
+OpenBenchmarking.org CPU / Processor Suite
+------------------------------------------
+https://openbenchmarking.org/suite/pts/cpu
+
+This is a subset of the
+`Phoronix Test Suite <https://github.com/phoronix-test-suite/phoronix-test-suite/>`_
+and is itself a collection of benchmark suites
+
+Parboil Benchmarks
+------------------
+http://impact.crhc.illinois.edu/parboil/parboil.aspx
+
+MachSuite
+---------
+https://breagen.github.io/MachSuite/
+
+Rodinia
+-------
+http://lava.cs.virginia.edu/Rodinia/download_links.htm
+
+Rodinia has already been partially included in
+MultiSource/Benchmarks/Rodinia. Benchmarks still missing are:
+
+ * streamcluster
+ * particlefilter
+ * nw
+ * nn
+ * myocyte
+ * mummergpu
+ * lud
+ * leukocyte
+ * lavaMD
+ * kmeans
+ * hotspot3D
+ * heartwall
+ * cfd
+ * bfs
+ * b+tree
+
+vecmathlib tests harness
+------------------------
+https://bitbucket.org/eschnett/vecmathlib/wiki/Home
+
+PARSEC
+------
+http://parsec.cs.princeton.edu/
+
+Graph500 reference implementations
+----------------------------------
+https://github.com/graph500/graph500/tree/v2-spec
+
+NAS Parallel Benchmarks
+-----------------------
+https://www.nas.nasa.gov/publications/npb.html
+
+The official benchmark is written in Fortran, but an unofficial
+C-translation is available as well:
+https://github.com/benchmark-subsetting/NPB3.0-omp-C
+
+DARPA HPCS SSCA#2 C/OpenMP reference implementation
+---------------------------------------------------
+http://www.highproductivity.org/SSCABmks.htm
+
+This web site does not exist any more, but there seems to be a copy of
+some of the benchmarks
+https://github.com/gtcasl/hpc-benchmarks/tree/master/SSCA2v2.2
+
+Kokkos
+------
+https://github.com/kokkos/kokkos-kernels/tree/master/perf_test
+https://github.com/kokkos/kokkos/tree/master/benchmarks
+
+PolyMage
+--------
+https://github.com/bondhugula/polymage-benchmarks
+
+PolyBench
+---------
+https://sourceforge.net/projects/polybench/
+
+A modified version of Polybench 3.2 is already presented in
+SingleSource/Benchmarks/Polybench. A newer version 4.2.1 is available.
+
+High Performance Geometric Multigrid
+------------------------------------
+https://crd.lbl.gov/departments/computer-science/PAR/research/hpgmg/
+
+RAJA Performance Suite
+----------------------
+https://github.com/LLNL/RAJAPerf
+
+CORAL-2 Benchmarks
+------------------
+https://asc.llnl.gov/coral-2-benchmarks/
+
+Many of its programs have already been integreated in
+MultiSource/Benchmarks/DOE-ProxyApps-C and
+MultiSource/Benchmarks/DOE-ProxyApps-C++.
+
+ * Nekbone
+ * QMCPack
+ * LAMMPS
+ * Kripke
+ * Quicksilver
+ * PENNANT
+ * Big Data Analytic Suite
+ * Deep Learning Suite
+ * Stream
+ * Stride
+ * ML/DL micro-benchmark
+ * Pynamic
+ * ACME
+ * VPIC
+ * Laghos
+ * Parallel Integer Sort
+ * Havoq
+
+NWChem
+------
+http://www.nwchem-sw.org/index.php/Benchmarks
+
+TVM
+----
+https://github.com/dmlc/tvm/tree/master/apps/benchmark
+
+HydroBench
+----------
+https://github.com/HydroBench/Hydro
+
+ParRes
+------
+https://github.com/ParRes/Kernels/tree/master/Cxx11
+
+Applications/Libraries
+======================
+
+GnuPG
+-----
+https://gnupg.org/
+
+Blitz++
+-------
+https://sourceforge.net/projects/blitz/
+
+FFmpeg
+------
+https://ffmpeg.org/
+
+FreePOOMA
+---------
+http://www.nongnu.org/freepooma/
+
+FTensors
+--------
+http://www.wlandry.net/Projects/FTensor
+
+rawspeed
+--------
+https://github.com/darktable-org/rawspeed
+
+Its test dataset is 756 MB in size, which is too large to be included
+into the test-suite repository.
+
+C++ Performance Benchmarks
+--------------------------
+https://gitlab.com/chriscox/CppPerformanceBenchmarks
+
+Generic Algorithms
+==================
+
+Image processing
+----------------
+
+Resampling
+``````````
+
+ * Bilinear
+ * Bicubic
+ * Lanczos
+
+Dither
+``````
+
+ * Threshold
+ * Random
+ * Halftone
+ * Bayer
+ * Floyd-Steinberg
+ * Jarvis
+ * Stucki
+ * Burkes
+ * Sierra
+ * Atkinson
+ * Gradient-based
+
+Feature detection
+`````````````````
+
+ * Harris
+ * Histogram of Oriented Gradients
+
+Color conversion
+````````````````
+
+ * RGB to grayscale
+ * HSL to RGB
+
+Graph
+-----
+
+Search Algorithms
+`````````````````
+
+ * Breadth-First-Search
+ * Depth-First-Search
+ * Dijkstra's algorithm
+ * A-Star
+
+Spanning Tree
+`````````````
+
+ * Kruskal's algorithm
+ * Prim's algorithm

Added: www-releases/trunk/9.0.0/docs/_sources/Proposals/VariableNames.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Proposals/VariableNames.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Proposals/VariableNames.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Proposals/VariableNames.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,399 @@
+===================
+Variable Names Plan
+===================
+
+.. contents::
+   :local:
+
+This plan is *provisional*. It is not agreed upon. It is written with the
+intention of capturing the desires and concerns of the LLVM community, and
+forming them into a plan that can be agreed upon.
+The original author is somewhat naÃ¯ve in the ways of LLVM so there will
+inevitably be some details that are flawed. You can help - you can edit this
+page (preferably with a Phabricator review for larger changes) or reply to the
+`Request For Comments thread
+<http://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html>`_.
+
+Too Long; Didn't Read
+=====================
+
+Improve the readability of LLVM code.
+
+Introduction
+============
+
+The current `variable naming rule
+<../CodingStandards.html#name-types-functions-variables-and-enumerators-properly>`_
+states:
+
+  Variable names should be nouns (as they represent state). The name should be
+  camel case, and start with an upper case letter (e.g. Leader or Boats).
+
+This rule is the same as that for type names. This is a problem because the
+type name cannot be reused for a variable name [*]_. LLVM developers tend to
+work around this by either prepending ``The`` to the type name::
+
+  Triple TheTriple;
+
+... or more commonly use an acronym, despite the coding standard stating "Avoid
+abbreviations unless they are well known"::
+
+  Triple T;
+
+The proliferation of acronyms leads to hard-to-read code such as `this
+<https://github.com/llvm/llvm-project/blob/0a8bc14ad7f3209fe702d18e250194cd90188596/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp#L7445>`_::
+
+  InnerLoopVectorizer LB(L, PSE, LI, DT, TLI, TTI, AC, ORE, VF.Width, IC,
+                         &LVL, &CM);
+
+Many other coding guidelines [LLDB]_ [Google]_ [WebKit]_ [Qt]_ [Rust]_ [Swift]_
+[Python]_ require that variable names begin with a lower case letter in contrast
+to class names which begin with a capital letter. This convention means that the
+most readable variable name also requires the least thought::
+
+  Triple triple;
+
+There is some agreement that the current rule is broken [LattnerAgree]_
+[ArsenaultAgree]_ [RobinsonAgree]_ and that acronyms are an obstacle to reading
+new code [MalyutinDistinguish]_ [CarruthAcronym]_ [PicusAcronym]_. There are
+some opposing views [ParzyszekAcronym2]_ [RicciAcronyms]_.
+
+This work-in-progress proposal is to change the coding standard for variable
+names to require that they start with a lower case letter.
+
+.. [*] In `some cases
+   <https://github.com/llvm/llvm-project/blob/8b72080d4d7b13072f371712eed333f987b7a18e/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp#L2727>`_
+   the type name *is* reused as a variable name, but this shadows the type name
+   and confuses many debuggers [DenisovCamelBack]_.
+
+Variable Names Coding Standard Options
+======================================
+
+There are two main options for variable names that begin with a lower case
+letter: ``camelBack`` and ``lower_case``. (These are also known by other names
+but here we use the terminology from clang-tidy).
+
+``camelBack`` is consistent with [WebKit]_, [Qt]_ and [Swift]_ while
+``lower_case`` is consistent with [LLDB]_, [Google]_, [Rust]_ and [Python]_.
+
+``camelBack`` is already used for function names, which may be considered an
+advantage [LattnerFunction]_ or a disadvantage [CarruthFunction]_.
+
+Approval for ``camelBack`` was expressed by [DenisovCamelBack]_
+[LattnerFunction]_ [IvanovicDistinguish]_.
+Opposition to ``camelBack`` was expressed by [CarruthCamelBack]_
+[TurnerCamelBack]_.
+Approval for ``lower_case`` was expressed by [CarruthLower]_
+[CarruthCamelBack]_ [TurnerLLDB]_.
+Opposition to ``lower_case`` was expressed by [LattnerLower]_.
+
+Differentiating variable kinds
+------------------------------
+
+An additional requested change is to distinguish between different kinds of
+variables [RobinsonDistinguish]_ [RobinsonDistinguish2]_ [JonesDistinguish]_
+[IvanovicDistinguish]_ [CarruthDistinguish]_ [MalyutinDistinguish]_.
+
+Others oppose this idea [HÃ¤hnleDistinguish]_ [GreeneDistinguish]_
+[HendersonPrefix]_.
+
+A possibility is for member variables to be prefixed with ``m_`` and for global
+variables to be prefixed with ``g_`` to distinguish them from local variables.
+This is consistent with [LLDB]_. The ``m_`` prefix is consistent with [WebKit]_.
+
+A variation is for member variables to be prefixed with ``m``
+[IvanovicDistinguish]_ [BeylsDistinguish]_. This is consistent with [Mozilla]_.
+
+Another option is for member variables to be suffixed with ``_`` which is
+consistent with [Google]_ and similar to [Python]_. Opposed by
+[ParzyszekDistinguish]_.
+
+Reducing the number of acronyms
+===============================
+
+While switching coding standard will make it easier to use non-acronym names for
+new code, it doesn't improve the existing large body of code that uses acronyms
+extensively to the detriment of its readability. Further, it is natural and
+generally encouraged that new code be written in the style of the surrounding
+code. Therefore it is likely that much newly written code will also use
+acronyms despite what the coding standard says, much as it is today.
+
+As well as changing the case of variable names, they could also be expanded to
+their non-acronym form e.g. ``Triple T`` â ``Triple triple``.
+
+There is support for expanding many acronyms [CarruthAcronym]_ [PicusAcronym]_
+but there is a preference that expanding acronyms be deferred
+[ParzyszekAcronym]_ [CarruthAcronym]_.
+
+The consensus within the community seems to be that at least some acronyms are
+valuable [ParzyszekAcronym]_ [LattnerAcronym]_. The most commonly cited acronym
+is ``TLI`` however that is used to refer to both ``TargetLowering`` and
+``TargetLibraryInfo`` [GreeneDistinguish]_.
+
+The following is a list of acronyms considered sufficiently useful that the
+benefit of using them outweighs the cost of learning them. Acronyms that are
+either not on the list or are used to refer to a different type should be
+expanded.
+
+============================ =============
+Class name                   Variable name
+============================ =============
+DeterministicFiniteAutomaton dfa
+DominatorTree                dt
+LoopInfo                     li
+MachineFunction              mf
+MachineInstr                 mi
+MachineRegisterInfo          mri
+ScalarEvolution              se
+TargetInstrInfo              tii
+TargetLibraryInfo            tli
+TargetRegisterInfo           tri
+============================ =============
+
+In some cases renaming acronyms to the full type name will result in overly
+verbose code. Unlike most classes, a variable's scope is limited and therefore
+some of its purpose can implied from that scope, meaning that fewer words are
+necessary to give it a clear name. For example, in an optization pass the reader
+can assume that a variable's purpose relates to optimization and therefore an
+``OptimizationRemarkEmitter`` variable could be given the name ``remarkEmitter``
+or even ``remarker``.
+
+The following is a list of longer class names and the associated shorter
+variable name.
+
+========================= =============
+Class name                Variable name
+========================= =============
+BasicBlock                block
+ConstantExpr              expr
+ExecutionEngine           engine
+MachineOperand            operand
+OptimizationRemarkEmitter remarker
+PreservedAnalyses         analyses
+PreservedAnalysesChecker  checker
+TargetLowering            lowering
+TargetMachine             machine
+========================= =============
+
+Transition Options
+==================
+
+There are three main options for transitioning:
+
+1. Keep the current coding standard
+2. Laissez faire
+3. Big bang
+
+Keep the current coding standard
+--------------------------------
+
+Proponents of keeping the current coding standard (i.e. not transitioning at
+all) question whether the cost of transition outweighs the benefit
+[EmersonConcern]_ [ReamesConcern]_ [BradburyConcern]_.
+The costs are that ``git blame`` will become less usable; and that merging the
+changes will be costly for downstream maintainers. See `Big bang`_ for potential
+mitigations.
+
+Laissez faire
+-------------
+
+The coding standard could allow both ``CamelCase`` and ``camelBack`` styles for
+variable names [LattnerTransition]_.
+
+A code review to implement this is at https://reviews.llvm.org/D57896.
+
+Advantages
+**********
+
+ * Very easy to implement initially.
+
+Disadvantages
+*************
+
+ * Leads to inconsistency [BradburyConcern]_ [AminiInconsistent]_.
+ * Inconsistency means it will be hard to know at a guess what name a variable
+   will have [DasInconsistent]_ [CarruthInconsistent]_.
+ * Some large-scale renaming may happen anyway, leading to its disadvantages
+   without any mitigations.
+
+Big bang
+--------
+
+With this approach, variables will be renamed by an automated script in a series
+of large commits.
+
+The principle advantage of this approach is that it minimises the cost of
+inconsistency [BradburyTransition]_ [RobinsonTransition]_.
+
+It goes against a policy of avoiding large-scale reformatting of existing code
+[GreeneDistinguish]_.
+
+It has been suggested that LLD would be a good starter project for the renaming
+[Ueyama]_.
+
+Keeping git blame usable
+************************
+
+``git blame`` (or ``git annotate``) permits quickly identifying the commit that
+changed a given line in a file. After renaming variables, many lines will show
+as being changed by that one commit, requiring a further invocation of ``git
+blame`` to identify prior, more interesting commits [GreeneGitBlame]_
+[RicciAcronyms]_.
+
+**Mitigation**: `git-hyper-blame
+<https://commondatastorage.googleapis.com/chrome-infra-docs/flat/depot_tools/docs/html/git-hyper-blame.html>`_
+can ignore or "look through" a given set of commits.
+A ``.git-blame-ignore-revs`` file identifying the variable renaming commits
+could be added to the LLVM git repository root directory.
+It is being `investigated
+<https://public-inbox.org/git/20190324235020.49706-1-michael@platin.gs/>`_
+whether similar functionality could be added to ``git blame`` itself.
+
+Minimising cost of downstream merges
+************************************
+
+There are many forks of LLVM with downstream changes. Merging a large-scale
+renaming change could be difficult for the fork maintainers.
+
+**Mitigation**: A large-scale renaming would be automated. A fork maintainer can
+merge from the commit immediately before the renaming, then apply the renaming
+script to their own branch. They can then merge again from the renaming commit,
+resolving all conflicts by choosing their own version. This could be tested on
+the [SVE]_ fork.
+
+Provisional Plan
+================
+
+This is a provisional plan for the `Big bang`_ approach. It has not been agreed.
+
+#. Investigate improving ``git blame``. The extent to which it can be made to
+   "look through" commits may impact how big a change can be made.
+
+#. Write a script to expand acronyms.
+
+#. Experiment and perform dry runs of the various refactoring options.
+   Results can be published in forks of the LLVM Git repository.
+
+#. Consider the evidence and agree on the new policy.
+
+#. Agree & announce a date for the renaming of the starter project (LLD).
+
+#. Update the `policy page <../CodingStandards.html>`_. This will explain the
+   old and new rules and which projects each applies to.
+
+#. Refactor the starter project in two commits:
+
+   1. Add or change the project's .clang-tidy to reflect the agreed rules.
+      (This is in a separate commit to enable the merging process described in
+      `Minimising cost of downstream merges`_).
+      Also update the project list on the policy page.
+   2. Apply ``clang-tidy`` to the project's files, with only the
+      ``readability-identifier-naming`` rules enabled. ``clang-tidy`` will also
+      reformat the affected lines according to the rules in ``.clang-format``.
+      It is anticipated that this will be a good dog-fooding opportunity for
+      clang-tidy, and bugs should be fixed in the process, likely including:
+
+        * `readability-identifier-naming incorrectly fixes lambda capture
+          <https://bugs.llvm.org/show_bug.cgi?id=41119>`_.
+        * `readability-identifier-naming incorrectly fixes variables which
+          become keywords <https://bugs.llvm.org/show_bug.cgi?id=41120>`_.
+        * `readability-identifier-naming misses fixing member variables in
+          destructor <https://bugs.llvm.org/show_bug.cgi?id=41122>`_.
+
+#. Gather feedback and refine the process as appropriate.
+
+#. Apply the process to the following projects, with a suitable delay between
+   each (at least 4 weeks after the first change, at least 2 weeks subsequently)
+   to allow gathering further feedback.
+   This list should exclude projects that must adhere to an externally defined
+   standard e.g. libcxx.
+   The list is roughly in chronological order of renaming.
+   Some items may not make sense to rename individually - it is expected that
+   this list will change following experimentation:
+
+   * TableGen
+   * llvm/tools
+   * clang-tools-extra
+   * clang
+   * ARM backend
+   * AArch64 backend
+   * AMDGPU backend
+   * ARC backend
+   * AVR backend
+   * BPF backend
+   * Hexagon backend
+   * Lanai backend
+   * MIPS backend
+   * NVPTX backend
+   * PowerPC backend
+   * RISC-V backend
+   * Sparc backend
+   * SystemZ backend
+   * WebAssembly backend
+   * X86 backend
+   * XCore backend
+   * libLTO
+   * Debug Information
+   * Remainder of llvm
+   * compiler-rt
+   * libunwind
+   * openmp
+   * parallel-libs
+   * polly
+   * lldb
+
+#. Remove the old variable name rule from the policy page.
+
+#. Repeat many of the steps in the sequence, using a script to expand acronyms.
+
+References
+==========
+
+.. [LLDB] LLDB Coding Conventions https://llvm.org/svn/llvm-project/lldb/branches/release_39/www/lldb-coding-conventions.html
+.. [Google] Google C++ Style Guide https://google.github.io/styleguide/cppguide.html#Variable_Names
+.. [WebKit] WebKit Code Style Guidelines https://webkit.org/code-style-guidelines/#names
+.. [Qt] Qt Coding Style https://wiki.qt.io/Qt_Coding_Style#Declaring_variables
+.. [Rust] Rust naming conventions https://doc.rust-lang.org/1.0.0/style/style/naming/README.html
+.. [Swift] Swift API Design Guidelines https://swift.org/documentation/api-design-guidelines/#general-conventions
+.. [Python] Style Guide for Python Code https://www.python.org/dev/peps/pep-0008/#function-and-variable-names
+.. [Mozilla] Mozilla Coding style: Prefixes https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Prefixes
+.. [SVE] LLVM with support for SVE https://github.com/ARM-software/LLVM-SVE
+.. [AminiInconsistent] Mehdi Amini, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130329.html
+.. [ArsenaultAgree] Matt Arsenault, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129934.html
+.. [BeylsDistinguish] Kristof Beyls, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130292.html
+.. [BradburyConcern] Alex Bradbury, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130266.html
+.. [BradburyTransition] Alex Bradbury, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130388.html
+.. [CarruthAcronym] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130313.html
+.. [CarruthCamelBack] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130214.html
+.. [CarruthDistinguish] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130310.html
+.. [CarruthFunction] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130309.html
+.. [CarruthInconsistent] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130312.html
+.. [CarruthLower] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130430.html
+.. [DasInconsistent] Sanjoy Das, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130304.html
+.. [DenisovCamelBack] Alex Denisov, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130179.html
+.. [EmersonConcern] Amara Emerson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129894.html
+.. [GreeneDistinguish] David Greene, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130425.html
+.. [GreeneGitBlame] David Greene, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130228.html
+.. [HendersonPrefix] James Henderson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130465.html
+.. [HÃ¤hnleDistinguish] Nicolai HÃ¤hnle, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129923.html
+.. [IvanovicDistinguish] Nemanja Ivanovic, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130249.html
+.. [JonesDistinguish] JD Jones, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129926.html
+.. [LattnerAcronym] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130353.html
+.. [LattnerAgree] Chris Latter, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129907.html
+.. [LattnerFunction] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130630.html
+.. [LattnerLower] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130629.html
+.. [LattnerTransition] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130355.html
+.. [MalyutinDistinguish] Danila Malyutin, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130320.html
+.. [ParzyszekAcronym] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130306.html
+.. [ParzyszekAcronym2] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130323.html
+.. [ParzyszekDistinguish] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129941.html
+.. [PicusAcronym] Diana Picus, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130318.html
+.. [ReamesConcern] Philip Reames, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130181.html
+.. [RicciAcronyms] Bruno Ricci, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130328.html
+.. [RobinsonAgree] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130111.html
+.. [RobinsonDistinguish] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129920.html
+.. [RobinsonDistinguish2] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130229.html
+.. [RobinsonTransition] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130415.html
+.. [TurnerCamelBack] Zachary Turner, https://reviews.llvm.org/D57896#1402264
+.. [TurnerLLDB] Zachary Turner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130213.html
+.. [Ueyama] Rui Ueyama, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130435.html

Added: www-releases/trunk/9.0.0/docs/_sources/Proposals/VectorizationPlan.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Proposals/VectorizationPlan.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Proposals/VectorizationPlan.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Proposals/VectorizationPlan.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,247 @@
+==================
+Vectorization Plan
+==================
+
+.. contents::
+   :local:
+
+Abstract
+========
+The vectorization transformation can be rather complicated, involving several
+potential alternatives, especially for outer-loops [1]_ but also possibly for
+innermost loops. These alternatives may have significant performance impact,
+both positive and negative. A cost model is therefore employed to identify the
+best alternative, including the alternative of avoiding any transformation
+altogether.
+
+The Vectorization Plan is an explicit model for describing vectorization
+candidates. It serves for both optimizing candidates including estimating their
+cost reliably, and for performing their final translation into IR. This
+facilitates dealing with multiple vectorization candidates.
+
+High-level Design
+=================
+
+Vectorization Workflow
+----------------------
+VPlan-based vectorization involves three major steps, taking a "scenario-based
+approach" to vectorization planning:
+
+1. Legal Step: check if a loop can be legally vectorized; encode constraints and
+   artifacts if so.
+2. Plan Step:
+
+   a. Build initial VPlans following the constraints and decisions taken by
+      Legal Step 1, and compute their cost.
+   b. Apply optimizations to the VPlans, possibly forking additional VPlans.
+      Prune sub-optimal VPlans having relatively high cost.
+3. Execute Step: materialize the best VPlan. Note that this is the only step
+   that modifies the IR.
+
+Design Guidelines
+-----------------
+In what follows, the term "input IR" refers to code that is fed into the
+vectorizer whereas the term "output IR" refers to code that is generated by the
+vectorizer. The output IR contains code that has been vectorized or "widened"
+according to a loop Vectorization Factor (VF), and/or loop unroll-and-jammed
+according to an Unroll Factor (UF).
+The design of VPlan follows several high-level guidelines:
+
+1. Analysis-like: building and manipulating VPlans must not modify the input IR.
+   In particular, if the best option is not to vectorize at all, the
+   vectorization process terminates before reaching Step 3, and compilation
+   should proceed as if VPlans had not been built.
+
+2. Align Cost & Execute: each VPlan must support both estimating the cost and
+   generating the output IR code, such that the cost estimation evaluates the
+   to-be-generated code reliably.
+
+3. Support vectorizing additional constructs:
+
+   a. Outer-loop vectorization. In particular, VPlan must be able to model the
+      control-flow of the output IR which may include multiple basic-blocks and
+      nested loops.
+   b. SLP vectorization.
+   c. Combinations of the above, including nested vectorization: vectorizing
+      both an inner loop and an outer-loop at the same time (each with its own
+      VF and UF), mixed vectorization: vectorizing a loop with SLP patterns
+      inside [4]_, (re)vectorizing input IR containing vector code.
+   d. Function vectorization [2]_.
+
+4. Support multiple candidates efficiently. In particular, similar candidates
+   related to a range of possible VF's and UF's must be represented efficiently.
+   Potential versioning needs to be supported efficiently.
+
+5. Support vectorizing idioms, such as interleaved groups of strided loads or
+   stores. This is achieved by modeling a sequence of output instructions using
+   a "Recipe", which is responsible for computing its cost and generating its
+   code.
+
+6. Encapsulate Single-Entry Single-Exit regions (SESE). During vectorization
+   such regions may need to be, for example, predicated and linearized, or
+   replicated VF*UF times to handle scalarized and predicated instructions.
+   Innerloops are also modelled as SESE regions.
+
+7. Support instruction-level analysis and transformation, as part of Planning
+   Step 2.b: During vectorization instructions may need to be traversed, moved,
+   replaced by other instructions or be created. For example, vector idiom
+   detection and formation involves searching for and optimizing instruction
+   patterns.
+
+Definitions
+===========
+The low-level design of VPlan comprises of the following classes.
+
+:LoopVectorizationPlanner:
+  A LoopVectorizationPlanner is designed to handle the vectorization of a loop
+  or a loop nest. It can construct, optimize and discard one or more VPlans,
+  each VPlan modelling a distinct way to vectorize the loop or the loop nest.
+  Once the best VPlan is determined, including the best VF and UF, this VPlan
+  drives the generation of output IR.
+
+:VPlan:
+  A model of a vectorized candidate for a given input IR loop or loop nest. This
+  candidate is represented using a Hierarchical CFG. VPlan supports estimating
+  the cost and driving the generation of the output IR code it represents.
+
+:Hierarchical CFG:
+  A control-flow graph whose nodes are basic-blocks or Hierarchical CFG's. The
+  Hierarchical CFG data structure is similar to the Tile Tree [5]_, where
+  cross-Tile edges are lifted to connect Tiles instead of the original
+  basic-blocks as in Sharir [6]_, promoting the Tile encapsulation. The terms
+  Region and Block are used rather than Tile [5]_ to avoid confusion with loop
+  tiling.
+
+:VPBlockBase:
+  The building block of the Hierarchical CFG. A pure-virtual base-class of
+  VPBasicBlock and VPRegionBlock, see below. VPBlockBase models the hierarchical
+  control-flow relations with other VPBlocks. Note that in contrast to the IR
+  BasicBlock, a VPBlockBase models its control-flow successors and predecessors
+  directly, rather than through a Terminator branch or through predecessor
+  branches that "use" the VPBlockBase.
+
+:VPBasicBlock:
+  VPBasicBlock is a subclass of VPBlockBase, and serves as the leaves of the
+  Hierarchical CFG. It represents a sequence of output IR instructions that will
+  appear consecutively in an output IR basic-block. The instructions of this
+  basic-block originate from one or more VPBasicBlocks. VPBasicBlock holds a
+  sequence of zero or more VPRecipes that model the cost and generation of the
+  output IR instructions.
+
+:VPRegionBlock:
+  VPRegionBlock is a subclass of VPBlockBase. It models a collection of
+  VPBasicBlocks and VPRegionBlocks which form a SESE subgraph of the output IR
+  CFG. A VPRegionBlock may indicate that its contents are to be replicated a
+  constant number of times when output IR is generated, effectively representing
+  a loop with constant trip-count that will be completely unrolled. This is used
+  to support scalarized and predicated instructions with a single model for
+  multiple candidate VF's and UF's.
+
+:VPRecipeBase:
+  A pure-virtual base class modeling a sequence of one or more output IR
+  instructions, possibly based on one or more input IR instructions. These
+  input IR instructions are referred to as "Ingredients" of the Recipe. A Recipe
+  may specify how its ingredients are to be transformed to produce the output IR
+  instructions; e.g., cloned once, replicated multiple times or widened
+  according to selected VF.
+
+:VPValue:
+  The base of VPlan's def-use relations class hierarchy. When instantiated, it
+  models a constant or a live-in Value in VPlan. It has users, which are of type
+  VPUser, but no operands.
+
+:VPUser:
+  A VPValue representing a general vertex in the def-use graph of VPlan. It has
+  operands which are of type VPValue. When instantiated, it represents a
+  live-out Instruction that exists outside VPlan. VPUser is similar in some
+  aspects to LLVM's User class.
+
+:VPInstruction:
+  A VPInstruction is both a VPRecipe and a VPUser. It models a single
+  VPlan-level instruction to be generated if the VPlan is executed, including
+  its opcode and possibly additional characteristics. It is the basis for
+  writing instruction-level analyses and optimizations in VPlan as creating,
+  replacing or moving VPInstructions record both def-use and scheduling
+  decisions. VPInstructions also extend LLVM IR's opcodes with idiomatic
+  operations that enrich the Vectorizer's semantics.
+
+:VPTransformState:
+  Stores information used for generating output IR, passed from
+  LoopVectorizationPlanner to its selected VPlan for execution, and used to pass
+  additional information down to VPBlocks and VPRecipes.
+
+The Planning Process and VPlan Roadmap
+======================================
+
+Transforming the Loop Vectorizer to use VPlan follows a staged approach. First,
+VPlan is used to record the final vectorization decisions, and to execute them:
+the Hierarchical CFG models the planned control-flow, and Recipes capture
+decisions taken inside basic-blocks. Next, VPlan will be used also as the basis
+for taking these decisions, effectively turning them into a series of
+VPlan-to-VPlan algorithms. Finally, VPlan will support the planning process
+itself including cost-based analyses for making these decisions, to fully
+support compositional and iterative decision making.
+
+Some decisions are local to an instruction in the loop, such as whether to widen
+it into a vector instruction or replicate it, keeping the generated instructions
+in place. Other decisions, however, involve moving instructions, replacing them
+with other instructions, and/or introducing new instructions. For example, a
+cast may sink past a later instruction and be widened to handle first-order
+recurrence; an interleave group of strided gathers or scatters may effectively
+move to one place where they are replaced with shuffles and a common wide vector
+load or store; new instructions may be introduced to compute masks, shuffle the
+elements of vectors, and pack scalar values into vectors or vice-versa.
+
+In order for VPlan to support making instruction-level decisions and analyses,
+it needs to model the relevant instructions along with their def/use relations.
+This too follows a staged approach: first, the new instructions that compute
+masks are modeled as VPInstructions, along with their induced def/use subgraph.
+This effectively models masks in VPlan, facilitating VPlan-based predication.
+Next, the logic embedded within each Recipe for generating its instructions at
+VPlan execution time, will instead take part in the planning process by modeling
+them as VPInstructions. Finally, only logic that applies to instructions as a
+group will remain in Recipes, such as interleave groups and potentially other
+idiom groups having synergistic cost.
+
+Related LLVM components
+-----------------------
+1. SLP Vectorizer: one can compare the VPlan model with LLVM's existing SLP
+   tree, where TSLP [3]_ adds Plan Step 2.b.
+
+2. RegionInfo: one can compare VPlan's H-CFG with the Region Analysis as used by
+   Polly [7]_.
+
+3. Loop Vectorizer: the Vectorization Plan aims to upgrade the infrastructure of
+   the Loop Vectorizer and extend it to handle outer loops [8]_, [9]_.
+
+References
+----------
+.. [1] "Outer-loop vectorization: revisited for short SIMD architectures", Dorit
+    Nuzman and Ayal Zaks, PACT 2008.
+
+.. [2] "Proposal for function vectorization and loop vectorization with function
+    calls", Xinmin Tian, [`cfe-dev
+    <http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html>`_].,
+    March 2, 2016.
+    See also `review <https://reviews.llvm.org/D22792>`_.
+
+.. [3] "Throttling Automatic Vectorization: When Less is More", Vasileios
+    Porpodas and Tim Jones, PACT 2015 and LLVM Developers' Meeting 2015.
+
+.. [4] "Exploiting mixed SIMD parallelism by reducing data reorganization
+    overhead", Hao Zhou and Jingling Xue, CGO 2016.
+
+.. [5] "Register Allocation via Hierarchical Graph Coloring", David Callahan and
+    Brian Koblenz, PLDI 1991
+
+.. [6] "Structural analysis: A new approach to flow analysis in optimizing
+    compilers", M. Sharir, Journal of Computer Languages, Jan. 1980
+
+.. [7] "Enabling Polyhedral Optimizations in LLVM", Tobias Grosser, Diploma
+    thesis, 2011.
+
+.. [8] "Introducing VPlan to the Loop Vectorizer", Gil Rapaport and Ayal Zaks,
+    European LLVM Developers' Meeting 2017.
+
+.. [9] "Extending LoopVectorizer: OpenMP4.5 SIMD and Outer Loop
+    Auto-Vectorization", Intel Vectorizer Team, LLVM Developers' Meeting 2016.

Added: www-releases/trunk/9.0.0/docs/_sources/ReleaseNotes.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/ReleaseNotes.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/ReleaseNotes.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/ReleaseNotes.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,378 @@
+========================
+LLVM 9.0.0 Release Notes
+========================
+
+.. contents::
+    :local:
+
+Introduction
+============
+
+This document contains the release notes for the LLVM Compiler Infrastructure,
+release 9.0.0.  Here we describe the status of LLVM, including major improvements
+from the previous release, improvements in various subprojects of LLVM, and
+some of the current users of the code.  All LLVM releases may be downloaded
+from the `LLVM releases web site <https://llvm.org/releases/>`_.
+
+For more information about LLVM, including information about the latest
+release, please check out the `main LLVM web site <https://llvm.org/>`_.  If you
+have questions or comments, the `LLVM Developer's Mailing List
+<https://lists.llvm.org/mailman/listinfo/llvm-dev>`_ is a good place to send
+them.
+
+
+Known Issues
+============
+
+These are issues that couldn't be fixed before the release. See the bug reports
+for the latest status.
+
+* `PR40547 <https://llvm.org/pr40547>`_ Clang gets miscompiled by GCC 9.
+
+
+Non-comprehensive list of changes in this release
+=================================================
+
+* Two new extension points, namely ``EP_FullLinkTimeOptimizationEarly`` and
+  ``EP_FullLinkTimeOptimizationLast`` are available for plugins to specialize
+  the legacy pass manager full LTO pipeline.
+
+* ``llvm-objcopy/llvm-strip`` got support for COFF object files/executables,
+  supporting the most common copying/stripping options.
+
+* The CMake parameter ``CLANG_ANALYZER_ENABLE_Z3_SOLVER`` has been replaced by
+  ``LLVM_ENABLE_Z3_SOLVER``.
+
+* The RISCV target is no longer "experimental" (see
+  `Changes to the RISCV Target`_ below for more details).
+
+* The ORCv1 JIT API has been deprecated. Please see
+  `Transitioning from ORCv1 to ORCv2 <ORCv2.html#transitioning-from-orcv1-to-orcv2>`_.
+
+* Support for target-independent hardware loops in IR has been added, with
+  PowerPC and Arm implementations.
+
+
+Noteworthy optimizations
+------------------------
+
+* LLVM will now remove stores to constant memory (since this is a
+  contradiction) under the assumption the code in question must be dead.  This
+  has proven to be problematic for some C/C++ code bases which expect to be
+  able to cast away 'const'.  This is (and has always been) undefined
+  behavior, but up until now had not been actively utilized for optimization
+  purposes in this exact way.  For more information, please see:
+  `bug 42763 <https://bugs.llvm.org/show_bug.cgi?id=42763>`_ and
+  `post commit discussion <http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646945.html>`_.
+
+* The optimizer will now convert calls to ``memcmp`` into a calls to ``bcmp`` in
+  some circumstances. Users who are building freestanding code (not depending on
+  the platform's libc) without specifying ``-ffreestanding`` may need to either
+  pass ``-fno-builtin-bcmp``, or provide a ``bcmp`` function.
+
+* LLVM will now pattern match wide scalar values stored by a succession of
+  narrow stores. For example, Clang will compile the following function that
+  writes a 32-bit value in big-endian order in a portable manner:
+
+  .. code-block:: c
+
+      void write32be(unsigned char *dst, uint32_t x) {
+        dst[0] = x >> 24;
+        dst[1] = x >> 16;
+        dst[2] = x >> 8;
+        dst[3] = x >> 0;
+      }
+
+  into the x86_64 code below:
+
+  .. code-block:: asm
+
+   write32be:
+           bswap   esi
+           mov     dword ptr [rdi], esi
+           ret
+
+  (The corresponding read patterns have been matched since LLVM 5.)
+
+* LLVM will now omit range checks for jump tables when lowering switches with
+  unreachable default destination. For example, the switch dispatch in the C++
+  code below
+
+  .. code-block:: c
+
+     int g(int);
+     enum e { A, B, C, D, E };
+     int f(e x, int y, int z) {
+       switch(x) {
+         case A: return g(y);
+         case B: return g(z);
+         case C: return g(y+z);
+         case D: return g(x-z);
+         case E: return g(x+z);
+       }
+     }
+
+  will result in the following x86_64 machine code when compiled with Clang.
+  This is because falling off the end of a non-void function is undefined
+  behaviour in C++, and the end of the function therefore being treated as
+  unreachable:
+
+  .. code-block:: asm
+
+   _Z1f1eii:
+           mov     eax, edi
+           jmp     qword ptr [8*rax + .LJTI0_0]
+
+
+* LLVM can now sink similar instructions to a common successor block also when
+  the instructions have no uses, such as calls to void functions. This allows
+  code such as
+
+  .. code-block:: c
+
+   void g(int);
+   enum e { A, B, C, D };
+   void f(e x, int y, int z) {
+     switch(x) {
+       case A: g(6); break;
+       case B: g(3); break;
+       case C: g(9); break;
+       case D: g(2); break;
+     }
+   }
+
+  to be optimized to a single call to ``g``, with the argument loaded from a
+  lookup table.
+
+
+Changes to the LLVM IR
+----------------------
+
+* Added ``immarg`` parameter attribute. This indicates an intrinsic
+  parameter is required to be a simple constant. This annotation must
+  be accurate to avoid possible miscompiles.
+
+* The 2-field form of global variables ``@llvm.global_ctors`` and
+  ``@llvm.global_dtors`` has been deleted. The third field of their element
+  type is now mandatory. Specify `i8* null` to migrate from the obsoleted
+  2-field form.
+
+* The ``byval`` attribute can now take a type parameter:
+  ``byval(<ty>)``. If present it must be identical to the argument's
+  pointee type. In the next release we intend to make this parameter
+  mandatory in preparation for opaque pointer types.
+
+* ``atomicrmw xchg`` now allows floating point types
+
+* ``atomicrmw`` now supports ``fadd`` and ``fsub``
+
+Changes to building LLVM
+------------------------
+
+* Building LLVM with Visual Studio now requires version 2017 or later.
+
+
+Changes to the AArch64 Backend
+------------------------------
+
+* Assembly-level support was added for: Scalable Vector Extension 2 (SVE2) and
+  Memory Tagging Extensions (MTE).
+
+Changes to the ARM Backend
+--------------------------
+
+* Assembly-level support was added for the Armv8.1-M architecture, including
+  the M-Profile Vector Extension (MVE).
+
+* A pipeline model was added for Cortex-M4. This pipeline model is also used to
+  tune for cores where this gives a benefit too: Cortex-M3, SC300, Cortex-M33
+  and Cortex-M35P.
+
+* Code generation support for M-profile low-overhead loops.
+
+
+Changes to the MIPS Target
+--------------------------
+
+* Support for ``.cplocal`` assembler directive.
+
+* Support for ``sge``, ``sgeu``, ``sgt``, ``sgtu`` pseudo instructions.
+
+* Support for ``o`` inline asm constraint.
+
+* Improved support of GlobalISel instruction selection framework.
+  This feature is still in experimental state for MIPS targets though.
+
+* Various code-gen improvements, related to improved and fixed instruction
+  selection and encoding and floating-point registers allocation.
+
+* Complete P5600 scheduling model.
+
+
+Changes to the PowerPC Target
+-----------------------------
+
+* Improved handling of TOC pointer spills for indirect calls
+
+* Improve precision of square root reciprocal estimate
+
+* Enabled MachinePipeliner support for P9 with ``-ppc-enable-pipeliner``.
+
+* MMX/SSE/SSE2 intrinsics headers have been ported to PowerPC using Altivec.
+
+* Machine verification failures cleaned, EXPENSIVE_CHECKS will run
+  MachineVerification by default now.
+
+* PowerPC scheduling enhancements, with customized PPC specific scheduler
+  strategy.
+
+* Inner most loop now always align to 32 bytes.
+
+* Enhancements of hardware loops interaction with LSR.
+
+* New builtins added, eg: ``__builtin_setrnd``.
+
+* Various codegen improvements for both scalar and vector code
+
+* Various new exploitations and bug fixes, e.g: exploited P9 ``maddld``.
+
+
+Changes to the SystemZ Target
+-----------------------------
+
+* Support for the arch13 architecture has been added.  When using the
+  ``-march=arch13`` option, the compiler will generate code making use of
+  new instructions introduced with the vector enhancement facility 2
+  and the miscellaneous instruction extension facility 2.
+  The ``-mtune=arch13`` option enables arch13 specific instruction
+  scheduling and tuning without making use of new instructions.
+
+* Builtins for the new vector instructions have been added and can be
+  enabled using the ``-mzvector`` option.  Support for these builtins
+  is indicated by the compiler predefining the ``__VEC__`` macro to
+  the value ``10303``.
+
+* The compiler now supports and automatically generates alignment hints
+  on vector load and store instructions.
+
+* Various code-gen improvements, in particular related to improved
+  instruction selection and register allocation.
+
+Changes to the X86 Target
+-------------------------
+
+* Fixed a bug in generating DWARF unwind information for 32 bit MinGW
+
+Changes to the AMDGPU Target
+----------------------------
+
+* Function call support is now enabled by default
+
+* Improved support for 96-bit loads and stores
+
+* DPP combiner pass is now enabled by default
+
+* Support for gfx10
+
+
+Changes to the RISCV Target
+---------------------------
+
+The RISCV target is no longer "experimental"! It's now built by default,
+rather than needing to be enabled with ``LLVM_EXPERIMENTAL_TARGETS_TO_BUILD``.
+
+The backend has full codegen support for the RV32I and RV64I base RISC-V
+instruction set variants, with the MAFDC standard extensions. We support the
+hard and soft-float ABIs for these targets. Testing has been performed with
+both Linux and bare-metal targets, including the compilation of a large corpus
+of Linux applications (through buildroot).
+
+
+Changes to LLDB
+===============
+
+* Backtraces are now color highlighting in the terminal.
+
+* DWARF4 (debug_types) and DWARF5 (debug_info) type units are now supported.
+
+* This release will be the last where ``lldb-mi`` is shipped as part of LLDB.
+  The tool will still be available in a `downstream repository on GitHub
+  <https://github.com/lldb-tools/lldb-mi>`_.
+
+External Open Source Projects Using LLVM 9
+==========================================
+
+Mull - Mutation Testing tool for C and C++
+------------------------------------------
+
+`Mull <https://github.com/mull-project/mull>`_ is an LLVM-based tool for
+mutation testing with a strong focus on C and C++ languages.
+
+Portable Computing Language (pocl)
+----------------------------------
+
+In addition to producing an easily portable open source OpenCL
+implementation, another major goal of `pocl <http://portablecl.org/>`_
+is improving performance portability of OpenCL programs with
+compiler optimizations, reducing the need for target-dependent manual
+optimizations. An important part of pocl is a set of LLVM passes used to
+statically parallelize multiple work-items with the kernel compiler, even in
+the presence of work-group barriers. This enables static parallelization of
+the fine-grained static concurrency in the work groups in multiple ways.
+
+TTA-based Co-design Environment (TCE)
+-------------------------------------
+
+`TCE <http://openasip.org/>`_ is an open source toolset for designing customized
+processors based on the Transport Triggered Architecture (TTA).
+The toolset provides a complete co-design flow from C/C++
+programs down to synthesizable VHDL/Verilog and parallel program binaries.
+Processor customization points include register files, function units,
+supported operations, and the interconnection network.
+
+TCE uses Clang and LLVM for C/C++/OpenCL C language support, target independent
+optimizations and also for parts of code generation. It generates new
+LLVM-based code generators "on the fly" for the designed TTA processors and
+loads them in to the compiler backend as runtime libraries to avoid
+per-target recompilation of larger parts of the compiler chain.
+
+
+Zig Programming Language
+------------------------
+
+`Zig <https://ziglang.org>`_  is a system programming language intended to be
+an alternative to C. It provides high level features such as generics, compile
+time function execution, and partial evaluation, while exposing low level LLVM
+IR features such as aliases and intrinsics. Zig uses Clang to provide automatic
+import of .h symbols, including inline functions and simple macros. Zig uses
+LLD combined with lazily building compiler-rt to provide out-of-the-box
+cross-compiling for all supported targets.
+
+
+LDC - the LLVM-based D compiler
+-------------------------------
+
+`D <http://dlang.org>`_ is a language with C-like syntax and static typing. It
+pragmatically combines efficiency, control, and modeling power, with safety and
+programmer productivity. D supports powerful concepts like Compile-Time Function
+Execution (CTFE) and Template Meta-Programming, provides an innovative approach
+to concurrency and offers many classical paradigms.
+
+`LDC <http://wiki.dlang.org/LDC>`_ uses the frontend from the reference compiler
+combined with LLVM as backend to produce efficient native code. LDC targets
+x86/x86_64 systems like Linux, OS X, FreeBSD and Windows and also Linux on ARM
+and PowerPC (32/64 bit). Ports to other architectures are underway.
+
+
+Additional Information
+======================
+
+A wide variety of additional information is available on the `LLVM web page
+<https://llvm.org/>`_, in particular in the `documentation
+<https://llvm.org/docs/>`_ section.  The web page also contains versions of the
+API documentation which is up-to-date with the Subversion version of the source
+code.  You can access versions of these documents specific to this release by
+going into the ``llvm/docs/`` directory in the LLVM tree.
+
+If you have any questions or comments about LLVM, please feel free to contact
+us via the `mailing lists <https://llvm.org/docs/#mailing-lists>`_.

Added: www-releases/trunk/9.0.0/docs/_sources/ReleaseProcess.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/ReleaseProcess.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/ReleaseProcess.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/ReleaseProcess.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,231 @@
+=============================
+How To Validate a New Release
+=============================
+
+.. contents::
+   :local:
+   :depth: 1
+
+Introduction
+============
+
+This document contains information about testing the release candidates that
+will ultimately be the next LLVM release. For more information on how to
+manage the actual release, please refer to :doc:`HowToReleaseLLVM`.
+
+Overview of the Release Process
+-------------------------------
+
+Once the release process starts, the Release Manager will ask for volunteers,
+and it'll be the role of each volunteer to:
+
+* Test and benchmark the previous release
+
+* Test and benchmark each release candidate, comparing to the previous release
+  and candidates
+
+* Identify, reduce and report every regression found during tests and benchmarks
+
+* Make sure the critical bugs get fixed and merged to the next release candidate
+
+Not all bugs or regressions are show-stoppers and it's a bit of a grey area what
+should be fixed before the next candidate and what can wait until the next
+release.
+
+It'll depend on:
+
+* The severity of the bug, how many people it affects and if it's a regression
+  or a known bug. Known bugs are "unsupported features" and some bugs can be
+  disabled if they have been implemented recently.
+
+* The stage in the release. Less critical bugs should be considered to be
+  fixed between RC1 and RC2, but not so much at the end of it.
+
+* If it's a correctness or a performance regression. Performance regression
+  tends to be taken more lightly than correctness.
+
+.. _scripts:
+
+Scripts
+=======
+
+The scripts are in the ``utils/release`` directory.
+
+test-release.sh
+---------------
+
+This script will check-out, configure and compile LLVM+Clang (+ most add-ons,
+like ``compiler-rt``, ``libcxx``, ``libomp`` and ``clang-extra-tools``) in
+three stages, and will test the final stage.
+It'll have installed the final binaries on the Phase3/Releasei(+Asserts)
+directory, and that's the one you should use for the test-suite and other
+external tests.
+
+To run the script on a specific release candidate run::
+
+   ./test-release.sh \
+        -release 3.3 \
+        -rc 1 \
+        -no-64bit \
+        -test-asserts \
+        -no-compare-files
+
+Each system will require different options. For instance, x86_64 will
+obviously not need ``-no-64bit`` while 32-bit systems will, or the script will
+fail.
+
+The important flags to get right are:
+
+* On the pre-release, you should change ``-rc 1`` to ``-final``. On RC2,
+  change it to ``-rc 2`` and so on.
+
+* On non-release testing, you can use ``-final`` in conjunction with
+  ``-no-checkout``, but you'll have to create the ``final`` directory by hand
+  and link the correct source dir to ``final/llvm.src``.
+
+* For release candidates, you need ``-test-asserts``, or it won't create a
+  "Release+Asserts" directory, which is needed for release testing and
+  benchmarking. This will take twice as long.
+
+* On the final candidate you just need Release builds, and that's the binary
+  directory you'll have to pack.
+
+This script builds three phases of Clang+LLVM twice each (Release and
+Release+Asserts), so use screen or nohup to avoid headaches, since it'll take
+a long time.
+
+Use the ``--help`` option to see all the options and chose it according to
+your needs.
+
+
+findRegressions-nightly.py
+--------------------------
+
+TODO
+
+.. _test-suite:
+
+Test Suite
+==========
+
+.. contents::
+   :local:
+
+Follow the `LNT Quick Start Guide
+<http://llvm.org/docs/lnt/quickstart.html>`__ link on how to set-up the
+test-suite
+
+The binary location you'll have to use for testing is inside the
+``rcN/Phase3/Release+Asserts/llvmCore-REL-RC.install``.
+Link that directory to an easier location and run the test-suite.
+
+An example on the run command line, assuming you created a link from the correct
+install directory to ``~/devel/llvm/install``::
+
+   ./sandbox/bin/python sandbox/bin/lnt runtest \
+       nt \
+       -j4 \
+       --sandbox sandbox \
+       --test-suite ~/devel/llvm/test/test-suite \
+       --cc ~/devel/llvm/install/bin/clang \
+       --cxx ~/devel/llvm/install/bin/clang++
+
+It should have no new regressions, compared to the previous release or release
+candidate. You don't need to fix all the bugs in the test-suite, since they're
+not necessarily meant to pass on all architectures all the time. This is
+due to the nature of the result checking, which relies on direct comparison,
+and most of the time, the failures are related to bad output checking, rather
+than bad code generation.
+
+If the errors are in LLVM itself, please report every single regression found
+as blocker, and all the other bugs as important, but not necessarily blocking
+the release to proceed. They can be set as "known failures" and to be
+fix on a future date.
+
+.. _pre-release-process:
+
+Pre-Release Process
+===================
+
+.. contents::
+   :local:
+
+When the release process is announced on the mailing list, you should prepare
+for the testing, by applying the same testing you'll do on the release
+candidates, on the previous release.
+
+You should:
+
+* Download the previous release sources from
+  http://llvm.org/releases/download.html.
+
+* Run the test-release.sh script on ``final`` mode (change ``-rc 1`` to
+  ``-final``).
+
+* Once all three stages are done, it'll test the final stage.
+
+* Using the ``Phase3/Release+Asserts/llvmCore-MAJ.MIN-final.install`` base,
+  run the test-suite.
+
+If the final phase's ``make check-all`` failed, it's a good idea to also test
+the intermediate stages by going on the obj directory and running
+``make check-all`` to find if there's at least one stage that passes (helps
+when reducing the error for bug report purposes).
+
+.. _release-process:
+
+Release Process
+===============
+
+.. contents::
+   :local:
+
+When the Release Manager sends you the release candidate, download all sources,
+unzip on the same directory (there will be sym-links from the appropriate places
+to them), and run the release test as above.
+
+You should:
+
+* Download the current candidate sources from where the release manager points
+  you (ex. http://llvm.org/pre-releases/3.3/rc1/).
+
+* Repeat the steps above with ``-rc 1``, ``-rc 2`` etc modes and run the
+  test-suite the same way.
+
+* Compare the results, report all errors on Bugzilla and publish the binary blob
+  where the release manager can grab it.
+
+Once the release manages announces that the latest candidate is the good one,
+you have to pack the ``Release`` (no Asserts) install directory on ``Phase3``
+and that will be the official binary.
+
+* Rename (or link) ``clang+llvm-REL-ARCH-ENV`` to the .install directory
+
+* Tar that into the same name with ``.tar.gz`` extensioan from outside the
+  directory
+
+* Make it available for the release manager to download
+
+.. _bug-reporting:
+
+Bug Reporting Process
+=====================
+
+.. contents::
+   :local:
+
+If you found regressions or failures when comparing a release candidate with the
+previous release, follow the rules below:
+
+* Critical bugs on compilation should be fixed as soon as possible, possibly
+  before releasing the binary blobs.
+
+* Check-all tests should be fixed before the next release candidate, but can
+  wait until the test-suite run is finished.
+
+* Bugs in the test suite or unimportant check-all tests can be fixed in between
+  release candidates.
+
+* New features or recent big changes, when close to the release, should have
+  done in a way that it's easy to disable. If they misbehave, prefer disabling
+  them than releasing an unstable (but untested) binary package.

Added: www-releases/trunk/9.0.0/docs/_sources/Remarks.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Remarks.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Remarks.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Remarks.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,306 @@
+=======
+Remarks
+=======
+
+.. contents::
+   :local:
+
+Introduction to the LLVM remark diagnostics
+===========================================
+
+LLVM is able to emit diagnostics from passes describing whether an optimization
+has been performed or missed for a particular reason, which should give more
+insight to users about what the compiler did during the compilation pipeline.
+
+There are three main remark types:
+
+``Passed``
+
+    Remarks that describe a successful optimization performed by the compiler.
+
+    :Example:
+
+    ::
+
+        foo inlined into bar with (cost=always): always inline attribute
+
+``Missed``
+
+    Remarks that describe an attempt to an optimization by the compiler that
+    could not be performed.
+
+    :Example:
+
+    ::
+
+        foo not inlined into bar because it should never be inlined
+        (cost=never): noinline function attribute
+
+``Analysis``
+
+    Remarks that describe the result of an analysis, that can bring more
+    information to the user regarding the generated code.
+
+    :Example:
+
+    ::
+
+        16 stack bytes in function
+
+    ::
+
+        10 instructions in function
+
+Enabling optimization remarks
+=============================
+
+There are two modes that are supported for enabling optimization remarks in
+LLVM: through remark diagnostics, or through serialized remarks.
+
+Remark diagnostics
+------------------
+
+Optimization remarks can be emitted as diagnostics. These diagnostics will be
+propagated to front-ends if desired, or emitted by tools like :doc:`llc
+<CommandGuide/llc>` or :doc:`opt <CommandGuide/opt>`.
+
+.. option:: -pass-remarks=<regex>
+
+  Enables optimization remarks from passes whose name match the given (POSIX)
+  regular expression.
+
+.. option:: -pass-remarks-missed=<regex>
+
+  Enables missed optimization remarks from passes whose name match the given
+  (POSIX) regular expression.
+
+.. option:: -pass-remarks-analysis=<regex>
+
+  Enables optimization analysis remarks from passes whose name match the given
+  (POSIX) regular expression.
+
+Serialized remarks
+------------------
+
+While diagnostics are useful during development, it is often more useful to
+refer to optimization remarks post-compilation, typically during performance
+analysis.
+
+For that, LLVM can serialize the remarks produced for each compilation unit to
+a file that can be consumed later.
+
+By default, the format of the serialized remarks is :ref:`YAML
+<yamlremarks>`, and it can be accompanied by a :ref:`section <remarkssection>`
+in the object files to easily retrieve it.
+
+:doc:`llc <CommandGuide/llc>` and :doc:`opt <CommandGuide/opt>` support the
+following options:
+
+
+``Basic options``
+
+    .. option:: -pass-remarks-output=<filename>
+
+      Enables the serialization of remarks to a file specified in <filename>.
+
+      By default, the output is serialized to :ref:`YAML <yamlremarks>`.
+
+    .. option:: -pass-remarks-format=<format>
+
+      Specifies the output format of the serialized remarks.
+
+      Supported formats:
+
+      * :ref:`yaml <yamlremarks>` (default)
+
+``Content configuration``
+
+    .. option:: -pass-remarks-filter=<regex>
+
+      Only passes whose name match the given (POSIX) regular expression will be
+      serialized to the final output.
+
+    .. option:: -pass-remarks-with-hotness
+
+      With PGO, include profile count in optimization remarks.
+
+    .. option:: -pass-remarks-hotness-threshold
+
+      The minimum profile count required for an optimization remark to be
+      emitted.
+
+Other tools that support remarks:
+
+:program:`llvm-lto`
+
+    .. option:: -lto-pass-remarks-output=<filename>
+    .. option:: -lto-pass-remarks-filter=<regex>
+    .. option:: -lto-pass-remarks-format=<format>
+    .. option:: -lto-pass-remarks-with-hotness
+    .. option:: -lto-pass-remarks-hotness-threshold
+
+:program:`gold-plugin` and :program:`lld`
+
+    .. option:: -opt-remarks-filename=<filename>
+    .. option:: -opt-remarks-filter=<regex>
+    .. option:: -opt-remarks-format=<format>
+    .. option:: -opt-remarks-with-hotness
+
+.. _yamlremarks:
+
+YAML remarks
+============
+
+A typical remark serialized to YAML looks like this:
+
+.. code-block:: yaml
+
+    --- !<TYPE>
+    Pass: <pass>
+    Name: <name>
+    DebugLoc: { File: <file>, Line: <line>, Column: <column> }
+    Function: <function>
+    Hotness: <hotness>
+    Args:
+      - <key>: <value>
+        DebugLoc: { File: <arg-file>, Line: <arg-line>, Column: <arg-column> }
+
+The following entries are mandatory:
+
+* ``<TYPE>``: can be ``Passed``, ``Missed``, ``Analysis``,
+  ``AnalysisFPCommute``, ``AnalysisAliasing``, ``Failure``.
+* ``<pass>``: the name of the pass that emitted this remark.
+* ``<name>``: the name of the remark coming from ``<pass>``.
+* ``<function>``: the mangled name of the function.
+
+If a ``DebugLoc`` entry is specified, the following fields are required:
+
+* ``<file>``
+* ``<line>``
+* ``<column>``
+
+If an ``arg`` entry is specified, the following fields are required:
+
+* ``<key>``
+* ``<value>``
+
+If a ``DebugLoc`` entry is specified within an ``arg`` entry, the following
+fields are required:
+
+* ``<arg-file>``
+* ``<arg-line>``
+* ``<arg-column>``
+
+opt-viewer
+==========
+
+The ``opt-viewer`` directory contains a collection of tools that visualize and
+summarize serialized remarks.
+
+.. _optviewerpy:
+
+opt-viewer.py
+-------------
+
+Output a HTML page which gives visual feedback on compiler interactions with
+your program.
+
+    :Examples:
+
+    ::
+
+        $ opt-viewer.py my_yaml_file.opt.yaml
+
+    ::
+
+        $ opt-viewer.py my_build_dir/
+
+
+opt-stats.py
+------------
+
+Output statistics about the optimization remarks in the input set.
+
+    :Example:
+
+    ::
+
+        $ opt-stats.py my_yaml_file.opt.yaml
+
+        Total number of remarks           3
+
+
+        Top 10 remarks by pass:
+          inline                         33%
+          asm-printer                    33%
+          prologepilog                   33%
+
+        Top 10 remarks:
+          asm-printer/InstructionCount   33%
+          inline/NoDefinition            33%
+          prologepilog/StackSize         33%
+
+opt-diff.py
+-----------
+
+Produce a new YAML file which contains all of the changes in optimizations
+between two YAML files.
+
+Typically, this tool should be used to do diffs between:
+
+* new compiler + fixed source vs old compiler + fixed source
+* fixed compiler + new source vs fixed compiler + old source
+
+This diff file can be displayed using :ref:`opt-viewer.py <optviewerpy>`.
+
+    :Example:
+
+    ::
+
+        $ opt-diff.py my_opt_yaml1.opt.yaml my_opt_yaml2.opt.yaml -o my_opt_diff.opt.yaml
+        $ opt-viewer.py my_opt_diff.opt.yaml
+
+.. _remarkssection:
+
+Emitting remark diagnostics in the object file
+==============================================
+
+A section containing metadata on remark diagnostics will be emitted when
+-remarks-section is passed. The section contains:
+
+* a magic number: "REMARKS\\0"
+* the version number: a little-endian uint64_t
+* the total size of the string table (the size itself excluded):
+  little-endian uint64_t
+* a list of null-terminated strings
+* the absolute file path to the serialized remark diagnostics: a
+  null-terminated string.
+
+The section is named:
+
+* ``__LLVM,__remarks`` (MachO)
+* ``.remarks`` (ELF)
+
+C API
+=====
+
+LLVM provides a library that can be used to parse remarks through a shared
+library named ``libRemarks``.
+
+The typical usage through the C API is like the following:
+
+.. code-block:: c
+
+    LLVMRemarkParserRef Parser = LLVMRemarkParserCreateYAML(Buf, Size);
+    LLVMRemarkEntryRef Remark = NULL;
+    while ((Remark = LLVMRemarkParserGetNext(Parser))) {
+       // use Remark
+       LLVMRemarkEntryDispose(Remark); // Release memory.
+    }
+    bool HasError = LLVMRemarkParserHasError(Parser);
+    LLVMRemarkParserDispose(Parser);
+
+.. FIXME: add documentation for llvm-opt-report.
+.. FIXME: add documentation for Passes supporting optimization remarks
+.. FIXME: add documentation for IR Passes
+.. FIXME: add documentation for CodeGen Passes

Added: www-releases/trunk/9.0.0/docs/_sources/ReportingGuide.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/ReportingGuide.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/ReportingGuide.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/ReportingGuide.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,143 @@
+===============
+Reporting Guide
+===============
+
+.. note::
+
+   This document is currently a **DRAFT** document while it is being discussed
+   by the community.
+
+If you believe someone is violating the :doc:`code of conduct <CodeOfConduct>`
+you can always report it to the LLVM Foundation Code of Conduct Advisory
+Committee by emailing conduct at llvm.org. **All reports will be kept
+confidential.** This isn't a public list and only `members`_ of the advisory
+committee will receive the report.
+
+If you believe anyone is in **physical danger**, please notify appropriate law
+enforcement first. If you are unsure what law enforcement agency is
+appropriate, please include this in your report and we will attempt to notify
+them.
+
+If the violation occurs at an event such as a Developer Meeting and requires
+immediate attention, you can also reach out to any of the event organizers or
+staff. Event organizers and staff will be prepared to handle the incident and
+able to help. If you cannot find one of the organizers, the venue staff can
+locate one for you. We will also post detailed contact information for specific
+events as part of each events' information. In person reports will still be
+kept confidential exactly as above, but also feel free to (anonymously if
+needed) email conduct at llvm.org.
+
+.. note::
+   The LLVM community has long handled inappropriate behavior on its own, using
+   both private communication and public responses. Nothing in this document is
+   intended to discourage this self enforcement of community norms. Instead,
+   the mechanisms described here are intended to supplement any self
+   enforcement within the community. They provide avenues for handling severe
+   cases or cases where the reporting party does not wish to respond directly
+   for any reason.
+
+Filing a report
+===============
+
+Reports can be as formal or informal as needed for the situation at hand. If
+possible, please include as much information as you can. If you feel
+comfortable, please consider including:
+
+* Your contact info (so we can get in touch with you if we need to follow up).
+* Names (real, nicknames, or pseudonyms) of any individuals involved. If there
+  were other witnesses besides you, please try to include them as well.
+* When and where the incident occurred. Please be as specific as possible.
+* Your account of what occurred. If there is a publicly available record (e.g.
+  a mailing list archive or a public IRC logger) please include a link.
+* Any extra context you believe existed for the incident.
+* If you believe this incident is ongoing.
+* Any other information you believe we should have.
+
+What happens after you file a report?
+=====================================
+
+You will receive an email from the advisory committee acknowledging receipt
+within 24 hours (and we will aim to respond much quicker than that).
+
+The advisory committee will immediately meet to review the incident and try to
+determine:
+
+* What happened and who was involved.
+* Whether this event constitutes a code of conduct violation.
+* Whether this is an ongoing situation, or if there is a threat to anyone's
+  physical safety.
+
+If this is determined to be an ongoing incident or a threat to physical safety,
+the working groups' immediate priority will be to protect everyone involved.
+This means we may delay an "official" response until we believe that the
+situation has ended and that everyone is physically safe.
+
+The working group will try to contact other parties involved or witnessing the
+event to gain clarity on what happened and understand any different
+perspectives.
+
+Once the advisory committee has a complete account of the events they will make
+a decision as to how to respond. Responses may include:
+
+* Nothing, if we determine no violation occurred or it has already been
+  appropriately resolved.
+* Providing either moderation or mediation to ongoing interactions (where
+  appropriate, safe, and desired by both parties).
+* A private reprimand from the working group to the individuals involved.
+* An imposed vacation (i.e. asking someone to "take a week off" from a mailing
+  list or IRC).
+* A public reprimand.
+* A permanent or temporary ban from some or all LLVM spaces (mailing lists,
+  IRC, etc.)
+* Involvement of relevant law enforcement if appropriate.
+
+If the situation is not resolved within one week, we'll respond within one week
+to the original reporter with an update and explanation.
+
+Once we've determined our response, we will separately contact the original
+reporter and other individuals to let them know what actions (if any) we'll be
+taking. We will take into account feedback from the individuals involved on the
+appropriateness of our response, but we don't guarantee we'll act on it.
+
+After any incident, the advisory committee will make a report on the situation
+to the LLVM Foundation board. The board may choose to make a public statement
+about the incident. If that's the case, the identities of anyone involved will
+remain confidential unless instructed by those inviduals otherwise.
+
+Appealing
+=========
+
+Only permanent resolutions (such as bans) or requests for public actions may be
+appealed. To appeal a decision of the working group, contact the LLVM
+Foundation board at board at llvm.org with your appeal and the board will review
+the case.
+
+In general, it is **not** appropriate to appeal a particular decision on
+a public mailing list. Doing so would involve disclosure of information which
+whould be confidential. Disclosing this kind of information publicly may be
+considered a separate and (potentially) more serious violation of the Code of
+Conduct. This is not meant to limit discussion of the Code of Conduct, the
+advisory board itself, or the appropriateness of responses in general, but
+**please** refrain from mentioning specific facts about cases without the
+explicit permission of all parties involved.
+
+.. _members:
+
+Members of the Code of Conduct Advisory Committee
+=================================================
+
+The members serving on the advisory committee are listed here with contact
+information in case you are more comfortable talking directly to a specific
+member of the committee.
+
+.. note::
+
+   FIXME: When we form the initial advisory committee, the members names and private contact info need to be added here.
+
+
+
+(This text is based on the `Django Project`_ Code of Conduct, which is in turn
+based on wording from the `Speak Up! project`_.)
+
+.. _Django Project: https://www.djangoproject.com/conduct/
+.. _Speak Up! project: http://speakup.io/coc.html

Added: www-releases/trunk/9.0.0/docs/_sources/ScudoHardenedAllocator.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/ScudoHardenedAllocator.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/ScudoHardenedAllocator.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/ScudoHardenedAllocator.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,253 @@
+========================
+Scudo Hardened Allocator
+========================
+
+.. contents::
+   :local:
+   :depth: 1
+
+Introduction
+============
+
+The Scudo Hardened Allocator is a user-mode allocator based on LLVM Sanitizer's
+CombinedAllocator, which aims at providing additional mitigations against heap
+based vulnerabilities, while maintaining good performance.
+
+Currently, the allocator supports (was tested on) the following architectures:
+
+- i386 (& i686) (32-bit);
+- x86_64 (64-bit);
+- armhf (32-bit);
+- AArch64 (64-bit);
+- MIPS (32-bit & 64-bit).
+
+The name "Scudo" has been retained from the initial implementation (Escudo
+meaning Shield in Spanish and Portuguese).
+
+Design
+======
+
+Allocator
+---------
+Scudo can be considered a Frontend to the Sanitizers' common allocator (later
+referenced as the Backend). It is split between a Primary allocator, fast and
+efficient, that services smaller allocation sizes, and a Secondary allocator
+that services larger allocation sizes and is backed by the operating system
+memory mapping primitives.
+
+Scudo was designed with security in mind, but aims at striking a good balance
+between security and performance. It is highly tunable and configurable.
+
+Chunk Header
+------------
+Every chunk of heap memory will be preceded by a chunk header. This has two
+purposes, the first one being to store various information about the chunk,
+the second one being to detect potential heap overflows. In order to achieve
+this, the header will be checksummed, involving the pointer to the chunk itself
+and a global secret. Any corruption of the header will be detected when said
+header is accessed, and the process terminated.
+
+The following information is stored in the header:
+
+- the 16-bit checksum;
+- the class ID for that chunk, which is the "bucket" where the chunk resides
+  for Primary backed allocations, or 0 for Secondary backed allocations;
+- the size (Primary) or unused bytes amount (Secondary) for that chunk, which is
+  necessary for computing the size of the chunk;
+- the state of the chunk (available, allocated or quarantined);
+- the allocation type (malloc, new, new[] or memalign), to detect potential
+  mismatches in the allocation APIs used;
+- the offset of the chunk, which is the distance in bytes from the beginning of
+  the returned chunk to the beginning of the Backend allocation;
+
+This header fits within 8 bytes, on all platforms supported.
+
+The checksum is computed as a CRC32 (made faster with hardware support)
+of the global secret, the chunk pointer itself, and the 8 bytes of header with
+the checksum field zeroed out. It is not intended to be cryptographically
+strong. 
+
+The header is atomically loaded and stored to prevent races. This is important
+as two consecutive chunks could belong to different threads. We also want to
+avoid any type of double fetches of information located in the header, and use
+local copies of the header for this purpose.
+
+Delayed Freelist
+-----------------
+A delayed freelist allows us to not return a chunk directly to the Backend, but
+to keep it aside for a while. Once a criterion is met, the delayed freelist is
+emptied, and the quarantined chunks are returned to the Backend. This helps
+mitigate use-after-free vulnerabilities by reducing the determinism of the
+allocation and deallocation patterns.
+
+This feature is using the Sanitizer's Quarantine as its base, and the amount of
+memory that it can hold is configurable by the user (see the Options section
+below).
+
+Randomness
+----------
+It is important for the allocator to not make use of fixed addresses. We use
+the dynamic base option for the SizeClassAllocator, allowing us to benefit
+from the randomness of the system memory mapping functions.
+
+Usage
+=====
+
+Library
+-------
+The allocator static library can be built from the LLVM build tree thanks to
+the ``scudo`` CMake rule. The associated tests can be exercised thanks to the
+``check-scudo`` CMake rule.
+
+Linking the static library to your project can require the use of the
+``whole-archive`` linker flag (or equivalent), depending on your linker.
+Additional flags might also be necessary.
+
+Your linked binary should now make use of the Scudo allocation and deallocation
+functions.
+
+You may also build Scudo like this: 
+
+.. code:: console
+
+  cd $LLVM/projects/compiler-rt/lib
+  clang++ -fPIC -std=c++11 -msse4.2 -O2 -I. scudo/*.cpp \
+    $(\ls sanitizer_common/*.{cc,S} | grep -v "sanitizer_termination\|sanitizer_common_nolibc\|sancov_\|sanitizer_unwind\|sanitizer_symbol") \
+    -shared -o libscudo.so -pthread
+
+and then use it with existing binaries as follows:
+
+.. code:: console
+
+  LD_PRELOAD=`pwd`/libscudo.so ./a.out
+
+Clang
+-----
+With a recent version of Clang (post rL317337), the allocator can be linked with
+a binary at compilation using the ``-fsanitize=scudo`` command-line argument, if
+the target platform is supported. Currently, the only other Sanitizer Scudo is
+compatible with is UBSan (eg: ``-fsanitize=scudo,undefined``). Compiling with
+Scudo will also enforce PIE for the output binary.
+
+Options
+-------
+Several aspects of the allocator can be configured on a per process basis
+through the following ways:
+
+- at compile time, by defining ``SCUDO_DEFAULT_OPTIONS`` to the options string
+  you want set by default;
+
+- by defining a ``__scudo_default_options`` function in one's program that
+  returns the options string to be parsed. Said function must have the following
+  prototype: ``extern "C" const char* __scudo_default_options(void)``, with a
+  default visibility. This will override the compile time define;
+
+- through the environment variable SCUDO_OPTIONS, containing the options string
+  to be parsed. Options defined this way will override any definition made
+  through ``__scudo_default_options``.
+
+The options string follows a syntax similar to ASan, where distinct options
+can be assigned in the same string, separated by colons.
+
+For example, using the environment variable:
+
+.. code:: console
+
+  SCUDO_OPTIONS="DeleteSizeMismatch=1:QuarantineSizeKb=64" ./a.out
+
+Or using the function:
+
+.. code:: cpp
+
+  extern "C" const char *__scudo_default_options() {
+    return "DeleteSizeMismatch=1:QuarantineSizeKb=64";
+  }
+
+
+The following options are available:
+
++-----------------------------+----------------+----------------+------------------------------------------------+
+| Option                      | 64-bit default | 32-bit default | Description                                    |
++-----------------------------+----------------+----------------+------------------------------------------------+
+| QuarantineSizeKb            | 256            | 64             | The size (in Kb) of quarantine used to delay   |
+|                             |                |                | the actual deallocation of chunks. Lower value |
+|                             |                |                | may reduce memory usage but decrease the       |
+|                             |                |                | effectiveness of the mitigation; a negative    |
+|                             |                |                | value will fallback to the defaults. Setting   |
+|                             |                |                | *both* this and ThreadLocalQuarantineSizeKb to |
+|                             |                |                | zero will disable the quarantine entirely.     |
++-----------------------------+----------------+----------------+------------------------------------------------+
+| QuarantineChunksUpToSize    | 2048           | 512            | Size (in bytes) up to which chunks can be      |
+|                             |                |                | quarantined.                                   |
++-----------------------------+----------------+----------------+------------------------------------------------+
+| ThreadLocalQuarantineSizeKb | 1024           | 256            | The size (in Kb) of per-thread cache use to    |
+|                             |                |                | offload the global quarantine. Lower value may |
+|                             |                |                | reduce memory usage but might increase         |
+|                             |                |                | contention on the global quarantine. Setting   |
+|                             |                |                | *both* this and QuarantineSizeKb to zero will  |
+|                             |                |                | disable the quarantine entirely.               |
++-----------------------------+----------------+----------------+------------------------------------------------+
+| DeallocationTypeMismatch    | true           | true           | Whether or not we report errors on             |
+|                             |                |                | malloc/delete, new/free, new/delete[], etc.    |
++-----------------------------+----------------+----------------+------------------------------------------------+
+| DeleteSizeMismatch          | true           | true           | Whether or not we report errors on mismatch    |
+|                             |                |                | between sizes of new and delete.               |
++-----------------------------+----------------+----------------+------------------------------------------------+
+| ZeroContents                | false          | false          | Whether or not we zero chunk contents on       |
+|                             |                |                | allocation and deallocation.                   |
++-----------------------------+----------------+----------------+------------------------------------------------+
+
+Allocator related common Sanitizer options can also be passed through Scudo
+options, such as ``allocator_may_return_null`` or ``abort_on_error``. A detailed
+list including those can be found here:
+https://github.com/google/sanitizers/wiki/SanitizerCommonFlags.
+
+Error Types
+===========
+
+The allocator will output an error message, and potentially terminate the
+process, when an unexpected behavior is detected. The output usually starts with
+``"Scudo ERROR:"`` followed by a short summary of the problem that occurred as
+well as the pointer(s) involved. Once again, Scudo is meant to be a mitigation,
+and might not be the most useful of tools to help you root-cause the issue,
+please consider `ASan <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_
+for this purpose.
+
+Here is a list of the current error messages and their potential cause:
+
+- ``"corrupted chunk header"``: the checksum verification of the chunk header
+  has failed. This is likely due to one of two things: the header was
+  overwritten (partially or totally), or the pointer passed to the function is
+  not a chunk at all;
+
+- ``"race on chunk header"``: two different threads are attempting to manipulate
+  the same header at the same time. This is usually symptomatic of a
+  race-condition or general lack of locking when performing operations on that
+  chunk;
+
+- ``"invalid chunk state"``: the chunk is not in the expected state for a given
+  operation, eg: it is not allocated when trying to free it, or it's not
+  quarantined when trying to recycle it, etc. A double-free is the typical
+  reason this error would occur;
+
+- ``"misaligned pointer"``: we strongly enforce basic alignment requirements, 8
+  bytes on 32-bit platforms, 16 bytes on 64-bit platforms. If a pointer passed
+  to our functions does not fit those, something is definitely wrong.
+
+- ``"allocation type mismatch"``: when the optional deallocation type mismatch
+  check is enabled, a deallocation function called on a chunk has to match the
+  type of function that was called to allocate it. Security implications of such
+  a mismatch are not necessarily obvious but situational at best;
+
+- ``"invalid sized delete"``: when the C++14 sized delete operator is used, and
+  the optional check enabled, this indicates that the size passed when
+  deallocating a chunk is not congruent with the one requested when allocating
+  it. This is likely to be a `compiler issue <https://software.intel.com/en-us/forums/intel-c-compiler/topic/783942>`_,
+  as was the case with Intel C++ Compiler, or some type confusion on the object
+  being deallocated;
+
+- ``"RSS limit exhausted"``: the maximum RSS optionally specified has been
+  exceeded;
+
+Several other error messages relate to parameter checking on the libc allocation
+APIs and are fairly straightforward to understand.

Added: www-releases/trunk/9.0.0/docs/_sources/SegmentedStacks.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/SegmentedStacks.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/SegmentedStacks.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/SegmentedStacks.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,77 @@
+========================
+Segmented Stacks in LLVM
+========================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Segmented stack allows stack space to be allocated incrementally than as a
+monolithic chunk (of some worst case size) at thread initialization. This is
+done by allocating stack blocks (henceforth called *stacklets*) and linking them
+into a doubly linked list. The function prologue is responsible for checking if
+the current stacklet has enough space for the function to execute; and if not,
+call into the libgcc runtime to allocate more stack space. Segmented stacks are
+enabled with the ``"split-stack"`` attribute on LLVM functions.
+
+The runtime functionality is `already there in libgcc
+<http://gcc.gnu.org/wiki/SplitStacks>`_.
+
+Implementation Details
+======================
+
+.. _allocating stacklets:
+
+Allocating Stacklets
+--------------------
+
+As mentioned above, the function prologue checks if the current stacklet has
+enough space. The current approach is to use a slot in the TCB to store the
+current stack limit (minus the amount of space needed to allocate a new block) -
+this slot's offset is again dictated by ``libgcc``. The generated
+assembly looks like this on x86-64:
+
+.. code-block:: text
+
+    leaq     -8(%rsp), %r10
+    cmpq     %fs:112,  %r10
+    jg       .LBB0_2
+
+    # More stack space needs to be allocated
+    movabsq  $8, %r10   # The amount of space needed
+    movabsq  $0, %r11   # The total size of arguments passed on stack
+    callq    __morestack
+    ret                 # The reason for this extra return is explained below
+  .LBB0_2:
+    # Usual prologue continues here
+
+The size of function arguments on the stack needs to be passed to
+``__morestack`` (this function is implemented in ``libgcc``) since that number
+of bytes has to be copied from the previous stacklet to the current one. This is
+so that SP (and FP) relative addressing of function arguments work as expected.
+
+The unusual ``ret`` is needed to have the function which made a call to
+``__morestack`` return correctly. ``__morestack``, instead of returning, calls
+into ``.LBB0_2``. This is possible since both, the size of the ``ret``
+instruction and the PC of call to ``__morestack`` are known. When the function
+body returns, control is transferred back to ``__morestack``. ``__morestack``
+then de-allocates the new stacklet, restores the correct SP value, and does a
+second return, which returns control to the correct caller.
+
+Variable Sized Allocas
+----------------------
+
+The section on `allocating stacklets`_ automatically assumes that every stack
+frame will be of fixed size. However, LLVM allows the use of the ``llvm.alloca``
+intrinsic to allocate dynamically sized blocks of memory on the stack. When
+faced with such a variable-sized alloca, code is generated to:
+
+* Check if the current stacklet has enough space. If yes, just bump the SP, like
+  in the normal case.
+* If not, generate a call to ``libgcc``, which allocates the memory from the
+  heap.
+
+The memory allocated from the heap is linked into a list in the current
+stacklet, and freed along with the same. This prevents a memory leak.

Added: www-releases/trunk/9.0.0/docs/_sources/SourceLevelDebugging.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/SourceLevelDebugging.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/SourceLevelDebugging.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/SourceLevelDebugging.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,2139 @@
+================================
+Source Level Debugging with LLVM
+================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document is the central repository for all information pertaining to debug
+information in LLVM.  It describes the :ref:`actual format that the LLVM debug
+information takes <format>`, which is useful for those interested in creating
+front-ends or dealing directly with the information.  Further, this document
+provides specific examples of what debug information for C/C++ looks like.
+
+Philosophy behind LLVM debugging information
+--------------------------------------------
+
+The idea of the LLVM debugging information is to capture how the important
+pieces of the source-language's Abstract Syntax Tree map onto LLVM code.
+Several design aspects have shaped the solution that appears here.  The
+important ones are:
+
+* Debugging information should have very little impact on the rest of the
+  compiler.  No transformations, analyses, or code generators should need to
+  be modified because of debugging information.
+
+* LLVM optimizations should interact in :ref:`well-defined and easily described
+  ways <intro_debugopt>` with the debugging information.
+
+* Because LLVM is designed to support arbitrary programming languages,
+  LLVM-to-LLVM tools should not need to know anything about the semantics of
+  the source-level-language.
+
+* Source-level languages are often **widely** different from one another.
+  LLVM should not put any restrictions of the flavor of the source-language,
+  and the debugging information should work with any language.
+
+* With code generator support, it should be possible to use an LLVM compiler
+  to compile a program to native machine code and standard debugging
+  formats.  This allows compatibility with traditional machine-code level
+  debuggers, like GDB or DBX.
+
+The approach used by the LLVM implementation is to use a small set of
+:ref:`intrinsic functions <format_common_intrinsics>` to define a mapping
+between LLVM program objects and the source-level objects.  The description of
+the source-level program is maintained in LLVM metadata in an
+:ref:`implementation-defined format <ccxx_frontend>` (the C/C++ front-end
+currently uses working draft 7 of the `DWARF 3 standard
+<http://www.eagercon.com/dwarf/dwarf3std.htm>`_).
+
+When a program is being debugged, a debugger interacts with the user and turns
+the stored debug information into source-language specific information.  As
+such, a debugger must be aware of the source-language, and is thus tied to a
+specific language or family of languages.
+
+Debug information consumers
+---------------------------
+
+The role of debug information is to provide meta information normally stripped
+away during the compilation process.  This meta information provides an LLVM
+user a relationship between generated code and the original program source
+code.
+
+Currently, there are two backend consumers of debug info: DwarfDebug and
+CodeViewDebug. DwarfDebug produces DWARF suitable for use with GDB, LLDB, and
+other DWARF-based debuggers. :ref:`CodeViewDebug <codeview>` produces CodeView,
+the Microsoft debug info format, which is usable with Microsoft debuggers such
+as Visual Studio and WinDBG. LLVM's debug information format is mostly derived
+from and inspired by DWARF, but it is feasible to translate into other target
+debug info formats such as STABS.
+
+It would also be reasonable to use debug information to feed profiling tools
+for analysis of generated code, or, tools for reconstructing the original
+source from generated code.
+
+.. _intro_debugopt:
+
+Debug information and optimizations
+-----------------------------------
+
+An extremely high priority of LLVM debugging information is to make it interact
+well with optimizations and analysis.  In particular, the LLVM debug
+information provides the following guarantees:
+
+* LLVM debug information **always provides information to accurately read
+  the source-level state of the program**, regardless of which LLVM
+  optimizations have been run, and without any modification to the
+  optimizations themselves.  However, some optimizations may impact the
+  ability to modify the current state of the program with a debugger, such
+  as setting program variables, or calling functions that have been
+  deleted.
+
+* As desired, LLVM optimizations can be upgraded to be aware of debugging
+  information, allowing them to update the debugging information as they
+  perform aggressive optimizations.  This means that, with effort, the LLVM
+  optimizers could optimize debug code just as well as non-debug code.
+
+* LLVM debug information does not prevent optimizations from
+  happening (for example inlining, basic block reordering/merging/cleanup,
+  tail duplication, etc).
+
+* LLVM debug information is automatically optimized along with the rest of
+  the program, using existing facilities.  For example, duplicate
+  information is automatically merged by the linker, and unused information
+  is automatically removed.
+
+Basically, the debug information allows you to compile a program with
+"``-O0 -g``" and get full debug information, allowing you to arbitrarily modify
+the program as it executes from a debugger.  Compiling a program with
+"``-O3 -g``" gives you full debug information that is always available and
+accurate for reading (e.g., you get accurate stack traces despite tail call
+elimination and inlining), but you might lose the ability to modify the program
+and call functions which were optimized out of the program, or inlined away
+completely.
+
+The :doc:`LLVM test-suite <TestSuiteMakefileGuide>` provides a framework to
+test the optimizer's handling of debugging information.  It can be run like
+this:
+
+.. code-block:: bash
+
+  % cd llvm/projects/test-suite/MultiSource/Benchmarks  # or some other level
+  % make TEST=dbgopt
+
+This will test impact of debugging information on optimization passes.  If
+debugging information influences optimization passes then it will be reported
+as a failure.  See :doc:`TestingGuide` for more information on LLVM test
+infrastructure and how to run various tests.
+
+.. _format:
+
+Debugging information format
+============================
+
+LLVM debugging information has been carefully designed to make it possible for
+the optimizer to optimize the program and debugging information without
+necessarily having to know anything about debugging information.  In
+particular, the use of metadata avoids duplicated debugging information from
+the beginning, and the global dead code elimination pass automatically deletes
+debugging information for a function if it decides to delete the function.
+
+To do this, most of the debugging information (descriptors for types,
+variables, functions, source files, etc) is inserted by the language front-end
+in the form of LLVM metadata.
+
+Debug information is designed to be agnostic about the target debugger and
+debugging information representation (e.g. DWARF/Stabs/etc).  It uses a generic
+pass to decode the information that represents variables, types, functions,
+namespaces, etc: this allows for arbitrary source-language semantics and
+type-systems to be used, as long as there is a module written for the target
+debugger to interpret the information.
+
+To provide basic functionality, the LLVM debugger does have to make some
+assumptions about the source-level language being debugged, though it keeps
+these to a minimum.  The only common features that the LLVM debugger assumes
+exist are `source files <LangRef.html#difile>`_, and `program objects
+<LangRef.html#diglobalvariable>`_.  These abstract objects are used by a
+debugger to form stack traces, show information about local variables, etc.
+
+This section of the documentation first describes the representation aspects
+common to any source-language.  :ref:`ccxx_frontend` describes the data layout
+conventions used by the C and C++ front-ends.
+
+Debug information descriptors are `specialized metadata nodes
+<LangRef.html#specialized-metadata>`_, first-class subclasses of ``Metadata``.
+
+.. _format_common_intrinsics:
+
+Debugger intrinsic functions
+----------------------------
+
+LLVM uses several intrinsic functions (name prefixed with "``llvm.dbg``") to
+track source local variables through optimization and code generation.
+
+``llvm.dbg.addr``
+^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+  void @llvm.dbg.addr(metadata, metadata, metadata)
+
+This intrinsic provides information about a local element (e.g., variable).
+The first argument is metadata holding the address of variable, typically a
+static alloca in the function entry block.  The second argument is a
+`local variable <LangRef.html#dilocalvariable>`_ containing a description of
+the variable.  The third argument is a `complex expression
+<LangRef.html#diexpression>`_.  An `llvm.dbg.addr` intrinsic describes the
+*address* of a source variable.
+
+.. code-block:: text
+
+    %i.addr = alloca i32, align 4
+    call void @llvm.dbg.addr(metadata i32* %i.addr, metadata !1,
+                             metadata !DIExpression()), !dbg !2
+    !1 = !DILocalVariable(name: "i", ...) ; int i
+    !2 = !DILocation(...)
+    ...
+    %buffer = alloca [256 x i8], align 8
+    ; The address of i is buffer+64.
+    call void @llvm.dbg.addr(metadata [256 x i8]* %buffer, metadata !3,
+                             metadata !DIExpression(DW_OP_plus, 64)), !dbg !4
+    !3 = !DILocalVariable(name: "i", ...) ; int i
+    !4 = !DILocation(...)
+
+A frontend should generate exactly one call to ``llvm.dbg.addr`` at the point
+of declaration of a source variable. Optimization passes that fully promote the
+variable from memory to SSA values will replace this call with possibly
+multiple calls to `llvm.dbg.value`. Passes that delete stores are effectively
+partial promotion, and they will insert a mix of calls to ``llvm.dbg.value``
+and ``llvm.dbg.addr`` to track the source variable value when it is available.
+After optimization, there may be multiple calls to ``llvm.dbg.addr`` describing
+the program points where the variables lives in memory. All calls for the same
+concrete source variable must agree on the memory location.
+
+
+``llvm.dbg.declare``
+^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+  void @llvm.dbg.declare(metadata, metadata, metadata)
+
+This intrinsic is identical to `llvm.dbg.addr`, except that there can only be
+one call to `llvm.dbg.declare` for a given concrete `local variable
+<LangRef.html#dilocalvariable>`_. It is not control-dependent, meaning that if
+a call to `llvm.dbg.declare` exists and has a valid location argument, that
+address is considered to be the true home of the variable across its entire
+lifetime. This makes it hard for optimizations to preserve accurate debug info
+in the presence of ``llvm.dbg.declare``, so we are transitioning away from it,
+and we plan to deprecate it in future LLVM releases.
+
+
+``llvm.dbg.value``
+^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+  void @llvm.dbg.value(metadata, metadata, metadata)
+
+This intrinsic provides information when a user source variable is set to a new
+value.  The first argument is the new value (wrapped as metadata).  The second
+argument is a `local variable <LangRef.html#dilocalvariable>`_ containing a
+description of the variable.  The third argument is a `complex expression
+<LangRef.html#diexpression>`_.
+
+An `llvm.dbg.value` intrinsic describes the *value* of a source variable
+directly, not its address.  Note that the value operand of this intrinsic may
+be indirect (i.e, a pointer to the source variable), provided that interpreting
+the complex expression derives the direct value.
+
+Object lifetimes and scoping
+============================
+
+In many languages, the local variables in functions can have their lifetimes or
+scopes limited to a subset of a function.  In the C family of languages, for
+example, variables are only live (readable and writable) within the source
+block that they are defined in.  In functional languages, values are only
+readable after they have been defined.  Though this is a very obvious concept,
+it is non-trivial to model in LLVM, because it has no notion of scoping in this
+sense, and does not want to be tied to a language's scoping rules.
+
+In order to handle this, the LLVM debug format uses the metadata attached to
+llvm instructions to encode line number and scoping information.  Consider the
+following C fragment, for example:
+
+.. code-block:: c
+
+  1.  void foo() {
+  2.    int X = 21;
+  3.    int Y = 22;
+  4.    {
+  5.      int Z = 23;
+  6.      Z = X;
+  7.    }
+  8.    X = Y;
+  9.  }
+
+.. FIXME: Update the following example to use llvm.dbg.addr once that is the
+   default in clang.
+
+Compiled to LLVM, this function would be represented like this:
+
+.. code-block:: text
+
+  ; Function Attrs: nounwind ssp uwtable
+  define void @foo() #0 !dbg !4 {
+  entry:
+    %X = alloca i32, align 4
+    %Y = alloca i32, align 4
+    %Z = alloca i32, align 4
+    call void @llvm.dbg.declare(metadata i32* %X, metadata !11, metadata !13), !dbg !14
+    store i32 21, i32* %X, align 4, !dbg !14
+    call void @llvm.dbg.declare(metadata i32* %Y, metadata !15, metadata !13), !dbg !16
+    store i32 22, i32* %Y, align 4, !dbg !16
+    call void @llvm.dbg.declare(metadata i32* %Z, metadata !17, metadata !13), !dbg !19
+    store i32 23, i32* %Z, align 4, !dbg !19
+    %0 = load i32, i32* %X, align 4, !dbg !20
+    store i32 %0, i32* %Z, align 4, !dbg !21
+    %1 = load i32, i32* %Y, align 4, !dbg !22
+    store i32 %1, i32* %X, align 4, !dbg !23
+    ret void, !dbg !24
+  }
+
+  ; Function Attrs: nounwind readnone
+  declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
+
+  attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
+  attributes #1 = { nounwind readnone }
+
+  !llvm.dbg.cu = !{!0}
+  !llvm.module.flags = !{!7, !8, !9}
+  !llvm.ident = !{!10}
+
+  !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !2, subprograms: !3, globals: !2, imports: !2)
+  !1 = !DIFile(filename: "/dev/stdin", directory: "/Users/dexonsmith/data/llvm/debug-info")
+  !2 = !{}
+  !3 = !{!4}
+  !4 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !5, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: false, variables: !2)
+  !5 = !DISubroutineType(types: !6)
+  !6 = !{null}
+  !7 = !{i32 2, !"Dwarf Version", i32 2}
+  !8 = !{i32 2, !"Debug Info Version", i32 3}
+  !9 = !{i32 1, !"PIC Level", i32 2}
+  !10 = !{!"clang version 3.7.0 (trunk 231150) (llvm/trunk 231154)"}
+  !11 = !DILocalVariable(name: "X", scope: !4, file: !1, line: 2, type: !12)
+  !12 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
+  !13 = !DIExpression()
+  !14 = !DILocation(line: 2, column: 9, scope: !4)
+  !15 = !DILocalVariable(name: "Y", scope: !4, file: !1, line: 3, type: !12)
+  !16 = !DILocation(line: 3, column: 9, scope: !4)
+  !17 = !DILocalVariable(name: "Z", scope: !18, file: !1, line: 5, type: !12)
+  !18 = distinct !DILexicalBlock(scope: !4, file: !1, line: 4, column: 5)
+  !19 = !DILocation(line: 5, column: 11, scope: !18)
+  !20 = !DILocation(line: 6, column: 11, scope: !18)
+  !21 = !DILocation(line: 6, column: 9, scope: !18)
+  !22 = !DILocation(line: 8, column: 9, scope: !4)
+  !23 = !DILocation(line: 8, column: 7, scope: !4)
+  !24 = !DILocation(line: 9, column: 3, scope: !4)
+
+
+This example illustrates a few important details about LLVM debugging
+information.  In particular, it shows how the ``llvm.dbg.declare`` intrinsic and
+location information, which are attached to an instruction, are applied
+together to allow a debugger to analyze the relationship between statements,
+variable definitions, and the code used to implement the function.
+
+.. code-block:: llvm
+
+  call void @llvm.dbg.declare(metadata i32* %X, metadata !11, metadata !13), !dbg !14
+    ; [debug line = 2:7] [debug variable = X]
+
+The first intrinsic ``%llvm.dbg.declare`` encodes debugging information for the
+variable ``X``.  The metadata ``!dbg !14`` attached to the intrinsic provides
+scope information for the variable ``X``.
+
+.. code-block:: text
+
+  !14 = !DILocation(line: 2, column: 9, scope: !4)
+  !4 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !5,
+                              isLocal: false, isDefinition: true, scopeLine: 1,
+                              isOptimized: false, variables: !2)
+
+Here ``!14`` is metadata providing `location information
+<LangRef.html#dilocation>`_.  In this example, scope is encoded by ``!4``, a
+`subprogram descriptor <LangRef.html#disubprogram>`_.  This way the location
+information attached to the intrinsics indicates that the variable ``X`` is
+declared at line number 2 at a function level scope in function ``foo``.
+
+Now lets take another example.
+
+.. code-block:: llvm
+
+  call void @llvm.dbg.declare(metadata i32* %Z, metadata !17, metadata !13), !dbg !19
+    ; [debug line = 5:9] [debug variable = Z]
+
+The third intrinsic ``%llvm.dbg.declare`` encodes debugging information for
+variable ``Z``.  The metadata ``!dbg !19`` attached to the intrinsic provides
+scope information for the variable ``Z``.
+
+.. code-block:: text
+
+  !18 = distinct !DILexicalBlock(scope: !4, file: !1, line: 4, column: 5)
+  !19 = !DILocation(line: 5, column: 11, scope: !18)
+
+Here ``!19`` indicates that ``Z`` is declared at line number 5 and column
+number 11 inside of lexical scope ``!18``.  The lexical scope itself resides
+inside of subprogram ``!4`` described above.
+
+The scope information attached with each instruction provides a straightforward
+way to find instructions covered by a scope.
+
+Object lifetime in optimized code
+=================================
+
+In the example above, every variable assignment uniquely corresponds to a
+memory store to the variable's position on the stack. However in heavily
+optimized code LLVM promotes most variables into SSA values, which can
+eventually be placed in physical registers or memory locations. To track SSA
+values through compilation, when objects are promoted to SSA values an
+``llvm.dbg.value`` intrinsic is created for each assignment, recording the
+variable's new location. Compared with the ``llvm.dbg.declare`` intrinsic:
+
+* A dbg.value terminates the effect of any preceeding dbg.values for (any
+  overlapping fragments of) the specified variable.
+* The dbg.value's position in the IR defines where in the instruction stream
+  the variable's value changes.
+* Operands can be constants, indicating the variable is assigned a
+  constant value.
+
+Care must be taken to update ``llvm.dbg.value`` intrinsics when optimization
+passes alter or move instructions and blocks -- the developer could observe such
+changes reflected in the value of variables when debugging the program. For any
+execution of the optimized program, the set of variable values presented to the
+developer by the debugger should not show a state that would never have existed
+in the execution of the unoptimized program, given the same input. Doing so
+risks misleading the developer by reporting a state that does not exist,
+damaging their understanding of the optimized program and undermining their
+trust in the debugger.
+
+Sometimes perfectly preserving variable locations is not possible, often when a
+redundant calculation is optimized out. In such cases, a ``llvm.dbg.value``
+with operand ``undef`` should be used, to terminate earlier variable locations
+and let the debugger present ``optimized out`` to the developer. Withholding
+these potentially stale variable values from the developer diminishes the
+amount of available debug information, but increases the reliability of the
+remaining information.
+ 
+To illustrate some potential issues, consider the following example:
+
+.. code-block:: llvm
+
+  define i32 @foo(i32 %bar, i1 %cond) {
+  entry:
+    call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2)
+    br i1 %cond, label %truebr, label %falsebr
+  truebr:
+    %tval = add i32 %bar, 1
+    call @llvm.dbg.value(metadata i32 %tval, metadata !1, metadata !2)
+    %g1 = call i32 @gazonk()
+    br label %exit
+  falsebr:
+    %fval = add i32 %bar, 2
+    call @llvm.dbg.value(metadata i32 %fval, metadata !1, metadata !2)
+    %g2 = call i32 @gazonk()
+    br label %exit
+  exit:
+    %merge = phi [ %tval, %truebr ], [ %fval, %falsebr ]
+    %g = phi [ %g1, %truebr ], [ %g2, %falsebr ]
+    call @llvm.dbg.value(metadata i32 %merge, metadata !1, metadata !2)
+    call @llvm.dbg.value(metadata i32 %g, metadata !3, metadata !2)
+    %plusten = add i32 %merge, 10
+    %toret = add i32 %plusten, %g
+    call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2)
+    ret i32 %toret
+  }
+
+Containing two source-level variables in ``!1`` and ``!3``. The function could,
+perhaps, be optimized into the following code:
+
+.. code-block:: llvm
+
+  define i32 @foo(i32 %bar, i1 %cond) {
+  entry:
+    %g = call i32 @gazonk()
+    %addoper = select i1 %cond, i32 11, i32 12
+    %plusten = add i32 %bar, %addoper
+    %toret = add i32 %plusten, %g
+    ret i32 %toret
+  }
+
+What ``llvm.dbg.value`` intrinsics should be placed to represent the original variable
+locations in this code? Unfortunately the the second, third and fourth
+dbg.values for ``!1`` in the source function have had their operands
+(%tval, %fval, %merge) optimized out. Assuming we cannot recover them, we
+might consider this placement of dbg.values:
+
+.. code-block:: llvm
+
+  define i32 @foo(i32 %bar, i1 %cond) {
+  entry:
+    call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2)
+    %g = call i32 @gazonk()
+    call @llvm.dbg.value(metadata i32 %g, metadata !3, metadata !2)
+    %addoper = select i1 %cond, i32 11, i32 12
+    %plusten = add i32 %bar, %addoper
+    %toret = add i32 %plusten, %g
+    call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2)
+    ret i32 %toret
+  }
+
+However, this will cause ``!3`` to have the return value of ``@gazonk()`` at
+the same time as ``!1`` has the constant value zero -- a pair of assignments
+that never occurred in the unoptimized program. To avoid this, we must terminate
+the range that ``!1`` has the constant value assignment by inserting an undef
+dbg.value before the dbg.value for ``!3``:
+
+.. code-block:: llvm
+
+  define i32 @foo(i32 %bar, i1 %cond) {
+  entry:
+    call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2)
+    %g = call i32 @gazonk()
+    call @llvm.dbg.value(metadata i32 undef, metadata !1, metadata !2)
+    call @llvm.dbg.value(metadata i32 %g, metadata !3, metadata !2)
+    %addoper = select i1 %cond, i32 11, i32 12
+    %plusten = add i32 %bar, %addoper
+    %toret = add i32 %plusten, %g
+    call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2)
+    ret i32 %toret
+  }
+
+In general, if any dbg.value has its operand optimized out and cannot be
+recovered, then an undef dbg.value is necessary to terminate earlier variable
+locations. Additional undef dbg.values may be necessary when the debugger can
+observe re-ordering of assignments.
+
+How variable location metadata is transformed during CodeGen
+============================================================
+
+LLVM preserves debug information throughout mid-level and backend passes,
+ultimately producing a mapping between source-level information and
+instruction ranges. This
+is relatively straightforwards for line number information, as mapping
+instructions to line numbers is a simple association. For variable locations
+however the story is more complex. As each ``llvm.dbg.value`` intrinsic
+represents a source-level assignment of a value to a source variable, the
+variable location intrinsics effectively embed a small imperative program
+within the LLVM IR. By the end of CodeGen, this becomes a mapping from each
+variable to their machine locations over ranges of instructions.
+From IR to object emission, the major transformations which affect variable
+location fidelity are:
+
+1. Instruction Selection
+2. Register allocation
+3. Block layout
+
+each of which are discussed below. In addition, instruction scheduling can
+significantly change the ordering of the program, and occurs in a number of
+different passes.
+
+Some variable locations are not transformed during CodeGen. Stack locations
+specified by ``llvm.dbg.declare`` are valid and unchanging for the entire
+duration of the function, and are recorded in a simple MachineFunction table.
+Location changes in the prologue and epilogue of a function are also ignored:
+frame setup and destruction may take several instructions, require a
+disproportionate amount of debugging information in the output binary to
+describe, and should be stepped over by debuggers anyway.
+
+Variable locations in Instruction Selection and MIR
+---------------------------------------------------
+
+Instruction selection creates a MIR function from an IR function, and just as
+it transforms ``intermediate`` instructions into machine instructions, so must
+``intermediate`` variable locations become machine variable locations.
+Within IR, variable locations are always identified by a Value, but in MIR
+there can be different types of variable locations. In addition, some IR
+locations become unavailable, for example if the operation of multiple IR
+instructions are combined into one machine instruction (such as
+multiply-and-accumulate) then intermediate Values are lost. To track variable
+locations through instruction selection, they are first separated into
+locations that do not depend on code generation (constants, stack locations,
+allocated virtual registers) and those that do. For those that do, debug
+metadata is attached to SDNodes in SelectionDAGs. After instruction selection
+has occurred and a MIR function is created, if the SDNode associated with debug
+metadata is allocated a virtual register, that virtual register is used as the
+variable location. If the SDNode is folded into a machine instruction or
+otherwise transformed into a non-register, the variable location becomes
+unavailable.
+
+Locations that are unavailable are treated as if they have been optimized out:
+in IR the location would be assigned ``undef`` by a debug intrinsic, and in MIR
+the equivalent location is used.
+
+After MIR locations are assigned to each variable, machine pseudo-instructions
+corresponding to each ``llvm.dbg.value`` and ``llvm.dbg.addr`` intrinsic are
+inserted. These ``DBG_VALUE`` instructions appear thus:
+
+.. code-block:: text
+
+  DBG_VALUE %1, $noreg, !123, !DIExpression()
+
+And have the following operands:
+ * The first operand can record the variable location as a register,
+   a frame index, an immediate, or the base address register if the original
+   debug intrinsic referred to memory. ``$noreg`` indicates the variable
+   location is undefined, equivalent to an ``undef`` dbg.value operand.
+ * The type of the second operand indicates whether the variable location is
+   directly referred to by the DBG_VALUE, or whether it is indirect. The
+   ``$noreg`` register signifies the former, an immediate operand (0) the
+   latter.
+ * Operand 3 is the Variable field of the original debug intrinsic.
+ * Operand 4 is the Expression field of the original debug intrinsic.
+
+The position at which the DBG_VALUEs are inserted should correspond to the
+positions of their matching ``llvm.dbg.value`` intrinsics in the IR block.  As
+with optimization, LLVM aims to preserve the order in which variable
+assignments occurred in the source program. However SelectionDAG performs some
+instruction scheduling, which can reorder assignments (discussed below).
+Function parameter locations are moved to the beginning of the function if
+they're not already, to ensure they're immediately available on function entry.
+
+To demonstrate variable locations during instruction selection, consider
+the following example:
+
+.. code-block:: llvm
+
+  define i32 @foo(i32* %addr) {
+  entry:
+    call void @llvm.dbg.value(metadata i32 0, metadata !3, metadata !DIExpression()), !dbg !5
+    br label %bb1, !dbg !5
+
+  bb1:                                              ; preds = %bb1, %entry
+    %bar.0 = phi i32 [ 0, %entry ], [ %add, %bb1 ]
+    call void @llvm.dbg.value(metadata i32 %bar.0, metadata !3, metadata !DIExpression()), !dbg !5
+    %addr1 = getelementptr i32, i32 *%addr, i32 1, !dbg !5
+    call void @llvm.dbg.value(metadata i32 *%addr1, metadata !3, metadata !DIExpression()), !dbg !5
+    %loaded1 = load i32, i32* %addr1, !dbg !5
+    %addr2 = getelementptr i32, i32 *%addr, i32 %bar.0, !dbg !5
+    call void @llvm.dbg.value(metadata i32 *%addr2, metadata !3, metadata !DIExpression()), !dbg !5
+    %loaded2 = load i32, i32* %addr2, !dbg !5
+    %add = add i32 %bar.0, 1, !dbg !5
+    call void @llvm.dbg.value(metadata i32 %add, metadata !3, metadata !DIExpression()), !dbg !5
+    %added = add i32 %loaded1, %loaded2
+    %cond = icmp ult i32 %added, %bar.0, !dbg !5
+    br i1 %cond, label %bb1, label %bb2, !dbg !5
+
+  bb2:                                              ; preds = %bb1
+    ret i32 0, !dbg !5
+  }
+
+If one compiles this IR with ``llc -o - -start-after=codegen-prepare -stop-after=expand-isel-pseudos -mtriple=x86_64--``, the following MIR is produced:
+
+.. code-block:: text
+
+  bb.0.entry:
+    successors: %bb.1(0x80000000)
+    liveins: $rdi
+
+    %2:gr64 = COPY $rdi
+    %3:gr32 = MOV32r0 implicit-def dead $eflags
+    DBG_VALUE 0, $noreg, !3, !DIExpression(), debug-location !5
+
+  bb.1.bb1:
+    successors: %bb.1(0x7c000000), %bb.2(0x04000000)
+
+    %0:gr32 = PHI %3, %bb.0, %1, %bb.1
+    DBG_VALUE %0, $noreg, !3, !DIExpression(), debug-location !5
+    DBG_VALUE %2, $noreg, !3, !DIExpression(DW_OP_plus_uconst, 4, DW_OP_stack_value), debug-location !5
+    %4:gr32 = MOV32rm %2, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
+    %5:gr64_nosp = MOVSX64rr32 %0, debug-location !5
+    DBG_VALUE $noreg, $noreg, !3, !DIExpression(), debug-location !5
+    %1:gr32 = INC32r %0, implicit-def dead $eflags, debug-location !5
+    DBG_VALUE %1, $noreg, !3, !DIExpression(), debug-location !5
+    %6:gr32 = ADD32rm %4, %2, 4, killed %5, 0, $noreg, implicit-def dead $eflags :: (load 4 from %ir.addr2)
+    %7:gr32 = SUB32rr %6, %0, implicit-def $eflags, debug-location !5
+    JB_1 %bb.1, implicit $eflags, debug-location !5
+    JMP_1 %bb.2, debug-location !5
+
+  bb.2.bb2:
+    %8:gr32 = MOV32r0 implicit-def dead $eflags
+    $eax = COPY %8, debug-location !5
+    RET 0, $eax, debug-location !5
+
+Observe first that there is a DBG_VALUE instruction for every ``llvm.dbg.value``
+intrinsic in the source IR, ensuring no source level assignments go missing.
+Then consider the different ways in which variable locations have been recorded:
+
+* For the first dbg.value an immediate operand is used to record a zero value.
+* The dbg.value of the PHI instruction leads to a DBG_VALUE of virtual register
+  ``%0``.
+* The first GEP has its effect folded into the first load instruction
+  (as a 4-byte offset), but the variable location is salvaged by folding
+  the GEPs effect into the DIExpression.
+* The second GEP is also folded into the corresponding load. However, it is
+  insufficiently simple to be salvaged, and is emitted as a ``$noreg``
+  DBG_VALUE, indicating that the variable takes on an undefined location.
+* The final dbg.value has its Value placed in virtual register ``%1``.
+
+Instruction Scheduling
+----------------------
+
+A number of passes can reschedule instructions, notably instruction selection
+and the pre-and-post RA machine schedulers. Instruction scheduling can
+significantly change the nature of the program -- in the (very unlikely) worst
+case the instruction sequence could be completely reversed. In such
+circumstances LLVM follows the principle applied to optimizations, that it is
+better for the debugger not to display any state than a misleading state.
+Thus, whenever instructions are advanced in order of execution, any
+corresponding DBG_VALUE is kept in its original position, and if an instruction
+is delayed then the variable is given an undefined location for the duration
+of the delay. To illustrate, consider this pseudo-MIR:
+
+.. code-block:: text
+
+  %1:gr32 = MOV32rm %0, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
+  DBG_VALUE %1, $noreg, !1, !2
+  %4:gr32 = ADD32rr %3, %2, implicit-def dead $eflags
+  DBG_VALUE %4, $noreg, !3, !4
+  %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
+  DBG_VALUE %7, $noreg, !5, !6
+
+Imagine that the SUB32rr were moved forward to give us the following MIR:
+
+.. code-block:: text
+
+  %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
+  %1:gr32 = MOV32rm %0, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
+  DBG_VALUE %1, $noreg, !1, !2
+  %4:gr32 = ADD32rr %3, %2, implicit-def dead $eflags
+  DBG_VALUE %4, $noreg, !3, !4
+  DBG_VALUE %7, $noreg, !5, !6
+
+In this circumstance LLVM would leave the MIR as shown above. Were we to move
+the DBG_VALUE of virtual register %7 upwards with the SUB32rr, we would re-order
+assignments and introduce a new state of the program. Wheras with the solution
+above, the debugger will see one fewer combination of variable values, because
+``!3`` and ``!5`` will change value at the same time. This is preferred over
+misrepresenting the original program.
+
+In comparison, if one sunk the MOV32rm, LLVM would produce the following:
+
+.. code-block:: text
+
+  DBG_VALUE $noreg, $noreg, !1, !2
+  %4:gr32 = ADD32rr %3, %2, implicit-def dead $eflags
+  DBG_VALUE %4, $noreg, !3, !4
+  %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
+  DBG_VALUE %7, $noreg, !5, !6
+  %1:gr32 = MOV32rm %0, 1, $noreg, 4, $noreg, debug-location !5 :: (load 4 from %ir.addr1)
+  DBG_VALUE %1, $noreg, !1, !2
+
+Here, to avoid presenting a state in which the first assignment to ``!1``
+disappears, the DBG_VALUE at the top of the block assigns the variable the
+undefined location, until its value is available at the end of the block where
+an additional DBG_VALUE is added. Were any other DBG_VALUE for ``!1`` to occur
+in the instructions that the MOV32rm was sunk past, the DBG_VALUE for ``%1``
+would be dropped and the debugger would never observe it in the variable. This
+accurately reflects that the value is not available during the corresponding
+portion of the original program.
+
+Variable locations during Register Allocation
+---------------------------------------------
+
+To avoid debug instructions interfering with the register allocator, the
+LiveDebugVariables pass extracts variable locations from a MIR function and
+deletes the corresponding DBG_VALUE instructions. Some localized copy
+propagation is performed within blocks. After register allocation, the
+VirtRegRewriter pass re-inserts DBG_VALUE instructions in their orignal
+positions, translating virtual register references into their physical
+machine locations. To avoid encoding incorrect variable locations, in this
+pass any DBG_VALUE of a virtual register that is not live, is replaced by
+the undefined location.
+
+LiveDebugValues expansion of variable locations
+-----------------------------------------------
+
+After all optimizations have run and shortly before emission, the
+LiveDebugValues pass runs to achieve two aims:
+
+* To propagate the location of variables through copies and register spills,
+* For every block, to record every valid variable location in that block.
+
+After this pass the DBG_VALUE instruction changes meaning: rather than
+corresponding to a source-level assignment where the variable may change value,
+it asserts the location of a variable in a block, and loses effect outside the
+block. Propagating variable locations through copies and spills is
+straightforwards: determining the variable location in every basic block
+requries the consideraton of control flow. Consider the following IR, which
+presents several difficulties:
+
+.. code-block:: text
+
+  define dso_local i32 @foo(i1 %cond, i32 %input) !dbg !12 {
+  entry:
+    br i1 %cond, label %truebr, label %falsebr
+
+  bb1: 
+    %value = phi i32 [ %value1, %truebr ], [ %value2, %falsebr ]
+    br label %exit, !dbg !26
+
+  truebr:
+    call void @llvm.dbg.value(metadata i32 %input, metadata !30, metadata !DIExpression()), !dbg !24
+    call void @llvm.dbg.value(metadata i32 1, metadata !23, metadata !DIExpression()), !dbg !24
+    %value1 = add i32 %input, 1
+    br label %bb1
+
+  falsebr:
+    call void @llvm.dbg.value(metadata i32 %input, metadata !30, metadata !DIExpression()), !dbg !24
+    call void @llvm.dbg.value(metadata i32 2, metadata !23, metadata !DIExpression()), !dbg !24
+    %value = add i32 %input, 2
+    br label %bb1
+
+  exit: 
+    ret i32 %value, !dbg !30
+  }
+
+Here the difficulties are:
+
+* The control flow is roughly the opposite of basic block order
+* The value of the ``!23`` variable merges into ``%bb1``, but there is no PHI
+  node
+
+As mentioned above, the ``llvm.dbg.value`` intrinsics essentially form an
+imperative program embedded in the IR, with each intrinsic defining a variable
+location. This *could* be converted to an SSA form by mem2reg, in the same way
+that it uses use-def chains to identify control flow merges and insert phi
+nodes for IR Values. However, because debug variable locations are defined for
+every machine instruction, in effect every IR instruction uses every variable
+location, which would lead to a large number of debugging intrinsics being
+generated.
+
+Examining the example above, variable ``!30`` is assigned ``%input`` on both
+conditional paths through the function, while ``!23`` is assigned differing
+constant values on either path. Where control flow merges in ``%bb1`` we would
+want ``!30`` to keep its location (``%input``), but ``!23`` to become undefined
+as we cannot determine at runtime what value it should have in %bb1 without
+inserting a PHI node. mem2reg does not insert the PHI node to avoid changing
+codegen when debugging is enabled, and does not insert the other dbg.values
+to avoid adding very large numbers of intrinsics.
+
+Instead, LiveDebugValues determines variable locations when control
+flow merges. A dataflow analysis is used to propagate locations between blocks:
+when control flow merges, if a variable has the same location in all
+predecessors then that location is propagated into the successor. If the
+predecessor locations disagree, the location becomes undefined.
+
+Once LiveDebugValues has run, every block should have all valid variable
+locations described by DBG_VALUE instructions within the block. Very little
+effort is then required by supporting classes (such as
+DbgEntityHistoryCalculator) to build a map of each instruction to every
+valid variable location, without the need to consider control flow. From
+the example above, it is otherwise difficult to determine that the location
+of variable ``!30`` should flow "up" into block ``%bb1``, but that the location
+of variable ``!23`` should not flow "down" into the ``%exit`` block.
+
+.. _ccxx_frontend:
+
+C/C++ front-end specific debug information
+==========================================
+
+The C and C++ front-ends represent information about the program in a format
+that is effectively identical to `DWARF 3.0
+<http://www.eagercon.com/dwarf/dwarf3std.htm>`_ in terms of information
+content.  This allows code generators to trivially support native debuggers by
+generating standard dwarf information, and contains enough information for
+non-dwarf targets to translate it as needed.
+
+This section describes the forms used to represent C and C++ programs.  Other
+languages could pattern themselves after this (which itself is tuned to
+representing programs in the same way that DWARF 3 does), or they could choose
+to provide completely different forms if they don't fit into the DWARF model.
+As support for debugging information gets added to the various LLVM
+source-language front-ends, the information used should be documented here.
+
+The following sections provide examples of a few C/C++ constructs and the debug
+information that would best describe those constructs.  The canonical
+references are the ``DIDescriptor`` classes defined in
+``include/llvm/IR/DebugInfo.h`` and the implementations of the helper functions
+in ``lib/IR/DIBuilder.cpp``.
+
+C/C++ source file information
+-----------------------------
+
+``llvm::Instruction`` provides easy access to metadata attached with an
+instruction.  One can extract line number information encoded in LLVM IR using
+``Instruction::getDebugLoc()`` and ``DILocation::getLine()``.
+
+.. code-block:: c++
+
+  if (DILocation *Loc = I->getDebugLoc()) { // Here I is an LLVM instruction
+    unsigned Line = Loc->getLine();
+    StringRef File = Loc->getFilename();
+    StringRef Dir = Loc->getDirectory();
+    bool ImplicitCode = Loc->isImplicitCode();
+  }
+
+When the flag ImplicitCode is true then it means that the Instruction has been
+added by the front-end but doesn't correspond to source code written by the user. For example
+
+.. code-block:: c++
+
+  if (MyBoolean) {
+    MyObject MO;
+    ...
+  }
+
+At the end of the scope the MyObject's destructor is called but it isn't written
+explicitly. This information is useful to avoid to have counters on brackets when
+making code coverage.
+
+C/C++ global variable information
+---------------------------------
+
+Given an integer global variable declared as follows:
+
+.. code-block:: c
+
+  _Alignas(8) int MyGlobal = 100;
+
+a C/C++ front-end would generate the following descriptors:
+
+.. code-block:: text
+
+  ;;
+  ;; Define the global itself.
+  ;;
+  @MyGlobal = global i32 100, align 8, !dbg !0
+
+  ;;
+  ;; List of debug info of globals
+  ;;
+  !llvm.dbg.cu = !{!1}
+
+  ;; Some unrelated metadata.
+  !llvm.module.flags = !{!6, !7}
+  !llvm.ident = !{!8}
+
+  ;; Define the global variable itself
+  !0 = distinct !DIGlobalVariable(name: "MyGlobal", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true, align: 64)
+
+  ;; Define the compile unit.
+  !1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2,
+                               producer: "clang version 4.0.0",
+                               isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug,
+                               enums: !3, globals: !4)
+
+  ;;
+  ;; Define the file
+  ;;
+  !2 = !DIFile(filename: "/dev/stdin",
+               directory: "/Users/dexonsmith/data/llvm/debug-info")
+
+  ;; An empty array.
+  !3 = !{}
+
+  ;; The Array of Global Variables
+  !4 = !{!0}
+
+  ;;
+  ;; Define the type
+  ;;
+  !5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+
+  ;; Dwarf version to output.
+  !6 = !{i32 2, !"Dwarf Version", i32 4}
+
+  ;; Debug info schema version.
+  !7 = !{i32 2, !"Debug Info Version", i32 3}
+
+  ;; Compiler identification
+  !8 = !{!"clang version 4.0.0"}
+
+
+The align value in DIGlobalVariable description specifies variable alignment in
+case it was forced by C11 _Alignas(), C++11 alignas() keywords or compiler
+attribute __attribute__((aligned ())). In other case (when this field is missing)
+alignment is considered default. This is used when producing DWARF output
+for DW_AT_alignment value.
+
+C/C++ function information
+--------------------------
+
+Given a function declared as follows:
+
+.. code-block:: c
+
+  int main(int argc, char *argv[]) {
+    return 0;
+  }
+
+a C/C++ front-end would generate the following descriptors:
+
+.. code-block:: text
+
+  ;;
+  ;; Define the anchor for subprograms.
+  ;;
+  !4 = !DISubprogram(name: "main", scope: !1, file: !1, line: 1, type: !5,
+                     isLocal: false, isDefinition: true, scopeLine: 1,
+                     flags: DIFlagPrototyped, isOptimized: false,
+                     variables: !2)
+
+  ;;
+  ;; Define the subprogram itself.
+  ;;
+  define i32 @main(i32 %argc, i8** %argv) !dbg !4 {
+  ...
+  }
+
+Fortran specific debug information
+==================================
+
+Fortran function information
+----------------------------
+
+There are a few DWARF attributes defined to support client debugging of Fortran programs.  LLVM can generate (or omit) the appropriate DWARF attributes for the prefix-specs of ELEMENTAL, PURE, IMPURE, RECURSIVE, and NON_RECURSIVE.  This is done by using the spFlags values: DISPFlagElemental, DISPFlagPure, and DISPFlagRecursive.
+
+.. code-block:: fortran
+
+  elemental function elem_func(a)
+
+a Fortran front-end would generate the following descriptors:
+
+.. code-block:: text
+
+  !11 = distinct !DISubprogram(name: "subroutine2", scope: !1, file: !1,
+          line: 5, type: !8, scopeLine: 6,
+          spFlags: DISPFlagDefinition | DISPFlagElemental, unit: !0,
+          retainedNodes: !2)
+
+and this will materialize an additional DWARF attribute as:
+
+.. code-block:: text
+
+  DW_TAG_subprogram [3]  
+     DW_AT_low_pc [DW_FORM_addr]     (0x0000000000000010 ".text")
+     DW_AT_high_pc [DW_FORM_data4]   (0x00000001)
+     ...
+     DW_AT_elemental [DW_FORM_flag_present]  (true)
+
+Debugging information format
+============================
+
+Debugging Information Extension for Objective C Properties
+----------------------------------------------------------
+
+Introduction
+^^^^^^^^^^^^
+
+Objective C provides a simpler way to declare and define accessor methods using
+declared properties.  The language provides features to declare a property and
+to let compiler synthesize accessor methods.
+
+The debugger lets developer inspect Objective C interfaces and their instance
+variables and class variables.  However, the debugger does not know anything
+about the properties defined in Objective C interfaces.  The debugger consumes
+information generated by compiler in DWARF format.  The format does not support
+encoding of Objective C properties.  This proposal describes DWARF extensions to
+encode Objective C properties, which the debugger can use to let developers
+inspect Objective C properties.
+
+Proposal
+^^^^^^^^
+
+Objective C properties exist separately from class members.  A property can be
+defined only by "setter" and "getter" selectors, and be calculated anew on each
+access.  Or a property can just be a direct access to some declared ivar.
+Finally it can have an ivar "automatically synthesized" for it by the compiler,
+in which case the property can be referred to in user code directly using the
+standard C dereference syntax as well as through the property "dot" syntax, but
+there is no entry in the ``@interface`` declaration corresponding to this ivar.
+
+To facilitate debugging, these properties we will add a new DWARF TAG into the
+``DW_TAG_structure_type`` definition for the class to hold the description of a
+given property, and a set of DWARF attributes that provide said description.
+The property tag will also contain the name and declared type of the property.
+
+If there is a related ivar, there will also be a DWARF property attribute placed
+in the ``DW_TAG_member`` DIE for that ivar referring back to the property TAG
+for that property.  And in the case where the compiler synthesizes the ivar
+directly, the compiler is expected to generate a ``DW_TAG_member`` for that
+ivar (with the ``DW_AT_artificial`` set to 1), whose name will be the name used
+to access this ivar directly in code, and with the property attribute pointing
+back to the property it is backing.
+
+The following examples will serve as illustration for our discussion:
+
+.. code-block:: objc
+
+  @interface I1 {
+    int n2;
+  }
+
+  @property int p1;
+  @property int p2;
+  @end
+
+  @implementation I1
+  @synthesize p1;
+  @synthesize p2 = n2;
+  @end
+
+This produces the following DWARF (this is a "pseudo dwarfdump" output):
+
+.. code-block:: none
+
+  0x00000100:  TAG_structure_type [7] *
+                 AT_APPLE_runtime_class( 0x10 )
+                 AT_name( "I1" )
+                 AT_decl_file( "Objc_Property.m" )
+                 AT_decl_line( 3 )
+
+  0x00000110    TAG_APPLE_property
+                  AT_name ( "p1" )
+                  AT_type ( {0x00000150} ( int ) )
+
+  0x00000120:   TAG_APPLE_property
+                  AT_name ( "p2" )
+                  AT_type ( {0x00000150} ( int ) )
+
+  0x00000130:   TAG_member [8]
+                  AT_name( "_p1" )
+                  AT_APPLE_property ( {0x00000110} "p1" )
+                  AT_type( {0x00000150} ( int ) )
+                  AT_artificial ( 0x1 )
+
+  0x00000140:    TAG_member [8]
+                   AT_name( "n2" )
+                   AT_APPLE_property ( {0x00000120} "p2" )
+                   AT_type( {0x00000150} ( int ) )
+
+  0x00000150:  AT_type( ( int ) )
+
+Note, the current convention is that the name of the ivar for an
+auto-synthesized property is the name of the property from which it derives
+with an underscore prepended, as is shown in the example.  But we actually
+don't need to know this convention, since we are given the name of the ivar
+directly.
+
+Also, it is common practice in ObjC to have different property declarations in
+the @interface and @implementation - e.g. to provide a read-only property in
+the interface,and a read-write interface in the implementation.  In that case,
+the compiler should emit whichever property declaration will be in force in the
+current translation unit.
+
+Developers can decorate a property with attributes which are encoded using
+``DW_AT_APPLE_property_attribute``.
+
+.. code-block:: objc
+
+  @property (readonly, nonatomic) int pr;
+
+.. code-block:: none
+
+  TAG_APPLE_property [8]
+    AT_name( "pr" )
+    AT_type ( {0x00000147} (int) )
+    AT_APPLE_property_attribute (DW_APPLE_PROPERTY_readonly, DW_APPLE_PROPERTY_nonatomic)
+
+The setter and getter method names are attached to the property using
+``DW_AT_APPLE_property_setter`` and ``DW_AT_APPLE_property_getter`` attributes.
+
+.. code-block:: objc
+
+  @interface I1
+  @property (setter=myOwnP3Setter:) int p3;
+  -(void)myOwnP3Setter:(int)a;
+  @end
+
+  @implementation I1
+  @synthesize p3;
+  -(void)myOwnP3Setter:(int)a{ }
+  @end
+
+The DWARF for this would be:
+
+.. code-block:: none
+
+  0x000003bd: TAG_structure_type [7] *
+                AT_APPLE_runtime_class( 0x10 )
+                AT_name( "I1" )
+                AT_decl_file( "Objc_Property.m" )
+                AT_decl_line( 3 )
+
+  0x000003cd      TAG_APPLE_property
+                    AT_name ( "p3" )
+                    AT_APPLE_property_setter ( "myOwnP3Setter:" )
+                    AT_type( {0x00000147} ( int ) )
+
+  0x000003f3:     TAG_member [8]
+                    AT_name( "_p3" )
+                    AT_type ( {0x00000147} ( int ) )
+                    AT_APPLE_property ( {0x000003cd} )
+                    AT_artificial ( 0x1 )
+
+New DWARF Tags
+^^^^^^^^^^^^^^
+
++-----------------------+--------+
+| TAG                   | Value  |
++=======================+========+
+| DW_TAG_APPLE_property | 0x4200 |
++-----------------------+--------+
+
+New DWARF Attributes
+^^^^^^^^^^^^^^^^^^^^
+
++--------------------------------+--------+-----------+
+| Attribute                      | Value  | Classes   |
++================================+========+===========+
+| DW_AT_APPLE_property           | 0x3fed | Reference |
++--------------------------------+--------+-----------+
+| DW_AT_APPLE_property_getter    | 0x3fe9 | String    |
++--------------------------------+--------+-----------+
+| DW_AT_APPLE_property_setter    | 0x3fea | String    |
++--------------------------------+--------+-----------+
+| DW_AT_APPLE_property_attribute | 0x3feb | Constant  |
++--------------------------------+--------+-----------+
+
+New DWARF Constants
+^^^^^^^^^^^^^^^^^^^
+
++--------------------------------------+-------+
+| Name                                 | Value |
++======================================+=======+
+| DW_APPLE_PROPERTY_readonly           | 0x01  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_getter             | 0x02  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_assign             | 0x04  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_readwrite          | 0x08  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_retain             | 0x10  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_copy               | 0x20  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_nonatomic          | 0x40  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_setter             | 0x80  |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_atomic             | 0x100 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_weak               | 0x200 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_strong             | 0x400 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_unsafe_unretained  | 0x800 |
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_nullability        | 0x1000|
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_null_resettable    | 0x2000|
++--------------------------------------+-------+
+| DW_APPLE_PROPERTY_class              | 0x4000|
++--------------------------------------+-------+
+
+Name Accelerator Tables
+-----------------------
+
+Introduction
+^^^^^^^^^^^^
+
+The "``.debug_pubnames``" and "``.debug_pubtypes``" formats are not what a
+debugger needs.  The "``pub``" in the section name indicates that the entries
+in the table are publicly visible names only.  This means no static or hidden
+functions show up in the "``.debug_pubnames``".  No static variables or private
+class variables are in the "``.debug_pubtypes``".  Many compilers add different
+things to these tables, so we can't rely upon the contents between gcc, icc, or
+clang.
+
+The typical query given by users tends not to match up with the contents of
+these tables.  For example, the DWARF spec states that "In the case of the name
+of a function member or static data member of a C++ structure, class or union,
+the name presented in the "``.debug_pubnames``" section is not the simple name
+given by the ``DW_AT_name attribute`` of the referenced debugging information
+entry, but rather the fully qualified name of the data or function member."
+So the only names in these tables for complex C++ entries is a fully
+qualified name.  Debugger users tend not to enter their search strings as
+"``a::b::c(int,const Foo&) const``", but rather as "``c``", "``b::c``" , or
+"``a::b::c``".  So the name entered in the name table must be demangled in
+order to chop it up appropriately and additional names must be manually entered
+into the table to make it effective as a name lookup table for debuggers to
+use.
+
+All debuggers currently ignore the "``.debug_pubnames``" table as a result of
+its inconsistent and useless public-only name content making it a waste of
+space in the object file.  These tables, when they are written to disk, are not
+sorted in any way, leaving every debugger to do its own parsing and sorting.
+These tables also include an inlined copy of the string values in the table
+itself making the tables much larger than they need to be on disk, especially
+for large C++ programs.
+
+Can't we just fix the sections by adding all of the names we need to this
+table? No, because that is not what the tables are defined to contain and we
+won't know the difference between the old bad tables and the new good tables.
+At best we could make our own renamed sections that contain all of the data we
+need.
+
+These tables are also insufficient for what a debugger like LLDB needs.  LLDB
+uses clang for its expression parsing where LLDB acts as a PCH.  LLDB is then
+often asked to look for type "``foo``" or namespace "``bar``", or list items in
+namespace "``baz``".  Namespaces are not included in the pubnames or pubtypes
+tables.  Since clang asks a lot of questions when it is parsing an expression,
+we need to be very fast when looking up names, as it happens a lot.  Having new
+accelerator tables that are optimized for very quick lookups will benefit this
+type of debugging experience greatly.
+
+We would like to generate name lookup tables that can be mapped into memory
+from disk, and used as is, with little or no up-front parsing.  We would also
+be able to control the exact content of these different tables so they contain
+exactly what we need.  The Name Accelerator Tables were designed to fix these
+issues.  In order to solve these issues we need to:
+
+* Have a format that can be mapped into memory from disk and used as is
+* Lookups should be very fast
+* Extensible table format so these tables can be made by many producers
+* Contain all of the names needed for typical lookups out of the box
+* Strict rules for the contents of tables
+
+Table size is important and the accelerator table format should allow the reuse
+of strings from common string tables so the strings for the names are not
+duplicated.  We also want to make sure the table is ready to be used as-is by
+simply mapping the table into memory with minimal header parsing.
+
+The name lookups need to be fast and optimized for the kinds of lookups that
+debuggers tend to do.  Optimally we would like to touch as few parts of the
+mapped table as possible when doing a name lookup and be able to quickly find
+the name entry we are looking for, or discover there are no matches.  In the
+case of debuggers we optimized for lookups that fail most of the time.
+
+Each table that is defined should have strict rules on exactly what is in the
+accelerator tables and documented so clients can rely on the content.
+
+Hash Tables
+^^^^^^^^^^^
+
+Standard Hash Tables
+""""""""""""""""""""
+
+Typical hash tables have a header, buckets, and each bucket points to the
+bucket contents:
+
+.. code-block:: none
+
+  .------------.
+  |  HEADER    |
+  |------------|
+  |  BUCKETS   |
+  |------------|
+  |  DATA      |
+  `------------'
+
+The BUCKETS are an array of offsets to DATA for each hash:
+
+.. code-block:: none
+
+  .------------.
+  | 0x00001000 | BUCKETS[0]
+  | 0x00002000 | BUCKETS[1]
+  | 0x00002200 | BUCKETS[2]
+  | 0x000034f0 | BUCKETS[3]
+  |            | ...
+  | 0xXXXXXXXX | BUCKETS[n_buckets]
+  '------------'
+
+So for ``bucket[3]`` in the example above, we have an offset into the table
+0x000034f0 which points to a chain of entries for the bucket.  Each bucket must
+contain a next pointer, full 32 bit hash value, the string itself, and the data
+for the current string value.
+
+.. code-block:: none
+
+              .------------.
+  0x000034f0: | 0x00003500 | next pointer
+              | 0x12345678 | 32 bit hash
+              | "erase"    | string value
+              | data[n]    | HashData for this bucket
+              |------------|
+  0x00003500: | 0x00003550 | next pointer
+              | 0x29273623 | 32 bit hash
+              | "dump"     | string value
+              | data[n]    | HashData for this bucket
+              |------------|
+  0x00003550: | 0x00000000 | next pointer
+              | 0x82638293 | 32 bit hash
+              | "main"     | string value
+              | data[n]    | HashData for this bucket
+              `------------'
+
+The problem with this layout for debuggers is that we need to optimize for the
+negative lookup case where the symbol we're searching for is not present.  So
+if we were to lookup "``printf``" in the table above, we would make a 32-bit
+hash for "``printf``", it might match ``bucket[3]``.  We would need to go to
+the offset 0x000034f0 and start looking to see if our 32 bit hash matches.  To
+do so, we need to read the next pointer, then read the hash, compare it, and
+skip to the next bucket.  Each time we are skipping many bytes in memory and
+touching new pages just to do the compare on the full 32 bit hash.  All of
+these accesses then tell us that we didn't have a match.
+
+Name Hash Tables
+""""""""""""""""
+
+To solve the issues mentioned above we have structured the hash tables a bit
+differently: a header, buckets, an array of all unique 32 bit hash values,
+followed by an array of hash value data offsets, one for each hash value, then
+the data for all hash values:
+
+.. code-block:: none
+
+  .-------------.
+  |  HEADER     |
+  |-------------|
+  |  BUCKETS    |
+  |-------------|
+  |  HASHES     |
+  |-------------|
+  |  OFFSETS    |
+  |-------------|
+  |  DATA       |
+  `-------------'
+
+The ``BUCKETS`` in the name tables are an index into the ``HASHES`` array.  By
+making all of the full 32 bit hash values contiguous in memory, we allow
+ourselves to efficiently check for a match while touching as little memory as
+possible.  Most often checking the 32 bit hash values is as far as the lookup
+goes.  If it does match, it usually is a match with no collisions.  So for a
+table with "``n_buckets``" buckets, and "``n_hashes``" unique 32 bit hash
+values, we can clarify the contents of the ``BUCKETS``, ``HASHES`` and
+``OFFSETS`` as:
+
+.. code-block:: none
+
+  .-------------------------.
+  |  HEADER.magic           | uint32_t
+  |  HEADER.version         | uint16_t
+  |  HEADER.hash_function   | uint16_t
+  |  HEADER.bucket_count    | uint32_t
+  |  HEADER.hashes_count    | uint32_t
+  |  HEADER.header_data_len | uint32_t
+  |  HEADER_DATA            | HeaderData
+  |-------------------------|
+  |  BUCKETS                | uint32_t[n_buckets] // 32 bit hash indexes
+  |-------------------------|
+  |  HASHES                 | uint32_t[n_hashes] // 32 bit hash values
+  |-------------------------|
+  |  OFFSETS                | uint32_t[n_hashes] // 32 bit offsets to hash value data
+  |-------------------------|
+  |  ALL HASH DATA          |
+  `-------------------------'
+
+So taking the exact same data from the standard hash example above we end up
+with:
+
+.. code-block:: none
+
+              .------------.
+              | HEADER     |
+              |------------|
+              |          0 | BUCKETS[0]
+              |          2 | BUCKETS[1]
+              |          5 | BUCKETS[2]
+              |          6 | BUCKETS[3]
+              |            | ...
+              |        ... | BUCKETS[n_buckets]
+              |------------|
+              | 0x........ | HASHES[0]
+              | 0x........ | HASHES[1]
+              | 0x........ | HASHES[2]
+              | 0x........ | HASHES[3]
+              | 0x........ | HASHES[4]
+              | 0x........ | HASHES[5]
+              | 0x12345678 | HASHES[6]    hash for BUCKETS[3]
+              | 0x29273623 | HASHES[7]    hash for BUCKETS[3]
+              | 0x82638293 | HASHES[8]    hash for BUCKETS[3]
+              | 0x........ | HASHES[9]
+              | 0x........ | HASHES[10]
+              | 0x........ | HASHES[11]
+              | 0x........ | HASHES[12]
+              | 0x........ | HASHES[13]
+              | 0x........ | HASHES[n_hashes]
+              |------------|
+              | 0x........ | OFFSETS[0]
+              | 0x........ | OFFSETS[1]
+              | 0x........ | OFFSETS[2]
+              | 0x........ | OFFSETS[3]
+              | 0x........ | OFFSETS[4]
+              | 0x........ | OFFSETS[5]
+              | 0x000034f0 | OFFSETS[6]   offset for BUCKETS[3]
+              | 0x00003500 | OFFSETS[7]   offset for BUCKETS[3]
+              | 0x00003550 | OFFSETS[8]   offset for BUCKETS[3]
+              | 0x........ | OFFSETS[9]
+              | 0x........ | OFFSETS[10]
+              | 0x........ | OFFSETS[11]
+              | 0x........ | OFFSETS[12]
+              | 0x........ | OFFSETS[13]
+              | 0x........ | OFFSETS[n_hashes]
+              |------------|
+              |            |
+              |            |
+              |            |
+              |            |
+              |            |
+              |------------|
+  0x000034f0: | 0x00001203 | .debug_str ("erase")
+              | 0x00000004 | A 32 bit array count - number of HashData with name "erase"
+              | 0x........ | HashData[0]
+              | 0x........ | HashData[1]
+              | 0x........ | HashData[2]
+              | 0x........ | HashData[3]
+              | 0x00000000 | String offset into .debug_str (terminate data for hash)
+              |------------|
+  0x00003500: | 0x00001203 | String offset into .debug_str ("collision")
+              | 0x00000002 | A 32 bit array count - number of HashData with name "collision"
+              | 0x........ | HashData[0]
+              | 0x........ | HashData[1]
+              | 0x00001203 | String offset into .debug_str ("dump")
+              | 0x00000003 | A 32 bit array count - number of HashData with name "dump"
+              | 0x........ | HashData[0]
+              | 0x........ | HashData[1]
+              | 0x........ | HashData[2]
+              | 0x00000000 | String offset into .debug_str (terminate data for hash)
+              |------------|
+  0x00003550: | 0x00001203 | String offset into .debug_str ("main")
+              | 0x00000009 | A 32 bit array count - number of HashData with name "main"
+              | 0x........ | HashData[0]
+              | 0x........ | HashData[1]
+              | 0x........ | HashData[2]
+              | 0x........ | HashData[3]
+              | 0x........ | HashData[4]
+              | 0x........ | HashData[5]
+              | 0x........ | HashData[6]
+              | 0x........ | HashData[7]
+              | 0x........ | HashData[8]
+              | 0x00000000 | String offset into .debug_str (terminate data for hash)
+              `------------'
+
+So we still have all of the same data, we just organize it more efficiently for
+debugger lookup.  If we repeat the same "``printf``" lookup from above, we
+would hash "``printf``" and find it matches ``BUCKETS[3]`` by taking the 32 bit
+hash value and modulo it by ``n_buckets``.  ``BUCKETS[3]`` contains "6" which
+is the index into the ``HASHES`` table.  We would then compare any consecutive
+32 bit hashes values in the ``HASHES`` array as long as the hashes would be in
+``BUCKETS[3]``.  We do this by verifying that each subsequent hash value modulo
+``n_buckets`` is still 3.  In the case of a failed lookup we would access the
+memory for ``BUCKETS[3]``, and then compare a few consecutive 32 bit hashes
+before we know that we have no match.  We don't end up marching through
+multiple words of memory and we really keep the number of processor data cache
+lines being accessed as small as possible.
+
+The string hash that is used for these lookup tables is the Daniel J.
+Bernstein hash which is also used in the ELF ``GNU_HASH`` sections.  It is a
+very good hash for all kinds of names in programs with very few hash
+collisions.
+
+Empty buckets are designated by using an invalid hash index of ``UINT32_MAX``.
+
+Details
+^^^^^^^
+
+These name hash tables are designed to be generic where specializations of the
+table get to define additional data that goes into the header ("``HeaderData``"),
+how the string value is stored ("``KeyType``") and the content of the data for each
+hash value.
+
+Header Layout
+"""""""""""""
+
+The header has a fixed part, and the specialized part.  The exact format of the
+header is:
+
+.. code-block:: c
+
+  struct Header
+  {
+    uint32_t   magic;           // 'HASH' magic value to allow endian detection
+    uint16_t   version;         // Version number
+    uint16_t   hash_function;   // The hash function enumeration that was used
+    uint32_t   bucket_count;    // The number of buckets in this hash table
+    uint32_t   hashes_count;    // The total number of unique hash values and hash data offsets in this table
+    uint32_t   header_data_len; // The bytes to skip to get to the hash indexes (buckets) for correct alignment
+                                // Specifically the length of the following HeaderData field - this does not
+                                // include the size of the preceding fields
+    HeaderData header_data;     // Implementation specific header data
+  };
+
+The header starts with a 32 bit "``magic``" value which must be ``'HASH'``
+encoded as an ASCII integer.  This allows the detection of the start of the
+hash table and also allows the table's byte order to be determined so the table
+can be correctly extracted.  The "``magic``" value is followed by a 16 bit
+``version`` number which allows the table to be revised and modified in the
+future.  The current version number is 1. ``hash_function`` is a ``uint16_t``
+enumeration that specifies which hash function was used to produce this table.
+The current values for the hash function enumerations include:
+
+.. code-block:: c
+
+  enum HashFunctionType
+  {
+    eHashFunctionDJB = 0u, // Daniel J Bernstein hash function
+  };
+
+``bucket_count`` is a 32 bit unsigned integer that represents how many buckets
+are in the ``BUCKETS`` array.  ``hashes_count`` is the number of unique 32 bit
+hash values that are in the ``HASHES`` array, and is the same number of offsets
+are contained in the ``OFFSETS`` array.  ``header_data_len`` specifies the size
+in bytes of the ``HeaderData`` that is filled in by specialized versions of
+this table.
+
+Fixed Lookup
+""""""""""""
+
+The header is followed by the buckets, hashes, offsets, and hash value data.
+
+.. code-block:: c
+
+  struct FixedTable
+  {
+    uint32_t buckets[Header.bucket_count];  // An array of hash indexes into the "hashes[]" array below
+    uint32_t hashes [Header.hashes_count];  // Every unique 32 bit hash for the entire table is in this table
+    uint32_t offsets[Header.hashes_count];  // An offset that corresponds to each item in the "hashes[]" array above
+  };
+
+``buckets`` is an array of 32 bit indexes into the ``hashes`` array.  The
+``hashes`` array contains all of the 32 bit hash values for all names in the
+hash table.  Each hash in the ``hashes`` table has an offset in the ``offsets``
+array that points to the data for the hash value.
+
+This table setup makes it very easy to repurpose these tables to contain
+different data, while keeping the lookup mechanism the same for all tables.
+This layout also makes it possible to save the table to disk and map it in
+later and do very efficient name lookups with little or no parsing.
+
+DWARF lookup tables can be implemented in a variety of ways and can store a lot
+of information for each name.  We want to make the DWARF tables extensible and
+able to store the data efficiently so we have used some of the DWARF features
+that enable efficient data storage to define exactly what kind of data we store
+for each name.
+
+The ``HeaderData`` contains a definition of the contents of each HashData chunk.
+We might want to store an offset to all of the debug information entries (DIEs)
+for each name.  To keep things extensible, we create a list of items, or
+Atoms, that are contained in the data for each name.  First comes the type of
+the data in each atom:
+
+.. code-block:: c
+
+  enum AtomType
+  {
+    eAtomTypeNULL       = 0u,
+    eAtomTypeDIEOffset  = 1u,   // DIE offset, check form for encoding
+    eAtomTypeCUOffset   = 2u,   // DIE offset of the compiler unit header that contains the item in question
+    eAtomTypeTag        = 3u,   // DW_TAG_xxx value, should be encoded as DW_FORM_data1 (if no tags exceed 255) or DW_FORM_data2
+    eAtomTypeNameFlags  = 4u,   // Flags from enum NameFlags
+    eAtomTypeTypeFlags  = 5u,   // Flags from enum TypeFlags
+  };
+
+The enumeration values and their meanings are:
+
+.. code-block:: none
+
+  eAtomTypeNULL       - a termination atom that specifies the end of the atom list
+  eAtomTypeDIEOffset  - an offset into the .debug_info section for the DWARF DIE for this name
+  eAtomTypeCUOffset   - an offset into the .debug_info section for the CU that contains the DIE
+  eAtomTypeDIETag     - The DW_TAG_XXX enumeration value so you don't have to parse the DWARF to see what it is
+  eAtomTypeNameFlags  - Flags for functions and global variables (isFunction, isInlined, isExternal...)
+  eAtomTypeTypeFlags  - Flags for types (isCXXClass, isObjCClass, ...)
+
+Then we allow each atom type to define the atom type and how the data for each
+atom type data is encoded:
+
+.. code-block:: c
+
+  struct Atom
+  {
+    uint16_t type;  // AtomType enum value
+    uint16_t form;  // DWARF DW_FORM_XXX defines
+  };
+
+The ``form`` type above is from the DWARF specification and defines the exact
+encoding of the data for the Atom type.  See the DWARF specification for the
+``DW_FORM_`` definitions.
+
+.. code-block:: c
+
+  struct HeaderData
+  {
+    uint32_t die_offset_base;
+    uint32_t atom_count;
+    Atoms    atoms[atom_count0];
+  };
+
+``HeaderData`` defines the base DIE offset that should be added to any atoms
+that are encoded using the ``DW_FORM_ref1``, ``DW_FORM_ref2``,
+``DW_FORM_ref4``, ``DW_FORM_ref8`` or ``DW_FORM_ref_udata``.  It also defines
+what is contained in each ``HashData`` object -- ``Atom.form`` tells us how large
+each field will be in the ``HashData`` and the ``Atom.type`` tells us how this data
+should be interpreted.
+
+For the current implementations of the "``.apple_names``" (all functions +
+globals), the "``.apple_types``" (names of all types that are defined), and
+the "``.apple_namespaces``" (all namespaces), we currently set the ``Atom``
+array to be:
+
+.. code-block:: c
+
+  HeaderData.atom_count = 1;
+  HeaderData.atoms[0].type = eAtomTypeDIEOffset;
+  HeaderData.atoms[0].form = DW_FORM_data4;
+
+This defines the contents to be the DIE offset (eAtomTypeDIEOffset) that is
+encoded as a 32 bit value (DW_FORM_data4).  This allows a single name to have
+multiple matching DIEs in a single file, which could come up with an inlined
+function for instance.  Future tables could include more information about the
+DIE such as flags indicating if the DIE is a function, method, block,
+or inlined.
+
+The KeyType for the DWARF table is a 32 bit string table offset into the
+".debug_str" table.  The ".debug_str" is the string table for the DWARF which
+may already contain copies of all of the strings.  This helps make sure, with
+help from the compiler, that we reuse the strings between all of the DWARF
+sections and keeps the hash table size down.  Another benefit to having the
+compiler generate all strings as DW_FORM_strp in the debug info, is that
+DWARF parsing can be made much faster.
+
+After a lookup is made, we get an offset into the hash data.  The hash data
+needs to be able to deal with 32 bit hash collisions, so the chunk of data
+at the offset in the hash data consists of a triple:
+
+.. code-block:: c
+
+  uint32_t str_offset
+  uint32_t hash_data_count
+  HashData[hash_data_count]
+
+If "str_offset" is zero, then the bucket contents are done. 99.9% of the
+hash data chunks contain a single item (no 32 bit hash collision):
+
+.. code-block:: none
+
+  .------------.
+  | 0x00001023 | uint32_t KeyType (.debug_str[0x0001023] => "main")
+  | 0x00000004 | uint32_t HashData count
+  | 0x........ | uint32_t HashData[0] DIE offset
+  | 0x........ | uint32_t HashData[1] DIE offset
+  | 0x........ | uint32_t HashData[2] DIE offset
+  | 0x........ | uint32_t HashData[3] DIE offset
+  | 0x00000000 | uint32_t KeyType (end of hash chain)
+  `------------'
+
+If there are collisions, you will have multiple valid string offsets:
+
+.. code-block:: none
+
+  .------------.
+  | 0x00001023 | uint32_t KeyType (.debug_str[0x0001023] => "main")
+  | 0x00000004 | uint32_t HashData count
+  | 0x........ | uint32_t HashData[0] DIE offset
+  | 0x........ | uint32_t HashData[1] DIE offset
+  | 0x........ | uint32_t HashData[2] DIE offset
+  | 0x........ | uint32_t HashData[3] DIE offset
+  | 0x00002023 | uint32_t KeyType (.debug_str[0x0002023] => "print")
+  | 0x00000002 | uint32_t HashData count
+  | 0x........ | uint32_t HashData[0] DIE offset
+  | 0x........ | uint32_t HashData[1] DIE offset
+  | 0x00000000 | uint32_t KeyType (end of hash chain)
+  `------------'
+
+Current testing with real world C++ binaries has shown that there is around 1
+32 bit hash collision per 100,000 name entries.
+
+Contents
+^^^^^^^^
+
+As we said, we want to strictly define exactly what is included in the
+different tables.  For DWARF, we have 3 tables: "``.apple_names``",
+"``.apple_types``", and "``.apple_namespaces``".
+
+"``.apple_names``" sections should contain an entry for each DWARF DIE whose
+``DW_TAG`` is a ``DW_TAG_label``, ``DW_TAG_inlined_subroutine``, or
+``DW_TAG_subprogram`` that has address attributes: ``DW_AT_low_pc``,
+``DW_AT_high_pc``, ``DW_AT_ranges`` or ``DW_AT_entry_pc``.  It also contains
+``DW_TAG_variable`` DIEs that have a ``DW_OP_addr`` in the location (global and
+static variables).  All global and static variables should be included,
+including those scoped within functions and classes.  For example using the
+following code:
+
+.. code-block:: c
+
+  static int var = 0;
+
+  void f ()
+  {
+    static int var = 0;
+  }
+
+Both of the static ``var`` variables would be included in the table.  All
+functions should emit both their full names and their basenames.  For C or C++,
+the full name is the mangled name (if available) which is usually in the
+``DW_AT_MIPS_linkage_name`` attribute, and the ``DW_AT_name`` contains the
+function basename.  If global or static variables have a mangled name in a
+``DW_AT_MIPS_linkage_name`` attribute, this should be emitted along with the
+simple name found in the ``DW_AT_name`` attribute.
+
+"``.apple_types``" sections should contain an entry for each DWARF DIE whose
+tag is one of:
+
+* DW_TAG_array_type
+* DW_TAG_class_type
+* DW_TAG_enumeration_type
+* DW_TAG_pointer_type
+* DW_TAG_reference_type
+* DW_TAG_string_type
+* DW_TAG_structure_type
+* DW_TAG_subroutine_type
+* DW_TAG_typedef
+* DW_TAG_union_type
+* DW_TAG_ptr_to_member_type
+* DW_TAG_set_type
+* DW_TAG_subrange_type
+* DW_TAG_base_type
+* DW_TAG_const_type
+* DW_TAG_file_type
+* DW_TAG_namelist
+* DW_TAG_packed_type
+* DW_TAG_volatile_type
+* DW_TAG_restrict_type
+* DW_TAG_atomic_type
+* DW_TAG_interface_type
+* DW_TAG_unspecified_type
+* DW_TAG_shared_type
+
+Only entries with a ``DW_AT_name`` attribute are included, and the entry must
+not be a forward declaration (``DW_AT_declaration`` attribute with a non-zero
+value).  For example, using the following code:
+
+.. code-block:: c
+
+  int main ()
+  {
+    int *b = 0;
+    return *b;
+  }
+
+We get a few type DIEs:
+
+.. code-block:: none
+
+  0x00000067:     TAG_base_type [5]
+                  AT_encoding( DW_ATE_signed )
+                  AT_name( "int" )
+                  AT_byte_size( 0x04 )
+
+  0x0000006e:     TAG_pointer_type [6]
+                  AT_type( {0x00000067} ( int ) )
+                  AT_byte_size( 0x08 )
+
+The DW_TAG_pointer_type is not included because it does not have a ``DW_AT_name``.
+
+"``.apple_namespaces``" section should contain all ``DW_TAG_namespace`` DIEs.
+If we run into a namespace that has no name this is an anonymous namespace, and
+the name should be output as "``(anonymous namespace)``" (without the quotes).
+Why?  This matches the output of the ``abi::cxa_demangle()`` that is in the
+standard C++ library that demangles mangled names.
+
+
+Language Extensions and File Format Changes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Objective-C Extensions
+""""""""""""""""""""""
+
+"``.apple_objc``" section should contain all ``DW_TAG_subprogram`` DIEs for an
+Objective-C class.  The name used in the hash table is the name of the
+Objective-C class itself.  If the Objective-C class has a category, then an
+entry is made for both the class name without the category, and for the class
+name with the category.  So if we have a DIE at offset 0x1234 with a name of
+method "``-[NSString(my_additions) stringWithSpecialString:]``", we would add
+an entry for "``NSString``" that points to DIE 0x1234, and an entry for
+"``NSString(my_additions)``" that points to 0x1234.  This allows us to quickly
+track down all Objective-C methods for an Objective-C class when doing
+expressions.  It is needed because of the dynamic nature of Objective-C where
+anyone can add methods to a class.  The DWARF for Objective-C methods is also
+emitted differently from C++ classes where the methods are not usually
+contained in the class definition, they are scattered about across one or more
+compile units.  Categories can also be defined in different shared libraries.
+So we need to be able to quickly find all of the methods and class functions
+given the Objective-C class name, or quickly find all methods and class
+functions for a class + category name.  This table does not contain any
+selector names, it just maps Objective-C class names (or class names +
+category) to all of the methods and class functions.  The selectors are added
+as function basenames in the "``.debug_names``" section.
+
+In the "``.apple_names``" section for Objective-C functions, the full name is
+the entire function name with the brackets ("``-[NSString
+stringWithCString:]``") and the basename is the selector only
+("``stringWithCString:``").
+
+Mach-O Changes
+""""""""""""""
+
+The sections names for the apple hash tables are for non-mach-o files.  For
+mach-o files, the sections should be contained in the ``__DWARF`` segment with
+names as follows:
+
+* "``.apple_names``" -> "``__apple_names``"
+* "``.apple_types``" -> "``__apple_types``"
+* "``.apple_namespaces``" -> "``__apple_namespac``" (16 character limit)
+* "``.apple_objc``" -> "``__apple_objc``"
+
+.. _codeview:
+
+CodeView Debug Info Format
+==========================
+
+LLVM supports emitting CodeView, the Microsoft debug info format, and this
+section describes the design and implementation of that support.
+
+Format Background
+-----------------
+
+CodeView as a format is clearly oriented around C++ debugging, and in C++, the
+majority of debug information tends to be type information. Therefore, the
+overriding design constraint of CodeView is the separation of type information
+from other "symbol" information so that type information can be efficiently
+merged across translation units. Both type information and symbol information is
+generally stored as a sequence of records, where each record begins with a
+16-bit record size and a 16-bit record kind.
+
+Type information is usually stored in the ``.debug$T`` section of the object
+file.  All other debug info, such as line info, string table, symbol info, and
+inlinee info, is stored in one or more ``.debug$S`` sections. There may only be
+one ``.debug$T`` section per object file, since all other debug info refers to
+it. If a PDB (enabled by the ``/Zi`` MSVC option) was used during compilation,
+the ``.debug$T`` section will contain only an ``LF_TYPESERVER2`` record pointing
+to the PDB. When using PDBs, symbol information appears to remain in the object
+file ``.debug$S`` sections.
+
+Type records are referred to by their index, which is the number of records in
+the stream before a given record plus ``0x1000``. Many common basic types, such
+as the basic integral types and unqualified pointers to them, are represented
+using type indices less than ``0x1000``. Such basic types are built in to
+CodeView consumers and do not require type records.
+
+Each type record may only contain type indices that are less than its own type
+index. This ensures that the graph of type stream references is acyclic. While
+the source-level type graph may contain cycles through pointer types (consider a
+linked list struct), these cycles are removed from the type stream by always
+referring to the forward declaration record of user-defined record types. Only
+"symbol" records in the ``.debug$S`` streams may refer to complete,
+non-forward-declaration type records.
+
+Working with CodeView
+---------------------
+
+These are instructions for some common tasks for developers working to improve
+LLVM's CodeView support. Most of them revolve around using the CodeView dumper
+embedded in ``llvm-readobj``.
+
+* Testing MSVC's output::
+
+    $ cl -c -Z7 foo.cpp # Use /Z7 to keep types in the object file
+    $ llvm-readobj --codeview foo.obj
+
+* Getting LLVM IR debug info out of Clang::
+
+    $ clang -g -gcodeview --target=x86_64-windows-msvc foo.cpp -S -emit-llvm
+
+  Use this to generate LLVM IR for LLVM test cases.
+
+* Generate and dump CodeView from LLVM IR metadata::
+
+    $ llc foo.ll -filetype=obj -o foo.obj
+    $ llvm-readobj --codeview foo.obj > foo.txt
+
+  Use this pattern in lit test cases and FileCheck the output of llvm-readobj
+
+Improving LLVM's CodeView support is a process of finding interesting type
+records, constructing a C++ test case that makes MSVC emit those records,
+dumping the records, understanding them, and then generating equivalent records
+in LLVM's backend.
+
+Testing Debug Info Preservation in Optimizations
+================================================
+
+The following paragraphs are an introduction to the debugify utility
+and examples of how to use it in regression tests to check debug info
+preservation after optimizations.
+
+The ``debugify`` utility
+------------------------
+
+The ``debugify`` synthetic debug info testing utility consists of two
+main parts. The ``debugify`` pass and the ``check-debugify`` one. They are
+meant to be used with ``opt`` for development purposes.
+
+The first applies synthetic debug information to every instruction of the module,
+while the latter checks that this DI is still available after an optimization
+has occurred, reporting any errors/warnings while doing so.
+
+The instructions are assigned sequentially increasing line locations,
+and are immediately used by debug value intrinsics when possible.
+
+For example, here is a module before:
+
+.. code-block:: llvm
+
+   define void @f(i32* %x) {
+   entry:
+     %x.addr = alloca i32*, align 8
+     store i32* %x, i32** %x.addr, align 8
+     %0 = load i32*, i32** %x.addr, align 8
+     store i32 10, i32* %0, align 4
+     ret void
+   }
+
+and after running ``opt -debugify``  on it we get:
+
+.. code-block:: text
+
+   define void @f(i32* %x) !dbg !6 {
+   entry:
+     %x.addr = alloca i32*, align 8, !dbg !12
+     call void @llvm.dbg.value(metadata i32** %x.addr, metadata !9, metadata !DIExpression()), !dbg !12
+     store i32* %x, i32** %x.addr, align 8, !dbg !13
+     %0 = load i32*, i32** %x.addr, align 8, !dbg !14
+     call void @llvm.dbg.value(metadata i32* %0, metadata !11, metadata !DIExpression()), !dbg !14
+     store i32 10, i32* %0, align 4, !dbg !15
+     ret void, !dbg !16
+   }
+
+   !llvm.dbg.cu = !{!0}
+   !llvm.debugify = !{!3, !4}
+   !llvm.module.flags = !{!5}
+
+   !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
+   !1 = !DIFile(filename: "debugify-sample.ll", directory: "/")
+   !2 = !{}
+   !3 = !{i32 5}
+   !4 = !{i32 2}
+   !5 = !{i32 2, !"Debug Info Version", i32 3}
+   !6 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !1, line: 1, type: !7, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !0, retainedNodes: !8)
+   !7 = !DISubroutineType(types: !2)
+   !8 = !{!9, !11}
+   !9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
+   !10 = !DIBasicType(name: "ty64", size: 64, encoding: DW_ATE_unsigned)
+   !11 = !DILocalVariable(name: "2", scope: !6, file: !1, line: 3, type: !10)
+   !12 = !DILocation(line: 1, column: 1, scope: !6)
+   !13 = !DILocation(line: 2, column: 1, scope: !6)
+   !14 = !DILocation(line: 3, column: 1, scope: !6)
+   !15 = !DILocation(line: 4, column: 1, scope: !6)
+   !16 = !DILocation(line: 5, column: 1, scope: !6)
+
+The following is an example of the -check-debugify output:
+
+.. code-block:: none
+
+   $ opt -enable-debugify -loop-vectorize llvm/test/Transforms/LoopVectorize/i8-induction.ll -disable-output
+   ERROR: Instruction with empty DebugLoc in function f --  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
+
+Errors/warnings can range from instructions with empty debug location to an
+instruction having a type that's incompatible with the source variable it describes,
+all the way to missing lines and missing debug value intrinsics.
+
+Fixing errors
+^^^^^^^^^^^^^
+
+Each of the errors above has a relevant API available to fix it.
+
+* In the case of missing debug location, ``Instruction::setDebugLoc`` or possibly
+  ``IRBuilder::setCurrentDebugLocation`` when using a Builder and the new location
+  should be reused.
+
+* When a debug value has incompatible type ``llvm::replaceAllDbgUsesWith`` can be used.
+  After a RAUW call an incompatible type error can occur because RAUW does not handle
+  widening and narrowing of variables while ``llvm::replaceAllDbgUsesWith`` does. It is
+  also capable of changing the DWARF expression used by the debugger to describe the variable.
+  It also prevents use-before-def by salvaging or deleting invalid debug values.
+
+* When a debug value is missing ``llvm::salvageDebugInfo`` can be used when no replacement
+  exists, or ``llvm::replaceAllDbgUsesWith`` when a replacement exists.
+
+Using ``debugify``
+------------------
+
+In order for ``check-debugify`` to work, the DI must be coming from
+``debugify``. Thus, modules with existing DI will be skipped.
+
+The most straightforward way to use ``debugify`` is as follows::
+
+  $ opt -debugify -pass-to-test -check-debugify sample.ll
+
+This will inject synthetic DI to ``sample.ll`` run the ``pass-to-test``
+and then check for missing DI.
+
+Some other ways to run debugify are avaliable:
+
+.. code-block:: bash
+
+   # Same as the above example.
+   $ opt -enable-debugify -pass-to-test sample.ll
+
+   # Suppresses verbose debugify output.
+   $ opt -enable-debugify -debugify-quiet -pass-to-test sample.ll
+
+   # Prepend -debugify before and append -check-debugify -strip after
+   # each pass on the pipeline (similar to -verify-each).
+   $ opt -debugify-each -O2 sample.ll
+
+``debugify`` can also be used to test a backend, e.g:
+
+.. code-block:: bash
+
+   $ opt -debugify < sample.ll | llc -o -
+
+``debugify`` in regression tests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-debugify`` pass is especially helpful when it comes to testing that
+a given pass preserves DI while transforming the module. For this to work,
+the ``-debugify`` output must be stable enough to use in regression tests.
+Changes to this pass are not allowed to break existing tests.
+
+It allows us to test for DI loss in the same tests we check that the
+transformation is actually doing what it should.
+
+Here is an example from ``test/Transforms/InstCombine/cast-mul-select.ll``:
+
+.. code-block:: llvm
+
+   ; RUN: opt < %s -debugify -instcombine -S | FileCheck %s --check-prefix=DEBUGINFO
+
+   define i32 @mul(i32 %x, i32 %y) {
+   ; DBGINFO-LABEL: @mul(
+   ; DBGINFO-NEXT:    [[C:%.*]] = mul i32 {{.*}}
+   ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i32 [[C]]
+   ; DBGINFO-NEXT:    [[D:%.*]] = and i32 {{.*}}
+   ; DBGINFO-NEXT:    call void @llvm.dbg.value(metadata i32 [[D]]
+
+     %A = trunc i32 %x to i8
+     %B = trunc i32 %y to i8
+     %C = mul i8 %A, %B
+     %D = zext i8 %C to i32
+     ret i32 %D
+   }
+
+Here we test that the two ``dbg.value`` instrinsics are preserved and
+are correctly pointing to the ``[[C]]`` and ``[[D]]`` variables.
+
+.. note::
+
+   Note, that when writing this kind of regression tests, it is important
+   to make them as robust as possible. That's why we should try to avoid
+   hardcoding line/variable numbers in check lines. If for example you test
+   for a ``DILocation`` to have a specific line number, and someone later adds
+   an instruction before the one we check the test will fail. In the cases this
+   can't be avoided (say, if a test wouldn't be precise enough), moving the
+   test to its own file is preferred.

Added: www-releases/trunk/9.0.0/docs/_sources/SpeculativeLoadHardening.md.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/SpeculativeLoadHardening.md.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/SpeculativeLoadHardening.md.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/SpeculativeLoadHardening.md.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1098 @@
+# Speculative Load Hardening
+
+### A Spectre Variant #1 Mitigation Technique
+
+Author: Chandler Carruth - [chandlerc at google.com](mailto:chandlerc at google.com)
+
+## Problem Statement
+
+Recently, Google Project Zero and other researchers have found information leak
+vulnerabilities by exploiting speculative execution in modern CPUs. These
+exploits are currently broken down into three variants:
+* GPZ Variant #1 (a.k.a. Spectre Variant #1): Bounds check (or predicate) bypass
+* GPZ Variant #2 (a.k.a. Spectre Variant #2): Branch target injection
+* GPZ Variant #3 (a.k.a. Meltdown): Rogue data cache load
+
+For more details, see the Google Project Zero blog post and the Spectre research
+paper:
+* https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
+* https://spectreattack.com/spectre.pdf
+
+The core problem of GPZ Variant #1 is that speculative execution uses branch
+prediction to select the path of instructions speculatively executed. This path
+is speculatively executed with the available data, and may load from memory and
+leak the loaded values through various side channels that survive even when the
+speculative execution is unwound due to being incorrect. Mispredicted paths can
+cause code to be executed with data inputs that never occur in correct
+executions, making checks against malicious inputs ineffective and allowing
+attackers to use malicious data inputs to leak secret data. Here is an example,
+extracted and simplified from the Project Zero paper:
+```
+struct array {
+  unsigned long length;
+  unsigned char data[];
+};
+struct array *arr1 = ...; // small array
+struct array *arr2 = ...; // array of size 0x400
+unsigned long untrusted_offset_from_caller = ...;
+if (untrusted_offset_from_caller < arr1->length) {
+  unsigned char value = arr1->data[untrusted_offset_from_caller];
+  unsigned long index2 = ((value&1)*0x100)+0x200;
+  unsigned char value2 = arr2->data[index2];
+}
+```
+
+The key of the attack is to call this with `untrusted_offset_from_caller` that
+is far outside of the bounds when the branch predictor will predict that it
+will be in-bounds. In that case, the body of the `if` will be executed
+speculatively, and may read secret data into `value` and leak it via a
+cache-timing side channel when a dependent access is made to populate `value2`.
+
+## High Level Mitigation Approach
+
+While several approaches are being actively pursued to mitigate specific
+branches and/or loads inside especially risky software (most notably various OS
+kernels), these approaches require manual and/or static analysis aided auditing
+of code and explicit source changes to apply the mitigation. They are unlikely
+to scale well to large applications. We are proposing a comprehensive
+mitigation approach that would apply automatically across an entire program
+rather than through manual changes to the code. While this is likely to have a
+high performance cost, some applications may be in a good position to take this
+performance / security tradeoff.
+
+The specific technique we propose is to cause loads to be checked using
+branchless code to ensure that they are executing along a valid control flow
+path. Consider the following C-pseudo-code representing the core idea of a
+predicate guarding potentially invalid loads:
+```
+void leak(int data);
+void example(int* pointer1, int* pointer2) {
+  if (condition) {
+    // ... lots of code ...
+    leak(*pointer1);
+  } else {
+    // ... more code ...
+    leak(*pointer2);
+  }
+}
+```
+
+This would get transformed into something resembling the following:
+```
+uintptr_t all_ones_mask = std::numerical_limits<uintptr_t>::max();
+uintptr_t all_zeros_mask = 0;
+void leak(int data);
+void example(int* pointer1, int* pointer2) {
+  uintptr_t predicate_state = all_ones_mask;
+  if (condition) {
+    // Assuming ?: is implemented using branchless logic...
+    predicate_state = !condition ? all_zeros_mask : predicate_state;
+    // ... lots of code ...
+    //
+    // Harden the pointer so it can't be loaded
+    pointer1 &= predicate_state;
+    leak(*pointer1);
+  } else {
+    predicate_state = condition ? all_zeros_mask : predicate_state;
+    // ... more code ...
+    //
+    // Alternative: Harden the loaded value
+    int value2 = *pointer2 & predicate_state;
+    leak(value2);
+  }
+}
+```
+
+The result should be that if the `if (condition) {` branch is mis-predicted,
+there is a *data* dependency on the condition used to zero out any pointers
+prior to loading through them or to zero out all of the loaded bits. Even
+though this code pattern may still execute speculatively, *invalid* speculative
+executions are prevented from leaking secret data from memory (but note that
+this data might still be loaded in safe ways, and some regions of memory are
+required to not hold secrets, see below for detailed limitations). This
+approach only requires the underlying hardware have a way to implement a
+branchless and unpredicted conditional update of a register's value. All modern
+architectures have support for this, and in fact such support is necessary to
+correctly implement constant time cryptographic primitives.
+
+Crucial properties of this approach:
+* It is not preventing any particular side-channel from working. This is
+  important as there are an unknown number of potential side channels and we
+  expect to continue discovering more. Instead, it prevents the observation of
+  secret data in the first place.
+* It accumulates the predicate state, protecting even in the face of nested
+  *correctly* predicted control flows.
+* It passes this predicate state across function boundaries to provide
+  [interprocedural protection](#interprocedural-checking).
+* When hardening the address of a load, it uses a *destructive* or
+  *non-reversible* modification of the address to prevent an attacker from
+  reversing the check using attacker-controlled inputs.
+* It does not completely block speculative execution, and merely prevents
+  *mis*-speculated paths from leaking secrets from memory (and stalls
+  speculation until this can be determined).
+* It is completely general and makes no fundamental assumptions about the
+  underlying architecture other than the ability to do branchless conditional
+  data updates and a lack of value prediction.
+* It does not require programmers to identify all possible secret data using
+  static source code annotations or code vulnerable to a variant #1 style
+  attack.
+
+Limitations of this approach:
+* It requires re-compiling source code to insert hardening instruction
+  sequences. Only software compiled in this mode is protected.
+* The performance is heavily dependent on a particular architecture's
+  implementation strategy. We outline a potential x86 implementation below and
+  characterize its performance.
+* It does not defend against secret data already loaded from memory and
+  residing in registers or leaked through other side-channels in
+  non-speculative execution. Code dealing with this, e.g cryptographic
+  routines, already uses constant-time algorithms and code to prevent
+  side-channels. Such code should also scrub registers of secret data following
+  [these
+  guidelines](https://github.com/HACS-workshop/spectre-mitigations/blob/master/crypto_guidelines.md).
+* To achieve reasonable performance, many loads may not be checked, such as
+  those with compile-time fixed addresses. This primarily consists of accesses
+  at compile-time constant offsets of global and local variables. Code which
+  needs this protection and intentionally stores secret data must ensure the
+  memory regions used for secret data are necessarily dynamic mappings or heap
+  allocations. This is an area which can be tuned to provide more comprehensive
+  protection at the cost of performance.
+* [Hardened loads](#hardening-the-address-of-the-load) may still load data from
+  _valid_ addresses if not _attacker-controlled_ addresses. To prevent these
+  from reading secret data, the low 2gb of the address space and 2gb above and
+  below any executable pages should be protected.
+
+Credit:
+* The core idea of tracing misspeculation through data and marking pointers to
+  block misspeculated loads was developed as part of a HACS 2018 discussion
+  between Chandler Carruth, Paul Kocher, Thomas Pornin, and several other
+  individuals.
+* Core idea of masking out loaded bits was part of the original mitigation
+  suggested by Jann Horn when these attacks were reported.
+
+
+### Indirect Branches, Calls, and Returns
+
+It is possible to attack control flow other than conditional branches with
+variant #1 style mispredictions.
+* A prediction towards a hot call target of a virtual method can lead to it
+  being speculatively executed when an expected type is used (often called
+  "type confusion").
+* A hot case may be speculatively executed due to prediction instead of the
+  correct case for a switch statement implemented as a jump table.
+* A hot common return address may be predicted incorrectly when returning from
+  a function.
+
+These code patterns are also vulnerable to Spectre variant #2, and as such are
+best mitigated with a
+[retpoline](https://support.google.com/faqs/answer/7625886) on x86 platforms.
+When a mitigation technique like retpoline is used, speculation simply cannot
+proceed through an indirect control flow edge (or it cannot be mispredicted in
+the case of a filled RSB) and so it is also protected from variant #1 style
+attacks. However, some architectures, micro-architectures, or vendors do not
+employ the retpoline mitigation, and on future x86 hardware (both Intel and
+AMD) it is expected to become unnecessary due to hardware-based mitigation.
+
+When not using a retpoline, these edges will need independent protection from
+variant #1 style attacks. The analogous approach to that used for conditional
+control flow should work:
+```
+uintptr_t all_ones_mask = std::numerical_limits<uintptr_t>::max();
+uintptr_t all_zeros_mask = 0;
+void leak(int data);
+void example(int* pointer1, int* pointer2) {
+  uintptr_t predicate_state = all_ones_mask;
+  switch (condition) {
+  case 0:
+    // Assuming ?: is implemented using branchless logic...
+    predicate_state = (condition != 0) ? all_zeros_mask : predicate_state;
+    // ... lots of code ...
+    //
+    // Harden the pointer so it can't be loaded
+    pointer1 &= predicate_state;
+    leak(*pointer1);
+    break;
+
+  case 1:
+    predicate_state = (condition != 1) ? all_zeros_mask : predicate_state;
+    // ... more code ...
+    //
+    // Alternative: Harden the loaded value
+    int value2 = *pointer2 & predicate_state;
+    leak(value2);
+    break;
+
+    // ...
+  }
+}
+```
+
+The core idea remains the same: validate the control flow using data-flow and
+use that validation to check that loads cannot leak information along
+misspeculated paths. Typically this involves passing the desired target of such
+control flow across the edge and checking that it is correct afterwards. Note
+that while it is tempting to think that this mitigates variant #2 attacks, it
+does not. Those attacks go to arbitrary gadgets that don't include the checks.
+
+
+### Variant #1.1 and #1.2 attacks: "Bounds Check Bypass Store"
+
+Beyond the core variant #1 attack, there are techniques to extend this attack.
+The primary technique is known as "Bounds Check Bypass Store" and is discussed
+in this research paper: https://people.csail.mit.edu/vlk/spectre11.pdf
+
+We will analyze these two variants independently. First, variant #1.1 works by
+speculatively storing over the return address after a bounds check bypass. This
+speculative store then ends up being used by the CPU during speculative
+execution of the return, potentially directing speculative execution to
+arbitrary gadgets in the binary. Let's look at an example.
+```
+unsigned char local_buffer[4];
+unsigned char *untrusted_data_from_caller = ...;
+unsigned long untrusted_size_from_caller = ...;
+if (untrusted_size_from_caller < sizeof(local_buffer)) {
+  // Speculative execution enters here with a too-large size.
+  memcpy(local_buffer, untrusted_data_from_caller,
+         untrusted_size_from_caller);
+  // The stack has now been smashed, writing an attacker-controlled
+  // address over the return address.
+  minor_processing(local_buffer);
+  return;
+  // Control will speculate to the attacker-written address.
+}
+```
+
+However, this can be mitigated by hardening the load of the return address just
+like any other load. This is sometimes complicated because x86 for example
+*implicitly* loads the return address off the stack. However, the
+implementation technique below is specifically designed to mitigate this
+implicit load by using the stack pointer to communicate misspeculation between
+functions. This additionally causes a misspeculation to have an invalid stack
+pointer and never be able to read the speculatively stored return address. See
+the detailed discussion below.
+
+For variant #1.2, the attacker speculatively stores into the vtable or jump
+table used to implement an indirect call or indirect jump. Because this is
+speculative, this will often be possible even when these are stored in
+read-only pages. For example:
+```
+class FancyObject : public BaseObject {
+public:
+  void DoSomething() override;
+};
+void f(unsigned long attacker_offset, unsigned long attacker_data) {
+  FancyObject object = getMyObject();
+  unsigned long *arr[4] = getFourDataPointers();
+  if (attacker_offset < 4) {
+    // We have bypassed the bounds check speculatively.
+    unsigned long *data = arr[attacker_offset];
+    // Now we have computed a pointer inside of `object`, the vptr.
+    *data = attacker_data;
+    // The vptr points to the virtual table and we speculatively clobber that.
+    g(object); // Hand the object to some other routine.
+  }
+}
+// In another file, we call a method on the object.
+void g(BaseObject &object) {
+  object.DoSomething();
+  // This speculatively calls the address stored over the vtable.
+}
+```
+
+Mitigating this requires hardening loads from these locations, or mitigating
+the indirect call or indirect jump. Any of these are sufficient to block the
+call or jump from using a speculatively stored value that has been read back.
+
+For both of these, using retpolines would be equally sufficient. One possible
+hybrid approach is to use retpolines for indirect call and jump, while relying
+on SLH to mitigate returns.
+
+Another approach that is sufficient for both of these is to harden all of the
+speculative stores. However, as most stores aren't interesting and don't
+inherently leak data, this is expected to be prohibitively expensive given the
+attack it is defending against.
+
+
+## Implementation Details
+
+There are a number of complex details impacting the implementation of this
+technique, both on a particular architecture and within a particular compiler.
+We discuss proposed implementation techniques for the x86 architecture and the
+LLVM compiler. These are primarily to serve as an example, as other
+implementation techniques are very possible.
+
+
+### x86 Implementation Details
+
+On the x86 platform we break down the implementation into three core
+components: accumulating the predicate state through the control flow graph,
+checking the loads, and checking control transfers between procedures.
+
+
+#### Accumulating Predicate State
+
+Consider baseline x86 instructions like the following, which test three
+conditions and if all pass, loads data from memory and potentially leaks it
+through some side channel:
+```
+# %bb.0:                                # %entry
+        pushq   %rax
+        testl   %edi, %edi
+        jne     .LBB0_4
+# %bb.1:                                # %then1
+        testl   %esi, %esi
+        jne     .LBB0_4
+# %bb.2:                                # %then2
+        testl   %edx, %edx
+        je      .LBB0_3
+.LBB0_4:                                # %exit
+        popq    %rax
+        retq
+.LBB0_3:                                # %danger
+        movl    (%rcx), %edi
+        callq   leak
+        popq    %rax
+        retq
+```
+
+When we go to speculatively execute the load, we want to know whether any of
+the dynamically executed predicates have been misspeculated. To track that,
+along each conditional edge, we need to track the data which would allow that
+edge to be taken. On x86, this data is stored in the flags register used by the
+conditional jump instruction. Along both edges after this fork in control flow,
+the flags register remains alive and contains data that we can use to build up
+our accumulated predicate state. We accumulate it using the x86 conditional
+move instruction which also reads the flag registers where the state resides.
+These conditional move instructions are known to not be predicted on any x86
+processors, making them immune to misprediction that could reintroduce the
+vulnerability. When we insert the conditional moves, the code ends up looking
+like the following:
+```
+# %bb.0:                                # %entry
+        pushq   %rax
+        xorl    %eax, %eax              # Zero out initial predicate state.
+        movq    $-1, %r8                # Put all-ones mask into a register.
+        testl   %edi, %edi
+        jne     .LBB0_1
+# %bb.2:                                # %then1
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        testl   %esi, %esi
+        jne     .LBB0_1
+# %bb.3:                                # %then2
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        testl   %edx, %edx
+        je      .LBB0_4
+.LBB0_1:
+        cmoveq  %r8, %rax               # Conditionally update predicate state.
+        popq    %rax
+        retq
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        ...
+```
+
+Here we create the "empty" or "correct execution" predicate state by zeroing
+`%rax`, and we create a constant "incorrect execution" predicate value by
+putting `-1` into `%r8`. Then, along each edge coming out of a conditional
+branch we do a conditional move that in a correct execution will be a no-op,
+but if misspeculated, will replace the `%rax` with the value of `%r8`.
+Misspeculating any one of the three predicates will cause `%rax` to hold the
+"incorrect execution" value from `%r8` as we preserve incoming values when
+execution is correct rather than overwriting it.
+
+We now have a value in `%rax` in each basic block that indicates if at some
+point previously a predicate was mispredicted. And we have arranged for that
+value to be particularly effective when used below to harden loads.
+
+
+##### Indirect Call, Branch, and Return Predicates
+
+There is no analogous flag to use when tracing indirect calls, branches, and
+returns. The predicate state must be accumulated through some other means.
+Fundamentally, this is the reverse of the problem posed in CFI: we need to
+check where we came from rather than where we are going. For function-local
+jump tables, this is easily arranged by testing the input to the jump table
+within each destination (not yet implemented, use retpolines):
+```
+        pushq   %rax
+        xorl    %eax, %eax              # Zero out initial predicate state.
+        movq    $-1, %r8                # Put all-ones mask into a register.
+        jmpq    *.LJTI0_0(,%rdi,8)      # Indirect jump through table.
+.LBB0_2:                                # %sw.bb
+        testq   $0, %rdi                # Validate index used for jump table.
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        ...
+        jmp     _Z4leaki                # TAILCALL
+
+.LBB0_3:                                # %sw.bb1
+        testq   $1, %rdi                # Validate index used for jump table.
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        ...
+        jmp     _Z4leaki                # TAILCALL
+
+.LBB0_5:                                # %sw.bb10
+        testq   $2, %rdi                # Validate index used for jump table.
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        ...
+        jmp     _Z4leaki                # TAILCALL
+        ...
+
+        .section        .rodata,"a", at progbits
+        .p2align        3
+.LJTI0_0:
+        .quad   .LBB0_2
+        .quad   .LBB0_3
+        .quad   .LBB0_5
+        ...
+```
+
+Returns have a simple mitigation technique on x86-64 (or other ABIs which have
+what is called a "red zone" region beyond the end of the stack). This region is
+guaranteed to be preserved across interrupts and context switches, making the
+return address used in returning to the current code remain on the stack and
+valid to read. We can emit code in the caller to verify that a return edge was
+not mispredicted:
+```
+        callq   other_function
+return_addr:
+        testq   -8(%rsp), return_addr   # Validate return address.
+        cmovneq %r8, %rax               # Update predicate state.
+```
+
+For an ABI without a "red zone" (and thus unable to read the return address
+from the stack), we can compute the expected return address prior to the call
+into a register preserved across the call and use that similarly to the above.
+
+Indirect calls (and returns in the absence of a red zone ABI) pose the most
+significant challenge to propagate. The simplest technique would be to define a
+new ABI such that the intended call target is passed into the called function
+and checked in the entry. Unfortunately, new ABIs are quite expensive to deploy
+in C and C++. While the target function could be passed in TLS, we would still
+require complex logic to handle a mixture of functions compiled with and
+without this extra logic (essentially, making the ABI backwards compatible).
+Currently, we suggest using retpolines here and will continue to investigate
+ways of mitigating this.
+
+
+##### Optimizations, Alternatives, and Tradeoffs
+
+Merely accumulating predicate state involves significant cost. There are
+several key optimizations we employ to minimize this and various alternatives
+that present different tradeoffs in the generated code.
+
+First, we work to reduce the number of instructions used to track the state:
+* Rather than inserting a `cmovCC` instruction along every conditional edge in
+  the original program, we track each set of condition flags we need to capture
+  prior to entering each basic block and reuse a common `cmovCC` sequence for
+  those.
+  * We could further reuse suffixes when there are multiple `cmovCC`
+    instructions required to capture the set of flags. Currently this is
+    believed to not be worth the cost as paired flags are relatively rare and
+    suffixes of them are exceedingly rare.
+* A common pattern in x86 is to have multiple conditional jump instructions
+  that use the same flags but handle different conditions. Naively, we could
+  consider each fallthrough between them an "edge" but this causes a much more
+  complex control flow graph. Instead, we accumulate the set of conditions
+  necessary for fallthrough and use a sequence of `cmovCC` instructions in a
+  single fallthrough edge to track it.
+
+Second, we trade register pressure for simpler `cmovCC` instructions by
+allocating a register for the "bad" state. We could read that value from memory
+as part of the conditional move instruction, however, this creates more
+micro-ops and requires the load-store unit to be involved. Currently, we place
+the value into a virtual register and allow the register allocator to decide
+when the register pressure is sufficient to make it worth spilling to memory
+and reloading.
+
+
+#### Hardening Loads
+
+Once we have the predicate accumulated into a special value for correct vs.
+misspeculated, we need to apply this to loads in a way that ensures they do not
+leak secret data. There are two primary techniques for this: we can either
+harden the loaded value to prevent observation, or we can harden the address
+itself to prevent the load from occuring. These have significantly different
+performance tradeoffs.
+
+
+##### Hardening loaded values
+
+The most appealing way to harden loads is to mask out all of the bits loaded.
+The key requirement is that for each bit loaded, along the misspeculated path
+that bit is always fixed at either 0 or 1 regardless of the value of the bit
+loaded. The most obvious implementation uses either an `and` instruction with
+an all-zero mask along misspeculated paths and an all-one mask along correct
+paths, or an `or` instruction with an all-one mask along misspeculated paths
+and an all-zero mask along correct paths. Other options become less appealing
+such as multiplying by zero, or multiple shift instructions. For reasons we
+elaborate on below, we end up suggesting you use `or` with an all-ones mask,
+making the x86 instruction sequence look like the following:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        movl    (%rsi), %edi            # Load potentially secret data from %rsi.
+        orl     %eax, %edi
+```
+
+Other useful patterns may be to fold the load into the `or` instruction itself
+at the cost of a register-to-register copy.
+
+There are some challenges with deploying this approach:
+1. Many loads on x86 are folded into other instructions. Separating them would
+   add very significant and costly register pressure with prohibitive
+   performance cost.
+1. Loads may not target a general purpose register requiring extra instructions
+   to map the state value into the correct register class, and potentially more
+   expensive instructions to mask the value in some way.
+1. The flags registers on x86 are very likely to be live, and challenging to
+   preserve cheaply.
+1. There are many more values loaded than pointers & indices used for loads. As
+   a consequence, hardening the result of a load requires substantially more
+   instructions than hardening the address of the load (see below).
+
+Despite these challenges, hardening the result of the load critically allows
+the load to proceed and thus has dramatically less impact on the total
+speculative / out-of-order potential of the execution. There are also several
+interesting techniques to try and mitigate these challenges and make hardening
+the results of loads viable in at least some cases. However, we generally
+expect to fall back when unprofitable from hardening the loaded value to the
+next approach of hardening the address itself.
+
+
+###### Loads folded into data-invariant operations can be hardened after the operation
+
+The first key to making this feasible is to recognize that many operations on
+x86 are "data-invariant". That is, they have no (known) observable behavior
+differences due to the particular input data. These instructions are often used
+when implementing cryptographic primitives dealing with private key data
+because they are not believed to provide any side-channels. Similarly, we can
+defer hardening until after them as they will not in-and-of-themselves
+introduce a speculative execution side-channel. This results in code sequences
+that look like:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        addl    (%rsi), %edi            # Load and accumulate without leaking.
+        orl     %eax, %edi
+```
+
+While an addition happens to the loaded (potentially secret) value, that
+doesn't leak any data and we then immediately harden it.
+
+
+###### Hardening of loaded values deferred down the data-invariant expression graph
+
+We can generalize the previous idea and sink the hardening down the expression
+graph across as many data-invariant operations as desirable. This can use very
+conservative rules for whether something is data-invariant. The primary goal
+should be to handle multiple loads with a single hardening instruction:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        addl    (%rsi), %edi            # Load and accumulate without leaking.
+        addl    4(%rsi), %edi           # Continue without leaking.
+        addl    8(%rsi), %edi
+        orl     %eax, %edi              # Mask out bits from all three loads.
+```
+
+
+###### Preserving the flags while hardening loaded values on Haswell, Zen, and newer processors
+
+Sadly, there are no useful instructions on x86 that apply a mask to all 64 bits
+without touching the flag registers. However, we can harden loaded values that
+are narrower than a word (fewer than 32-bits on 32-bit systems and fewer than
+64-bits on 64-bit systems) by zero-extending the value to the full word size
+and then shifting right by at least the number of original bits using the BMI2
+`shrx` instruction:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        addl    (%rsi), %edi            # Load and accumulate 32 bits of data.
+        shrxq   %rax, %rdi, %rdi        # Shift out all 32 bits loaded.
+```
+
+Because on x86 the zero-extend is free, this can efficiently harden the loaded
+value.
+
+
+##### Hardening the address of the load
+
+When hardening the loaded value is inapplicable, most often because the
+instruction directly leaks information (like `cmp` or `jmpq`), we switch to
+hardening the _address_ of the load instead of the loaded value. This avoids
+increasing register pressure by unfolding the load or paying some other high
+cost.
+
+To understand how this works in practice, we need to examine the exact
+semantics of the x86 addressing modes which, in its fully general form, looks
+like `(%base,%index,scale)offset`. Here `%base` and `%index` are 64-bit
+registers that can potentially be any value, and may be attacker controlled,
+and `scale` and `offset` are fixed immediate values. `scale` must be `1`, `2`,
+`4`, or `8`, and `offset` can be any 32-bit sign extended value. The exact
+computation performed to find the address is then: `%base + (scale * %index) +
+offset` under 64-bit 2's complement modular arithmetic.
+
+One issue with this approach is that, after hardening, the  `%base + (scale *
+%index)` subexpression will compute a value near zero (`-1 + (scale * -1)`) and
+then a large, positive `offset` will index into memory within the first two
+gigabytes of address space. While these offsets are not attacker controlled,
+the attacker could chose to attack a load which happens to have the desired
+offset and then successfully read memory in that region. This significantly
+raises the burden on the attacker and limits the scope of attack but does not
+eliminate it. To fully close the attack we must work with the operating system
+to preclude mapping memory in the low two gigabytes of address space.
+
+
+###### 64-bit load checking instructions
+
+We can use the following instruction sequences to check loads. We set up `%r8`
+in these examples to hold the special value of `-1` which will be `cmov`ed over
+`%rax` in misspeculated paths.
+
+Single register addressing mode:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        orq     %rax, %rsi              # Mask the pointer if misspeculating.
+        movl    (%rsi), %edi
+```
+
+Two register addressing mode:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        orq     %rax, %rsi              # Mask the pointer if misspeculating.
+        orq     %rax, %rcx              # Mask the index if misspeculating.
+        movl    (%rsi,%rcx), %edi
+```
+
+This will result in a negative address near zero or in `offset` wrapping the
+address space back to a small positive address. Small, negative addresses will
+fault in user-mode for most operating systems, but targets which need the high
+address space to be user accessible may need to adjust the exact sequence used
+above. Additionally, the low addresses will need to be marked unreadable by the
+OS to fully harden the load.
+
+
+###### RIP-relative addressing is even easier to break
+
+There is a common addressing mode idiom that is substantially harder to check:
+addressing relative to the instruction pointer. We cannot change the value of
+the instruction pointer register and so we have the harder problem of forcing
+`%base + scale * %index + offset` to be an invalid address, by *only* changing
+`%index`. The only advantage we have is that the attacker also cannot modify
+`%base`. If we use the fast instruction sequence above, but only apply it to
+the index, we will always access `%rip + (scale * -1) + offset`. If the
+attacker can find a load which with this address happens to point to secret
+data, then they can reach it. However, the loader and base libraries can also
+simply refuse to map the heap, data segments, or stack within 2gb of any of the
+text in the program, much like it can reserve the low 2gb of address space.
+
+
+###### The flag registers again make everything hard
+
+Unfortunately, the technique of using `orq`-instructions has a serious flaw on
+x86. The very thing that makes it easy to accumulate state, the flag registers
+containing predicates, causes serious problems here because they may be alive
+and used by the loading instruction or subsequent instructions. On x86, the
+`orq` instruction **sets** the flags and will override anything already there.
+This makes inserting them into the instruction stream very hazardous.
+Unfortunately, unlike when hardening the loaded value, we have no fallback here
+and so we must have a fully general approach available.
+
+The first thing we must do when generating these sequences is try to analyze
+the surrounding code to prove that the flags are not in fact alive or being
+used. Typically, it has been set by some other instruction which just happens
+to set the flags register (much like ours!) with no actual dependency. In those
+cases, it is safe to directly insert these instructions. Alternatively we may
+be able to move them earlier to avoid clobbering the used value.
+
+However, this may ultimately be impossible. In that case, we need to preserve
+the flags around these instructions:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        pushfq
+        orq     %rax, %rcx              # Mask the pointer if misspeculating.
+        orq     %rax, %rdx              # Mask the index if misspeculating.
+        popfq
+        movl    (%rcx,%rdx), %edi
+```
+
+Using the `pushf` and `popf` instructions saves the flags register around our
+inserted code, but comes at a high cost. First, we must store the flags to the
+stack and reload them. Second, this causes the stack pointer to be adjusted
+dynamically, requiring a frame pointer be used for referring to temporaries
+spilled to the stack, etc.
+
+On newer x86 processors we can use the `lahf` and `sahf` instructions to save
+all of the flags besides the overflow flag in a register rather than on the
+stack. We can then use `seto` and `add` to save and restore the overflow flag
+in a register. Combined, this will save and restore flags in the same manner as
+above but using two registers rather than the stack. That is still very
+expensive if slightly less expensive than `pushf` and `popf` in most cases.
+
+
+###### A flag-less alternative on Haswell, Zen and newer processors
+
+Starting with the BMI2 x86 instruction set extensions available on Haswell and
+Zen processors, there is an instruction for shifting that does not set any
+flags: `shrx`. We can use this and the `lea` instruction to implement analogous
+code sequences to the above ones. However, these are still very marginally
+slower, as there are fewer ports able to dispatch shift instructions in most
+modern x86 processors than there are for `or` instructions.
+
+Fast, single register addressing mode:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        shrxq   %rax, %rsi, %rsi        # Shift away bits if misspeculating.
+        movl    (%rsi), %edi
+```
+
+This will collapse the register to zero or one, and everything but the offset
+in the addressing mode to be less than or equal to 9. This means the full
+address can only be guaranteed to be less than `(1 << 31) + 9`. The OS may wish
+to protect an extra page of the low address space to account for this
+
+
+##### Optimizations
+
+A very large portion of the cost for this approach comes from checking loads in
+this way, so it is important to work to optimize this. However, beyond making
+the instruction sequences to *apply* the checks efficient (for example by
+avoiding `pushfq` and `popfq` sequences), the only significant optimization is
+to check fewer loads without introducing a vulnerability. We apply several
+techniques to accomplish that.
+
+
+###### Don't check loads from compile-time constant stack offsets
+
+We implement this optimization on x86 by skipping the checking of loads which
+use a fixed frame pointer offset.
+
+The result of this optimization is that patterns like reloading a spilled
+register or accessing a global field don't get checked. This is a very
+significant performance win.
+
+
+###### Don't check dependent loads
+
+A core part of why this mitigation strategy works is that it establishes a
+data-flow check on the loaded address. However, this means that if the address
+itself was already loaded using a checked load, there is no need to check a
+dependent load provided it is within the same basic block as the checked load,
+and therefore has no additional predicates guarding it. Consider code like the
+following:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        movq    (%rcx), %rdi
+        movl    (%rdi), %edx
+```
+
+This will get transformed into:
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        orq     %rax, %rcx              # Mask the pointer if misspeculating.
+        movq    (%rcx), %rdi            # Hardened load.
+        movl    (%rdi), %edx            # Unhardened load due to dependent addr.
+```
+
+This doesn't check the load through `%rdi` as that pointer is dependent on a
+checked load already.
+
+
+###### Protect large, load-heavy blocks with a single lfence
+
+It may be worth using a single `lfence` instruction at the start of a block
+which begins with a (very) large number of loads that require independent
+protection *and* which require hardening the address of the load. However, this
+is unlikely to be profitable in practice. The latency hit of the hardening
+would need to exceed that of an `lfence` when *correctly* speculatively
+executed. But in that case, the `lfence` cost is a complete loss of speculative
+execution (at a minimum). So far, the evidence we have of the performance cost
+of using `lfence` indicates few if any hot code patterns where this trade off
+would make sense.
+
+
+###### Tempting optimizations that break the security model
+
+Several optimizations were considered which didn't pan out due to failure to
+uphold the security model. One in particular is worth discussing as many others
+will reduce to it.
+
+We wondered whether only the *first* load in a basic block could be checked. If
+the check works as intended, it forms an invalid pointer that doesn't even
+virtual-address translate in the hardware. It should fault very early on in its
+processing. Maybe that would stop things in time for the misspeculated path to
+fail to leak any secrets. This doesn't end up working because the processor is
+fundamentally out-of-order, even in its speculative domain. As a consequence,
+the attacker could cause the initial address computation itself to stall and
+allow an arbitrary number of unrelated loads (including attacked loads of
+secret data) to pass through.
+
+
+#### Interprocedural Checking
+
+Modern x86 processors may speculate into called functions and out of functions
+to their return address. As a consequence, we need a way to check loads that
+occur after a misspeculated predicate but where the load and the misspeculated
+predicate are in different functions. In essence, we need some interprocedural
+generalization of the predicate state tracking. A primary challenge to passing
+the predicate state between functions is that we would like to not require a
+change to the ABI or calling convention in order to make this mitigation more
+deployable, and further would like code mitigated in this way to be easily
+mixed with code not mitigated in this way and without completely losing the
+value of the mitigation.
+
+
+##### Embed the predicate state into the high bit(s) of the stack pointer
+
+We can use the same technique that allows hardening pointers to pass the
+predicate state into and out of functions. The stack pointer is trivially
+passed between functions and we can test for it having the high bits set to
+detect when it has been marked due to misspeculation. The callsite instruction
+sequence looks like (assuming a misspeculated state value of `-1`):
+```
+        ...
+
+.LBB0_4:                                # %danger
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        shlq    $47, %rax
+        orq     %rax, %rsp
+        callq   other_function
+        movq    %rsp, %rax
+        sarq    63, %rax                # Sign extend the high bit to all bits.
+```
+
+This first puts the predicate state into the high bits of `%rsp` before calling
+the function and then reads it back out of high bits of `%rsp` afterward. When
+correctly executing (speculatively or not), these are all no-ops. When
+misspeculating, the stack pointer will end up negative. We arrange for it to
+remain a canonical address, but otherwise leave the low bits alone to allow
+stack adjustments to proceed normally without disrupting this. Within the
+called function, we can extract this predicate state and then reset it on
+return:
+```
+other_function:
+        # prolog
+        callq   other_function
+        movq    %rsp, %rax
+        sarq    63, %rax                # Sign extend the high bit to all bits.
+        # ...
+
+.LBB0_N:
+        cmovneq %r8, %rax               # Conditionally update predicate state.
+        shlq    $47, %rax
+        orq     %rax, %rsp
+        retq
+```
+
+This approach is effective when all code is mitigated in this fashion, and can
+even survive very limited reaches into unmitigated code (the state will
+round-trip in and back out of an unmitigated function, it just won't be
+updated). But it does have some limitations. There is a cost to merging the
+state into `%rsp` and it doesn't insulate mitigated code from misspeculation in
+an unmitigated caller.
+
+There is also an advantage to using this form of interprocedural mitigation: by
+forming these invalid stack pointer addresses we can prevent speculative
+returns from successfully reading speculatively written values to the actual
+stack. This works first by forming a data-dependency between computing the
+address of the return address on the stack and our predicate state. And even
+when satisfied, if a misprediction causes the state to be poisoned the
+resulting stack pointer will be invalid.
+
+
+##### Rewrite API of internal functions to directly propagate predicate state
+
+(Not yet implemented.)
+
+We have the option with internal functions to directly adjust their API to
+accept the predicate as an argument and return it. This is likely to be
+marginally cheaper than embedding into `%rsp` for entering functions.
+
+
+##### Use `lfence` to guard function transitions
+
+An `lfence` instruction can be used to prevent subsequent loads from
+speculatively executing until all prior mispredicted predicates have resolved.
+We can use this broader barrier to speculative loads executing between
+functions. We emit it in the entry block to handle calls, and prior to each
+return. This approach also has the advantage of providing the strongest degree
+of mitigation when mixed with unmitigated code by halting all misspeculation
+entering a function which is mitigated, regardless of what occured in the
+caller. However, such a mixture is inherently more risky. Whether this kind of
+mixture is a sufficient mitigation requires careful analysis.
+
+Unfortunately, experimental results indicate that the performance overhead of
+this approach is very high for certain patterns of code. A classic example is
+any form of recursive evaluation engine. The hot, rapid call and return
+sequences exhibit dramatic performance loss when mitigated with `lfence`. This
+component alone can regress performance by 2x or more, making it an unpleasant
+tradeoff even when only used in a mixture of code.
+
+
+##### Use an internal TLS location to pass predicate state
+
+We can define a special thread-local value to hold the predicate state between
+functions. This avoids direct ABI implications by using a side channel between
+callers and callees to communicate the predicate state. It also allows implicit
+zero-initialization of the state, which allows non-checked code to be the first
+code executed.
+
+However, this requires a load from TLS in the entry block, a store to TLS
+before every call and every ret, and a load from TLS after every call. As a
+consequence it is expected to be substantially more expensive even than using
+`%rsp` and potentially `lfence` within the function entry block.
+
+
+##### Define a new ABI and/or calling convention
+
+We could define a new ABI and/or calling convention to explicitly pass the
+predicate state in and out of functions. This may be interesting if none of the
+alternatives have adequate performance, but it makes deployment and adoption
+dramatically more complex, and potentially infeasible.
+
+
+## High-Level Alternative Mitigation Strategies
+
+There are completely different alternative approaches to mitigating variant 1
+attacks. [Most](https://lwn.net/Articles/743265/)
+[discussion](https://lwn.net/Articles/744287/) so far focuses on mitigating
+specific known attackable components in the Linux kernel (or other kernels) by
+manually rewriting the code to contain an instruction sequence that is not
+vulnerable. For x86 systems this is done by either injecting an `lfence`
+instruction along the code path which would leak data if executed speculatively
+or by rewriting memory accesses to have branch-less masking to a known safe
+region. On Intel systems, `lfence` [will prevent the speculative load of secret
+data](https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf).
+On AMD systems `lfence` is currently a no-op, but can be made
+dispatch-serializing by setting an MSR, and thus preclude misspeculation of the
+code path ([mitigation G-2 +
+V1-1](https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf)).
+
+However, this relies on finding and enumerating all possible points in code
+which could be attacked to leak information. While in some cases static
+analysis is effective at doing this at scale, in many cases it still relies on
+human judgement to evaluate whether code might be vulnerable. Especially for
+software systems which receive less detailed scrutiny but remain sensitive to
+these attacks, this seems like an impractical security model. We need an
+automatic and systematic mitigation strategy.
+
+
+### Automatic `lfence` on Conditional Edges
+
+A natural way to scale up the existing hand-coded mitigations is simply to
+inject an `lfence` instruction into both the target and fallthrough
+destinations of every conditional branch. This ensures that no predicate or
+bounds check can be bypassed speculatively. However, the performance overhead
+of this approach is, simply put, catastrophic. Yet it remains the only truly
+"secure by default" approach known prior to this effort and serves as the
+baseline for performance.
+
+One attempt to address the performance overhead of this and make it more
+realistic to deploy is [MSVC's /Qspectre
+switch](https://blogs.msdn.microsoft.com/vcblog/2018/01/15/spectre-mitigations-in-msvc/).
+Their technique is to use static analysis within the compiler to only insert
+`lfence` instructions into conditional edges at risk of attack. However,
+[initial](https://arstechnica.com/gadgets/2018/02/microsofts-compiler-level-spectre-fix-shows-how-hard-this-problem-will-be-to-solve/)
+[analysis](https://www.paulkocher.com/doc/MicrosoftCompilerSpectreMitigation.html)
+has shown that this approach is incomplete and only catches a small and limited
+subset of attackable patterns which happen to resemble very closely the initial
+proofs of concept. As such, while its performance is acceptable, it does not
+appear to be an adequate systematic mitigation.
+
+
+## Performance Overhead
+
+The performance overhead of this style of comprehensive mitigation is very
+high. However, it compares very favorably with previously recommended
+approaches such as the `lfence` instruction. Just as users can restrict the
+scope of `lfence` to control its performance impact, this mitigation technique
+could be restricted in scope as well.
+
+However, it is important to understand what it would cost to get a fully
+mitigated baseline. Here we assume targeting a Haswell (or newer) processor and
+using all of the tricks to improve performance (so leaves the low 2gb
+unprotected and +/- 2gb surrounding any PC in the program). We ran both
+Google's microbenchmark suite and a large highly-tuned server built using
+ThinLTO and PGO. All were built with `-march=haswell` to give access to BMI2
+instructions, and benchmarks were run on large Haswell servers. We collected
+data both with an `lfence`-based mitigation and load hardening as presented
+here. The summary is that mitigating with load hardening is 1.77x faster than
+mitigating with `lfence`, and the overhead of load hardening compared to a
+normal program is likely between a 10% overhead and a 50% overhead with most
+large applications seeing a 30% overhead or less.
+
+| Benchmark                              | `lfence` | Load Hardening | Mitigated Speedup |
+| -------------------------------------- | -------: | -------------: | ----------------: |
+| Google microbenchmark suite            |   -74.8% |         -36.4% |          **2.5x** |
+| Large server QPS (using ThinLTO & PGO) |   -62%   |         -29%   |          **1.8x** |
+
+Below is a visualization of the microbenchmark suite results which helps show
+the distribution of results that is somewhat lost in the summary. The y-axis is
+a log-scale speedup ratio of load hardening relative to `lfence` (up -> faster
+-> better). Each box-and-whiskers represents one microbenchmark which may have
+many different metrics measured. The red line marks the median, the box marks
+the first and third quartiles, and the whiskers mark the min and max.
+
+![Microbenchmark result visualization](speculative_load_hardening_microbenchmarks.png)
+
+We don't yet have benchmark data on SPEC or the LLVM test suite, but we can
+work on getting that. Still, the above should give a pretty clear
+characterization of the performance, and specific benchmarks are unlikely to
+reveal especially interesting properties.
+
+
+### Future Work: Fine Grained Control and API-Integration
+
+The performance overhead of this technique is likely to be very significant and
+something users wish to control or reduce. There are interesting options here
+that impact the implementation strategy used.
+
+One particularly appealing option is to allow both opt-in and opt-out of this
+mitigation at reasonably fine granularity such as on a per-function basis,
+including intelligent handling of inlining decisions -- protected code can be
+prevented from inlining into unprotected code, and unprotected code will become
+protected when inlined into protected code. For systems where only a limited
+set of code is reachable by externally controlled inputs, it may be possible to
+limit the scope of mitigation through such mechanisms without compromising the
+application's overall security. The performance impact may also be focused in a
+few key functions that can be hand-mitigated in ways that have lower
+performance overhead while the remainder of the application receives automatic
+protection.
+
+For both limiting the scope of mitigation or manually mitigating hot functions,
+there needs to be some support for mixing mitigated and unmitigated code
+without completely defeating the mitigation. For the first use case, it would
+be particularly desirable that mitigated code remains safe when being called
+during misspeculation from unmitigated code.
+
+For the second use case, it may be important to connect the automatic
+mitigation technique to explicit mitigation APIs such as what is described in
+http://wg21.link/p0928 (or any other eventual API) so that there is a clean way
+to switch from automatic to manual mitigation without immediately exposing a
+hole. However, the design for how to do this is hard to come up with until the
+APIs are better established. We will revisit this as those APIs mature.

Added: www-releases/trunk/9.0.0/docs/_sources/SphinxQuickstartTemplate.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/SphinxQuickstartTemplate.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/SphinxQuickstartTemplate.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/SphinxQuickstartTemplate.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,160 @@
+==========================
+Sphinx Quickstart Template
+==========================
+
+Introduction and Quickstart
+===========================
+
+This document is meant to get you writing documentation as fast as possible
+even if you have no previous experience with Sphinx. The goal is to take
+someone in the state of "I want to write documentation and get it added to
+LLVM's docs" and turn that into useful documentation mailed to llvm-commits
+with as little nonsense as possible.
+
+You can find this document in ``docs/SphinxQuickstartTemplate.rst``. You
+should copy it, open the new file in your text editor, write your docs, and
+then send the new document to llvm-commits for review.
+
+Focus on *content*. It is easy to fix the Sphinx (reStructuredText) syntax
+later if necessary, although reStructuredText tries to imitate common
+plain-text conventions so it should be quite natural. A basic knowledge of
+reStructuredText syntax is useful when writing the document, so the last
+~half of this document (starting with `Example Section`_) gives examples
+which should cover 99% of use cases.
+
+Let me say that again: focus on *content*. But if you really need to verify
+Sphinx's output, see ``docs/README.txt`` for information.
+
+Once you have finished with the content, please send the ``.rst`` file to
+llvm-commits for review.
+
+Guidelines
+==========
+
+Try to answer the following questions in your first section:
+
+#. Why would I want to read this document?
+
+#. What should I know to be able to follow along with this document?
+
+#. What will I have learned by the end of this document?
+
+Common names for the first section are ``Introduction``, ``Overview``, or
+``Background``.
+
+If possible, make your document a "how to". Give it a name ``HowTo*.rst``
+like the other "how to" documents. This format is usually the easiest
+for another person to understand and also the most useful.
+
+You generally should not be writing documentation other than a "how to"
+unless there is already a "how to" about your topic. The reason for this
+is that without a "how to" document to read first, it is difficult for a
+person to understand a more advanced document.
+
+Focus on content (yes, I had to say it again).
+
+The rest of this document shows example reStructuredText markup constructs
+that are meant to be read by you in your text editor after you have copied
+this file into a new file for the documentation you are about to write.
+
+Example Section
+===============
+
+Your text can be *emphasized*, **bold**, or ``monospace``.
+
+Use blank lines to separate paragraphs.
+
+Headings (like ``Example Section`` just above) give your document its
+structure. Use the same kind of adornments (e.g. ``======`` vs. ``------``)
+as are used in this document. The adornment must be the same length as the
+text above it. For Vim users, variations of ``yypVr=`` might be handy.
+
+Example Subsection
+------------------
+
+Make a link `like this <http://llvm.org/>`_. There is also a more
+sophisticated syntax which `can be more readable`_ for longer links since
+it disrupts the flow less. You can put the ``.. _`link text`: <URL>`` block
+pretty much anywhere later in the document.
+
+.. _`can be more readable`: http://en.wikipedia.org/wiki/LLVM
+
+Lists can be made like this:
+
+#. A list starting with ``#.`` will be automatically numbered.
+
+#. This is a second list element.
+
+   #. Use indentation to create nested lists.
+
+You can also use unordered lists.
+
+* Stuff.
+
+  + Deeper stuff.
+
+* More stuff.
+
+Example Subsubsection
+^^^^^^^^^^^^^^^^^^^^^
+
+You can make blocks of code like this:
+
+.. code-block:: c++
+
+   int main() {
+     return 0;
+   }
+
+For a shell session, use a ``console`` code block (some existing docs use
+``bash``):
+
+.. code-block:: console
+
+   $ echo "Goodbye cruel world!"
+   $ rm -rf /
+
+If you need to show LLVM IR use the ``llvm`` code block.
+
+.. code-block:: llvm
+
+   define i32 @test1() {
+   entry:
+     ret i32 0
+   }
+
+Some other common code blocks you might need are ``c``, ``objc``, ``make``,
+and ``cmake``. If you need something beyond that, you can look at the `full
+list`_ of supported code blocks.
+
+.. _`full list`: http://pygments.org/docs/lexers/
+
+However, don't waste time fiddling with syntax highlighting when you could
+be adding meaningful content. When in doubt, show preformatted text
+without any syntax highlighting like this:
+
+::
+
+                          .
+                           +:.
+                       ..:: ::
+                    .++:+:: ::+:.:.
+                   .:+           :
+            ::.::..::            .+.
+          ..:+    ::              :
+    ......+:.                    ..
+          :++.    ..              :
+            .+:::+::              :
+            ..   . .+            ::
+                     +.:      .::+.
+                      ...+. .: .
+                         .++:..
+                          ...
+
+Hopefully you won't need to be this deep
+""""""""""""""""""""""""""""""""""""""""
+
+If you need to do fancier things than what has been shown in this document,
+you can mail the list or check Sphinx's `reStructuredText Primer`_.
+
+.. _`reStructuredText Primer`: http://sphinx.pocoo.org/rest.html

Added: www-releases/trunk/9.0.0/docs/_sources/StackMaps.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/StackMaps.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/StackMaps.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/StackMaps.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,517 @@
+===================================
+Stack maps and patch points in LLVM
+===================================
+
+.. contents::
+   :local:
+   :depth: 2
+
+Definitions
+===========
+
+In this document we refer to the "runtime" collectively as all
+components that serve as the LLVM client, including the LLVM IR
+generator, object code consumer, and code patcher.
+
+A stack map records the location of ``live values`` at a particular
+instruction address. These ``live values`` do not refer to all the
+LLVM values live across the stack map. Instead, they are only the
+values that the runtime requires to be live at this point. For
+example, they may be the values the runtime will need to resume
+program execution at that point independent of the compiled function
+containing the stack map.
+
+LLVM emits stack map data into the object code within a designated
+:ref:`stackmap-section`. This stack map data contains a record for
+each stack map. The record stores the stack map's instruction address
+and contains a entry for each mapped value. Each entry encodes a
+value's location as a register, stack offset, or constant.
+
+A patch point is an instruction address at which space is reserved for
+patching a new instruction sequence at run time. Patch points look
+much like calls to LLVM. They take arguments that follow a calling
+convention and may return a value. They also imply stack map
+generation, which allows the runtime to locate the patchpoint and
+find the location of ``live values`` at that point.
+
+Motivation
+==========
+
+This functionality is currently experimental but is potentially useful
+in a variety of settings, the most obvious being a runtime (JIT)
+compiler. Example applications of the patchpoint intrinsics are
+implementing an inline call cache for polymorphic method dispatch or
+optimizing the retrieval of properties in dynamically typed languages
+such as JavaScript.
+
+The intrinsics documented here are currently used by the JavaScript
+compiler within the open source WebKit project, see the `FTL JIT
+<https://trac.webkit.org/wiki/FTLJIT>`_, but they are designed to be
+used whenever stack maps or code patching are needed. Because the
+intrinsics have experimental status, compatibility across LLVM
+releases is not guaranteed.
+
+The stack map functionality described in this document is separate
+from the functionality described in
+:ref:`stack-map`. `GCFunctionMetadata` provides the location of
+pointers into a collected heap captured by the `GCRoot` intrinsic,
+which can also be considered a "stack map". Unlike the stack maps
+defined above, the `GCFunctionMetadata` stack map interface does not
+provide a way to associate live register values of arbitrary type with
+an instruction address, nor does it specify a format for the resulting
+stack map. The stack maps described here could potentially provide
+richer information to a garbage collecting runtime, but that usage
+will not be discussed in this document.
+
+Intrinsics
+==========
+
+The following two kinds of intrinsics can be used to implement stack
+maps and patch points: ``llvm.experimental.stackmap`` and
+``llvm.experimental.patchpoint``. Both kinds of intrinsics generate a
+stack map record, and they both allow some form of code patching. They
+can be used independently (i.e. ``llvm.experimental.patchpoint``
+implicitly generates a stack map without the need for an additional
+call to ``llvm.experimental.stackmap``). The choice of which to use
+depends on whether it is necessary to reserve space for code patching
+and whether any of the intrinsic arguments should be lowered according
+to calling conventions. ``llvm.experimental.stackmap`` does not
+reserve any space, nor does it expect any call arguments. If the
+runtime patches code at the stack map's address, it will destructively
+overwrite the program text. This is unlike
+``llvm.experimental.patchpoint``, which reserves space for in-place
+patching without overwriting surrounding code. The
+``llvm.experimental.patchpoint`` intrinsic also lowers a specified
+number of arguments according to its calling convention. This allows
+patched code to make in-place function calls without marshaling.
+
+Each instance of one of these intrinsics generates a stack map record
+in the :ref:`stackmap-section`. The record includes an ID, allowing
+the runtime to uniquely identify the stack map, and the offset within
+the code from the beginning of the enclosing function.
+
+'``llvm.experimental.stackmap``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare void
+        @llvm.experimental.stackmap(i64 <id>, i32 <numShadowBytes>, ...)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.stackmap``' intrinsic records the location of
+specified values in the stack map without generating any code.
+
+Operands:
+"""""""""
+
+The first operand is an ID to be encoded within the stack map. The
+second operand is the number of shadow bytes following the
+intrinsic. The variable number of operands that follow are the ``live
+values`` for which locations will be recorded in the stack map.
+
+To use this intrinsic as a bare-bones stack map, with no code patching
+support, the number of shadow bytes can be set to zero.
+
+Semantics:
+""""""""""
+
+The stack map intrinsic generates no code in place, unless nops are
+needed to cover its shadow (see below). However, its offset from
+function entry is stored in the stack map. This is the relative
+instruction address immediately following the instructions that
+precede the stack map.
+
+The stack map ID allows a runtime to locate the desired stack map
+record. LLVM passes this ID through directly to the stack map
+record without checking uniqueness.
+
+LLVM guarantees a shadow of instructions following the stack map's
+instruction offset during which neither the end of the basic block nor
+another call to ``llvm.experimental.stackmap`` or
+``llvm.experimental.patchpoint`` may occur. This allows the runtime to
+patch the code at this point in response to an event triggered from
+outside the code. The code for instructions following the stack map
+may be emitted in the stack map's shadow, and these instructions may
+be overwritten by destructive patching. Without shadow bytes, this
+destructive patching could overwrite program text or data outside the
+current function. We disallow overlapping stack map shadows so that
+the runtime does not need to consider this corner case.
+
+For example, a stack map with 8 byte shadow:
+
+.. code-block:: llvm
+
+  call void @runtime()
+  call void (i64, i32, ...)* @llvm.experimental.stackmap(i64 77, i32 8,
+                                                         i64* %ptr)
+  %val = load i64* %ptr
+  %add = add i64 %val, 3
+  ret i64 %add
+
+May require one byte of nop-padding:
+
+.. code-block:: none
+
+  0x00 callq _runtime
+  0x05 nop                <--- stack map address
+  0x06 movq (%rdi), %rax
+  0x07 addq $3, %rax
+  0x0a popq %rdx
+  0x0b ret                <---- end of 8-byte shadow
+
+Now, if the runtime needs to invalidate the compiled code, it may
+patch 8 bytes of code at the stack map's address at follows:
+
+.. code-block:: none
+
+  0x00 callq _runtime
+  0x05 movl  $0xffff, %rax <--- patched code at stack map address
+  0x0a callq *%rax         <---- end of 8-byte shadow
+
+This way, after the normal call to the runtime returns, the code will
+execute a patched call to a special entry point that can rebuild a
+stack frame from the values located by the stack map.
+
+'``llvm.experimental.patchpoint.*``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare void
+        @llvm.experimental.patchpoint.void(i64 <id>, i32 <numBytes>,
+                                           i8* <target>, i32 <numArgs>, ...)
+      declare i64
+        @llvm.experimental.patchpoint.i64(i64 <id>, i32 <numBytes>,
+                                          i8* <target>, i32 <numArgs>, ...)
+
+Overview:
+"""""""""
+
+The '``llvm.experimental.patchpoint.*``' intrinsics creates a function
+call to the specified ``<target>`` and records the location of specified
+values in the stack map.
+
+Operands:
+"""""""""
+
+The first operand is an ID, the second operand is the number of bytes
+reserved for the patchable region, the third operand is the target
+address of a function (optionally null), and the fourth operand
+specifies how many of the following variable operands are considered
+function call arguments. The remaining variable number of operands are
+the ``live values`` for which locations will be recorded in the stack
+map.
+
+Semantics:
+""""""""""
+
+The patch point intrinsic generates a stack map. It also emits a
+function call to the address specified by ``<target>`` if the address
+is not a constant null. The function call and its arguments are
+lowered according to the calling convention specified at the
+intrinsic's callsite. Variants of the intrinsic with non-void return
+type also return a value according to calling convention.
+
+On PowerPC, note that ``<target>`` must be the ABI function pointer for the
+intended target of the indirect call. Specifically, when compiling for the
+ELF V1 ABI, ``<target>`` is the function-descriptor address normally used as
+the C/C++ function-pointer representation.
+
+Requesting zero patch point arguments is valid. In this case, all
+variable operands are handled just like
+``llvm.experimental.stackmap.*``. The difference is that space will
+still be reserved for patching, a call will be emitted, and a return
+value is allowed.
+
+The location of the arguments are not normally recorded in the stack
+map because they are already fixed by the calling convention. The
+remaining ``live values`` will have their location recorded, which
+could be a register, stack location, or constant. A special calling
+convention has been introduced for use with stack maps, anyregcc,
+which forces the arguments to be loaded into registers but allows
+those register to be dynamically allocated. These argument registers
+will have their register locations recorded in the stack map in
+addition to the remaining ``live values``.
+
+The patch point also emits nops to cover at least ``<numBytes>`` of
+instruction encoding space. Hence, the client must ensure that
+``<numBytes>`` is enough to encode a call to the target address on the
+supported targets. If the call target is constant null, then there is
+no minimum requirement. A zero-byte null target patchpoint is
+valid.
+
+The runtime may patch the code emitted for the patch point, including
+the call sequence and nops. However, the runtime may not assume
+anything about the code LLVM emits within the reserved space. Partial
+patching is not allowed. The runtime must patch all reserved bytes,
+padding with nops if necessary.
+
+This example shows a patch point reserving 15 bytes, with one argument
+in $rdi, and a return value in $rax per native calling convention:
+
+.. code-block:: llvm
+
+  %target = inttoptr i64 -281474976710654 to i8*
+  %val = call i64 (i64, i32, ...)*
+           @llvm.experimental.patchpoint.i64(i64 78, i32 15,
+                                             i8* %target, i32 1, i64* %ptr)
+  %add = add i64 %val, 3
+  ret i64 %add
+
+May generate:
+
+.. code-block:: none
+
+  0x00 movabsq $0xffff000000000002, %r11 <--- patch point address
+  0x0a callq   *%r11
+  0x0d nop
+  0x0e nop                               <--- end of reserved 15-bytes
+  0x0f addq    $0x3, %rax
+  0x10 movl    %rax, 8(%rsp)
+
+Note that no stack map locations will be recorded. If the patched code
+sequence does not need arguments fixed to specific calling convention
+registers, then the ``anyregcc`` convention may be used:
+
+.. code-block:: none
+
+  %val = call anyregcc @llvm.experimental.patchpoint(i64 78, i32 15,
+                                                     i8* %target, i32 1,
+                                                     i64* %ptr)
+
+The stack map now indicates the location of the %ptr argument and
+return value:
+
+.. code-block:: none
+
+  Stack Map: ID=78, Loc0=%r9 Loc1=%r8
+
+The patch code sequence may now use the argument that happened to be
+allocated in %r8 and return a value allocated in %r9:
+
+.. code-block:: none
+
+  0x00 movslq 4(%r8) %r9              <--- patched code at patch point address
+  0x03 nop
+  ...
+  0x0e nop                            <--- end of reserved 15-bytes
+  0x0f addq    $0x3, %r9
+  0x10 movl    %r9, 8(%rsp)
+
+.. _stackmap-format:
+
+Stack Map Format
+================
+
+The existence of a stack map or patch point intrinsic within an LLVM
+Module forces code emission to create a :ref:`stackmap-section`. The
+format of this section follows:
+
+.. code-block:: none
+
+  Header {
+    uint8  : Stack Map Version (current version is 3)
+    uint8  : Reserved (expected to be 0)
+    uint16 : Reserved (expected to be 0)
+  }
+  uint32 : NumFunctions
+  uint32 : NumConstants
+  uint32 : NumRecords
+  StkSizeRecord[NumFunctions] {
+    uint64 : Function Address
+    uint64 : Stack Size
+    uint64 : Record Count
+  }
+  Constants[NumConstants] {
+    uint64 : LargeConstant
+  }
+  StkMapRecord[NumRecords] {
+    uint64 : PatchPoint ID
+    uint32 : Instruction Offset
+    uint16 : Reserved (record flags)
+    uint16 : NumLocations
+    Location[NumLocations] {
+      uint8  : Register | Direct | Indirect | Constant | ConstantIndex
+      uint8  : Reserved (expected to be 0)
+      uint16 : Location Size
+      uint16 : Dwarf RegNum
+      uint16 : Reserved (expected to be 0)
+      int32  : Offset or SmallConstant
+    }
+    uint32 : Padding (only if required to align to 8 byte)
+    uint16 : Padding
+    uint16 : NumLiveOuts
+    LiveOuts[NumLiveOuts]
+      uint16 : Dwarf RegNum
+      uint8  : Reserved
+      uint8  : Size in Bytes
+    }
+    uint32 : Padding (only if required to align to 8 byte)
+  }
+
+The first byte of each location encodes a type that indicates how to
+interpret the ``RegNum`` and ``Offset`` fields as follows:
+
+======== ========== =================== ===========================
+Encoding Type       Value               Description
+-------- ---------- ------------------- ---------------------------
+0x1      Register   Reg                 Value in a register
+0x2      Direct     Reg + Offset        Frame index value
+0x3      Indirect   [Reg + Offset]      Spilled value
+0x4      Constant   Offset              Small constant
+0x5      ConstIndex Constants[Offset]   Large constant
+======== ========== =================== ===========================
+
+In the common case, a value is available in a register, and the
+``Offset`` field will be zero. Values spilled to the stack are encoded
+as ``Indirect`` locations. The runtime must load those values from a
+stack address, typically in the form ``[BP + Offset]``. If an
+``alloca`` value is passed directly to a stack map intrinsic, then
+LLVM may fold the frame index into the stack map as an optimization to
+avoid allocating a register or stack slot. These frame indices will be
+encoded as ``Direct`` locations in the form ``BP + Offset``. LLVM may
+also optimize constants by emitting them directly in the stack map,
+either in the ``Offset`` of a ``Constant`` location or in the constant
+pool, referred to by ``ConstantIndex`` locations.
+
+At each callsite, a "liveout" register list is also recorded. These
+are the registers that are live across the stackmap and therefore must
+be saved by the runtime. This is an important optimization when the
+patchpoint intrinsic is used with a calling convention that by default
+preserves most registers as callee-save.
+
+Each entry in the liveout register list contains a DWARF register
+number and size in bytes. The stackmap format deliberately omits
+specific subregister information. Instead the runtime must interpret
+this information conservatively. For example, if the stackmap reports
+one byte at ``%rax``, then the value may be in either ``%al`` or
+``%ah``. It doesn't matter in practice, because the runtime will
+simply save ``%rax``. However, if the stackmap reports 16 bytes at
+``%ymm0``, then the runtime can safely optimize by saving only
+``%xmm0``.
+
+The stack map format is a contract between an LLVM SVN revision and
+the runtime. It is currently experimental and may change in the short
+term, but minimizing the need to update the runtime is
+important. Consequently, the stack map design is motivated by
+simplicity and extensibility. Compactness of the representation is
+secondary because the runtime is expected to parse the data
+immediately after compiling a module and encode the information in its
+own format. Since the runtime controls the allocation of sections, it
+can reuse the same stack map space for multiple modules.
+
+Stackmap support is currently only implemented for 64-bit
+platforms. However, a 32-bit implementation should be able to use the
+same format with an insignificant amount of wasted space.
+
+.. _stackmap-section:
+
+Stack Map Section
+^^^^^^^^^^^^^^^^^
+
+A JIT compiler can easily access this section by providing its own
+memory manager via the LLVM C API
+``LLVMCreateSimpleMCJITMemoryManager()``. When creating the memory
+manager, the JIT provides a callback:
+``LLVMMemoryManagerAllocateDataSectionCallback()``. When LLVM creates
+this section, it invokes the callback and passes the section name. The
+JIT can record the in-memory address of the section at this time and
+later parse it to recover the stack map data.
+
+For MachO (e.g. on Darwin), the stack map section name is
+"__llvm_stackmaps". The segment name is "__LLVM_STACKMAPS".
+
+For ELF (e.g. on Linux), the stack map section name is
+".llvm_stackmaps".  The segment name is "__LLVM_STACKMAPS".
+
+Stack Map Usage
+===============
+
+The stack map support described in this document can be used to
+precisely determine the location of values at a specific position in
+the code. LLVM does not maintain any mapping between those values and
+any higher-level entity. The runtime must be able to interpret the
+stack map record given only the ID, offset, and the order of the
+locations, records, and functions, which LLVM preserves.
+
+Note that this is quite different from the goal of debug information,
+which is a best-effort attempt to track the location of named
+variables at every instruction.
+
+An important motivation for this design is to allow a runtime to
+commandeer a stack frame when execution reaches an instruction address
+associated with a stack map. The runtime must be able to rebuild a
+stack frame and resume program execution using the information
+provided by the stack map. For example, execution may resume in an
+interpreter or a recompiled version of the same function.
+
+This usage restricts LLVM optimization. Clearly, LLVM must not move
+stores across a stack map. However, loads must also be handled
+conservatively. If the load may trigger an exception, hoisting it
+above a stack map could be invalid. For example, the runtime may
+determine that a load is safe to execute without a type check given
+the current state of the type system. If the type system changes while
+some activation of the load's function exists on the stack, the load
+becomes unsafe. The runtime can prevent subsequent execution of that
+load by immediately patching any stack map location that lies between
+the current call site and the load (typically, the runtime would
+simply patch all stack map locations to invalidate the function). If
+the compiler had hoisted the load above the stack map, then the
+program could crash before the runtime could take back control.
+
+To enforce these semantics, stackmap and patchpoint intrinsics are
+considered to potentially read and write all memory. This may limit
+optimization more than some clients desire. This limitation may be
+avoided by marking the call site as "readonly". In the future we may
+also allow meta-data to be added to the intrinsic call to express
+aliasing, thereby allowing optimizations to hoist certain loads above
+stack maps.
+
+Direct Stack Map Entries
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+As shown in :ref:`stackmap-section`, a Direct stack map location
+records the address of frame index. This address is itself the value
+that the runtime requested. This differs from Indirect locations,
+which refer to a stack locations from which the requested values must
+be loaded. Direct locations can communicate the address if an alloca,
+while Indirect locations handle register spills.
+
+For example:
+
+.. code-block:: none
+
+  entry:
+    %a = alloca i64...
+    llvm.experimental.stackmap(i64 <ID>, i32 <shadowBytes>, i64* %a)
+
+The runtime can determine this alloca's relative location on the
+stack immediately after compilation, or at any time thereafter. This
+differs from Register and Indirect locations, because the runtime can
+only read the values in those locations when execution reaches the
+instruction address of the stack map.
+
+This functionality requires LLVM to treat entry-block allocas
+specially when they are directly consumed by an intrinsics. (This is
+the same requirement imposed by the llvm.gcroot intrinsic.) LLVM
+transformations must not substitute the alloca with any intervening
+value. This can be verified by the runtime simply by checking that the
+stack map's location is a Direct location type.
+
+
+Supported Architectures
+=======================
+
+Support for StackMap generation and the related intrinsics requires 
+some code for each backend.  Today, only a subset of LLVM's backends 
+are supported.  The currently supported architectures are X86_64, 
+PowerPC, and Aarch64.

Added: www-releases/trunk/9.0.0/docs/_sources/StackSafetyAnalysis.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/StackSafetyAnalysis.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/StackSafetyAnalysis.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/StackSafetyAnalysis.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,56 @@
+==================================
+Stack Safety Analysis
+==================================
+
+
+Introduction
+============
+
+The Stack Safety Analysis determines if stack allocated variables can be
+considered 'safe' from memory access bugs.
+
+The primary purpose of the analysis is to be used by sanitizers to avoid
+unnecessary instrumentation of 'safe' variables. SafeStack is going to be the
+first user.
+
+'safe' variables can be defined as variables that can not be used out-of-scope
+(e.g. use-after-return) or accessed out of bounds. In the future it can be
+extended to track other variable properties. E.g. we plan to extend
+implementation with a check to make sure that variable is always initialized
+before every read to optimize use-of-uninitialized-memory checks.
+
+How it works
+============
+
+The analysis is implemented in two stages:
+
+The intra-procedural, or 'local', stage performs a depth-first search inside
+functions to collect all uses of each alloca, including loads/stores and uses as
+arguments functions. After this stage we know which parts of the alloca are used
+by functions itself but we don't know what happens after it is passed as
+an argument to another function.
+
+The inter-procedural, or 'global', stage, resolves what happens to allocas after
+they are passed as function arguments. This stage performs a depth-first search
+on function calls inside a single module and propagates allocas usage through
+functions calls.
+
+When used with ThinLTO, the global stage performs a whole program analysis over
+the Module Summary Index.
+
+Testing
+=======
+
+The analysis is covered with lit tests.
+
+We expect that users can tolerate false classification of variables as
+'unsafe' when in-fact it's 'safe'. This may lead to inefficient code. However, we
+can't accept false 'safe' classification which may cause sanitizers to miss actual
+bugs in instrumented code. To avoid that we want additional validation tool.
+
+AddressSanitizer may help with this validation. We can instrument all variables
+as usual but additionally store stack-safe information in the
+``ASanStackVariableDescription``. Then if AddressSanitizer detects a bug on
+a 'safe' variable we can produce an additional report to let the user know that
+probably Stack Safety Analysis failed and we should check for a bug in the
+compiler.

Added: www-releases/trunk/9.0.0/docs/_sources/Statepoints.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Statepoints.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Statepoints.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Statepoints.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,983 @@
+=====================================
+Garbage Collection Safepoints in LLVM
+=====================================
+
+.. contents::
+   :local:
+   :depth: 2
+
+Status
+=======
+
+This document describes a set of extensions to LLVM to support garbage
+collection.  By now, these mechanisms are well proven with commercial java 
+implementation with a fully relocating collector having shipped using them.  
+There are a couple places where bugs might still linger; these are called out
+below.
+
+They are still listed as "experimental" to indicate that no forward or backward
+compatibility guarantees are offered across versions.  If your use case is such 
+that you need some form of forward compatibility guarantee, please raise the 
+issue on the llvm-dev mailing list.  
+
+LLVM still supports an alternate mechanism for conservative garbage collection 
+support using the ``gcroot`` intrinsic.  The ``gcroot`` mechanism is mostly of
+historical interest at this point with one exception - its implementation of
+shadow stacks has been used successfully by a number of language frontends and
+is still supported.  
+
+Overview & Core Concepts
+========================
+
+To collect dead objects, garbage collectors must be able to identify
+any references to objects contained within executing code, and,
+depending on the collector, potentially update them.  The collector
+does not need this information at all points in code - that would make
+the problem much harder - but only at well-defined points in the
+execution known as 'safepoints' For most collectors, it is sufficient
+to track at least one copy of each unique pointer value.  However, for
+a collector which wishes to relocate objects directly reachable from
+running code, a higher standard is required.
+
+One additional challenge is that the compiler may compute intermediate
+results ("derived pointers") which point outside of the allocation or
+even into the middle of another allocation.  The eventual use of this
+intermediate value must yield an address within the bounds of the
+allocation, but such "exterior derived pointers" may be visible to the
+collector.  Given this, a garbage collector can not safely rely on the
+runtime value of an address to indicate the object it is associated
+with.  If the garbage collector wishes to move any object, the
+compiler must provide a mapping, for each pointer, to an indication of
+its allocation.
+
+To simplify the interaction between a collector and the compiled code,
+most garbage collectors are organized in terms of three abstractions:
+load barriers, store barriers, and safepoints.
+
+#. A load barrier is a bit of code executed immediately after the
+   machine load instruction, but before any use of the value loaded.
+   Depending on the collector, such a barrier may be needed for all
+   loads, merely loads of a particular type (in the original source
+   language), or none at all.
+
+#. Analogously, a store barrier is a code fragment that runs
+   immediately before the machine store instruction, but after the
+   computation of the value stored.  The most common use of a store
+   barrier is to update a 'card table' in a generational garbage
+   collector.
+
+#. A safepoint is a location at which pointers visible to the compiled
+   code (i.e. currently in registers or on the stack) are allowed to
+   change.  After the safepoint completes, the actual pointer value
+   may differ, but the 'object' (as seen by the source language)
+   pointed to will not.
+
+  Note that the term 'safepoint' is somewhat overloaded.  It refers to
+  both the location at which the machine state is parsable and the
+  coordination protocol involved in bring application threads to a
+  point at which the collector can safely use that information.  The
+  term "statepoint" as used in this document refers exclusively to the
+  former.
+
+This document focuses on the last item - compiler support for
+safepoints in generated code.  We will assume that an outside
+mechanism has decided where to place safepoints.  From our
+perspective, all safepoints will be function calls.  To support
+relocation of objects directly reachable from values in compiled code,
+the collector must be able to:
+
+#. identify every copy of a pointer (including copies introduced by
+   the compiler itself) at the safepoint,
+#. identify which object each pointer relates to, and
+#. potentially update each of those copies.
+
+This document describes the mechanism by which an LLVM based compiler
+can provide this information to a language runtime/collector, and
+ensure that all pointers can be read and updated if desired.
+
+Abstract Machine Model
+^^^^^^^^^^^^^^^^^^^^^^^
+
+At a high level, LLVM has been extended to support compiling to an abstract 
+machine which extends the actual target with a non-integral pointer type 
+suitable for representing a garbage collected reference to an object.  In 
+particular, such non-integral pointer type have no defined mapping to an 
+integer representation.  This semantic quirk allows the runtime to pick a 
+integer mapping for each point in the program allowing relocations of objects 
+without visible effects.
+
+This high level abstract machine model is used for most of the optimizer.  As
+a result, transform passes do not need to be extended to look through explicit
+relocation sequence.  Before starting code generation, we switch
+representations to an explicit form.  The exact location chosen for lowering
+is an implementation detail.
+
+Note that most of the value of the abstract machine model comes for collectors
+which need to model potentially relocatable objects.  For a compiler which
+supports only a non-relocating collector, you may wish to consider starting
+with the fully explicit form.  
+
+Warning: There is one currently known semantic hole in the definition of 
+non-integral pointers which has not been addressed upstream.  To work around
+this, you need to disable speculation of loads unless the memory type 
+(non-integral pointer vs anything else) is known to unchanged.  That is, it is 
+not safe to speculate a load if doing causes a non-integral pointer value to 
+be loaded as any other type or vice versa.  In practice, this restriction is 
+well isolated to isSafeToSpeculate in ValueTracking.cpp.
+
+Explicit Representation
+^^^^^^^^^^^^^^^^^^^^^^^
+
+A frontend could directly generate this low level explicit form, but 
+doing so may inhibit optimization.  Instead, it is recommended that
+compilers with relocating collectors target the abstract machine model just
+described.  
+
+The heart of the explicit approach is to construct (or rewrite) the IR in a 
+manner where the possible updates performed by the garbage collector are
+explicitly visible in the IR.  Doing so requires that we:
+
+#. create a new SSA value for each potentially relocated pointer, and
+   ensure that no uses of the original (non relocated) value is
+   reachable after the safepoint,
+#. specify the relocation in a way which is opaque to the compiler to
+   ensure that the optimizer can not introduce new uses of an
+   unrelocated value after a statepoint. This prevents the optimizer
+   from performing unsound optimizations.
+#. recording a mapping of live pointers (and the allocation they're
+   associated with) for each statepoint.
+
+At the most abstract level, inserting a safepoint can be thought of as
+replacing a call instruction with a call to a multiple return value
+function which both calls the original target of the call, returns
+its result, and returns updated values for any live pointers to
+garbage collected objects.
+
+  Note that the task of identifying all live pointers to garbage
+  collected values, transforming the IR to expose a pointer giving the
+  base object for every such live pointer, and inserting all the
+  intrinsics correctly is explicitly out of scope for this document.
+  The recommended approach is to use the :ref:`utility passes 
+  <statepoint-utilities>` described below. 
+
+This abstract function call is concretely represented by a sequence of
+intrinsic calls known collectively as a "statepoint relocation sequence".
+
+Let's consider a simple call in LLVM IR:
+
+.. code-block:: llvm
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
+         gc "statepoint-example" {
+    call void ()* @foo()
+    ret i8 addrspace(1)* %obj
+  }
+
+Depending on our language we may need to allow a safepoint during the execution 
+of ``foo``. If so, we need to let the collector update local values in the 
+current frame.  If we don't, we'll be accessing a potential invalid reference 
+once we eventually return from the call.
+
+In this example, we need to relocate the SSA value ``%obj``.  Since we can't 
+actually change the value in the SSA value ``%obj``, we need to introduce a new 
+SSA value ``%obj.relocated`` which represents the potentially changed value of
+``%obj`` after the safepoint and update any following uses appropriately.  The 
+resulting relocation sequence is:
+
+.. code-block:: llvm
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
+         gc "statepoint-example" {
+    %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj)
+    %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 7, i32 7)
+    ret i8 addrspace(1)* %obj.relocated
+  }
+
+Ideally, this sequence would have been represented as a M argument, N
+return value function (where M is the number of values being
+relocated + the original call arguments and N is the original return
+value + each relocated value), but LLVM does not easily support such a
+representation.
+
+Instead, the statepoint intrinsic marks the actual site of the
+safepoint or statepoint.  The statepoint returns a token value (which
+exists only at compile time).  To get back the original return value
+of the call, we use the ``gc.result`` intrinsic.  To get the relocation
+of each pointer in turn, we use the ``gc.relocate`` intrinsic with the
+appropriate index.  Note that both the ``gc.relocate`` and ``gc.result`` are
+tied to the statepoint.  The combination forms a "statepoint relocation 
+sequence" and represents the entirety of a parseable call or 'statepoint'.
+
+When lowered, this example would generate the following x86 assembly:
+
+.. code-block:: gas
+  
+	  .globl	test1
+	  .align	16, 0x90
+	  pushq	%rax
+	  callq	foo
+  .Ltmp1:
+	  movq	(%rsp), %rax  # This load is redundant (oops!)
+	  popq	%rdx
+	  retq
+
+Each of the potentially relocated values has been spilled to the
+stack, and a record of that location has been recorded to the
+:ref:`Stack Map section <stackmap-section>`.  If the garbage collector
+needs to update any of these pointers during the call, it knows
+exactly what to change.
+
+The relevant parts of the StackMap section for our example are:
+
+.. code-block:: gas
+  
+  # This describes the call site
+  # Stack Maps: callsite 2882400000
+	  .quad	2882400000
+	  .long	.Ltmp1-test1
+	  .short	0
+  # .. 8 entries skipped ..
+  # This entry describes the spill slot which is directly addressable
+  # off RSP with offset 0.  Given the value was spilled with a pushq, 
+  # that makes sense.
+  # Stack Maps:   Loc 8: Direct RSP     [encoding: .byte 2, .byte 8, .short 7, .int 0]
+	  .byte	2
+	  .byte	8
+	  .short	7
+	  .long	0
+
+This example was taken from the tests for the :ref:`RewriteStatepointsForGC`
+utility pass.  As such, its full StackMap can be easily examined with the
+following command.
+
+.. code-block:: bash
+
+  opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps
+
+Simplifications for Non-Relocating GCs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some of the complexity in the previous example is unnecessary for a
+non-relocating collector.  While a non-relocating collector still needs the
+information about which location contain live references, it doesn't need to
+represent explicit relocations.  As such, the previously described explicit
+lowering can be simplified to remove all of the ``gc.relocate`` intrinsic
+calls and leave uses in terms of the original reference value.  
+
+Here's the explicit lowering for the previous example for a non-relocating
+collector:
+
+.. code-block:: llvm
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
+         gc "statepoint-example" {
+    call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj)
+    ret i8 addrspace(1)* %obj
+  }
+
+Recording On Stack Regions
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In addition to the explicit relocation form previously described, the
+statepoint infrastructure also allows the listing of allocas within the gc
+pointer list.  Allocas can be listed with or without additional explicit gc
+pointer values and relocations.
+
+An alloca in the gc region of the statepoint operand list will cause the
+address of the stack region to be listed in the stackmap for the statepoint.
+
+This mechanism can be used to describe explicit spill slots if desired.  It
+then becomes the generator's responsibility to ensure that values are
+spill/filled to/from the alloca as needed on either side of the safepoint.
+Note that there is no way to indicate a corresponding base pointer for such
+an explicitly specified spill slot, so usage is restricted to values for
+which the associated collector can derive the object base from the pointer
+itself.
+
+This mechanism can be used to describe on stack objects containing
+references provided that the collector can map from the location on the
+stack to a heap map describing the internal layout of the references the
+collector needs to process.
+
+WARNING: At the moment, this alternate form is not well exercised.  It is
+recommended to use this with caution and expect to have to fix a few bugs.
+In particular, the RewriteStatepointsForGC utility pass does not do
+anything for allocas today.
+  
+Base & Derived Pointers
+^^^^^^^^^^^^^^^^^^^^^^^
+
+A "base pointer" is one which points to the starting address of an allocation
+(object).  A "derived pointer" is one which is offset from a base pointer by
+some amount.  When relocating objects, a garbage collector needs to be able 
+to relocate each derived pointer associated with an allocation to the same 
+offset from the new address.
+
+"Interior derived pointers" remain within the bounds of the allocation 
+they're associated with.  As a result, the base object can be found at 
+runtime provided the bounds of allocations are known to the runtime system.
+
+"Exterior derived pointers" are outside the bounds of the associated object;
+they may even fall within *another* allocations address range.  As a result,
+there is no way for a garbage collector to determine which allocation they 
+are associated with at runtime and compiler support is needed.
+
+The ``gc.relocate`` intrinsic supports an explicit operand for describing the
+allocation associated with a derived pointer.  This operand is frequently 
+referred to as the base operand, but does not strictly speaking have to be
+a base pointer, but it does need to lie within the bounds of the associated
+allocation.  Some collectors may require that the operand be an actual base
+pointer rather than merely an internal derived pointer. Note that during 
+lowering both the base and derived pointer operands are required to be live 
+over the associated call safepoint even if the base is otherwise unused 
+afterwards.
+
+If we extend our previous example to include a pointless derived pointer, 
+we get:
+
+.. code-block:: llvm
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
+         gc "statepoint-example" {
+    %gep = getelementptr i8, i8 addrspace(1)* %obj, i64 20000
+    %token = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj, i8 addrspace(1)* %gep)
+    %obj.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %token, i32 7, i32 7)
+    %gep.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %token, i32 7, i32 8)
+    %p = getelementptr i8, i8 addrspace(1)* %gep, i64 -20000
+    ret i8 addrspace(1)* %p
+  }
+
+Note that in this example %p and %obj.relocate are the same address and we
+could replace one with the other, potentially removing the derived pointer
+from the live set at the safepoint entirely.
+
+.. _gc_transition_args:
+
+GC Transitions
+^^^^^^^^^^^^^^^^^^
+
+As a practical consideration, many garbage-collected systems allow code that is
+collector-aware ("managed code") to call code that is not collector-aware
+("unmanaged code"). It is common that such calls must also be safepoints, since
+it is desirable to allow the collector to run during the execution of
+unmanaged code. Furthermore, it is common that coordinating the transition from
+managed to unmanaged code requires extra code generation at the call site to
+inform the collector of the transition. In order to support these needs, a
+statepoint may be marked as a GC transition, and data that is necessary to
+perform the transition (if any) may be provided as additional arguments to the
+statepoint.
+
+  Note that although in many cases statepoints may be inferred to be GC
+  transitions based on the function symbols involved (e.g. a call from a
+  function with GC strategy "foo" to a function with GC strategy "bar"),
+  indirect calls that are also GC transitions must also be supported. This
+  requirement is the driving force behind the decision to require that GC
+  transitions are explicitly marked.
+
+Let's revisit the sample given above, this time treating the call to ``@foo``
+as a GC transition. Depending on our target, the transition code may need to
+access some extra state in order to inform the collector of the transition.
+Let's assume a hypothetical GC--somewhat unimaginatively named "hypothetical-gc"
+--that requires that a TLS variable must be written to before and after a call
+to unmanaged code. The resulting relocation sequence is:
+
+.. code-block:: llvm
+
+  @flag = thread_local global i32 0, align 4
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1) *%obj)
+         gc "hypothetical-gc" {
+
+    %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 1, i32* @Flag, i32 0, i8 addrspace(1)* %obj)
+    %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 7, i32 7)
+    ret i8 addrspace(1)* %obj.relocated
+  }
+
+During lowering, this will result in a instruction selection DAG that looks
+something like:
+
+::
+
+  CALLSEQ_START
+  ...
+  GC_TRANSITION_START (lowered i32 *@Flag), SRCVALUE i32* Flag
+  STATEPOINT
+  GC_TRANSITION_END (lowered i32 *@Flag), SRCVALUE i32 *Flag
+  ...
+  CALLSEQ_END
+
+In order to generate the necessary transition code, the backend for each target
+supported by "hypothetical-gc" must be modified to lower ``GC_TRANSITION_START``
+and ``GC_TRANSITION_END`` nodes appropriately when the "hypothetical-gc"
+strategy is in use for a particular function. Assuming that such lowering has
+been added for X86, the generated assembly would be:
+
+.. code-block:: gas
+
+	  .globl	test1
+	  .align	16, 0x90
+	  pushq	%rax
+	  movl $1, %fs:Flag at TPOFF
+	  callq	foo
+	  movl $0, %fs:Flag at TPOFF
+  .Ltmp1:
+	  movq	(%rsp), %rax  # This load is redundant (oops!)
+	  popq	%rdx
+	  retq
+
+Note that the design as presented above is not fully implemented: in particular,
+strategy-specific lowering is not present, and all GC transitions are emitted as
+as single no-op before and after the call instruction. These no-ops are often
+removed by the backend during dead machine instruction elimination.
+
+
+Intrinsics
+===========
+
+'llvm.experimental.gc.statepoint' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare token
+        @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
+                       func_type <target>, 
+                       i64 <#call args>, i64 <flags>,
+                       ... (call parameters),
+                       i64 <# transition args>, ... (transition parameters),
+                       i64 <# deopt args>, ... (deopt parameters),
+                       ... (gc parameters))
+
+Overview:
+"""""""""
+
+The statepoint intrinsic represents a call which is parse-able by the
+runtime.
+
+Operands:
+"""""""""
+
+The 'id' operand is a constant integer that is reported as the ID
+field in the generated stackmap.  LLVM does not interpret this
+parameter in any way and its meaning is up to the statepoint user to
+decide.  Note that LLVM is free to duplicate code containing
+statepoint calls, and this may transform IR that had a unique 'id' per
+lexical call to statepoint to IR that does not.
+
+If 'num patch bytes' is non-zero then the call instruction
+corresponding to the statepoint is not emitted and LLVM emits 'num
+patch bytes' bytes of nops in its place.  LLVM will emit code to
+prepare the function arguments and retrieve the function return value
+in accordance to the calling convention; the former before the nop
+sequence and the latter after the nop sequence.  It is expected that
+the user will patch over the 'num patch bytes' bytes of nops with a
+calling sequence specific to their runtime before executing the
+generated machine code.  There are no guarantees with respect to the
+alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
+not have a concept of shadow bytes.  Note that semantically the
+statepoint still represents a call or invoke to 'target', and the nop
+sequence after patching is expected to represent an operation
+equivalent to a call or invoke to 'target'.
+
+The 'target' operand is the function actually being called.  The
+target can be specified as either a symbolic LLVM function, or as an
+arbitrary Value of appropriate function type.  Note that the function
+type must match the signature of the callee and the types of the 'call
+parameters' arguments.
+
+The '#call args' operand is the number of arguments to the actual
+call.  It must exactly match the number of arguments passed in the
+'call parameters' variable length section.
+
+The 'flags' operand is used to specify extra information about the
+statepoint. This is currently only used to mark certain statepoints
+as GC transitions. This operand is a 64-bit integer with the following
+layout, where bit 0 is the least significant bit:
+
+  +-------+---------------------------------------------------+
+  | Bit # | Usage                                             |
+  +=======+===================================================+
+  |     0 | Set if the statepoint is a GC transition, cleared |
+  |       | otherwise.                                        |
+  +-------+---------------------------------------------------+
+  |  1-63 | Reserved for future use; must be cleared.         |
+  +-------+---------------------------------------------------+
+
+The 'call parameters' arguments are simply the arguments which need to
+be passed to the call target.  They will be lowered according to the
+specified calling convention and otherwise handled like a normal call
+instruction.  The number of arguments must exactly match what is
+specified in '# call args'.  The types must match the signature of
+'target'.
+
+The 'transition parameters' arguments contain an arbitrary list of
+Values which need to be passed to GC transition code. They will be
+lowered and passed as operands to the appropriate GC_TRANSITION nodes
+in the selection DAG. It is assumed that these arguments must be
+available before and after (but not necessarily during) the execution
+of the callee. The '# transition args' field indicates how many operands
+are to be interpreted as 'transition parameters'.
+
+The 'deopt parameters' arguments contain an arbitrary list of Values
+which is meaningful to the runtime.  The runtime may read any of these
+values, but is assumed not to modify them.  If the garbage collector
+might need to modify one of these values, it must also be listed in
+the 'gc pointer' argument list.  The '# deopt args' field indicates
+how many operands are to be interpreted as 'deopt parameters'.
+
+The 'gc parameters' arguments contain every pointer to a garbage
+collector object which potentially needs to be updated by the garbage
+collector.  Note that the argument list must explicitly contain a base
+pointer for every derived pointer listed.  The order of arguments is
+unimportant.  Unlike the other variable length parameter sets, this
+list is not length prefixed.
+
+Semantics:
+""""""""""
+
+A statepoint is assumed to read and write all memory.  As a result,
+memory operations can not be reordered past a statepoint.  It is
+illegal to mark a statepoint as being either 'readonly' or 'readnone'.
+
+Note that legal IR can not perform any memory operation on a 'gc
+pointer' argument of the statepoint in a location statically reachable
+from the statepoint.  Instead, the explicitly relocated value (from a
+``gc.relocate``) must be used.
+
+'llvm.experimental.gc.result' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare type*
+        @llvm.experimental.gc.result(token %statepoint_token)
+
+Overview:
+"""""""""
+
+``gc.result`` extracts the result of the original call instruction
+which was replaced by the ``gc.statepoint``.  The ``gc.result``
+intrinsic is actually a family of three intrinsics due to an
+implementation limitation.  Other than the type of the return value,
+the semantics are the same.
+
+Operands:
+"""""""""
+
+The first and only argument is the ``gc.statepoint`` which starts
+the safepoint sequence of which this ``gc.result`` is a part.
+Despite the typing of this as a generic token, *only* the value defined 
+by a ``gc.statepoint`` is legal here.
+
+Semantics:
+""""""""""
+
+The ``gc.result`` represents the return value of the call target of
+the ``statepoint``.  The type of the ``gc.result`` must exactly match
+the type of the target.  If the call target returns void, there will
+be no ``gc.result``.
+
+A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
+side effects since it is just a projection of the return value of the
+previous call represented by the ``gc.statepoint``.
+
+'llvm.experimental.gc.relocate' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare <pointer type>
+        @llvm.experimental.gc.relocate(token %statepoint_token, 
+                                       i32 %base_offset, 
+                                       i32 %pointer_offset)
+
+Overview:
+"""""""""
+
+A ``gc.relocate`` returns the potentially relocated value of a pointer
+at the safepoint.
+
+Operands:
+"""""""""
+
+The first argument is the ``gc.statepoint`` which starts the
+safepoint sequence of which this ``gc.relocation`` is a part.
+Despite the typing of this as a generic token, *only* the value defined 
+by a ``gc.statepoint`` is legal here.
+
+The second argument is an index into the statepoints list of arguments
+which specifies the allocation for the pointer being relocated.
+This index must land within the 'gc parameter' section of the
+statepoint's argument list.  The associated value must be within the
+object with which the pointer being relocated is associated. The optimizer
+is free to change *which* interior derived pointer is reported, provided that
+it does not replace an actual base pointer with another interior derived 
+pointer.  Collectors are allowed to rely on the base pointer operand 
+remaining an actual base pointer if so constructed.
+
+The third argument is an index into the statepoint's list of arguments
+which specify the (potentially) derived pointer being relocated.  It
+is legal for this index to be the same as the second argument
+if-and-only-if a base pointer is being relocated. This index must land
+within the 'gc parameter' section of the statepoint's argument list.
+
+Semantics:
+""""""""""
+
+The return value of ``gc.relocate`` is the potentially relocated value
+of the pointer specified by its arguments.  It is unspecified how the
+value of the returned pointer relates to the argument to the
+``gc.statepoint`` other than that a) it points to the same source
+language object with the same offset, and b) the 'based-on'
+relationship of the newly relocated pointers is a projection of the
+unrelocated pointers.  In particular, the integer value of the pointer
+returned is unspecified.
+
+A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
+side effects since it is just a way to extract information about work
+done during the actual call modeled by the ``gc.statepoint``.
+
+.. _statepoint-stackmap-format:
+
+Stack Map Format
+================
+
+Locations for each pointer value which may need read and/or updated by
+the runtime or collector are provided in a separate section of the
+generated object file as specified in the PatchPoint documentation.
+This special section is encoded per the
+:ref:`Stack Map format <stackmap-format>`.
+
+The general expectation is that a JIT compiler will parse and discard this
+format; it is not particularly memory efficient.  If you need an alternate
+format (e.g. for an ahead of time compiler), see discussion under
+:ref: `open work items <OpenWork>` below.
+
+Each statepoint generates the following Locations:
+
+* Constant which describes the calling convention of the call target. This
+  constant is a valid :ref:`calling convention identifier <callingconv>` for
+  the version of LLVM used to generate the stackmap. No additional compatibility
+  guarantees are made for this constant over what LLVM provides elsewhere w.r.t.
+  these identifiers.
+* Constant which describes the flags passed to the statepoint intrinsic
+* Constant which describes number of following deopt *Locations* (not
+  operands)
+* Variable number of Locations, one for each deopt parameter listed in
+  the IR statepoint (same number as described by previous Constant).  At 
+  the moment, only deopt parameters with a bitwidth of 64 bits or less 
+  are supported.  Values of a type larger than 64 bits can be specified 
+  and reported only if a) the value is constant at the call site, and b) 
+  the constant can be represented with less than 64 bits (assuming zero 
+  extension to the original bitwidth).
+* Variable number of relocation records, each of which consists of 
+  exactly two Locations.  Relocation records are described in detail
+  below.
+
+Each relocation record provides sufficient information for a collector to 
+relocate one or more derived pointers.  Each record consists of a pair of 
+Locations.  The second element in the record represents the pointer (or 
+pointers) which need updated.  The first element in the record provides a 
+pointer to the base of the object with which the pointer(s) being relocated is
+associated.  This information is required for handling generalized derived 
+pointers since a pointer may be outside the bounds of the original allocation,
+but still needs to be relocated with the allocation.  Additionally:
+
+* It is guaranteed that the base pointer must also appear explicitly as a 
+  relocation pair if used after the statepoint. 
+* There may be fewer relocation records then gc parameters in the IR
+  statepoint. Each *unique* pair will occur at least once; duplicates
+  are possible.  
+* The Locations within each record may either be of pointer size or a 
+  multiple of pointer size.  In the later case, the record must be 
+  interpreted as describing a sequence of pointers and their corresponding 
+  base pointers. If the Location is of size N x sizeof(pointer), then
+  there will be N records of one pointer each contained within the Location.
+  Both Locations in a pair can be assumed to be of the same size.
+
+Note that the Locations used in each section may describe the same
+physical location.  e.g. A stack slot may appear as a deopt location,
+a gc base pointer, and a gc derived pointer.
+
+The LiveOut section of the StkMapRecord will be empty for a statepoint
+record.
+
+Safepoint Semantics & Verification
+==================================
+
+The fundamental correctness property for the compiled code's
+correctness w.r.t. the garbage collector is a dynamic one.  It must be
+the case that there is no dynamic trace such that a operation
+involving a potentially relocated pointer is observably-after a
+safepoint which could relocate it.  'observably-after' is this usage
+means that an outside observer could observe this sequence of events
+in a way which precludes the operation being performed before the
+safepoint.
+
+To understand why this 'observable-after' property is required,
+consider a null comparison performed on the original copy of a
+relocated pointer.  Assuming that control flow follows the safepoint,
+there is no way to observe externally whether the null comparison is
+performed before or after the safepoint.  (Remember, the original
+Value is unmodified by the safepoint.)  The compiler is free to make
+either scheduling choice.
+
+The actual correctness property implemented is slightly stronger than
+this.  We require that there be no *static path* on which a
+potentially relocated pointer is 'observably-after' it may have been
+relocated.  This is slightly stronger than is strictly necessary (and
+thus may disallow some otherwise valid programs), but greatly
+simplifies reasoning about correctness of the compiled code.
+
+By construction, this property will be upheld by the optimizer if
+correctly established in the source IR.  This is a key invariant of
+the design.
+
+The existing IR Verifier pass has been extended to check most of the
+local restrictions on the intrinsics mentioned in their respective
+documentation.  The current implementation in LLVM does not check the
+key relocation invariant, but this is ongoing work on developing such
+a verifier.  Please ask on llvm-dev if you're interested in
+experimenting with the current version.
+
+.. _statepoint-utilities:
+
+Utility Passes for Safepoint Insertion
+======================================
+
+.. _RewriteStatepointsForGC:
+
+RewriteStatepointsForGC
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The pass RewriteStatepointsForGC transforms a function's IR to lower from the
+abstract machine model described above to the explicit statepoint model of 
+relocations.  To do this, it replaces all calls or invokes of functions which
+might contain a safepoint poll with a ``gc.statepoint`` and associated full
+relocation sequence, including all required ``gc.relocates``.  
+
+Note that by default, this pass only runs for the "statepoint-example" or 
+"core-clr" gc strategies.  You will need to add your custom strategy to this 
+whitelist or use one of the predefined ones. 
+
+As an example, given this code:
+
+.. code-block:: llvm
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
+         gc "statepoint-example" {
+    call void @foo()
+    ret i8 addrspace(1)* %obj
+  }
+
+The pass would produce this IR:
+
+.. code-block:: llvm
+
+  define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) 
+         gc "statepoint-example" {
+    %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj)
+    %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 12, i32 12)
+    ret i8 addrspace(1)* %obj.relocated
+  }
+
+In the above examples, the addrspace(1) marker on the pointers is the mechanism
+that the ``statepoint-example`` GC strategy uses to distinguish references from
+non references.  The pass assumes that all addrspace(1) pointers are non-integral
+pointer types.  Address space 1 is not globally reserved for this purpose.
+
+This pass can be used an utility function by a language frontend that doesn't 
+want to manually reason about liveness, base pointers, or relocation when 
+constructing IR.  As currently implemented, RewriteStatepointsForGC must be 
+run after SSA construction (i.e. mem2ref).
+
+RewriteStatepointsForGC will ensure that appropriate base pointers are listed
+for every relocation created.  It will do so by duplicating code as needed to
+propagate the base pointer associated with each pointer being relocated to
+the appropriate safepoints.  The implementation assumes that the following 
+IR constructs produce base pointers: loads from the heap, addresses of global 
+variables, function arguments, function return values. Constant pointers (such
+as null) are also assumed to be base pointers.  In practice, this constraint
+can be relaxed to producing interior derived pointers provided the target 
+collector can find the associated allocation from an arbitrary interior 
+derived pointer.
+
+By default RewriteStatepointsForGC passes in ``0xABCDEF00`` as the statepoint
+ID and ``0`` as the number of patchable bytes to the newly constructed
+``gc.statepoint``.  These values can be configured on a per-callsite
+basis using the attributes ``"statepoint-id"`` and
+``"statepoint-num-patch-bytes"``.  If a call site is marked with a
+``"statepoint-id"`` function attribute and its value is a positive
+integer (represented as a string), then that value is used as the ID
+of the newly constructed ``gc.statepoint``.  If a call site is marked
+with a ``"statepoint-num-patch-bytes"`` function attribute and its
+value is a positive integer, then that value is used as the 'num patch
+bytes' parameter of the newly constructed ``gc.statepoint``.  The
+``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes
+are not propagated to the ``gc.statepoint`` call or invoke if they
+could be successfully parsed.
+
+In practice, RewriteStatepointsForGC should be run much later in the pass 
+pipeline, after most optimization is already done.  This helps to improve 
+the quality of the generated code when compiled with garbage collection support.
+
+.. _PlaceSafepoints:
+
+PlaceSafepoints
+^^^^^^^^^^^^^^^^
+
+The pass PlaceSafepoints inserts safepoint polls sufficient to ensure running 
+code checks for a safepoint request on a timely manner. This pass is expected 
+to be run before RewriteStatepointsForGC and thus does not produce full 
+relocation sequences.  
+
+As an example, given input IR of the following:
+
+.. code-block:: llvm
+
+  define void @test() gc "statepoint-example" {
+    call void @foo()
+    ret void
+  }
+
+  declare void @do_safepoint()
+  define void @gc.safepoint_poll() {
+    call void @do_safepoint()
+    ret void
+  }
+
+
+This pass would produce the following IR:
+
+.. code-block:: llvm
+
+  define void @test() gc "statepoint-example" {
+    call void @do_safepoint()
+    call void @foo()
+    ret void
+  }
+
+In this case, we've added an (unconditional) entry safepoint poll.  Note that 
+despite appearances, the entry poll is not necessarily redundant.  We'd have to 
+know that ``foo`` and ``test`` were not mutually recursive for the poll to be 
+redundant.  In practice, you'd probably want to your poll definition to contain 
+a conditional branch of some form.
+
+At the moment, PlaceSafepoints can insert safepoint polls at method entry and 
+loop backedges locations.  Extending this to work with return polls would be 
+straight forward if desired.
+
+PlaceSafepoints includes a number of optimizations to avoid placing safepoint 
+polls at particular sites unless needed to ensure timely execution of a poll 
+under normal conditions.  PlaceSafepoints does not attempt to ensure timely 
+execution of a poll under worst case conditions such as heavy system paging.
+
+The implementation of a safepoint poll action is specified by looking up a 
+function of the name ``gc.safepoint_poll`` in the containing Module.  The body
+of this function is inserted at each poll site desired.  While calls or invokes
+inside this method are transformed to a ``gc.statepoints``, recursive poll 
+insertion is not performed.
+
+This pass is useful for any language frontend which only has to support
+garbage collection semantics at safepoints.  If you need other abstract
+frame information at safepoints (e.g. for deoptimization or introspection),
+you can insert safepoint polls in the frontend.  If you have the later case,
+please ask on llvm-dev for suggestions.  There's been a good amount of work
+done on making such a scheme work well in practice which is not yet documented
+here.  
+
+
+Supported Architectures
+=======================
+
+Support for statepoint generation requires some code for each backend.
+Today, only X86_64 is supported.
+
+.. _OpenWork:
+
+Limitations and Half Baked Ideas
+================================
+
+Mixing References and Raw Pointers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Support for languages which allow unmanaged pointers to garbage collected
+objects (i.e. pass a pointer to an object to a C routine) in the abstract
+machine model.  At the moment, the best idea on how to approach this
+involves an intrinsic or opaque function which hides the connection between
+the reference value and the raw pointer.  The problem is that having a
+ptrtoint or inttoptr cast (which is common for such use cases) breaks the
+rules used for inferring base pointers for arbitrary references when
+lowering out of the abstract model to the explicit physical model.  Note
+that a frontend which lowers directly to the physical model doesn't have
+any problems here.
+
+Objects on the Stack
+^^^^^^^^^^^^^^^^^^^^
+
+As noted above, the explicit lowering supports objects allocated on the
+stack provided the collector can find a heap map given the stack address.
+
+The missing pieces are a) integration with rewriting (RS4GC) from the
+abstract machine model and b) support for optionally decomposing on stack
+objects so as not to require heap maps for them.  The later is required
+for ease of integration with some collectors.  
+
+Lowering Quality and Representation Overhead
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The current statepoint lowering is known to be somewhat poor.  In the very
+long term, we'd like to integrate statepoints with the register allocator;
+in the near term this is unlikely to happen.  We've found the quality of
+lowering to be relatively unimportant as hot-statepoints are almost always
+inliner bugs.
+
+Concerns have been raised that the statepoint representation results in a
+large amount of IR being produced for some examples and that this
+contributes to higher than expected memory usage and compile times.  There's
+no immediate plans to make changes due to this, but alternate models may be
+explored in the future.
+
+Relocations Along Exceptional Edges
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Relocations along exceptional paths are currently broken in ToT.  In
+particular, there is current no way to represent a rethrow on a path which
+also has relocations.  See `this llvm-dev discussion
+<https://groups.google.com/forum/#!topic/llvm-dev/AE417XjgxvI>`_ for more
+detail.
+
+Support for alternate stackmap formats
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+For some use cases, it is
+desirable to directly encode a final memory efficient stackmap format for
+use by the runtime.  This is particularly relevant for ahead of time
+compilers which wish to directly link object files without the need for
+post processing of each individual object file.  While not implemented
+today for statepoints, there is precedent for a GCStrategy to be able to
+select a customer GCMetataPrinter for this purpose.  Patches to enable
+this functionality upstream are welcome.   
+
+Bugs and Enhancements
+=====================
+
+Currently known bugs and enhancements under consideration can be
+tracked by performing a `bugzilla search
+<https://bugs.llvm.org/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_
+for [Statepoint] in the summary field. When filing new bugs, please
+use this tag so that interested parties see the newly filed bug.  As
+with most LLVM features, design discussions take place on `llvm-dev
+<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches
+should be sent to `llvm-commits
+<http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ for review.
+

Added: www-releases/trunk/9.0.0/docs/_sources/SupportLibrary.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/SupportLibrary.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/SupportLibrary.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/SupportLibrary.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,246 @@
+===============
+Support Library
+===============
+
+Abstract
+========
+
+This document provides some details on LLVM's Support Library, located in the
+source at ``lib/Support`` and ``include/llvm/Support``. The library's purpose
+is to shield LLVM from the differences between operating systems for the few
+services LLVM needs from the operating system. Much of LLVM is written using
+portability features of standard C++. However, in a few areas, system dependent
+facilities are needed and the Support Library is the wrapper around those
+system calls.
+
+By centralizing LLVM's use of operating system interfaces, we make it possible
+for the LLVM tool chain and runtime libraries to be more easily ported to new
+platforms since (theoretically) only ``lib/Support`` needs to be ported.  This
+library also unclutters the rest of LLVM from #ifdef use and special cases for
+specific operating systems. Such uses are replaced with simple calls to the
+interfaces provided in ``include/llvm/Support``.
+
+Note that the Support Library is not intended to be a complete operating system
+wrapper (such as the Adaptive Communications Environment (ACE) or Apache
+Portable Runtime (APR)), but only provides the functionality necessary to
+support LLVM.
+
+The Support Library was originally referred to as the System Library, written
+by Reid Spencer who formulated the design based on similar work originating
+from the eXtensible Programming System (XPS). Several people helped with the
+effort; especially, Jeff Cohen and Henrik Bach on the Win32 port.
+
+Keeping LLVM Portable
+=====================
+
+In order to keep LLVM portable, LLVM developers should adhere to a set of
+portability rules associated with the Support Library. Adherence to these rules
+should help the Support Library achieve its goal of shielding LLVM from the
+variations in operating system interfaces and doing so efficiently.  The
+following sections define the rules needed to fulfill this objective.
+
+Don't Include System Headers
+----------------------------
+
+Except in ``lib/Support``, no LLVM source code should directly ``#include`` a
+system header. Care has been taken to remove all such ``#includes`` from LLVM
+while ``lib/Support`` was being developed.  Specifically this means that header
+files like "``unistd.h``", "``windows.h``", "``stdio.h``", and "``string.h``"
+are forbidden to be included by LLVM source code outside the implementation of
+``lib/Support``.
+
+To obtain system-dependent functionality, existing interfaces to the system
+found in ``include/llvm/Support`` should be used. If an appropriate interface is
+not available, it should be added to ``include/llvm/Support`` and implemented in
+``lib/Support`` for all supported platforms.
+
+Don't Expose System Headers
+---------------------------
+
+The Support Library must shield LLVM from **all** system headers. To obtain
+system level functionality, LLVM source must 
+``#include "llvm/Support/Thing.h"`` and nothing else. This means that
+``Thing.h`` cannot expose any system header files. This protects LLVM from
+accidentally using system specific functionality and only allows it via
+the ``lib/Support`` interface.
+
+Use Standard C Headers
+----------------------
+
+The **standard** C headers (the ones beginning with "c") are allowed to be
+exposed through the ``lib/Support`` interface. These headers and the things they
+declare are considered to be platform agnostic. LLVM source files may include
+them directly or obtain their inclusion through ``lib/Support`` interfaces.
+
+Use Standard C++ Headers
+------------------------
+
+The **standard** C++ headers from the standard C++ library and standard
+template library may be exposed through the ``lib/Support`` interface. These
+headers and the things they declare are considered to be platform agnostic.
+LLVM source files may include them or obtain their inclusion through
+``lib/Support`` interfaces.
+
+High Level Interface
+--------------------
+
+The entry points specified in the interface of ``lib/Support`` must be aimed at
+completing some reasonably high level task needed by LLVM. We do not want to
+simply wrap each operating system call. It would be preferable to wrap several
+operating system calls that are always used in conjunction with one another by
+LLVM.
+
+For example, consider what is needed to execute a program, wait for it to
+complete, and return its result code. On Unix, this involves the following
+operating system calls: ``getenv``, ``fork``, ``execve``, and ``wait``. The
+correct thing for ``lib/Support`` to provide is a function, say
+``ExecuteProgramAndWait``, that implements the functionality completely.  what
+we don't want is wrappers for the operating system calls involved.
+
+There must **not** be a one-to-one relationship between operating system
+calls and the Support library's interface. Any such interface function will be
+suspicious.
+
+No Unused Functionality
+-----------------------
+
+There must be no functionality specified in the interface of ``lib/Support``
+that isn't actually used by LLVM. We're not writing a general purpose operating
+system wrapper here, just enough to satisfy LLVM's needs. And, LLVM doesn't
+need much. This design goal aims to keep the ``lib/Support`` interface small and
+understandable which should foster its actual use and adoption.
+
+No Duplicate Implementations
+----------------------------
+
+The implementation of a function for a given platform must be written exactly
+once. This implies that it must be possible to apply a function's
+implementation to multiple operating systems if those operating systems can
+share the same implementation. This rule applies to the set of operating
+systems supported for a given class of operating system (e.g. Unix, Win32).
+
+No Virtual Methods
+------------------
+
+The Support Library interfaces can be called quite frequently by LLVM. In order
+to make those calls as efficient as possible, we discourage the use of virtual
+methods. There is no need to use inheritance for implementation differences, it
+just adds complexity. The ``#include`` mechanism works just fine.
+
+No Exposed Functions
+--------------------
+
+Any functions defined by system libraries (i.e. not defined by ``lib/Support``)
+must not be exposed through the ``lib/Support`` interface, even if the header
+file for that function is not exposed. This prevents inadvertent use of system
+specific functionality.
+
+For example, the ``stat`` system call is notorious for having variations in the
+data it provides. ``lib/Support`` must not declare ``stat`` nor allow it to be
+declared. Instead it should provide its own interface to discovering
+information about files and directories. Those interfaces may be implemented in
+terms of ``stat`` but that is strictly an implementation detail. The interface
+provided by the Support Library must be implemented on all platforms (even
+those without ``stat``).
+
+No Exposed Data
+---------------
+
+Any data defined by system libraries (i.e. not defined by ``lib/Support``) must
+not be exposed through the ``lib/Support`` interface, even if the header file
+for that function is not exposed. As with functions, this prevents inadvertent
+use of data that might not exist on all platforms.
+
+Minimize Soft Errors
+--------------------
+
+Operating system interfaces will generally provide error results for every
+little thing that could go wrong. In almost all cases, you can divide these
+error results into two groups: normal/good/soft and abnormal/bad/hard. That is,
+some of the errors are simply information like "file not found", "insufficient
+privileges", etc. while other errors are much harder like "out of space", "bad
+disk sector", or "system call interrupted". We'll call the first group "*soft*"
+errors and the second group "*hard*" errors.
+
+``lib/Support`` must always attempt to minimize soft errors.  This is a design
+requirement because the minimization of soft errors can affect the granularity
+and the nature of the interface. In general, if you find that you're wanting to
+throw soft errors, you must review the granularity of the interface because it
+is likely you're trying to implement something that is too low level. The rule
+of thumb is to provide interface functions that **can't** fail, except when
+faced with hard errors.
+
+For a trivial example, suppose we wanted to add an "``OpenFileForWriting``"
+function. For many operating systems, if the file doesn't exist, attempting to
+open the file will produce an error.  However, ``lib/Support`` should not simply
+throw that error if it occurs because its a soft error. The problem is that the
+interface function, ``OpenFileForWriting`` is too low level. It should be
+``OpenOrCreateFileForWriting``. In the case of the soft "doesn't exist" error,
+this function would just create it and then open it for writing.
+
+This design principle needs to be maintained in ``lib/Support`` because it
+avoids the propagation of soft error handling throughout the rest of LLVM.
+Hard errors will generally just cause a termination for an LLVM tool so don't
+be bashful about throwing them.
+
+Rules of thumb:
+
+#. Don't throw soft errors, only hard errors.
+
+#. If you're tempted to throw a soft error, re-think the interface.
+
+#. Handle internally the most common normal/good/soft error conditions
+   so the rest of LLVM doesn't have to.
+
+No throw Specifications
+-----------------------
+
+None of the ``lib/Support`` interface functions may be declared with C++
+``throw()`` specifications on them. This requirement makes sure that the
+compiler does not insert additional exception handling code into the interface
+functions. This is a performance consideration: ``lib/Support`` functions are
+at the bottom of many call chains and as such can be frequently called. We
+need them to be as efficient as possible.  However, no routines in the system
+library should actually throw exceptions.
+
+Code Organization
+-----------------
+
+Implementations of the Support Library interface are separated by their general
+class of operating system. Currently only Unix and Win32 classes are defined
+but more could be added for other operating system classifications.  To
+distinguish which implementation to compile, the code in ``lib/Support`` uses
+the ``LLVM_ON_UNIX`` and ``_WIN32`` ``#defines``.  Each source file in
+``lib/Support``, after implementing the generic (operating system independent)
+functionality needs to include the correct implementation using a set of
+``#if defined(LLVM_ON_XYZ)`` directives. For example, if we had
+``lib/Support/Path.cpp``, we'd expect to see in that file:
+
+.. code-block:: c++
+
+  #if defined(LLVM_ON_UNIX)
+  #include "Unix/Path.inc"
+  #endif
+  #if defined(_WIN32)
+  #include "Windows/Path.inc"
+  #endif
+
+The implementation in ``lib/Support/Unix/Path.inc`` should handle all Unix
+variants. The implementation in ``lib/Support/Windows/Path.inc`` should handle 
+all Windows variants.  What this does is quickly inc the basic class
+of operating system that will provide the implementation. The specific details
+for a given platform must still be determined through the use of ``#ifdef``.
+
+Consistent Semantics
+--------------------
+
+The implementation of a ``lib/Support`` interface can vary drastically between
+platforms. That's okay as long as the end result of the interface function is
+the same. For example, a function to create a directory is pretty straight
+forward on all operating system. System V IPC on the other hand isn't even
+supported on all platforms. Instead of "supporting" System V IPC,
+``lib/Support`` should provide an interface to the basic concept of
+inter-process communications. The implementations might use System V IPC if
+that was available or named pipes, or whatever gets the job done effectively
+for a given operating system.  In all cases, the interface and the
+implementation must be semantically consistent.

Added: www-releases/trunk/9.0.0/docs/_sources/SystemLibrary.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/SystemLibrary.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/SystemLibrary.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/SystemLibrary.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,9 @@
+==============
+System Library
+==============
+
+Moved
+=====
+
+The System Library has been renamed to Support Library with documentation
+available at :doc:`SupportLibrary`. Please, change your links to that page.

Added: www-releases/trunk/9.0.0/docs/_sources/TableGen/BackEnds.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TableGen/BackEnds.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TableGen/BackEnds.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TableGen/BackEnds.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,571 @@
+=================
+TableGen BackEnds
+=================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+TableGen backends are at the core of TableGen's functionality. The source files
+provide the semantics to a generated (in memory) structure, but it's up to the
+backend to print this out in a way that is meaningful to the user (normally a
+C program including a file or a textual list of warnings, options and error
+messages).
+
+TableGen is used by both LLVM and Clang with very different goals. LLVM uses it
+as a way to automate the generation of massive amounts of information regarding
+instructions, schedules, cores and architecture features. Some backends generate
+output that is consumed by more than one source file, so they need to be created
+in a way that is easy to use pre-processor tricks. Some backends can also print
+C code structures, so that they can be directly included as-is.
+
+Clang, on the other hand, uses it mainly for diagnostic messages (errors,
+warnings, tips) and attributes, so more on the textual end of the scale.
+
+LLVM BackEnds
+=============
+
+.. warning::
+   This document is raw. Each section below needs three sub-sections: description
+   of its purpose with a list of users, output generated from generic input, and
+   finally why it needed a new backend (in case there's something similar).
+
+Overall, each backend will take the same TableGen file type and transform into
+similar output for different targets/uses. There is an implicit contract between
+the TableGen files, the back-ends and their users.
+
+For instance, a global contract is that each back-end produces macro-guarded
+sections. Based on whether the file is included by a header or a source file,
+or even in which context of each file the include is being used, you have
+todefine a macro just before including it, to get the right output:
+
+.. code-block:: c++
+
+  #define GET_REGINFO_TARGET_DESC
+  #include "ARMGenRegisterInfo.inc"
+
+And just part of the generated file would be included. This is useful if
+you need the same information in multiple formats (instantiation, initialization,
+getter/setter functions, etc) from the same source TableGen file without having
+to re-compile the TableGen file multiple times.
+
+Sometimes, multiple macros might be defined before the same include file to
+output multiple blocks:
+
+.. code-block:: c++
+
+  #define GET_REGISTER_MATCHER
+  #define GET_SUBTARGET_FEATURE_NAME
+  #define GET_MATCHER_IMPLEMENTATION
+  #include "ARMGenAsmMatcher.inc"
+
+The macros will be undef'd automatically as they're used, in the include file.
+
+On all LLVM back-ends, the ``llvm-tblgen`` binary will be executed on the root
+TableGen file ``<Target>.td``, which should include all others. This guarantees
+that all information needed is accessible, and that no duplication is needed
+in the TableGen files.
+
+CodeEmitter
+-----------
+
+**Purpose**: CodeEmitterGen uses the descriptions of instructions and their fields to
+construct an automated code emitter: a function that, given a MachineInstr,
+returns the (currently, 32-bit unsigned) value of the instruction.
+
+**Output**: C++ code, implementing the target's CodeEmitter
+class by overriding the virtual functions as ``<Target>CodeEmitter::function()``.
+
+**Usage**: Used to include directly at the end of ``<Target>MCCodeEmitter.cpp``.
+
+RegisterInfo
+------------
+
+**Purpose**: This tablegen backend is responsible for emitting a description of a target
+register file for a code generator.  It uses instances of the Register,
+RegisterAliases, and RegisterClass classes to gather this information.
+
+**Output**: C++ code with enums and structures representing the register mappings,
+properties, masks, etc.
+
+**Usage**: Both on ``<Target>BaseRegisterInfo`` and ``<Target>MCTargetDesc`` (headers
+and source files) with macros defining in which they are for declaration vs.
+initialization issues.
+
+InstrInfo
+---------
+
+**Purpose**: This tablegen backend is responsible for emitting a description of the target
+instruction set for the code generator. (what are the differences from CodeEmitter?)
+
+**Output**: C++ code with enums and structures representing the instruction mappings,
+properties, masks, etc.
+
+**Usage**: Both on ``<Target>BaseInstrInfo`` and ``<Target>MCTargetDesc`` (headers
+and source files) with macros defining in which they are for declaration vs.
+initialization issues.
+
+AsmWriter
+---------
+
+**Purpose**: Emits an assembly printer for the current target.
+
+**Output**: Implementation of ``<Target>InstPrinter::printInstruction()``, among
+other things.
+
+**Usage**: Included directly into ``InstPrinter/<Target>InstPrinter.cpp``.
+
+AsmMatcher
+----------
+
+**Purpose**: Emits a target specifier matcher for
+converting parsed assembly operands in the MCInst structures. It also
+emits a matcher for custom operand parsing. Extensive documentation is
+written on the ``AsmMatcherEmitter.cpp`` file.
+
+**Output**: Assembler parsers' matcher functions, declarations, etc.
+
+**Usage**: Used in back-ends' ``AsmParser/<Target>AsmParser.cpp`` for
+building the AsmParser class.
+
+Disassembler
+------------
+
+**Purpose**: Contains disassembler table emitters for various
+architectures. Extensive documentation is written on the
+``DisassemblerEmitter.cpp`` file.
+
+**Output**: Decoding tables, static decoding functions, etc.
+
+**Usage**: Directly included in ``Disassembler/<Target>Disassembler.cpp``
+to cater for all default decodings, after all hand-made ones.
+
+PseudoLowering
+--------------
+
+**Purpose**: Generate pseudo instruction lowering.
+
+**Output**: Implements ``<Target>AsmPrinter::emitPseudoExpansionLowering()``.
+
+**Usage**: Included directly into ``<Target>AsmPrinter.cpp``.
+
+CallingConv
+-----------
+
+**Purpose**: Responsible for emitting descriptions of the calling
+conventions supported by this target.
+
+**Output**: Implement static functions to deal with calling conventions
+chained by matching styles, returning false on no match.
+
+**Usage**: Used in ISelLowering and FastIsel as function pointers to
+implementation returned by a CC selection function.
+
+DAGISel
+-------
+
+**Purpose**: Generate a DAG instruction selector.
+
+**Output**: Creates huge functions for automating DAG selection.
+
+**Usage**: Included in ``<Target>ISelDAGToDAG.cpp`` inside the target's
+implementation of ``SelectionDAGISel``.
+
+DFAPacketizer
+-------------
+
+**Purpose**: This class parses the Schedule.td file and produces an API that
+can be used to reason about whether an instruction can be added to a packet
+on a VLIW architecture. The class internally generates a deterministic finite
+automaton (DFA) that models all possible mappings of machine instructions
+to functional units as instructions are added to a packet.
+
+**Output**: Scheduling tables for GPU back-ends (Hexagon, AMD).
+
+**Usage**: Included directly on ``<Target>InstrInfo.cpp``.
+
+FastISel
+--------
+
+**Purpose**: This tablegen backend emits code for use by the "fast"
+instruction selection algorithm. See the comments at the top of
+lib/CodeGen/SelectionDAG/FastISel.cpp for background. This file
+scans through the target's tablegen instruction-info files
+and extracts instructions with obvious-looking patterns, and it emits
+code to look up these instructions by type and operator.
+
+**Output**: Generates ``Predicate`` and ``FastEmit`` methods.
+
+**Usage**: Implements private methods of the targets' implementation
+of ``FastISel`` class.
+
+Subtarget
+---------
+
+**Purpose**: Generate subtarget enumerations.
+
+**Output**: Enums, globals, local tables for sub-target information.
+
+**Usage**: Populates ``<Target>Subtarget`` and
+``MCTargetDesc/<Target>MCTargetDesc`` files (both headers and source).
+
+Intrinsic
+---------
+
+**Purpose**: Generate (target) intrinsic information.
+
+OptParserDefs
+-------------
+
+**Purpose**: Print enum values for a class.
+
+SearchableTables
+----------------
+
+**Purpose**: Generate custom searchable tables.
+
+**Output**: Enums, global tables and lookup helper functions.
+
+**Usage**: This backend allows generating free-form, target-specific tables
+from TableGen records. The ARM and AArch64 targets use this backend to generate
+tables of system registers; the AMDGPU target uses it to generate meta-data
+about complex image and memory buffer instructions.
+
+More documentation is available in ``include/llvm/TableGen/SearchableTable.td``,
+which also contains the definitions of TableGen classes which must be
+instantiated in order to define the enums and tables emitted by this backend.
+
+CTags
+-----
+
+**Purpose**: This tablegen backend emits an index of definitions in ctags(1)
+format. A helper script, utils/TableGen/tdtags, provides an easier-to-use
+interface; run 'tdtags -H' for documentation.
+
+X86EVEX2VEX
+-----------
+
+**Purpose**: This X86 specific tablegen backend emits tables that map EVEX
+encoded instructions to their VEX encoded identical instruction.
+
+Clang BackEnds
+==============
+
+ClangAttrClasses
+----------------
+
+**Purpose**: Creates Attrs.inc, which contains semantic attribute class
+declarations for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
+This file is included as part of ``Attr.h``.
+
+ClangAttrParserStringSwitches
+-----------------------------
+
+**Purpose**: Creates AttrParserStringSwitches.inc, which contains
+StringSwitch::Case statements for parser-related string switches. Each switch
+is given its own macro (such as ``CLANG_ATTR_ARG_CONTEXT_LIST``, or
+``CLANG_ATTR_IDENTIFIER_ARG_LIST``), which is expected to be defined before
+including AttrParserStringSwitches.inc, and undefined after.
+
+ClangAttrImpl
+-------------
+
+**Purpose**: Creates AttrImpl.inc, which contains semantic attribute class
+definitions for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``.
+This file is included as part of ``AttrImpl.cpp``.
+
+ClangAttrList
+-------------
+
+**Purpose**: Creates AttrList.inc, which is used when a list of semantic
+attribute identifiers is required. For instance, ``AttrKinds.h`` includes this
+file to generate the list of ``attr::Kind`` enumeration values. This list is
+separated out into multiple categories: attributes, inheritable attributes, and
+inheritable parameter attributes. This categorization happens automatically
+based on information in ``Attr.td`` and is used to implement the ``classof``
+functionality required for ``dyn_cast`` and similar APIs.
+
+ClangAttrPCHRead
+----------------
+
+**Purpose**: Creates AttrPCHRead.inc, which is used to deserialize attributes
+in the ``ASTReader::ReadAttributes`` function.
+
+ClangAttrPCHWrite
+-----------------
+
+**Purpose**: Creates AttrPCHWrite.inc, which is used to serialize attributes in
+the ``ASTWriter::WriteAttributes`` function.
+
+ClangAttrSpellings
+---------------------
+
+**Purpose**: Creates AttrSpellings.inc, which is used to implement the
+``__has_attribute`` feature test macro.
+
+ClangAttrSpellingListIndex
+--------------------------
+
+**Purpose**: Creates AttrSpellingListIndex.inc, which is used to map parsed
+attribute spellings (including which syntax or scope was used) to an attribute
+spelling list index. These spelling list index values are internal
+implementation details exposed via
+``AttributeList::getAttributeSpellingListIndex``.
+
+ClangAttrVisitor
+-------------------
+
+**Purpose**: Creates AttrVisitor.inc, which is used when implementing 
+recursive AST visitors.
+
+ClangAttrTemplateInstantiate
+----------------------------
+
+**Purpose**: Creates AttrTemplateInstantiate.inc, which implements the
+``instantiateTemplateAttribute`` function, used when instantiating a template
+that requires an attribute to be cloned.
+
+ClangAttrParsedAttrList
+-----------------------
+
+**Purpose**: Creates AttrParsedAttrList.inc, which is used to generate the
+``AttributeList::Kind`` parsed attribute enumeration.
+
+ClangAttrParsedAttrImpl
+-----------------------
+
+**Purpose**: Creates AttrParsedAttrImpl.inc, which is used by
+``AttributeList.cpp`` to implement several functions on the ``AttributeList``
+class. This functionality is implemented via the ``AttrInfoMap ParsedAttrInfo``
+array, which contains one element per parsed attribute object.
+
+ClangAttrParsedAttrKinds
+------------------------
+
+**Purpose**: Creates AttrParsedAttrKinds.inc, which is used to implement the
+``AttributeList::getKind`` function, mapping a string (and syntax) to a parsed
+attribute ``AttributeList::Kind`` enumeration.
+
+ClangAttrDump
+-------------
+
+**Purpose**: Creates AttrDump.inc, which dumps information about an attribute.
+It is used to implement ``ASTDumper::dumpAttr``.
+
+ClangDiagsDefs
+--------------
+
+Generate Clang diagnostics definitions.
+
+ClangDiagGroups
+---------------
+
+Generate Clang diagnostic groups.
+
+ClangDiagsIndexName
+-------------------
+
+Generate Clang diagnostic name index.
+
+ClangCommentNodes
+-----------------
+
+Generate Clang AST comment nodes.
+
+ClangDeclNodes
+--------------
+
+Generate Clang AST declaration nodes.
+
+ClangStmtNodes
+--------------
+
+Generate Clang AST statement nodes.
+
+ClangSACheckers
+---------------
+
+Generate Clang Static Analyzer checkers.
+
+ClangCommentHTMLTags
+--------------------
+
+Generate efficient matchers for HTML tag names that are used in documentation comments.
+
+ClangCommentHTMLTagsProperties
+------------------------------
+
+Generate efficient matchers for HTML tag properties.
+
+ClangCommentHTMLNamedCharacterReferences
+----------------------------------------
+
+Generate function to translate named character references to UTF-8 sequences.
+
+ClangCommentCommandInfo
+-----------------------
+
+Generate command properties for commands that are used in documentation comments.
+
+ClangCommentCommandList
+-----------------------
+
+Generate list of commands that are used in documentation comments.
+
+ArmNeon
+-------
+
+Generate arm_neon.h for clang.
+
+ArmNeonSema
+-----------
+
+Generate ARM NEON sema support for clang.
+
+ArmNeonTest
+-----------
+
+Generate ARM NEON tests for clang.
+
+AttrDocs
+--------
+
+**Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is
+used for documenting user-facing attributes.
+
+General BackEnds
+================
+
+JSON
+----
+
+**Purpose**: Output all the values in every ``def``, as a JSON data
+structure that can be easily parsed by a variety of languages. Useful
+for writing custom backends without having to modify TableGen itself,
+or for performing auxiliary analysis on the same TableGen data passed
+to a built-in backend.
+
+**Output**:
+
+The root of the output file is a JSON object (i.e. dictionary),
+containing the following fixed keys:
+
+* ``!tablegen_json_version``: a numeric version field that will
+  increase if an incompatible change is ever made to the structure of
+  this data. The format described here corresponds to version 1.
+
+* ``!instanceof``: a dictionary whose keys are the class names defined
+  in the TableGen input. For each key, the corresponding value is an
+  array of strings giving the names of ``def`` records that derive
+  from that class. So ``root["!instanceof"]["Instruction"]``, for
+  example, would list the names of all the records deriving from the
+  class ``Instruction``.
+
+For each ``def`` record, the root object also has a key for the record
+name. The corresponding value is a subsidiary object containing the
+following fixed keys:
+
+* ``!superclasses``: an array of strings giving the names of all the
+  classes that this record derives from.
+
+* ``!fields``: an array of strings giving the names of all the variables
+  in this record that were defined with the ``field`` keyword.
+
+* ``!name``: a string giving the name of the record. This is always
+  identical to the key in the JSON root object corresponding to this
+  record's dictionary. (If the record is anonymous, the name is
+  arbitrary.)
+
+* ``!anonymous``: a boolean indicating whether the record's name was
+  specified by the TableGen input (if it is ``false``), or invented by
+  TableGen itself (if ``true``).
+
+For each variable defined in a record, the ``def`` object for that
+record also has a key for the variable name. The corresponding value
+is a translation into JSON of the variable's value, using the
+conventions described below.
+
+Some TableGen data types are translated directly into the
+corresponding JSON type:
+
+* A completely undefined value (e.g. for a variable declared without
+  initializer in some superclass of this record, and never initialized
+  by the record itself or any other superclass) is emitted as the JSON
+  ``null`` value.
+
+* ``int`` and ``bit`` values are emitted as numbers. Note that
+  TableGen ``int`` values are capable of holding integers too large to
+  be exactly representable in IEEE double precision. The integer
+  literal in the JSON output will show the full exact integer value.
+  So if you need to retrieve large integers with full precision, you
+  should use a JSON reader capable of translating such literals back
+  into 64-bit integers without losing precision, such as Python's
+  standard ``json`` module.
+
+* ``string`` and ``code`` values are emitted as JSON strings.
+
+* ``list<T>`` values, for any element type ``T``, are emitted as JSON
+  arrays. Each element of the array is represented in turn using these
+  same conventions.
+
+* ``bits`` values are also emitted as arrays. A ``bits`` array is
+  ordered from least-significant bit to most-significant. So the
+  element with index ``i`` corresponds to the bit described as
+  ``x{i}`` in TableGen source. However, note that this means that
+  scripting languages are likely to *display* the array in the
+  opposite order from the way it appears in the TableGen source or in
+  the diagnostic ``-print-records`` output.
+
+All other TableGen value types are emitted as a JSON object,
+containing two standard fields: ``kind`` is a discriminator describing
+which kind of value the object represents, and ``printable`` is a
+string giving the same representation of the value that would appear
+in ``-print-records``.
+
+* A reference to a ``def`` object has ``kind=="def"``, and has an
+  extra field ``def`` giving the name of the object referred to.
+
+* A reference to another variable in the same record has
+  ``kind=="var"``, and has an extra field ``var`` giving the name of
+  the variable referred to.
+
+* A reference to a specific bit of a ``bits``-typed variable in the
+  same record has ``kind=="varbit"``, and has two extra fields:
+  ``var`` gives the name of the variable referred to, and ``index``
+  gives the index of the bit.
+
+* A value of type ``dag`` has ``kind=="dag"``, and has two extra
+  fields. ``operator`` gives the initial value after the opening
+  parenthesis of the dag initializer; ``args`` is an array giving the
+  following arguments. The elements of ``args`` are arrays of length
+  2, giving the value of each argument followed by its colon-suffixed
+  name (if any). For example, in the JSON representation of the dag
+  value ``(Op 22, "hello":$foo)`` (assuming that ``Op`` is the name of
+  a record defined elsewhere with a ``def`` statement):
+
+  * ``operator`` will be an object in which ``kind=="def"`` and
+    ``def=="Op"``
+
+  * ``args`` will be the array ``[[22, null], ["hello", "foo"]]``.
+
+* If any other kind of value or complicated expression appears in the
+  output, it will have ``kind=="complex"``, and no additional fields.
+  These values are not expected to be needed by backends. The standard
+  ``printable`` field can be used to extract a representation of them
+  in TableGen source syntax if necessary.
+
+How to write a back-end
+=======================
+
+TODO.
+
+Until we get a step-by-step HowTo for writing TableGen backends, you can at
+least grab the boilerplate (build system, new files, etc.) from Clang's
+r173931.
+
+TODO: How they work, how to write one.  This section should not contain details
+about any particular backend, except maybe ``-print-enums`` as an example.  This
+should highlight the APIs in ``TableGen/Record.h``.
+

Added: www-releases/trunk/9.0.0/docs/_sources/TableGen/Deficiencies.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TableGen/Deficiencies.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TableGen/Deficiencies.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TableGen/Deficiencies.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,31 @@
+=====================
+TableGen Deficiencies
+=====================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Despite being very generic, TableGen has some deficiencies that have been
+pointed out numerous times. The common theme is that, while TableGen allows
+you to build Domain-Specific-Languages, the final languages that you create
+lack the power of other DSLs, which in turn increase considerably the size
+and complexity of TableGen files.
+
+At the same time, TableGen allows you to create virtually any meaning of
+the basic concepts via custom-made back-ends, which can pervert the original
+design and make it very hard for newcomers to understand it.
+
+There are some in favour of extending the semantics even more, but making sure
+back-ends adhere to strict rules. Others suggesting we should move to more
+powerful DSLs designed with specific purposes, or even re-using existing
+DSLs.
+
+Known Problems
+==============
+
+TODO: Add here frequently asked questions about why TableGen doesn't do
+what you want, how it might, and how we could extend/restrict it to
+be more use friendly.

Added: www-releases/trunk/9.0.0/docs/_sources/TableGen/LangIntro.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TableGen/LangIntro.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TableGen/LangIntro.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TableGen/LangIntro.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,709 @@
+==============================
+TableGen Language Introduction
+==============================
+
+.. contents::
+   :local:
+
+.. warning::
+   This document is extremely rough. If you find something lacking, please
+   fix it, file a documentation bug, or ask about it on llvm-dev.
+
+Introduction
+============
+
+This document is not meant to be a normative spec about the TableGen language
+in and of itself (i.e. how to understand a given construct in terms of how
+it affects the final set of records represented by the TableGen file). For
+the formal language specification, see :doc:`LangRef`.
+
+TableGen syntax
+===============
+
+TableGen doesn't care about the meaning of data (that is up to the backend to
+define), but it does care about syntax, and it enforces a simple type system.
+This section describes the syntax and the constructs allowed in a TableGen file.
+
+TableGen primitives
+-------------------
+
+TableGen comments
+^^^^^^^^^^^^^^^^^
+
+TableGen supports C++ style "``//``" comments, which run to the end of the
+line, and it also supports **nestable** "``/* */``" comments.
+
+.. _TableGen type:
+
+The TableGen type system
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+TableGen files are strongly typed, in a simple (but complete) type-system.
+These types are used to perform automatic conversions, check for errors, and to
+help interface designers constrain the input that they allow.  Every `value
+definition`_ is required to have an associated type.
+
+TableGen supports a mixture of very low-level types (such as ``bit``) and very
+high-level types (such as ``dag``).  This flexibility is what allows it to
+describe a wide range of information conveniently and compactly.  The TableGen
+types are:
+
+``bit``
+    A 'bit' is a boolean value that can hold either 0 or 1.
+
+``int``
+    The 'int' type represents a simple 32-bit integer value, such as 5.
+
+``string``
+    The 'string' type represents an ordered sequence of characters of arbitrary
+    length.
+
+``code``
+    The `code` type represents a code fragment, which can be single/multi-line
+    string literal.
+
+``bits<n>``
+    A 'bits' type is an arbitrary, but fixed, size integer that is broken up
+    into individual bits.  This type is useful because it can handle some bits
+    being defined while others are undefined.
+
+``list<ty>``
+    This type represents a list whose elements are some other type.  The
+    contained type is arbitrary: it can even be another list type.
+
+Class type
+    Specifying a class name in a type context means that the defined value must
+    be a subclass of the specified class.  This is useful in conjunction with
+    the ``list`` type, for example, to constrain the elements of the list to a
+    common base class (e.g., a ``list<Register>`` can only contain definitions
+    derived from the "``Register``" class).
+
+``dag``
+    This type represents a nestable directed graph of elements.
+
+To date, these types have been sufficient for describing things that TableGen
+has been used for, but it is straight-forward to extend this list if needed.
+
+.. _TableGen expressions:
+
+TableGen values and expressions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+TableGen allows for a pretty reasonable number of different expression forms
+when building up values.  These forms allow the TableGen file to be written in a
+natural syntax and flavor for the application.  The current expression forms
+supported include:
+
+``?``
+    uninitialized field
+
+``0b1001011``
+    binary integer value.
+    Note that this is sized by the number of bits given and will not be
+    silently extended/truncated.
+
+``7``
+    decimal integer value
+
+``0x7F``
+    hexadecimal integer value
+
+``"foo"``
+    a single-line string value, can be assigned to ``string`` or ``code`` variable.
+
+``[{ ... }]``
+    usually called a "code fragment", but is just a multiline string literal
+
+``[ X, Y, Z ]<type>``
+    list value.  <type> is the type of the list element and is usually optional.
+    In rare cases, TableGen is unable to deduce the element type in which case
+    the user must specify it explicitly.
+
+``{ a, b, 0b10 }``
+    initializer for a "bits<4>" value.
+    1-bit from "a", 1-bit from "b", 2-bits from 0b10.
+
+``value``
+    value reference
+
+``value{17}``
+    access to one bit of a value
+
+``value{15-17}``
+    access to an ordered sequence of bits of a value, in particular ``value{15-17}``
+    produces an order that is the reverse of ``value{17-15}``.
+
+``DEF``
+    reference to a record definition
+
+``CLASS<val list>``
+    reference to a new anonymous definition of CLASS with the specified template
+    arguments.
+
+``X.Y``
+    reference to the subfield of a value
+
+``list[4-7,17,2-3]``
+    A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it.
+    Elements may be included multiple times.
+
+``foreach <var> = [ <list> ] in { <body> }``
+
+``foreach <var> = [ <list> ] in <def>``
+    Replicate <body> or <def>, replacing instances of <var> with each value
+    in <list>.  <var> is scoped at the level of the ``foreach`` loop and must
+    not conflict with any other object introduced in <body> or <def>.  Only
+    ``def``\s and ``defm``\s are expanded within <body>.
+
+``foreach <var> = 0-15 in ...``
+
+``foreach <var> = {0-15,32-47} in ...``
+    Loop over ranges of integers. The braces are required for multiple ranges.
+
+``(DEF a, b)``
+    a dag value.  The first element is required to be a record definition, the
+    remaining elements in the list may be arbitrary other values, including
+    nested ```dag``' values.
+
+``!con(a, b, ...)``
+    Concatenate two or more DAG nodes. Their operations must equal.
+
+    Example: !con((op a1:$name1, a2:$name2), (op b1:$name3)) results in
+    the DAG node (op a1:$name1, a2:$name2, b1:$name3).
+
+``!dag(op, children, names)``
+    Generate a DAG node programmatically. 'children' and 'names' must be lists
+    of equal length or unset ('?'). 'names' must be a 'list<string>'.
+
+    Due to limitations of the type system, 'children' must be a list of items
+    of a common type. In practice, this means that they should either have the
+    same type or be records with a common superclass. Mixing dag and non-dag
+    items is not possible. However, '?' can be used.
+
+    Example: !dag(op, [a1, a2, ?], ["name1", "name2", "name3"]) results in
+    (op a1:$name1, a2:$name2, ?:$name3).
+
+``!listconcat(a, b, ...)``
+    A list value that is the result of concatenating the 'a' and 'b' lists.
+    The lists must have the same element type.
+    More than two arguments are accepted with the result being the concatenation
+    of all the lists given.
+
+``!listsplat(a, size)``
+    A list value that contains the value ``a`` ``size`` times.
+    Example: ``!listsplat(0, 2)`` results in ``[0, 0]``.
+
+``!strconcat(a, b, ...)``
+    A string value that is the result of concatenating the 'a' and 'b' strings.
+    More than two arguments are accepted with the result being the concatenation
+    of all the strings given.
+
+``str1#str2``
+    "#" (paste) is a shorthand for !strconcat.  It may concatenate things that
+    are not quoted strings, in which case an implicit !cast<string> is done on
+    the operand of the paste.
+
+``!cast<type>(a)``
+    If 'a' is a string, a record of type *type* obtained by looking up the
+    string 'a' in the list of all records defined by the time that all template
+    arguments in 'a' are fully resolved.
+
+    For example, if !cast<type>(a) appears in a multiclass definition, or in a
+    class instantiated inside of a multiclass definition, and 'a' does not
+    reference any template arguments of the multiclass, then a record of name
+    'a' must be instantiated earlier in the source file. If 'a' does reference
+    a template argument, then the lookup is delayed until defm statements
+    instantiating the multiclass (or later, if the defm occurs in another
+    multiclass and template arguments of the inner multiclass that are
+    referenced by 'a' are substituted by values that themselves contain
+    references to template arguments of the outer multiclass).
+
+    If the type of 'a' does not match *type*, TableGen aborts with an error.
+
+    Otherwise, perform a normal type cast e.g. between an int and a bit, or
+    between record types. This allows casting a record to a subclass, though if
+    the types do not match, constant folding will be inhibited. !cast<string>
+    is a special case in that the argument can be an int or a record. In the
+    latter case, the record's name is returned.
+
+``!isa<type>(a)``
+    Returns an integer: 1 if 'a' is dynamically of the given type, 0 otherwise.
+
+``!subst(a, b, c)``
+    If 'a' and 'b' are of string type or are symbol references, substitute 'b'
+    for 'a' in 'c.'  This operation is analogous to $(subst) in GNU make.
+
+``!foreach(a, b, c)``
+    For each member of dag or list 'b' apply operator 'c'. 'a' is the name
+    of a variable that will be substituted by members of 'b' in 'c'.
+    This operation is analogous to $(foreach) in GNU make.
+
+``!foldl(start, lst, a, b, expr)``
+    Perform a left-fold over 'lst' with the given starting value. 'a' and 'b'
+    are variable names which will be substituted in 'expr'. If you think of
+    expr as a function f(a,b), the fold will compute
+    'f(...f(f(start, lst[0]), lst[1]), ...), lst[n-1])' for a list of length n.
+    As usual, 'a' will be of the type of 'start', and 'b' will be of the type
+    of elements of 'lst'. These types need not be the same, but 'expr' must be
+    of the same type as 'start'.
+
+``!head(a)``
+    The first element of list 'a.'
+
+``!tail(a)``
+    The 2nd-N elements of list 'a.'
+
+``!empty(a)``
+    An integer {0,1} indicating whether list 'a' is empty.
+
+``!size(a)``
+    An integer indicating the number of elements in list 'a'.
+
+``!if(a,b,c)``
+  'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise.
+
+``!cond(condition_1 : val1, condition_2 : val2, ..., condition_n : valn)``
+    Instead of embedding !if inside !if which can get cumbersome,
+    one can use !cond. !cond returns 'val1' if the result of 'int' or 'bit'
+    operator 'condition1' is nonzero. Otherwise, it checks 'condition2'.
+    If 'condition2' is nonzero, returns 'val2', and so on.
+    If all conditions are zero, it reports an error.  
+
+    For example, to convert an integer 'x' into a string:
+      !cond(!lt(x,0) : "negative", !eq(x,0) : "zero", 1 : "positive")
+
+``!eq(a,b)``
+    'bit 1' if string a is equal to string b, 0 otherwise.  This only operates
+    on string, int and bit objects.  Use !cast<string> to compare other types of
+    objects.
+
+``!ne(a,b)``
+    The negation of ``!eq(a,b)``.
+
+``!le(a,b), !lt(a,b), !ge(a,b), !gt(a,b)``
+    (Signed) comparison of integer values that returns bit 1 or 0 depending on
+    the result of the comparison.
+
+``!shl(a,b)`` ``!srl(a,b)`` ``!sra(a,b)``
+    The usual shift operators. Operations are on 64-bit integers, the result
+    is undefined for shift counts outside [0, 63].
+
+``!add(a,b,...)`` ``!mul(a,b,...)`` ``!and(a,b,...)`` ``!or(a,b,...)``
+    The usual arithmetic and binary operators.
+
+Note that all of the values have rules specifying how they convert to values
+for different types.  These rules allow you to assign a value like "``7``"
+to a "``bits<4>``" value, for example.
+
+Classes and definitions
+-----------------------
+
+As mentioned in the :doc:`introduction <index>`, classes and definitions (collectively known as
+'records') in TableGen are the main high-level unit of information that TableGen
+collects.  Records are defined with a ``def`` or ``class`` keyword, the record
+name, and an optional list of "`template arguments`_".  If the record has
+superclasses, they are specified as a comma separated list that starts with a
+colon character ("``:``").  If `value definitions`_ or `let expressions`_ are
+needed for the class, they are enclosed in curly braces ("``{}``"); otherwise,
+the record ends with a semicolon.
+
+Here is a simple TableGen file:
+
+.. code-block:: text
+
+  class C { bit V = 1; }
+  def X : C;
+  def Y : C {
+    string Greeting = "hello";
+  }
+
+This example defines two definitions, ``X`` and ``Y``, both of which derive from
+the ``C`` class.  Because of this, they both get the ``V`` bit value.  The ``Y``
+definition also gets the Greeting member as well.
+
+In general, classes are useful for collecting together the commonality between a
+group of records and isolating it in a single place.  Also, classes permit the
+specification of default values for their subclasses, allowing the subclasses to
+override them as they wish.
+
+.. _value definition:
+.. _value definitions:
+
+Value definitions
+^^^^^^^^^^^^^^^^^
+
+Value definitions define named entries in records.  A value must be defined
+before it can be referred to as the operand for another value definition or
+before the value is reset with a `let expression`_.  A value is defined by
+specifying a `TableGen type`_ and a name.  If an initial value is available, it
+may be specified after the type with an equal sign.  Value definitions require
+terminating semicolons.
+
+.. _let expression:
+.. _let expressions:
+.. _"let" expressions within a record:
+
+'let' expressions
+^^^^^^^^^^^^^^^^^
+
+A record-level let expression is used to change the value of a value definition
+in a record.  This is primarily useful when a superclass defines a value that a
+derived class or definition wants to override.  Let expressions consist of the
+'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new
+value.  For example, a new class could be added to the example above, redefining
+the ``V`` field for all of its subclasses:
+
+.. code-block:: text
+
+  class D : C { let V = 0; }
+  def Z : D;
+
+In this case, the ``Z`` definition will have a zero value for its ``V`` value,
+despite the fact that it derives (indirectly) from the ``C`` class, because the
+``D`` class overrode its value.
+
+References between variables in a record are substituted late, which gives
+``let`` expressions unusual power. Consider this admittedly silly example:
+
+.. code-block:: text
+
+  class A<int x> {
+    int Y = x;
+    int Yplus1 = !add(Y, 1);
+    int xplus1 = !add(x, 1);
+  }
+  def Z : A<5> {
+    let Y = 10;
+  }
+
+The value of ``Z.xplus1`` will be 6, but the value of ``Z.Yplus1`` is 11. Use
+this power wisely.
+
+.. _template arguments:
+
+Class template arguments
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+TableGen permits the definition of parameterized classes as well as normal
+concrete classes.  Parameterized TableGen classes specify a list of variable
+bindings (which may optionally have defaults) that are bound when used.  Here is
+a simple example:
+
+.. code-block:: text
+
+  class FPFormat<bits<3> val> {
+    bits<3> Value = val;
+  }
+  def NotFP      : FPFormat<0>;
+  def ZeroArgFP  : FPFormat<1>;
+  def OneArgFP   : FPFormat<2>;
+  def OneArgFPRW : FPFormat<3>;
+  def TwoArgFP   : FPFormat<4>;
+  def CompareFP  : FPFormat<5>;
+  def CondMovFP  : FPFormat<6>;
+  def SpecialFP  : FPFormat<7>;
+
+In this case, template arguments are used as a space efficient way to specify a
+list of "enumeration values", each with a "``Value``" field set to the specified
+integer.
+
+The more esoteric forms of `TableGen expressions`_ are useful in conjunction
+with template arguments.  As an example:
+
+.. code-block:: text
+
+  class ModRefVal<bits<2> val> {
+    bits<2> Value = val;
+  }
+
+  def None   : ModRefVal<0>;
+  def Mod    : ModRefVal<1>;
+  def Ref    : ModRefVal<2>;
+  def ModRef : ModRefVal<3>;
+
+  class Value<ModRefVal MR> {
+    // Decode some information into a more convenient format, while providing
+    // a nice interface to the user of the "Value" class.
+    bit isMod = MR.Value{0};
+    bit isRef = MR.Value{1};
+
+    // other stuff...
+  }
+
+  // Example uses
+  def bork : Value<Mod>;
+  def zork : Value<Ref>;
+  def hork : Value<ModRef>;
+
+This is obviously a contrived example, but it shows how template arguments can
+be used to decouple the interface provided to the user of the class from the
+actual internal data representation expected by the class.  In this case,
+running ``llvm-tblgen`` on the example prints the following definitions:
+
+.. code-block:: text
+
+  def bork {      // Value
+    bit isMod = 1;
+    bit isRef = 0;
+  }
+  def hork {      // Value
+    bit isMod = 1;
+    bit isRef = 1;
+  }
+  def zork {      // Value
+    bit isMod = 0;
+    bit isRef = 1;
+  }
+
+This shows that TableGen was able to dig into the argument and extract a piece
+of information that was requested by the designer of the "Value" class.  For
+more realistic examples, please see existing users of TableGen, such as the X86
+backend.
+
+Multiclass definitions and instances
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+While classes with template arguments are a good way to factor commonality
+between two instances of a definition, multiclasses allow a convenient notation
+for defining multiple definitions at once (instances of implicitly constructed
+classes).  For example, consider an 3-address instruction set whose instructions
+come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``"
+(e.g. SPARC). In this case, you'd like to specify in one place that this
+commonality exists, then in a separate place indicate what all the ops are.
+
+Here is an example TableGen fragment that shows this idea:
+
+.. code-block:: text
+
+  def ops;
+  def GPR;
+  def Imm;
+  class inst<int opc, string asmstr, dag operandlist>;
+
+  multiclass ri_inst<int opc, string asmstr> {
+    def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
+                   (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
+    def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
+                   (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
+  }
+
+  // Instantiations of the ri_inst multiclass.
+  defm ADD : ri_inst<0b111, "add">;
+  defm SUB : ri_inst<0b101, "sub">;
+  defm MUL : ri_inst<0b100, "mul">;
+  ...
+
+The name of the resultant definitions has the multidef fragment names appended
+to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc.  A defm may
+inherit from multiple multiclasses, instantiating definitions from each
+multiclass.  Using a multiclass this way is exactly equivalent to instantiating
+the classes multiple times yourself, e.g. by writing:
+
+.. code-block:: text
+
+  def ops;
+  def GPR;
+  def Imm;
+  class inst<int opc, string asmstr, dag operandlist>;
+
+  class rrinst<int opc, string asmstr>
+    : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
+           (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
+
+  class riinst<int opc, string asmstr>
+    : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
+           (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
+
+  // Instantiations of the ri_inst multiclass.
+  def ADD_rr : rrinst<0b111, "add">;
+  def ADD_ri : riinst<0b111, "add">;
+  def SUB_rr : rrinst<0b101, "sub">;
+  def SUB_ri : riinst<0b101, "sub">;
+  def MUL_rr : rrinst<0b100, "mul">;
+  def MUL_ri : riinst<0b100, "mul">;
+  ...
+
+A ``defm`` can also be used inside a multiclass providing several levels of
+multiclass instantiations.
+
+.. code-block:: text
+
+  class Instruction<bits<4> opc, string Name> {
+    bits<4> opcode = opc;
+    string name = Name;
+  }
+
+  multiclass basic_r<bits<4> opc> {
+    def rr : Instruction<opc, "rr">;
+    def rm : Instruction<opc, "rm">;
+  }
+
+  multiclass basic_s<bits<4> opc> {
+    defm SS : basic_r<opc>;
+    defm SD : basic_r<opc>;
+    def X : Instruction<opc, "x">;
+  }
+
+  multiclass basic_p<bits<4> opc> {
+    defm PS : basic_r<opc>;
+    defm PD : basic_r<opc>;
+    def Y : Instruction<opc, "y">;
+  }
+
+  defm ADD : basic_s<0xf>, basic_p<0xf>;
+  ...
+
+  // Results
+  def ADDPDrm { ...
+  def ADDPDrr { ...
+  def ADDPSrm { ...
+  def ADDPSrr { ...
+  def ADDSDrm { ...
+  def ADDSDrr { ...
+  def ADDY { ...
+  def ADDX { ...
+
+``defm`` declarations can inherit from classes too, the rule to follow is that
+the class list must start after the last multiclass, and there must be at least
+one multiclass before them.
+
+.. code-block:: text
+
+  class XD { bits<4> Prefix = 11; }
+  class XS { bits<4> Prefix = 12; }
+
+  class I<bits<4> op> {
+    bits<4> opcode = op;
+  }
+
+  multiclass R {
+    def rr : I<4>;
+    def rm : I<2>;
+  }
+
+  multiclass Y {
+    defm SS : R, XD;
+    defm SD : R, XS;
+  }
+
+  defm Instr : Y;
+
+  // Results
+  def InstrSDrm {
+    bits<4> opcode = { 0, 0, 1, 0 };
+    bits<4> Prefix = { 1, 1, 0, 0 };
+  }
+  ...
+  def InstrSSrr {
+    bits<4> opcode = { 0, 1, 0, 0 };
+    bits<4> Prefix = { 1, 0, 1, 1 };
+  }
+
+File scope entities
+-------------------
+
+File inclusion
+^^^^^^^^^^^^^^
+
+TableGen supports the '``include``' token, which textually substitutes the
+specified file in place of the include directive.  The filename should be
+specified as a double quoted string immediately after the '``include``' keyword.
+Example:
+
+.. code-block:: text
+
+  include "foo.td"
+
+'let' expressions
+^^^^^^^^^^^^^^^^^
+
+"Let" expressions at file scope are similar to `"let" expressions within a
+record`_, except they can specify a value binding for multiple records at a
+time, and may be useful in certain other cases.  File-scope let expressions are
+really just another way that TableGen allows the end-user to factor out
+commonality from the records.
+
+File-scope "let" expressions take a comma-separated list of bindings to apply,
+and one or more records to bind the values in.  Here are some examples:
+
+.. code-block:: text
+
+  let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in
+    def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>;
+
+  let isCall = 1 in
+    // All calls clobber the non-callee saved registers...
+    let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
+                MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
+                XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in {
+      def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops),
+                             "call\t${dst:call}", []>;
+      def CALL32r     : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
+                          "call\t{*}$dst", [(X86call GR32:$dst)]>;
+      def CALL32m     : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
+                          "call\t{*}$dst", []>;
+    }
+
+File-scope "let" expressions are often useful when a couple of definitions need
+to be added to several records, and the records do not otherwise need to be
+opened, as in the case with the ``CALL*`` instructions above.
+
+It's also possible to use "let" expressions inside multiclasses, providing more
+ways to factor out commonality from the records, specially if using several
+levels of multiclass instantiations. This also avoids the need of using "let"
+expressions within subsequent records inside a multiclass.
+
+.. code-block:: text
+
+  multiclass basic_r<bits<4> opc> {
+    let Predicates = [HasSSE2] in {
+      def rr : Instruction<opc, "rr">;
+      def rm : Instruction<opc, "rm">;
+    }
+    let Predicates = [HasSSE3] in
+      def rx : Instruction<opc, "rx">;
+  }
+
+  multiclass basic_ss<bits<4> opc> {
+    let IsDouble = 0 in
+      defm SS : basic_r<opc>;
+
+    let IsDouble = 1 in
+      defm SD : basic_r<opc>;
+  }
+
+  defm ADD : basic_ss<0xf>;
+
+Looping
+^^^^^^^
+
+TableGen supports the '``foreach``' block, which textually replicates the loop
+body, substituting iterator values for iterator references in the body.
+Example:
+
+.. code-block:: text
+
+  foreach i = [0, 1, 2, 3] in {
+    def R#i : Register<...>;
+    def F#i : Register<...>;
+  }
+
+This will create objects ``R0``, ``R1``, ``R2`` and ``R3``.  ``foreach`` blocks
+may be nested. If there is only one item in the body the braces may be
+elided:
+
+.. code-block:: text
+
+  foreach i = [0, 1, 2, 3] in
+    def R#i : Register<...>;
+
+Code Generator backend info
+===========================
+
+Expressions used by code generator to describe instructions and isel patterns:
+
+``(implicit a)``
+    an implicitly defined physical register.  This tells the dag instruction
+    selection emitter the input pattern's extra definitions matches implicit
+    physical register definitions.
+

Added: www-releases/trunk/9.0.0/docs/_sources/TableGen/LangRef.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TableGen/LangRef.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TableGen/LangRef.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TableGen/LangRef.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,505 @@
+===========================
+TableGen Language Reference
+===========================
+
+.. contents::
+   :local:
+
+.. warning::
+   This document is extremely rough. If you find something lacking, please
+   fix it, file a documentation bug, or ask about it on llvm-dev.
+
+Introduction
+============
+
+This document is meant to be a normative spec about the TableGen language
+in and of itself (i.e. how to understand a given construct in terms of how
+it affects the final set of records represented by the TableGen file). If
+you are unsure if this document is really what you are looking for, please
+read the :doc:`introduction to TableGen <index>` first.
+
+Notation
+========
+
+The lexical and syntax notation used here is intended to imitate
+`Python's`_. In particular, for lexical definitions, the productions
+operate at the character level and there is no implied whitespace between
+elements. The syntax definitions operate at the token level, so there is
+implied whitespace between tokens.
+
+.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
+
+Lexical Analysis
+================
+
+TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
+comments.  TableGen also provides simple `Preprocessing Support`_.
+
+The following is a listing of the basic punctuation tokens::
+
+   - + [ ] { } ( ) < > : ; .  = ? #
+
+Numeric literals take one of the following forms:
+
+.. TableGen actually will lex some pretty strange sequences an interpret
+   them as numbers. What is shown here is an attempt to approximate what it
+   "should" accept.
+
+.. productionlist::
+   TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
+   DecimalInteger: ["+" | "-"] ("0"..."9")+
+   HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
+   BinInteger: "0b" ("0" | "1")+
+
+One aspect to note is that the :token:`DecimalInteger` token *includes* the
+``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
+most languages do.
+
+Also note that :token:`BinInteger` creates a value of type ``bits<n>``
+(where ``n`` is the number of bits).  This will implicitly convert to
+integers when needed.
+
+TableGen has identifier-like tokens:
+
+.. productionlist::
+   ualpha: "a"..."z" | "A"..."Z" | "_"
+   TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
+   TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
+
+Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
+begin with a number. In case of ambiguity, a token will be interpreted as a
+numeric literal rather than an identifier.
+
+TableGen also has two string-like literals:
+
+.. productionlist::
+   TokString: '"' <non-'"' characters and C-like escapes> '"'
+   TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
+
+:token:`TokCodeFragment` is essentially a multiline string literal
+delimited by ``[{`` and ``}]``.
+
+.. note::
+   The current implementation accepts the following C-like escapes::
+
+      \\ \' \" \t \n
+
+TableGen also has the following keywords::
+
+   bit   bits      class   code         dag
+   def   foreach   defm    field        in
+   int   let       list    multiclass   string
+
+TableGen also has "bang operators" which have a
+wide variety of meanings:
+
+.. productionlist::
+   BangOperator: one of
+               :!eq     !if      !head    !tail      !con
+               :!add    !shl     !sra     !srl       !and
+               :!or     !empty   !subst   !foreach   !strconcat
+               :!cast   !listconcat       !size      !foldl
+               :!isa    !dag     !le      !lt        !ge
+               :!gt     !ne      !mul     !listsplat
+
+TableGen also has !cond operator that needs a slightly different
+syntax compared to other "bang operators":
+
+.. productionlist::
+   CondOperator: !cond
+
+
+Syntax
+======
+
+TableGen has an ``include`` mechanism. It does not play a role in the
+syntax per se, since it is lexically replaced with the contents of the
+included file.
+
+.. productionlist::
+   IncludeDirective: "include" `TokString`
+
+TableGen's top-level production consists of "objects".
+
+.. productionlist::
+   TableGenFile: `Object`*
+   Object: `Class` | `Def` | `Defm` | `Defset` | `Let` | `MultiClass` |
+           `Foreach`
+
+``class``\es
+------------
+
+.. productionlist::
+   Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
+   TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
+
+A ``class`` declaration creates a record which other records can inherit
+from. A class can be parametrized by a list of "template arguments", whose
+values can be used in the class body.
+
+A given class can only be defined once. A ``class`` declaration is
+considered to define the class if any of the following is true:
+
+.. break ObjectBody into its consituents so that they are present here?
+
+#. The :token:`TemplateArgList` is present.
+#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
+#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
+
+You can declare an empty class by giving an empty :token:`TemplateArgList`
+and an empty :token:`ObjectBody`. This can serve as a restricted form of
+forward declaration: note that records deriving from the forward-declared
+class will inherit no fields from it since the record expansion is done
+when the record is parsed.
+
+Every class has an implicit template argument called ``NAME``, which is set
+to the name of the instantiating ``def`` or ``defm``. The result is undefined
+if the class is instantiated by an anonymous record.
+
+Declarations
+------------
+
+.. Omitting mention of arcane "field" prefix to discourage its use.
+
+The declaration syntax is pretty much what you would expect as a C++
+programmer.
+
+.. productionlist::
+   Declaration: `Type` `TokIdentifier` ["=" `Value`]
+
+It assigns the value to the identifier.
+
+Types
+-----
+
+.. productionlist::
+   Type: "string" | "code" | "bit" | "int" | "dag"
+       :| "bits" "<" `TokInteger` ">"
+       :| "list" "<" `Type` ">"
+       :| `ClassID`
+   ClassID: `TokIdentifier`
+
+Both ``string`` and ``code`` correspond to the string type; the difference
+is purely to indicate programmer intention.
+
+The :token:`ClassID` must identify a class that has been previously
+declared or defined.
+
+Values
+------
+
+.. productionlist::
+   Value: `SimpleValue` `ValueSuffix`*
+   ValueSuffix: "{" `RangeList` "}"
+              :| "[" `RangeList` "]"
+              :| "." `TokIdentifier`
+   RangeList: `RangePiece` ("," `RangePiece`)*
+   RangePiece: `TokInteger`
+             :| `TokInteger` "-" `TokInteger`
+             :| `TokInteger` `TokInteger`
+
+The peculiar last form of :token:`RangePiece` is due to the fact that the
+"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
+two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
+instead of "1", "-", and "5".
+The :token:`RangeList` can be thought of as specifying "list slice" in some
+contexts.
+
+
+:token:`SimpleValue` has a number of forms:
+
+
+.. productionlist::
+   SimpleValue: `TokIdentifier`
+
+The value will be the variable referenced by the identifier. It can be one
+of:
+
+.. The code for this is exceptionally abstruse. These examples are a
+   best-effort attempt.
+
+* name of a ``def``, such as the use of ``Bar`` in::
+
+     def Bar : SomeClass {
+       int X = 5;
+     }
+
+     def Foo {
+       SomeClass Baz = Bar;
+     }
+
+* value local to a ``def``, such as the use of ``Bar`` in::
+
+     def Foo {
+       int Bar = 5;
+       int Baz = Bar;
+     }
+
+  Values defined in superclasses can be accessed the same way.
+
+* a template arg of a ``class``, such as the use of ``Bar`` in::
+
+     class Foo<int Bar> {
+       int Baz = Bar;
+     }
+
+* value local to a ``class``, such as the use of ``Bar`` in::
+
+     class Foo {
+       int Bar = 5;
+       int Baz = Bar;
+     }
+
+* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
+
+     multiclass Foo<int Bar> {
+       def : SomeClass<Bar>;
+     }
+
+* the iteration variable of a ``foreach``, such as the use of ``i`` in::
+
+     foreach i = 0-5 in
+     def Foo#i;
+
+* a variable defined by ``defset``
+
+* the implicit template argument ``NAME`` in a ``class`` or ``multiclass``
+
+.. productionlist::
+   SimpleValue: `TokInteger`
+
+This represents the numeric value of the integer.
+
+.. productionlist::
+   SimpleValue: `TokString`+
+
+Multiple adjacent string literals are concatenated like in C/C++. The value
+is the concatenation of the strings.
+
+.. productionlist::
+   SimpleValue: `TokCodeFragment`
+
+The value is the string value of the code fragment.
+
+.. productionlist::
+   SimpleValue: "?"
+
+``?`` represents an "unset" initializer.
+
+.. productionlist::
+   SimpleValue: "{" `ValueList` "}"
+   ValueList: [`ValueListNE`]
+   ValueListNE: `Value` ("," `Value`)*
+
+This represents a sequence of bits, as would be used to initialize a
+``bits<n>`` field (where ``n`` is the number of bits).
+
+.. productionlist::
+   SimpleValue: `ClassID` "<" `ValueListNE` ">"
+
+This generates a new anonymous record definition (as would be created by an
+unnamed ``def`` inheriting from the given class with the given template
+arguments) and the value is the value of that record definition.
+
+.. productionlist::
+   SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
+
+A list initializer. The optional :token:`Type` can be used to indicate a
+specific element type, otherwise the element type will be deduced from the
+given values.
+
+.. The initial `DagArg` of the dag must start with an identifier or
+   !cast, but this is more of an implementation detail and so for now just
+   leave it out.
+
+.. productionlist::
+   SimpleValue: "(" `DagArg` [`DagArgList`] ")"
+   DagArgList: `DagArg` ("," `DagArg`)*
+   DagArg: `Value` [":" `TokVarName`] | `TokVarName`
+
+The initial :token:`DagArg` is called the "operator" of the dag.
+
+.. productionlist::
+   SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
+              :| `CondOperator` "(" `CondVal` ("," `CondVal`)* ")"
+   CondVal: `Value` ":" `Value`
+
+Bodies
+------
+
+.. productionlist::
+   ObjectBody: `BaseClassList` `Body`
+   BaseClassList: [":" `BaseClassListNE`]
+   BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
+   SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
+   DefmID: `TokIdentifier`
+
+The version with the :token:`MultiClassID` is only valid in the
+:token:`BaseClassList` of a ``defm``.
+The :token:`MultiClassID` should be the name of a ``multiclass``.
+
+.. put this somewhere else
+
+It is after parsing the base class list that the "let stack" is applied.
+
+.. productionlist::
+   Body: ";" | "{" BodyList "}"
+   BodyList: BodyItem*
+   BodyItem: `Declaration` ";"
+           :| "let" `TokIdentifier` [ "{" `RangeList` "}" ] "=" `Value` ";"
+
+The ``let`` form allows overriding the value of an inherited field.
+
+``def``
+-------
+
+.. productionlist::
+   Def: "def" [`Value`] `ObjectBody`
+
+Defines a record whose name is given by the optional :token:`Value`. The value
+is parsed in a special mode where global identifiers (records and variables
+defined by ``defset``) are not recognized, and all unrecognized identifiers
+are interpreted as strings.
+
+If no name is given, the record is anonymous. The final name of anonymous
+records is undefined, but globally unique.
+
+Special handling occurs if this ``def`` appears inside a ``multiclass`` or
+a ``foreach``.
+
+When a non-anonymous record is defined in a multiclass and the given name
+does not contain a reference to the implicit template argument ``NAME``, such
+a reference will automatically be prepended. That is, the following are
+equivalent inside a multiclass::
+
+    def Foo;
+    def NAME#Foo;
+
+``defm``
+--------
+
+.. productionlist::
+   Defm: "defm" [`Value`] ":" `BaseClassListNE` ";"
+
+The :token:`BaseClassList` is a list of at least one ``multiclass`` and any
+number of ``class``'s. The ``multiclass``'s must occur before any ``class``'s.
+
+Instantiates all records defined in all given ``multiclass``'s and adds the
+given ``class``'s as superclasses.
+
+The name is parsed in the same special mode used by ``def``. If the name is
+missing, a globally unique string is used instead (but instantiated records
+are not considered to be anonymous, unless they were originally defined by an
+anonymous ``def``) That is, the following have different semantics::
+
+    defm : SomeMultiClass<...>;    // some globally unique name
+    defm "" : SomeMultiClass<...>; // empty name string
+
+When it occurs inside a multiclass, the second variant is equivalent to
+``defm NAME : ...``. More generally, when ``defm`` occurs in a multiclass and
+its name does not contain a reference to the implicit template argument
+``NAME``, such a reference will automatically be prepended. That is, the
+following are equivalent inside a multiclass::
+
+    defm Foo : SomeMultiClass<...>;
+    defm NAME#Foo : SomeMultiClass<...>;
+
+``defset``
+----------
+.. productionlist::
+   Defset: "defset" `Type` `TokIdentifier` "=" "{" `Object`* "}"
+
+All records defined inside the braces via ``def`` and ``defm`` are collected
+in a globally accessible list of the given name (in addition to being added
+to the global collection of records as usual). Anonymous records created inside
+initializier expressions using the ``Class<args...>`` syntax are never collected
+in a defset.
+
+The given type must be ``list<A>``, where ``A`` is some class. It is an error
+to define a record (via ``def`` or ``defm``) inside the braces which doesn't
+derive from ``A``.
+
+``foreach``
+-----------
+
+.. productionlist::
+   Foreach: "foreach" `ForeachDeclaration` "in" "{" `Object`* "}"
+          :| "foreach" `ForeachDeclaration` "in" `Object`
+   ForeachDeclaration: ID "=" ( "{" `RangeList` "}" | `RangePiece` | `Value` )
+
+The value assigned to the variable in the declaration is iterated over and
+the object or object list is reevaluated with the variable set at each
+iterated value.
+
+Note that the productions involving RangeList and RangePiece have precedence
+over the more generic value parsing based on the first token.
+
+Top-Level ``let``
+-----------------
+
+.. productionlist::
+   Let:  "let" `LetList` "in" "{" `Object`* "}"
+      :| "let" `LetList` "in" `Object`
+   LetList: `LetItem` ("," `LetItem`)*
+   LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
+
+This is effectively equivalent to ``let`` inside the body of a record
+except that it applies to multiple records at a time. The bindings are
+applied at the end of parsing the base classes of a record.
+
+``multiclass``
+--------------
+
+.. productionlist::
+   MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
+             : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
+   BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
+   MultiClassID: `TokIdentifier`
+   MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
+
+Preprocessing Support
+=====================
+
+TableGen's embedded preprocessor is only intended for conditional compilation.
+It supports the following directives:
+
+.. productionlist::
+   LineBegin: ^
+   LineEnd: "\n" | "\r" | EOF
+   WhiteSpace: " " | "\t"
+   CStyleComment: "/*" (.* - "*/") "*/"
+   BCPLComment: "//" (.* - `LineEnd`) `LineEnd`
+   WhiteSpaceOrCStyleComment: `WhiteSpace` | `CStyleComment`
+   WhiteSpaceOrAnyComment: `WhiteSpace` | `CStyleComment` | `BCPLComment`
+   MacroName: `ualpha` (`ualpha` | "0"..."9")*
+   PrepDefine: `LineBegin` (`WhiteSpaceOrCStyleComment`)*
+             : "#define" (`WhiteSpace`)+ `MacroName`
+             : (`WhiteSpaceOrAnyComment`)* `LineEnd`
+   PrepIfdef: `LineBegin` (`WhiteSpaceOrCStyleComment`)*
+            : "#ifdef" (`WhiteSpace`)+ `MacroName`
+            : (`WhiteSpaceOrAnyComment`)* `LineEnd`
+   PrepElse: `LineBegin` (`WhiteSpaceOrCStyleComment`)*
+           : "#else" (`WhiteSpaceOrAnyComment`)* `LineEnd`
+   PrepEndif: `LineBegin` (`WhiteSpaceOrCStyleComment`)*
+            : "#endif" (`WhiteSpaceOrAnyComment`)* `LineEnd`
+   PrepRegContentException: `PrepIfdef` | `PrepElse` | `PrepEndif` | EOF
+   PrepRegion: .* - `PrepRegContentException`
+             :| `PrepIfdef`
+             :  (`PrepRegion`)*
+             :  [`PrepElse`]
+             :  (`PrepRegion`)*
+             :  `PrepEndif`
+
+:token:`PrepRegion` may occur anywhere in a TD file, as long as it matches
+the grammar specification.
+
+:token:`PrepDefine` allows defining a :token:`MacroName` so that any following
+:token:`PrepIfdef` - :token:`PrepElse` preprocessing region part and
+:token:`PrepIfdef` - :token:`PrepEndif` preprocessing region
+are enabled for TableGen tokens parsing.
+
+A preprocessing region, starting (i.e. having its :token:`PrepIfdef`) in a file,
+must end (i.e. have its :token:`PrepEndif`) in the same file.
+
+A :token:`MacroName` may be defined externally by using ``{ -D<NAME> }``
+option of TableGen.

Added: www-releases/trunk/9.0.0/docs/_sources/TableGen/index.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TableGen/index.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TableGen/index.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TableGen/index.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,304 @@
+========
+TableGen
+========
+
+.. contents::
+   :local:
+
+.. toctree::
+   :hidden:
+
+   BackEnds
+   LangRef
+   LangIntro
+   Deficiencies
+
+Introduction
+============
+
+TableGen's purpose is to help a human develop and maintain records of
+domain-specific information.  Because there may be a large number of these
+records, it is specifically designed to allow writing flexible descriptions and
+for common features of these records to be factored out.  This reduces the
+amount of duplication in the description, reduces the chance of error, and makes
+it easier to structure domain specific information.
+
+The core part of TableGen parses a file, instantiates the declarations, and
+hands the result off to a domain-specific `backend`_ for processing.
+
+The current major users of TableGen are :doc:`../CodeGenerator`
+and the
+`Clang diagnostics and attributes <http://clang.llvm.org/docs/UsersManual.html#controlling-errors-and-warnings>`_.
+
+Note that if you work on TableGen much, and use emacs or vim, that you can find
+an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and
+``llvm/utils/vim`` directories of your LLVM distribution, respectively.
+
+.. _intro:
+
+
+The TableGen program
+====================
+
+TableGen files are interpreted by the TableGen program: `llvm-tblgen` available
+on your build directory under `bin`. It is not installed in the system (or where
+your sysroot is set to), since it has no use beyond LLVM's build process.
+
+Running TableGen
+----------------
+
+TableGen runs just like any other LLVM tool.  The first (optional) argument
+specifies the file to read.  If a filename is not specified, ``llvm-tblgen``
+reads from standard input.
+
+To be useful, one of the `backends`_ must be used.  These backends are
+selectable on the command line (type '``llvm-tblgen -help``' for a list).  For
+example, to get a list of all of the definitions that subclass a particular type
+(which can be useful for building up an enum list of these records), use the
+``-print-enums`` option:
+
+.. code-block:: bash
+
+  $ llvm-tblgen X86.td -print-enums -class=Register
+  AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX,
+  ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP,
+  MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D,
+  R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15,
+  R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI,
+  RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7,
+  XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5,
+  XMM6, XMM7, XMM8, XMM9,
+
+  $ llvm-tblgen X86.td -print-enums -class=Instruction 
+  ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri,
+  ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8,
+  ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm,
+  ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr,
+  ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ...
+
+The default backend prints out all of the records. There is also a general
+backend which outputs all the records as a JSON data structure, enabled using
+the `-dump-json` option.
+
+If you plan to use TableGen, you will most likely have to write a `backend`_
+that extracts the information specific to what you need and formats it in the
+appropriate way. You can do this by extending TableGen itself in C++, or by
+writing a script in any language that can consume the JSON output.
+
+Example
+-------
+
+With no other arguments, `llvm-tblgen` parses the specified file and prints out all
+of the classes, then all of the definitions.  This is a good way to see what the
+various definitions expand to fully.  Running this on the ``X86.td`` file prints
+this (at the time of this writing):
+
+.. code-block:: text
+
+  ...
+  def ADD32rr {   // Instruction X86Inst I
+    string Namespace = "X86";
+    dag OutOperandList = (outs GR32:$dst);
+    dag InOperandList = (ins GR32:$src1, GR32:$src2);
+    string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}";
+    list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))];
+    list<Register> Uses = [];
+    list<Register> Defs = [EFLAGS];
+    list<Predicate> Predicates = [];
+    int CodeSize = 3;
+    int AddedComplexity = 0;
+    bit isReturn = 0;
+    bit isBranch = 0;
+    bit isIndirectBranch = 0;
+    bit isBarrier = 0;
+    bit isCall = 0;
+    bit canFoldAsLoad = 0;
+    bit mayLoad = 0;
+    bit mayStore = 0;
+    bit isImplicitDef = 0;
+    bit isConvertibleToThreeAddress = 1;
+    bit isCommutable = 1;
+    bit isTerminator = 0;
+    bit isReMaterializable = 0;
+    bit isPredicable = 0;
+    bit hasDelaySlot = 0;
+    bit usesCustomInserter = 0;
+    bit hasCtrlDep = 0;
+    bit isNotDuplicable = 0;
+    bit hasSideEffects = 0;
+    InstrItinClass Itinerary = NoItinerary;
+    string Constraints = "";
+    string DisableEncoding = "";
+    bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 };
+    Format Form = MRMDestReg;
+    bits<6> FormBits = { 0, 0, 0, 0, 1, 1 };
+    ImmType ImmT = NoImm;
+    bits<3> ImmTypeBits = { 0, 0, 0 };
+    bit hasOpSizePrefix = 0;
+    bit hasAdSizePrefix = 0;
+    bits<4> Prefix = { 0, 0, 0, 0 };
+    bit hasREX_WPrefix = 0;
+    FPFormat FPForm = ?;
+    bits<3> FPFormBits = { 0, 0, 0 };
+  }
+  ...
+
+This definition corresponds to the 32-bit register-register ``add`` instruction
+of the x86 architecture.  ``def ADD32rr`` defines a record named
+``ADD32rr``, and the comment at the end of the line indicates the superclasses
+of the definition.  The body of the record contains all of the data that
+TableGen assembled for the record, indicating that the instruction is part of
+the "X86" namespace, the pattern indicating how the instruction is selected by
+the code generator, that it is a two-address instruction, has a particular
+encoding, etc.  The contents and semantics of the information in the record are
+specific to the needs of the X86 backend, and are only shown as an example.
+
+As you can see, a lot of information is needed for every instruction supported
+by the code generator, and specifying it all manually would be unmaintainable,
+prone to bugs, and tiring to do in the first place.  Because we are using
+TableGen, all of the information was derived from the following definition:
+
+.. code-block:: text
+
+  let Defs = [EFLAGS],
+      isCommutable = 1,                  // X = ADD Y,Z --> X = ADD Z,Y
+      isConvertibleToThreeAddress = 1 in // Can transform into LEA.
+  def ADD32rr  : I<0x01, MRMDestReg, (outs GR32:$dst),
+                                     (ins GR32:$src1, GR32:$src2),
+                   "add{l}\t{$src2, $dst|$dst, $src2}",
+                   [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
+
+This definition makes use of the custom class ``I`` (extended from the custom
+class ``X86Inst``), which is defined in the X86-specific TableGen file, to
+factor out the common features that instructions of its class share.  A key
+feature of TableGen is that it allows the end-user to define the abstractions
+they prefer to use when describing their information.
+
+Syntax
+======
+
+TableGen has a syntax that is loosely based on C++ templates, with built-in
+types and specification. In addition, TableGen's syntax introduces some
+automation concepts like multiclass, foreach, let, etc.
+
+Basic concepts
+--------------
+
+TableGen files consist of two key parts: 'classes' and 'definitions', both of
+which are considered 'records'.
+
+**TableGen records** have a unique name, a list of values, and a list of
+superclasses.  The list of values is the main data that TableGen builds for each
+record; it is this that holds the domain specific information for the
+application.  The interpretation of this data is left to a specific `backend`_,
+but the structure and format rules are taken care of and are fixed by
+TableGen.
+
+**TableGen definitions** are the concrete form of 'records'.  These generally do
+not have any undefined values, and are marked with the '``def``' keyword.
+
+.. code-block:: text
+
+  def FeatureFPARMv8 : SubtargetFeature<"fp-armv8", "HasFPARMv8", "true",
+                                        "Enable ARMv8 FP">;
+
+In this example, FeatureFPARMv8 is ``SubtargetFeature`` record initialised
+with some values. The names of the classes are defined via the
+keyword `class` either on the same file or some other included. Most target
+TableGen files include the generic ones in ``include/llvm/Target``.
+
+**TableGen classes** are abstract records that are used to build and describe
+other records.  These classes allow the end-user to build abstractions for
+either the domain they are targeting (such as "Register", "RegisterClass", and
+"Instruction" in the LLVM code generator) or for the implementor to help factor
+out common properties of records (such as "FPInst", which is used to represent
+floating point instructions in the X86 backend).  TableGen keeps track of all of
+the classes that are used to build up a definition, so the backend can find all
+definitions of a particular class, such as "Instruction".
+
+.. code-block:: text
+
+ class ProcNoItin<string Name, list<SubtargetFeature> Features>
+       : Processor<Name, NoItineraries, Features>;
+
+Here, the class ProcNoItin, receiving parameters `Name` of type `string` and
+a list of target features is specializing the class Processor by passing the
+arguments down as well as hard-coding NoItineraries.
+
+**TableGen multiclasses** are groups of abstract records that are instantiated
+all at once.  Each instantiation can result in multiple TableGen definitions.
+If a multiclass inherits from another multiclass, the definitions in the
+sub-multiclass become part of the current multiclass, as if they were declared
+in the current multiclass.
+
+.. code-block:: text
+
+  multiclass ro_signed_pats<string T, string Rm, dag Base, dag Offset, dag Extend,
+                          dag address, ValueType sty> {
+  def : Pat<(i32 (!cast<SDNode>("sextload" # sty) address)),
+            (!cast<Instruction>("LDRS" # T # "w_" # Rm # "_RegOffset")
+              Base, Offset, Extend)>;
+
+  def : Pat<(i64 (!cast<SDNode>("sextload" # sty) address)),
+            (!cast<Instruction>("LDRS" # T # "x_" # Rm # "_RegOffset")
+              Base, Offset, Extend)>;
+  }
+
+  defm : ro_signed_pats<"B", Rm, Base, Offset, Extend,
+                        !foreach(decls.pattern, address,
+                                 !subst(SHIFT, imm_eq0, decls.pattern)),
+                        i8>;
+
+
+
+See the :doc:`TableGen Language Introduction <LangIntro>` for more generic
+information on the usage of the language, and the
+:doc:`TableGen Language Reference <LangRef>` for more in-depth description
+of the formal language specification.
+
+.. _backend:
+.. _backends:
+
+TableGen backends
+=================
+
+TableGen files have no real meaning without a back-end. The default operation
+of running ``llvm-tblgen`` is to print the information in a textual format, but
+that's only useful for debugging of the TableGen files themselves. The power
+in TableGen is, however, to interpret the source files into an internal 
+representation that can be generated into anything you want.
+
+Current usage of TableGen is to create huge include files with tables that you
+can either include directly (if the output is in the language you're coding),
+or be used in pre-processing via macros surrounding the include of the file.
+
+Direct output can be used if the back-end already prints a table in C format
+or if the output is just a list of strings (for error and warning messages).
+Pre-processed output should be used if the same information needs to be used
+in different contexts (like Instruction names), so your back-end should print
+a meta-information list that can be shaped into different compile-time formats.
+
+See the `TableGen BackEnds <BackEnds.html>`_ for more information.
+
+TableGen Deficiencies
+=====================
+
+Despite being very generic, TableGen has some deficiencies that have been
+pointed out numerous times. The common theme is that, while TableGen allows
+you to build Domain-Specific-Languages, the final languages that you create
+lack the power of other DSLs, which in turn increase considerably the size
+and complexity of TableGen files.
+
+At the same time, TableGen allows you to create virtually any meaning of
+the basic concepts via custom-made back-ends, which can pervert the original
+design and make it very hard for newcomers to understand the evil TableGen
+file.
+
+There are some in favour of extending the semantics even more, but making sure
+back-ends adhere to strict rules. Others are suggesting we should move to less,
+more powerful DSLs designed with specific purposes, or even re-using existing
+DSLs.
+
+Either way, this is a discussion that will likely span across several years,
+if not decades. You can read more in the `TableGen Deficiencies <Deficiencies.html>`_
+document.

Added: www-releases/trunk/9.0.0/docs/_sources/TableGenFundamentals.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TableGenFundamentals.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TableGenFundamentals.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TableGenFundamentals.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,10 @@
+=====================
+TableGen Fundamentals
+=====================
+
+Moved
+=====
+
+The TableGen fundamentals documentation has moved to a directory on its own
+and is now available at :doc:`TableGen/index`. Please, change your links to
+that page.

Added: www-releases/trunk/9.0.0/docs/_sources/TestSuiteGuide.md.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TestSuiteGuide.md.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TestSuiteGuide.md.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TestSuiteGuide.md.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,403 @@
+test-suite Guide
+================
+
+Quickstart
+----------
+
+1. The lit test runner is required to run the tests. You can either use one
+   from an LLVM build:
+
+   ```bash
+   % <path to llvm build>/bin/llvm-lit --version
+   lit 0.8.0dev
+   ```
+
+   An alternative is installing it as a python package in a python virtual
+   environment:
+
+   ```bash
+   % mkdir venv
+   % virtualenv venv
+   % . venv/bin/activate
+   % pip install svn+http://llvm.org/svn/llvm-project/llvm/trunk/utils/lit
+   % lit --version
+   lit 0.8.0dev
+   ```
+
+2. Check out the `test-suite` module with:
+
+   ```bash
+   % git clone https://github.com/llvm/llvm-test-suite.git test-suite
+   ```
+
+3. Create a build directory and use CMake to configure the suite. Use the
+   `CMAKE_C_COMPILER` option to specify the compiler to test. Use a cache file
+   to choose a typical build configuration:
+
+   ```bash
+   % mkdir test-suite-build
+   % cd test-suite-build
+   % cmake -DCMAKE_C_COMPILER=<path to llvm build>/bin/clang \
+           -C../test-suite/cmake/caches/O3.cmake \
+           ../test-suite
+   ```
+
+4. Build the benchmarks:
+
+   ```text
+   % make
+   Scanning dependencies of target timeit-target
+   [  0%] Building C object tools/CMakeFiles/timeit-target.dir/timeit.c.o
+   [  0%] Linking C executable timeit-target
+   ...
+   ```
+
+5. Run the tests with lit:
+
+   ```text
+   % llvm-lit -v -j 1 -o results.json .
+   -- Testing: 474 tests, 1 threads --
+   PASS: test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test (1 of 474)
+   ********** TEST 'test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test' RESULTS **********
+   compile_time: 0.2192
+   exec_time: 0.0462
+   hash: "59620e187c6ac38b36382685ccd2b63b"
+   size: 83348
+   **********
+   PASS: test-suite :: MultiSource/Applications/ALAC/encode/alacconvert-encode.test (2 of 474)
+   ...
+   ```
+
+6. Show and compare result files (optional):
+
+   ```bash
+   # Make sure pandas is installed. Prepend `sudo` if necessary.
+   % pip install pandas
+   # Show a single result file:
+   % test-suite/utils/compare.py results.json
+   # Compare two result files:
+   % test-suite/utils/compare.py results_a.json results_b.json
+   ```
+
+
+Structure
+---------
+
+The test-suite contains benchmark and test programs.  The programs come with
+reference outputs so that their correctness can be checked.  The suite comes
+with tools to collect metrics such as benchmark runtime, compilation time and
+code size.
+
+The test-suite is divided into several directories:
+
+-  `SingleSource/`
+
+   Contains test programs that are only a single source file in size.  A
+   subdirectory may contain several programs.
+
+-  `MultiSource/`
+
+   Contains subdirectories which entire programs with multiple source files.
+   Large benchmarks and whole applications go here.
+
+-  `MicroBenchmarks/`
+
+   Programs using the [google-benchmark](https://github.com/google/benchmark)
+   library. The programs define functions that are run multiple times until the
+   measurement results are statistically significant.
+
+-  `External/`
+
+   Contains descriptions and test data for code that cannot be directly
+   distributed with the test-suite. The most prominent members of this
+   directory are the SPEC CPU benchmark suites.
+   See [External Suites](#external-suites).
+
+-  `Bitcode/`
+
+   These tests are mostly written in LLVM bitcode.
+
+-  `CTMark/`
+
+   Contains symbolic links to other benchmarks forming a representative sample
+   for compilation performance measurements.
+
+### Benchmarks
+
+Every program can work as a correctness test. Some programs are unsuitable for
+performance measurements. Setting the `TEST_SUITE_BENCHMARKING_ONLY` CMake
+option to `ON` will disable them.
+
+
+Configuration
+-------------
+
+The test-suite has configuration options to customize building and running the
+benchmarks. CMake can print a list of them:
+
+```bash
+% cd test-suite-build
+# Print basic options:
+% cmake -LH
+# Print all options:
+% cmake -LAH
+```
+
+### Common Configuration Options
+
+- `CMAKE_C_FLAGS`
+
+  Specify extra flags to be passed to C compiler invocations.  The flags are
+  also passed to the C++ compiler and linker invocations.  See
+  [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html)
+
+- `CMAKE_C_COMPILER`
+
+  Select the C compiler executable to be used. Note that the C++ compiler is
+  inferred automatically i.e. when specifying `path/to/clang` CMake will
+  automatically use `path/to/clang++` as the C++ compiler.  See
+  [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html)
+
+- `CMAKE_BUILD_TYPE`
+
+  Select a build type like `OPTIMIZE` or `DEBUG` selecting a set of predefined
+  compiler flags. These flags are applied regardless of the `CMAKE_C_FLAGS`
+  option and may be changed by modifying `CMAKE_C_FLAGS_OPTIMIZE` etc.  See
+  [https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html]](https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html)
+
+- `TEST_SUITE_RUN_UNDER`
+
+  Prefix test invocations with the given tool. This is typically used to run
+  cross-compiled tests within a simulator tool.
+
+- `TEST_SUITE_BENCHMARKING_ONLY`
+
+  Disable tests that are unsuitable for performance measurements. The disabled
+  tests either run for a very short time or are dominated by I/O performance
+  making them unsuitable as compiler performance tests.
+
+- `TEST_SUITE_SUBDIRS`
+
+  Semicolon-separated list of directories to include. This can be used to only
+  build parts of the test-suite or to include external suites.  This option
+  does not work reliably with deeper subdirectories as it skips intermediate
+  `CMakeLists.txt` files which may be required.
+
+- `TEST_SUITE_COLLECT_STATS`
+
+  Collect internal LLVM statistics. Appends `-save-stats=obj` when invocing the
+  compiler and makes the lit runner collect and merge the statistic files.
+
+- `TEST_SUITE_RUN_BENCHMARKS`
+
+  If this is set to `OFF` then lit will not actually run the tests but just
+  collect build statistics like compile time and code size.
+
+- `TEST_SUITE_USE_PERF`
+
+  Use the `perf` tool for time measurement instead of the `timeit` tool that
+  comes with the test-suite.  The `perf` is usually available on linux systems.
+
+- `TEST_SUITE_SPEC2000_ROOT`, `TEST_SUITE_SPEC2006_ROOT`, `TEST_SUITE_SPEC2017_ROOT`, ...
+
+  Specify installation directories of external benchmark suites. You can find
+  more information about expected versions or usage in the README files in the
+  `External` directory (such as `External/SPEC/README`)
+
+### Common CMake Flags
+
+- `-GNinja`
+
+  Generate build files for the ninja build tool.
+
+- `-Ctest-suite/cmake/caches/<cachefile.cmake>`
+
+  Use a CMake cache.  The test-suite comes with several CMake caches which
+  predefine common or tricky build configurations.
+
+
+Displaying and Analyzing Results
+--------------------------------
+
+The `compare.py` script displays and compares result files.  A result file is
+produced when invoking lit with the `-o filename.json` flag.
+
+Example usage:
+
+- Basic Usage:
+
+  ```text
+  % test-suite/utils/compare.py baseline.json
+  Warning: 'test-suite :: External/SPEC/CINT2006/403.gcc/403.gcc.test' has No metrics!
+  Tests: 508
+  Metric: exec_time
+
+  Program                                         baseline
+
+  INT2006/456.hmmer/456.hmmer                   1222.90
+  INT2006/464.h264ref/464.h264ref               928.70
+  ...
+               baseline
+  count  506.000000
+  mean   20.563098
+  std    111.423325
+  min    0.003400
+  25%    0.011200
+  50%    0.339450
+  75%    4.067200
+  max    1222.896800
+  ```
+
+- Show compile_time or text segment size metrics:
+
+  ```bash
+  % test-suite/utils/compare.py -m compile_time baseline.json
+  % test-suite/utils/compare.py -m size.__text baseline.json
+  ```
+
+- Compare two result files and filter short running tests:
+
+  ```bash
+  % test-suite/utils/compare.py --filter-short baseline.json experiment.json
+  ...
+  Program                                         baseline  experiment  diff
+
+  SingleSour.../Benchmarks/Linpack/linpack-pc     5.16      4.30        -16.5%
+  MultiSourc...erolling-dbl/LoopRerolling-dbl     7.01      7.86         12.2%
+  SingleSour...UnitTests/Vectorizer/gcc-loops     3.89      3.54        -9.0%
+  ...
+  ```
+
+- Merge multiple baseline and experiment result files by taking the minimum
+  runtime each:
+
+  ```bash
+  % test-suite/utils/compare.py base0.json base1.json base2.json vs exp0.json exp1.json exp2.json
+  ```
+
+### Continuous Tracking with LNT
+
+LNT is a set of client and server tools for continuously monitoring
+performance. You can find more information at
+[http://llvm.org/docs/lnt](http://llvm.org/docs/lnt). The official LNT instance
+of the LLVM project is hosted at [http://lnt.llvm.org](http://lnt.llvm.org).
+
+
+External Suites
+---------------
+
+External suites such as SPEC can be enabled by either
+
+- placing (or linking) them into the `test-suite/test-suite-externals/xxx` directory (example: `test-suite/test-suite-externals/speccpu2000`)
+- using a configuration option such as `-D TEST_SUITE_SPEC2000_ROOT=path/to/speccpu2000`
+
+You can find further information in the respective README files such as
+`test-suite/External/SPEC/README`.
+
+For the SPEC benchmarks you can switch between the `test`, `train` and
+`ref` input datasets via the `TEST_SUITE_RUN_TYPE` configuration option.
+The `train` dataset is used by default.
+
+
+Custom Suites
+-------------
+
+You can build custom suites using the test-suite infrastructure. A custom suite
+has a `CMakeLists.txt` file at the top directory. The `CMakeLists.txt` will be
+picked up automatically if placed into a subdirectory of the test-suite or when
+setting the `TEST_SUITE_SUBDIRS` variable:
+
+```bash
+% cmake -DTEST_SUITE_SUBDIRS=path/to/my/benchmark-suite ../test-suite
+```
+
+
+Profile Guided Optimization
+---------------------------
+
+Profile guided optimization requires to compile and run twice. First the
+benchmark should be compiled with profile generation instrumentation enabled
+and setup for training data. The lit runner will merge the profile files
+using `llvm-profdata` so they can be used by the second compilation run.
+
+Example:
+```bash
+# Profile generation run:
+% cmake -DTEST_SUITE_PROFILE_GENERATE=ON \
+        -DTEST_SUITE_RUN_TYPE=train \
+        ../test-suite
+% make
+% llvm-lit .
+# Use the profile data for compilation and actual benchmark run:
+% cmake -DTEST_SUITE_PROFILE_GENERATE=OFF \
+        -DTEST_SUITE_PROFILE_USE=ON \
+        -DTEST_SUITE_RUN_TYPE=ref \
+        .
+% make
+% llvm-lit -o result.json .
+```
+
+The `TEST_SUITE_RUN_TYPE` setting only affects the SPEC benchmark suites.
+
+
+Cross Compilation and External Devices
+--------------------------------------
+
+### Compilation
+
+CMake allows to cross compile to a different target via toolchain files. More
+information can be found here:
+
+- [http://llvm.org/docs/lnt/tests.html#cross-compiling](http://llvm.org/docs/lnt/tests.html#cross-compiling)
+
+- [https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html](https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html)
+
+Cross compilation from macOS to iOS is possible with the
+`test-suite/cmake/caches/target-target-*-iphoneos-internal.cmake` CMake cache
+files; this requires an internal iOS SDK.
+
+### Running
+
+There are two ways to run the tests in a cross compilation setting:
+
+- Via SSH connection to an external device: The `TEST_SUITE_REMOTE_HOST` option
+  should be set to the SSH hostname.  The executables and data files need to be
+  transferred to the device after compilation.  This is typically done via the
+  `rsync` make target.  After this, the lit runner can be used on the host
+  machine. It will prefix the benchmark and verification command lines with an
+  `ssh` command.
+
+  Example:
+
+  ```bash
+  % cmake -G Ninja -D CMAKE_C_COMPILER=path/to/clang \
+          -C ../test-suite/cmake/caches/target-arm64-iphoneos-internal.cmake \
+          -D TEST_SUITE_REMOTE_HOST=mydevice \
+          ../test-suite
+  % ninja
+  % ninja rsync
+  % llvm-lit -j1 -o result.json .
+  ```
+
+- You can specify a simulator for the target machine with the
+  `TEST_SUITE_RUN_UNDER` setting. The lit runner will prefix all benchmark
+  invocations with it.
+
+
+Running the test-suite via LNT
+------------------------------
+
+The LNT tool can run the test-suite. Use this when submitting test results to
+an LNT instance.  See
+[http://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite](http://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite)
+for details.
+
+Running the test-suite via Makefiles (deprecated)
+-------------------------------------------------
+
+**Note**: The test-suite comes with a set of Makefiles that are considered
+deprecated.  They do not support newer testing modes like `Bitcode` or
+`Microbenchmarks` and are harder to use.
+
+Old documentation is available in the
+[test-suite Makefile Guide](TestSuiteMakefileGuide).

Added: www-releases/trunk/9.0.0/docs/_sources/TestSuiteMakefileGuide.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TestSuiteMakefileGuide.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TestSuiteMakefileGuide.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TestSuiteMakefileGuide.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,198 @@
+======================================
+test-suite Makefile Guide (deprecated)
+======================================
+
+.. contents::
+    :local:
+
+Overview
+========
+
+First, all tests are executed within the LLVM object directory tree.
+They *are not* executed inside of the LLVM source tree. This is because
+the test suite creates temporary files during execution.
+
+To run the test suite, you need to use the following steps:
+
+#. Check out the ``test-suite`` module with:
+
+   .. code-block:: bash
+
+       % git clone https://github.com/llvm/llvm-test-suite.git test-suite
+
+#. FIXME: these directions are outdated and won't work. Figure out
+   what the correct thing to do is, and write it down here.
+
+#. Configure and build ``llvm``.
+
+#. Configure and build ``llvm-gcc``.
+
+#. Install ``llvm-gcc`` somewhere.
+
+#. *Re-configure* ``llvm`` from the top level of each build tree (LLVM
+   object directory tree) in which you want to run the test suite, just
+   as you do before building LLVM.
+
+   During the *re-configuration*, you must either: (1) have ``llvm-gcc``
+   you just built in your path, or (2) specify the directory where your
+   just-built ``llvm-gcc`` is installed using
+   ``--with-llvmgccdir=$LLVM_GCC_DIR``.
+
+   You must also tell the configure machinery that the test suite is
+   available so it can be configured for your build tree:
+
+   .. code-block:: bash
+
+       % cd $LLVM_OBJ_ROOT ; $LLVM_SRC_ROOT/configure [--with-llvmgccdir=$LLVM_GCC_DIR]
+
+   [Remember that ``$LLVM_GCC_DIR`` is the directory where you
+   *installed* llvm-gcc, not its src or obj directory.]
+
+#. You can now run the test suite from your build tree as follows:
+
+   .. code-block:: bash
+
+       % cd $LLVM_OBJ_ROOT/projects/test-suite
+       % make
+
+Note that the second and third steps only need to be done once. After
+you have the suite checked out and configured, you don't need to do it
+again (unless the test code or configure script changes).
+
+Configuring External Tests
+==========================
+
+In order to run the External tests in the ``test-suite`` module, you
+must specify *--with-externals*. This must be done during the
+*re-configuration* step (see above), and the ``llvm`` re-configuration
+must recognize the previously-built ``llvm-gcc``. If any of these is
+missing or neglected, the External tests won't work.
+
+* *--with-externals*
+
+* *--with-externals=<directory>*
+
+This tells LLVM where to find any external tests. They are expected to
+be in specifically named subdirectories of <``directory``>. If
+``directory`` is left unspecified, ``configure`` uses the default value
+``/home/vadve/shared/benchmarks/speccpu2000/benchspec``. Subdirectory
+names known to LLVM include:
+
+* spec95
+
+* speccpu2000
+
+* speccpu2006
+
+* povray31
+
+Others are added from time to time, and can be determined from
+``configure``.
+
+Running Different Tests
+=======================
+
+In addition to the regular "whole program" tests, the ``test-suite``
+module also provides a mechanism for compiling the programs in different
+ways. If the variable TEST is defined on the ``gmake`` command line, the
+test system will include a Makefile named
+``TEST.<value of TEST variable>.Makefile``. This Makefile can modify
+build rules to yield different results.
+
+For example, the LLVM nightly tester uses ``TEST.nightly.Makefile`` to
+create the nightly test reports. To run the nightly tests, run
+``gmake TEST=nightly``.
+
+There are several TEST Makefiles available in the tree. Some of them are
+designed for internal LLVM research and will not work outside of the
+LLVM research group. They may still be valuable, however, as a guide to
+writing your own TEST Makefile for any optimization or analysis passes
+that you develop with LLVM.
+
+Generating Test Output
+======================
+
+There are a number of ways to run the tests and generate output. The
+most simple one is simply running ``gmake`` with no arguments. This will
+compile and run all programs in the tree using a number of different
+methods and compare results. Any failures are reported in the output,
+but are likely drowned in the other output. Passes are not reported
+explicitly.
+
+Somewhat better is running ``gmake TEST=sometest test``, which runs the
+specified test and usually adds per-program summaries to the output
+(depending on which sometest you use). For example, the ``nightly`` test
+explicitly outputs TEST-PASS or TEST-FAIL for every test after each
+program. Though these lines are still drowned in the output, it's easy
+to grep the output logs in the Output directories.
+
+Even better are the ``report`` and ``report.format`` targets (where
+``format`` is one of ``html``, ``csv``, ``text`` or ``graphs``). The
+exact contents of the report are dependent on which ``TEST`` you are
+running, but the text results are always shown at the end of the run and
+the results are always stored in the ``report.<type>.format`` file (when
+running with ``TEST=<type>``). The ``report`` also generate a file
+called ``report.<type>.raw.out`` containing the output of the entire
+test run.
+
+Writing Custom Tests for the test-suite
+=======================================
+
+Assuming you can run the test suite, (e.g.
+"``gmake TEST=nightly report``" should work), it is really easy to run
+optimizations or code generator components against every program in the
+tree, collecting statistics or running custom checks for correctness. At
+base, this is how the nightly tester works, it's just one example of a
+general framework.
+
+Lets say that you have an LLVM optimization pass, and you want to see
+how many times it triggers. First thing you should do is add an LLVM
+`statistic <ProgrammersManual.html#Statistic>`_ to your pass, which will
+tally counts of things you care about.
+
+Following this, you can set up a test and a report that collects these
+and formats them for easy viewing. This consists of two files, a
+"``test-suite/TEST.XXX.Makefile``" fragment (where XXX is the name of
+your test) and a "``test-suite/TEST.XXX.report``" file that indicates
+how to format the output into a table. There are many example reports of
+various levels of sophistication included with the test suite, and the
+framework is very general.
+
+If you are interested in testing an optimization pass, check out the
+"libcalls" test as an example. It can be run like this:
+
+.. code-block:: bash
+
+    % cd llvm/projects/test-suite/MultiSource/Benchmarks  # or some other level
+    % make TEST=libcalls report
+
+This will do a bunch of stuff, then eventually print a table like this:
+
+::
+
+    Name                                  | total | #exit |
+    ...
+    FreeBench/analyzer/analyzer           | 51    | 6     |
+    FreeBench/fourinarow/fourinarow       | 1     | 1     |
+    FreeBench/neural/neural               | 19    | 9     |
+    FreeBench/pifft/pifft                 | 5     | 3     |
+    MallocBench/cfrac/cfrac               | 1     | *     |
+    MallocBench/espresso/espresso         | 52    | 12    |
+    MallocBench/gs/gs                     | 4     | *     |
+    Prolangs-C/TimberWolfMC/timberwolfmc  | 302   | *     |
+    Prolangs-C/agrep/agrep                | 33    | 12    |
+    Prolangs-C/allroots/allroots          | *     | *     |
+    Prolangs-C/assembler/assembler        | 47    | *     |
+    Prolangs-C/bison/mybison              | 74    | *     |
+    ...
+
+This basically is grepping the -stats output and displaying it in a
+table. You can also use the "TEST=libcalls report.html" target to get
+the table in HTML form, similarly for report.csv and report.tex.
+
+The source for this is in ``test-suite/TEST.libcalls.*``. The format is
+pretty simple: the Makefile indicates how to run the test (in this case,
+"``opt -simplify-libcalls -stats``"), and the report contains one line
+for each column of the output. The first value is the header for the
+column and the second is the regex to grep the output of the command
+for. There are lots of example reports that can do fancy stuff.

Added: www-releases/trunk/9.0.0/docs/_sources/TestingGuide.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TestingGuide.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TestingGuide.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TestingGuide.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,601 @@
+=================================
+LLVM Testing Infrastructure Guide
+=================================
+
+.. contents::
+   :local:
+
+.. toctree::
+   :hidden:
+
+   TestSuiteGuide
+   TestSuiteMakefileGuide
+
+Overview
+========
+
+This document is the reference manual for the LLVM testing
+infrastructure. It documents the structure of the LLVM testing
+infrastructure, the tools needed to use it, and how to add and run
+tests.
+
+Requirements
+============
+
+In order to use the LLVM testing infrastructure, you will need all of the
+software required to build LLVM, as well as `Python <http://python.org>`_ 2.7 or
+later.
+
+LLVM Testing Infrastructure Organization
+========================================
+
+The LLVM testing infrastructure contains three major categories of tests:
+unit tests, regression tests and whole programs. The unit tests and regression
+tests are contained inside the LLVM repository itself under ``llvm/unittests``
+and ``llvm/test`` respectively and are expected to always pass -- they should be
+run before every commit.
+
+The whole programs tests are referred to as the "LLVM test suite" (or
+"test-suite") and are in the ``test-suite`` module in subversion. For
+historical reasons, these tests are also referred to as the "nightly
+tests" in places, which is less ambiguous than "test-suite" and remains
+in use although we run them much more often than nightly.
+
+Unit tests
+----------
+
+Unit tests are written using `Google Test <https://github.com/google/googletest/blob/master/googletest/docs/primer.md>`_
+and `Google Mock <https://github.com/google/googletest/blob/master/googlemock/docs/ForDummies.md>`_
+and are located in the ``llvm/unittests`` directory.
+
+Regression tests
+----------------
+
+The regression tests are small pieces of code that test a specific
+feature of LLVM or trigger a specific bug in LLVM. The language they are
+written in depends on the part of LLVM being tested. These tests are driven by
+the :doc:`Lit <CommandGuide/lit>` testing tool (which is part of LLVM), and
+are located in the ``llvm/test`` directory.
+
+Typically when a bug is found in LLVM, a regression test containing just
+enough code to reproduce the problem should be written and placed
+somewhere underneath this directory. For example, it can be a small
+piece of LLVM IR distilled from an actual application or benchmark.
+
+``test-suite``
+--------------
+
+The test suite contains whole programs, which are pieces of code which
+can be compiled and linked into a stand-alone program that can be
+executed. These programs are generally written in high level languages
+such as C or C++.
+
+These programs are compiled using a user specified compiler and set of
+flags, and then executed to capture the program output and timing
+information. The output of these programs is compared to a reference
+output to ensure that the program is being compiled correctly.
+
+In addition to compiling and executing programs, whole program tests
+serve as a way of benchmarking LLVM performance, both in terms of the
+efficiency of the programs generated as well as the speed with which
+LLVM compiles, optimizes, and generates code.
+
+The test-suite is located in the ``test-suite`` Subversion module.
+
+See the :doc:`TestSuiteGuide` for details.
+
+Debugging Information tests
+---------------------------
+
+The test suite contains tests to check quality of debugging information.
+The test are written in C based languages or in LLVM assembly language.
+
+These tests are compiled and run under a debugger. The debugger output
+is checked to validate of debugging information. See README.txt in the
+test suite for more information . This test suite is located in the
+``debuginfo-tests`` Subversion module.
+
+Quick start
+===========
+
+The tests are located in two separate Subversion modules. The unit and
+regression tests are in the main "llvm" module under the directories
+``llvm/unittests`` and ``llvm/test`` (so you get these tests for free with the
+main LLVM tree). Use ``make check-all`` to run the unit and regression tests
+after building LLVM.
+
+The ``test-suite`` module contains more comprehensive tests including whole C
+and C++ programs. See the :doc:`TestSuiteGuide` for details.
+
+Unit and Regression tests
+-------------------------
+
+To run all of the LLVM unit tests use the check-llvm-unit target:
+
+.. code-block:: bash
+
+    % make check-llvm-unit
+
+To run all of the LLVM regression tests use the check-llvm target:
+
+.. code-block:: bash
+
+    % make check-llvm
+
+In order to get reasonable testing performance, build LLVM and subprojects
+in release mode, i.e.
+
+.. code-block:: bash
+
+    % cmake -DCMAKE_BUILD_TYPE="Release" -DLLVM_ENABLE_ASSERTIONS=On
+
+If you have `Clang <http://clang.llvm.org/>`_ checked out and built, you
+can run the LLVM and Clang tests simultaneously using:
+
+.. code-block:: bash
+
+    % make check-all
+
+To run the tests with Valgrind (Memcheck by default), use the ``LIT_ARGS`` make
+variable to pass the required options to lit. For example, you can use:
+
+.. code-block:: bash
+
+    % make check LIT_ARGS="-v --vg --vg-leak"
+
+to enable testing with valgrind and with leak checking enabled.
+
+To run individual tests or subsets of tests, you can use the ``llvm-lit``
+script which is built as part of LLVM. For example, to run the
+``Integer/BitPacked.ll`` test by itself you can run:
+
+.. code-block:: bash
+
+    % llvm-lit ~/llvm/test/Integer/BitPacked.ll 
+
+or to run all of the ARM CodeGen tests:
+
+.. code-block:: bash
+
+    % llvm-lit ~/llvm/test/CodeGen/ARM
+
+For more information on using the :program:`lit` tool, see ``llvm-lit --help``
+or the :doc:`lit man page <CommandGuide/lit>`.
+
+Debugging Information tests
+---------------------------
+
+To run debugging information tests simply add the ``debuginfo-tests``
+project to your ``LLVM_ENABLE_PROJECTS`` define on the cmake
+command-line.
+
+Regression test structure
+=========================
+
+The LLVM regression tests are driven by :program:`lit` and are located in the
+``llvm/test`` directory.
+
+This directory contains a large array of small tests that exercise
+various features of LLVM and to ensure that regressions do not occur.
+The directory is broken into several sub-directories, each focused on a
+particular area of LLVM.
+
+Writing new regression tests
+----------------------------
+
+The regression test structure is very simple, but does require some
+information to be set. This information is gathered via ``configure``
+and is written to a file, ``test/lit.site.cfg`` in the build directory.
+The ``llvm/test`` Makefile does this work for you.
+
+In order for the regression tests to work, each directory of tests must
+have a ``lit.local.cfg`` file. :program:`lit` looks for this file to determine
+how to run the tests. This file is just Python code and thus is very
+flexible, but we've standardized it for the LLVM regression tests. If
+you're adding a directory of tests, just copy ``lit.local.cfg`` from
+another directory to get running. The standard ``lit.local.cfg`` simply
+specifies which files to look in for tests. Any directory that contains
+only directories does not need the ``lit.local.cfg`` file. Read the :doc:`Lit
+documentation <CommandGuide/lit>` for more information.
+
+Each test file must contain lines starting with "RUN:" that tell :program:`lit`
+how to run it. If there are no RUN lines, :program:`lit` will issue an error
+while running a test.
+
+RUN lines are specified in the comments of the test program using the
+keyword ``RUN`` followed by a colon, and lastly the command (pipeline)
+to execute. Together, these lines form the "script" that :program:`lit`
+executes to run the test case. The syntax of the RUN lines is similar to a
+shell's syntax for pipelines including I/O redirection and variable
+substitution. However, even though these lines may *look* like a shell
+script, they are not. RUN lines are interpreted by :program:`lit`.
+Consequently, the syntax differs from shell in a few ways. You can specify
+as many RUN lines as needed.
+
+:program:`lit` performs substitution on each RUN line to replace LLVM tool names
+with the full paths to the executable built for each tool (in
+``$(LLVM_OBJ_ROOT)/$(BuildMode)/bin)``. This ensures that :program:`lit` does
+not invoke any stray LLVM tools in the user's path during testing.
+
+Each RUN line is executed on its own, distinct from other lines unless
+its last character is ``\``. This continuation character causes the RUN
+line to be concatenated with the next one. In this way you can build up
+long pipelines of commands without making huge line lengths. The lines
+ending in ``\`` are concatenated until a RUN line that doesn't end in
+``\`` is found. This concatenated set of RUN lines then constitutes one
+execution. :program:`lit` will substitute variables and arrange for the pipeline
+to be executed. If any process in the pipeline fails, the entire line (and
+test case) fails too.
+
+Below is an example of legal RUN lines in a ``.ll`` file:
+
+.. code-block:: llvm
+
+    ; RUN: llvm-as < %s | llvm-dis > %t1
+    ; RUN: llvm-dis < %s.bc-13 > %t2
+    ; RUN: diff %t1 %t2
+
+As with a Unix shell, the RUN lines permit pipelines and I/O
+redirection to be used.
+
+There are some quoting rules that you must pay attention to when writing
+your RUN lines. In general nothing needs to be quoted. :program:`lit` won't
+strip off any quote characters so they will get passed to the invoked program.
+To avoid this use curly braces to tell :program:`lit` that it should treat
+everything enclosed as one value.
+
+In general, you should strive to keep your RUN lines as simple as possible,
+using them only to run tools that generate textual output you can then examine.
+The recommended way to examine output to figure out if the test passes is using
+the :doc:`FileCheck tool <CommandGuide/FileCheck>`. *[The usage of grep in RUN
+lines is deprecated - please do not send or commit patches that use it.]*
+
+Put related tests into a single file rather than having a separate file per
+test. Check if there are files already covering your feature and consider
+adding your code there instead of creating a new file.
+
+Extra files
+-----------
+
+If your test requires extra files besides the file containing the ``RUN:``
+lines, the idiomatic place to put them is in a subdirectory ``Inputs``.
+You can then refer to the extra files as ``%S/Inputs/foo.bar``.
+
+For example, consider ``test/Linker/ident.ll``. The directory structure is
+as follows::
+
+  test/
+    Linker/
+      ident.ll
+      Inputs/
+        ident.a.ll
+        ident.b.ll
+
+For convenience, these are the contents:
+
+.. code-block:: llvm
+
+  ;;;;; ident.ll:
+
+  ; RUN: llvm-link %S/Inputs/ident.a.ll %S/Inputs/ident.b.ll -S | FileCheck %s
+
+  ; Verify that multiple input llvm.ident metadata are linked together.
+
+  ; CHECK-DAG: !llvm.ident = !{!0, !1, !2}
+  ; CHECK-DAG: "Compiler V1"
+  ; CHECK-DAG: "Compiler V2"
+  ; CHECK-DAG: "Compiler V3"
+
+  ;;;;; Inputs/ident.a.ll:
+
+  !llvm.ident = !{!0, !1}
+  !0 = metadata !{metadata !"Compiler V1"}
+  !1 = metadata !{metadata !"Compiler V2"}
+
+  ;;;;; Inputs/ident.b.ll:
+
+  !llvm.ident = !{!0}
+  !0 = metadata !{metadata !"Compiler V3"}
+
+For symmetry reasons, ``ident.ll`` is just a dummy file that doesn't
+actually participate in the test besides holding the ``RUN:`` lines.
+
+.. note::
+
+  Some existing tests use ``RUN: true`` in extra files instead of just
+  putting the extra files in an ``Inputs/`` directory. This pattern is
+  deprecated.
+
+Fragile tests
+-------------
+
+It is easy to write a fragile test that would fail spuriously if the tool being
+tested outputs a full path to the input file.  For example, :program:`opt` by
+default outputs a ``ModuleID``:
+
+.. code-block:: console
+
+  $ cat example.ll
+  define i32 @main() nounwind {
+      ret i32 0
+  }
+
+  $ opt -S /path/to/example.ll
+  ; ModuleID = '/path/to/example.ll'
+
+  define i32 @main() nounwind {
+      ret i32 0
+  }
+
+``ModuleID`` can unexpectedly match against ``CHECK`` lines.  For example:
+
+.. code-block:: llvm
+
+  ; RUN: opt -S %s | FileCheck
+
+  define i32 @main() nounwind {
+      ; CHECK-NOT: load
+      ret i32 0
+  }
+
+This test will fail if placed into a ``download`` directory.
+
+To make your tests robust, always use ``opt ... < %s`` in the RUN line.
+:program:`opt` does not output a ``ModuleID`` when input comes from stdin.
+
+Platform-Specific Tests
+-----------------------
+
+Whenever adding tests that require the knowledge of a specific platform,
+either related to code generated, specific output or back-end features,
+you must make sure to isolate the features, so that buildbots that
+run on different architectures (and don't even compile all back-ends),
+don't fail.
+
+The first problem is to check for target-specific output, for example sizes
+of structures, paths and architecture names, for example:
+
+* Tests containing Windows paths will fail on Linux and vice-versa.
+* Tests that check for ``x86_64`` somewhere in the text will fail anywhere else.
+* Tests where the debug information calculates the size of types and structures.
+
+Also, if the test rely on any behaviour that is coded in any back-end, it must
+go in its own directory. So, for instance, code generator tests for ARM go
+into ``test/CodeGen/ARM`` and so on. Those directories contain a special
+``lit`` configuration file that ensure all tests in that directory will
+only run if a specific back-end is compiled and available.
+
+For instance, on ``test/CodeGen/ARM``, the ``lit.local.cfg`` is:
+
+.. code-block:: python
+
+  config.suffixes = ['.ll', '.c', '.cpp', '.test']
+  if not 'ARM' in config.root.targets:
+    config.unsupported = True
+
+Other platform-specific tests are those that depend on a specific feature
+of a specific sub-architecture, for example only to Intel chips that support ``AVX2``.
+
+For instance, ``test/CodeGen/X86/psubus.ll`` tests three sub-architecture
+variants:
+
+.. code-block:: llvm
+
+  ; RUN: llc -mcpu=core2 < %s | FileCheck %s -check-prefix=SSE2
+  ; RUN: llc -mcpu=corei7-avx < %s | FileCheck %s -check-prefix=AVX1
+  ; RUN: llc -mcpu=core-avx2 < %s | FileCheck %s -check-prefix=AVX2
+
+And the checks are different:
+
+.. code-block:: llvm
+
+  ; SSE2: @test1
+  ; SSE2: psubusw LCPI0_0(%rip), %xmm0
+  ; AVX1: @test1
+  ; AVX1: vpsubusw LCPI0_0(%rip), %xmm0, %xmm0
+  ; AVX2: @test1
+  ; AVX2: vpsubusw LCPI0_0(%rip), %xmm0, %xmm0
+
+So, if you're testing for a behaviour that you know is platform-specific or
+depends on special features of sub-architectures, you must add the specific
+triple, test with the specific FileCheck and put it into the specific
+directory that will filter out all other architectures.
+
+
+Constraining test execution
+---------------------------
+
+Some tests can be run only in specific configurations, such as
+with debug builds or on particular platforms. Use ``REQUIRES``
+and ``UNSUPPORTED`` to control when the test is enabled.
+
+Some tests are expected to fail. For example, there may be a known bug
+that the test detect. Use ``XFAIL`` to mark a test as an expected failure.
+An ``XFAIL`` test will be successful if its execution fails, and
+will be a failure if its execution succeeds.
+
+.. code-block:: llvm
+
+    ; This test will be only enabled in the build with asserts.
+    ; REQUIRES: asserts
+    ; This test is disabled on Linux.
+    ; UNSUPPORTED: -linux-
+    ; This test is expected to fail on PowerPC.
+    ; XFAIL: powerpc
+
+``REQUIRES`` and ``UNSUPPORTED`` and ``XFAIL`` all accept a comma-separated
+list of boolean expressions. The values in each expression may be:
+
+- Features added to ``config.available_features`` by 
+  configuration files such as ``lit.cfg``.
+- Substrings of the target triple (``UNSUPPORTED`` and ``XFAIL`` only).
+
+| ``REQUIRES`` enables the test if all expressions are true.
+| ``UNSUPPORTED`` disables the test if any expression is true.
+| ``XFAIL`` expects the test to fail if any expression is true.
+
+As a special case, ``XFAIL: *`` is expected to fail everywhere.
+
+.. code-block:: llvm
+
+    ; This test is disabled on Windows,
+    ; and is disabled on Linux, except for Android Linux.
+    ; UNSUPPORTED: windows, linux && !android
+    ; This test is expected to fail on both PowerPC and ARM.
+    ; XFAIL: powerpc || arm
+
+
+Substitutions
+-------------
+
+Besides replacing LLVM tool names the following substitutions are performed in
+RUN lines:
+
+``%%``
+   Replaced by a single ``%``. This allows escaping other substitutions.
+
+``%s``
+   File path to the test case's source. This is suitable for passing on the
+   command line as the input to an LLVM tool.
+
+   Example: ``/home/user/llvm/test/MC/ELF/foo_test.s``
+
+``%S``
+   Directory path to the test case's source.
+
+   Example: ``/home/user/llvm/test/MC/ELF``
+
+``%t``
+   File path to a temporary file name that could be used for this test case.
+   The file name won't conflict with other test cases. You can append to it
+   if you need multiple temporaries. This is useful as the destination of
+   some redirected output.
+
+   Example: ``/home/user/llvm.build/test/MC/ELF/Output/foo_test.s.tmp``
+
+``%T``
+   Directory of ``%t``. Deprecated. Shouldn't be used, because it can be easily
+   misused and cause race conditions between tests.
+
+   Use ``rm -rf %t && mkdir %t`` instead if a temporary directory is necessary.
+
+   Example: ``/home/user/llvm.build/test/MC/ELF/Output``
+
+``%{pathsep}``
+
+   Expands to the path separator, i.e. ``:`` (or ``;`` on Windows).
+
+``%/s, %/S, %/t, %/T:``
+
+  Act like the corresponding substitution above but replace any ``\``
+  character with a ``/``. This is useful to normalize path separators.
+
+   Example: ``%s:  C:\Desktop Files/foo_test.s.tmp``
+   
+   Example: ``%/s: C:/Desktop Files/foo_test.s.tmp``
+
+``%:s, %:S, %:t, %:T:``
+
+  Act like the corresponding substitution above but remove colons at
+  the beginning of Windows paths. This is useful to allow concatenation
+  of absolute paths on Windows to produce a legal path.
+
+   Example: ``%s:  C:\Desktop Files\foo_test.s.tmp``
+
+   Example: ``%:s: C\Desktop Files\foo_test.s.tmp``
+
+
+**LLVM-specific substitutions:**
+
+``%shlibext``
+   The suffix for the host platforms shared library files. This includes the
+   period as the first character.
+
+   Example: ``.so`` (Linux), ``.dylib`` (macOS), ``.dll`` (Windows)
+
+``%exeext``
+   The suffix for the host platforms executable files. This includes the
+   period as the first character.
+
+   Example: ``.exe`` (Windows), empty on Linux.
+
+``%(line)``, ``%(line+<number>)``, ``%(line-<number>)``
+   The number of the line where this substitution is used, with an optional
+   integer offset. This can be used in tests with multiple RUN lines, which
+   reference test file's line numbers.
+
+
+**Clang-specific substitutions:**
+
+``%clang``
+   Invokes the Clang driver.
+
+``%clang_cpp``
+   Invokes the Clang driver for C++.
+
+``%clang_cl``
+   Invokes the CL-compatible Clang driver.
+
+``%clangxx``
+   Invokes the G++-compatible Clang driver.
+
+``%clang_cc1``
+   Invokes the Clang frontend.
+
+``%itanium_abi_triple``, ``%ms_abi_triple``
+   These substitutions can be used to get the current target triple adjusted to
+   the desired ABI. For example, if the test suite is running with the
+   ``i686-pc-win32`` target, ``%itanium_abi_triple`` will expand to
+   ``i686-pc-mingw32``. This allows a test to run with a specific ABI without
+   constraining it to a specific triple.
+
+To add more substituations, look at ``test/lit.cfg`` or ``lit.local.cfg``.
+
+
+Options
+-------
+
+The llvm lit configuration allows to customize some things with user options:
+
+``llc``, ``opt``, ...
+    Substitute the respective llvm tool name with a custom command line. This
+    allows to specify custom paths and default arguments for these tools.
+    Example:
+
+    % llvm-lit "-Dllc=llc -verify-machineinstrs"
+
+``run_long_tests``
+    Enable the execution of long running tests.
+
+``llvm_site_config``
+    Load the specified lit configuration instead of the default one.
+
+
+Other Features
+--------------
+
+To make RUN line writing easier, there are several helper programs. These
+helpers are in the PATH when running tests, so you can just call them using
+their name. For example:
+
+``not``
+   This program runs its arguments and then inverts the result code from it.
+   Zero result codes become 1. Non-zero result codes become 0.
+
+To make the output more useful, :program:`lit` will scan
+the lines of the test case for ones that contain a pattern that matches
+``PR[0-9]+``. This is the syntax for specifying a PR (Problem Report) number
+that is related to the test case. The number after "PR" specifies the
+LLVM bugzilla number. When a PR number is specified, it will be used in
+the pass/fail reporting. This is useful to quickly get some context when
+a test fails.
+
+Finally, any line that contains "END." will cause the special
+interpretation of lines to terminate. This is generally done right after
+the last RUN: line. This has two side effects:
+
+(a) it prevents special interpretation of lines that are part of the test
+    program, not the instructions to the test case, and
+
+(b) it speeds things up for really big test cases by avoiding
+    interpretation of the remainder of the file.