[www-releases] r306451 - Add 4.0.1 docs for LLVM

Tue Jun 27 12:30:49 PDT 2017

Added: www-releases/trunk/4.0.1/docs/YamlIO.html
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/YamlIO.html?rev=306451&view=auto
==============================================================================

--- www-releases/trunk/4.0.1/docs/YamlIO.html (added)
+++ www-releases/trunk/4.0.1/docs/YamlIO.html Tue Jun 27 12:30:47 2017
@@ -0,0 +1,1031 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+  <head>
+    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+    
+    <title>YAML I/O — LLVM 4 documentation</title>
+    
+    <link rel="stylesheet" href="_static/llvm-theme.css" type="text/css" />
+    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+    
+    <script type="text/javascript">
+      var DOCUMENTATION_OPTIONS = {
+        URL_ROOT:    '',
+        VERSION:     '4',
+        COLLAPSE_INDEX: false,
+        FILE_SUFFIX: '.html',
+        HAS_SOURCE:  true
+      };
+    </script>
+    <script type="text/javascript" src="_static/jquery.js"></script>
+    <script type="text/javascript" src="_static/underscore.js"></script>
+    <script type="text/javascript" src="_static/doctools.js"></script>
+    <link rel="top" title="LLVM 4 documentation" href="index.html" />
+    <link rel="next" title="The Often Misunderstood GEP Instruction" href="GetElementPtr.html" />
+    <link rel="prev" title="LLVMâs Analysis and Transform Passes" href="Passes.html" />
+<style type="text/css">
+  table.right { float: right; margin-left: 20px; }
+  table.right td { border: 1px solid #ccc; }
+</style>
+
+  </head>
+  <body>
+<div class="logo">
+  <a href="index.html">
+    <img src="_static/logo.png"
+         alt="LLVM Logo" width="250" height="88"/></a>
+</div>
+
+    <div class="related">
+      <h3>Navigation</h3>
+      <ul>
+        <li class="right" style="margin-right: 10px">
+          <a href="genindex.html" title="General Index"
+             accesskey="I">index</a></li>
+        <li class="right" >
+          <a href="GetElementPtr.html" title="The Often Misunderstood GEP Instruction"
+             accesskey="N">next</a> |</li>
+        <li class="right" >
+          <a href="Passes.html" title="LLVMâs Analysis and Transform Passes"
+             accesskey="P">previous</a> |</li>
+  <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+  <li><a href="index.html">Documentation</a>»</li>
+ 
+      </ul>
+    </div>
+
+
+    <div class="document">
+      <div class="documentwrapper">
+          <div class="body">
+            
+  <div class="section" id="yaml-i-o">
+<h1>YAML I/O<a class="headerlink" href="#yaml-i-o" title="Permalink to this headline">Â¶</a></h1>
+<div class="contents local topic" id="contents">
+<ul class="simple">
+<li><a class="reference internal" href="#introduction-to-yaml" id="id1">Introduction to YAML</a></li>
+<li><a class="reference internal" href="#introduction-to-yaml-i-o" id="id2">Introduction to YAML I/O</a></li>
+<li><a class="reference internal" href="#error-handling" id="id3">Error Handling</a></li>
+<li><a class="reference internal" href="#scalars" id="id4">Scalars</a><ul>
+<li><a class="reference internal" href="#built-in-types" id="id5">Built-in types</a></li>
+<li><a class="reference internal" href="#unique-types" id="id6">Unique types</a></li>
+<li><a class="reference internal" href="#hex-types" id="id7">Hex types</a></li>
+<li><a class="reference internal" href="#scalarenumerationtraits" id="id8">ScalarEnumerationTraits</a></li>
+<li><a class="reference internal" href="#bitvalue" id="id9">BitValue</a></li>
+<li><a class="reference internal" href="#custom-scalar" id="id10">Custom Scalar</a></li>
+<li><a class="reference internal" href="#block-scalars" id="id11">Block Scalars</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#mappings" id="id12">Mappings</a><ul>
+<li><a class="reference internal" href="#no-normalization" id="id13">No Normalization</a></li>
+<li><a class="reference internal" href="#normalization" id="id14">Normalization</a></li>
+<li><a class="reference internal" href="#default-values" id="id15">Default values</a></li>
+<li><a class="reference internal" href="#order-of-keys" id="id16">Order of Keys</a></li>
+<li><a class="reference internal" href="#tags" id="id17">Tags</a></li>
+<li><a class="reference internal" href="#validation" id="id18">Validation</a></li>
+<li><a class="reference internal" href="#flow-mapping" id="id19">Flow Mapping</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#sequence" id="id20">Sequence</a><ul>
+<li><a class="reference internal" href="#flow-sequence" id="id21">Flow Sequence</a></li>
+<li><a class="reference internal" href="#utility-macros" id="id22">Utility Macros</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#document-list" id="id23">Document List</a></li>
+<li><a class="reference internal" href="#user-context-data" id="id24">User Context Data</a></li>
+<li><a class="reference internal" href="#output" id="id25">Output</a></li>
+<li><a class="reference internal" href="#input" id="id26">Input</a></li>
+</ul>
+</div>
+<div class="section" id="introduction-to-yaml">
+<h2><a class="toc-backref" href="#id1">Introduction to YAML</a><a class="headerlink" href="#introduction-to-yaml" title="Permalink to this headline">Â¶</a></h2>
+<p>YAML is a human readable data serialization language.  The full YAML language
+spec can be read at <a class="reference external" href="http://www.yaml.org/spec/1.2/spec.html#Introduction">yaml.org</a>.  The simplest form of
+yaml is just “scalars”, “mappings”, and “sequences”.  A scalar is any number
+or string.  The pound/hash symbol (#) begins a comment line.   A mapping is
+a set of key-value pairs where the key ends with a colon.  For example:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="c1"># a mapping</span>
+<span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+<span class="l-Scalar-Plain">hat-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">7</span>
+</pre></div>
+</div>
+<p>A sequence is a list of items where each item starts with a leading dash (‘-‘).
+For example:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="c1"># a sequence</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">x86</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">x86_64</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">PowerPC</span>
+</pre></div>
+</div>
+<p>You can combine mappings and sequences by indenting.  For example a sequence
+of mappings in which one of the mapping values is itself a sequence:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="c1"># a sequence of mappings with one key's value being a sequence</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+  <span class="l-Scalar-Plain">cpus</span><span class="p-Indicator">:</span>
+   <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">x86</span>
+   <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">x86_64</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Bob</span>
+  <span class="l-Scalar-Plain">cpus</span><span class="p-Indicator">:</span>
+   <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">x86</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Dan</span>
+  <span class="l-Scalar-Plain">cpus</span><span class="p-Indicator">:</span>
+   <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">PowerPC</span>
+   <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">x86</span>
+</pre></div>
+</div>
+<p>Sometime sequences are known to be short and the one entry per line is too
+verbose, so YAML offers an alternate syntax for sequences called a “Flow
+Sequence” in which you put comma separated sequence elements into square
+brackets.  The above example could then be simplified to :</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="c1"># a sequence of mappings with one key's value being a flow sequence</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+  <span class="l-Scalar-Plain">cpus</span><span class="p-Indicator">:</span>      <span class="p-Indicator">[</span> <span class="nv">x86</span><span class="p-Indicator">,</span> <span class="nv">x86_64</span> <span class="p-Indicator">]</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Bob</span>
+  <span class="l-Scalar-Plain">cpus</span><span class="p-Indicator">:</span>      <span class="p-Indicator">[</span> <span class="nv">x86</span> <span class="p-Indicator">]</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Dan</span>
+  <span class="l-Scalar-Plain">cpus</span><span class="p-Indicator">:</span>      <span class="p-Indicator">[</span> <span class="nv">PowerPC</span><span class="p-Indicator">,</span> <span class="nv">x86</span> <span class="p-Indicator">]</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="introduction-to-yaml-i-o">
+<h2><a class="toc-backref" href="#id2">Introduction to YAML I/O</a><a class="headerlink" href="#introduction-to-yaml-i-o" title="Permalink to this headline">Â¶</a></h2>
+<p>The use of indenting makes the YAML easy for a human to read and understand,
+but having a program read and write YAML involves a lot of tedious details.
+The YAML I/O library structures and simplifies reading and writing YAML
+documents.</p>
+<p>YAML I/O assumes you have some “native” data structures which you want to be
+able to dump as YAML and recreate from YAML.  The first step is to try
+writing example YAML for your data structures. You may find after looking at
+possible YAML representations that a direct mapping of your data structures
+to YAML is not very readable.  Often the fields are not in the order that
+a human would find readable.  Or the same information is replicated in multiple
+locations, making it hard for a human to write such YAML correctly.</p>
+<p>In relational database theory there is a design step called normalization in
+which you reorganize fields and tables.  The same considerations need to
+go into the design of your YAML encoding.  But, you may not want to change
+your existing native data structures.  Therefore, when writing out YAML
+there may be a normalization step, and when reading YAML there would be a
+corresponding denormalization step.</p>
+<p>YAML I/O uses a non-invasive, traits based design.  YAML I/O defines some
+abstract base templates.  You specialize those templates on your data types.
+For instance, if you have an enumerated type FooBar you could specialize
+ScalarEnumerationTraits on that type and define the enumeration() method:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">ScalarEnumerationTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">ScalarEnumerationTraits</span><span class="o"><</span><span class="n">FooBar</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">enumeration</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">FooBar</span> <span class="o">&</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span>
+  <span class="p">...</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>As with all YAML I/O template specializations, the ScalarEnumerationTraits is used for
+both reading and writing YAML. That is, the mapping between in-memory enum
+values and the YAML string representation is only in one place.
+This assures that the code for writing and parsing of YAML stays in sync.</p>
+<p>To specify a YAML mappings, you define a specialization on
+llvm::yaml::MappingTraits.
+If your native data structure happens to be a struct that is already normalized,
+then the specialization is simple.  For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Person</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Person</span> <span class="o">&</span><span class="n">info</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"name"</span><span class="p">,</span>         <span class="n">info</span><span class="p">.</span><span class="n">name</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapOptional</span><span class="p">(</span><span class="s">"hat-size"</span><span class="p">,</span>     <span class="n">info</span><span class="p">.</span><span class="n">hatSize</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>A YAML sequence is automatically inferred if you data type has begin()/end()
+iterators and a push_back() method.  Therefore any of the STL containers
+(such as std::vector<>) will automatically translate to YAML sequences.</p>
+<p>Once you have defined specializations for your data types, you can
+programmatically use YAML I/O to write a YAML document:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">Output</span><span class="p">;</span>
+
+<span class="n">Person</span> <span class="n">tom</span><span class="p">;</span>
+<span class="n">tom</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">"Tom"</span><span class="p">;</span>
+<span class="n">tom</span><span class="p">.</span><span class="n">hatSize</span> <span class="o">=</span> <span class="mi">8</span><span class="p">;</span>
+<span class="n">Person</span> <span class="n">dan</span><span class="p">;</span>
+<span class="n">dan</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">"Dan"</span><span class="p">;</span>
+<span class="n">dan</span><span class="p">.</span><span class="n">hatSize</span> <span class="o">=</span> <span class="mi">7</span><span class="p">;</span>
+<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">Person</span><span class="o">></span> <span class="n">persons</span><span class="p">;</span>
+<span class="n">persons</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">tom</span><span class="p">);</span>
+<span class="n">persons</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">dan</span><span class="p">);</span>
+
+<span class="n">Output</span> <span class="n">yout</span><span class="p">(</span><span class="n">llvm</span><span class="o">::</span><span class="n">outs</span><span class="p">());</span>
+<span class="n">yout</span> <span class="o"><<</span> <span class="n">persons</span><span class="p">;</span>
+</pre></div>
+</div>
+<p>This would write the following:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+  <span class="l-Scalar-Plain">hat-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">8</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Dan</span>
+  <span class="l-Scalar-Plain">hat-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">7</span>
+</pre></div>
+</div>
+<p>And you can also read such YAML documents with the following code:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">Input</span><span class="p">;</span>
+
+<span class="k">typedef</span> <span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">Person</span><span class="o">></span> <span class="n">PersonList</span><span class="p">;</span>
+<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">PersonList</span><span class="o">></span> <span class="n">docs</span><span class="p">;</span>
+
+<span class="n">Input</span> <span class="n">yin</span><span class="p">(</span><span class="n">document</span><span class="p">.</span><span class="n">getBuffer</span><span class="p">());</span>
+<span class="n">yin</span> <span class="o">>></span> <span class="n">docs</span><span class="p">;</span>
+
+<span class="k">if</span> <span class="p">(</span> <span class="n">yin</span><span class="p">.</span><span class="n">error</span><span class="p">()</span> <span class="p">)</span>
+  <span class="k">return</span><span class="p">;</span>
+
+<span class="c1">// Process read document</span>
+<span class="k">for</span> <span class="p">(</span> <span class="n">PersonList</span> <span class="o">&</span><span class="n">pl</span> <span class="o">:</span> <span class="n">docs</span> <span class="p">)</span> <span class="p">{</span>
+  <span class="k">for</span> <span class="p">(</span> <span class="n">Person</span> <span class="o">&</span><span class="n">person</span> <span class="o">:</span> <span class="n">pl</span> <span class="p">)</span> <span class="p">{</span>
+    <span class="n">cout</span> <span class="o"><<</span> <span class="s">"name="</span> <span class="o"><<</span> <span class="n">person</span><span class="p">.</span><span class="n">name</span><span class="p">;</span>
+  <span class="p">}</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>One other feature of YAML is the ability to define multiple documents in a
+single file.  That is why reading YAML produces a vector of your document type.</p>
+</div>
+<div class="section" id="error-handling">
+<h2><a class="toc-backref" href="#id3">Error Handling</a><a class="headerlink" href="#error-handling" title="Permalink to this headline">Â¶</a></h2>
+<p>When parsing a YAML document, if the input does not match your schema (as
+expressed in your XxxTraits<> specializations).  YAML I/O
+will print out an error message and your Input object’s error() method will
+return true. For instance the following document:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+  <span class="l-Scalar-Plain">shoe-size</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">12</span>
+<span class="p-Indicator">-</span> <span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Dan</span>
+  <span class="l-Scalar-Plain">hat-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">7</span>
+</pre></div>
+</div>
+<p>Has a key (shoe-size) that is not defined in the schema.  YAML I/O will
+automatically generate this error:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="l-Scalar-Plain">YAML:2:2</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">error</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">unknown key 'shoe-size'</span>
+  <span class="l-Scalar-Plain">shoe-size</span><span class="p-Indicator">:</span>       <span class="l-Scalar-Plain">12</span>
+  <span class="l-Scalar-Plain">^~~~~~~~~</span>
+</pre></div>
+</div>
+<p>Similar errors are produced for other input not conforming to the schema.</p>
+</div>
+<div class="section" id="scalars">
+<h2><a class="toc-backref" href="#id4">Scalars</a><a class="headerlink" href="#scalars" title="Permalink to this headline">Â¶</a></h2>
+<p>YAML scalars are just strings (i.e. not a sequence or mapping).  The YAML I/O
+library provides support for translating between YAML scalars and specific
+C++ types.</p>
+<div class="section" id="built-in-types">
+<h3><a class="toc-backref" href="#id5">Built-in types</a><a class="headerlink" href="#built-in-types" title="Permalink to this headline">Â¶</a></h3>
+<p>The following types have built-in support in YAML I/O:</p>
+<ul class="simple">
+<li>bool</li>
+<li>float</li>
+<li>double</li>
+<li>StringRef</li>
+<li>std::string</li>
+<li>int64_t</li>
+<li>int32_t</li>
+<li>int16_t</li>
+<li>int8_t</li>
+<li>uint64_t</li>
+<li>uint32_t</li>
+<li>uint16_t</li>
+<li>uint8_t</li>
+</ul>
+<p>That is, you can use those types in fields of MappingTraits or as element type
+in sequence.  When reading, YAML I/O will validate that the string found
+is convertible to that type and error out if not.</p>
+</div>
+<div class="section" id="unique-types">
+<h3><a class="toc-backref" href="#id6">Unique types</a><a class="headerlink" href="#unique-types" title="Permalink to this headline">Â¶</a></h3>
+<p>Given that YAML I/O is trait based, the selection of how to convert your data
+to YAML is based on the type of your data.  But in C++ type matching, typedefs
+do not generate unique type names.  That means if you have two typedefs of
+unsigned int, to YAML I/O both types look exactly like unsigned int.  To
+facilitate make unique type names, YAML I/O provides a macro which is used
+like a typedef on built-in types, but expands to create a class with conversion
+operators to and from the base type.  For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="n">LLVM_YAML_STRONG_TYPEDEF</span><span class="p">(</span><span class="n">uint32_t</span><span class="p">,</span> <span class="n">MyFooFlags</span><span class="p">)</span>
+<span class="n">LLVM_YAML_STRONG_TYPEDEF</span><span class="p">(</span><span class="n">uint32_t</span><span class="p">,</span> <span class="n">MyBarFlags</span><span class="p">)</span>
+</pre></div>
+</div>
+<p>This generates two classes MyFooFlags and MyBarFlags which you can use in your
+native data structures instead of uint32_t. They are implicitly
+converted to and from uint32_t.  The point of creating these unique types
+is that you can now specify traits on them to get different YAML conversions.</p>
+</div>
+<div class="section" id="hex-types">
+<h3><a class="toc-backref" href="#id7">Hex types</a><a class="headerlink" href="#hex-types" title="Permalink to this headline">Â¶</a></h3>
+<p>An example use of a unique type is that YAML I/O provides fixed sized unsigned
+integers that are written with YAML I/O as hexadecimal instead of the decimal
+format used by the built-in integer types:</p>
+<ul class="simple">
+<li>Hex64</li>
+<li>Hex32</li>
+<li>Hex16</li>
+<li>Hex8</li>
+</ul>
+<p>You can use llvm::yaml::Hex32 instead of uint32_t and the only different will
+be that when YAML I/O writes out that type it will be formatted in hexadecimal.</p>
+</div>
+<div class="section" id="scalarenumerationtraits">
+<h3><a class="toc-backref" href="#id8">ScalarEnumerationTraits</a><a class="headerlink" href="#scalarenumerationtraits" title="Permalink to this headline">Â¶</a></h3>
+<p>YAML I/O supports translating between in-memory enumerations and a set of string
+values in YAML documents. This is done by specializing ScalarEnumerationTraits<>
+on your enumeration type and define a enumeration() method.
+For instance, suppose you had an enumeration of CPUs and a struct with it as
+a field:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">enum</span> <span class="n">CPUs</span> <span class="p">{</span>
+  <span class="n">cpu_x86_64</span>  <span class="o">=</span> <span class="mi">5</span><span class="p">,</span>
+  <span class="n">cpu_x86</span>     <span class="o">=</span> <span class="mi">7</span><span class="p">,</span>
+  <span class="n">cpu_PowerPC</span> <span class="o">=</span> <span class="mi">8</span>
+<span class="p">};</span>
+
+<span class="k">struct</span> <span class="n">Info</span> <span class="p">{</span>
+  <span class="n">CPUs</span>      <span class="n">cpu</span><span class="p">;</span>
+  <span class="n">uint32_t</span>  <span class="n">flags</span><span class="p">;</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>To support reading and writing of this enumeration, you can define a
+ScalarEnumerationTraits specialization on CPUs, which can then be used
+as a field type:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">ScalarEnumerationTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">ScalarEnumerationTraits</span><span class="o"><</span><span class="n">CPUs</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">enumeration</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">CPUs</span> <span class="o">&</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">enumCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"x86_64"</span><span class="p">,</span>  <span class="n">cpu_x86_64</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">enumCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"x86"</span><span class="p">,</span>     <span class="n">cpu_x86</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">enumCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"PowerPC"</span><span class="p">,</span> <span class="n">cpu_PowerPC</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Info</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Info</span> <span class="o">&</span><span class="n">info</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"cpu"</span><span class="p">,</span>       <span class="n">info</span><span class="p">.</span><span class="n">cpu</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapOptional</span><span class="p">(</span><span class="s">"flags"</span><span class="p">,</span>     <span class="n">info</span><span class="p">.</span><span class="n">flags</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>When reading YAML, if the string found does not match any of the strings
+specified by enumCase() methods, an error is automatically generated.
+When writing YAML, if the value being written does not match any of the values
+specified by the enumCase() methods, a runtime assertion is triggered.</p>
+</div>
+<div class="section" id="bitvalue">
+<h3><a class="toc-backref" href="#id9">BitValue</a><a class="headerlink" href="#bitvalue" title="Permalink to this headline">Â¶</a></h3>
+<p>Another common data structure in C++ is a field where each bit has a unique
+meaning.  This is often used in a “flags” field.  YAML I/O has support for
+converting such fields to a flow sequence.   For instance suppose you
+had the following bit flags defined:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">enum</span> <span class="p">{</span>
+  <span class="n">flagsPointy</span> <span class="o">=</span> <span class="mi">1</span>
+  <span class="n">flagsHollow</span> <span class="o">=</span> <span class="mi">2</span>
+  <span class="n">flagsFlat</span>   <span class="o">=</span> <span class="mi">4</span>
+  <span class="n">flagsRound</span>  <span class="o">=</span> <span class="mi">8</span>
+<span class="p">};</span>
+
+<span class="n">LLVM_YAML_STRONG_TYPEDEF</span><span class="p">(</span><span class="n">uint32_t</span><span class="p">,</span> <span class="n">MyFlags</span><span class="p">)</span>
+</pre></div>
+</div>
+<p>To support reading and writing of MyFlags, you specialize ScalarBitSetTraits<>
+on MyFlags and provide the bit values and their names.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">ScalarBitSetTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">ScalarBitSetTraits</span><span class="o"><</span><span class="n">MyFlags</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">bitset</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MyFlags</span> <span class="o">&</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"hollow"</span><span class="p">,</span>  <span class="n">flagHollow</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"flat"</span><span class="p">,</span>    <span class="n">flagFlat</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"round"</span><span class="p">,</span>   <span class="n">flagRound</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"pointy"</span><span class="p">,</span>  <span class="n">flagPointy</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+
+<span class="k">struct</span> <span class="n">Info</span> <span class="p">{</span>
+  <span class="n">StringRef</span>   <span class="n">name</span><span class="p">;</span>
+  <span class="n">MyFlags</span>     <span class="n">flags</span><span class="p">;</span>
+<span class="p">};</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Info</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Info</span><span class="o">&</span> <span class="n">info</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"name"</span><span class="p">,</span>  <span class="n">info</span><span class="p">.</span><span class="n">name</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"flags"</span><span class="p">,</span> <span class="n">info</span><span class="p">.</span><span class="n">flags</span><span class="p">);</span>
+   <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>With the above, YAML I/O (when writing) will test mask each value in the
+bitset trait against the flags field, and each that matches will
+cause the corresponding string to be added to the flow sequence.  The opposite
+is done when reading and any unknown string values will result in a error. With
+the above schema, a same valid YAML document is:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>    <span class="l-Scalar-Plain">Tom</span>
+<span class="l-Scalar-Plain">flags</span><span class="p-Indicator">:</span>   <span class="p-Indicator">[</span> <span class="nv">pointy</span><span class="p-Indicator">,</span> <span class="nv">flat</span> <span class="p-Indicator">]</span>
+</pre></div>
+</div>
+<p>Sometimes a “flags” field might contains an enumeration part
+defined by a bit-mask.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">enum</span> <span class="p">{</span>
+  <span class="n">flagsFeatureA</span> <span class="o">=</span> <span class="mi">1</span><span class="p">,</span>
+  <span class="n">flagsFeatureB</span> <span class="o">=</span> <span class="mi">2</span><span class="p">,</span>
+  <span class="n">flagsFeatureC</span> <span class="o">=</span> <span class="mi">4</span><span class="p">,</span>
+
+  <span class="n">flagsCPUMask</span> <span class="o">=</span> <span class="mi">24</span><span class="p">,</span>
+
+  <span class="n">flagsCPU1</span> <span class="o">=</span> <span class="mi">8</span><span class="p">,</span>
+  <span class="n">flagsCPU2</span> <span class="o">=</span> <span class="mi">16</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>To support reading and writing such fields, you need to use the maskedBitSet()
+method and provide the bit values, their names and the enumeration mask.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">ScalarBitSetTraits</span><span class="o"><</span><span class="n">MyFlags</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">bitset</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MyFlags</span> <span class="o">&</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"featureA"</span><span class="p">,</span>  <span class="n">flagsFeatureA</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"featureB"</span><span class="p">,</span>  <span class="n">flagsFeatureB</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">bitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"featureC"</span><span class="p">,</span>  <span class="n">flagsFeatureC</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">maskedBitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"CPU1"</span><span class="p">,</span>  <span class="n">flagsCPU1</span><span class="p">,</span> <span class="n">flagsCPUMask</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">maskedBitSetCase</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="s">"CPU2"</span><span class="p">,</span>  <span class="n">flagsCPU2</span><span class="p">,</span> <span class="n">flagsCPUMask</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>YAML I/O (when writing) will apply the enumeration mask to the flags field,
+and compare the result and values from the bitset. As in case of a regular
+bitset, each that matches will cause the corresponding string to be added
+to the flow sequence.</p>
+</div>
+<div class="section" id="custom-scalar">
+<h3><a class="toc-backref" href="#id10">Custom Scalar</a><a class="headerlink" href="#custom-scalar" title="Permalink to this headline">Â¶</a></h3>
+<p>Sometimes for readability a scalar needs to be formatted in a custom way. For
+instance your internal data structure may use a integer for time (seconds since
+some epoch), but in YAML it would be much nicer to express that integer in
+some time format (e.g. 4-May-2012 10:30pm).  YAML I/O has a way to support
+custom formatting and parsing of scalar types by specializing ScalarTraits<> on
+your data type.  When writing, YAML I/O will provide the native type and
+your specialization must create a temporary llvm::StringRef.  When reading,
+YAML I/O will provide an llvm::StringRef of scalar and your specialization
+must convert that to your native data type.  An outline of a custom scalar type
+looks like:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">ScalarTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">ScalarTraits</span><span class="o"><</span><span class="n">MyCustomType</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">output</span><span class="p">(</span><span class="k">const</span> <span class="n">MyCustomType</span> <span class="o">&</span><span class="n">value</span><span class="p">,</span> <span class="kt">void</span><span class="o">*</span><span class="p">,</span>
+                     <span class="n">llvm</span><span class="o">::</span><span class="n">raw_ostream</span> <span class="o">&</span><span class="n">out</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">out</span> <span class="o"><<</span> <span class="n">value</span><span class="p">;</span>  <span class="c1">// do custom formatting here</span>
+  <span class="p">}</span>
+  <span class="k">static</span> <span class="n">StringRef</span> <span class="n">input</span><span class="p">(</span><span class="n">StringRef</span> <span class="n">scalar</span><span class="p">,</span> <span class="kt">void</span><span class="o">*</span><span class="p">,</span> <span class="n">MyCustomType</span> <span class="o">&</span><span class="n">value</span><span class="p">)</span> <span class="p">{</span>
+    <span class="c1">// do custom parsing here.  Return the empty string on success,</span>
+    <span class="c1">// or an error message on failure.</span>
+    <span class="k">return</span> <span class="n">StringRef</span><span class="p">();</span>
+  <span class="p">}</span>
+  <span class="c1">// Determine if this scalar needs quotes.</span>
+  <span class="k">static</span> <span class="kt">bool</span> <span class="n">mustQuote</span><span class="p">(</span><span class="n">StringRef</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="kc">true</span><span class="p">;</span> <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="block-scalars">
+<h3><a class="toc-backref" href="#id11">Block Scalars</a><a class="headerlink" href="#block-scalars" title="Permalink to this headline">Â¶</a></h3>
+<p>YAML block scalars are string literals that are represented in YAML using the
+literal block notation, just like the example shown below:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="l-Scalar-Plain">text</span><span class="p-Indicator">:</span> <span class="p-Indicator">|</span>
+  <span class="no">First line</span>
+  <span class="no">Second line</span>
+</pre></div>
+</div>
+<p>The YAML I/O library provides support for translating between YAML block scalars
+and specific C++ types by allowing you to specialize BlockScalarTraits<> on
+your data type. The library doesn’t provide any built-in support for block
+scalar I/O for types like std::string and llvm::StringRef as they are already
+supported by YAML I/O and use the ordinary scalar notation by default.</p>
+<p>BlockScalarTraits specializations are very similar to the
+ScalarTraits specialization - YAML I/O will provide the native type and your
+specialization must create a temporary llvm::StringRef when writing, and
+it will also provide an llvm::StringRef that has the value of that block scalar
+and your specialization must convert that to your native data type when reading.
+An example of a custom type with an appropriate specialization of
+BlockScalarTraits is shown below:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">BlockScalarTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">struct</span> <span class="n">MyStringType</span> <span class="p">{</span>
+  <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">Str</span><span class="p">;</span>
+<span class="p">};</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">BlockScalarTraits</span><span class="o"><</span><span class="n">MyStringType</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">output</span><span class="p">(</span><span class="k">const</span> <span class="n">MyStringType</span> <span class="o">&</span><span class="n">Value</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">Ctxt</span><span class="p">,</span>
+                     <span class="n">llvm</span><span class="o">::</span><span class="n">raw_ostream</span> <span class="o">&</span><span class="n">OS</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">OS</span> <span class="o"><<</span> <span class="n">Value</span><span class="p">.</span><span class="n">Str</span><span class="p">;</span>
+  <span class="p">}</span>
+
+  <span class="k">static</span> <span class="n">StringRef</span> <span class="n">input</span><span class="p">(</span><span class="n">StringRef</span> <span class="n">Scalar</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">Ctxt</span><span class="p">,</span>
+                         <span class="n">MyStringType</span> <span class="o">&</span><span class="n">Value</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">Value</span><span class="p">.</span><span class="n">Str</span> <span class="o">=</span> <span class="n">Scalar</span><span class="p">.</span><span class="n">str</span><span class="p">();</span>
+    <span class="k">return</span> <span class="n">StringRef</span><span class="p">();</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+</div>
+</div>
+<div class="section" id="mappings">
+<h2><a class="toc-backref" href="#id12">Mappings</a><a class="headerlink" href="#mappings" title="Permalink to this headline">Â¶</a></h2>
+<p>To be translated to or from a YAML mapping for your type T you must specialize
+llvm::yaml::MappingTraits on T and implement the “void mapping(IO &io, T&)”
+method. If your native data structures use pointers to a class everywhere,
+you can specialize on the class pointer.  Examples:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="c1">// Example of struct Foo which is used by value</span>
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Foo</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Foo</span> <span class="o">&</span><span class="n">foo</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapOptional</span><span class="p">(</span><span class="s">"size"</span><span class="p">,</span>      <span class="n">foo</span><span class="p">.</span><span class="n">size</span><span class="p">);</span>
+  <span class="p">...</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+
+<span class="c1">// Example of struct Bar which is natively always a pointer</span>
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Bar</span><span class="o">*></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Bar</span> <span class="o">*&</span><span class="n">bar</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapOptional</span><span class="p">(</span><span class="s">"size"</span><span class="p">,</span>    <span class="n">bar</span><span class="o">-></span><span class="n">size</span><span class="p">);</span>
+  <span class="p">...</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<div class="section" id="no-normalization">
+<h3><a class="toc-backref" href="#id13">No Normalization</a><a class="headerlink" href="#no-normalization" title="Permalink to this headline">Â¶</a></h3>
+<p>The mapping() method is responsible, if needed, for normalizing and
+denormalizing. In a simple case where the native data structure requires no
+normalization, the mapping method just uses mapOptional() or mapRequired() to
+bind the struct’s fields to YAML key names.  For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Person</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Person</span> <span class="o">&</span><span class="n">info</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"name"</span><span class="p">,</span>         <span class="n">info</span><span class="p">.</span><span class="n">name</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapOptional</span><span class="p">(</span><span class="s">"hat-size"</span><span class="p">,</span>     <span class="n">info</span><span class="p">.</span><span class="n">hatSize</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="normalization">
+<h3><a class="toc-backref" href="#id14">Normalization</a><a class="headerlink" href="#normalization" title="Permalink to this headline">Â¶</a></h3>
+<p>When [de]normalization is required, the mapping() method needs a way to access
+normalized values as fields. To help with this, there is
+a template MappingNormalization<> which you can then use to automatically
+do the normalization and denormalization.  The template is used to create
+a local variable in your mapping() method which contains the normalized keys.</p>
+<p>Suppose you have native data type
+Polar which specifies a position in polar coordinates (distance, angle):</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">struct</span> <span class="n">Polar</span> <span class="p">{</span>
+  <span class="kt">float</span> <span class="n">distance</span><span class="p">;</span>
+  <span class="kt">float</span> <span class="n">angle</span><span class="p">;</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>but you’ve decided the normalized YAML for should be in x,y coordinates. That
+is, you want the yaml to look like:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="l-Scalar-Plain">x</span><span class="p-Indicator">:</span>   <span class="l-Scalar-Plain">10.3</span>
+<span class="l-Scalar-Plain">y</span><span class="p-Indicator">:</span>   <span class="l-Scalar-Plain">-4.7</span>
+</pre></div>
+</div>
+<p>You can support this by defining a MappingTraits that normalizes the polar
+coordinates to x,y coordinates when writing YAML and denormalizes x,y
+coordinates into polar when reading YAML.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Polar</span><span class="o">></span> <span class="p">{</span>
+
+  <span class="k">class</span> <span class="nc">NormalizedPolar</span> <span class="p">{</span>
+  <span class="k">public</span><span class="o">:</span>
+    <span class="n">NormalizedPolar</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">)</span>
+      <span class="o">:</span> <span class="n">x</span><span class="p">(</span><span class="mf">0.0</span><span class="p">),</span> <span class="n">y</span><span class="p">(</span><span class="mf">0.0</span><span class="p">)</span> <span class="p">{</span>
+    <span class="p">}</span>
+    <span class="n">NormalizedPolar</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="p">,</span> <span class="n">Polar</span> <span class="o">&</span><span class="n">polar</span><span class="p">)</span>
+      <span class="o">:</span> <span class="n">x</span><span class="p">(</span><span class="n">polar</span><span class="p">.</span><span class="n">distance</span> <span class="o">*</span> <span class="n">cos</span><span class="p">(</span><span class="n">polar</span><span class="p">.</span><span class="n">angle</span><span class="p">)),</span>
+        <span class="n">y</span><span class="p">(</span><span class="n">polar</span><span class="p">.</span><span class="n">distance</span> <span class="o">*</span> <span class="n">sin</span><span class="p">(</span><span class="n">polar</span><span class="p">.</span><span class="n">angle</span><span class="p">))</span> <span class="p">{</span>
+    <span class="p">}</span>
+    <span class="n">Polar</span> <span class="n">denormalize</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="p">)</span> <span class="p">{</span>
+      <span class="k">return</span> <span class="n">Polar</span><span class="p">(</span><span class="n">sqrt</span><span class="p">(</span><span class="n">x</span><span class="o">*</span><span class="n">x</span><span class="o">+</span><span class="n">y</span><span class="o">*</span><span class="n">y</span><span class="p">),</span> <span class="n">arctan</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">));</span>
+    <span class="p">}</span>
+
+    <span class="kt">float</span>        <span class="n">x</span><span class="p">;</span>
+    <span class="kt">float</span>        <span class="n">y</span><span class="p">;</span>
+  <span class="p">};</span>
+
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Polar</span> <span class="o">&</span><span class="n">polar</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">MappingNormalization</span><span class="o"><</span><span class="n">NormalizedPolar</span><span class="p">,</span> <span class="n">Polar</span><span class="o">></span> <span class="n">keys</span><span class="p">(</span><span class="n">io</span><span class="p">,</span> <span class="n">polar</span><span class="p">);</span>
+
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span>    <span class="n">keys</span><span class="o">-></span><span class="n">x</span><span class="p">);</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"y"</span><span class="p">,</span>    <span class="n">keys</span><span class="o">-></span><span class="n">y</span><span class="p">);</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>When writing YAML, the local variable “keys” will be a stack allocated
+instance of NormalizedPolar, constructed from the supplied polar object which
+initializes it x and y fields.  The mapRequired() methods then write out the x
+and y values as key/value pairs.</p>
+<p>When reading YAML, the local variable “keys” will be a stack allocated instance
+of NormalizedPolar, constructed by the empty constructor.  The mapRequired
+methods will find the matching key in the YAML document and fill in the x and y
+fields of the NormalizedPolar object keys. At the end of the mapping() method
+when the local keys variable goes out of scope, the denormalize() method will
+automatically be called to convert the read values back to polar coordinates,
+and then assigned back to the second parameter to mapping().</p>
+<p>In some cases, the normalized class may be a subclass of the native type and
+could be returned by the denormalize() method, except that the temporary
+normalized instance is stack allocated.  In these cases, the utility template
+MappingNormalizationHeap<> can be used instead.  It just like
+MappingNormalization<> except that it heap allocates the normalized object
+when reading YAML.  It never destroys the normalized object.  The denormalize()
+method can this return “this”.</p>
+</div>
+<div class="section" id="default-values">
+<h3><a class="toc-backref" href="#id15">Default values</a><a class="headerlink" href="#default-values" title="Permalink to this headline">Â¶</a></h3>
+<p>Within a mapping() method, calls to io.mapRequired() mean that that key is
+required to exist when parsing YAML documents, otherwise YAML I/O will issue an
+error.</p>
+<p>On the other hand, keys registered with io.mapOptional() are allowed to not
+exist in the YAML document being read.  So what value is put in the field
+for those optional keys?
+There are two steps to how those optional fields are filled in. First, the
+second parameter to the mapping() method is a reference to a native class.  That
+native class must have a default constructor.  Whatever value the default
+constructor initially sets for an optional field will be that field’s value.
+Second, the mapOptional() method has an optional third parameter.  If provided
+it is the value that mapOptional() should set that field to if the YAML document
+does not have that key.</p>
+<p>There is one important difference between those two ways (default constructor
+and third parameter to mapOptional). When YAML I/O generates a YAML document,
+if the mapOptional() third parameter is used, if the actual value being written
+is the same as (using ==) the default value, then that key/value is not written.</p>
+</div>
+<div class="section" id="order-of-keys">
+<h3><a class="toc-backref" href="#id16">Order of Keys</a><a class="headerlink" href="#order-of-keys" title="Permalink to this headline">Â¶</a></h3>
+<p>When writing out a YAML document, the keys are written in the order that the
+calls to mapRequired()/mapOptional() are made in the mapping() method. This
+gives you a chance to write the fields in an order that a human reader of
+the YAML document would find natural.  This may be different that the order
+of the fields in the native class.</p>
+<p>When reading in a YAML document, the keys in the document can be in any order,
+but they are processed in the order that the calls to mapRequired()/mapOptional()
+are made in the mapping() method.  That enables some interesting
+functionality.  For instance, if the first field bound is the cpu and the second
+field bound is flags, and the flags are cpu specific, you can programmatically
+switch how the flags are converted to and from YAML based on the cpu.
+This works for both reading and writing. For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">struct</span> <span class="n">Info</span> <span class="p">{</span>
+  <span class="n">CPUs</span>        <span class="n">cpu</span><span class="p">;</span>
+  <span class="n">uint32_t</span>    <span class="n">flags</span><span class="p">;</span>
+<span class="p">};</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Info</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Info</span> <span class="o">&</span><span class="n">info</span><span class="p">)</span> <span class="p">{</span>
+    <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"cpu"</span><span class="p">,</span>       <span class="n">info</span><span class="p">.</span><span class="n">cpu</span><span class="p">);</span>
+    <span class="c1">// flags must come after cpu for this to work when reading yaml</span>
+    <span class="k">if</span> <span class="p">(</span> <span class="n">info</span><span class="p">.</span><span class="n">cpu</span> <span class="o">==</span> <span class="n">cpu_x86_64</span> <span class="p">)</span>
+      <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"flags"</span><span class="p">,</span>  <span class="o">*</span><span class="p">(</span><span class="n">My86_64Flags</span><span class="o">*</span><span class="p">)</span><span class="n">info</span><span class="p">.</span><span class="n">flags</span><span class="p">);</span>
+    <span class="k">else</span>
+      <span class="n">io</span><span class="p">.</span><span class="n">mapRequired</span><span class="p">(</span><span class="s">"flags"</span><span class="p">,</span>  <span class="o">*</span><span class="p">(</span><span class="n">My86Flags</span><span class="o">*</span><span class="p">)</span><span class="n">info</span><span class="p">.</span><span class="n">flags</span><span class="p">);</span>
+ <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="tags">
+<h3><a class="toc-backref" href="#id17">Tags</a><a class="headerlink" href="#tags" title="Permalink to this headline">Â¶</a></h3>
+<p>The YAML syntax supports tags as a way to specify the type of a node before
+it is parsed. This allows dynamic types of nodes.  But the YAML I/O model uses
+static typing, so there are limits to how you can use tags with the YAML I/O
+model. Recently, we added support to YAML I/O for checking/setting the optional
+tag on a map. Using this functionality it is even possbile to support different
+mappings, as long as they are convertable.</p>
+<p>To check a tag, inside your mapping() method you can use io.mapTag() to specify
+what the tag should be.  This will also add that tag when writing yaml.</p>
+</div>
+<div class="section" id="validation">
+<h3><a class="toc-backref" href="#id18">Validation</a><a class="headerlink" href="#validation" title="Permalink to this headline">Â¶</a></h3>
+<p>Sometimes in a yaml map, each key/value pair is valid, but the combination is
+not.  This is similar to something having no syntax errors, but still having
+semantic errors.  To support semantic level checking, YAML I/O allows
+an optional <tt class="docutils literal"><span class="pre">validate()</span></tt> method in a MappingTraits template specialization.</p>
+<p>When parsing yaml, the <tt class="docutils literal"><span class="pre">validate()</span></tt> method is call <em>after</em> all key/values in
+the map have been processed. Any error message returned by the <tt class="docutils literal"><span class="pre">validate()</span></tt>
+method during input will be printed just a like a syntax error would be printed.
+When writing yaml, the <tt class="docutils literal"><span class="pre">validate()</span></tt> method is called <em>before</em> the yaml
+key/values  are written.  Any error during output will trigger an <tt class="docutils literal"><span class="pre">assert()</span></tt>
+because it is a programming error to have invalid struct values.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">struct</span> <span class="n">Stuff</span> <span class="p">{</span>
+  <span class="p">...</span>
+<span class="p">};</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Stuff</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Stuff</span> <span class="o">&</span><span class="n">stuff</span><span class="p">)</span> <span class="p">{</span>
+  <span class="p">...</span>
+  <span class="p">}</span>
+  <span class="k">static</span> <span class="n">StringRef</span> <span class="n">validate</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Stuff</span> <span class="o">&</span><span class="n">stuff</span><span class="p">)</span> <span class="p">{</span>
+    <span class="c1">// Look at all fields in 'stuff' and if there</span>
+    <span class="c1">// are any bad values return a string describing</span>
+    <span class="c1">// the error.  Otherwise return an empty string.</span>
+    <span class="k">return</span> <span class="n">StringRef</span><span class="p">();</span>
+  <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="flow-mapping">
+<h3><a class="toc-backref" href="#id19">Flow Mapping</a><a class="headerlink" href="#flow-mapping" title="Permalink to this headline">Â¶</a></h3>
+<p>A YAML “flow mapping” is a mapping that uses the inline notation
+(e.g { x: 1, y: 0 } ) when written to YAML. To specify that a type should be
+written in YAML using flow mapping, your MappingTraits specialization should
+add “static const bool flow = true;”. For instance:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">MappingTraits</span><span class="p">;</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">IO</span><span class="p">;</span>
+
+<span class="k">struct</span> <span class="n">Stuff</span> <span class="p">{</span>
+  <span class="p">...</span>
+<span class="p">};</span>
+
+<span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">MappingTraits</span><span class="o"><</span><span class="n">Stuff</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="kt">void</span> <span class="n">mapping</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">Stuff</span> <span class="o">&</span><span class="n">stuff</span><span class="p">)</span> <span class="p">{</span>
+    <span class="p">...</span>
+  <span class="p">}</span>
+
+  <span class="k">static</span> <span class="k">const</span> <span class="kt">bool</span> <span class="n">flow</span> <span class="o">=</span> <span class="kc">true</span><span class="p">;</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>Flow mappings are subject to line wrapping according to the Output object
+configuration.</p>
+</div>
+</div>
+<div class="section" id="sequence">
+<h2><a class="toc-backref" href="#id20">Sequence</a><a class="headerlink" href="#sequence" title="Permalink to this headline">Â¶</a></h2>
+<p>To be translated to or from a YAML sequence for your type T you must specialize
+llvm::yaml::SequenceTraits on T and implement two methods:
+<tt class="docutils literal"><span class="pre">size_t</span> <span class="pre">size(IO</span> <span class="pre">&io,</span> <span class="pre">T&)</span></tt> and
+<tt class="docutils literal"><span class="pre">T::value_type&</span> <span class="pre">element(IO</span> <span class="pre">&io,</span> <span class="pre">T&,</span> <span class="pre">size_t</span> <span class="pre">indx)</span></tt>.  For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">SequenceTraits</span><span class="o"><</span><span class="n">MySeq</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="n">size_t</span> <span class="n">size</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MySeq</span> <span class="o">&</span><span class="n">list</span><span class="p">)</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
+  <span class="k">static</span> <span class="n">MySeqEl</span> <span class="o">&</span><span class="n">element</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MySeq</span> <span class="o">&</span><span class="n">list</span><span class="p">,</span> <span class="n">size_t</span> <span class="n">index</span><span class="p">)</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>The size() method returns how many elements are currently in your sequence.
+The element() method returns a reference to the i’th element in the sequence.
+When parsing YAML, the element() method may be called with an index one bigger
+than the current size.  Your element() method should allocate space for one
+more element (using default constructor if element is a C++ object) and returns
+a reference to that new allocated space.</p>
+<div class="section" id="flow-sequence">
+<h3><a class="toc-backref" href="#id21">Flow Sequence</a><a class="headerlink" href="#flow-sequence" title="Permalink to this headline">Â¶</a></h3>
+<p>A YAML “flow sequence” is a sequence that when written to YAML it uses the
+inline notation (e.g [ foo, bar ] ).  To specify that a sequence type should
+be written in YAML as a flow sequence, your SequenceTraits specialization should
+add “static const bool flow = true;”.  For instance:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">SequenceTraits</span><span class="o"><</span><span class="n">MyList</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="n">size_t</span> <span class="n">size</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MyList</span> <span class="o">&</span><span class="n">list</span><span class="p">)</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
+  <span class="k">static</span> <span class="n">MyListEl</span> <span class="o">&</span><span class="n">element</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MyList</span> <span class="o">&</span><span class="n">list</span><span class="p">,</span> <span class="n">size_t</span> <span class="n">index</span><span class="p">)</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
+
+  <span class="c1">// The existence of this member causes YAML I/O to use a flow sequence</span>
+  <span class="k">static</span> <span class="k">const</span> <span class="kt">bool</span> <span class="n">flow</span> <span class="o">=</span> <span class="kc">true</span><span class="p">;</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+<p>With the above, if you used MyList as the data type in your native data
+structures, then when converted to YAML, a flow sequence of integers
+will be used (e.g. [ 10, -3, 4 ]).</p>
+<p>Flow sequences are subject to line wrapping according to the Output object
+configuration.</p>
+</div>
+<div class="section" id="utility-macros">
+<h3><a class="toc-backref" href="#id22">Utility Macros</a><a class="headerlink" href="#utility-macros" title="Permalink to this headline">Â¶</a></h3>
+<p>Since a common source of sequences is std::vector<>, YAML I/O provides macros:
+LLVM_YAML_IS_SEQUENCE_VECTOR() and LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR() which
+can be used to easily specify SequenceTraits<> on a std::vector type.  YAML
+I/O does not partial specialize SequenceTraits on std::vector<> because that
+would force all vectors to be sequences.  An example use of the macros:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">MyType1</span><span class="o">></span><span class="p">;</span>
+<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">MyType2</span><span class="o">></span><span class="p">;</span>
+<span class="n">LLVM_YAML_IS_SEQUENCE_VECTOR</span><span class="p">(</span><span class="n">MyType1</span><span class="p">)</span>
+<span class="n">LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR</span><span class="p">(</span><span class="n">MyType2</span><span class="p">)</span>
+</pre></div>
+</div>
+</div>
+</div>
+<div class="section" id="document-list">
+<h2><a class="toc-backref" href="#id23">Document List</a><a class="headerlink" href="#document-list" title="Permalink to this headline">Â¶</a></h2>
+<p>YAML allows you to define multiple “documents” in a single YAML file.  Each
+new document starts with a left aligned “—” token.  The end of all documents
+is denoted with a left aligned ”...” token.  Many users of YAML will never
+have need for multiple documents.  The top level node in their YAML schema
+will be a mapping or sequence. For those cases, the following is not needed.
+But for cases where you do want multiple documents, you can specify a
+trait for you document list type.  The trait has the same methods as
+SequenceTraits but is named DocumentListTraits.  For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">template</span> <span class="o"><></span>
+<span class="k">struct</span> <span class="n">DocumentListTraits</span><span class="o"><</span><span class="n">MyDocList</span><span class="o">></span> <span class="p">{</span>
+  <span class="k">static</span> <span class="n">size_t</span> <span class="n">size</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MyDocList</span> <span class="o">&</span><span class="n">list</span><span class="p">)</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
+  <span class="k">static</span> <span class="n">MyDocType</span> <span class="n">element</span><span class="p">(</span><span class="n">IO</span> <span class="o">&</span><span class="n">io</span><span class="p">,</span> <span class="n">MyDocList</span> <span class="o">&</span><span class="n">list</span><span class="p">,</span> <span class="n">size_t</span> <span class="n">index</span><span class="p">)</span> <span class="p">{</span> <span class="p">...</span> <span class="p">}</span>
+<span class="p">};</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="user-context-data">
+<h2><a class="toc-backref" href="#id24">User Context Data</a><a class="headerlink" href="#user-context-data" title="Permalink to this headline">Â¶</a></h2>
+<p>When an llvm::yaml::Input or llvm::yaml::Output object is created their
+constructors take an optional “context” parameter.  This is a pointer to
+whatever state information you might need.</p>
+<p>For instance, in a previous example we showed how the conversion type for a
+flags field could be determined at runtime based on the value of another field
+in the mapping. But what if an inner mapping needs to know some field value
+of an outer mapping?  That is where the “context” parameter comes in. You
+can set values in the context in the outer map’s mapping() method and
+retrieve those values in the inner map’s mapping() method.</p>
+<p>The context value is just a void*.  All your traits which use the context
+and operate on your native data types, need to agree what the context value
+actually is.  It could be a pointer to an object or struct which your various
+traits use to shared context sensitive information.</p>
+</div>
+<div class="section" id="output">
+<h2><a class="toc-backref" href="#id25">Output</a><a class="headerlink" href="#output" title="Permalink to this headline">Â¶</a></h2>
+<p>The llvm::yaml::Output class is used to generate a YAML document from your
+in-memory data structures, using traits defined on your data types.
+To instantiate an Output object you need an llvm::raw_ostream, an optional
+context pointer and an optional wrapping column:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">class</span> <span class="nc">Output</span> <span class="o">:</span> <span class="k">public</span> <span class="n">IO</span> <span class="p">{</span>
+<span class="k">public</span><span class="o">:</span>
+  <span class="n">Output</span><span class="p">(</span><span class="n">llvm</span><span class="o">::</span><span class="n">raw_ostream</span> <span class="o">&</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">context</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">,</span> <span class="kt">int</span> <span class="n">WrapColumn</span> <span class="o">=</span> <span class="mi">70</span><span class="p">);</span>
+</pre></div>
+</div>
+<p>Once you have an Output object, you can use the C++ stream operator on it
+to write your native data as YAML. One thing to recall is that a YAML file
+can contain multiple “documents”.  If the top level data structure you are
+streaming as YAML is a mapping, scalar, or sequence, then Output assumes you
+are generating one document and wraps the mapping output
+with  “<tt class="docutils literal"><span class="pre">---</span></tt>” and trailing “<tt class="docutils literal"><span class="pre">...</span></tt>”.</p>
+<p>The WrapColumn parameter will cause the flow mappings and sequences to
+line-wrap when they go over the supplied column. Pass 0 to completely
+suppress the wrapping.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">Output</span><span class="p">;</span>
+
+<span class="kt">void</span> <span class="n">dumpMyMapDoc</span><span class="p">(</span><span class="k">const</span> <span class="n">MyMapType</span> <span class="o">&</span><span class="n">info</span><span class="p">)</span> <span class="p">{</span>
+  <span class="n">Output</span> <span class="n">yout</span><span class="p">(</span><span class="n">llvm</span><span class="o">::</span><span class="n">outs</span><span class="p">());</span>
+  <span class="n">yout</span> <span class="o"><<</span> <span class="n">info</span><span class="p">;</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>The above could produce output like:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="nn">---</span>
+<span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+<span class="l-Scalar-Plain">hat-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">7</span>
+<span class="nn">...</span>
+</pre></div>
+</div>
+<p>On the other hand, if the top level data structure you are streaming as YAML
+has a DocumentListTraits specialization, then Output walks through each element
+of your DocumentList and generates a “—” before the start of each element
+and ends with a ”...”.</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">Output</span><span class="p">;</span>
+
+<span class="kt">void</span> <span class="n">dumpMyMapDoc</span><span class="p">(</span><span class="k">const</span> <span class="n">MyDocListType</span> <span class="o">&</span><span class="n">docList</span><span class="p">)</span> <span class="p">{</span>
+  <span class="n">Output</span> <span class="n">yout</span><span class="p">(</span><span class="n">llvm</span><span class="o">::</span><span class="n">outs</span><span class="p">());</span>
+  <span class="n">yout</span> <span class="o"><<</span> <span class="n">docList</span><span class="p">;</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>The above could produce output like:</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="nn">---</span>
+<span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+<span class="l-Scalar-Plain">hat-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">7</span>
+<span class="nn">---</span>
+<span class="l-Scalar-Plain">name</span><span class="p-Indicator">:</span>      <span class="l-Scalar-Plain">Tom</span>
+<span class="l-Scalar-Plain">shoe-size</span><span class="p-Indicator">:</span>  <span class="l-Scalar-Plain">11</span>
+<span class="nn">...</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="input">
+<h2><a class="toc-backref" href="#id26">Input</a><a class="headerlink" href="#input" title="Permalink to this headline">Â¶</a></h2>
+<p>The llvm::yaml::Input class is used to parse YAML document(s) into your native
+data structures. To instantiate an Input
+object you need a StringRef to the entire YAML file, and optionally a context
+pointer:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="k">class</span> <span class="nc">Input</span> <span class="o">:</span> <span class="k">public</span> <span class="n">IO</span> <span class="p">{</span>
+<span class="k">public</span><span class="o">:</span>
+  <span class="n">Input</span><span class="p">(</span><span class="n">StringRef</span> <span class="n">inputContent</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">context</span><span class="o">=</span><span class="nb">NULL</span><span class="p">);</span>
+</pre></div>
+</div>
+<p>Once you have an Input object, you can use the C++ stream operator to read
+the document(s).  If you expect there might be multiple YAML documents in
+one file, you’ll need to specialize DocumentListTraits on a list of your
+document type and stream in that document list type.  Otherwise you can
+just stream in the document type.  Also, you can check if there was
+any syntax errors in the YAML be calling the error() method on the Input
+object.  For example:</p>
+<div class="highlight-c++"><div class="highlight"><pre><span class="c1">// Reading a single document</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">Input</span><span class="p">;</span>
+
+<span class="n">Input</span> <span class="n">yin</span><span class="p">(</span><span class="n">mb</span><span class="p">.</span><span class="n">getBuffer</span><span class="p">());</span>
+
+<span class="c1">// Parse the YAML file</span>
+<span class="n">MyDocType</span> <span class="n">theDoc</span><span class="p">;</span>
+<span class="n">yin</span> <span class="o">>></span> <span class="n">theDoc</span><span class="p">;</span>
+
+<span class="c1">// Check for error</span>
+<span class="k">if</span> <span class="p">(</span> <span class="n">yin</span><span class="p">.</span><span class="n">error</span><span class="p">()</span> <span class="p">)</span>
+  <span class="k">return</span><span class="p">;</span>
+</pre></div>
+</div>
+<div class="highlight-c++"><div class="highlight"><pre><span class="c1">// Reading multiple documents in one file</span>
+<span class="k">using</span> <span class="n">llvm</span><span class="o">::</span><span class="n">yaml</span><span class="o">::</span><span class="n">Input</span><span class="p">;</span>
+
+<span class="n">LLVM_YAML_IS_DOCUMENT_LIST_VECTOR</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">MyDocType</span><span class="o">></span><span class="p">)</span>
+
+<span class="n">Input</span> <span class="n">yin</span><span class="p">(</span><span class="n">mb</span><span class="p">.</span><span class="n">getBuffer</span><span class="p">());</span>
+
+<span class="c1">// Parse the YAML file</span>
+<span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o"><</span><span class="n">MyDocType</span><span class="o">></span> <span class="n">theDocList</span><span class="p">;</span>
+<span class="n">yin</span> <span class="o">>></span> <span class="n">theDocList</span><span class="p">;</span>
+
+<span class="c1">// Check for error</span>
+<span class="k">if</span> <span class="p">(</span> <span class="n">yin</span><span class="p">.</span><span class="n">error</span><span class="p">()</span> <span class="p">)</span>
+  <span class="k">return</span><span class="p">;</span>
+</pre></div>
+</div>
+</div>
+</div>
+
+
+          </div>
+      </div>
+      <div class="clearer"></div>
+    </div>
+    <div class="related">
+      <h3>Navigation</h3>
+      <ul>
+        <li class="right" style="margin-right: 10px">
+          <a href="genindex.html" title="General Index"
+             >index</a></li>
+        <li class="right" >
+          <a href="GetElementPtr.html" title="The Often Misunderstood GEP Instruction"
+             >next</a> |</li>
+        <li class="right" >
+          <a href="Passes.html" title="LLVMâs Analysis and Transform Passes"
+             >previous</a> |</li>
+  <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+  <li><a href="index.html">Documentation</a>»</li>
+ 
+      </ul>
+    </div>
+    <div class="footer">
+        © Copyright 2003-2017, LLVM Project.
+      Last updated on 2017-06-27.
+      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
+    </div>
+  </body>
+</html>
\ No newline at end of file

Added: www-releases/trunk/4.0.1/docs/_images/ARM-BE-bitcastfail.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/ARM-BE-bitcastfail.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/ARM-BE-bitcastfail.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/ARM-BE-bitcastsuccess.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/ARM-BE-bitcastsuccess.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/ARM-BE-bitcastsuccess.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/ARM-BE-ld1.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/ARM-BE-ld1.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/ARM-BE-ld1.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/ARM-BE-ldr.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/ARM-BE-ldr.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/ARM-BE-ldr.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/LangImpl05-cfg.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/LangImpl05-cfg.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/LangImpl05-cfg.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/MCJIT-creation.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/MCJIT-creation.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/MCJIT-creation.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/MCJIT-dyld-load.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/MCJIT-dyld-load.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/MCJIT-dyld-load.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/MCJIT-engine-builder.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/MCJIT-engine-builder.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/MCJIT-engine-builder.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/MCJIT-load-object.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/MCJIT-load-object.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/MCJIT-load-object.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/MCJIT-load.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/MCJIT-load.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/MCJIT-load.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/MCJIT-resolve-relocations.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/MCJIT-resolve-relocations.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/MCJIT-resolve-relocations.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/gcc-loops.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/gcc-loops.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/gcc-loops.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_images/linpack-pc.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_images/linpack-pc.png?rev=306451&view=auto
==============================================================================
Binary file - no diff available.

Propchange: www-releases/trunk/4.0.1/docs/_images/linpack-pc.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: www-releases/trunk/4.0.1/docs/_sources/AMDGPUUsage.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/AMDGPUUsage.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/AMDGPUUsage.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/AMDGPUUsage.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,353 @@
+==============================
+User Guide for AMDGPU Back-end
+==============================
+
+Introduction
+============
+
+The AMDGPU back-end provides ISA code generation for AMD GPUs, starting with
+the R600 family up until the current Volcanic Islands (GCN Gen 3).
+
+Refer to `AMDGPU section in Architecture & Platform Information for Compiler Writers <CompilerWriterInfo.html#amdgpu>`_
+for additional documentation.
+
+Conventions
+===========
+
+Address Spaces
+--------------
+
+The AMDGPU back-end uses the following address space mapping:
+
+   ============= ============================================
+   Address Space Memory Space
+   ============= ============================================
+   0             Private
+   1             Global
+   2             Constant
+   3             Local
+   4             Generic (Flat)
+   5             Region
+   ============= ============================================
+
+The terminology in the table, aside from the region memory space, is from the
+OpenCL standard.
+
+
+Assembler
+=========
+
+AMDGPU backend has LLVM-MC based assembler which is currently in development.
+It supports Southern Islands ISA, Sea Islands and Volcanic Islands.
+
+This document describes general syntax for instructions and operands. For more
+information about instructions, their semantics and supported combinations
+of operands, refer to one of Instruction Set Architecture manuals.
+
+An instruction has the following syntax (register operands are
+normally comma-separated while extra operands are space-separated):
+
+*<opcode> <register_operand0>, ... <extra_operand0> ...*
+
+
+Operands
+--------
+
+The following syntax for register operands is supported:
+
+* SGPR registers: s0, ... or s[0], ...
+* VGPR registers: v0, ... or v[0], ...
+* TTMP registers: ttmp0, ... or ttmp[0], ...
+* Special registers: exec (exec_lo, exec_hi), vcc (vcc_lo, vcc_hi), flat_scratch (flat_scratch_lo, flat_scratch_hi)
+* Special trap registers: tba (tba_lo, tba_hi), tma (tma_lo, tma_hi)
+* Register pairs, quads, etc: s[2:3], v[10:11], ttmp[5:6], s[4:7], v[12:15], ttmp[4:7], s[8:15], ...
+* Register lists: [s0, s1], [ttmp0, ttmp1, ttmp2, ttmp3]
+* Register index expressions: v[2*2], s[1-1:2-1]
+* 'off' indicates that an operand is not enabled
+
+The following extra operands are supported:
+
+* offset, offset0, offset1
+* idxen, offen bits
+* glc, slc, tfe bits
+* waitcnt: integer or combination of counter values
+* VOP3 modifiers:
+
+  - abs (\| \|), neg (\-)
+
+* DPP modifiers:
+
+  - row_shl, row_shr, row_ror, row_rol
+  - row_mirror, row_half_mirror, row_bcast
+  - wave_shl, wave_shr, wave_ror, wave_rol, quad_perm
+  - row_mask, bank_mask, bound_ctrl
+
+* SDWA modifiers:
+
+  - dst_sel, src0_sel, src1_sel (BYTE_N, WORD_M, DWORD)
+  - dst_unused (UNUSED_PAD, UNUSED_SEXT, UNUSED_PRESERVE)
+  - abs, neg, sext
+
+DS Instructions Examples
+------------------------
+
+.. code-block:: nasm
+
+  ds_add_u32 v2, v4 offset:16
+  ds_write_src2_b64 v2 offset0:4 offset1:8
+  ds_cmpst_f32 v2, v4, v6
+  ds_min_rtn_f64 v[8:9], v2, v[4:5]
+
+
+For full list of supported instructions, refer to "LDS/GDS instructions" in ISA Manual.
+
+FLAT Instruction Examples
+--------------------------
+
+.. code-block:: nasm
+
+  flat_load_dword v1, v[3:4]
+  flat_store_dwordx3 v[3:4], v[5:7]
+  flat_atomic_swap v1, v[3:4], v5 glc
+  flat_atomic_cmpswap v1, v[3:4], v[5:6] glc slc
+  flat_atomic_fmax_x2 v[1:2], v[3:4], v[5:6] glc
+
+For full list of supported instructions, refer to "FLAT instructions" in ISA Manual.
+
+MUBUF Instruction Examples
+---------------------------
+
+.. code-block:: nasm
+
+  buffer_load_dword v1, off, s[4:7], s1
+  buffer_store_dwordx4 v[1:4], v2, ttmp[4:7], s1 offen offset:4 glc tfe
+  buffer_store_format_xy v[1:2], off, s[4:7], s1
+  buffer_wbinvl1
+  buffer_atomic_inc v1, v2, s[8:11], s4 idxen offset:4 slc
+
+For full list of supported instructions, refer to "MUBUF Instructions" in ISA Manual.
+
+SMRD/SMEM Instruction Examples
+-------------------------------
+
+.. code-block:: nasm
+
+  s_load_dword s1, s[2:3], 0xfc
+  s_load_dwordx8 s[8:15], s[2:3], s4
+  s_load_dwordx16 s[88:103], s[2:3], s4
+  s_dcache_inv_vol
+  s_memtime s[4:5]
+
+For full list of supported instructions, refer to "Scalar Memory Operations" in ISA Manual.
+
+SOP1 Instruction Examples
+--------------------------
+
+.. code-block:: nasm
+
+  s_mov_b32 s1, s2
+  s_mov_b64 s[0:1], 0x80000000
+  s_cmov_b32 s1, 200
+  s_wqm_b64 s[2:3], s[4:5]
+  s_bcnt0_i32_b64 s1, s[2:3]
+  s_swappc_b64 s[2:3], s[4:5]
+  s_cbranch_join s[4:5]
+
+For full list of supported instructions, refer to "SOP1 Instructions" in ISA Manual.
+
+SOP2 Instruction Examples
+-------------------------
+
+.. code-block:: nasm
+
+  s_add_u32 s1, s2, s3
+  s_and_b64 s[2:3], s[4:5], s[6:7]
+  s_cselect_b32 s1, s2, s3
+  s_andn2_b32 s2, s4, s6
+  s_lshr_b64 s[2:3], s[4:5], s6
+  s_ashr_i32 s2, s4, s6
+  s_bfm_b64 s[2:3], s4, s6
+  s_bfe_i64 s[2:3], s[4:5], s6
+  s_cbranch_g_fork s[4:5], s[6:7]
+
+For full list of supported instructions, refer to "SOP2 Instructions" in ISA Manual.
+
+SOPC Instruction Examples
+--------------------------
+
+.. code-block:: nasm
+
+  s_cmp_eq_i32 s1, s2
+  s_bitcmp1_b32 s1, s2
+  s_bitcmp0_b64 s[2:3], s4
+  s_setvskip s3, s5
+
+For full list of supported instructions, refer to "SOPC Instructions" in ISA Manual.
+
+SOPP Instruction Examples
+--------------------------
+
+.. code-block:: nasm
+
+  s_barrier
+  s_nop 2
+  s_endpgm
+  s_waitcnt 0 ; Wait for all counters to be 0
+  s_waitcnt vmcnt(0) & expcnt(0) & lgkmcnt(0) ; Equivalent to above
+  s_waitcnt vmcnt(1) ; Wait for vmcnt counter to be 1.
+  s_sethalt 9
+  s_sleep 10
+  s_sendmsg 0x1
+  s_sendmsg sendmsg(MSG_INTERRUPT)
+  s_trap 1
+
+For full list of supported instructions, refer to "SOPP Instructions" in ISA Manual.
+
+Unless otherwise mentioned, little verification is performed on the operands
+of SOPP Instrucitons, so it is up to the programmer to be familiar with the
+range or acceptable values.
+
+Vector ALU Instruction Examples
+-------------------------------
+
+For vector ALU instruction opcodes (VOP1, VOP2, VOP3, VOPC, VOP_DPP, VOP_SDWA),
+the assembler will automatically use optimal encoding based on its operands.
+To force specific encoding, one can add a suffix to the opcode of the instruction:
+
+* _e32 for 32-bit VOP1/VOP2/VOPC
+* _e64 for 64-bit VOP3
+* _dpp for VOP_DPP
+* _sdwa for VOP_SDWA
+
+VOP1/VOP2/VOP3/VOPC examples:
+
+.. code-block:: nasm
+
+  v_mov_b32 v1, v2
+  v_mov_b32_e32 v1, v2
+  v_nop
+  v_cvt_f64_i32_e32 v[1:2], v2
+  v_floor_f32_e32 v1, v2
+  v_bfrev_b32_e32 v1, v2
+  v_add_f32_e32 v1, v2, v3
+  v_mul_i32_i24_e64 v1, v2, 3
+  v_mul_i32_i24_e32 v1, -3, v3
+  v_mul_i32_i24_e32 v1, -100, v3
+  v_addc_u32 v1, s[0:1], v2, v3, s[2:3]
+  v_max_f16_e32 v1, v2, v3
+
+VOP_DPP examples:
+
+.. code-block:: nasm
+
+  v_mov_b32 v0, v0 quad_perm:[0,2,1,1]
+  v_sin_f32 v0, v0 row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
+  v_mov_b32 v0, v0 wave_shl:1
+  v_mov_b32 v0, v0 row_mirror
+  v_mov_b32 v0, v0 row_bcast:31
+  v_mov_b32 v0, v0 quad_perm:[1,3,0,1] row_mask:0xa bank_mask:0x1 bound_ctrl:0
+  v_add_f32 v0, v0, |v0| row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
+  v_max_f16 v1, v2, v3 row_shl:1 row_mask:0xa bank_mask:0x1 bound_ctrl:0
+
+VOP_SDWA examples:
+
+.. code-block:: nasm
+
+  v_mov_b32 v1, v2 dst_sel:BYTE_0 dst_unused:UNUSED_PRESERVE src0_sel:DWORD
+  v_min_u32 v200, v200, v1 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:BYTE_1 src1_sel:DWORD
+  v_sin_f32 v0, v0 dst_unused:UNUSED_PAD src0_sel:WORD_1
+  v_fract_f32 v0, |v0| dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1
+  v_cmpx_le_u32 vcc, v1, v2 src0_sel:BYTE_2 src1_sel:WORD_0
+
+For full list of supported instructions, refer to "Vector ALU instructions".
+
+HSA Code Object Directives
+--------------------------
+
+AMDGPU ABI defines auxiliary data in output code object. In assembly source,
+one can specify them with assembler directives.
+
+.hsa_code_object_version major, minor
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+*major* and *minor* are integers that specify the version of the HSA code
+object that will be generated by the assembler.
+
+.hsa_code_object_isa [major, minor, stepping, vendor, arch]
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+*major*, *minor*, and *stepping* are all integers that describe the instruction
+set architecture (ISA) version of the assembly program.
+
+*vendor* and *arch* are quoted strings.  *vendor* should always be equal to
+"AMD" and *arch* should always be equal to "AMDGPU".
+
+By default, the assembler will derive the ISA version, *vendor*, and *arch*
+from the value of the -mcpu option that is passed to the assembler.
+
+.amdgpu_hsa_kernel (name)
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This directives specifies that the symbol with given name is a kernel entry point
+(label) and the object should contain corresponding symbol of type STT_AMDGPU_HSA_KERNEL.
+
+.amd_kernel_code_t
+^^^^^^^^^^^^^^^^^^
+
+This directive marks the beginning of a list of key / value pairs that are used
+to specify the amd_kernel_code_t object that will be emitted by the assembler.
+The list must be terminated by the *.end_amd_kernel_code_t* directive.  For
+any amd_kernel_code_t values that are unspecified a default value will be
+used.  The default value for all keys is 0, with the following exceptions:
+
+- *kernel_code_version_major* defaults to 1.
+- *machine_kind* defaults to 1.
+- *machine_version_major*, *machine_version_minor*, and
+  *machine_version_stepping* are derived from the value of the -mcpu option
+  that is passed to the assembler.
+- *kernel_code_entry_byte_offset* defaults to 256.
+- *wavefront_size* defaults to 6.
+- *kernarg_segment_alignment*, *group_segment_alignment*, and
+  *private_segment_alignment* default to 4.  Note that alignments are specified
+  as a power of two, so a value of **n** means an alignment of 2^ **n**.
+
+The *.amd_kernel_code_t* directive must be placed immediately after the
+function label and before any instructions.
+
+For a full list of amd_kernel_code_t keys, refer to AMDGPU ABI document,
+comments in lib/Target/AMDGPU/AmdKernelCodeT.h and test/CodeGen/AMDGPU/hsa.s.
+
+Here is an example of a minimal amd_kernel_code_t specification:
+
+.. code-block:: none
+
+   .hsa_code_object_version 1,0
+   .hsa_code_object_isa
+
+   .hsatext
+   .globl  hello_world
+   .p2align 8
+   .amdgpu_hsa_kernel hello_world
+
+   hello_world:
+
+      .amd_kernel_code_t
+         enable_sgpr_kernarg_segment_ptr = 1
+         is_ptr64 = 1
+         compute_pgm_rsrc1_vgprs = 0
+         compute_pgm_rsrc1_sgprs = 0
+         compute_pgm_rsrc2_user_sgpr = 2
+         kernarg_segment_byte_size = 8
+         wavefront_sgpr_count = 2
+         workitem_vgpr_count = 3
+     .end_amd_kernel_code_t
+
+     s_load_dwordx2 s[0:1], s[0:1] 0x0
+     v_mov_b32 v0, 3.14159
+     s_waitcnt lgkmcnt(0)
+     v_mov_b32 v1, s0
+     v_mov_b32 v2, s1
+     flat_store_dword v[1:2], v0
+     s_endpgm
+   .Lfunc_end0:
+        .size   hello_world, .Lfunc_end0-hello_world

Added: www-releases/trunk/4.0.1/docs/_sources/AdvancedBuilds.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/AdvancedBuilds.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/AdvancedBuilds.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/AdvancedBuilds.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,174 @@
+=============================
+Advanced Build Configurations
+=============================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake
+does not build the project, it generates the files needed by your build tool
+(GNU make, Visual Studio, etc.) for building LLVM.
+
+If **you are a new contributor**, please start with the :doc:`GettingStarted` or
+:doc:`CMake` pages. This page is intended for users doing more complex builds.
+
+Many of the examples below are written assuming specific CMake Generators.
+Unless otherwise explicitly called out these commands should work with any CMake
+generator.
+
+Bootstrap Builds
+================
+
+The Clang CMake build system supports bootstrap (aka multi-stage) builds. At a
+high level a multi-stage build is a chain of builds that pass data from one
+stage into the next. The most common and simple version of this is a traditional
+bootstrap build.
+
+In a simple two-stage bootstrap build, we build clang using the system compiler,
+then use that just-built clang to build clang again. In CMake this simplest form
+of a bootstrap build can be configured with a single option,
+CLANG_ENABLE_BOOTSTRAP.
+
+.. code-block:: console
+
+  $ cmake -G Ninja -DCLANG_ENABLE_BOOTSTRAP=On <path to source>
+  $ ninja stage2
+
+This command itself isn't terribly useful because it assumes default
+configurations for each stage. The next series of examples utilize CMake cache
+scripts to provide more complex options.
+
+The clang build system refers to builds as stages. A stage1 build is a standard
+build using the compiler installed on the host, and a stage2 build is built
+using the stage1 compiler. This nomenclature holds up to more stages too. In
+general a stage*n* build is built using the output from stage*n-1*.
+
+Apple Clang Builds (A More Complex Bootstrap)
+=============================================
+
+Apple's Clang builds are a slightly more complicated example of the simple
+bootstrapping scenario. Apple Clang is built using a 2-stage build.
+
+The stage1 compiler is a host-only compiler with some options set. The stage1
+compiler is a balance of optimization vs build time because it is a throwaway.
+The stage2 compiler is the fully optimized compiler intended to ship to users.
+
+Setting up these compilers requires a lot of options. To simplify the
+configuration the Apple Clang build settings are contained in CMake Cache files.
+You can build an Apple Clang compiler using the following commands:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -C <path to clang>/cmake/caches/Apple-stage1.cmake <path to source>
+  $ ninja stage2-distribution
+
+This CMake invocation configures the stage1 host compiler, and sets
+CLANG_BOOTSTRAP_CMAKE_ARGS to pass the Apple-stage2.cmake cache script to the
+stage2 configuration step.
+
+When you build the stage2-distribution target it builds the minimal stage1
+compiler and required tools, then configures and builds the stage2 compiler
+based on the settings in Apple-stage2.cmake.
+
+This pattern of using cache scripts to set complex settings, and specifically to
+make later stage builds include cache scripts is common in our more advanced
+build configurations.
+
+Multi-stage PGO
+===============
+
+Profile-Guided Optimizations (PGO) is a really great way to optimize the code
+clang generates. Our multi-stage PGO builds are a workflow for generating PGO
+profiles that can be used to optimize clang.
+
+At a high level, the way PGO works is that you build an instrumented compiler,
+then you run the instrumented compiler against sample source files. While the
+instrumented compiler runs it will output a bunch of files containing
+performance counters (.profraw files). After generating all the profraw files
+you use llvm-profdata to merge the files into a single profdata file that you
+can feed into the LLVM_PROFDATA_FILE option.
+
+Our PGO.cmake cache script automates that whole process. You can use it by
+running:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -C <path_to_clang>/cmake/caches/PGO.cmake <source dir>
+  $ ninja stage2-instrumented-generate-profdata
+
+If you let that run for a few hours or so, it will place a profdata file in your
+build directory. This takes a really long time because it builds clang twice,
+and you *must* have compiler-rt in your build tree.
+
+This process uses any source files under the perf-training directory as training
+data as long as the source files are marked up with LIT-style RUN lines.
+
+After it finishes you can use âfind . -name clang.profdataâ to find it, but it
+should be at a path something like:
+
+.. code-block:: console
+
+  <build dir>/tools/clang/stage2-instrumented-bins/utils/perf-training/clang.profdata
+
+You can feed that file into the LLVM_PROFDATA_FILE option when you build your
+optimized compiler.
+
+The PGO came cache has a slightly different stage naming scheme than other
+multi-stage builds. It generates three stages; stage1, stage2-instrumented, and
+stage2. Both of the stage2 builds are built using the stage1 compiler.
+
+The PGO came cache generates the following additional targets:
+
+**stage2-instrumented**
+  Builds a stage1 x86 compiler, runtime, and required tools (llvm-config,
+  llvm-profdata) then uses that compiler to build an instrumented stage2 compiler.
+
+**stage2-instrumented-generate-profdata**
+  Depends on "stage2-instrumented" and will use the instrumented compiler to
+  generate profdata based on the training files in <clang>/utils/perf-training
+
+**stage2**
+  Depends of "stage2-instrumented-generate-profdata" and will use the stage1
+  compiler with the stage2 profdata to build a PGO-optimized compiler.
+
+**stage2-check-llvm**
+  Depends on stage2 and runs check-llvm using the stage2 compiler.
+
+**stage2-check-clang**
+  Depends on stage2 and runs check-clang using the stage2 compiler.
+
+**stage2-check-all**
+  Depends on stage2 and runs check-all using the stage2 compiler.
+
+**stage2-test-suite**
+  Depends on stage2 and runs the test-suite using the stage3 compiler (requires
+  in-tree test-suite).
+
+3-Stage Non-Determinism
+=======================
+
+In the ancient lore of compilers non-determinism is like the multi-headed hydra.
+Whenever it's head pops up, terror and chaos ensue.
+
+Historically one of the tests to verify that a compiler was deterministic would
+be a three stage build. The idea of a three stage build is you take your sources
+and build a compiler (stage1), then use that compiler to rebuild the sources
+(stage2), then you use that compiler to rebuild the sources a third time
+(stage3) with an identical configuration to the stage2 build. At the end of
+this, you have a stage2 and stage3 compiler that should be bit-for-bit
+identical.
+
+You can perform one of these 3-stage builds with LLVM & clang using the
+following commands:
+
+.. code-block:: console
+
+  $ cmake -G Ninja -C <path_to_clang>/cmake/caches/3-stage.cmake <source dir>
+  $ ninja stage3
+
+After the build you can compare the stage2 & stage3 compilers. We have a bot
+setup `here <http://lab.llvm.org:8011/builders/clang-3stage-ubuntu>`_ that runs
+this build and compare configuration.

Added: www-releases/trunk/4.0.1/docs/_sources/AliasAnalysis.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/AliasAnalysis.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/AliasAnalysis.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/AliasAnalysis.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,717 @@
+==================================
+LLVM Alias Analysis Infrastructure
+==================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Alias Analysis (aka Pointer Analysis) is a class of techniques which attempt to
+determine whether or not two pointers ever can point to the same object in
+memory.  There are many different algorithms for alias analysis and many
+different ways of classifying them: flow-sensitive vs. flow-insensitive,
+context-sensitive vs. context-insensitive, field-sensitive
+vs. field-insensitive, unification-based vs. subset-based, etc.  Traditionally,
+alias analyses respond to a query with a `Must, May, or No`_ alias response,
+indicating that two pointers always point to the same object, might point to the
+same object, or are known to never point to the same object.
+
+The LLVM `AliasAnalysis
+<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ class is the
+primary interface used by clients and implementations of alias analyses in the
+LLVM system.  This class is the common interface between clients of alias
+analysis information and the implementations providing it, and is designed to
+support a wide range of implementations and clients (but currently all clients
+are assumed to be flow-insensitive).  In addition to simple alias analysis
+information, this class exposes Mod/Ref information from those implementations
+which can provide it, allowing for powerful analyses and transformations to work
+well together.
+
+This document contains information necessary to successfully implement this
+interface, use it, and to test both sides.  It also explains some of the finer
+points about what exactly results mean.  
+
+``AliasAnalysis`` Class Overview
+================================
+
+The `AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__
+class defines the interface that the various alias analysis implementations
+should support.  This class exports two important enums: ``AliasResult`` and
+``ModRefResult`` which represent the result of an alias query or a mod/ref
+query, respectively.
+
+The ``AliasAnalysis`` interface exposes information about memory, represented in
+several different ways.  In particular, memory objects are represented as a
+starting address and size, and function calls are represented as the actual
+``call`` or ``invoke`` instructions that performs the call.  The
+``AliasAnalysis`` interface also exposes some helper methods which allow you to
+get mod/ref information for arbitrary instructions.
+
+All ``AliasAnalysis`` interfaces require that in queries involving multiple
+values, values which are not :ref:`constants <constants>` are all
+defined within the same function.
+
+Representation of Pointers
+--------------------------
+
+Most importantly, the ``AliasAnalysis`` class provides several methods which are
+used to query whether or not two memory objects alias, whether function calls
+can modify or read a memory object, etc.  For all of these queries, memory
+objects are represented as a pair of their starting address (a symbolic LLVM
+``Value*``) and a static size.
+
+Representing memory objects as a starting address and a size is critically
+important for correct Alias Analyses.  For example, consider this (silly, but
+possible) C code:
+
+.. code-block:: c++
+
+  int i;
+  char C[2];
+  char A[10]; 
+  /* ... */
+  for (i = 0; i != 10; ++i) {
+    C[0] = A[i];          /* One byte store */
+    C[1] = A[9-i];        /* One byte store */
+  }
+
+In this case, the ``basicaa`` pass will disambiguate the stores to ``C[0]`` and
+``C[1]`` because they are accesses to two distinct locations one byte apart, and
+the accesses are each one byte.  In this case, the Loop Invariant Code Motion
+(LICM) pass can use store motion to remove the stores from the loop.  In
+constrast, the following code:
+
+.. code-block:: c++
+
+  int i;
+  char C[2];
+  char A[10]; 
+  /* ... */
+  for (i = 0; i != 10; ++i) {
+    ((short*)C)[0] = A[i];  /* Two byte store! */
+    C[1] = A[9-i];          /* One byte store */
+  }
+
+In this case, the two stores to C do alias each other, because the access to the
+``&C[0]`` element is a two byte access.  If size information wasn't available in
+the query, even the first case would have to conservatively assume that the
+accesses alias.
+
+.. _alias:
+
+The ``alias`` method
+--------------------
+  
+The ``alias`` method is the primary interface used to determine whether or not
+two memory objects alias each other.  It takes two memory objects as input and
+returns MustAlias, PartialAlias, MayAlias, or NoAlias as appropriate.
+
+Like all ``AliasAnalysis`` interfaces, the ``alias`` method requires that either
+the two pointer values be defined within the same function, or at least one of
+the values is a :ref:`constant <constants>`.
+
+.. _Must, May, or No:
+
+Must, May, and No Alias Responses
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``NoAlias`` response may be used when there is never an immediate dependence
+between any memory reference *based* on one pointer and any memory reference
+*based* the other. The most obvious example is when the two pointers point to
+non-overlapping memory ranges. Another is when the two pointers are only ever
+used for reading memory. Another is when the memory is freed and reallocated
+between accesses through one pointer and accesses through the other --- in this
+case, there is a dependence, but it's mediated by the free and reallocation.
+
+As an exception to this is with the :ref:`noalias <noalias>` keyword;
+the "irrelevant" dependencies are ignored.
+
+The ``MayAlias`` response is used whenever the two pointers might refer to the
+same object.
+
+The ``PartialAlias`` response is used when the two memory objects are known to
+be overlapping in some way, but do not start at the same address.
+
+The ``MustAlias`` response may only be returned if the two memory objects are
+guaranteed to always start at exactly the same location. A ``MustAlias``
+response implies that the pointers compare equal.
+
+The ``getModRefInfo`` methods
+-----------------------------
+
+The ``getModRefInfo`` methods return information about whether the execution of
+an instruction can read or modify a memory location.  Mod/Ref information is
+always conservative: if an instruction **might** read or write a location,
+``ModRef`` is returned.
+
+The ``AliasAnalysis`` class also provides a ``getModRefInfo`` method for testing
+dependencies between function calls.  This method takes two call sites (``CS1``
+& ``CS2``), returns ``NoModRef`` if neither call writes to memory read or
+written by the other, ``Ref`` if ``CS1`` reads memory written by ``CS2``,
+``Mod`` if ``CS1`` writes to memory read or written by ``CS2``, or ``ModRef`` if
+``CS1`` might read or write memory written to by ``CS2``.  Note that this
+relation is not commutative.
+
+Other useful ``AliasAnalysis`` methods
+--------------------------------------
+
+Several other tidbits of information are often collected by various alias
+analysis implementations and can be put to good use by various clients.
+
+The ``pointsToConstantMemory`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``pointsToConstantMemory`` method returns true if and only if the analysis
+can prove that the pointer only points to unchanging memory locations
+(functions, constant global variables, and the null pointer).  This information
+can be used to refine mod/ref information: it is impossible for an unchanging
+memory location to be modified.
+
+.. _never access memory or only read memory:
+
+The ``doesNotAccessMemory`` and  ``onlyReadsMemory`` methods
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These methods are used to provide very simple mod/ref information for function
+calls.  The ``doesNotAccessMemory`` method returns true for a function if the
+analysis can prove that the function never reads or writes to memory, or if the
+function only reads from constant memory.  Functions with this property are
+side-effect free and only depend on their input arguments, allowing them to be
+eliminated if they form common subexpressions or be hoisted out of loops.  Many
+common functions behave this way (e.g., ``sin`` and ``cos``) but many others do
+not (e.g., ``acos``, which modifies the ``errno`` variable).
+
+The ``onlyReadsMemory`` method returns true for a function if analysis can prove
+that (at most) the function only reads from non-volatile memory.  Functions with
+this property are side-effect free, only depending on their input arguments and
+the state of memory when they are called.  This property allows calls to these
+functions to be eliminated and moved around, as long as there is no store
+instruction that changes the contents of memory.  Note that all functions that
+satisfy the ``doesNotAccessMemory`` method also satisfy ``onlyReadsMemory``.
+
+Writing a new ``AliasAnalysis`` Implementation
+==============================================
+
+Writing a new alias analysis implementation for LLVM is quite straight-forward.
+There are already several implementations that you can use for examples, and the
+following information should help fill in any details.  For a examples, take a
+look at the `various alias analysis implementations`_ included with LLVM.
+
+Different Pass styles
+---------------------
+
+The first step to determining what type of :doc:`LLVM pass <WritingAnLLVMPass>`
+you need to use for your Alias Analysis.  As is the case with most other
+analyses and transformations, the answer should be fairly obvious from what type
+of problem you are trying to solve:
+
+#. If you require interprocedural analysis, it should be a ``Pass``.
+#. If you are a function-local analysis, subclass ``FunctionPass``.
+#. If you don't need to look at the program at all, subclass ``ImmutablePass``.
+
+In addition to the pass that you subclass, you should also inherit from the
+``AliasAnalysis`` interface, of course, and use the ``RegisterAnalysisGroup``
+template to register as an implementation of ``AliasAnalysis``.
+
+Required initialization calls
+-----------------------------
+
+Your subclass of ``AliasAnalysis`` is required to invoke two methods on the
+``AliasAnalysis`` base class: ``getAnalysisUsage`` and
+``InitializeAliasAnalysis``.  In particular, your implementation of
+``getAnalysisUsage`` should explicitly call into the
+``AliasAnalysis::getAnalysisUsage`` method in addition to doing any declaring
+any pass dependencies your pass has.  Thus you should have something like this:
+
+.. code-block:: c++
+
+  void getAnalysisUsage(AnalysisUsage &AU) const {
+    AliasAnalysis::getAnalysisUsage(AU);
+    // declare your dependencies here.
+  }
+
+Additionally, your must invoke the ``InitializeAliasAnalysis`` method from your
+analysis run method (``run`` for a ``Pass``, ``runOnFunction`` for a
+``FunctionPass``, or ``InitializePass`` for an ``ImmutablePass``).  For example
+(as part of a ``Pass``):
+
+.. code-block:: c++
+
+  bool run(Module &M) {
+    InitializeAliasAnalysis(this);
+    // Perform analysis here...
+    return false;
+  }
+
+Required methods to override
+----------------------------
+
+You must override the ``getAdjustedAnalysisPointer`` method on all subclasses
+of ``AliasAnalysis``. An example implementation of this method would look like:
+
+.. code-block:: c++
+
+  void *getAdjustedAnalysisPointer(const void* ID) override {
+    if (ID == &AliasAnalysis::ID)
+      return (AliasAnalysis*)this;
+    return this;
+  }
+
+Interfaces which may be specified
+---------------------------------
+
+All of the `AliasAnalysis
+<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`__ virtual methods
+default to providing :ref:`chaining <aliasanalysis-chaining>` to another alias
+analysis implementation, which ends up returning conservatively correct
+information (returning "May" Alias and "Mod/Ref" for alias and mod/ref queries
+respectively).  Depending on the capabilities of the analysis you are
+implementing, you just override the interfaces you can improve.
+
+.. _aliasanalysis-chaining:
+
+``AliasAnalysis`` chaining behavior
+-----------------------------------
+
+With only one special exception (the :ref:`-no-aa <aliasanalysis-no-aa>` pass)
+every alias analysis pass chains to another alias analysis implementation (for
+example, the user can specify "``-basicaa -ds-aa -licm``" to get the maximum
+benefit from both alias analyses).  The alias analysis class automatically
+takes care of most of this for methods that you don't override.  For methods
+that you do override, in code paths that return a conservative MayAlias or
+Mod/Ref result, simply return whatever the superclass computes.  For example:
+
+.. code-block:: c++
+
+  AliasResult alias(const Value *V1, unsigned V1Size,
+                    const Value *V2, unsigned V2Size) {
+    if (...)
+      return NoAlias;
+    ...
+
+    // Couldn't determine a must or no-alias result.
+    return AliasAnalysis::alias(V1, V1Size, V2, V2Size);
+  }
+
+In addition to analysis queries, you must make sure to unconditionally pass LLVM
+`update notification`_ methods to the superclass as well if you override them,
+which allows all alias analyses in a change to be updated.
+
+.. _update notification:
+
+Updating analysis results for transformations
+---------------------------------------------
+
+Alias analysis information is initially computed for a static snapshot of the
+program, but clients will use this information to make transformations to the
+code.  All but the most trivial forms of alias analysis will need to have their
+analysis results updated to reflect the changes made by these transformations.
+
+The ``AliasAnalysis`` interface exposes four methods which are used to
+communicate program changes from the clients to the analysis implementations.
+Various alias analysis implementations should use these methods to ensure that
+their internal data structures are kept up-to-date as the program changes (for
+example, when an instruction is deleted), and clients of alias analysis must be
+sure to call these interfaces appropriately.
+
+The ``deleteValue`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``deleteValue`` method is called by transformations when they remove an
+instruction or any other value from the program (including values that do not
+use pointers).  Typically alias analyses keep data structures that have entries
+for each value in the program.  When this method is called, they should remove
+any entries for the specified value, if they exist.
+
+The ``copyValue`` method
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``copyValue`` method is used when a new value is introduced into the
+program.  There is no way to introduce a value into the program that did not
+exist before (this doesn't make sense for a safe compiler transformation), so
+this is the only way to introduce a new value.  This method indicates that the
+new value has exactly the same properties as the value being copied.
+
+The ``replaceWithNewValue`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This method is a simple helper method that is provided to make clients easier to
+use.  It is implemented by copying the old analysis information to the new
+value, then deleting the old value.  This method cannot be overridden by alias
+analysis implementations.
+
+The ``addEscapingUse`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``addEscapingUse`` method is used when the uses of a pointer value have
+changed in ways that may invalidate precomputed analysis information.
+Implementations may either use this callback to provide conservative responses
+for points whose uses have change since analysis time, or may recompute some or
+all of their internal state to continue providing accurate responses.
+
+In general, any new use of a pointer value is considered an escaping use, and
+must be reported through this callback, *except* for the uses below:
+
+* A ``bitcast`` or ``getelementptr`` of the pointer
+* A ``store`` through the pointer (but not a ``store`` *of* the pointer)
+* A ``load`` through the pointer
+
+Efficiency Issues
+-----------------
+
+From the LLVM perspective, the only thing you need to do to provide an efficient
+alias analysis is to make sure that alias analysis **queries** are serviced
+quickly.  The actual calculation of the alias analysis results (the "run"
+method) is only performed once, but many (perhaps duplicate) queries may be
+performed.  Because of this, try to move as much computation to the run method
+as possible (within reason).
+
+Limitations
+-----------
+
+The AliasAnalysis infrastructure has several limitations which make writing a
+new ``AliasAnalysis`` implementation difficult.
+
+There is no way to override the default alias analysis. It would be very useful
+to be able to do something like "``opt -my-aa -O2``" and have it use ``-my-aa``
+for all passes which need AliasAnalysis, but there is currently no support for
+that, short of changing the source code and recompiling. Similarly, there is
+also no way of setting a chain of analyses as the default.
+
+There is no way for transform passes to declare that they preserve
+``AliasAnalysis`` implementations. The ``AliasAnalysis`` interface includes
+``deleteValue`` and ``copyValue`` methods which are intended to allow a pass to
+keep an AliasAnalysis consistent, however there's no way for a pass to declare
+in its ``getAnalysisUsage`` that it does so. Some passes attempt to use
+``AU.addPreserved<AliasAnalysis>``, however this doesn't actually have any
+effect.
+
+``AliasAnalysisCounter`` (``-count-aa``) are implemented as ``ModulePass``
+classes, so if your alias analysis uses ``FunctionPass``, it won't be able to
+use these utilities. If you try to use them, the pass manager will silently
+route alias analysis queries directly to ``BasicAliasAnalysis`` instead.
+
+Similarly, the ``opt -p`` option introduces ``ModulePass`` passes between each
+pass, which prevents the use of ``FunctionPass`` alias analysis passes.
+
+The ``AliasAnalysis`` API does have functions for notifying implementations when
+values are deleted or copied, however these aren't sufficient. There are many
+other ways that LLVM IR can be modified which could be relevant to
+``AliasAnalysis`` implementations which can not be expressed.
+
+The ``AliasAnalysisDebugger`` utility seems to suggest that ``AliasAnalysis``
+implementations can expect that they will be informed of any relevant ``Value``
+before it appears in an alias query. However, popular clients such as ``GVN``
+don't support this, and are known to trigger errors when run with the
+``AliasAnalysisDebugger``.
+
+Due to several of the above limitations, the most obvious use for the
+``AliasAnalysisCounter`` utility, collecting stats on all alias queries in a
+compilation, doesn't work, even if the ``AliasAnalysis`` implementations don't
+use ``FunctionPass``.  There's no way to set a default, much less a default
+sequence, and there's no way to preserve it.
+
+The ``AliasSetTracker`` class (which is used by ``LICM``) makes a
+non-deterministic number of alias queries. This can cause stats collected by
+``AliasAnalysisCounter`` to have fluctuations among identical runs, for
+example. Another consequence is that debugging techniques involving pausing
+execution after a predetermined number of queries can be unreliable.
+
+Many alias queries can be reformulated in terms of other alias queries. When
+multiple ``AliasAnalysis`` queries are chained together, it would make sense to
+start those queries from the beginning of the chain, with care taken to avoid
+infinite looping, however currently an implementation which wants to do this can
+only start such queries from itself.
+
+Using alias analysis results
+============================
+
+There are several different ways to use alias analysis results.  In order of
+preference, these are:
+
+Using the ``MemoryDependenceAnalysis`` Pass
+-------------------------------------------
+
+The ``memdep`` pass uses alias analysis to provide high-level dependence
+information about memory-using instructions.  This will tell you which store
+feeds into a load, for example.  It uses caching and other techniques to be
+efficient, and is used by Dead Store Elimination, GVN, and memcpy optimizations.
+
+.. _AliasSetTracker:
+
+Using the ``AliasSetTracker`` class
+-----------------------------------
+
+Many transformations need information about alias **sets** that are active in
+some scope, rather than information about pairwise aliasing.  The
+`AliasSetTracker <http://llvm.org/doxygen/classllvm_1_1AliasSetTracker.html>`__
+class is used to efficiently build these Alias Sets from the pairwise alias
+analysis information provided by the ``AliasAnalysis`` interface.
+
+First you initialize the AliasSetTracker by using the "``add``" methods to add
+information about various potentially aliasing instructions in the scope you are
+interested in.  Once all of the alias sets are completed, your pass should
+simply iterate through the constructed alias sets, using the ``AliasSetTracker``
+``begin()``/``end()`` methods.
+
+The ``AliasSet``\s formed by the ``AliasSetTracker`` are guaranteed to be
+disjoint, calculate mod/ref information and volatility for the set, and keep
+track of whether or not all of the pointers in the set are Must aliases.  The
+AliasSetTracker also makes sure that sets are properly folded due to call
+instructions, and can provide a list of pointers in each set.
+
+As an example user of this, the `Loop Invariant Code Motion
+<doxygen/structLICM.html>`_ pass uses ``AliasSetTracker``\s to calculate alias
+sets for each loop nest.  If an ``AliasSet`` in a loop is not modified, then all
+load instructions from that set may be hoisted out of the loop.  If any alias
+sets are stored to **and** are must alias sets, then the stores may be sunk
+to outside of the loop, promoting the memory location to a register for the
+duration of the loop nest.  Both of these transformations only apply if the
+pointer argument is loop-invariant.
+
+The AliasSetTracker implementation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The AliasSetTracker class is implemented to be as efficient as possible.  It
+uses the union-find algorithm to efficiently merge AliasSets when a pointer is
+inserted into the AliasSetTracker that aliases multiple sets.  The primary data
+structure is a hash table mapping pointers to the AliasSet they are in.
+
+The AliasSetTracker class must maintain a list of all of the LLVM ``Value*``\s
+that are in each AliasSet.  Since the hash table already has entries for each
+LLVM ``Value*`` of interest, the AliasesSets thread the linked list through
+these hash-table nodes to avoid having to allocate memory unnecessarily, and to
+make merging alias sets extremely efficient (the linked list merge is constant
+time).
+
+You shouldn't need to understand these details if you are just a client of the
+AliasSetTracker, but if you look at the code, hopefully this brief description
+will help make sense of why things are designed the way they are.
+
+Using the ``AliasAnalysis`` interface directly
+----------------------------------------------
+
+If neither of these utility class are what your pass needs, you should use the
+interfaces exposed by the ``AliasAnalysis`` class directly.  Try to use the
+higher-level methods when possible (e.g., use mod/ref information instead of the
+`alias`_ method directly if possible) to get the best precision and efficiency.
+
+Existing alias analysis implementations and clients
+===================================================
+
+If you're going to be working with the LLVM alias analysis infrastructure, you
+should know what clients and implementations of alias analysis are available.
+In particular, if you are implementing an alias analysis, you should be aware of
+the `the clients`_ that are useful for monitoring and evaluating different
+implementations.
+
+.. _various alias analysis implementations:
+
+Available ``AliasAnalysis`` implementations
+-------------------------------------------
+
+This section lists the various implementations of the ``AliasAnalysis``
+interface.  With the exception of the :ref:`-no-aa <aliasanalysis-no-aa>`
+implementation, all of these :ref:`chain <aliasanalysis-chaining>` to other
+alias analysis implementations.
+
+.. _aliasanalysis-no-aa:
+
+The ``-no-aa`` pass
+^^^^^^^^^^^^^^^^^^^
+
+The ``-no-aa`` pass is just like what it sounds: an alias analysis that never
+returns any useful information.  This pass can be useful if you think that alias
+analysis is doing something wrong and are trying to narrow down a problem.
+
+The ``-basicaa`` pass
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``-basicaa`` pass is an aggressive local analysis that *knows* many
+important facts:
+
+* Distinct globals, stack allocations, and heap allocations can never alias.
+* Globals, stack allocations, and heap allocations never alias the null pointer.
+* Different fields of a structure do not alias.
+* Indexes into arrays with statically differing subscripts cannot alias.
+* Many common standard C library functions `never access memory or only read
+  memory`_.
+* Pointers that obviously point to constant globals "``pointToConstantMemory``".
+* Function calls can not modify or references stack allocations if they never
+  escape from the function that allocates them (a common case for automatic
+  arrays).
+
+The ``-globalsmodref-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This pass implements a simple context-sensitive mod/ref and alias analysis for
+internal global variables that don't "have their address taken".  If a global
+does not have its address taken, the pass knows that no pointers alias the
+global.  This pass also keeps track of functions that it knows never access
+memory or never read memory.  This allows certain optimizations (e.g. GVN) to
+eliminate call instructions entirely.
+
+The real power of this pass is that it provides context-sensitive mod/ref
+information for call instructions.  This allows the optimizer to know that calls
+to a function do not clobber or read the value of the global, allowing loads and
+stores to be eliminated.
+
+.. note::
+
+  This pass is somewhat limited in its scope (only support non-address taken
+  globals), but is very quick analysis.
+
+The ``-steens-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-steens-aa`` pass implements a variation on the well-known "Steensgaard's
+algorithm" for interprocedural alias analysis.  Steensgaard's algorithm is a
+unification-based, flow-insensitive, context-insensitive, and field-insensitive
+alias analysis that is also very scalable (effectively linear time).
+
+The LLVM ``-steens-aa`` pass implements a "speculatively field-**sensitive**"
+version of Steensgaard's algorithm using the Data Structure Analysis framework.
+This gives it substantially more precision than the standard algorithm while
+maintaining excellent analysis scalability.
+
+.. note::
+
+  ``-steens-aa`` is available in the optional "poolalloc" module. It is not part
+  of the LLVM core.
+
+The ``-ds-aa`` pass
+^^^^^^^^^^^^^^^^^^^
+
+The ``-ds-aa`` pass implements the full Data Structure Analysis algorithm.  Data
+Structure Analysis is a modular unification-based, flow-insensitive,
+context-**sensitive**, and speculatively field-**sensitive** alias
+analysis that is also quite scalable, usually at ``O(n * log(n))``.
+
+This algorithm is capable of responding to a full variety of alias analysis
+queries, and can provide context-sensitive mod/ref information as well.  The
+only major facility not implemented so far is support for must-alias
+information.
+
+.. note::
+
+  ``-ds-aa`` is available in the optional "poolalloc" module. It is not part of
+  the LLVM core.
+
+The ``-scev-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``-scev-aa`` pass implements AliasAnalysis queries by translating them into
+ScalarEvolution queries. This gives it a more complete understanding of
+``getelementptr`` instructions and loop induction variables than other alias
+analyses have.
+
+Alias analysis driven transformations
+-------------------------------------
+
+LLVM includes several alias-analysis driven transformations which can be used
+with any of the implementations above.
+
+The ``-adce`` pass
+^^^^^^^^^^^^^^^^^^
+
+The ``-adce`` pass, which implements Aggressive Dead Code Elimination uses the
+``AliasAnalysis`` interface to delete calls to functions that do not have
+side-effects and are not used.
+
+The ``-licm`` pass
+^^^^^^^^^^^^^^^^^^
+
+The ``-licm`` pass implements various Loop Invariant Code Motion related
+transformations.  It uses the ``AliasAnalysis`` interface for several different
+transformations:
+
+* It uses mod/ref information to hoist or sink load instructions out of loops if
+  there are no instructions in the loop that modifies the memory loaded.
+
+* It uses mod/ref information to hoist function calls out of loops that do not
+  write to memory and are loop-invariant.
+
+* It uses alias information to promote memory objects that are loaded and stored
+  to in loops to live in a register instead.  It can do this if there are no may
+  aliases to the loaded/stored memory location.
+
+The ``-argpromotion`` pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-argpromotion`` pass promotes by-reference arguments to be passed in
+by-value instead.  In particular, if pointer arguments are only loaded from it
+passes in the value loaded instead of the address to the function.  This pass
+uses alias information to make sure that the value loaded from the argument
+pointer is not modified between the entry of the function and any load of the
+pointer.
+
+The ``-gvn``, ``-memcpyopt``, and ``-dse`` passes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These passes use AliasAnalysis information to reason about loads and stores.
+
+.. _the clients:
+
+Clients for debugging and evaluation of implementations
+-------------------------------------------------------
+
+These passes are useful for evaluating the various alias analysis
+implementations.  You can use them with commands like:
+
+.. code-block:: bash
+
+  % opt -ds-aa -aa-eval foo.bc -disable-output -stats
+
+The ``-print-alias-sets`` pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-print-alias-sets`` pass is exposed as part of the ``opt`` tool to print
+out the Alias Sets formed by the `AliasSetTracker`_ class.  This is useful if
+you're using the ``AliasSetTracker`` class.  To use it, use something like:
+
+.. code-block:: bash
+
+  % opt -ds-aa -print-alias-sets -disable-output
+
+The ``-count-aa`` pass
+^^^^^^^^^^^^^^^^^^^^^^
+
+The ``-count-aa`` pass is useful to see how many queries a particular pass is
+making and what responses are returned by the alias analysis.  As an example:
+
+.. code-block:: bash
+
+  % opt -basicaa -count-aa -ds-aa -count-aa -licm
+
+will print out how many queries (and what responses are returned) by the
+``-licm`` pass (of the ``-ds-aa`` pass) and how many queries are made of the
+``-basicaa`` pass by the ``-ds-aa`` pass.  This can be useful when debugging a
+transformation or an alias analysis implementation.
+
+The ``-aa-eval`` pass
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``-aa-eval`` pass simply iterates through all pairs of pointers in a
+function and asks an alias analysis whether or not the pointers alias.  This
+gives an indication of the precision of the alias analysis.  Statistics are
+printed indicating the percent of no/may/must aliases found (a more precise
+algorithm will have a lower number of may aliases).
+
+Memory Dependence Analysis
+==========================
+
+.. note::
+
+  We are currently in the process of migrating things from
+  ``MemoryDependenceAnalysis`` to :doc:`MemorySSA`. Please try to use
+  that instead.
+
+If you're just looking to be a client of alias analysis information, consider
+using the Memory Dependence Analysis interface instead.  MemDep is a lazy,
+caching layer on top of alias analysis that is able to answer the question of
+what preceding memory operations a given instruction depends on, either at an
+intra- or inter-block level.  Because of its laziness and caching policy, using
+MemDep can be a significant performance win over accessing alias analysis
+directly.

Added: www-releases/trunk/4.0.1/docs/_sources/Atomics.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/Atomics.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/Atomics.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/Atomics.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,605 @@
+==============================================
+LLVM Atomic Instructions and Concurrency Guide
+==============================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+LLVM supports instructions which are well-defined in the presence of threads and
+asynchronous signals.
+
+The atomic instructions are designed specifically to provide readable IR and
+optimized code generation for the following:
+
+* The C++11 ``<atomic>`` header.  (`C++11 draft available here
+  <http://www.open-std.org/jtc1/sc22/wg21/>`_.) (`C11 draft available here
+  <http://www.open-std.org/jtc1/sc22/wg14/>`_.)
+
+* Proper semantics for Java-style memory, for both ``volatile`` and regular
+  shared variables. (`Java Specification
+  <http://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html>`_)
+
+* gcc-compatible ``__sync_*`` builtins. (`Description
+  <https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html>`_)
+
+* Other scenarios with atomic semantics, including ``static`` variables with
+  non-trivial constructors in C++.
+
+Atomic and volatile in the IR are orthogonal; "volatile" is the C/C++ volatile,
+which ensures that every volatile load and store happens and is performed in the
+stated order.  A couple examples: if a SequentiallyConsistent store is
+immediately followed by another SequentiallyConsistent store to the same
+address, the first store can be erased. This transformation is not allowed for a
+pair of volatile stores. On the other hand, a non-volatile non-atomic load can
+be moved across a volatile load freely, but not an Acquire load.
+
+This document is intended to provide a guide to anyone either writing a frontend
+for LLVM or working on optimization passes for LLVM with a guide for how to deal
+with instructions with special semantics in the presence of concurrency.  This
+is not intended to be a precise guide to the semantics; the details can get
+extremely complicated and unreadable, and are not usually necessary.
+
+.. _Optimization outside atomic:
+
+Optimization outside atomic
+===========================
+
+The basic ``'load'`` and ``'store'`` allow a variety of optimizations, but can
+lead to undefined results in a concurrent environment; see `NotAtomic`_. This
+section specifically goes into the one optimizer restriction which applies in
+concurrent environments, which gets a bit more of an extended description
+because any optimization dealing with stores needs to be aware of it.
+
+From the optimizer's point of view, the rule is that if there are not any
+instructions with atomic ordering involved, concurrency does not matter, with
+one exception: if a variable might be visible to another thread or signal
+handler, a store cannot be inserted along a path where it might not execute
+otherwise.  Take the following example:
+
+.. code-block:: c
+
+ /* C code, for readability; run through clang -O2 -S -emit-llvm to get
+     equivalent IR */
+  int x;
+  void f(int* a) {
+    for (int i = 0; i < 100; i++) {
+      if (a[i])
+        x += 1;
+    }
+  }
+
+The following is equivalent in non-concurrent situations:
+
+.. code-block:: c
+
+  int x;
+  void f(int* a) {
+    int xtemp = x;
+    for (int i = 0; i < 100; i++) {
+      if (a[i])
+        xtemp += 1;
+    }
+    x = xtemp;
+  }
+
+However, LLVM is not allowed to transform the former to the latter: it could
+indirectly introduce undefined behavior if another thread can access ``x`` at
+the same time. (This example is particularly of interest because before the
+concurrency model was implemented, LLVM would perform this transformation.)
+
+Note that speculative loads are allowed; a load which is part of a race returns
+``undef``, but does not have undefined behavior.
+
+Atomic instructions
+===================
+
+For cases where simple loads and stores are not sufficient, LLVM provides
+various atomic instructions. The exact guarantees provided depend on the
+ordering; see `Atomic orderings`_.
+
+``load atomic`` and ``store atomic`` provide the same basic functionality as
+non-atomic loads and stores, but provide additional guarantees in situations
+where threads and signals are involved.
+
+``cmpxchg`` and ``atomicrmw`` are essentially like an atomic load followed by an
+atomic store (where the store is conditional for ``cmpxchg``), but no other
+memory operation can happen on any thread between the load and store.
+
+A ``fence`` provides Acquire and/or Release ordering which is not part of
+another operation; it is normally used along with Monotonic memory operations.
+A Monotonic load followed by an Acquire fence is roughly equivalent to an
+Acquire load, and a Monotonic store following a Release fence is roughly
+equivalent to a Release store. SequentiallyConsistent fences behave as both
+an Acquire and a Release fence, and offer some additional complicated
+guarantees, see the C++11 standard for details.
+
+Frontends generating atomic instructions generally need to be aware of the
+target to some degree; atomic instructions are guaranteed to be lock-free, and
+therefore an instruction which is wider than the target natively supports can be
+impossible to generate.
+
+.. _Atomic orderings:
+
+Atomic orderings
+================
+
+In order to achieve a balance between performance and necessary guarantees,
+there are six levels of atomicity. They are listed in order of strength; each
+level includes all the guarantees of the previous level except for
+Acquire/Release. (See also `LangRef Ordering <LangRef.html#ordering>`_.)
+
+.. _NotAtomic:
+
+NotAtomic
+---------
+
+NotAtomic is the obvious, a load or store which is not atomic. (This isn't
+really a level of atomicity, but is listed here for comparison.) This is
+essentially a regular load or store. If there is a race on a given memory
+location, loads from that location return undef.
+
+Relevant standard
+  This is intended to match shared variables in C/C++, and to be used in any
+  other context where memory access is necessary, and a race is impossible. (The
+  precise definition is in `LangRef Memory Model <LangRef.html#memmodel>`_.)
+
+Notes for frontends
+  The rule is essentially that all memory accessed with basic loads and stores
+  by multiple threads should be protected by a lock or other synchronization;
+  otherwise, you are likely to run into undefined behavior. If your frontend is
+  for a "safe" language like Java, use Unordered to load and store any shared
+  variable.  Note that NotAtomic volatile loads and stores are not properly
+  atomic; do not try to use them as a substitute. (Per the C/C++ standards,
+  volatile does provide some limited guarantees around asynchronous signals, but
+  atomics are generally a better solution.)
+
+Notes for optimizers
+  Introducing loads to shared variables along a codepath where they would not
+  otherwise exist is allowed; introducing stores to shared variables is not. See
+  `Optimization outside atomic`_.
+
+Notes for code generation
+  The one interesting restriction here is that it is not allowed to write to
+  bytes outside of the bytes relevant to a store.  This is mostly relevant to
+  unaligned stores: it is not allowed in general to convert an unaligned store
+  into two aligned stores of the same width as the unaligned store. Backends are
+  also expected to generate an i8 store as an i8 store, and not an instruction
+  which writes to surrounding bytes.  (If you are writing a backend for an
+  architecture which cannot satisfy these restrictions and cares about
+  concurrency, please send an email to llvm-dev.)
+
+Unordered
+---------
+
+Unordered is the lowest level of atomicity. It essentially guarantees that races
+produce somewhat sane results instead of having undefined behavior.  It also
+guarantees the operation to be lock-free, so it does not depend on the data
+being part of a special atomic structure or depend on a separate per-process
+global lock.  Note that code generation will fail for unsupported atomic
+operations; if you need such an operation, use explicit locking.
+
+Relevant standard
+  This is intended to match the Java memory model for shared variables.
+
+Notes for frontends
+  This cannot be used for synchronization, but is useful for Java and other
+  "safe" languages which need to guarantee that the generated code never
+  exhibits undefined behavior. Note that this guarantee is cheap on common
+  platforms for loads of a native width, but can be expensive or unavailable for
+  wider loads, like a 64-bit store on ARM. (A frontend for Java or other "safe"
+  languages would normally split a 64-bit store on ARM into two 32-bit unordered
+  stores.)
+
+Notes for optimizers
+  In terms of the optimizer, this prohibits any transformation that transforms a
+  single load into multiple loads, transforms a store into multiple stores,
+  narrows a store, or stores a value which would not be stored otherwise.  Some
+  examples of unsafe optimizations are narrowing an assignment into a bitfield,
+  rematerializing a load, and turning loads and stores into a memcpy
+  call. Reordering unordered operations is safe, though, and optimizers should
+  take advantage of that because unordered operations are common in languages
+  that need them.
+
+Notes for code generation
+  These operations are required to be atomic in the sense that if you use
+  unordered loads and unordered stores, a load cannot see a value which was
+  never stored.  A normal load or store instruction is usually sufficient, but
+  note that an unordered load or store cannot be split into multiple
+  instructions (or an instruction which does multiple memory operations, like
+  ``LDRD`` on ARM without LPAE, or not naturally-aligned ``LDRD`` on LPAE ARM).
+
+Monotonic
+---------
+
+Monotonic is the weakest level of atomicity that can be used in synchronization
+primitives, although it does not provide any general synchronization. It
+essentially guarantees that if you take all the operations affecting a specific
+address, a consistent ordering exists.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_relaxed``; see those
+  standards for the exact definition.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.  The
+  guarantees in terms of synchronization are very weak, so make sure these are
+  only used in a pattern which you know is correct.  Generally, these would
+  either be used for atomic operations which do not protect other memory (like
+  an atomic counter), or along with a ``fence``.
+
+Notes for optimizers
+  In terms of the optimizer, this can be treated as a read+write on the relevant
+  memory location (and alias analysis will take advantage of that). In addition,
+  it is legal to reorder non-atomic and Unordered loads around Monotonic
+  loads. CSE/DSE and a few other optimizations are allowed, but Monotonic
+  operations are unlikely to be used in ways which would make those
+  optimizations useful.
+
+Notes for code generation
+  Code generation is essentially the same as that for unordered for loads and
+  stores.  No fences are required.  ``cmpxchg`` and ``atomicrmw`` are required
+  to appear as a single operation.
+
+Acquire
+-------
+
+Acquire provides a barrier of the sort necessary to acquire a lock to access
+other memory with normal loads and stores.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_acquire``. It should also be
+  used for C++11/C11 ``memory_order_consume``.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.
+  Acquire only provides a semantic guarantee when paired with a Release
+  operation.
+
+Notes for optimizers
+  Optimizers not aware of atomics can treat this like a nothrow call.  It is
+  also possible to move stores from before an Acquire load or read-modify-write
+  operation to after it, and move non-Acquire loads from before an Acquire
+  operation to after it.
+
+Notes for code generation
+  Architectures with weak memory ordering (essentially everything relevant today
+  except x86 and SPARC) require some sort of fence to maintain the Acquire
+  semantics.  The precise fences required varies widely by architecture, but for
+  a simple implementation, most architectures provide a barrier which is strong
+  enough for everything (``dmb`` on ARM, ``sync`` on PowerPC, etc.).  Putting
+  such a fence after the equivalent Monotonic operation is sufficient to
+  maintain Acquire semantics for a memory operation.
+
+Release
+-------
+
+Release is similar to Acquire, but with a barrier of the sort necessary to
+release a lock.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_release``.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.
+  Release only provides a semantic guarantee when paired with a Acquire
+  operation.
+
+Notes for optimizers
+  Optimizers not aware of atomics can treat this like a nothrow call.  It is
+  also possible to move loads from after a Release store or read-modify-write
+  operation to before it, and move non-Release stores from after an Release
+  operation to before it.
+
+Notes for code generation
+  See the section on Acquire; a fence before the relevant operation is usually
+  sufficient for Release. Note that a store-store fence is not sufficient to
+  implement Release semantics; store-store fences are generally not exposed to
+  IR because they are extremely difficult to use correctly.
+
+AcquireRelease
+--------------
+
+AcquireRelease (``acq_rel`` in IR) provides both an Acquire and a Release
+barrier (for fences and operations which both read and write memory).
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_acq_rel``.
+
+Notes for frontends
+  If you are writing a frontend which uses this directly, use with caution.
+  Acquire only provides a semantic guarantee when paired with a Release
+  operation, and vice versa.
+
+Notes for optimizers
+  In general, optimizers should treat this like a nothrow call; the possible
+  optimizations are usually not interesting.
+
+Notes for code generation
+  This operation has Acquire and Release semantics; see the sections on Acquire
+  and Release.
+
+SequentiallyConsistent
+----------------------
+
+SequentiallyConsistent (``seq_cst`` in IR) provides Acquire semantics for loads
+and Release semantics for stores. Additionally, it guarantees that a total
+ordering exists between all SequentiallyConsistent operations.
+
+Relevant standard
+  This corresponds to the C++11/C11 ``memory_order_seq_cst``, Java volatile, and
+  the gcc-compatible ``__sync_*`` builtins which do not specify otherwise.
+
+Notes for frontends
+  If a frontend is exposing atomic operations, these are much easier to reason
+  about for the programmer than other kinds of operations, and using them is
+  generally a practical performance tradeoff.
+
+Notes for optimizers
+  Optimizers not aware of atomics can treat this like a nothrow call.  For
+  SequentiallyConsistent loads and stores, the same reorderings are allowed as
+  for Acquire loads and Release stores, except that SequentiallyConsistent
+  operations may not be reordered.
+
+Notes for code generation
+  SequentiallyConsistent loads minimally require the same barriers as Acquire
+  operations and SequentiallyConsistent stores require Release
+  barriers. Additionally, the code generator must enforce ordering between
+  SequentiallyConsistent stores followed by SequentiallyConsistent loads. This
+  is usually done by emitting either a full fence before the loads or a full
+  fence after the stores; which is preferred varies by architecture.
+
+Atomics and IR optimization
+===========================
+
+Predicates for optimizer writers to query:
+
+* ``isSimple()``: A load or store which is not volatile or atomic.  This is
+  what, for example, memcpyopt would check for operations it might transform.
+
+* ``isUnordered()``: A load or store which is not volatile and at most
+  Unordered. This would be checked, for example, by LICM before hoisting an
+  operation.
+
+* ``mayReadFromMemory()``/``mayWriteToMemory()``: Existing predicate, but note
+  that they return true for any operation which is volatile or at least
+  Monotonic.
+
+* ``isStrongerThan`` / ``isAtLeastOrStrongerThan``: These are predicates on
+  orderings. They can be useful for passes that are aware of atomics, for
+  example to do DSE across a single atomic access, but not across a
+  release-acquire pair (see MemoryDependencyAnalysis for an example of this)
+
+* Alias analysis: Note that AA will return ModRef for anything Acquire or
+  Release, and for the address accessed by any Monotonic operation.
+
+To support optimizing around atomic operations, make sure you are using the
+right predicates; everything should work if that is done.  If your pass should
+optimize some atomic operations (Unordered operations in particular), make sure
+it doesn't replace an atomic load or store with a non-atomic operation.
+
+Some examples of how optimizations interact with various kinds of atomic
+operations:
+
+* ``memcpyopt``: An atomic operation cannot be optimized into part of a
+  memcpy/memset, including unordered loads/stores.  It can pull operations
+  across some atomic operations.
+
+* LICM: Unordered loads/stores can be moved out of a loop.  It just treats
+  monotonic operations like a read+write to a memory location, and anything
+  stricter than that like a nothrow call.
+
+* DSE: Unordered stores can be DSE'ed like normal stores.  Monotonic stores can
+  be DSE'ed in some cases, but it's tricky to reason about, and not especially
+  important. It is possible in some case for DSE to operate across a stronger
+  atomic operation, but it is fairly tricky. DSE delegates this reasoning to
+  MemoryDependencyAnalysis (which is also used by other passes like GVN).
+
+* Folding a load: Any atomic load from a constant global can be constant-folded,
+  because it cannot be observed.  Similar reasoning allows sroa with
+  atomic loads and stores.
+
+Atomics and Codegen
+===================
+
+Atomic operations are represented in the SelectionDAG with ``ATOMIC_*`` opcodes.
+On architectures which use barrier instructions for all atomic ordering (like
+ARM), appropriate fences can be emitted by the AtomicExpand Codegen pass if
+``setInsertFencesForAtomic()`` was used.
+
+The MachineMemOperand for all atomic operations is currently marked as volatile;
+this is not correct in the IR sense of volatile, but CodeGen handles anything
+marked volatile very conservatively.  This should get fixed at some point.
+
+One very important property of the atomic operations is that if your backend
+supports any inline lock-free atomic operations of a given size, you should
+support *ALL* operations of that size in a lock-free manner.
+
+When the target implements atomic ``cmpxchg`` or LL/SC instructions (as most do)
+this is trivial: all the other operations can be implemented on top of those
+primitives. However, on many older CPUs (e.g. ARMv5, SparcV8, Intel 80386) there
+are atomic load and store instructions, but no ``cmpxchg`` or LL/SC. As it is
+invalid to implement ``atomic load`` using the native instruction, but
+``cmpxchg`` using a library call to a function that uses a mutex, ``atomic
+load`` must *also* expand to a library call on such architectures, so that it
+can remain atomic with regards to a simultaneous ``cmpxchg``, by using the same
+mutex.
+
+AtomicExpandPass can help with that: it will expand all atomic operations to the
+proper ``__atomic_*`` libcalls for any size above the maximum set by
+``setMaxAtomicSizeInBitsSupported`` (which defaults to 0).
+
+On x86, all atomic loads generate a ``MOV``. SequentiallyConsistent stores
+generate an ``XCHG``, other stores generate a ``MOV``. SequentiallyConsistent
+fences generate an ``MFENCE``, other fences do not cause any code to be
+generated.  ``cmpxchg`` uses the ``LOCK CMPXCHG`` instruction.  ``atomicrmw xchg``
+uses ``XCHG``, ``atomicrmw add`` and ``atomicrmw sub`` use ``XADD``, and all
+other ``atomicrmw`` operations generate a loop with ``LOCK CMPXCHG``.  Depending
+on the users of the result, some ``atomicrmw`` operations can be translated into
+operations like ``LOCK AND``, but that does not work in general.
+
+On ARM (before v8), MIPS, and many other RISC architectures, Acquire, Release,
+and SequentiallyConsistent semantics require barrier instructions for every such
+operation. Loads and stores generate normal instructions.  ``cmpxchg`` and
+``atomicrmw`` can be represented using a loop with LL/SC-style instructions
+which take some sort of exclusive lock on a cache line (``LDREX`` and ``STREX``
+on ARM, etc.).
+
+It is often easiest for backends to use AtomicExpandPass to lower some of the
+atomic constructs. Here are some lowerings it can do:
+
+* cmpxchg -> loop with load-linked/store-conditional
+  by overriding ``shouldExpandAtomicCmpXchgInIR()``, ``emitLoadLinked()``,
+  ``emitStoreConditional()``
+* large loads/stores -> ll-sc/cmpxchg
+  by overriding ``shouldExpandAtomicStoreInIR()``/``shouldExpandAtomicLoadInIR()``
+* strong atomic accesses -> monotonic accesses + fences by overriding
+  ``shouldInsertFencesForAtomic()``, ``emitLeadingFence()``, and
+  ``emitTrailingFence()``
+* atomic rmw -> loop with cmpxchg or load-linked/store-conditional
+  by overriding ``expandAtomicRMWInIR()``
+* expansion to __atomic_* libcalls for unsupported sizes.
+
+For an example of all of these, look at the ARM backend.
+
+Libcalls: __atomic_*
+====================
+
+There are two kinds of atomic library calls that are generated by LLVM. Please
+note that both sets of library functions somewhat confusingly share the names of
+builtin functions defined by clang. Despite this, the library functions are
+not directly related to the builtins: it is *not* the case that ``__atomic_*``
+builtins lower to ``__atomic_*`` library calls and ``__sync_*`` builtins lower
+to ``__sync_*`` library calls.
+
+The first set of library functions are named ``__atomic_*``. This set has been
+"standardized" by GCC, and is described below. (See also `GCC's documentation
+<https://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary>`_)
+
+LLVM's AtomicExpandPass will translate atomic operations on data sizes above
+``MaxAtomicSizeInBitsSupported`` into calls to these functions.
+
+There are four generic functions, which can be called with data of any size or
+alignment::
+
+   void __atomic_load(size_t size, void *ptr, void *ret, int ordering)
+   void __atomic_store(size_t size, void *ptr, void *val, int ordering)
+   void __atomic_exchange(size_t size, void *ptr, void *val, void *ret, int ordering)
+   bool __atomic_compare_exchange(size_t size, void *ptr, void *expected, void *desired, int success_order, int failure_order)
+
+There are also size-specialized versions of the above functions, which can only
+be used with *naturally-aligned* pointers of the appropriate size. In the
+signatures below, "N" is one of 1, 2, 4, 8, and 16, and "iN" is the appropriate
+integer type of that size; if no such integer type exists, the specialization
+cannot be used::
+
+   iN __atomic_load_N(iN *ptr, iN val, int ordering)
+   void __atomic_store_N(iN *ptr, iN val, int ordering)
+   iN __atomic_exchange_N(iN *ptr, iN val, int ordering)
+   bool __atomic_compare_exchange_N(iN *ptr, iN *expected, iN desired, int success_order, int failure_order)
+
+Finally there are some read-modify-write functions, which are only available in
+the size-specific variants (any other sizes use a ``__atomic_compare_exchange``
+loop)::
+
+   iN __atomic_fetch_add_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_sub_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_and_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_or_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_xor_N(iN *ptr, iN val, int ordering)
+   iN __atomic_fetch_nand_N(iN *ptr, iN val, int ordering)
+
+This set of library functions have some interesting implementation requirements
+to take note of:
+
+- They support all sizes and alignments -- including those which cannot be
+  implemented natively on any existing hardware. Therefore, they will certainly
+  use mutexes in for some sizes/alignments.
+
+- As a consequence, they cannot be shipped in a statically linked
+  compiler-support library, as they have state which must be shared amongst all
+  DSOs loaded in the program. They must be provided in a shared library used by
+  all objects.
+
+- The set of atomic sizes supported lock-free must be a superset of the sizes
+  any compiler can emit. That is: if a new compiler introduces support for
+  inline-lock-free atomics of size N, the ``__atomic_*`` functions must also have a
+  lock-free implementation for size N. This is a requirement so that code
+  produced by an old compiler (which will have called the ``__atomic_*`` function)
+  interoperates with code produced by the new compiler (which will use native
+  the atomic instruction).
+
+Note that it's possible to write an entirely target-independent implementation
+of these library functions by using the compiler atomic builtins themselves to
+implement the operations on naturally-aligned pointers of supported sizes, and a
+generic mutex implementation otherwise.
+
+Libcalls: __sync_*
+==================
+
+Some targets or OS/target combinations can support lock-free atomics, but for
+various reasons, it is not practical to emit the instructions inline.
+
+There's two typical examples of this.
+
+Some CPUs support multiple instruction sets which can be swiched back and forth
+on function-call boundaries. For example, MIPS supports the MIPS16 ISA, which
+has a smaller instruction encoding than the usual MIPS32 ISA. ARM, similarly,
+has the Thumb ISA. In MIPS16 and earlier versions of Thumb, the atomic
+instructions are not encodable. However, those instructions are available via a
+function call to a function with the longer encoding.
+
+Additionally, a few OS/target pairs provide kernel-supported lock-free
+atomics. ARM/Linux is an example of this: the kernel `provides
+<https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt>`_ a
+function which on older CPUs contains a "magically-restartable" atomic sequence
+(which looks atomic so long as there's only one CPU), and contains actual atomic
+instructions on newer multicore models. This sort of functionality can typically
+be provided on any architecture, if all CPUs which are missing atomic
+compare-and-swap support are uniprocessor (no SMP). This is almost always the
+case. The only common architecture without that property is SPARC -- SPARCV8 SMP
+systems were common, yet it doesn't support any sort of compare-and-swap
+operation.
+
+In either of these cases, the Target in LLVM can claim support for atomics of an
+appropriate size, and then implement some subset of the operations via libcalls
+to a ``__sync_*`` function. Such functions *must* not use locks in their
+implementation, because unlike the ``__atomic_*`` routines used by
+AtomicExpandPass, these may be mixed-and-matched with native instructions by the
+target lowering.
+
+Further, these routines do not need to be shared, as they are stateless. So,
+there is no issue with having multiple copies included in one binary. Thus,
+typically these routines are implemented by the statically-linked compiler
+runtime support library.
+
+LLVM will emit a call to an appropriate ``__sync_*`` routine if the target
+ISelLowering code has set the corresponding ``ATOMIC_CMPXCHG``, ``ATOMIC_SWAP``,
+or ``ATOMIC_LOAD_*`` operation to "Expand", and if it has opted-into the
+availability of those library functions via a call to ``initSyncLibcalls()``.
+
+The full set of functions that may be called by LLVM is (for ``N`` being 1, 2,
+4, 8, or 16)::
+
+  iN __sync_val_compare_and_swap_N(iN *ptr, iN expected, iN desired)
+  iN __sync_lock_test_and_set_N(iN *ptr, iN val)
+  iN __sync_fetch_and_add_N(iN *ptr, iN val)
+  iN __sync_fetch_and_sub_N(iN *ptr, iN val)
+  iN __sync_fetch_and_and_N(iN *ptr, iN val)
+  iN __sync_fetch_and_or_N(iN *ptr, iN val)
+  iN __sync_fetch_and_xor_N(iN *ptr, iN val)
+  iN __sync_fetch_and_nand_N(iN *ptr, iN val)
+  iN __sync_fetch_and_max_N(iN *ptr, iN val)
+  iN __sync_fetch_and_umax_N(iN *ptr, iN val)
+  iN __sync_fetch_and_min_N(iN *ptr, iN val)
+  iN __sync_fetch_and_umin_N(iN *ptr, iN val)
+
+This list doesn't include any function for atomic load or store; all known
+architectures support atomic loads and stores directly (possibly by emitting a
+fence on either side of a normal load or store.)
+
+There's also, somewhat separately, the possibility to lower ``ATOMIC_FENCE`` to
+``__sync_synchronize()``. This may happen or not happen independent of all the
+above, controlled purely by ``setOperationAction(ISD::ATOMIC_FENCE, ...)``.

Added: www-releases/trunk/4.0.1/docs/_sources/BigEndianNEON.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/BigEndianNEON.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/BigEndianNEON.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/BigEndianNEON.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,205 @@
+==============================================
+Using ARM NEON instructions in big endian mode
+==============================================
+
+.. contents::
+    :local:
+
+Introduction
+============
+
+Generating code for big endian ARM processors is for the most part straightforward. NEON loads and stores however have some interesting properties that make code generation decisions less obvious in big endian mode.
+
+The aim of this document is to explain the problem with NEON loads and stores, and the solution that has been implemented in LLVM.
+
+In this document the term "vector" refers to what the ARM ABI calls a "short vector", which is a sequence of items that can fit in a NEON register. This sequence can be 64 or 128 bits in length, and can constitute 8, 16, 32 or 64 bit items. This document refers to A64 instructions throughout, but is almost applicable to the A32/ARMv7 instruction sets also. The ABI format for passing vectors in A32 is sligtly different to A64. Apart from that, the same concepts apply.
+
+Example: C-level intrinsics -> assembly
+---------------------------------------
+
+It may be helpful first to illustrate how C-level ARM NEON intrinsics are lowered to instructions.
+
+This trivial C function takes a vector of four ints and sets the zero'th lane to the value "42"::
+
+    #include <arm_neon.h>
+    int32x4_t f(int32x4_t p) {
+        return vsetq_lane_s32(42, p, 0);
+    }
+
+arm_neon.h intrinsics generate "generic" IR where possible (that is, normal IR instructions not ``llvm.arm.neon.*`` intrinsic calls). The above generates::
+
+    define <4 x i32> @f(<4 x i32> %p) {
+      %vset_lane = insertelement <4 x i32> %p, i32 42, i32 0
+      ret <4 x i32> %vset_lane
+    }
+
+Which then becomes the following trivial assembly::
+
+    f:                                      // @f
+            movz	w8, #0x2a
+            ins 	v0.s[0], w8
+            ret
+
+Problem
+=======
+
+The main problem is how vectors are represented in memory and in registers.
+
+First, a recap. The "endianness" of an item affects its representation in memory only. In a register, a number is just a sequence of bits - 64 bits in the case of AArch64 general purpose registers. Memory, however, is a sequence of addressable units of 8 bits in size. Any number greater than 8 bits must therefore be split up into 8-bit chunks, and endianness describes the order in which these chunks are laid out in memory.
+
+A "little endian" layout has the least significant byte first (lowest in memory address). A "big endian" layout has the *most* significant byte first. This means that when loading an item from big endian memory, the lowest 8-bits in memory must go in the most significant 8-bits, and so forth.
+
+``LDR`` and ``LD1``
+===================
+
+.. figure:: ARM-BE-ldr.png
+    :align: right
+    
+    Big endian vector load using ``LDR``.
+
+
+A vector is a consecutive sequence of items that are operated on simultaneously. To load a 64-bit vector, 64 bits need to be read from memory. In little endian mode, we can do this by just performing a 64-bit load - ``LDR q0, [foo]``. However if we try this in big endian mode, because of the byte swapping the lane indices end up being swapped! The zero'th item as laid out in memory becomes the n'th lane in the vector.
+
+.. figure:: ARM-BE-ld1.png
+    :align: right
+
+    Big endian vector load using ``LD1``. Note that the lanes retain the correct ordering.
+
+
+Because of this, the instruction ``LD1`` performs a vector load but performs byte swapping not on the entire 64 bits, but on the individual items within the vector. This means that the register content is the same as it would have been on a little endian system.
+
+It may seem that ``LD1`` should suffice to peform vector loads on a big endian machine. However there are pros and cons to the two approaches that make it less than simple which register format to pick.
+
+There are two options:
+
+    1. The content of a vector register is the same *as if* it had been loaded with an ``LDR`` instruction.
+    2. The content of a vector register is the same *as if* it had been loaded with an ``LD1`` instruction.
+
+Because ``LD1 == LDR + REV`` and similarly ``LDR == LD1 + REV`` (on a big endian system), we can simulate either type of load with the other type of load plus a ``REV`` instruction. So we're not deciding which instructions to use, but which format to use (which will then influence which instruction is best to use).
+
+.. The 'clearer' container is required to make the following section header come after the floated
+   images above.
+.. container:: clearer
+
+    Note that throughout this section we only mention loads. Stores have exactly the same problems as their associated loads, so have been skipped for brevity.
+ 
+
+Considerations
+==============
+
+LLVM IR Lane ordering
+---------------------
+
+LLVM IR has first class vector types. In LLVM IR, the zero'th element of a vector resides at the lowest memory address. The optimizer relies on this property in certain areas, for example when concatenating vectors together. The intention is for arrays and vectors to have identical memory layouts - ``[4 x i8]`` and ``<4 x i8>`` should be represented the same in memory. Without this property there would be many special cases that the optimizer would have to cleverly handle.
+
+Use of ``LDR`` would break this lane ordering property. This doesn't preclude the use of ``LDR``, but we would have to do one of two things:
+
+   1. Insert a ``REV`` instruction to reverse the lane order after every ``LDR``.
+   2. Disable all optimizations that rely on lane layout, and for every access to an individual lane (``insertelement``/``extractelement``/``shufflevector``) reverse the lane index.
+
+AAPCS
+-----
+
+The ARM procedure call standard (AAPCS) defines the ABI for passing vectors between functions in registers. It states:
+
+    When a short vector is transferred between registers and memory it is treated as an opaque object. That is a short vector is stored in memory as if it were stored with a single ``STR`` of the entire register; a short vector is loaded from memory using the corresponding ``LDR`` instruction. On a little-endian system this means that element 0 will always contain the lowest addressed element of a short vector; on a big-endian system element 0 will contain the highest-addressed element of a short vector.
+
+    -- Procedure Call Standard for the ARM 64-bit Architecture (AArch64), 4.1.2 Short Vectors
+
+The use of ``LDR`` and ``STR`` as the ABI defines has at least one advantage over ``LD1`` and ``ST1``. ``LDR`` and ``STR`` are oblivious to the size of the individual lanes of a vector. ``LD1`` and ``ST1`` are not - the lane size is encoded within them. This is important across an ABI boundary, because it would become necessary to know the lane width the callee expects. Consider the following code:
+
+.. code-block:: c
+
+    <callee.c>
+    void callee(uint32x2_t v) {
+      ...
+    }
+
+    <caller.c>
+    extern void callee(uint32x2_t);
+    void caller() {
+      callee(...);
+    }
+
+If ``callee`` changed its signature to ``uint16x4_t``, which is equivalent in register content, if we passed as ``LD1`` we'd break this code until ``caller`` was updated and recompiled.
+
+There is an argument that if the signatures of the two functions are different then the behaviour should be undefined. But there may be functions that are agnostic to the lane layout of the vector, and treating the vector as an opaque value (just loading it and storing it) would be impossible without a common format across ABI boundaries.
+
+So to preserve ABI compatibility, we need to use the ``LDR`` lane layout across function calls.
+
+Alignment
+---------
+
+In strict alignment mode, ``LDR qX`` requires its address to be 128-bit aligned, whereas ``LD1`` only requires it to be as aligned as the lane size. If we canonicalised on using ``LDR``, we'd still need to use ``LD1`` in some places to avoid alignment faults (the result of the ``LD1`` would then need to be reversed with ``REV``).
+
+Most operating systems however do not run with alignment faults enabled, so this is often not an issue.
+
+Summary
+-------
+
+The following table summarises the instructions that are required to be emitted for each property mentioned above for each of the two solutions.
+
++-------------------------------+-------------------------------+---------------------+
+|                               | ``LDR`` layout                | ``LD1`` layout      |
++===============================+===============================+=====================+
+| Lane ordering                 |   ``LDR + REV``               |    ``LD1``          |
++-------------------------------+-------------------------------+---------------------+
+| AAPCS                         |   ``LDR``                     |    ``LD1 + REV``    |
++-------------------------------+-------------------------------+---------------------+
+| Alignment for strict mode     |   ``LDR`` / ``LD1 + REV``     |    ``LD1``          |
++-------------------------------+-------------------------------+---------------------+
+
+Neither approach is perfect, and choosing one boils down to choosing the lesser of two evils. The issue with lane ordering, it was decided, would have to change target-agnostic compiler passes and would result in a strange IR in which lane indices were reversed. It was decided that this was worse than the changes that would have to be made to support ``LD1``, so ``LD1`` was chosen as the canonical vector load instruction (and by inference, ``ST1`` for vector stores).
+
+Implementation
+==============
+
+There are 3 parts to the implementation:
+
+    1. Predicate ``LDR`` and ``STR`` instructions so that they are never allowed to be selected to generate vector loads and stores. The exception is one-lane vectors [1]_ - these by definition cannot have lane ordering problems so are fine to use ``LDR``/``STR``. 
+
+    2. Create code generation patterns for bitconverts that create ``REV`` instructions.
+
+    3. Make sure appropriate bitconverts are created so that vector values get passed over call boundaries as 1-element vectors (which is the same as if they were loaded with ``LDR``).
+
+Bitconverts
+-----------
+
+.. image:: ARM-BE-bitcastfail.png
+    :align: right
+
+The main problem with the ``LD1`` solution is dealing with bitconverts (or bitcasts, or reinterpret casts). These are pseudo instructions that only change the compiler's interpretation of data, not the underlying data itself. A requirement is that if data is loaded and then saved again (called a "round trip"), the memory contents should be the same after the store as before the load. If a vector is loaded and is then bitconverted to a different vector type before storing, the round trip will currently be broken.
+
+Take for example this code sequence::
+
+    %0 = load <4 x i32> %x
+    %1 = bitcast <4 x i32> %0 to <2 x i64>
+         store <2 x i64> %1, <2 x i64>* %y
+
+This would produce a code sequence such as that in the figure on the right. The mismatched ``LD1`` and ``ST1`` cause the stored data to differ from the loaded data.
+
+.. container:: clearer
+
+    When we see a bitcast from type ``X`` to type ``Y``, what we need to do is to change the in-register representation of the data to be *as if* it had just been loaded by a ``LD1`` of type ``Y``.
+
+.. image:: ARM-BE-bitcastsuccess.png
+    :align: right
+
+Conceptually this is simple - we can insert a ``REV`` undoing the ``LD1`` of type ``X`` (converting the in-register representation to the same as if it had been loaded by ``LDR``) and then insert another ``REV`` to change the representation to be as if it had been loaded by an ``LD1`` of type ``Y``.
+
+For the previous example, this would be::
+
+    LD1   v0.4s, [x]
+
+    REV64 v0.4s, v0.4s                  // There is no REV128 instruction, so it must be synthesizedcd 
+    EXT   v0.16b, v0.16b, v0.16b, #8    // with a REV64 then an EXT to swap the two 64-bit elements.
+
+    REV64 v0.2d, v0.2d
+    EXT   v0.16b, v0.16b, v0.16b, #8
+
+    ST1   v0.2d, [y]
+
+It turns out that these ``REV`` pairs can, in almost all cases, be squashed together into a single ``REV``. For the example above, a ``REV128 4s`` + ``REV128 2d`` is actually a ``REV64 4s``, as shown in the figure on the right.
+
+.. [1] One lane vectors may seem useless as a concept but they serve to distinguish between values held in general purpose registers and values held in NEON/VFP registers. For example, an ``i64`` would live in an ``x`` register, but ``<1 x i64>`` would live in a ``d`` register.
+

Added: www-releases/trunk/4.0.1/docs/_sources/BitCodeFormat.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/BitCodeFormat.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/BitCodeFormat.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/BitCodeFormat.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,1312 @@
+.. role:: raw-html(raw)
+   :format: html
+
+========================
+LLVM Bitcode File Format
+========================
+
+.. contents::
+   :local:
+
+Abstract
+========
+
+This document describes the LLVM bitstream file format and the encoding of the
+LLVM IR into it.
+
+Overview
+========
+
+What is commonly known as the LLVM bitcode file format (also, sometimes
+anachronistically known as bytecode) is actually two things: a `bitstream
+container format`_ and an `encoding of LLVM IR`_ into the container format.
+
+The bitstream format is an abstract encoding of structured data, very similar to
+XML in some ways.  Like XML, bitstream files contain tags, and nested
+structures, and you can parse the file without having to understand the tags.
+Unlike XML, the bitstream format is a binary encoding, and unlike XML it
+provides a mechanism for the file to self-describe "abbreviations", which are
+effectively size optimizations for the content.
+
+LLVM IR files may be optionally embedded into a `wrapper`_ structure, or in a
+`native object file`_. Both of these mechanisms make it easy to embed extra
+data along with LLVM IR files.
+
+This document first describes the LLVM bitstream format, describes the wrapper
+format, then describes the record structure used by LLVM IR files.
+
+.. _bitstream container format:
+
+Bitstream Format
+================
+
+The bitstream format is literally a stream of bits, with a very simple
+structure.  This structure consists of the following concepts:
+
+* A "`magic number`_" that identifies the contents of the stream.
+
+* Encoding `primitives`_ like variable bit-rate integers.
+
+* `Blocks`_, which define nested content.
+
+* `Data Records`_, which describe entities within the file.
+
+* Abbreviations, which specify compression optimizations for the file.
+
+Note that the :doc:`llvm-bcanalyzer <CommandGuide/llvm-bcanalyzer>` tool can be
+used to dump and inspect arbitrary bitstreams, which is very useful for
+understanding the encoding.
+
+.. _magic number:
+
+Magic Numbers
+-------------
+
+The first two bytes of a bitcode file are 'BC' (``0x42``, ``0x43``).  The second
+two bytes are an application-specific magic number.  Generic bitcode tools can
+look at only the first two bytes to verify the file is bitcode, while
+application-specific programs will want to look at all four.
+
+.. _primitives:
+
+Primitives
+----------
+
+A bitstream literally consists of a stream of bits, which are read in order
+starting with the least significant bit of each byte.  The stream is made up of
+a number of primitive values that encode a stream of unsigned integer values.
+These integers are encoded in two ways: either as `Fixed Width Integers`_ or as
+`Variable Width Integers`_.
+
+.. _Fixed Width Integers:
+.. _fixed-width value:
+
+Fixed Width Integers
+^^^^^^^^^^^^^^^^^^^^
+
+Fixed-width integer values have their low bits emitted directly to the file.
+For example, a 3-bit integer value encodes 1 as 001.  Fixed width integers are
+used when there are a well-known number of options for a field.  For example,
+boolean values are usually encoded with a 1-bit wide integer.
+
+.. _Variable Width Integers:
+.. _Variable Width Integer:
+.. _variable-width value:
+
+Variable Width Integers
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Variable-width integer (VBR) values encode values of arbitrary size, optimizing
+for the case where the values are small.  Given a 4-bit VBR field, any 3-bit
+value (0 through 7) is encoded directly, with the high bit set to zero.  Values
+larger than N-1 bits emit their bits in a series of N-1 bit chunks, where all
+but the last set the high bit.
+
+For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a vbr4
+value.  The first set of four bits indicates the value 3 (011) with a
+continuation piece (indicated by a high bit of 1).  The next word indicates a
+value of 24 (011 << 3) with no continuation.  The sum (3+24) yields the value
+27.
+
+.. _char6-encoded value:
+
+6-bit characters
+^^^^^^^^^^^^^^^^
+
+6-bit characters encode common characters into a fixed 6-bit field.  They
+represent the following characters with the following 6-bit values:
+
+::
+
+  'a' .. 'z' ---  0 .. 25
+  'A' .. 'Z' --- 26 .. 51
+  '0' .. '9' --- 52 .. 61
+         '.' --- 62
+         '_' --- 63
+
+This encoding is only suitable for encoding characters and strings that consist
+only of the above characters.  It is completely incapable of encoding characters
+not in the set.
+
+Word Alignment
+^^^^^^^^^^^^^^
+
+Occasionally, it is useful to emit zero bits until the bitstream is a multiple
+of 32 bits.  This ensures that the bit position in the stream can be represented
+as a multiple of 32-bit words.
+
+Abbreviation IDs
+----------------
+
+A bitstream is a sequential series of `Blocks`_ and `Data Records`_.  Both of
+these start with an abbreviation ID encoded as a fixed-bitwidth field.  The
+width is specified by the current block, as described below.  The value of the
+abbreviation ID specifies either a builtin ID (which have special meanings,
+defined below) or one of the abbreviation IDs defined for the current block by
+the stream itself.
+
+The set of builtin abbrev IDs is:
+
+* 0 - `END_BLOCK`_ --- This abbrev ID marks the end of the current block.
+
+* 1 - `ENTER_SUBBLOCK`_ --- This abbrev ID marks the beginning of a new
+  block.
+
+* 2 - `DEFINE_ABBREV`_ --- This defines a new abbreviation.
+
+* 3 - `UNABBREV_RECORD`_ --- This ID specifies the definition of an
+  unabbreviated record.
+
+Abbreviation IDs 4 and above are defined by the stream itself, and specify an
+`abbreviated record encoding`_.
+
+.. _Blocks:
+
+Blocks
+------
+
+Blocks in a bitstream denote nested regions of the stream, and are identified by
+a content-specific id number (for example, LLVM IR uses an ID of 12 to represent
+function bodies).  Block IDs 0-7 are reserved for `standard blocks`_ whose
+meaning is defined by Bitcode; block IDs 8 and greater are application
+specific. Nested blocks capture the hierarchical structure of the data encoded
+in it, and various properties are associated with blocks as the file is parsed.
+Block definitions allow the reader to efficiently skip blocks in constant time
+if the reader wants a summary of blocks, or if it wants to efficiently skip data
+it does not understand.  The LLVM IR reader uses this mechanism to skip function
+bodies, lazily reading them on demand.
+
+When reading and encoding the stream, several properties are maintained for the
+block.  In particular, each block maintains:
+
+#. A current abbrev id width.  This value starts at 2 at the beginning of the
+   stream, and is set every time a block record is entered.  The block entry
+   specifies the abbrev id width for the body of the block.
+
+#. A set of abbreviations.  Abbreviations may be defined within a block, in
+   which case they are only defined in that block (neither subblocks nor
+   enclosing blocks see the abbreviation).  Abbreviations can also be defined
+   inside a `BLOCKINFO`_ block, in which case they are defined in all blocks
+   that match the ID that the ``BLOCKINFO`` block is describing.
+
+As sub blocks are entered, these properties are saved and the new sub-block has
+its own set of abbreviations, and its own abbrev id width.  When a sub-block is
+popped, the saved values are restored.
+
+.. _ENTER_SUBBLOCK:
+
+ENTER_SUBBLOCK Encoding
+^^^^^^^^^^^^^^^^^^^^^^^
+
+:raw-html:`<tt>`
+[ENTER_SUBBLOCK, blockid\ :sub:`vbr8`, newabbrevlen\ :sub:`vbr4`, <align32bits>, blocklen_32]
+:raw-html:`</tt>`
+
+The ``ENTER_SUBBLOCK`` abbreviation ID specifies the start of a new block
+record.  The ``blockid`` value is encoded as an 8-bit VBR identifier, and
+indicates the type of block being entered, which can be a `standard block`_ or
+an application-specific block.  The ``newabbrevlen`` value is a 4-bit VBR, which
+specifies the abbrev id width for the sub-block.  The ``blocklen`` value is a
+32-bit aligned value that specifies the size of the subblock in 32-bit
+words. This value allows the reader to skip over the entire block in one jump.
+
+.. _END_BLOCK:
+
+END_BLOCK Encoding
+^^^^^^^^^^^^^^^^^^
+
+``[END_BLOCK, <align32bits>]``
+
+The ``END_BLOCK`` abbreviation ID specifies the end of the current block record.
+Its end is aligned to 32-bits to ensure that the size of the block is an even
+multiple of 32-bits.
+
+.. _Data Records:
+
+Data Records
+------------
+
+Data records consist of a record code and a number of (up to) 64-bit integer
+values.  The interpretation of the code and values is application specific and
+may vary between different block types.  Records can be encoded either using an
+unabbrev record, or with an abbreviation.  In the LLVM IR format, for example,
+there is a record which encodes the target triple of a module.  The code is
+``MODULE_CODE_TRIPLE``, and the values of the record are the ASCII codes for the
+characters in the string.
+
+.. _UNABBREV_RECORD:
+
+UNABBREV_RECORD Encoding
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+:raw-html:`<tt>`
+[UNABBREV_RECORD, code\ :sub:`vbr6`, numops\ :sub:`vbr6`, op0\ :sub:`vbr6`, op1\ :sub:`vbr6`, ...]
+:raw-html:`</tt>`
+
+An ``UNABBREV_RECORD`` provides a default fallback encoding, which is both
+completely general and extremely inefficient.  It can describe an arbitrary
+record by emitting the code and operands as VBRs.
+
+For example, emitting an LLVM IR target triple as an unabbreviated record
+requires emitting the ``UNABBREV_RECORD`` abbrevid, a vbr6 for the
+``MODULE_CODE_TRIPLE`` code, a vbr6 for the length of the string, which is equal
+to the number of operands, and a vbr6 for each character.  Because there are no
+letters with values less than 32, each letter would need to be emitted as at
+least a two-part VBR, which means that each letter would require at least 12
+bits.  This is not an efficient encoding, but it is fully general.
+
+.. _abbreviated record encoding:
+
+Abbreviated Record Encoding
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[<abbrevid>, fields...]``
+
+An abbreviated record is a abbreviation id followed by a set of fields that are
+encoded according to the `abbreviation definition`_.  This allows records to be
+encoded significantly more densely than records encoded with the
+`UNABBREV_RECORD`_ type, and allows the abbreviation types to be specified in
+the stream itself, which allows the files to be completely self describing.  The
+actual encoding of abbreviations is defined below.
+
+The record code, which is the first field of an abbreviated record, may be
+encoded in the abbreviation definition (as a literal operand) or supplied in the
+abbreviated record (as a Fixed or VBR operand value).
+
+.. _abbreviation definition:
+
+Abbreviations
+-------------
+
+Abbreviations are an important form of compression for bitstreams.  The idea is
+to specify a dense encoding for a class of records once, then use that encoding
+to emit many records.  It takes space to emit the encoding into the file, but
+the space is recouped (hopefully plus some) when the records that use it are
+emitted.
+
+Abbreviations can be determined dynamically per client, per file. Because the
+abbreviations are stored in the bitstream itself, different streams of the same
+format can contain different sets of abbreviations according to the needs of the
+specific stream.  As a concrete example, LLVM IR files usually emit an
+abbreviation for binary operators.  If a specific LLVM module contained no or
+few binary operators, the abbreviation does not need to be emitted.
+
+.. _DEFINE_ABBREV:
+
+DEFINE_ABBREV Encoding
+^^^^^^^^^^^^^^^^^^^^^^
+
+:raw-html:`<tt>`
+[DEFINE_ABBREV, numabbrevops\ :sub:`vbr5`, abbrevop0, abbrevop1, ...]
+:raw-html:`</tt>`
+
+A ``DEFINE_ABBREV`` record adds an abbreviation to the list of currently defined
+abbreviations in the scope of this block.  This definition only exists inside
+this immediate block --- it is not visible in subblocks or enclosing blocks.
+Abbreviations are implicitly assigned IDs sequentially starting from 4 (the
+first application-defined abbreviation ID).  Any abbreviations defined in a
+``BLOCKINFO`` record for the particular block type receive IDs first, in order,
+followed by any abbreviations defined within the block itself.  Abbreviated data
+records reference this ID to indicate what abbreviation they are invoking.
+
+An abbreviation definition consists of the ``DEFINE_ABBREV`` abbrevid followed
+by a VBR that specifies the number of abbrev operands, then the abbrev operands
+themselves.  Abbreviation operands come in three forms.  They all start with a
+single bit that indicates whether the abbrev operand is a literal operand (when
+the bit is 1) or an encoding operand (when the bit is 0).
+
+#. Literal operands --- :raw-html:`<tt>` [1\ :sub:`1`, litvalue\
+   :sub:`vbr8`] :raw-html:`</tt>` --- Literal operands specify that the value in
+   the result is always a single specific value.  This specific value is emitted
+   as a vbr8 after the bit indicating that it is a literal operand.
+
+#. Encoding info without data --- :raw-html:`<tt>` [0\ :sub:`1`, encoding\
+   :sub:`3`] :raw-html:`</tt>` --- Operand encodings that do not have extra data
+   are just emitted as their code.
+
+#. Encoding info with data --- :raw-html:`<tt>` [0\ :sub:`1`, encoding\
+   :sub:`3`, value\ :sub:`vbr5`] :raw-html:`</tt>` --- Operand encodings that do
+   have extra data are emitted as their code, followed by the extra data.
+
+The possible operand encodings are:
+
+* Fixed (code 1): The field should be emitted as a `fixed-width value`_, whose
+  width is specified by the operand's extra data.
+
+* VBR (code 2): The field should be emitted as a `variable-width value`_, whose
+  width is specified by the operand's extra data.
+
+* Array (code 3): This field is an array of values.  The array operand has no
+  extra data, but expects another operand to follow it, indicating the element
+  type of the array.  When reading an array in an abbreviated record, the first
+  integer is a vbr6 that indicates the array length, followed by the encoded
+  elements of the array.  An array may only occur as the last operand of an
+  abbreviation (except for the one final operand that gives the array's
+  type).
+
+* Char6 (code 4): This field should be emitted as a `char6-encoded value`_.
+  This operand type takes no extra data. Char6 encoding is normally used as an
+  array element type.
+
+* Blob (code 5): This field is emitted as a vbr6, followed by padding to a
+  32-bit boundary (for alignment) and an array of 8-bit objects.  The array of
+  bytes is further followed by tail padding to ensure that its total length is a
+  multiple of 4 bytes.  This makes it very efficient for the reader to decode
+  the data without having to make a copy of it: it can use a pointer to the data
+  in the mapped in file and poke directly at it.  A blob may only occur as the
+  last operand of an abbreviation.
+
+For example, target triples in LLVM modules are encoded as a record of the form
+``[TRIPLE, 'a', 'b', 'c', 'd']``.  Consider if the bitstream emitted the
+following abbrev entry:
+
+::
+
+  [0, Fixed, 4]
+  [0, Array]
+  [0, Char6]
+
+When emitting a record with this abbreviation, the above entry would be emitted
+as:
+
+:raw-html:`<tt><blockquote>`
+[4\ :sub:`abbrevwidth`, 2\ :sub:`4`, 4\ :sub:`vbr6`, 0\ :sub:`6`, 1\ :sub:`6`, 2\ :sub:`6`, 3\ :sub:`6`]
+:raw-html:`</blockquote></tt>`
+
+These values are:
+
+#. The first value, 4, is the abbreviation ID for this abbreviation.
+
+#. The second value, 2, is the record code for ``TRIPLE`` records within LLVM IR
+   file ``MODULE_BLOCK`` blocks.
+
+#. The third value, 4, is the length of the array.
+
+#. The rest of the values are the char6 encoded values for ``"abcd"``.
+
+With this abbreviation, the triple is emitted with only 37 bits (assuming a
+abbrev id width of 3).  Without the abbreviation, significantly more space would
+be required to emit the target triple.  Also, because the ``TRIPLE`` value is
+not emitted as a literal in the abbreviation, the abbreviation can also be used
+for any other string value.
+
+.. _standard blocks:
+.. _standard block:
+
+Standard Blocks
+---------------
+
+In addition to the basic block structure and record encodings, the bitstream
+also defines specific built-in block types.  These block types specify how the
+stream is to be decoded or other metadata.  In the future, new standard blocks
+may be added.  Block IDs 0-7 are reserved for standard blocks.
+
+.. _BLOCKINFO:
+
+#0 - BLOCKINFO Block
+^^^^^^^^^^^^^^^^^^^^
+
+The ``BLOCKINFO`` block allows the description of metadata for other blocks.
+The currently specified records are:
+
+::
+
+  [SETBID (#1), blockid]
+  [DEFINE_ABBREV, ...]
+  [BLOCKNAME, ...name...]
+  [SETRECORDNAME, RecordID, ...name...]
+
+The ``SETBID`` record (code 1) indicates which block ID is being described.
+``SETBID`` records can occur multiple times throughout the block to change which
+block ID is being described.  There must be a ``SETBID`` record prior to any
+other records.
+
+Standard ``DEFINE_ABBREV`` records can occur inside ``BLOCKINFO`` blocks, but
+unlike their occurrence in normal blocks, the abbreviation is defined for blocks
+matching the block ID we are describing, *not* the ``BLOCKINFO`` block
+itself.  The abbreviations defined in ``BLOCKINFO`` blocks receive abbreviation
+IDs as described in `DEFINE_ABBREV`_.
+
+The ``BLOCKNAME`` record (code 2) can optionally occur in this block.  The
+elements of the record are the bytes of the string name of the block.
+llvm-bcanalyzer can use this to dump out bitcode files symbolically.
+
+The ``SETRECORDNAME`` record (code 3) can also optionally occur in this block.
+The first operand value is a record ID number, and the rest of the elements of
+the record are the bytes for the string name of the record.  llvm-bcanalyzer can
+use this to dump out bitcode files symbolically.
+
+Note that although the data in ``BLOCKINFO`` blocks is described as "metadata,"
+the abbreviations they contain are essential for parsing records from the
+corresponding blocks.  It is not safe to skip them.
+
+.. _wrapper:
+
+Bitcode Wrapper Format
+======================
+
+Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper
+structure.  This structure contains a simple header that indicates the offset
+and size of the embedded BC file.  This allows additional information to be
+stored alongside the BC file.  The structure of this file header is:
+
+:raw-html:`<tt><blockquote>`
+[Magic\ :sub:`32`, Version\ :sub:`32`, Offset\ :sub:`32`, Size\ :sub:`32`, CPUType\ :sub:`32`]
+:raw-html:`</blockquote></tt>`
+
+Each of the fields are 32-bit fields stored in little endian form (as with the
+rest of the bitcode file fields).  The Magic number is always ``0x0B17C0DE`` and
+the version is currently always ``0``.  The Offset field is the offset in bytes
+to the start of the bitcode stream in the file, and the Size field is the size
+in bytes of the stream. CPUType is a target-specific value that can be used to
+encode the CPU of the target.
+
+.. _native object file:
+
+Native Object File Wrapper Format
+=================================
+
+Bitcode files for LLVM IR may also be wrapped in a native object file
+(i.e. ELF, COFF, Mach-O).  The bitcode must be stored in a section of the object
+file named ``__LLVM,__bitcode`` for MachO and ``.llvmbc`` for the other object
+formats.  This wrapper format is useful for accommodating LTO in compilation
+pipelines where intermediate objects must be native object files which contain
+metadata in other sections.
+
+Not all tools support this format.
+
+.. _encoding of LLVM IR:
+
+LLVM IR Encoding
+================
+
+LLVM IR is encoded into a bitstream by defining blocks and records.  It uses
+blocks for things like constant pools, functions, symbol tables, etc.  It uses
+records for things like instructions, global variable descriptors, type
+descriptions, etc.  This document does not describe the set of abbreviations
+that the writer uses, as these are fully self-described in the file, and the
+reader is not allowed to build in any knowledge of this.
+
+Basics
+------
+
+LLVM IR Magic Number
+^^^^^^^^^^^^^^^^^^^^
+
+The magic number for LLVM IR files is:
+
+:raw-html:`<tt><blockquote>`
+[0x0\ :sub:`4`, 0xC\ :sub:`4`, 0xE\ :sub:`4`, 0xD\ :sub:`4`]
+:raw-html:`</blockquote></tt>`
+
+When combined with the bitcode magic number and viewed as bytes, this is
+``"BC 0xC0DE"``.
+
+.. _Signed VBRs:
+
+Signed VBRs
+^^^^^^^^^^^
+
+`Variable Width Integer`_ encoding is an efficient way to encode arbitrary sized
+unsigned values, but is an extremely inefficient for encoding signed values, as
+signed values are otherwise treated as maximally large unsigned values.
+
+As such, signed VBR values of a specific width are emitted as follows:
+
+* Positive values are emitted as VBRs of the specified width, but with their
+  value shifted left by one.
+
+* Negative values are emitted as VBRs of the specified width, but the negated
+  value is shifted left by one, and the low bit is set.
+
+With this encoding, small positive and small negative values can both be emitted
+efficiently. Signed VBR encoding is used in ``CST_CODE_INTEGER`` and
+``CST_CODE_WIDE_INTEGER`` records within ``CONSTANTS_BLOCK`` blocks.
+It is also used for phi instruction operands in `MODULE_CODE_VERSION`_ 1.
+
+LLVM IR Blocks
+^^^^^^^^^^^^^^
+
+LLVM IR is defined with the following blocks:
+
+* 8 --- `MODULE_BLOCK`_ --- This is the top-level block that contains the entire
+  module, and describes a variety of per-module information.
+
+* 9 --- `PARAMATTR_BLOCK`_ --- This enumerates the parameter attributes.
+
+* 10 --- `PARAMATTR_GROUP_BLOCK`_ --- This describes the attribute group table.
+
+* 11 --- `CONSTANTS_BLOCK`_ --- This describes constants for a module or
+  function.
+
+* 12 --- `FUNCTION_BLOCK`_ --- This describes a function body.
+
+* 14 --- `VALUE_SYMTAB_BLOCK`_ --- This describes a value symbol table.
+
+* 15 --- `METADATA_BLOCK`_ --- This describes metadata items.
+
+* 16 --- `METADATA_ATTACHMENT`_ --- This contains records associating metadata
+  with function instruction values.
+
+* 17 --- `TYPE_BLOCK`_ --- This describes all of the types in the module.
+
+.. _MODULE_BLOCK:
+
+MODULE_BLOCK Contents
+---------------------
+
+The ``MODULE_BLOCK`` block (id 8) is the top-level block for LLVM bitcode files,
+and each bitcode file must contain exactly one. In addition to records
+(described below) containing information about the module, a ``MODULE_BLOCK``
+block may contain the following sub-blocks:
+
+* `BLOCKINFO`_
+* `PARAMATTR_BLOCK`_
+* `PARAMATTR_GROUP_BLOCK`_
+* `TYPE_BLOCK`_
+* `VALUE_SYMTAB_BLOCK`_
+* `CONSTANTS_BLOCK`_
+* `FUNCTION_BLOCK`_
+* `METADATA_BLOCK`_
+
+.. _MODULE_CODE_VERSION:
+
+MODULE_CODE_VERSION Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[VERSION, version#]``
+
+The ``VERSION`` record (code 1) contains a single value indicating the format
+version. Versions 0 and 1 are supported at this time. The difference between
+version 0 and 1 is in the encoding of instruction operands in
+each `FUNCTION_BLOCK`_.
+
+In version 0, each value defined by an instruction is assigned an ID
+unique to the function. Function-level value IDs are assigned starting from
+``NumModuleValues`` since they share the same namespace as module-level
+values. The value enumerator resets after each function. When a value is
+an operand of an instruction, the value ID is used to represent the operand.
+For large functions or large modules, these operand values can be large.
+
+The encoding in version 1 attempts to avoid large operand values
+in common cases. Instead of using the value ID directly, operands are
+encoded as relative to the current instruction. Thus, if an operand
+is the value defined by the previous instruction, the operand
+will be encoded as 1.
+
+For example, instead of
+
+.. code-block:: none
+
+  #n = load #n-1
+  #n+1 = icmp eq #n, #const0
+  br #n+1, label #(bb1), label #(bb2)
+
+version 1 will encode the instructions as
+
+.. code-block:: none
+
+  #n = load #1
+  #n+1 = icmp eq #1, (#n+1)-#const0
+  br #1, label #(bb1), label #(bb2)
+
+Note in the example that operands which are constants also use
+the relative encoding, while operands like basic block labels
+do not use the relative encoding.
+
+Forward references will result in a negative value.
+This can be inefficient, as operands are normally encoded
+as unsigned VBRs. However, forward references are rare, except in the
+case of phi instructions. For phi instructions, operands are encoded as
+`Signed VBRs`_ to deal with forward references.
+
+
+MODULE_CODE_TRIPLE Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[TRIPLE, ...string...]``
+
+The ``TRIPLE`` record (code 2) contains a variable number of values representing
+the bytes of the ``target triple`` specification string.
+
+MODULE_CODE_DATALAYOUT Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[DATALAYOUT, ...string...]``
+
+The ``DATALAYOUT`` record (code 3) contains a variable number of values
+representing the bytes of the ``target datalayout`` specification string.
+
+MODULE_CODE_ASM Record
+^^^^^^^^^^^^^^^^^^^^^^
+
+``[ASM, ...string...]``
+
+The ``ASM`` record (code 4) contains a variable number of values representing
+the bytes of ``module asm`` strings, with individual assembly blocks separated
+by newline (ASCII 10) characters.
+
+.. _MODULE_CODE_SECTIONNAME:
+
+MODULE_CODE_SECTIONNAME Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[SECTIONNAME, ...string...]``
+
+The ``SECTIONNAME`` record (code 5) contains a variable number of values
+representing the bytes of a single section name string. There should be one
+``SECTIONNAME`` record for each section name referenced (e.g., in global
+variable or function ``section`` attributes) within the module. These records
+can be referenced by the 1-based index in the *section* fields of ``GLOBALVAR``
+or ``FUNCTION`` records.
+
+MODULE_CODE_DEPLIB Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[DEPLIB, ...string...]``
+
+The ``DEPLIB`` record (code 6) contains a variable number of values representing
+the bytes of a single dependent library name string, one of the libraries
+mentioned in a ``deplibs`` declaration.  There should be one ``DEPLIB`` record
+for each library name referenced.
+
+MODULE_CODE_GLOBALVAR Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal, unnamed_addr, externally_initialized, dllstorageclass, comdat]``
+
+The ``GLOBALVAR`` record (code 7) marks the declaration or definition of a
+global variable. The operand fields are:
+
+* *pointer type*: The type index of the pointer type used to point to this
+  global variable
+
+* *isconst*: Non-zero if the variable is treated as constant within the module,
+  or zero if it is not
+
+* *initid*: If non-zero, the value index of the initializer for this variable,
+  plus 1.
+
+.. _linkage type:
+
+* *linkage*: An encoding of the linkage type for this variable:
+
+  * ``external``: code 0
+  * ``weak``: code 1
+  * ``appending``: code 2
+  * ``internal``: code 3
+  * ``linkonce``: code 4
+  * ``dllimport``: code 5
+  * ``dllexport``: code 6
+  * ``extern_weak``: code 7
+  * ``common``: code 8
+  * ``private``: code 9
+  * ``weak_odr``: code 10
+  * ``linkonce_odr``: code 11
+  * ``available_externally``: code 12
+  * deprecated : code 13
+  * deprecated : code 14
+
+* alignment*: The logarithm base 2 of the variable's requested alignment, plus 1
+
+* *section*: If non-zero, the 1-based section index in the table of
+  `MODULE_CODE_SECTIONNAME`_ entries.
+
+.. _visibility:
+
+* *visibility*: If present, an encoding of the visibility of this variable:
+
+  * ``default``: code 0
+  * ``hidden``: code 1
+  * ``protected``: code 2
+
+.. _bcthreadlocal:
+
+* *threadlocal*: If present, an encoding of the thread local storage mode of the
+  variable:
+
+  * ``not thread local``: code 0
+  * ``thread local; default TLS model``: code 1
+  * ``localdynamic``: code 2
+  * ``initialexec``: code 3
+  * ``localexec``: code 4
+
+.. _bcunnamedaddr:
+
+* *unnamed_addr*: If present, an encoding of the ``unnamed_addr`` attribute of this
+  variable:
+
+  * not ``unnamed_addr``: code 0
+  * ``unnamed_addr``: code 1
+  * ``local_unnamed_addr``: code 2
+
+.. _bcdllstorageclass:
+
+* *dllstorageclass*: If present, an encoding of the DLL storage class of this variable:
+
+  * ``default``: code 0
+  * ``dllimport``: code 1
+  * ``dllexport``: code 2
+
+* *comdat*: An encoding of the COMDAT of this function
+
+.. _FUNCTION:
+
+MODULE_CODE_FUNCTION Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc, prologuedata, dllstorageclass, comdat, prefixdata, personalityfn]``
+
+The ``FUNCTION`` record (code 8) marks the declaration or definition of a
+function. The operand fields are:
+
+* *type*: The type index of the function type describing this function
+
+* *callingconv*: The calling convention number:
+  * ``ccc``: code 0
+  * ``fastcc``: code 8
+  * ``coldcc``: code 9
+  * ``webkit_jscc``: code 12
+  * ``anyregcc``: code 13
+  * ``preserve_mostcc``: code 14
+  * ``preserve_allcc``: code 15
+  * ``swiftcc`` : code 16
+  * ``cxx_fast_tlscc``: code 17
+  * ``x86_stdcallcc``: code 64
+  * ``x86_fastcallcc``: code 65
+  * ``arm_apcscc``: code 66
+  * ``arm_aapcscc``: code 67
+  * ``arm_aapcs_vfpcc``: code 68
+
+* isproto*: Non-zero if this entry represents a declaration rather than a
+  definition
+
+* *linkage*: An encoding of the `linkage type`_ for this function
+
+* *paramattr*: If nonzero, the 1-based parameter attribute index into the table
+  of `PARAMATTR_CODE_ENTRY`_ entries.
+
+* *alignment*: The logarithm base 2 of the function's requested alignment, plus
+  1
+
+* *section*: If non-zero, the 1-based section index in the table of
+  `MODULE_CODE_SECTIONNAME`_ entries.
+
+* *visibility*: An encoding of the `visibility`_ of this function
+
+* *gc*: If present and nonzero, the 1-based garbage collector index in the table
+  of `MODULE_CODE_GCNAME`_ entries.
+
+* *unnamed_addr*: If present, an encoding of the
+  :ref:`unnamed_addr<bcunnamedaddr>` attribute of this function
+
+* *prologuedata*: If non-zero, the value index of the prologue data for this function,
+  plus 1.
+
+* *dllstorageclass*: An encoding of the
+  :ref:`dllstorageclass<bcdllstorageclass>` of this function
+
+* *comdat*: An encoding of the COMDAT of this function
+
+* *prefixdata*: If non-zero, the value index of the prefix data for this function,
+  plus 1.
+
+* *personalityfn*: If non-zero, the value index of the personality function for this function,
+  plus 1.
+
+MODULE_CODE_ALIAS Record
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[ALIAS, alias type, aliasee val#, linkage, visibility, dllstorageclass, threadlocal, unnamed_addr]``
+
+The ``ALIAS`` record (code 9) marks the definition of an alias. The operand
+fields are
+
+* *alias type*: The type index of the alias
+
+* *aliasee val#*: The value index of the aliased value
+
+* *linkage*: An encoding of the `linkage type`_ for this alias
+
+* *visibility*: If present, an encoding of the `visibility`_ of the alias
+
+* *dllstorageclass*: If present, an encoding of the
+  :ref:`dllstorageclass<bcdllstorageclass>` of the alias
+
+* *threadlocal*: If present, an encoding of the
+  :ref:`thread local property<bcthreadlocal>` of the alias
+
+* *unnamed_addr*: If present, an encoding of the
+  :ref:`unnamed_addr<bcunnamedaddr>` attribute of this alias
+
+MODULE_CODE_PURGEVALS Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[PURGEVALS, numvals]``
+
+The ``PURGEVALS`` record (code 10) resets the module-level value list to the
+size given by the single operand value. Module-level value list items are added
+by ``GLOBALVAR``, ``FUNCTION``, and ``ALIAS`` records.  After a ``PURGEVALS``
+record is seen, new value indices will start from the given *numvals* value.
+
+.. _MODULE_CODE_GCNAME:
+
+MODULE_CODE_GCNAME Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[GCNAME, ...string...]``
+
+The ``GCNAME`` record (code 11) contains a variable number of values
+representing the bytes of a single garbage collector name string. There should
+be one ``GCNAME`` record for each garbage collector name referenced in function
+``gc`` attributes within the module. These records can be referenced by 1-based
+index in the *gc* fields of ``FUNCTION`` records.
+
+.. _PARAMATTR_BLOCK:
+
+PARAMATTR_BLOCK Contents
+------------------------
+
+The ``PARAMATTR_BLOCK`` block (id 9) contains a table of entries describing the
+attributes of function parameters. These entries are referenced by 1-based index
+in the *paramattr* field of module block `FUNCTION`_ records, or within the
+*attr* field of function block ``INST_INVOKE`` and ``INST_CALL`` records.
+
+Entries within ``PARAMATTR_BLOCK`` are constructed to ensure that each is unique
+(i.e., no two indices represent equivalent attribute lists).
+
+.. _PARAMATTR_CODE_ENTRY:
+
+PARAMATTR_CODE_ENTRY Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[ENTRY, attrgrp0, attrgrp1, ...]``
+
+The ``ENTRY`` record (code 2) contains a variable number of values describing a
+unique set of function parameter attributes. Each *attrgrp* value is used as a
+key with which to look up an entry in the the attribute group table described
+in the ``PARAMATTR_GROUP_BLOCK`` block.
+
+.. _PARAMATTR_CODE_ENTRY_OLD:
+
+PARAMATTR_CODE_ENTRY_OLD Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+  This is a legacy encoding for attributes, produced by LLVM versions 3.2 and
+  earlier. It is guaranteed to be understood by the current LLVM version, as
+  specified in the :ref:`IR backwards compatibility` policy.
+
+``[ENTRY, paramidx0, attr0, paramidx1, attr1...]``
+
+The ``ENTRY`` record (code 1) contains an even number of values describing a
+unique set of function parameter attributes. Each *paramidx* value indicates
+which set of attributes is represented, with 0 representing the return value
+attributes, 0xFFFFFFFF representing function attributes, and other values
+representing 1-based function parameters. Each *attr* value is a bitmap with the
+following interpretation:
+
+* bit 0: ``zeroext``
+* bit 1: ``signext``
+* bit 2: ``noreturn``
+* bit 3: ``inreg``
+* bit 4: ``sret``
+* bit 5: ``nounwind``
+* bit 6: ``noalias``
+* bit 7: ``byval``
+* bit 8: ``nest``
+* bit 9: ``readnone``
+* bit 10: ``readonly``
+* bit 11: ``noinline``
+* bit 12: ``alwaysinline``
+* bit 13: ``optsize``
+* bit 14: ``ssp``
+* bit 15: ``sspreq``
+* bits 16-31: ``align n``
+* bit 32: ``nocapture``
+* bit 33: ``noredzone``
+* bit 34: ``noimplicitfloat``
+* bit 35: ``naked``
+* bit 36: ``inlinehint``
+* bits 37-39: ``alignstack n``, represented as the logarithm
+  base 2 of the requested alignment, plus 1
+
+.. _PARAMATTR_GROUP_BLOCK:
+
+PARAMATTR_GROUP_BLOCK Contents
+------------------------------
+
+The ``PARAMATTR_GROUP_BLOCK`` block (id 10) contains a table of entries
+describing the attribute groups present in the module. These entries can be
+referenced within ``PARAMATTR_CODE_ENTRY`` entries.
+
+.. _PARAMATTR_GRP_CODE_ENTRY:
+
+PARAMATTR_GRP_CODE_ENTRY Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[ENTRY, grpid, paramidx, attr0, attr1, ...]``
+
+The ``ENTRY`` record (code 3) contains *grpid* and *paramidx* values, followed
+by a variable number of values describing a unique group of attributes. The
+*grpid* value is a unique key for the attribute group, which can be referenced
+within ``PARAMATTR_CODE_ENTRY`` entries. The *paramidx* value indicates which
+set of attributes is represented, with 0 representing the return value
+attributes, 0xFFFFFFFF representing function attributes, and other values
+representing 1-based function parameters.
+
+Each *attr* is itself represented as a variable number of values:
+
+``kind, key [, ...], [value [, ...]]``
+
+Each attribute is either a well-known LLVM attribute (possibly with an integer
+value associated with it), or an arbitrary string (possibly with an arbitrary
+string value associated with it). The *kind* value is an integer code
+distinguishing between these possibilities:
+
+* code 0: well-known attribute
+* code 1: well-known attribute with an integer value
+* code 3: string attribute
+* code 4: string attribute with a string value
+
+For well-known attributes (code 0 or 1), the *key* value is an integer code
+identifying the attribute. For attributes with an integer argument (code 1),
+the *value* value indicates the argument.
+
+For string attributes (code 3 or 4), the *key* value is actually a variable
+number of values representing the bytes of a null-terminated string. For
+attributes with a string argument (code 4), the *value* value is similarly a
+variable number of values representing the bytes of a null-terminated string.
+
+The integer codes are mapped to well-known attributes as follows.
+
+* code 1: ``align(<n>)``
+* code 2: ``alwaysinline``
+* code 3: ``byval``
+* code 4: ``inlinehint``
+* code 5: ``inreg``
+* code 6: ``minsize``
+* code 7: ``naked``
+* code 8: ``nest``
+* code 9: ``noalias``
+* code 10: ``nobuiltin``
+* code 11: ``nocapture``
+* code 12: ``noduplicates``
+* code 13: ``noimplicitfloat``
+* code 14: ``noinline``
+* code 15: ``nonlazybind``
+* code 16: ``noredzone``
+* code 17: ``noreturn``
+* code 18: ``nounwind``
+* code 19: ``optsize``
+* code 20: ``readnone``
+* code 21: ``readonly``
+* code 22: ``returned``
+* code 23: ``returns_twice``
+* code 24: ``signext``
+* code 25: ``alignstack(<n>)``
+* code 26: ``ssp``
+* code 27: ``sspreq``
+* code 28: ``sspstrong``
+* code 29: ``sret``
+* code 30: ``sanitize_address``
+* code 31: ``sanitize_thread``
+* code 32: ``sanitize_memory``
+* code 33: ``uwtable``
+* code 34: ``zeroext``
+* code 35: ``builtin``
+* code 36: ``cold``
+* code 37: ``optnone``
+* code 38: ``inalloca``
+* code 39: ``nonnull``
+* code 40: ``jumptable``
+* code 41: ``dereferenceable(<n>)``
+* code 42: ``dereferenceable_or_null(<n>)``
+* code 43: ``convergent``
+* code 44: ``safestack``
+* code 45: ``argmemonly``
+* code 46: ``swiftself``
+* code 47: ``swifterror``
+* code 48: ``norecurse``
+* code 49: ``inaccessiblememonly``
+* code 50: ``inaccessiblememonly_or_argmemonly``
+* code 51: ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
+* code 52: ``writeonly``
+
+.. note::
+  The ``allocsize`` attribute has a special encoding for its arguments. Its two
+  arguments, which are 32-bit integers, are packed into one 64-bit integer value
+  (i.e. ``(EltSizeParam << 32) | NumEltsParam``), with ``NumEltsParam`` taking on
+  the sentinel value -1 if it is not specified.
+
+.. _TYPE_BLOCK:
+
+TYPE_BLOCK Contents
+-------------------
+
+The ``TYPE_BLOCK`` block (id 17) contains records which constitute a table of
+type operator entries used to represent types referenced within an LLVM
+module. Each record (with the exception of `NUMENTRY`_) generates a single type
+table entry, which may be referenced by 0-based index from instructions,
+constants, metadata, type symbol table entries, or other type operator records.
+
+Entries within ``TYPE_BLOCK`` are constructed to ensure that each entry is
+unique (i.e., no two indices represent structurally equivalent types).
+
+.. _TYPE_CODE_NUMENTRY:
+.. _NUMENTRY:
+
+TYPE_CODE_NUMENTRY Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[NUMENTRY, numentries]``
+
+The ``NUMENTRY`` record (code 1) contains a single value which indicates the
+total number of type code entries in the type table of the module. If present,
+``NUMENTRY`` should be the first record in the block.
+
+TYPE_CODE_VOID Record
+^^^^^^^^^^^^^^^^^^^^^
+
+``[VOID]``
+
+The ``VOID`` record (code 2) adds a ``void`` type to the type table.
+
+TYPE_CODE_HALF Record
+^^^^^^^^^^^^^^^^^^^^^
+
+``[HALF]``
+
+The ``HALF`` record (code 10) adds a ``half`` (16-bit floating point) type to
+the type table.
+
+TYPE_CODE_FLOAT Record
+^^^^^^^^^^^^^^^^^^^^^^
+
+``[FLOAT]``
+
+The ``FLOAT`` record (code 3) adds a ``float`` (32-bit floating point) type to
+the type table.
+
+TYPE_CODE_DOUBLE Record
+^^^^^^^^^^^^^^^^^^^^^^^
+
+``[DOUBLE]``
+
+The ``DOUBLE`` record (code 4) adds a ``double`` (64-bit floating point) type to
+the type table.
+
+TYPE_CODE_LABEL Record
+^^^^^^^^^^^^^^^^^^^^^^
+
+``[LABEL]``
+
+The ``LABEL`` record (code 5) adds a ``label`` type to the type table.
+
+TYPE_CODE_OPAQUE Record
+^^^^^^^^^^^^^^^^^^^^^^^
+
+``[OPAQUE]``
+
+The ``OPAQUE`` record (code 6) adds an ``opaque`` type to the type table, with
+a name defined by a previously encountered ``STRUCT_NAME`` record. Note that
+distinct ``opaque`` types are not unified.
+
+TYPE_CODE_INTEGER Record
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[INTEGER, width]``
+
+The ``INTEGER`` record (code 7) adds an integer type to the type table. The
+single *width* field indicates the width of the integer type.
+
+TYPE_CODE_POINTER Record
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[POINTER, pointee type, address space]``
+
+The ``POINTER`` record (code 8) adds a pointer type to the type table. The
+operand fields are
+
+* *pointee type*: The type index of the pointed-to type
+
+* *address space*: If supplied, the target-specific numbered address space where
+  the pointed-to object resides. Otherwise, the default address space is zero.
+
+TYPE_CODE_FUNCTION_OLD Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+  This is a legacy encoding for functions, produced by LLVM versions 3.0 and
+  earlier. It is guaranteed to be understood by the current LLVM version, as
+  specified in the :ref:`IR backwards compatibility` policy.
+
+``[FUNCTION_OLD, vararg, ignored, retty, ...paramty... ]``
+
+The ``FUNCTION_OLD`` record (code 9) adds a function type to the type table.
+The operand fields are
+
+* *vararg*: Non-zero if the type represents a varargs function
+
+* *ignored*: This value field is present for backward compatibility only, and is
+  ignored
+
+* *retty*: The type index of the function's return type
+
+* *paramty*: Zero or more type indices representing the parameter types of the
+  function
+
+TYPE_CODE_ARRAY Record
+^^^^^^^^^^^^^^^^^^^^^^
+
+``[ARRAY, numelts, eltty]``
+
+The ``ARRAY`` record (code 11) adds an array type to the type table.  The
+operand fields are
+
+* *numelts*: The number of elements in arrays of this type
+
+* *eltty*: The type index of the array element type
+
+TYPE_CODE_VECTOR Record
+^^^^^^^^^^^^^^^^^^^^^^^
+
+``[VECTOR, numelts, eltty]``
+
+The ``VECTOR`` record (code 12) adds a vector type to the type table.  The
+operand fields are
+
+* *numelts*: The number of elements in vectors of this type
+
+* *eltty*: The type index of the vector element type
+
+TYPE_CODE_X86_FP80 Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[X86_FP80]``
+
+The ``X86_FP80`` record (code 13) adds an ``x86_fp80`` (80-bit floating point)
+type to the type table.
+
+TYPE_CODE_FP128 Record
+^^^^^^^^^^^^^^^^^^^^^^
+
+``[FP128]``
+
+The ``FP128`` record (code 14) adds an ``fp128`` (128-bit floating point) type
+to the type table.
+
+TYPE_CODE_PPC_FP128 Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[PPC_FP128]``
+
+The ``PPC_FP128`` record (code 15) adds a ``ppc_fp128`` (128-bit floating point)
+type to the type table.
+
+TYPE_CODE_METADATA Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[METADATA]``
+
+The ``METADATA`` record (code 16) adds a ``metadata`` type to the type table.
+
+TYPE_CODE_X86_MMX Record
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[X86_MMX]``
+
+The ``X86_MMX`` record (code 17) adds an ``x86_mmx`` type to the type table.
+
+TYPE_CODE_STRUCT_ANON Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[STRUCT_ANON, ispacked, ...eltty...]``
+
+The ``STRUCT_ANON`` record (code 18) adds a literal struct type to the type
+table. The operand fields are
+
+* *ispacked*: Non-zero if the type represents a packed structure
+
+* *eltty*: Zero or more type indices representing the element types of the
+  structure
+
+TYPE_CODE_STRUCT_NAME Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[STRUCT_NAME, ...string...]``
+
+The ``STRUCT_NAME`` record (code 19) contains a variable number of values
+representing the bytes of a struct name. The next ``OPAQUE`` or
+``STRUCT_NAMED`` record will use this name.
+
+TYPE_CODE_STRUCT_NAMED Record
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[STRUCT_NAMED, ispacked, ...eltty...]``
+
+The ``STRUCT_NAMED`` record (code 20) adds an identified struct type to the
+type table, with a name defined by a previously encountered ``STRUCT_NAME``
+record. The operand fields are
+
+* *ispacked*: Non-zero if the type represents a packed structure
+
+* *eltty*: Zero or more type indices representing the element types of the
+  structure
+
+TYPE_CODE_FUNCTION Record
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``[FUNCTION, vararg, retty, ...paramty... ]``
+
+The ``FUNCTION`` record (code 21) adds a function type to the type table. The
+operand fields are
+
+* *vararg*: Non-zero if the type represents a varargs function
+
+* *retty*: The type index of the function's return type
+
+* *paramty*: Zero or more type indices representing the parameter types of the
+  function
+
+.. _CONSTANTS_BLOCK:
+
+CONSTANTS_BLOCK Contents
+------------------------
+
+The ``CONSTANTS_BLOCK`` block (id 11) ...
+
+.. _FUNCTION_BLOCK:
+
+FUNCTION_BLOCK Contents
+-----------------------
+
+The ``FUNCTION_BLOCK`` block (id 12) ...
+
+In addition to the record types described below, a ``FUNCTION_BLOCK`` block may
+contain the following sub-blocks:
+
+* `CONSTANTS_BLOCK`_
+* `VALUE_SYMTAB_BLOCK`_
+* `METADATA_ATTACHMENT`_
+
+.. _VALUE_SYMTAB_BLOCK:
+
+VALUE_SYMTAB_BLOCK Contents
+---------------------------
+
+The ``VALUE_SYMTAB_BLOCK`` block (id 14) ...
+
+.. _METADATA_BLOCK:
+
+METADATA_BLOCK Contents
+-----------------------
+
+The ``METADATA_BLOCK`` block (id 15) ...
+
+.. _METADATA_ATTACHMENT:
+
+METADATA_ATTACHMENT Contents
+----------------------------
+
+The ``METADATA_ATTACHMENT`` block (id 16) ...

Added: www-releases/trunk/4.0.1/docs/_sources/BlockFrequencyTerminology.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/BlockFrequencyTerminology.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/BlockFrequencyTerminology.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/BlockFrequencyTerminology.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,130 @@
+================================
+LLVM Block Frequency Terminology
+================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Block Frequency is a metric for estimating the relative frequency of different
+basic blocks.  This document describes the terminology that the
+``BlockFrequencyInfo`` and ``MachineBlockFrequencyInfo`` analysis passes use.
+
+Branch Probability
+==================
+
+Blocks with multiple successors have probabilities associated with each
+outgoing edge.  These are called branch probabilities.  For a given block, the
+sum of its outgoing branch probabilities should be 1.0.
+
+Branch Weight
+=============
+
+Rather than storing fractions on each edge, we store an integer weight.
+Weights are relative to the other edges of a given predecessor block.  The
+branch probability associated with a given edge is its own weight divided by
+the sum of the weights on the predecessor's outgoing edges.
+
+For example, consider this IR:
+
+.. code-block:: llvm
+
+   define void @foo() {
+       ; ...
+       A:
+           br i1 %cond, label %B, label %C, !prof !0
+       ; ...
+   }
+   !0 = metadata !{metadata !"branch_weights", i32 7, i32 8}
+
+and this simple graph representation::
+
+   A -> B  (edge-weight: 7)
+   A -> C  (edge-weight: 8)
+
+The probability of branching from block A to block B is 7/15, and the
+probability of branching from block A to block C is 8/15.
+
+See :doc:`BranchWeightMetadata` for details about the branch weight IR
+representation.
+
+Block Frequency
+===============
+
+Block frequency is a relative metric that represents the number of times a
+block executes.  The ratio of a block frequency to the entry block frequency is
+the expected number of times the block will execute per entry to the function.
+
+Block frequency is the main output of the ``BlockFrequencyInfo`` and
+``MachineBlockFrequencyInfo`` analysis passes.
+
+Implementation: a series of DAGs
+================================
+
+The implementation of the block frequency calculation analyses each loop,
+bottom-up, ignoring backedges; i.e., as a DAG.  After each loop is processed,
+it's packaged up to act as a pseudo-node in its parent loop's (or the
+function's) DAG analysis.
+
+Block Mass
+==========
+
+For each DAG, the entry node is assigned a mass of ``UINT64_MAX`` and mass is
+distributed to successors according to branch weights.  Block Mass uses a
+fixed-point representation where ``UINT64_MAX`` represents ``1.0`` and ``0``
+represents a number just above ``0.0``.
+
+After mass is fully distributed, in any cut of the DAG that separates the exit
+nodes from the entry node, the sum of the block masses of the nodes succeeded
+by a cut edge should equal ``UINT64_MAX``.  In other words, mass is conserved
+as it "falls" through the DAG.
+
+If a function's basic block graph is a DAG, then block masses are valid block
+frequencies.  This works poorly in practise though, since downstream users rely
+on adding block frequencies together without hitting the maximum.
+
+Loop Scale
+==========
+
+Loop scale is a metric that indicates how many times a loop iterates per entry.
+As mass is distributed through the loop's DAG, the (otherwise ignored) backedge
+mass is collected.  This backedge mass is used to compute the exit frequency,
+and thus the loop scale.
+
+Implementation: Getting from mass and scale to frequency
+========================================================
+
+After analysing the complete series of DAGs, each block has a mass (local to
+its containing loop, if any), and each loop pseudo-node has a loop scale and
+its own mass (from its parent's DAG).
+
+We can get an initial frequency assignment (with entry frequency of 1.0) by
+multiplying these masses and loop scales together.  A given block's frequency
+is the product of its mass, the mass of containing loops' pseudo nodes, and the
+containing loops' loop scales.
+
+Since downstream users need integers (not floating point), this initial
+frequency assignment is shifted as necessary into the range of ``uint64_t``.
+
+Block Bias
+==========
+
+Block bias is a proposed *absolute* metric to indicate a bias toward or away
+from a given block during a function's execution.  The idea is that bias can be
+used in isolation to indicate whether a block is relatively hot or cold, or to
+compare two blocks to indicate whether one is hotter or colder than the other.
+
+The proposed calculation involves calculating a *reference* block frequency,
+where:
+
+* every branch weight is assumed to be 1 (i.e., every branch probability
+  distribution is even) and
+
+* loop scales are ignored.
+
+This reference frequency represents what the block frequency would be in an
+unbiased graph.
+
+The bias is the ratio of the block frequency to this reference block frequency.

Added: www-releases/trunk/4.0.1/docs/_sources/BranchWeightMetadata.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/BranchWeightMetadata.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/BranchWeightMetadata.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/BranchWeightMetadata.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,140 @@
+===========================
+LLVM Branch Weight Metadata
+===========================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+Branch Weight Metadata represents branch weights as its likeliness to be taken
+(see :doc:`BlockFrequencyTerminology`). Metadata is assigned to the
+``TerminatorInst`` as a ``MDNode`` of the ``MD_prof`` kind. The first operator
+is always a ``MDString`` node with the string "branch_weights".  Number of
+operators depends on the terminator type.
+
+Branch weights might be fetch from the profiling file, or generated based on
+`__builtin_expect`_ instruction.
+
+All weights are represented as an unsigned 32-bit values, where higher value
+indicates greater chance to be taken.
+
+Supported Instructions
+======================
+
+``BranchInst``
+^^^^^^^^^^^^^^
+
+Metadata is only assigned to the conditional branches. There are two extra
+operands for the true and the false branch.
+
+.. code-block:: none
+
+  !0 = metadata !{
+    metadata !"branch_weights",
+    i32 <TRUE_BRANCH_WEIGHT>,
+    i32 <FALSE_BRANCH_WEIGHT>
+  }
+
+``SwitchInst``
+^^^^^^^^^^^^^^
+
+Branch weights are assigned to every case (including the ``default`` case which
+is always case #0).
+
+.. code-block:: none
+
+  !0 = metadata !{
+    metadata !"branch_weights",
+    i32 <DEFAULT_BRANCH_WEIGHT>
+    [ , i32 <CASE_BRANCH_WEIGHT> ... ]
+  }
+
+``IndirectBrInst``
+^^^^^^^^^^^^^^^^^^
+
+Branch weights are assigned to every destination.
+
+.. code-block:: none
+
+  !0 = metadata !{
+    metadata !"branch_weights",
+    i32 <LABEL_BRANCH_WEIGHT>
+    [ , i32 <LABEL_BRANCH_WEIGHT> ... ]
+  }
+
+Other
+^^^^^
+
+Other terminator instructions are not allowed to contain Branch Weight Metadata.
+
+.. _\__builtin_expect:
+
+Built-in ``expect`` Instructions
+================================
+
+``__builtin_expect(long exp, long c)`` instruction provides branch prediction
+information. The return value is the value of ``exp``.
+
+It is especially useful in conditional statements. Currently Clang supports two
+conditional statements:
+
+``if`` statement
+^^^^^^^^^^^^^^^^
+
+The ``exp`` parameter is the condition. The ``c`` parameter is the expected
+comparison value. If it is equal to 1 (true), the condition is likely to be
+true, in other case condition is likely to be false. For example:
+
+.. code-block:: c++
+
+  if (__builtin_expect(x > 0, 1)) {
+    // This block is likely to be taken.
+  }
+
+``switch`` statement
+^^^^^^^^^^^^^^^^^^^^
+
+The ``exp`` parameter is the value. The ``c`` parameter is the expected
+value. If the expected value doesn't show on the cases list, the ``default``
+case is assumed to be likely taken.
+
+.. code-block:: c++
+
+  switch (__builtin_expect(x, 5)) {
+  default: break;
+  case 0:  // ...
+  case 3:  // ...
+  case 5:  // This case is likely to be taken.
+  }
+
+CFG Modifications
+=================
+
+Branch Weight Metatada is not proof against CFG changes. If terminator operands'
+are changed some action should be taken. In other case some misoptimizations may
+occur due to incorrect branch prediction information.
+
+Function Entry Counts
+=====================
+
+To allow comparing different functions during inter-procedural analysis and
+optimization, ``MD_prof`` nodes can also be assigned to a function definition.
+The first operand is a string indicating the name of the associated counter.
+
+Currently, one counter is supported: "function_entry_count". This is a 64-bit
+counter that indicates the number of times that this function was invoked (in
+the case of instrumentation-based profiles). In the case of sampling-based
+profiles, this counter is an approximation of how many times the function was
+invoked.
+
+For example, in the code below, the instrumentation for function foo()
+indicates that it was called 2,590 times at runtime.
+
+.. code-block:: llvm
+
+  define i32 @foo() !prof !1 {
+    ret i32 0
+  }
+  !1 = !{!"function_entry_count", i64 2590}

Added: www-releases/trunk/4.0.1/docs/_sources/Bugpoint.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/Bugpoint.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/Bugpoint.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/Bugpoint.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,216 @@
+====================================
+LLVM bugpoint tool: design and usage
+====================================
+
+.. contents::
+   :local:
+
+Description
+===========
+
+``bugpoint`` narrows down the source of problems in LLVM tools and passes.  It
+can be used to debug three types of failures: optimizer crashes, miscompilations
+by optimizers, or bad native code generation (including problems in the static
+and JIT compilers).  It aims to reduce large test cases to small, useful ones.
+For example, if ``opt`` crashes while optimizing a file, it will identify the
+optimization (or combination of optimizations) that causes the crash, and reduce
+the file down to a small example which triggers the crash.
+
+For detailed case scenarios, such as debugging ``opt``, or one of the LLVM code
+generators, see :doc:`HowToSubmitABug`.
+
+Design Philosophy
+=================
+
+``bugpoint`` is designed to be a useful tool without requiring any hooks into
+the LLVM infrastructure at all.  It works with any and all LLVM passes and code
+generators, and does not need to "know" how they work.  Because of this, it may
+appear to do stupid things or miss obvious simplifications.  ``bugpoint`` is
+also designed to trade off programmer time for computer time in the
+compiler-debugging process; consequently, it may take a long period of
+(unattended) time to reduce a test case, but we feel it is still worth it. Note
+that ``bugpoint`` is generally very quick unless debugging a miscompilation
+where each test of the program (which requires executing it) takes a long time.
+
+Automatic Debugger Selection
+----------------------------
+
+``bugpoint`` reads each ``.bc`` or ``.ll`` file specified on the command line
+and links them together into a single module, called the test program.  If any
+LLVM passes are specified on the command line, it runs these passes on the test
+program.  If any of the passes crash, or if they produce malformed output (which
+causes the verifier to abort), ``bugpoint`` starts the `crash debugger`_.
+
+Otherwise, if the ``-output`` option was not specified, ``bugpoint`` runs the
+test program with the "safe" backend (which is assumed to generate good code) to
+generate a reference output.  Once ``bugpoint`` has a reference output for the
+test program, it tries executing it with the selected code generator.  If the
+selected code generator crashes, ``bugpoint`` starts the `crash debugger`_ on
+the code generator.  Otherwise, if the resulting output differs from the
+reference output, it assumes the difference resulted from a code generator
+failure, and starts the `code generator debugger`_.
+
+Finally, if the output of the selected code generator matches the reference
+output, ``bugpoint`` runs the test program after all of the LLVM passes have
+been applied to it.  If its output differs from the reference output, it assumes
+the difference resulted from a failure in one of the LLVM passes, and enters the
+`miscompilation debugger`_.  Otherwise, there is no problem ``bugpoint`` can
+debug.
+
+.. _crash debugger:
+
+Crash debugger
+--------------
+
+If an optimizer or code generator crashes, ``bugpoint`` will try as hard as it
+can to reduce the list of passes (for optimizer crashes) and the size of the
+test program.  First, ``bugpoint`` figures out which combination of optimizer
+passes triggers the bug. This is useful when debugging a problem exposed by
+``opt``, for example, because it runs over 38 passes.
+
+Next, ``bugpoint`` tries removing functions from the test program, to reduce its
+size.  Usually it is able to reduce a test program to a single function, when
+debugging intraprocedural optimizations.  Once the number of functions has been
+reduced, it attempts to delete various edges in the control flow graph, to
+reduce the size of the function as much as possible.  Finally, ``bugpoint``
+deletes any individual LLVM instructions whose absence does not eliminate the
+failure.  At the end, ``bugpoint`` should tell you what passes crash, give you a
+bitcode file, and give you instructions on how to reproduce the failure with
+``opt`` or ``llc``.
+
+.. _code generator debugger:
+
+Code generator debugger
+-----------------------
+
+The code generator debugger attempts to narrow down the amount of code that is
+being miscompiled by the selected code generator.  To do this, it takes the test
+program and partitions it into two pieces: one piece which it compiles with the
+"safe" backend (into a shared object), and one piece which it runs with either
+the JIT or the static LLC compiler.  It uses several techniques to reduce the
+amount of code pushed through the LLVM code generator, to reduce the potential
+scope of the problem.  After it is finished, it emits two bitcode files (called
+"test" [to be compiled with the code generator] and "safe" [to be compiled with
+the "safe" backend], respectively), and instructions for reproducing the
+problem.  The code generator debugger assumes that the "safe" backend produces
+good code.
+
+.. _miscompilation debugger:
+
+Miscompilation debugger
+-----------------------
+
+The miscompilation debugger works similarly to the code generator debugger.  It
+works by splitting the test program into two pieces, running the optimizations
+specified on one piece, linking the two pieces back together, and then executing
+the result.  It attempts to narrow down the list of passes to the one (or few)
+which are causing the miscompilation, then reduce the portion of the test
+program which is being miscompiled.  The miscompilation debugger assumes that
+the selected code generator is working properly.
+
+Advice for using bugpoint
+=========================
+
+``bugpoint`` can be a remarkably useful tool, but it sometimes works in
+non-obvious ways.  Here are some hints and tips:
+
+* In the code generator and miscompilation debuggers, ``bugpoint`` only works
+  with programs that have deterministic output.  Thus, if the program outputs
+  ``argv[0]``, the date, time, or any other "random" data, ``bugpoint`` may
+  misinterpret differences in these data, when output, as the result of a
+  miscompilation.  Programs should be temporarily modified to disable outputs
+  that are likely to vary from run to run.
+
+* In the code generator and miscompilation debuggers, debugging will go faster
+  if you manually modify the program or its inputs to reduce the runtime, but
+  still exhibit the problem.
+
+* ``bugpoint`` is extremely useful when working on a new optimization: it helps
+  track down regressions quickly.  To avoid having to relink ``bugpoint`` every
+  time you change your optimization however, have ``bugpoint`` dynamically load
+  your optimization with the ``-load`` option.
+
+* ``bugpoint`` can generate a lot of output and run for a long period of time.
+  It is often useful to capture the output of the program to file.  For example,
+  in the C shell, you can run:
+
+  .. code-block:: console
+
+    $ bugpoint  ... |& tee bugpoint.log
+
+  to get a copy of ``bugpoint``'s output in the file ``bugpoint.log``, as well
+  as on your terminal.
+
+* ``bugpoint`` cannot debug problems with the LLVM linker. If ``bugpoint``
+  crashes before you see its "All input ok" message, you might try ``llvm-link
+  -v`` on the same set of input files. If that also crashes, you may be
+  experiencing a linker bug.
+
+* ``bugpoint`` is useful for proactively finding bugs in LLVM.  Invoking
+  ``bugpoint`` with the ``-find-bugs`` option will cause the list of specified
+  optimizations to be randomized and applied to the program. This process will
+  repeat until a bug is found or the user kills ``bugpoint``.
+
+What to do when bugpoint isn't enough
+=====================================
+	
+Sometimes, ``bugpoint`` is not enough. In particular, InstCombine and
+TargetLowering both have visitor structured code with lots of potential
+transformations.  If the process of using bugpoint has left you with still too
+much code to figure out and the problem seems to be in instcombine, the
+following steps may help.  These same techniques are useful with TargetLowering
+as well.
+
+Turn on ``-debug-only=instcombine`` and see which transformations within
+instcombine are firing by selecting out lines with "``IC``" in them.
+
+At this point, you have a decision to make.  Is the number of transformations
+small enough to step through them using a debugger?  If so, then try that.
+
+If there are too many transformations, then a source modification approach may
+be helpful.  In this approach, you can modify the source code of instcombine to
+disable just those transformations that are being performed on your test input
+and perform a binary search over the set of transformations.  One set of places
+to modify are the "``visit*``" methods of ``InstCombiner`` (*e.g.*
+``visitICmpInst``) by adding a "``return false``" as the first line of the
+method.
+
+If that still doesn't remove enough, then change the caller of
+``InstCombiner::DoOneIteration``, ``InstCombiner::runOnFunction`` to limit the
+number of iterations.
+
+You may also find it useful to use "``-stats``" now to see what parts of
+instcombine are firing.  This can guide where to put additional reporting code.
+
+At this point, if the amount of transformations is still too large, then
+inserting code to limit whether or not to execute the body of the code in the
+visit function can be helpful.  Add a static counter which is incremented on
+every invocation of the function.  Then add code which simply returns false on
+desired ranges.  For example:
+
+.. code-block:: c++
+
+
+  static int calledCount = 0;
+  calledCount++;
+  DEBUG(if (calledCount < 212) return false);
+  DEBUG(if (calledCount > 217) return false);
+  DEBUG(if (calledCount == 213) return false);
+  DEBUG(if (calledCount == 214) return false);
+  DEBUG(if (calledCount == 215) return false);
+  DEBUG(if (calledCount == 216) return false);
+  DEBUG(dbgs() << "visitXOR calledCount: " << calledCount << "\n");
+  DEBUG(dbgs() << "I: "; I->dump());
+
+could be added to ``visitXOR`` to limit ``visitXor`` to being applied only to
+calls 212 and 217. This is from an actual test case and raises an important
+point---a simple binary search may not be sufficient, as transformations that
+interact may require isolating more than one call.  In TargetLowering, use
+``return SDNode();`` instead of ``return false;``.
+
+Now that the number of transformations is down to a manageable number, try
+examining the output to see if you can figure out which transformations are
+being done.  If that can be figured out, then do the usual debugging.  If which
+code corresponds to the transformation being performed isn't obvious, set a
+breakpoint after the call count based disabling and step through the code.
+Alternatively, you can use "``printf``" style debugging to report waypoints.

Added: www-releases/trunk/4.0.1/docs/_sources/CMake.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CMake.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CMake.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CMake.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,763 @@
+========================
+Building LLVM with CMake
+========================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+`CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake
+does not build the project, it generates the files needed by your build tool
+(GNU make, Visual Studio, etc.) for building LLVM.
+
+If **you are a new contributor**, please start with the :doc:`GettingStarted` 
+page.  This page is geared for existing contributors moving from the 
+legacy configure/make system.
+
+If you are really anxious about getting a functional LLVM build, go to the
+`Quick start`_ section. If you are a CMake novice, start with `Basic CMake usage`_
+and then go back to the `Quick start`_ section once you know what you are doing. The
+`Options and variables`_ section is a reference for customizing your build. If
+you already have experience with CMake, this is the recommended starting point.
+
+This page is geared towards users of the LLVM CMake build. If you're looking for
+information about modifying the LLVM CMake build system you may want to see the
+:doc:`CMakePrimer` page. It has a basic overview of the CMake language.
+
+.. _Quick start:
+
+Quick start
+===========
+
+We use here the command-line, non-interactive CMake interface.
+
+#. `Download <http://www.cmake.org/cmake/resources/software.html>`_ and install
+   CMake. Version 3.4.3 is the minimum required.
+
+#. Open a shell. Your development tools must be reachable from this shell
+   through the PATH environment variable.
+
+#. Create a build directory. Building LLVM in the source
+   directory is not supported. cd to this directory:
+
+   .. code-block:: console
+
+     $ mkdir mybuilddir
+     $ cd mybuilddir
+
+#. Execute this command in the shell replacing `path/to/llvm/source/root` with
+   the path to the root of your LLVM source tree:
+
+   .. code-block:: console
+
+     $ cmake path/to/llvm/source/root
+
+   CMake will detect your development environment, perform a series of tests, and
+   generate the files required for building LLVM. CMake will use default values
+   for all build parameters. See the `Options and variables`_ section for
+   a list of build parameters that you can modify.
+
+   This can fail if CMake can't detect your toolset, or if it thinks that the
+   environment is not sane enough. In this case, make sure that the toolset that
+   you intend to use is the only one reachable from the shell, and that the shell
+   itself is the correct one for your development environment. CMake will refuse
+   to build MinGW makefiles if you have a POSIX shell reachable through the PATH
+   environment variable, for instance. You can force CMake to use a given build
+   tool; for instructions, see the `Usage`_ section, below.
+
+#. After CMake has finished running, proceed to use IDE project files, or start
+   the build from the build directory:
+
+   .. code-block:: console
+
+     $ cmake --build .
+
+   The ``--build`` option tells ``cmake`` to invoke the underlying build
+   tool (``make``, ``ninja``, ``xcodebuild``, ``msbuild``, etc.)
+
+   The underlying build tool can be invoked directly, of course, but
+   the ``--build`` option is portable.
+
+#. After LLVM has finished building, install it from the build directory:
+
+   .. code-block:: console
+
+     $ cmake --build . --target install
+
+   The ``--target`` option with ``install`` parameter in addition to
+   the ``--build`` option tells ``cmake`` to build the ``install`` target.
+
+   It is possible to set a different install prefix at installation time
+   by invoking the ``cmake_install.cmake`` script generated in the
+   build directory:
+
+   .. code-block:: console
+
+     $ cmake -DCMAKE_INSTALL_PREFIX=/tmp/llvm -P cmake_install.cmake
+
+.. _Basic CMake usage:
+.. _Usage:
+
+Basic CMake usage
+=================
+
+This section explains basic aspects of CMake
+which you may need in your day-to-day usage.
+
+CMake comes with extensive documentation, in the form of html files, and as
+online help accessible via the ``cmake`` executable itself. Execute ``cmake
+--help`` for further help options.
+
+CMake allows you to specify a build tool (e.g., GNU make, Visual Studio,
+or Xcode). If not specified on the command line, CMake tries to guess which
+build tool to use, based on your environment. Once it has identified your
+build tool, CMake uses the corresponding *Generator* to create files for your
+build tool (e.g., Makefiles or Visual Studio or Xcode project files). You can
+explicitly specify the generator with the command line option ``-G "Name of the
+generator"``. To see a list of the available generators on your system, execute
+
+.. code-block:: console
+
+  $ cmake --help
+
+This will list the generator names at the end of the help text.
+
+Generators' names are case-sensitive, and may contain spaces. For this reason,
+you should enter them exactly as they are listed in the ``cmake --help``
+output, in quotes. For example, to generate project files specifically for
+Visual Studio 12, you can execute:
+
+.. code-block:: console
+
+  $ cmake -G "Visual Studio 12" path/to/llvm/source/root
+
+For a given development platform there can be more than one adequate
+generator. If you use Visual Studio, "NMake Makefiles" is a generator you can use
+for building with NMake. By default, CMake chooses the most specific generator
+supported by your development environment. If you want an alternative generator,
+you must tell this to CMake with the ``-G`` option.
+
+.. todo::
+
+  Explain variables and cache. Move explanation here from #options section.
+
+.. _Options and variables:
+
+Options and variables
+=====================
+
+Variables customize how the build will be generated. Options are boolean
+variables, with possible values ON/OFF. Options and variables are defined on the
+CMake command line like this:
+
+.. code-block:: console
+
+  $ cmake -DVARIABLE=value path/to/llvm/source
+
+You can set a variable after the initial CMake invocation to change its
+value. You can also undefine a variable:
+
+.. code-block:: console
+
+  $ cmake -UVARIABLE path/to/llvm/source
+
+Variables are stored in the CMake cache. This is a file named ``CMakeCache.txt``
+stored at the root of your build directory that is generated by ``cmake``.
+Editing it yourself is not recommended.
+
+Variables are listed in the CMake cache and later in this document with
+the variable name and type separated by a colon. You can also specify the
+variable and type on the CMake command line:
+
+.. code-block:: console
+
+  $ cmake -DVARIABLE:TYPE=value path/to/llvm/source
+
+Frequently-used CMake variables
+-------------------------------
+
+Here are some of the CMake variables that are used often, along with a
+brief explanation and LLVM-specific notes. For full documentation, consult the
+CMake manual, or execute ``cmake --help-variable VARIABLE_NAME``.
+
+**CMAKE_BUILD_TYPE**:STRING
+  Sets the build type for ``make``-based generators. Possible values are
+  Release, Debug, RelWithDebInfo and MinSizeRel. If you are using an IDE such as
+  Visual Studio, you should use the IDE settings to set the build type.
+  Be aware that Release and RelWithDebInfo are not using the same optimization
+  level on most platform.
+
+**CMAKE_INSTALL_PREFIX**:PATH
+  Path where LLVM will be installed if "make install" is invoked or the
+  "install" target is built.
+
+**LLVM_LIBDIR_SUFFIX**:STRING
+  Extra suffix to append to the directory where libraries are to be
+  installed. On a 64-bit architecture, one could use ``-DLLVM_LIBDIR_SUFFIX=64``
+  to install libraries to ``/usr/lib64``.
+
+**CMAKE_C_FLAGS**:STRING
+  Extra flags to use when compiling C source files.
+
+**CMAKE_CXX_FLAGS**:STRING
+  Extra flags to use when compiling C++ source files.
+
+.. _LLVM-specific variables:
+
+LLVM-specific variables
+-----------------------
+
+**LLVM_TARGETS_TO_BUILD**:STRING
+  Semicolon-separated list of targets to build, or *all* for building all
+  targets. Case-sensitive. Defaults to *all*. Example:
+  ``-DLLVM_TARGETS_TO_BUILD="X86;PowerPC"``.
+
+**LLVM_BUILD_TOOLS**:BOOL
+  Build LLVM tools. Defaults to ON. Targets for building each tool are generated
+  in any case. You can build a tool separately by invoking its target. For
+  example, you can build *llvm-as* with a Makefile-based system by executing *make
+  llvm-as* at the root of your build directory.
+
+**LLVM_INCLUDE_TOOLS**:BOOL
+  Generate build targets for the LLVM tools. Defaults to ON. You can use this
+  option to disable the generation of build targets for the LLVM tools.
+
+**LLVM_BUILD_EXAMPLES**:BOOL
+  Build LLVM examples. Defaults to OFF. Targets for building each example are
+  generated in any case. See documentation for *LLVM_BUILD_TOOLS* above for more
+  details.
+
+**LLVM_INCLUDE_EXAMPLES**:BOOL
+  Generate build targets for the LLVM examples. Defaults to ON. You can use this
+  option to disable the generation of build targets for the LLVM examples.
+
+**LLVM_BUILD_TESTS**:BOOL
+  Build LLVM unit tests. Defaults to OFF. Targets for building each unit test
+  are generated in any case. You can build a specific unit test using the
+  targets defined under *unittests*, such as ADTTests, IRTests, SupportTests,
+  etc. (Search for ``add_llvm_unittest`` in the subdirectories of *unittests*
+  for a complete list of unit tests.) It is possible to build all unit tests
+  with the target *UnitTests*.
+
+**LLVM_INCLUDE_TESTS**:BOOL
+  Generate build targets for the LLVM unit tests. Defaults to ON. You can use
+  this option to disable the generation of build targets for the LLVM unit
+  tests.
+
+**LLVM_APPEND_VC_REV**:BOOL
+  Append version control revision info (svn revision number or Git revision id)
+  to LLVM version string (stored in the PACKAGE_VERSION macro). For this to work
+  cmake must be invoked before the build. Defaults to OFF.
+
+**LLVM_ENABLE_THREADS**:BOOL
+  Build with threads support, if available. Defaults to ON.
+
+**LLVM_ENABLE_CXX1Y**:BOOL
+  Build in C++1y mode, if available. Defaults to OFF.
+
+**LLVM_ENABLE_ASSERTIONS**:BOOL
+  Enables code assertions. Defaults to ON if and only if ``CMAKE_BUILD_TYPE``
+  is *Debug*.
+
+**LLVM_ENABLE_EH**:BOOL
+  Build LLVM with exception-handling support. This is necessary if you wish to
+  link against LLVM libraries and make use of C++ exceptions in your own code
+  that need to propagate through LLVM code. Defaults to OFF.
+
+**LLVM_ENABLE_EXPENSIVE_CHECKS**:BOOL
+  Enable additional time/memory expensive checking. Defaults to OFF.
+
+**LLVM_ENABLE_PIC**:BOOL
+  Add the ``-fPIC`` flag to the compiler command-line, if the compiler supports
+  this flag. Some systems, like Windows, do not need this flag. Defaults to ON.
+
+**LLVM_ENABLE_RTTI**:BOOL
+  Build LLVM with run-time type information. Defaults to OFF.
+
+**LLVM_ENABLE_WARNINGS**:BOOL
+  Enable all compiler warnings. Defaults to ON.
+
+**LLVM_ENABLE_PEDANTIC**:BOOL
+  Enable pedantic mode. This disables compiler-specific extensions, if
+  possible. Defaults to ON.
+
+**LLVM_ENABLE_WERROR**:BOOL
+  Stop and fail the build, if a compiler warning is triggered. Defaults to OFF.
+
+**LLVM_ABI_BREAKING_CHECKS**:STRING
+  Used to decide if LLVM should be built with ABI breaking checks or
+  not.  Allowed values are `WITH_ASSERTS` (default), `FORCE_ON` and
+  `FORCE_OFF`.  `WITH_ASSERTS` turns on ABI breaking checks in an
+  assertion enabled build.  `FORCE_ON` (`FORCE_OFF`) turns them on
+  (off) irrespective of whether normal (`NDEBUG`-based) assertions are
+  enabled or not.  A version of LLVM built with ABI breaking checks
+  is not ABI compatible with a version built without it.
+
+**LLVM_BUILD_32_BITS**:BOOL
+  Build 32-bit executables and libraries on 64-bit systems. This option is
+  available only on some 64-bit Unix systems. Defaults to OFF.
+
+**LLVM_TARGET_ARCH**:STRING
+  LLVM target to use for native code generation. This is required for JIT
+  generation. It defaults to "host", meaning that it shall pick the architecture
+  of the machine where LLVM is being built. If you are cross-compiling, set it
+  to the target architecture name.
+
+**LLVM_TABLEGEN**:STRING
+  Full path to a native TableGen executable (usually named ``llvm-tblgen``). This is
+  intended for cross-compiling: if the user sets this variable, no native
+  TableGen will be created.
+
+**LLVM_LIT_ARGS**:STRING
+  Arguments given to lit.  ``make check`` and ``make clang-test`` are affected.
+  By default, ``'-sv --no-progress-bar'`` on Visual C++ and Xcode, ``'-sv'`` on
+  others.
+
+**LLVM_LIT_TOOLS_DIR**:PATH
+  The path to GnuWin32 tools for tests. Valid on Windows host.  Defaults to
+  the empty string, in which case lit will look for tools needed for tests
+  (e.g. ``grep``, ``sort``, etc.) in your %PATH%. If GnuWin32 is not in your
+  %PATH%, then you can set this variable to the GnuWin32 directory so that
+  lit can find tools needed for tests in that directory.
+
+**LLVM_ENABLE_FFI**:BOOL
+  Indicates whether the LLVM Interpreter will be linked with the Foreign Function
+  Interface library (libffi) in order to enable calling external functions.
+  If the library or its headers are installed in a custom
+  location, you can also set the variables FFI_INCLUDE_DIR and
+  FFI_LIBRARY_DIR to the directories where ffi.h and libffi.so can be found,
+  respectively. Defaults to OFF.
+
+**LLVM_EXTERNAL_{CLANG,LLD,POLLY}_SOURCE_DIR**:PATH
+  These variables specify the path to the source directory for the external
+  LLVM projects Clang, lld, and Polly, respectively, relative to the top-level
+  source directory.  If the in-tree subdirectory for an external project
+  exists (e.g., llvm/tools/clang for Clang), then the corresponding variable
+  will not be used.  If the variable for an external project does not point
+  to a valid path, then that project will not be built.
+
+**LLVM_ENABLE_PROJECTS**:STRING
+  Semicolon-separated list of projects to build, or *all* for building all
+  (clang, libcxx, libcxxabi, lldb, compiler-rt, lld, polly) projects.
+  This flag assumes that projects are checked out side-by-side and not nested,
+  i.e. clang needs to be in parallel of llvm instead of nested in `llvm/tools`.
+  This feature allows to have one build for only LLVM and another for clang+llvm
+  using the same source checkout.
+
+**LLVM_EXTERNAL_PROJECTS**:STRING
+  Semicolon-separated list of additional external projects to build as part of
+  llvm. For each project LLVM_EXTERNAL_<NAME>_SOURCE_DIR have to be specified
+  with the path for the source code of the project. Example:
+  ``-DLLVM_EXTERNAL_PROJECTS="Foo;Bar"
+  -DLLVM_EXTERNAL_FOO_SOURCE_DIR=/src/foo
+  -DLLVM_EXTERNAL_BAR_SOURCE_DIR=/src/bar``.
+
+**LLVM_USE_OPROFILE**:BOOL
+  Enable building OProfile JIT support. Defaults to OFF.
+
+**LLVM_PROFDATA_FILE**:PATH
+  Path to a profdata file to pass into clang's -fprofile-instr-use flag. This
+  can only be specified if you're building with clang.
+
+**LLVM_USE_INTEL_JITEVENTS**:BOOL
+  Enable building support for Intel JIT Events API. Defaults to OFF.
+
+**LLVM_ENABLE_ZLIB**:BOOL
+  Enable building with zlib to support compression/uncompression in LLVM tools.
+  Defaults to ON.
+
+**LLVM_ENABLE_DIA_SDK**:BOOL
+  Enable building with MSVC DIA SDK for PDB debugging support. Available
+  only with MSVC. Defaults to ON.
+
+**LLVM_USE_SANITIZER**:STRING
+  Define the sanitizer used to build LLVM binaries and tests. Possible values
+  are ``Address``, ``Memory``, ``MemoryWithOrigins``, ``Undefined``, ``Thread``,
+  and ``Address;Undefined``. Defaults to empty string.
+
+**LLVM_ENABLE_LTO**:STRING
+  Add ``-flto`` or ``-flto=`` flags to the compile and link command
+  lines, enabling link-time optimization. Possible values are ``Off``,
+  ``On``, ``Thin`` and ``Full``. Defaults to OFF.
+
+**LLVM_PARALLEL_COMPILE_JOBS**:STRING
+  Define the maximum number of concurrent compilation jobs.
+
+**LLVM_PARALLEL_LINK_JOBS**:STRING
+  Define the maximum number of concurrent link jobs.
+
+**LLVM_BUILD_DOCS**:BOOL
+  Adds all *enabled* documentation targets (i.e. Doxgyen and Sphinx targets) as
+  dependencies of the default build targets.  This results in all of the (enabled)
+  documentation targets being as part of a normal build.  If the ``install`` 
+  target is run then this also enables all built documentation targets to be 
+  installed. Defaults to OFF.  To enable a particular documentation target, see 
+  see LLVM_ENABLE_SPHINX and LLVM_ENABLE_DOXYGEN.  
+
+**LLVM_ENABLE_DOXYGEN**:BOOL
+  Enables the generation of browsable HTML documentation using doxygen.
+  Defaults to OFF.
+
+**LLVM_ENABLE_DOXYGEN_QT_HELP**:BOOL
+  Enables the generation of a Qt Compressed Help file. Defaults to OFF.
+  This affects the make target ``doxygen-llvm``. When enabled, apart from
+  the normal HTML output generated by doxygen, this will produce a QCH file
+  named ``org.llvm.qch``. You can then load this file into Qt Creator.
+  This option is only useful in combination with ``-DLLVM_ENABLE_DOXYGEN=ON``;
+  otherwise this has no effect.
+
+**LLVM_DOXYGEN_QCH_FILENAME**:STRING
+  The filename of the Qt Compressed Help file that will be generated when
+  ``-DLLVM_ENABLE_DOXYGEN=ON`` and
+  ``-DLLVM_ENABLE_DOXYGEN_QT_HELP=ON`` are given. Defaults to
+  ``org.llvm.qch``.
+  This option is only useful in combination with
+  ``-DLLVM_ENABLE_DOXYGEN_QT_HELP=ON``;
+  otherwise it has no effect.
+
+**LLVM_DOXYGEN_QHP_NAMESPACE**:STRING
+  Namespace under which the intermediate Qt Help Project file lives. See `Qt
+  Help Project`_
+  for more information. Defaults to "org.llvm". This option is only useful in
+  combination with ``-DLLVM_ENABLE_DOXYGEN_QT_HELP=ON``; otherwise
+  it has no effect.
+
+**LLVM_DOXYGEN_QHP_CUST_FILTER_NAME**:STRING
+  See `Qt Help Project`_ for
+  more information. Defaults to the CMake variable ``${PACKAGE_STRING}`` which
+  is a combination of the package name and version string. This filter can then
+  be used in Qt Creator to select only documentation from LLVM when browsing
+  through all the help files that you might have loaded. This option is only
+  useful in combination with ``-DLLVM_ENABLE_DOXYGEN_QT_HELP=ON``;
+  otherwise it has no effect.
+
+.. _Qt Help Project: http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom-filters
+
+**LLVM_DOXYGEN_QHELPGENERATOR_PATH**:STRING
+  The path to the ``qhelpgenerator`` executable. Defaults to whatever CMake's
+  ``find_program()`` can find. This option is only useful in combination with
+  ``-DLLVM_ENABLE_DOXYGEN_QT_HELP=ON``; otherwise it has no
+  effect.
+
+**LLVM_DOXYGEN_SVG**:BOOL
+  Uses .svg files instead of .png files for graphs in the Doxygen output.
+  Defaults to OFF.
+
+**LLVM_INSTALL_DOXYGEN_HTML_DIR**:STRING
+  The path to install Doxygen-generated HTML documentation to. This path can
+  either be absolute or relative to the CMAKE_INSTALL_PREFIX. Defaults to
+  `share/doc/llvm/doxygen-html`.
+
+**LLVM_ENABLE_SPHINX**:BOOL
+  If specified, CMake will search for the ``sphinx-build`` executable and will make
+  the ``SPHINX_OUTPUT_HTML`` and ``SPHINX_OUTPUT_MAN`` CMake options available.
+  Defaults to OFF.
+
+**SPHINX_EXECUTABLE**:STRING
+  The path to the ``sphinx-build`` executable detected by CMake.
+
+**SPHINX_OUTPUT_HTML**:BOOL
+  If enabled (and ``LLVM_ENABLE_SPHINX`` is enabled) then the targets for
+  building the documentation as html are added (but not built by default unless
+  ``LLVM_BUILD_DOCS`` is enabled). There is a target for each project in the
+  source tree that uses sphinx (e.g.  ``docs-llvm-html``, ``docs-clang-html``
+  and ``docs-lld-html``). Defaults to ON.
+
+**SPHINX_OUTPUT_MAN**:BOOL
+  If enabled (and ``LLVM_ENABLE_SPHINX`` is enabled) the targets for building
+  the man pages are added (but not built by default unless ``LLVM_BUILD_DOCS``
+  is enabled). Currently the only target added is ``docs-llvm-man``. Defaults
+  to ON.
+
+**SPHINX_WARNINGS_AS_ERRORS**:BOOL
+  If enabled then sphinx documentation warnings will be treated as
+  errors. Defaults to ON.
+
+**LLVM_INSTALL_SPHINX_HTML_DIR**:STRING
+  The path to install Sphinx-generated HTML documentation to. This path can
+  either be absolute or relative to the CMAKE_INSTALL_PREFIX. Defaults to
+  `share/doc/llvm/html`.
+
+**LLVM_INSTALL_OCAMLDOC_HTML_DIR**:STRING
+  The path to install OCamldoc-generated HTML documentation to. This path can
+  either be absolute or relative to the CMAKE_INSTALL_PREFIX. Defaults to
+  `share/doc/llvm/ocaml-html`.
+
+**LLVM_CREATE_XCODE_TOOLCHAIN**:BOOL
+  OS X Only: If enabled CMake will generate a target named
+  'install-xcode-toolchain'. This target will create a directory at
+  $CMAKE_INSTALL_PREFIX/Toolchains containing an xctoolchain directory which can
+  be used to override the default system tools. 
+
+**LLVM_BUILD_LLVM_DYLIB**:BOOL
+  If enabled, the target for building the libLLVM shared library is added.
+  This library contains all of LLVM's components in a single shared library.
+  Defaults to OFF. This cannot be used in conjunction with BUILD_SHARED_LIBS.
+  Tools will only be linked to the libLLVM shared library if LLVM_LINK_LLVM_DYLIB
+  is also ON.
+  The components in the library can be customised by setting LLVM_DYLIB_COMPONENTS
+  to a list of the desired components.
+
+**LLVM_LINK_LLVM_DYLIB**:BOOL
+  If enabled, tools will be linked with the libLLVM shared library. Defaults
+  to OFF. Setting LLVM_LINK_LLVM_DYLIB to ON also sets LLVM_BUILD_LLVM_DYLIB
+  to ON.
+
+**BUILD_SHARED_LIBS**:BOOL
+  Flag indicating if each LLVM component (e.g. Support) is built as a shared
+  library (ON) or as a static library (OFF). Its default value is OFF. On
+  Windows, shared libraries may be used when building with MinGW, including
+  mingw-w64, but not when building with the Microsoft toolchain.
+ 
+  .. note:: BUILD_SHARED_LIBS is only recommended for use by LLVM developers.
+            If you want to build LLVM as a shared library, you should use the
+            ``LLVM_BUILD_LLVM_DYLIB`` option.
+
+**LLVM_OPTIMIZED_TABLEGEN**:BOOL
+  If enabled and building a debug or asserts build the CMake build system will
+  generate a Release build tree to build a fully optimized tablegen for use
+  during the build. Enabling this option can significantly speed up build times
+  especially when building LLVM in Debug configurations.
+
+CMake Caches
+============
+
+Recently LLVM and Clang have been adding some more complicated build system
+features. Utilizing these new features often involves a complicated chain of
+CMake variables passed on the command line. Clang provides a collection of CMake
+cache scripts to make these features more approachable.
+
+CMake cache files are utilized using CMake's -C flag:
+
+.. code-block:: console
+
+  $ cmake -C <path to cache file> <path to sources>
+
+CMake cache scripts are processed in an isolated scope, only cached variables
+remain set when the main configuration runs. CMake cached variables do not reset
+variables that are already set unless the FORCE option is specified.
+
+A few notes about CMake Caches:
+
+- Order of command line arguments is important
+
+  - -D arguments specified before -C are set before the cache is processed and
+    can be read inside the cache file
+  - -D arguments specified after -C are set after the cache is processed and
+    are unset inside the cache file
+
+- All -D arguments will override cache file settings
+- CMAKE_TOOLCHAIN_FILE is evaluated after both the cache file and the command
+  line arguments
+- It is recommended that all -D options should be specified *before* -C
+
+For more information about some of the advanced build configurations supported
+via Cache files see :doc:`AdvancedBuilds`.
+
+Executing the test suite
+========================
+
+Testing is performed when the *check-all* target is built. For instance, if you are
+using Makefiles, execute this command in the root of your build directory:
+
+.. code-block:: console
+
+  $ make check-all
+
+On Visual Studio, you may run tests by building the project "check-all".
+For more information about testing, see the :doc:`TestingGuide`.
+
+Cross compiling
+===============
+
+See `this wiki page <http://www.vtk.org/Wiki/CMake_Cross_Compiling>`_ for
+generic instructions on how to cross-compile with CMake. It goes into detailed
+explanations and may seem daunting, but it is not. On the wiki page there are
+several examples including toolchain files. Go directly to `this section
+<http://www.vtk.org/Wiki/CMake_Cross_Compiling#Information_how_to_set_up_various_cross_compiling_toolchains>`_
+for a quick solution.
+
+Also see the `LLVM-specific variables`_ section for variables used when
+cross-compiling.
+
+Embedding LLVM in your project
+==============================
+
+From LLVM 3.5 onwards both the CMake and autoconf/Makefile build systems export
+LLVM libraries as importable CMake targets. This means that clients of LLVM can
+now reliably use CMake to develop their own LLVM-based projects against an
+installed version of LLVM regardless of how it was built.
+
+Here is a simple example of a CMakeLists.txt file that imports the LLVM libraries
+and uses them to build a simple application ``simple-tool``.
+
+.. code-block:: cmake
+
+  cmake_minimum_required(VERSION 3.4.3)
+  project(SimpleProject)
+
+  find_package(LLVM REQUIRED CONFIG)
+
+  message(STATUS "Found LLVM ${LLVM_PACKAGE_VERSION}")
+  message(STATUS "Using LLVMConfig.cmake in: ${LLVM_DIR}")
+
+  # Set your project compile flags.
+  # E.g. if using the C++ header files
+  # you will need to enable C++11 support
+  # for your compiler.
+
+  include_directories(${LLVM_INCLUDE_DIRS})
+  add_definitions(${LLVM_DEFINITIONS})
+
+  # Now build our tools
+  add_executable(simple-tool tool.cpp)
+
+  # Find the libraries that correspond to the LLVM components
+  # that we wish to use
+  llvm_map_components_to_libnames(llvm_libs support core irreader)
+
+  # Link against LLVM libraries
+  target_link_libraries(simple-tool ${llvm_libs})
+
+The ``find_package(...)`` directive when used in CONFIG mode (as in the above
+example) will look for the ``LLVMConfig.cmake`` file in various locations (see
+cmake manual for details).  It creates a ``LLVM_DIR`` cache entry to save the
+directory where ``LLVMConfig.cmake`` is found or allows the user to specify the
+directory (e.g. by passing ``-DLLVM_DIR=/usr/lib/cmake/llvm`` to
+the ``cmake`` command or by setting it directly in ``ccmake`` or ``cmake-gui``).
+
+This file is available in two different locations.
+
+* ``<INSTALL_PREFIX>/lib/cmake/llvm/LLVMConfig.cmake`` where
+  ``<INSTALL_PREFIX>`` is the install prefix of an installed version of LLVM.
+  On Linux typically this is ``/usr/lib/cmake/llvm/LLVMConfig.cmake``.
+
+* ``<LLVM_BUILD_ROOT>/lib/cmake/llvm/LLVMConfig.cmake`` where
+  ``<LLVM_BUILD_ROOT>`` is the root of the LLVM build tree. **Note: this is only
+  available when building LLVM with CMake.**
+
+If LLVM is installed in your operating system's normal installation prefix (e.g.
+on Linux this is usually ``/usr/``) ``find_package(LLVM ...)`` will
+automatically find LLVM if it is installed correctly. If LLVM is not installed
+or you wish to build directly against the LLVM build tree you can use
+``LLVM_DIR`` as previously mentioned.
+
+The ``LLVMConfig.cmake`` file sets various useful variables. Notable variables
+include
+
+``LLVM_CMAKE_DIR``
+  The path to the LLVM CMake directory (i.e. the directory containing
+  LLVMConfig.cmake).
+
+``LLVM_DEFINITIONS``
+  A list of preprocessor defines that should be used when building against LLVM.
+
+``LLVM_ENABLE_ASSERTIONS``
+  This is set to ON if LLVM was built with assertions, otherwise OFF.
+
+``LLVM_ENABLE_EH``
+  This is set to ON if LLVM was built with exception handling (EH) enabled,
+  otherwise OFF.
+
+``LLVM_ENABLE_RTTI``
+  This is set to ON if LLVM was built with run time type information (RTTI),
+  otherwise OFF.
+
+``LLVM_INCLUDE_DIRS``
+  A list of include paths to directories containing LLVM header files.
+
+``LLVM_PACKAGE_VERSION``
+  The LLVM version. This string can be used with CMake conditionals, e.g., ``if
+  (${LLVM_PACKAGE_VERSION} VERSION_LESS "3.5")``.
+
+``LLVM_TOOLS_BINARY_DIR``
+  The path to the directory containing the LLVM tools (e.g. ``llvm-as``).
+
+Notice that in the above example we link ``simple-tool`` against several LLVM
+libraries. The list of libraries is determined by using the
+``llvm_map_components_to_libnames()`` CMake function. For a list of available
+components look at the output of running ``llvm-config --components``.
+
+Note that for LLVM < 3.5 ``llvm_map_components_to_libraries()`` was
+used instead of ``llvm_map_components_to_libnames()``. This is now deprecated
+and will be removed in a future version of LLVM.
+
+.. _cmake-out-of-source-pass:
+
+Developing LLVM passes out of source
+------------------------------------
+
+It is possible to develop LLVM passes out of LLVM's source tree (i.e. against an
+installed or built LLVM). An example of a project layout is provided below.
+
+.. code-block:: none
+
+  <project dir>/
+      |
+      CMakeLists.txt
+      <pass name>/
+          |
+          CMakeLists.txt
+          Pass.cpp
+          ...
+
+Contents of ``<project dir>/CMakeLists.txt``:
+
+.. code-block:: cmake
+
+  find_package(LLVM REQUIRED CONFIG)
+
+  add_definitions(${LLVM_DEFINITIONS})
+  include_directories(${LLVM_INCLUDE_DIRS})
+
+  add_subdirectory(<pass name>)
+
+Contents of ``<project dir>/<pass name>/CMakeLists.txt``:
+
+.. code-block:: cmake
+
+  add_library(LLVMPassname MODULE Pass.cpp)
+
+Note if you intend for this pass to be merged into the LLVM source tree at some
+point in the future it might make more sense to use LLVM's internal
+``add_llvm_loadable_module`` function instead by...
+
+
+Adding the following to ``<project dir>/CMakeLists.txt`` (after
+``find_package(LLVM ...)``)
+
+.. code-block:: cmake
+
+  list(APPEND CMAKE_MODULE_PATH "${LLVM_CMAKE_DIR}")
+  include(AddLLVM)
+
+And then changing ``<project dir>/<pass name>/CMakeLists.txt`` to
+
+.. code-block:: cmake
+
+  add_llvm_loadable_module(LLVMPassname
+    Pass.cpp
+    )
+
+When you are done developing your pass, you may wish to integrate it
+into the LLVM source tree. You can achieve it in two easy steps:
+
+#. Copying ``<pass name>`` folder into ``<LLVM root>/lib/Transform`` directory.
+
+#. Adding ``add_subdirectory(<pass name>)`` line into
+   ``<LLVM root>/lib/Transform/CMakeLists.txt``.
+
+Compiler/Platform-specific topics
+=================================
+
+Notes for specific compilers and/or platforms.
+
+Microsoft Visual C++
+--------------------
+
+**LLVM_COMPILER_JOBS**:STRING
+  Specifies the maximum number of parallel compiler jobs to use per project
+  when building with msbuild or Visual Studio. Only supported for the Visual
+  Studio 2010 CMake generator. 0 means use all processors. Default is 0.

Added: www-releases/trunk/4.0.1/docs/_sources/CMakePrimer.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CMakePrimer.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CMakePrimer.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CMakePrimer.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,465 @@
+============
+CMake Primer
+============
+
+.. contents::
+   :local:
+
+.. warning::
+   Disclaimer: This documentation is written by LLVM project contributors `not`
+   anyone affiliated with the CMake project. This document may contain
+   inaccurate terminology, phrasing, or technical details. It is provided with
+   the best intentions.
+
+
+Introduction
+============
+
+The LLVM project and many of the core projects built on LLVM build using CMake.
+This document aims to provide a brief overview of CMake for developers modifying
+LLVM projects or building their own projects on top of LLVM.
+
+The official CMake language references is available in the cmake-language
+manpage and `cmake-language online documentation
+<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.
+
+10,000 ft View
+==============
+
+CMake is a tool that reads script files in its own language that describe how a
+software project builds. As CMake evaluates the scripts it constructs an
+internal representation of the software project. Once the scripts have been
+fully processed, if there are no errors, CMake will generate build files to
+actually build the project. CMake supports generating build files for a variety
+of command line build tools as well as for popular IDEs.
+
+When a user runs CMake it performs a variety of checks similar to how autoconf
+worked historically. During the checks and the evaluation of the build
+description scripts CMake caches values into the CMakeCache. This is useful
+because it allows the build system to skip long-running checks during
+incremental development. CMake caching also has some drawbacks, but that will be
+discussed later.
+
+Scripting Overview
+==================
+
+CMake's scripting language has a very simple grammar. Every language construct
+is a command that matches the pattern _name_(_args_). Commands come in three
+primary types: language-defined (commands implemented in C++ in CMake), defined
+functions, and defined macros. The CMake distribution also contains a suite of
+CMake modules that contain definitions for useful functionality.
+
+The example below is the full CMake build for building a C++ "Hello World"
+program. The example uses only CMake language-defined functions.
+
+.. code-block:: cmake
+
+   cmake_minimum_required(VERSION 3.2)
+   project(HelloWorld)
+   add_executable(HelloWorld HelloWorld.cpp)
+
+The CMake language provides control flow constructs in the form of foreach loops
+and if blocks. To make the example above more complicated you could add an if
+block to define "APPLE" when targeting Apple platforms:
+
+.. code-block:: cmake
+
+   cmake_minimum_required(VERSION 3.2)
+   project(HelloWorld)
+   add_executable(HelloWorld HelloWorld.cpp)
+   if(APPLE)
+     target_compile_definitions(HelloWorld PUBLIC APPLE)
+   endif()
+   
+Variables, Types, and Scope
+===========================
+
+Dereferencing
+-------------
+
+In CMake variables are "stringly" typed. All variables are represented as
+strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
+and results in a literal substitution of the name for the value. CMake refers to
+this as "variable evaluation" in their documentation. Dereferences are performed
+*before* the command being called receives the arguments. This means
+dereferencing a list results in multiple separate arguments being passed to the
+command.
+
+Variable dereferences can be nested and be used to model complex data. For
+example:
+
+.. code-block:: cmake
+
+   set(var_name var1)
+   set(${var_name} foo) # same as "set(var1 foo)"
+   set(${${var_name}}_var bar) # same as "set(foo_var bar)"
+   
+Dereferencing an unset variable results in an empty expansion. It is a common
+pattern in CMake to conditionally set variables knowing that it will be used in
+code paths that the variable isn't set. There are examples of this throughout
+the LLVM CMake build system.
+
+An example of variable empty expansion is:
+
+.. code-block:: cmake
+
+   if(APPLE)
+     set(extra_sources Apple.cpp)
+   endif()
+   add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
+   
+In this example the ``extra_sources`` variable is only defined if you're
+targeting an Apple platform. For all other targets the ``extra_sources`` will be
+evaluated as empty before add_executable is given its arguments.
+
+One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly
+dereference values. This has some unexpected results. For example:
+
+.. code-block:: cmake
+
+   if("${SOME_VAR}" STREQUAL "MSVC")
+
+In this code sample MSVC will be implicitly dereferenced, which will result in
+the if command comparing the value of the dereferenced variables ``SOME_VAR``
+and ``MSVC``. A common workaround to this solution is to prepend strings being
+compared with an ``x``.
+
+.. code-block:: cmake
+
+   if("x${SOME_VAR}" STREQUAL "xMSVC")
+
+This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This
+pattern is uncommon, but it does occur in LLVM's CMake scripts.
+
+.. note::
+   
+   Once the LLVM project upgrades its minimum CMake version to 3.1 or later we
+   can prevent this behavior by setting CMP0054 to new. For more information on
+   CMake policies please see the cmake-policies manpage or the `cmake-policies
+   online documentation
+   <https://cmake.org/cmake/help/v3.4/manual/cmake-policies.7.html>`_.
+
+Lists
+-----
+
+In CMake lists are semi-colon delimited strings, and it is strongly advised that
+you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
+defining lists:
+
+.. code-block:: cmake
+
+   # Creates a list with members a, b, c, and d
+   set(my_list a b c d)
+   set(my_list "a;b;c;d")
+   
+   # Creates a string "a b c d"
+   set(my_string "a b c d")
+
+Lists of Lists
+--------------
+
+One of the more complicated patterns in CMake is lists of lists. Because a list
+cannot contain an element with a semi-colon to construct a list of lists you
+make a list of variable names that refer to other lists. For example:
+
+.. code-block:: cmake
+
+   set(list_of_lists a b c)
+   set(a 1 2 3)
+   set(b 4 5 6)
+   set(c 7 8 9)
+   
+With this layout you can iterate through the list of lists printing each value
+with the following code:
+
+.. code-block:: cmake
+
+   foreach(list_name IN LISTS list_of_lists)
+     foreach(value IN LISTS ${list_name})
+       message(${value})
+     endforeach()
+   endforeach()
+   
+You'll notice that the inner foreach loop's list is doubly dereferenced. This is
+because the first dereference turns ``list_name`` into the name of the sub-list
+(a, b, or c in the example), then the second dereference is to get the value of
+the list.
+
+This pattern is used throughout CMake, the most common example is the compiler
+flags options, which CMake refers to using the following variable expansions:
+CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
+
+Other Types
+-----------
+
+Variables that are cached or specified on the command line can have types
+associated with them. The variable's type is used by CMake's UI tool to display
+the right input field. The variable's type generally doesn't impact evaluation.
+One of the few examples is PATH variables, which CMake does have some special
+handling for. You can read more about the special handling in `CMake's set
+documentation
+<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.
+
+Scope
+-----
+
+CMake inherently has a directory-based scoping. Setting a variable in a
+CMakeLists file, will set the variable for that file, and all subdirectories.
+Variables set in a CMake module that is included in a CMakeLists file will be
+set in the scope they are included from, and all subdirectories.
+
+When a variable that is already set is set again in a subdirectory it overrides
+the value in that scope and any deeper subdirectories.
+
+The CMake set command provides two scope-related options. PARENT_SCOPE sets a
+variable into the parent scope, and not the current scope. The CACHE option sets
+the variable in the CMakeCache, which results in it being set in all scopes. The
+CACHE option will not set a variable that already exists in the CACHE unless the
+FORCE option is specified.
+
+In addition to directory-based scope, CMake functions also have their own scope.
+This means variables set inside functions do not bleed into the parent scope.
+This is not true of macros, and it is for this reason LLVM prefers functions
+over macros whenever reasonable.
+
+.. note::
+  Unlike C-based languages, CMake's loop and control flow blocks do not have
+  their own scopes.
+
+Control Flow
+============
+
+CMake features the same basic control flow constructs you would expect in any
+scripting language, but there are a few quarks because, as with everything in
+CMake, control flow constructs are commands.
+
+If, ElseIf, Else
+----------------
+
+.. note::
+  For the full documentation on the CMake if command go
+  `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
+  far more complete.
+
+In general CMake if blocks work the way you'd expect:
+
+.. code-block:: cmake
+
+  if(<condition>)
+    message("do stuff")
+  elseif(<condition>)
+    message("do other stuff")
+  else()
+    message("do other other stuff")
+  endif()
+
+The single most important thing to know about CMake's if blocks coming from a C
+background is that they do not have their own scope. Variables set inside
+conditional blocks persist after the ``endif()``.
+
+Loops
+-----
+
+The most common form of the CMake ``foreach`` block is:
+
+.. code-block:: cmake
+
+  foreach(var ...)
+    message("do stuff")
+  endforeach()
+
+The variable argument portion of the ``foreach`` block can contain dereferenced
+lists, values to iterate, or a mix of both:
+
+.. code-block:: cmake
+
+  foreach(var foo bar baz)
+    message(${var})
+  endforeach()
+  # prints:
+  #  foo
+  #  bar
+  #  baz
+
+  set(my_list 1 2 3)
+  foreach(var ${my_list})
+    message(${var})
+  endforeach()
+  # prints:
+  #  1
+  #  2
+  #  3
+
+  foreach(var ${my_list} out_of_bounds)
+    message(${var})
+  endforeach()
+  # prints:
+  #  1
+  #  2
+  #  3
+  #  out_of_bounds
+
+There is also a more modern CMake foreach syntax. The code below is equivalent
+to the code above:
+
+.. code-block:: cmake
+
+  foreach(var IN ITEMS foo bar baz)
+    message(${var})
+  endforeach()
+  # prints:
+  #  foo
+  #  bar
+  #  baz
+
+  set(my_list 1 2 3)
+  foreach(var IN LISTS my_list)
+    message(${var})
+  endforeach()
+  # prints:
+  #  1
+  #  2
+  #  3
+
+  foreach(var IN LISTS my_list ITEMS out_of_bounds)
+    message(${var})
+  endforeach()
+  # prints:
+  #  1
+  #  2
+  #  3
+  #  out_of_bounds
+
+Similar to the conditional statements, these generally behave how you would
+expect, and they do not have their own scope.
+
+CMake also supports ``while`` loops, although they are not widely used in LLVM.
+
+Modules, Functions and Macros
+=============================
+
+Modules
+-------
+
+Modules are CMake's vehicle for enabling code reuse. CMake modules are just
+CMake script files. They can contain code to execute on include as well as
+definitions for commands.
+
+In CMake macros and functions are universally referred to as commands, and they
+are the primary method of defining code that can be called multiple times.
+
+In LLVM we have several CMake modules that are included as part of our
+distribution for developers who don't build our project from source. Those
+modules are the fundamental pieces needed to build LLVM-based projects with
+CMake. We also rely on modules as a way of organizing the build system's
+functionality for maintainability and re-use within LLVM projects.
+
+Argument Handling
+-----------------
+
+When defining a CMake command handling arguments is very useful. The examples
+in this section will all use the CMake ``function`` block, but this all applies
+to the ``macro`` block as well.
+
+CMake commands can have named arguments, but all commands are implicitly
+variable argument. If the command has named arguments they are required and must
+be specified at every call site. Below is a trivial example of providing a
+wrapper function for CMake's built in function ``add_dependencies``.
+
+.. code-block:: cmake
+
+   function(add_deps target)
+     add_dependencies(${target} ${ARGV})
+   endfunction()
+
+This example defines a new macro named ``add_deps`` which takes a required first
+argument, and just calls another function passing through the first argument and
+all trailing arguments. When variable arguments are present CMake defines them
+in a list named ``ARGV``, and the count of the arguments is defined in ``ARGN``.
+
+CMake provides a module ``CMakeParseArguments`` which provides an implementation
+of advanced argument parsing. We use this all over LLVM, and it is recommended
+for any function that has complex argument-based behaviors or optional
+arguments. CMake's official documentation for the module is in the
+``cmake-modules`` manpage, and is also available at the
+`cmake-modules online documentation
+<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.
+
+.. note::
+  As of CMake 3.5 the cmake_parse_arguments command has become a native command
+  and the CMakeParseArguments module is empty and only left around for
+  compatibility.
+
+Functions Vs Macros
+-------------------
+
+Functions and Macros look very similar in how they are used, but there is one
+fundamental difference between the two. Functions have their own scope, and
+macros don't. This means variables set in macros will bleed out into the calling
+scope. That makes macros suitable for defining very small bits of functionality
+only.
+
+The other difference between CMake functions and macros is how arguments are
+passed. Arguments to macros are not set as variables, instead dereferences to
+the parameters are resolved across the macro before executing it. This can
+result in some unexpected behavior if using unreferenced variables. For example:
+
+.. code-block:: cmake
+
+   macro(print_list my_list)
+     foreach(var IN LISTS my_list)
+       message("${var}")
+     endforeach()
+   endmacro()
+   
+   set(my_list a b c d)
+   set(my_list_of_numbers 1 2 3 4)
+   print_list(my_list_of_numbers)
+   # prints:
+   # a
+   # b
+   # c
+   # d
+
+Generally speaking this issue is uncommon because it requires using
+non-dereferenced variables with names that overlap in the parent scope, but it
+is important to be aware of because it can lead to subtle bugs.
+
+LLVM Project Wrappers
+=====================
+
+LLVM projects provide lots of wrappers around critical CMake built-in commands.
+We use these wrappers to provide consistent behaviors across LLVM components
+and to reduce code duplication.
+
+We generally (but not always) follow the convention that commands prefaced with
+``llvm_`` are intended to be used only as building blocks for other commands.
+Wrapper commands that are intended for direct use are generally named following
+with the project in the middle of the command name (i.e. ``add_llvm_executable``
+is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
+all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
+distribution. It can be included and used by any LLVM sub-project that requires
+LLVM.
+
+.. note::
+
+   Not all LLVM projects require LLVM for all use cases. For example compiler-rt
+   can be built without LLVM, and the compiler-rt sanitizer libraries are used
+   with GCC.
+
+Useful Built-in Commands
+========================
+
+CMake has a bunch of useful built-in commands. This document isn't going to
+go into details about them because The CMake project has excellent
+documentation. To highlight a few useful functions see:
+
+* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
+* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
+* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
+* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
+* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
+* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_
+
+The full documentation for CMake commands is in the ``cmake-commands`` manpage
+and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_

Added: www-releases/trunk/4.0.1/docs/_sources/CodeGenerator.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CodeGenerator.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CodeGenerator.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CodeGenerator.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,2700 @@
+==========================================
+The LLVM Target-Independent Code Generator
+==========================================
+
+.. role:: raw-html(raw)
+   :format: html
+
+.. raw:: html
+
+  <style>
+    .unknown { background-color: #C0C0C0; text-align: center; }
+    .unknown:before { content: "?" }
+    .no { background-color: #C11B17 }
+    .no:before { content: "N" }
+    .partial { background-color: #F88017 }
+    .yes { background-color: #0F0; }
+    .yes:before { content: "Y" }
+    .na { background-color: #6666FF; }
+    .na:before { content: "N/A" }
+  </style>
+
+.. contents::
+   :local:
+
+.. warning::
+  This is a work in progress.
+
+Introduction
+============
+
+The LLVM target-independent code generator is a framework that provides a suite
+of reusable components for translating the LLVM internal representation to the
+machine code for a specified target---either in assembly form (suitable for a
+static compiler) or in binary machine code format (usable for a JIT
+compiler). The LLVM target-independent code generator consists of six main
+components:
+
+1. `Abstract target description`_ interfaces which capture important properties
+   about various aspects of the machine, independently of how they will be used.
+   These interfaces are defined in ``include/llvm/Target/``.
+
+2. Classes used to represent the `code being generated`_ for a target.  These
+   classes are intended to be abstract enough to represent the machine code for
+   *any* target machine.  These classes are defined in
+   ``include/llvm/CodeGen/``. At this level, concepts like "constant pool
+   entries" and "jump tables" are explicitly exposed.
+
+3. Classes and algorithms used to represent code at the object file level, the
+   `MC Layer`_.  These classes represent assembly level constructs like labels,
+   sections, and instructions.  At this level, concepts like "constant pool
+   entries" and "jump tables" don't exist.
+
+4. `Target-independent algorithms`_ used to implement various phases of native
+   code generation (register allocation, scheduling, stack frame representation,
+   etc).  This code lives in ``lib/CodeGen/``.
+
+5. `Implementations of the abstract target description interfaces`_ for
+   particular targets.  These machine descriptions make use of the components
+   provided by LLVM, and can optionally provide custom target-specific passes,
+   to build complete code generators for a specific target.  Target descriptions
+   live in ``lib/Target/``.
+
+6. The target-independent JIT components.  The LLVM JIT is completely target
+   independent (it uses the ``TargetJITInfo`` structure to interface for
+   target-specific issues.  The code for the target-independent JIT lives in
+   ``lib/ExecutionEngine/JIT``.
+
+Depending on which part of the code generator you are interested in working on,
+different pieces of this will be useful to you.  In any case, you should be
+familiar with the `target description`_ and `machine code representation`_
+classes.  If you want to add a backend for a new target, you will need to
+`implement the target description`_ classes for your new target and understand
+the :doc:`LLVM code representation <LangRef>`.  If you are interested in
+implementing a new `code generation algorithm`_, it should only depend on the
+target-description and machine code representation classes, ensuring that it is
+portable.
+
+Required components in the code generator
+-----------------------------------------
+
+The two pieces of the LLVM code generator are the high-level interface to the
+code generator and the set of reusable components that can be used to build
+target-specific backends.  The two most important interfaces (:raw-html:`<tt>`
+`TargetMachine`_ :raw-html:`</tt>` and :raw-html:`<tt>` `DataLayout`_
+:raw-html:`</tt>`) are the only ones that are required to be defined for a
+backend to fit into the LLVM system, but the others must be defined if the
+reusable code generator components are going to be used.
+
+This design has two important implications.  The first is that LLVM can support
+completely non-traditional code generation targets.  For example, the C backend
+does not require register allocation, instruction selection, or any of the other
+standard components provided by the system.  As such, it only implements these
+two interfaces, and does its own thing. Note that C backend was removed from the
+trunk since LLVM 3.1 release. Another example of a code generator like this is a
+(purely hypothetical) backend that converts LLVM to the GCC RTL form and uses
+GCC to emit machine code for a target.
+
+This design also implies that it is possible to design and implement radically
+different code generators in the LLVM system that do not make use of any of the
+built-in components.  Doing so is not recommended at all, but could be required
+for radically different targets that do not fit into the LLVM machine
+description model: FPGAs for example.
+
+.. _high-level design of the code generator:
+
+The high-level design of the code generator
+-------------------------------------------
+
+The LLVM target-independent code generator is designed to support efficient and
+quality code generation for standard register-based microprocessors.  Code
+generation in this model is divided into the following stages:
+
+1. `Instruction Selection`_ --- This phase determines an efficient way to
+   express the input LLVM code in the target instruction set.  This stage
+   produces the initial code for the program in the target instruction set, then
+   makes use of virtual registers in SSA form and physical registers that
+   represent any required register assignments due to target constraints or
+   calling conventions.  This step turns the LLVM code into a DAG of target
+   instructions.
+
+2. `Scheduling and Formation`_ --- This phase takes the DAG of target
+   instructions produced by the instruction selection phase, determines an
+   ordering of the instructions, then emits the instructions as :raw-html:`<tt>`
+   `MachineInstr`_\s :raw-html:`</tt>` with that ordering.  Note that we
+   describe this in the `instruction selection section`_ because it operates on
+   a `SelectionDAG`_.
+
+3. `SSA-based Machine Code Optimizations`_ --- This optional stage consists of a
+   series of machine-code optimizations that operate on the SSA-form produced by
+   the instruction selector.  Optimizations like modulo-scheduling or peephole
+   optimization work here.
+
+4. `Register Allocation`_ --- The target code is transformed from an infinite
+   virtual register file in SSA form to the concrete register file used by the
+   target.  This phase introduces spill code and eliminates all virtual register
+   references from the program.
+
+5. `Prolog/Epilog Code Insertion`_ --- Once the machine code has been generated
+   for the function and the amount of stack space required is known (used for
+   LLVM alloca's and spill slots), the prolog and epilog code for the function
+   can be inserted and "abstract stack location references" can be eliminated.
+   This stage is responsible for implementing optimizations like frame-pointer
+   elimination and stack packing.
+
+6. `Late Machine Code Optimizations`_ --- Optimizations that operate on "final"
+   machine code can go here, such as spill code scheduling and peephole
+   optimizations.
+
+7. `Code Emission`_ --- The final stage actually puts out the code for the
+   current function, either in the target assembler format or in machine
+   code.
+
+The code generator is based on the assumption that the instruction selector will
+use an optimal pattern matching selector to create high-quality sequences of
+native instructions.  Alternative code generator designs based on pattern
+expansion and aggressive iterative peephole optimization are much slower.  This
+design permits efficient compilation (important for JIT environments) and
+aggressive optimization (used when generating code offline) by allowing
+components of varying levels of sophistication to be used for any step of
+compilation.
+
+In addition to these stages, target implementations can insert arbitrary
+target-specific passes into the flow.  For example, the X86 target uses a
+special pass to handle the 80x87 floating point stack architecture.  Other
+targets with unusual requirements can be supported with custom passes as needed.
+
+Using TableGen for target description
+-------------------------------------
+
+The target description classes require a detailed description of the target
+architecture.  These target descriptions often have a large amount of common
+information (e.g., an ``add`` instruction is almost identical to a ``sub``
+instruction).  In order to allow the maximum amount of commonality to be
+factored out, the LLVM code generator uses the
+:doc:`TableGen/index` tool to describe big chunks of the
+target machine, which allows the use of domain-specific and target-specific
+abstractions to reduce the amount of repetition.
+
+As LLVM continues to be developed and refined, we plan to move more and more of
+the target description to the ``.td`` form.  Doing so gives us a number of
+advantages.  The most important is that it makes it easier to port LLVM because
+it reduces the amount of C++ code that has to be written, and the surface area
+of the code generator that needs to be understood before someone can get
+something working.  Second, it makes it easier to change things. In particular,
+if tables and other things are all emitted by ``tblgen``, we only need a change
+in one place (``tblgen``) to update all of the targets to a new interface.
+
+.. _Abstract target description:
+.. _target description:
+
+Target description classes
+==========================
+
+The LLVM target description classes (located in the ``include/llvm/Target``
+directory) provide an abstract description of the target machine independent of
+any particular client.  These classes are designed to capture the *abstract*
+properties of the target (such as the instructions and registers it has), and do
+not incorporate any particular pieces of code generation algorithms.
+
+All of the target description classes (except the :raw-html:`<tt>` `DataLayout`_
+:raw-html:`</tt>` class) are designed to be subclassed by the concrete target
+implementation, and have virtual methods implemented.  To get to these
+implementations, the :raw-html:`<tt>` `TargetMachine`_ :raw-html:`</tt>` class
+provides accessors that should be implemented by the target.
+
+.. _TargetMachine:
+
+The ``TargetMachine`` class
+---------------------------
+
+The ``TargetMachine`` class provides virtual methods that are used to access the
+target-specific implementations of the various target description classes via
+the ``get*Info`` methods (``getInstrInfo``, ``getRegisterInfo``,
+``getFrameInfo``, etc.).  This class is designed to be specialized by a concrete
+target implementation (e.g., ``X86TargetMachine``) which implements the various
+virtual methods.  The only required target description class is the
+:raw-html:`<tt>` `DataLayout`_ :raw-html:`</tt>` class, but if the code
+generator components are to be used, the other interfaces should be implemented
+as well.
+
+.. _DataLayout:
+
+The ``DataLayout`` class
+------------------------
+
+The ``DataLayout`` class is the only required target description class, and it
+is the only class that is not extensible (you cannot derive a new class from
+it).  ``DataLayout`` specifies information about how the target lays out memory
+for structures, the alignment requirements for various data types, the size of
+pointers in the target, and whether the target is little-endian or
+big-endian.
+
+.. _TargetLowering:
+
+The ``TargetLowering`` class
+----------------------------
+
+The ``TargetLowering`` class is used by SelectionDAG based instruction selectors
+primarily to describe how LLVM code should be lowered to SelectionDAG
+operations.  Among other things, this class indicates:
+
+* an initial register class to use for various ``ValueType``\s,
+
+* which operations are natively supported by the target machine,
+
+* the return type of ``setcc`` operations,
+
+* the type to use for shift amounts, and
+
+* various high-level characteristics, like whether it is profitable to turn
+  division by a constant into a multiplication sequence.
+
+.. _TargetRegisterInfo:
+
+The ``TargetRegisterInfo`` class
+--------------------------------
+
+The ``TargetRegisterInfo`` class is used to describe the register file of the
+target and any interactions between the registers.
+
+Registers are represented in the code generator by unsigned integers.  Physical
+registers (those that actually exist in the target description) are unique
+small numbers, and virtual registers are generally large.  Note that
+register ``#0`` is reserved as a flag value.
+
+Each register in the processor description has an associated
+``TargetRegisterDesc`` entry, which provides a textual name for the register
+(used for assembly output and debugging dumps) and a set of aliases (used to
+indicate whether one register overlaps with another).
+
+In addition to the per-register description, the ``TargetRegisterInfo`` class
+exposes a set of processor specific register classes (instances of the
+``TargetRegisterClass`` class).  Each register class contains sets of registers
+that have the same properties (for example, they are all 32-bit integer
+registers).  Each SSA virtual register created by the instruction selector has
+an associated register class.  When the register allocator runs, it replaces
+virtual registers with a physical register in the set.
+
+The target-specific implementations of these classes is auto-generated from a
+:doc:`TableGen/index` description of the register file.
+
+.. _TargetInstrInfo:
+
+The ``TargetInstrInfo`` class
+-----------------------------
+
+The ``TargetInstrInfo`` class is used to describe the machine instructions
+supported by the target.  Descriptions define things like the mnemonic for
+the opcode, the number of operands, the list of implicit register uses and defs,
+whether the instruction has certain target-independent properties (accesses
+memory, is commutable, etc), and holds any target-specific flags.
+
+The ``TargetFrameLowering`` class
+---------------------------------
+
+The ``TargetFrameLowering`` class is used to provide information about the stack
+frame layout of the target. It holds the direction of stack growth, the known
+stack alignment on entry to each function, and the offset to the local area.
+The offset to the local area is the offset from the stack pointer on function
+entry to the first location where function data (local variables, spill
+locations) can be stored.
+
+The ``TargetSubtarget`` class
+-----------------------------
+
+The ``TargetSubtarget`` class is used to provide information about the specific
+chip set being targeted.  A sub-target informs code generation of which
+instructions are supported, instruction latencies and instruction execution
+itinerary; i.e., which processing units are used, in what order, and for how
+long.
+
+The ``TargetJITInfo`` class
+---------------------------
+
+The ``TargetJITInfo`` class exposes an abstract interface used by the
+Just-In-Time code generator to perform target-specific activities, such as
+emitting stubs.  If a ``TargetMachine`` supports JIT code generation, it should
+provide one of these objects through the ``getJITInfo`` method.
+
+.. _code being generated:
+.. _machine code representation:
+
+Machine code description classes
+================================
+
+At the high-level, LLVM code is translated to a machine specific representation
+formed out of :raw-html:`<tt>` `MachineFunction`_ :raw-html:`</tt>`,
+:raw-html:`<tt>` `MachineBasicBlock`_ :raw-html:`</tt>`, and :raw-html:`<tt>`
+`MachineInstr`_ :raw-html:`</tt>` instances (defined in
+``include/llvm/CodeGen``).  This representation is completely target agnostic,
+representing instructions in their most abstract form: an opcode and a series of
+operands.  This representation is designed to support both an SSA representation
+for machine code, as well as a register allocated, non-SSA form.
+
+.. _MachineInstr:
+
+The ``MachineInstr`` class
+--------------------------
+
+Target machine instructions are represented as instances of the ``MachineInstr``
+class.  This class is an extremely abstract way of representing machine
+instructions.  In particular, it only keeps track of an opcode number and a set
+of operands.
+
+The opcode number is a simple unsigned integer that only has meaning to a
+specific backend.  All of the instructions for a target should be defined in the
+``*InstrInfo.td`` file for the target. The opcode enum values are auto-generated
+from this description.  The ``MachineInstr`` class does not have any information
+about how to interpret the instruction (i.e., what the semantics of the
+instruction are); for that you must refer to the :raw-html:`<tt>`
+`TargetInstrInfo`_ :raw-html:`</tt>` class.
+
+The operands of a machine instruction can be of several different types: a
+register reference, a constant integer, a basic block reference, etc.  In
+addition, a machine operand should be marked as a def or a use of the value
+(though only registers are allowed to be defs).
+
+By convention, the LLVM code generator orders instruction operands so that all
+register definitions come before the register uses, even on architectures that
+are normally printed in other orders.  For example, the SPARC add instruction:
+"``add %i1, %i2, %i3``" adds the "%i1", and "%i2" registers and stores the
+result into the "%i3" register.  In the LLVM code generator, the operands should
+be stored as "``%i3, %i1, %i2``": with the destination first.
+
+Keeping destination (definition) operands at the beginning of the operand list
+has several advantages.  In particular, the debugging printer will print the
+instruction like this:
+
+.. code-block:: llvm
+
+  %r3 = add %i1, %i2
+
+Also if the first operand is a def, it is easier to `create instructions`_ whose
+only def is the first operand.
+
+.. _create instructions:
+
+Using the ``MachineInstrBuilder.h`` functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Machine instructions are created by using the ``BuildMI`` functions, located in
+the ``include/llvm/CodeGen/MachineInstrBuilder.h`` file.  The ``BuildMI``
+functions make it easy to build arbitrary machine instructions.  Usage of the
+``BuildMI`` functions look like this:
+
+.. code-block:: c++
+
+  // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42')
+  // instruction and insert it at the end of the given MachineBasicBlock.
+  const TargetInstrInfo &TII = ...
+  MachineBasicBlock &MBB = ...
+  DebugLoc DL;
+  MachineInstr *MI = BuildMI(MBB, DL, TII.get(X86::MOV32ri), DestReg).addImm(42);
+
+  // Create the same instr, but insert it before a specified iterator point.
+  MachineBasicBlock::iterator MBBI = ...
+  BuildMI(MBB, MBBI, DL, TII.get(X86::MOV32ri), DestReg).addImm(42);
+
+  // Create a 'cmp Reg, 0' instruction, no destination reg.
+  MI = BuildMI(MBB, DL, TII.get(X86::CMP32ri8)).addReg(Reg).addImm(42);
+
+  // Create an 'sahf' instruction which takes no operands and stores nothing.
+  MI = BuildMI(MBB, DL, TII.get(X86::SAHF));
+
+  // Create a self looping branch instruction.
+  BuildMI(MBB, DL, TII.get(X86::JNE)).addMBB(&MBB);
+
+If you need to add a definition operand (other than the optional destination
+register), you must explicitly mark it as such:
+
+.. code-block:: c++
+
+  MI.addReg(Reg, RegState::Define);
+
+Fixed (preassigned) registers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+One important issue that the code generator needs to be aware of is the presence
+of fixed registers.  In particular, there are often places in the instruction
+stream where the register allocator *must* arrange for a particular value to be
+in a particular register.  This can occur due to limitations of the instruction
+set (e.g., the X86 can only do a 32-bit divide with the ``EAX``/``EDX``
+registers), or external factors like calling conventions.  In any case, the
+instruction selector should emit code that copies a virtual register into or out
+of a physical register when needed.
+
+For example, consider this simple LLVM example:
+
+.. code-block:: llvm
+
+  define i32 @test(i32 %X, i32 %Y) {
+    %Z = sdiv i32 %X, %Y
+    ret i32 %Z
+  }
+
+The X86 instruction selector might produce this machine code for the ``div`` and
+``ret``:
+
+.. code-block:: text
+
+  ;; Start of div
+  %EAX = mov %reg1024           ;; Copy X (in reg1024) into EAX
+  %reg1027 = sar %reg1024, 31
+  %EDX = mov %reg1027           ;; Sign extend X into EDX
+  idiv %reg1025                 ;; Divide by Y (in reg1025)
+  %reg1026 = mov %EAX           ;; Read the result (Z) out of EAX
+
+  ;; Start of ret
+  %EAX = mov %reg1026           ;; 32-bit return value goes in EAX
+  ret
+
+By the end of code generation, the register allocator would coalesce the
+registers and delete the resultant identity moves producing the following
+code:
+
+.. code-block:: text
+
+  ;; X is in EAX, Y is in ECX
+  mov %EAX, %EDX
+  sar %EDX, 31
+  idiv %ECX
+  ret
+
+This approach is extremely general (if it can handle the X86 architecture, it
+can handle anything!) and allows all of the target specific knowledge about the
+instruction stream to be isolated in the instruction selector.  Note that
+physical registers should have a short lifetime for good code generation, and
+all physical registers are assumed dead on entry to and exit from basic blocks
+(before register allocation).  Thus, if you need a value to be live across basic
+block boundaries, it *must* live in a virtual register.
+
+Call-clobbered registers
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some machine instructions, like calls, clobber a large number of physical
+registers.  Rather than adding ``<def,dead>`` operands for all of them, it is
+possible to use an ``MO_RegisterMask`` operand instead.  The register mask
+operand holds a bit mask of preserved registers, and everything else is
+considered to be clobbered by the instruction.
+
+Machine code in SSA form
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``MachineInstr``'s are initially selected in SSA-form, and are maintained in
+SSA-form until register allocation happens.  For the most part, this is
+trivially simple since LLVM is already in SSA form; LLVM PHI nodes become
+machine code PHI nodes, and virtual registers are only allowed to have a single
+definition.
+
+After register allocation, machine code is no longer in SSA-form because there
+are no virtual registers left in the code.
+
+.. _MachineBasicBlock:
+
+The ``MachineBasicBlock`` class
+-------------------------------
+
+The ``MachineBasicBlock`` class contains a list of machine instructions
+(:raw-html:`<tt>` `MachineInstr`_ :raw-html:`</tt>` instances).  It roughly
+corresponds to the LLVM code input to the instruction selector, but there can be
+a one-to-many mapping (i.e. one LLVM basic block can map to multiple machine
+basic blocks). The ``MachineBasicBlock`` class has a "``getBasicBlock``" method,
+which returns the LLVM basic block that it comes from.
+
+.. _MachineFunction:
+
+The ``MachineFunction`` class
+-----------------------------
+
+The ``MachineFunction`` class contains a list of machine basic blocks
+(:raw-html:`<tt>` `MachineBasicBlock`_ :raw-html:`</tt>` instances).  It
+corresponds one-to-one with the LLVM function input to the instruction selector.
+In addition to a list of basic blocks, the ``MachineFunction`` contains a a
+``MachineConstantPool``, a ``MachineFrameInfo``, a ``MachineFunctionInfo``, and
+a ``MachineRegisterInfo``.  See ``include/llvm/CodeGen/MachineFunction.h`` for
+more information.
+
+``MachineInstr Bundles``
+------------------------
+
+LLVM code generator can model sequences of instructions as MachineInstr
+bundles. A MI bundle can model a VLIW group / pack which contains an arbitrary
+number of parallel instructions. It can also be used to model a sequential list
+of instructions (potentially with data dependencies) that cannot be legally
+separated (e.g. ARM Thumb2 IT blocks).
+
+Conceptually a MI bundle is a MI with a number of other MIs nested within:
+
+::
+
+  --------------
+  |   Bundle   | ---------
+  --------------          \
+         |           ----------------
+         |           |      MI      |
+         |           ----------------
+         |                   |
+         |           ----------------
+         |           |      MI      |
+         |           ----------------
+         |                   |
+         |           ----------------
+         |           |      MI      |
+         |           ----------------
+         |
+  --------------
+  |   Bundle   | --------
+  --------------         \
+         |           ----------------
+         |           |      MI      |
+         |           ----------------
+         |                   |
+         |           ----------------
+         |           |      MI      |
+         |           ----------------
+         |                   |
+         |                  ...
+         |
+  --------------
+  |   Bundle   | --------
+  --------------         \
+         |
+        ...
+
+MI bundle support does not change the physical representations of
+MachineBasicBlock and MachineInstr. All the MIs (including top level and nested
+ones) are stored as sequential list of MIs. The "bundled" MIs are marked with
+the 'InsideBundle' flag. A top level MI with the special BUNDLE opcode is used
+to represent the start of a bundle. It's legal to mix BUNDLE MIs with indiviual
+MIs that are not inside bundles nor represent bundles.
+
+MachineInstr passes should operate on a MI bundle as a single unit. Member
+methods have been taught to correctly handle bundles and MIs inside bundles.
+The MachineBasicBlock iterator has been modified to skip over bundled MIs to
+enforce the bundle-as-a-single-unit concept. An alternative iterator
+instr_iterator has been added to MachineBasicBlock to allow passes to iterate
+over all of the MIs in a MachineBasicBlock, including those which are nested
+inside bundles. The top level BUNDLE instruction must have the correct set of
+register MachineOperand's that represent the cumulative inputs and outputs of
+the bundled MIs.
+
+Packing / bundling of MachineInstr's should be done as part of the register
+allocation super-pass. More specifically, the pass which determines what MIs
+should be bundled together must be done after code generator exits SSA form
+(i.e. after two-address pass, PHI elimination, and copy coalescing).  Bundles
+should only be finalized (i.e. adding BUNDLE MIs and input and output register
+MachineOperands) after virtual registers have been rewritten into physical
+registers. This requirement eliminates the need to add virtual register operands
+to BUNDLE instructions which would effectively double the virtual register def
+and use lists.
+
+.. _MC Layer:
+
+The "MC" Layer
+==============
+
+The MC Layer is used to represent and process code at the raw machine code
+level, devoid of "high level" information like "constant pools", "jump tables",
+"global variables" or anything like that.  At this level, LLVM handles things
+like label names, machine instructions, and sections in the object file.  The
+code in this layer is used for a number of important purposes: the tail end of
+the code generator uses it to write a .s or .o file, and it is also used by the
+llvm-mc tool to implement standalone machine code assemblers and disassemblers.
+
+This section describes some of the important classes.  There are also a number
+of important subsystems that interact at this layer, they are described later in
+this manual.
+
+.. _MCStreamer:
+
+The ``MCStreamer`` API
+----------------------
+
+MCStreamer is best thought of as an assembler API.  It is an abstract API which
+is *implemented* in different ways (e.g. to output a .s file, output an ELF .o
+file, etc) but whose API correspond directly to what you see in a .s file.
+MCStreamer has one method per directive, such as EmitLabel, EmitSymbolAttribute,
+SwitchSection, EmitValue (for .byte, .word), etc, which directly correspond to
+assembly level directives.  It also has an EmitInstruction method, which is used
+to output an MCInst to the streamer.
+
+This API is most important for two clients: the llvm-mc stand-alone assembler is
+effectively a parser that parses a line, then invokes a method on MCStreamer. In
+the code generator, the `Code Emission`_ phase of the code generator lowers
+higher level LLVM IR and Machine* constructs down to the MC layer, emitting
+directives through MCStreamer.
+
+On the implementation side of MCStreamer, there are two major implementations:
+one for writing out a .s file (MCAsmStreamer), and one for writing out a .o
+file (MCObjectStreamer).  MCAsmStreamer is a straightforward implementation
+that prints out a directive for each method (e.g. ``EmitValue -> .byte``), but
+MCObjectStreamer implements a full assembler.
+
+For target specific directives, the MCStreamer has a MCTargetStreamer instance.
+Each target that needs it defines a class that inherits from it and is a lot
+like MCStreamer itself: It has one method per directive and two classes that
+inherit from it, a target object streamer and a target asm streamer. The target
+asm streamer just prints it (``emitFnStart -> .fnstart``), and the object
+streamer implement the assembler logic for it.
+
+To make llvm use these classes, the target initialization must call
+TargetRegistry::RegisterAsmStreamer and TargetRegistry::RegisterMCObjectStreamer
+passing callbacks that allocate the corresponding target streamer and pass it
+to createAsmStreamer or to the appropriate object streamer constructor.
+
+The ``MCContext`` class
+-----------------------
+
+The MCContext class is the owner of a variety of uniqued data structures at the
+MC layer, including symbols, sections, etc.  As such, this is the class that you
+interact with to create symbols and sections.  This class can not be subclassed.
+
+The ``MCSymbol`` class
+----------------------
+
+The MCSymbol class represents a symbol (aka label) in the assembly file.  There
+are two interesting kinds of symbols: assembler temporary symbols, and normal
+symbols.  Assembler temporary symbols are used and processed by the assembler
+but are discarded when the object file is produced.  The distinction is usually
+represented by adding a prefix to the label, for example "L" labels are
+assembler temporary labels in MachO.
+
+MCSymbols are created by MCContext and uniqued there.  This means that MCSymbols
+can be compared for pointer equivalence to find out if they are the same symbol.
+Note that pointer inequality does not guarantee the labels will end up at
+different addresses though.  It's perfectly legal to output something like this
+to the .s file:
+
+::
+
+  foo:
+  bar:
+    .byte 4
+
+In this case, both the foo and bar symbols will have the same address.
+
+The ``MCSection`` class
+-----------------------
+
+The ``MCSection`` class represents an object-file specific section. It is
+subclassed by object file specific implementations (e.g. ``MCSectionMachO``,
+``MCSectionCOFF``, ``MCSectionELF``) and these are created and uniqued by
+MCContext.  The MCStreamer has a notion of the current section, which can be
+changed with the SwitchToSection method (which corresponds to a ".section"
+directive in a .s file).
+
+.. _MCInst:
+
+The ``MCInst`` class
+--------------------
+
+The ``MCInst`` class is a target-independent representation of an instruction.
+It is a simple class (much more so than `MachineInstr`_) that holds a
+target-specific opcode and a vector of MCOperands.  MCOperand, in turn, is a
+simple discriminated union of three cases: 1) a simple immediate, 2) a target
+register ID, 3) a symbolic expression (e.g. "``Lfoo-Lbar+42``") as an MCExpr.
+
+MCInst is the common currency used to represent machine instructions at the MC
+layer.  It is the type used by the instruction encoder, the instruction printer,
+and the type generated by the assembly parser and disassembler.
+
+.. _Target-independent algorithms:
+.. _code generation algorithm:
+
+Target-independent code generation algorithms
+=============================================
+
+This section documents the phases described in the `high-level design of the
+code generator`_.  It explains how they work and some of the rationale behind
+their design.
+
+.. _Instruction Selection:
+.. _instruction selection section:
+
+Instruction Selection
+---------------------
+
+Instruction Selection is the process of translating LLVM code presented to the
+code generator into target-specific machine instructions.  There are several
+well-known ways to do this in the literature.  LLVM uses a SelectionDAG based
+instruction selector.
+
+Portions of the DAG instruction selector are generated from the target
+description (``*.td``) files.  Our goal is for the entire instruction selector
+to be generated from these ``.td`` files, though currently there are still
+things that require custom C++ code.
+
+.. _SelectionDAG:
+
+Introduction to SelectionDAGs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The SelectionDAG provides an abstraction for code representation in a way that
+is amenable to instruction selection using automatic techniques
+(e.g. dynamic-programming based optimal pattern matching selectors). It is also
+well-suited to other phases of code generation; in particular, instruction
+scheduling (SelectionDAG's are very close to scheduling DAGs post-selection).
+Additionally, the SelectionDAG provides a host representation where a large
+variety of very-low-level (but target-independent) `optimizations`_ may be
+performed; ones which require extensive information about the instructions
+efficiently supported by the target.
+
+The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the
+``SDNode`` class.  The primary payload of the ``SDNode`` is its operation code
+(Opcode) that indicates what operation the node performs and the operands to the
+operation.  The various operation node types are described at the top of the
+``include/llvm/CodeGen/ISDOpcodes.h`` file.
+
+Although most operations define a single value, each node in the graph may
+define multiple values.  For example, a combined div/rem operation will define
+both the dividend and the remainder. Many other situations require multiple
+values as well.  Each node also has some number of operands, which are edges to
+the node defining the used value.  Because nodes may define multiple values,
+edges are represented by instances of the ``SDValue`` class, which is a
+``<SDNode, unsigned>`` pair, indicating the node and result value being used,
+respectively.  Each value produced by an ``SDNode`` has an associated ``MVT``
+(Machine Value Type) indicating what the type of the value is.
+
+SelectionDAGs contain two different kinds of values: those that represent data
+flow and those that represent control flow dependencies.  Data values are simple
+edges with an integer or floating point value type.  Control edges are
+represented as "chain" edges which are of type ``MVT::Other``.  These edges
+provide an ordering between nodes that have side effects (such as loads, stores,
+calls, returns, etc).  All nodes that have side effects should take a token
+chain as input and produce a new one as output.  By convention, token chain
+inputs are always operand #0, and chain results are always the last value
+produced by an operation. However, after instruction selection, the
+machine nodes have their chain after the instruction's operands, and
+may be followed by glue nodes.
+
+A SelectionDAG has designated "Entry" and "Root" nodes.  The Entry node is
+always a marker node with an Opcode of ``ISD::EntryToken``.  The Root node is
+the final side-effecting node in the token chain. For example, in a single basic
+block function it would be the return node.
+
+One important concept for SelectionDAGs is the notion of a "legal" vs.
+"illegal" DAG.  A legal DAG for a target is one that only uses supported
+operations and supported types.  On a 32-bit PowerPC, for example, a DAG with a
+value of type i1, i8, i16, or i64 would be illegal, as would a DAG that uses a
+SREM or UREM operation.  The `legalize types`_ and `legalize operations`_ phases
+are responsible for turning an illegal DAG into a legal DAG.
+
+.. _SelectionDAG-Process:
+
+SelectionDAG Instruction Selection Process
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+SelectionDAG-based instruction selection consists of the following steps:
+
+#. `Build initial DAG`_ --- This stage performs a simple translation from the
+   input LLVM code to an illegal SelectionDAG.
+
+#. `Optimize SelectionDAG`_ --- This stage performs simple optimizations on the
+   SelectionDAG to simplify it, and recognize meta instructions (like rotates
+   and ``div``/``rem`` pairs) for targets that support these meta operations.
+   This makes the resultant code more efficient and the `select instructions
+   from DAG`_ phase (below) simpler.
+
+#. `Legalize SelectionDAG Types`_ --- This stage transforms SelectionDAG nodes
+   to eliminate any types that are unsupported on the target.
+
+#. `Optimize SelectionDAG`_ --- The SelectionDAG optimizer is run to clean up
+   redundancies exposed by type legalization.
+
+#. `Legalize SelectionDAG Ops`_ --- This stage transforms SelectionDAG nodes to
+   eliminate any operations that are unsupported on the target.
+
+#. `Optimize SelectionDAG`_ --- The SelectionDAG optimizer is run to eliminate
+   inefficiencies introduced by operation legalization.
+
+#. `Select instructions from DAG`_ --- Finally, the target instruction selector
+   matches the DAG operations to target instructions.  This process translates
+   the target-independent input DAG into another DAG of target instructions.
+
+#. `SelectionDAG Scheduling and Formation`_ --- The last phase assigns a linear
+   order to the instructions in the target-instruction DAG and emits them into
+   the MachineFunction being compiled.  This step uses traditional prepass
+   scheduling techniques.
+
+After all of these steps are complete, the SelectionDAG is destroyed and the
+rest of the code generation passes are run.
+
+One great way to visualize what is going on here is to take advantage of a few
+LLC command line options.  The following options pop up a window displaying the
+SelectionDAG at specific times (if you only get errors printed to the console
+while using this, you probably `need to configure your
+system <ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ to add support for it).
+
+* ``-view-dag-combine1-dags`` displays the DAG after being built, before the
+  first optimization pass.
+
+* ``-view-legalize-dags`` displays the DAG before Legalization.
+
+* ``-view-dag-combine2-dags`` displays the DAG before the second optimization
+  pass.
+
+* ``-view-isel-dags`` displays the DAG before the Select phase.
+
+* ``-view-sched-dags`` displays the DAG before Scheduling.
+
+The ``-view-sunit-dags`` displays the Scheduler's dependency graph.  This graph
+is based on the final SelectionDAG, with nodes that must be scheduled together
+bundled into a single scheduling-unit node, and with immediate operands and
+other nodes that aren't relevant for scheduling omitted.
+
+The option ``-filter-view-dags`` allows to select the name of the basic block
+that you are interested to visualize and filters all the previous
+``view-*-dags`` options.
+
+.. _Build initial DAG:
+
+Initial SelectionDAG Construction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The initial SelectionDAG is na\ :raw-html:`ï`\ vely peephole expanded from
+the LLVM input by the ``SelectionDAGBuilder`` class.  The intent of this pass
+is to expose as much low-level, target-specific details to the SelectionDAG as
+possible.  This pass is mostly hard-coded (e.g. an LLVM ``add`` turns into an
+``SDNode add`` while a ``getelementptr`` is expanded into the obvious
+arithmetic). This pass requires target-specific hooks to lower calls, returns,
+varargs, etc.  For these features, the :raw-html:`<tt>` `TargetLowering`_
+:raw-html:`</tt>` interface is used.
+
+.. _legalize types:
+.. _Legalize SelectionDAG Types:
+.. _Legalize SelectionDAG Ops:
+
+SelectionDAG LegalizeTypes Phase
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Legalize phase is in charge of converting a DAG to only use the types that
+are natively supported by the target.
+
+There are two main ways of converting values of unsupported scalar types to
+values of supported types: converting small types to larger types ("promoting"),
+and breaking up large integer types into smaller ones ("expanding").  For
+example, a target might require that all f32 values are promoted to f64 and that
+all i1/i8/i16 values are promoted to i32.  The same target might require that
+all i64 values be expanded into pairs of i32 values.  These changes can insert
+sign and zero extensions as needed to make sure that the final code has the same
+behavior as the input.
+
+There are two main ways of converting values of unsupported vector types to
+value of supported types: splitting vector types, multiple times if necessary,
+until a legal type is found, and extending vector types by adding elements to
+the end to round them out to legal types ("widening").  If a vector gets split
+all the way down to single-element parts with no supported vector type being
+found, the elements are converted to scalars ("scalarizing").
+
+A target implementation tells the legalizer which types are supported (and which
+register class to use for them) by calling the ``addRegisterClass`` method in
+its ``TargetLowering`` constructor.
+
+.. _legalize operations:
+.. _Legalizer:
+
+SelectionDAG Legalize Phase
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Legalize phase is in charge of converting a DAG to only use the operations
+that are natively supported by the target.
+
+Targets often have weird constraints, such as not supporting every operation on
+every supported datatype (e.g. X86 does not support byte conditional moves and
+PowerPC does not support sign-extending loads from a 16-bit memory location).
+Legalize takes care of this by open-coding another sequence of operations to
+emulate the operation ("expansion"), by promoting one type to a larger type that
+supports the operation ("promotion"), or by using a target-specific hook to
+implement the legalization ("custom").
+
+A target implementation tells the legalizer which operations are not supported
+(and which of the above three actions to take) by calling the
+``setOperationAction`` method in its ``TargetLowering`` constructor.
+
+Prior to the existence of the Legalize passes, we required that every target
+`selector`_ supported and handled every operator and type even if they are not
+natively supported.  The introduction of the Legalize phases allows all of the
+canonicalization patterns to be shared across targets, and makes it very easy to
+optimize the canonicalized code because it is still in the form of a DAG.
+
+.. _optimizations:
+.. _Optimize SelectionDAG:
+.. _selector:
+
+SelectionDAG Optimization Phase: the DAG Combiner
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The SelectionDAG optimization phase is run multiple times for code generation,
+immediately after the DAG is built and once after each legalization.  The first
+run of the pass allows the initial code to be cleaned up (e.g. performing
+optimizations that depend on knowing that the operators have restricted type
+inputs).  Subsequent runs of the pass clean up the messy code generated by the
+Legalize passes, which allows Legalize to be very simple (it can focus on making
+code legal instead of focusing on generating *good* and legal code).
+
+One important class of optimizations performed is optimizing inserted sign and
+zero extension instructions.  We currently use ad-hoc techniques, but could move
+to more rigorous techniques in the future.  Here are some good papers on the
+subject:
+
+"`Widening integer arithmetic <http://www.eecs.harvard.edu/~nr/pubs/widen-abstract.html>`_" :raw-html:`<br>`
+Kevin Redwine and Norman Ramsey :raw-html:`<br>`
+International Conference on Compiler Construction (CC) 2004
+
+"`Effective sign extension elimination <http://portal.acm.org/citation.cfm?doid=512529.512552>`_"  :raw-html:`<br>`
+Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani :raw-html:`<br>`
+Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design
+and Implementation.
+
+.. _Select instructions from DAG:
+
+SelectionDAG Select Phase
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Select phase is the bulk of the target-specific code for instruction
+selection.  This phase takes a legal SelectionDAG as input, pattern matches the
+instructions supported by the target to this DAG, and produces a new DAG of
+target code.  For example, consider the following LLVM fragment:
+
+.. code-block:: llvm
+
+  %t1 = fadd float %W, %X
+  %t2 = fmul float %t1, %Y
+  %t3 = fadd float %t2, %Z
+
+This LLVM code corresponds to a SelectionDAG that looks basically like this:
+
+.. code-block:: text
+
+  (fadd:f32 (fmul:f32 (fadd:f32 W, X), Y), Z)
+
+If a target supports floating point multiply-and-add (FMA) operations, one of
+the adds can be merged with the multiply.  On the PowerPC, for example, the
+output of the instruction selector might look like this DAG:
+
+::
+
+  (FMADDS (FADDS W, X), Y, Z)
+
+The ``FMADDS`` instruction is a ternary instruction that multiplies its first
+two operands and adds the third (as single-precision floating-point numbers).
+The ``FADDS`` instruction is a simple binary single-precision add instruction.
+To perform this pattern match, the PowerPC backend includes the following
+instruction definitions:
+
+.. code-block:: text
+  :emphasize-lines: 4-5,9
+
+  def FMADDS : AForm_1<59, 29,
+                      (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRC, F4RC:$FRB),
+                      "fmadds $FRT, $FRA, $FRC, $FRB",
+                      [(set F4RC:$FRT, (fadd (fmul F4RC:$FRA, F4RC:$FRC),
+                                             F4RC:$FRB))]>;
+  def FADDS : AForm_2<59, 21,
+                      (ops F4RC:$FRT, F4RC:$FRA, F4RC:$FRB),
+                      "fadds $FRT, $FRA, $FRB",
+                      [(set F4RC:$FRT, (fadd F4RC:$FRA, F4RC:$FRB))]>;
+
+The highlighted portion of the instruction definitions indicates the pattern
+used to match the instructions. The DAG operators (like ``fmul``/``fadd``)
+are defined in the ``include/llvm/Target/TargetSelectionDAG.td`` file.
+"``F4RC``" is the register class of the input and result values.
+
+The TableGen DAG instruction selector generator reads the instruction patterns
+in the ``.td`` file and automatically builds parts of the pattern matching code
+for your target.  It has the following strengths:
+
+* At compiler-compiler time, it analyzes your instruction patterns and tells you
+  if your patterns make sense or not.
+
+* It can handle arbitrary constraints on operands for the pattern match.  In
+  particular, it is straight-forward to say things like "match any immediate
+  that is a 13-bit sign-extended value".  For examples, see the ``immSExt16``
+  and related ``tblgen`` classes in the PowerPC backend.
+
+* It knows several important identities for the patterns defined.  For example,
+  it knows that addition is commutative, so it allows the ``FMADDS`` pattern
+  above to match "``(fadd X, (fmul Y, Z))``" as well as "``(fadd (fmul X, Y),
+  Z)``", without the target author having to specially handle this case.
+
+* It has a full-featured type-inferencing system.  In particular, you should
+  rarely have to explicitly tell the system what type parts of your patterns
+  are.  In the ``FMADDS`` case above, we didn't have to tell ``tblgen`` that all
+  of the nodes in the pattern are of type 'f32'.  It was able to infer and
+  propagate this knowledge from the fact that ``F4RC`` has type 'f32'.
+
+* Targets can define their own (and rely on built-in) "pattern fragments".
+  Pattern fragments are chunks of reusable patterns that get inlined into your
+  patterns during compiler-compiler time.  For example, the integer "``(not
+  x)``" operation is actually defined as a pattern fragment that expands as
+  "``(xor x, -1)``", since the SelectionDAG does not have a native '``not``'
+  operation.  Targets can define their own short-hand fragments as they see fit.
+  See the definition of '``not``' and '``ineg``' for examples.
+
+* In addition to instructions, targets can specify arbitrary patterns that map
+  to one or more instructions using the 'Pat' class.  For example, the PowerPC
+  has no way to load an arbitrary integer immediate into a register in one
+  instruction. To tell tblgen how to do this, it defines:
+
+  ::
+
+    // Arbitrary immediate support.  Implement in terms of LIS/ORI.
+    def : Pat<(i32 imm:$imm),
+              (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))>;
+
+  If none of the single-instruction patterns for loading an immediate into a
+  register match, this will be used.  This rule says "match an arbitrary i32
+  immediate, turning it into an ``ORI`` ('or a 16-bit immediate') and an ``LIS``
+  ('load 16-bit immediate, where the immediate is shifted to the left 16 bits')
+  instruction".  To make this work, the ``LO16``/``HI16`` node transformations
+  are used to manipulate the input immediate (in this case, take the high or low
+  16-bits of the immediate).
+
+* When using the 'Pat' class to map a pattern to an instruction that has one
+  or more complex operands (like e.g. `X86 addressing mode`_), the pattern may
+  either specify the operand as a whole using a ``ComplexPattern``, or else it
+  may specify the components of the complex operand separately.  The latter is
+  done e.g. for pre-increment instructions by the PowerPC back end:
+
+  ::
+
+    def STWU  : DForm_1<37, (outs ptr_rc:$ea_res), (ins GPRC:$rS, memri:$dst),
+                    "stwu $rS, $dst", LdStStoreUpd, []>,
+                    RegConstraint<"$dst.reg = $ea_res">, NoEncode<"$ea_res">;
+
+    def : Pat<(pre_store GPRC:$rS, ptr_rc:$ptrreg, iaddroff:$ptroff),
+              (STWU GPRC:$rS, iaddroff:$ptroff, ptr_rc:$ptrreg)>;
+
+  Here, the pair of ``ptroff`` and ``ptrreg`` operands is matched onto the
+  complex operand ``dst`` of class ``memri`` in the ``STWU`` instruction.
+
+* While the system does automate a lot, it still allows you to write custom C++
+  code to match special cases if there is something that is hard to
+  express.
+
+While it has many strengths, the system currently has some limitations,
+primarily because it is a work in progress and is not yet finished:
+
+* Overall, there is no way to define or match SelectionDAG nodes that define
+  multiple values (e.g. ``SMUL_LOHI``, ``LOAD``, ``CALL``, etc).  This is the
+  biggest reason that you currently still *have to* write custom C++ code
+  for your instruction selector.
+
+* There is no great way to support matching complex addressing modes yet.  In
+  the future, we will extend pattern fragments to allow them to define multiple
+  values (e.g. the four operands of the `X86 addressing mode`_, which are
+  currently matched with custom C++ code).  In addition, we'll extend fragments
+  so that a fragment can match multiple different patterns.
+
+* We don't automatically infer flags like ``isStore``/``isLoad`` yet.
+
+* We don't automatically generate the set of supported registers and operations
+  for the `Legalizer`_ yet.
+
+* We don't have a way of tying in custom legalized nodes yet.
+
+Despite these limitations, the instruction selector generator is still quite
+useful for most of the binary and logical operations in typical instruction
+sets.  If you run into any problems or can't figure out how to do something,
+please let Chris know!
+
+.. _Scheduling and Formation:
+.. _SelectionDAG Scheduling and Formation:
+
+SelectionDAG Scheduling and Formation Phase
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The scheduling phase takes the DAG of target instructions from the selection
+phase and assigns an order.  The scheduler can pick an order depending on
+various constraints of the machines (i.e. order for minimal register pressure or
+try to cover instruction latencies).  Once an order is established, the DAG is
+converted to a list of :raw-html:`<tt>` `MachineInstr`_\s :raw-html:`</tt>` and
+the SelectionDAG is destroyed.
+
+Note that this phase is logically separate from the instruction selection phase,
+but is tied to it closely in the code because it operates on SelectionDAGs.
+
+Future directions for the SelectionDAG
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. Optional function-at-a-time selection.
+
+#. Auto-generate entire selector from ``.td`` file.
+
+.. _SSA-based Machine Code Optimizations:
+
+SSA-based Machine Code Optimizations
+------------------------------------
+
+To Be Written
+
+Live Intervals
+--------------
+
+Live Intervals are the ranges (intervals) where a variable is *live*.  They are
+used by some `register allocator`_ passes to determine if two or more virtual
+registers which require the same physical register are live at the same point in
+the program (i.e., they conflict).  When this situation occurs, one virtual
+register must be *spilled*.
+
+Live Variable Analysis
+^^^^^^^^^^^^^^^^^^^^^^
+
+The first step in determining the live intervals of variables is to calculate
+the set of registers that are immediately dead after the instruction (i.e., the
+instruction calculates the value, but it is never used) and the set of registers
+that are used by the instruction, but are never used after the instruction
+(i.e., they are killed). Live variable information is computed for
+each *virtual* register and *register allocatable* physical register
+in the function.  This is done in a very efficient manner because it uses SSA to
+sparsely compute lifetime information for virtual registers (which are in SSA
+form) and only has to track physical registers within a block.  Before register
+allocation, LLVM can assume that physical registers are only live within a
+single basic block.  This allows it to do a single, local analysis to resolve
+physical register lifetimes within each basic block. If a physical register is
+not register allocatable (e.g., a stack pointer or condition codes), it is not
+tracked.
+
+Physical registers may be live in to or out of a function. Live in values are
+typically arguments in registers. Live out values are typically return values in
+registers. Live in values are marked as such, and are given a dummy "defining"
+instruction during live intervals analysis. If the last basic block of a
+function is a ``return``, then it's marked as using all live out values in the
+function.
+
+``PHI`` nodes need to be handled specially, because the calculation of the live
+variable information from a depth first traversal of the CFG of the function
+won't guarantee that a virtual register used by the ``PHI`` node is defined
+before it's used. When a ``PHI`` node is encountered, only the definition is
+handled, because the uses will be handled in other basic blocks.
+
+For each ``PHI`` node of the current basic block, we simulate an assignment at
+the end of the current basic block and traverse the successor basic blocks. If a
+successor basic block has a ``PHI`` node and one of the ``PHI`` node's operands
+is coming from the current basic block, then the variable is marked as *alive*
+within the current basic block and all of its predecessor basic blocks, until
+the basic block with the defining instruction is encountered.
+
+Live Intervals Analysis
+^^^^^^^^^^^^^^^^^^^^^^^
+
+We now have the information available to perform the live intervals analysis and
+build the live intervals themselves.  We start off by numbering the basic blocks
+and machine instructions.  We then handle the "live-in" values.  These are in
+physical registers, so the physical register is assumed to be killed by the end
+of the basic block.  Live intervals for virtual registers are computed for some
+ordering of the machine instructions ``[1, N]``.  A live interval is an interval
+``[i, j)``, where ``1 >= i >= j > N``, for which a variable is live.
+
+.. note::
+  More to come...
+
+.. _Register Allocation:
+.. _register allocator:
+
+Register Allocation
+-------------------
+
+The *Register Allocation problem* consists in mapping a program
+:raw-html:`<b><tt>` P\ :sub:`v`\ :raw-html:`</tt></b>`, that can use an unbounded
+number of virtual registers, to a program :raw-html:`<b><tt>` P\ :sub:`p`\
+:raw-html:`</tt></b>` that contains a finite (possibly small) number of physical
+registers. Each target architecture has a different number of physical
+registers. If the number of physical registers is not enough to accommodate all
+the virtual registers, some of them will have to be mapped into memory. These
+virtuals are called *spilled virtuals*.
+
+How registers are represented in LLVM
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In LLVM, physical registers are denoted by integer numbers that normally range
+from 1 to 1023. To see how this numbering is defined for a particular
+architecture, you can read the ``GenRegisterNames.inc`` file for that
+architecture. For instance, by inspecting
+``lib/Target/X86/X86GenRegisterInfo.inc`` we see that the 32-bit register
+``EAX`` is denoted by 43, and the MMX register ``MM0`` is mapped to 65.
+
+Some architectures contain registers that share the same physical location. A
+notable example is the X86 platform. For instance, in the X86 architecture, the
+registers ``EAX``, ``AX`` and ``AL`` share the first eight bits. These physical
+registers are marked as *aliased* in LLVM. Given a particular architecture, you
+can check which registers are aliased by inspecting its ``RegisterInfo.td``
+file. Moreover, the class ``MCRegAliasIterator`` enumerates all the physical
+registers aliased to a register.
+
+Physical registers, in LLVM, are grouped in *Register Classes*.  Elements in the
+same register class are functionally equivalent, and can be interchangeably
+used. Each virtual register can only be mapped to physical registers of a
+particular class. For instance, in the X86 architecture, some virtuals can only
+be allocated to 8 bit registers.  A register class is described by
+``TargetRegisterClass`` objects.  To discover if a virtual register is
+compatible with a given physical, this code can be used:
+
+.. code-block:: c++
+
+  bool RegMapping_Fer::compatible_class(MachineFunction &mf,
+                                        unsigned v_reg,
+                                        unsigned p_reg) {
+    assert(TargetRegisterInfo::isPhysicalRegister(p_reg) &&
+           "Target register must be physical");
+    const TargetRegisterClass *trc = mf.getRegInfo().getRegClass(v_reg);
+    return trc->contains(p_reg);
+  }
+
+Sometimes, mostly for debugging purposes, it is useful to change the number of
+physical registers available in the target architecture. This must be done
+statically, inside the ``TargetRegsterInfo.td`` file. Just ``grep`` for
+``RegisterClass``, the last parameter of which is a list of registers. Just
+commenting some out is one simple way to avoid them being used. A more polite
+way is to explicitly exclude some registers from the *allocation order*. See the
+definition of the ``GR8`` register class in
+``lib/Target/X86/X86RegisterInfo.td`` for an example of this.
+
+Virtual registers are also denoted by integer numbers. Contrary to physical
+registers, different virtual registers never share the same number. Whereas
+physical registers are statically defined in a ``TargetRegisterInfo.td`` file
+and cannot be created by the application developer, that is not the case with
+virtual registers. In order to create new virtual registers, use the method
+``MachineRegisterInfo::createVirtualRegister()``. This method will return a new
+virtual register. Use an ``IndexedMap<Foo, VirtReg2IndexFunctor>`` to hold
+information per virtual register. If you need to enumerate all virtual
+registers, use the function ``TargetRegisterInfo::index2VirtReg()`` to find the
+virtual register numbers:
+
+.. code-block:: c++
+
+    for (unsigned i = 0, e = MRI->getNumVirtRegs(); i != e; ++i) {
+      unsigned VirtReg = TargetRegisterInfo::index2VirtReg(i);
+      stuff(VirtReg);
+    }
+
+Before register allocation, the operands of an instruction are mostly virtual
+registers, although physical registers may also be used. In order to check if a
+given machine operand is a register, use the boolean function
+``MachineOperand::isRegister()``. To obtain the integer code of a register, use
+``MachineOperand::getReg()``. An instruction may define or use a register. For
+instance, ``ADD reg:1026 := reg:1025 reg:1024`` defines the registers 1024, and
+uses registers 1025 and 1026. Given a register operand, the method
+``MachineOperand::isUse()`` informs if that register is being used by the
+instruction. The method ``MachineOperand::isDef()`` informs if that registers is
+being defined.
+
+We will call physical registers present in the LLVM bitcode before register
+allocation *pre-colored registers*. Pre-colored registers are used in many
+different situations, for instance, to pass parameters of functions calls, and
+to store results of particular instructions. There are two types of pre-colored
+registers: the ones *implicitly* defined, and those *explicitly*
+defined. Explicitly defined registers are normal operands, and can be accessed
+with ``MachineInstr::getOperand(int)::getReg()``.  In order to check which
+registers are implicitly defined by an instruction, use the
+``TargetInstrInfo::get(opcode)::ImplicitDefs``, where ``opcode`` is the opcode
+of the target instruction. One important difference between explicit and
+implicit physical registers is that the latter are defined statically for each
+instruction, whereas the former may vary depending on the program being
+compiled. For example, an instruction that represents a function call will
+always implicitly define or use the same set of physical registers. To read the
+registers implicitly used by an instruction, use
+``TargetInstrInfo::get(opcode)::ImplicitUses``. Pre-colored registers impose
+constraints on any register allocation algorithm. The register allocator must
+make sure that none of them are overwritten by the values of virtual registers
+while still alive.
+
+Mapping virtual registers to physical registers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are two ways to map virtual registers to physical registers (or to memory
+slots). The first way, that we will call *direct mapping*, is based on the use
+of methods of the classes ``TargetRegisterInfo``, and ``MachineOperand``. The
+second way, that we will call *indirect mapping*, relies on the ``VirtRegMap``
+class in order to insert loads and stores sending and getting values to and from
+memory.
+
+The direct mapping provides more flexibility to the developer of the register
+allocator; however, it is more error prone, and demands more implementation
+work.  Basically, the programmer will have to specify where load and store
+instructions should be inserted in the target function being compiled in order
+to get and store values in memory. To assign a physical register to a virtual
+register present in a given operand, use ``MachineOperand::setReg(p_reg)``. To
+insert a store instruction, use ``TargetInstrInfo::storeRegToStackSlot(...)``,
+and to insert a load instruction, use ``TargetInstrInfo::loadRegFromStackSlot``.
+
+The indirect mapping shields the application developer from the complexities of
+inserting load and store instructions. In order to map a virtual register to a
+physical one, use ``VirtRegMap::assignVirt2Phys(vreg, preg)``.  In order to map
+a certain virtual register to memory, use
+``VirtRegMap::assignVirt2StackSlot(vreg)``. This method will return the stack
+slot where ``vreg``'s value will be located.  If it is necessary to map another
+virtual register to the same stack slot, use
+``VirtRegMap::assignVirt2StackSlot(vreg, stack_location)``. One important point
+to consider when using the indirect mapping, is that even if a virtual register
+is mapped to memory, it still needs to be mapped to a physical register. This
+physical register is the location where the virtual register is supposed to be
+found before being stored or after being reloaded.
+
+If the indirect strategy is used, after all the virtual registers have been
+mapped to physical registers or stack slots, it is necessary to use a spiller
+object to place load and store instructions in the code. Every virtual that has
+been mapped to a stack slot will be stored to memory after being defined and will
+be loaded before being used. The implementation of the spiller tries to recycle
+load/store instructions, avoiding unnecessary instructions. For an example of
+how to invoke the spiller, see ``RegAllocLinearScan::runOnMachineFunction`` in
+``lib/CodeGen/RegAllocLinearScan.cpp``.
+
+Handling two address instructions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+With very rare exceptions (e.g., function calls), the LLVM machine code
+instructions are three address instructions. That is, each instruction is
+expected to define at most one register, and to use at most two registers.
+However, some architectures use two address instructions. In this case, the
+defined register is also one of the used registers. For instance, an instruction
+such as ``ADD %EAX, %EBX``, in X86 is actually equivalent to ``%EAX = %EAX +
+%EBX``.
+
+In order to produce correct code, LLVM must convert three address instructions
+that represent two address instructions into true two address instructions. LLVM
+provides the pass ``TwoAddressInstructionPass`` for this specific purpose. It
+must be run before register allocation takes place. After its execution, the
+resulting code may no longer be in SSA form. This happens, for instance, in
+situations where an instruction such as ``%a = ADD %b %c`` is converted to two
+instructions such as:
+
+::
+
+  %a = MOVE %b
+  %a = ADD %a %c
+
+Notice that, internally, the second instruction is represented as ``ADD
+%a[def/use] %c``. I.e., the register operand ``%a`` is both used and defined by
+the instruction.
+
+The SSA deconstruction phase
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+An important transformation that happens during register allocation is called
+the *SSA Deconstruction Phase*. The SSA form simplifies many analyses that are
+performed on the control flow graph of programs. However, traditional
+instruction sets do not implement PHI instructions. Thus, in order to generate
+executable code, compilers must replace PHI instructions with other instructions
+that preserve their semantics.
+
+There are many ways in which PHI instructions can safely be removed from the
+target code. The most traditional PHI deconstruction algorithm replaces PHI
+instructions with copy instructions. That is the strategy adopted by LLVM. The
+SSA deconstruction algorithm is implemented in
+``lib/CodeGen/PHIElimination.cpp``. In order to invoke this pass, the identifier
+``PHIEliminationID`` must be marked as required in the code of the register
+allocator.
+
+Instruction folding
+^^^^^^^^^^^^^^^^^^^
+
+*Instruction folding* is an optimization performed during register allocation
+that removes unnecessary copy instructions. For instance, a sequence of
+instructions such as:
+
+::
+
+  %EBX = LOAD %mem_address
+  %EAX = COPY %EBX
+
+can be safely substituted by the single instruction:
+
+::
+
+  %EAX = LOAD %mem_address
+
+Instructions can be folded with the
+``TargetRegisterInfo::foldMemoryOperand(...)`` method. Care must be taken when
+folding instructions; a folded instruction can be quite different from the
+original instruction. See ``LiveIntervals::addIntervalsForSpills`` in
+``lib/CodeGen/LiveIntervalAnalysis.cpp`` for an example of its use.
+
+Built in register allocators
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The LLVM infrastructure provides the application developer with three different
+register allocators:
+
+* *Fast* --- This register allocator is the default for debug builds. It
+  allocates registers on a basic block level, attempting to keep values in
+  registers and reusing registers as appropriate.
+
+* *Basic* --- This is an incremental approach to register allocation. Live
+  ranges are assigned to registers one at a time in an order that is driven by
+  heuristics. Since code can be rewritten on-the-fly during allocation, this
+  framework allows interesting allocators to be developed as extensions. It is
+  not itself a production register allocator but is a potentially useful
+  stand-alone mode for triaging bugs and as a performance baseline.
+
+* *Greedy* --- *The default allocator*. This is a highly tuned implementation of
+  the *Basic* allocator that incorporates global live range splitting. This
+  allocator works hard to minimize the cost of spill code.
+
+* *PBQP* --- A Partitioned Boolean Quadratic Programming (PBQP) based register
+  allocator. This allocator works by constructing a PBQP problem representing
+  the register allocation problem under consideration, solving this using a PBQP
+  solver, and mapping the solution back to a register assignment.
+
+The type of register allocator used in ``llc`` can be chosen with the command
+line option ``-regalloc=...``:
+
+.. code-block:: bash
+
+  $ llc -regalloc=linearscan file.bc -o ln.s
+  $ llc -regalloc=fast file.bc -o fa.s
+  $ llc -regalloc=pbqp file.bc -o pbqp.s
+
+.. _Prolog/Epilog Code Insertion:
+
+Prolog/Epilog Code Insertion
+----------------------------
+
+Compact Unwind
+
+Throwing an exception requires *unwinding* out of a function. The information on
+how to unwind a given function is traditionally expressed in DWARF unwind
+(a.k.a. frame) info. But that format was originally developed for debuggers to
+backtrace, and each Frame Description Entry (FDE) requires ~20-30 bytes per
+function. There is also the cost of mapping from an address in a function to the
+corresponding FDE at runtime. An alternative unwind encoding is called *compact
+unwind* and requires just 4-bytes per function.
+
+The compact unwind encoding is a 32-bit value, which is encoded in an
+architecture-specific way. It specifies which registers to restore and from
+where, and how to unwind out of the function. When the linker creates a final
+linked image, it will create a ``__TEXT,__unwind_info`` section. This section is
+a small and fast way for the runtime to access unwind info for any given
+function. If we emit compact unwind info for the function, that compact unwind
+info will be encoded in the ``__TEXT,__unwind_info`` section. If we emit DWARF
+unwind info, the ``__TEXT,__unwind_info`` section will contain the offset of the
+FDE in the ``__TEXT,__eh_frame`` section in the final linked image.
+
+For X86, there are three modes for the compact unwind encoding:
+
+*Function with a Frame Pointer (``EBP`` or ``RBP``)*
+  ``EBP/RBP``-based frame, where ``EBP/RBP`` is pushed onto the stack
+  immediately after the return address, then ``ESP/RSP`` is moved to
+  ``EBP/RBP``. Thus to unwind, ``ESP/RSP`` is restored with the current
+  ``EBP/RBP`` value, then ``EBP/RBP`` is restored by popping the stack, and the
+  return is done by popping the stack once more into the PC. All non-volatile
+  registers that need to be restored must have been saved in a small range on
+  the stack that starts ``EBP-4`` to ``EBP-1020`` (``RBP-8`` to
+  ``RBP-1020``). The offset (divided by 4 in 32-bit mode and 8 in 64-bit mode)
+  is encoded in bits 16-23 (mask: ``0x00FF0000``).  The registers saved are
+  encoded in bits 0-14 (mask: ``0x00007FFF``) as five 3-bit entries from the
+  following table:
+
+    ==============  =============  ===============
+    Compact Number  i386 Register  x86-64 Register
+    ==============  =============  ===============
+    1               ``EBX``        ``RBX``
+    2               ``ECX``        ``R12``
+    3               ``EDX``        ``R13``
+    4               ``EDI``        ``R14``
+    5               ``ESI``        ``R15``
+    6               ``EBP``        ``RBP``
+    ==============  =============  ===============
+
+*Frameless with a Small Constant Stack Size (``EBP`` or ``RBP`` is not used as a frame pointer)*
+  To return, a constant (encoded in the compact unwind encoding) is added to the
+  ``ESP/RSP``.  Then the return is done by popping the stack into the PC. All
+  non-volatile registers that need to be restored must have been saved on the
+  stack immediately after the return address. The stack size (divided by 4 in
+  32-bit mode and 8 in 64-bit mode) is encoded in bits 16-23 (mask:
+  ``0x00FF0000``). There is a maximum stack size of 1024 bytes in 32-bit mode
+  and 2048 in 64-bit mode. The number of registers saved is encoded in bits 9-12
+  (mask: ``0x00001C00``). Bits 0-9 (mask: ``0x000003FF``) contain which
+  registers were saved and their order. (See the
+  ``encodeCompactUnwindRegistersWithoutFrame()`` function in
+  ``lib/Target/X86FrameLowering.cpp`` for the encoding algorithm.)
+
+*Frameless with a Large Constant Stack Size (``EBP`` or ``RBP`` is not used as a frame pointer)*
+  This case is like the "Frameless with a Small Constant Stack Size" case, but
+  the stack size is too large to encode in the compact unwind encoding. Instead
+  it requires that the function contains "``subl $nnnnnn, %esp``" in its
+  prolog. The compact encoding contains the offset to the ``$nnnnnn`` value in
+  the function in bits 9-12 (mask: ``0x00001C00``).
+
+.. _Late Machine Code Optimizations:
+
+Late Machine Code Optimizations
+-------------------------------
+
+.. note::
+
+  To Be Written
+
+.. _Code Emission:
+
+Code Emission
+-------------
+
+The code emission step of code generation is responsible for lowering from the
+code generator abstractions (like `MachineFunction`_, `MachineInstr`_, etc) down
+to the abstractions used by the MC layer (`MCInst`_, `MCStreamer`_, etc).  This
+is done with a combination of several different classes: the (misnamed)
+target-independent AsmPrinter class, target-specific subclasses of AsmPrinter
+(such as SparcAsmPrinter), and the TargetLoweringObjectFile class.
+
+Since the MC layer works at the level of abstraction of object files, it doesn't
+have a notion of functions, global variables etc.  Instead, it thinks about
+labels, directives, and instructions.  A key class used at this time is the
+MCStreamer class.  This is an abstract API that is implemented in different ways
+(e.g. to output a .s file, output an ELF .o file, etc) that is effectively an
+"assembler API".  MCStreamer has one method per directive, such as EmitLabel,
+EmitSymbolAttribute, SwitchSection, etc, which directly correspond to assembly
+level directives.
+
+If you are interested in implementing a code generator for a target, there are
+three important things that you have to implement for your target:
+
+#. First, you need a subclass of AsmPrinter for your target.  This class
+   implements the general lowering process converting MachineFunction's into MC
+   label constructs.  The AsmPrinter base class provides a number of useful
+   methods and routines, and also allows you to override the lowering process in
+   some important ways.  You should get much of the lowering for free if you are
+   implementing an ELF, COFF, or MachO target, because the
+   TargetLoweringObjectFile class implements much of the common logic.
+
+#. Second, you need to implement an instruction printer for your target.  The
+   instruction printer takes an `MCInst`_ and renders it to a raw_ostream as
+   text.  Most of this is automatically generated from the .td file (when you
+   specify something like "``add $dst, $src1, $src2``" in the instructions), but
+   you need to implement routines to print operands.
+
+#. Third, you need to implement code that lowers a `MachineInstr`_ to an MCInst,
+   usually implemented in "<target>MCInstLower.cpp".  This lowering process is
+   often target specific, and is responsible for turning jump table entries,
+   constant pool indices, global variable addresses, etc into MCLabels as
+   appropriate.  This translation layer is also responsible for expanding pseudo
+   ops used by the code generator into the actual machine instructions they
+   correspond to. The MCInsts that are generated by this are fed into the
+   instruction printer or the encoder.
+
+Finally, at your choosing, you can also implement a subclass of MCCodeEmitter
+which lowers MCInst's into machine code bytes and relocations.  This is
+important if you want to support direct .o file emission, or would like to
+implement an assembler for your target.
+
+VLIW Packetizer
+---------------
+
+In a Very Long Instruction Word (VLIW) architecture, the compiler is responsible
+for mapping instructions to functional-units available on the architecture. To
+that end, the compiler creates groups of instructions called *packets* or
+*bundles*. The VLIW packetizer in LLVM is a target-independent mechanism to
+enable the packetization of machine instructions.
+
+Mapping from instructions to functional units
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Instructions in a VLIW target can typically be mapped to multiple functional
+units. During the process of packetizing, the compiler must be able to reason
+about whether an instruction can be added to a packet. This decision can be
+complex since the compiler has to examine all possible mappings of instructions
+to functional units. Therefore to alleviate compilation-time complexity, the
+VLIW packetizer parses the instruction classes of a target and generates tables
+at compiler build time. These tables can then be queried by the provided
+machine-independent API to determine if an instruction can be accommodated in a
+packet.
+
+How the packetization tables are generated and used
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The packetizer reads instruction classes from a target's itineraries and creates
+a deterministic finite automaton (DFA) to represent the state of a packet. A DFA
+consists of three major elements: inputs, states, and transitions. The set of
+inputs for the generated DFA represents the instruction being added to a
+packet. The states represent the possible consumption of functional units by
+instructions in a packet. In the DFA, transitions from one state to another
+occur on the addition of an instruction to an existing packet. If there is a
+legal mapping of functional units to instructions, then the DFA contains a
+corresponding transition. The absence of a transition indicates that a legal
+mapping does not exist and that the instruction cannot be added to the packet.
+
+To generate tables for a VLIW target, add *Target*\ GenDFAPacketizer.inc as a
+target to the Makefile in the target directory. The exported API provides three
+functions: ``DFAPacketizer::clearResources()``,
+``DFAPacketizer::reserveResources(MachineInstr *MI)``, and
+``DFAPacketizer::canReserveResources(MachineInstr *MI)``. These functions allow
+a target packetizer to add an instruction to an existing packet and to check
+whether an instruction can be added to a packet. See
+``llvm/CodeGen/DFAPacketizer.h`` for more information.
+
+Implementing a Native Assembler
+===============================
+
+Though you're probably reading this because you want to write or maintain a
+compiler backend, LLVM also fully supports building a native assembler.
+We've tried hard to automate the generation of the assembler from the .td files
+(in particular the instruction syntax and encodings), which means that a large
+part of the manual and repetitive data entry can be factored and shared with the
+compiler.
+
+Instruction Parsing
+-------------------
+
+.. note::
+
+  To Be Written
+
+
+Instruction Alias Processing
+----------------------------
+
+Once the instruction is parsed, it enters the MatchInstructionImpl function.
+The MatchInstructionImpl function performs alias processing and then does actual
+matching.
+
+Alias processing is the phase that canonicalizes different lexical forms of the
+same instructions down to one representation.  There are several different kinds
+of alias that are possible to implement and they are listed below in the order
+that they are processed (which is in order from simplest/weakest to most
+complex/powerful).  Generally you want to use the first alias mechanism that
+meets the needs of your instruction, because it will allow a more concise
+description.
+
+Mnemonic Aliases
+^^^^^^^^^^^^^^^^
+
+The first phase of alias processing is simple instruction mnemonic remapping for
+classes of instructions which are allowed with two different mnemonics.  This
+phase is a simple and unconditionally remapping from one input mnemonic to one
+output mnemonic.  It isn't possible for this form of alias to look at the
+operands at all, so the remapping must apply for all forms of a given mnemonic.
+Mnemonic aliases are defined simply, for example X86 has:
+
+::
+
+  def : MnemonicAlias<"cbw",     "cbtw">;
+  def : MnemonicAlias<"smovq",   "movsq">;
+  def : MnemonicAlias<"fldcww",  "fldcw">;
+  def : MnemonicAlias<"fucompi", "fucomip">;
+  def : MnemonicAlias<"ud2a",    "ud2">;
+
+... and many others.  With a MnemonicAlias definition, the mnemonic is remapped
+simply and directly.  Though MnemonicAlias's can't look at any aspect of the
+instruction (such as the operands) they can depend on global modes (the same
+ones supported by the matcher), through a Requires clause:
+
+::
+
+  def : MnemonicAlias<"pushf", "pushfq">, Requires<[In64BitMode]>;
+  def : MnemonicAlias<"pushf", "pushfl">, Requires<[In32BitMode]>;
+
+In this example, the mnemonic gets mapped into a different one depending on
+the current instruction set.
+
+Instruction Aliases
+^^^^^^^^^^^^^^^^^^^
+
+The most general phase of alias processing occurs while matching is happening:
+it provides new forms for the matcher to match along with a specific instruction
+to generate.  An instruction alias has two parts: the string to match and the
+instruction to generate.  For example:
+
+::
+
+  def : InstAlias<"movsx $src, $dst", (MOVSX16rr8W GR16:$dst, GR8  :$src)>;
+  def : InstAlias<"movsx $src, $dst", (MOVSX16rm8W GR16:$dst, i8mem:$src)>;
+  def : InstAlias<"movsx $src, $dst", (MOVSX32rr8  GR32:$dst, GR8  :$src)>;
+  def : InstAlias<"movsx $src, $dst", (MOVSX32rr16 GR32:$dst, GR16 :$src)>;
+  def : InstAlias<"movsx $src, $dst", (MOVSX64rr8  GR64:$dst, GR8  :$src)>;
+  def : InstAlias<"movsx $src, $dst", (MOVSX64rr16 GR64:$dst, GR16 :$src)>;
+  def : InstAlias<"movsx $src, $dst", (MOVSX64rr32 GR64:$dst, GR32 :$src)>;
+
+This shows a powerful example of the instruction aliases, matching the same
+mnemonic in multiple different ways depending on what operands are present in
+the assembly.  The result of instruction aliases can include operands in a
+different order than the destination instruction, and can use an input multiple
+times, for example:
+
+::
+
+  def : InstAlias<"clrb $reg", (XOR8rr  GR8 :$reg, GR8 :$reg)>;
+  def : InstAlias<"clrw $reg", (XOR16rr GR16:$reg, GR16:$reg)>;
+  def : InstAlias<"clrl $reg", (XOR32rr GR32:$reg, GR32:$reg)>;
+  def : InstAlias<"clrq $reg", (XOR64rr GR64:$reg, GR64:$reg)>;
+
+This example also shows that tied operands are only listed once.  In the X86
+backend, XOR8rr has two input GR8's and one output GR8 (where an input is tied
+to the output).  InstAliases take a flattened operand list without duplicates
+for tied operands.  The result of an instruction alias can also use immediates
+and fixed physical registers which are added as simple immediate operands in the
+result, for example:
+
+::
+
+  // Fixed Immediate operand.
+  def : InstAlias<"aad", (AAD8i8 10)>;
+
+  // Fixed register operand.
+  def : InstAlias<"fcomi", (COM_FIr ST1)>;
+
+  // Simple alias.
+  def : InstAlias<"fcomi $reg", (COM_FIr RST:$reg)>;
+
+Instruction aliases can also have a Requires clause to make them subtarget
+specific.
+
+If the back-end supports it, the instruction printer can automatically emit the
+alias rather than what's being aliased. It typically leads to better, more
+readable code. If it's better to print out what's being aliased, then pass a '0'
+as the third parameter to the InstAlias definition.
+
+Instruction Matching
+--------------------
+
+.. note::
+
+  To Be Written
+
+.. _Implementations of the abstract target description interfaces:
+.. _implement the target description:
+
+Target-specific Implementation Notes
+====================================
+
+This section of the document explains features or design decisions that are
+specific to the code generator for a particular target.  First we start with a
+table that summarizes what features are supported by each target.
+
+.. _target-feature-matrix:
+
+Target Feature Matrix
+---------------------
+
+Note that this table does not list features that are not supported fully by any
+target yet.  It considers a feature to be supported if at least one subtarget
+supports it.  A feature being supported means that it is useful and works for
+most cases, it does not indicate that there are zero known bugs in the
+implementation.  Here is the key:
+
+:raw-html:`<table border="1" cellspacing="0">`
+:raw-html:`<tr>`
+:raw-html:`<th>Unknown</th>`
+:raw-html:`<th>Not Applicable</th>`
+:raw-html:`<th>No support</th>`
+:raw-html:`<th>Partial Support</th>`
+:raw-html:`<th>Complete Support</th>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td class="unknown"></td>`
+:raw-html:`<td class="na"></td>`
+:raw-html:`<td class="no"></td>`
+:raw-html:`<td class="partial"></td>`
+:raw-html:`<td class="yes"></td>`
+:raw-html:`</tr>`
+:raw-html:`</table>`
+
+Here is the table:
+
+:raw-html:`<table width="689" border="1" cellspacing="0">`
+:raw-html:`<tr><td></td>`
+:raw-html:`<td colspan="13" align="center" style="background-color:#ffc">Target</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<th>Feature</th>`
+:raw-html:`<th>ARM</th>`
+:raw-html:`<th>Hexagon</th>`
+:raw-html:`<th>MSP430</th>`
+:raw-html:`<th>Mips</th>`
+:raw-html:`<th>NVPTX</th>`
+:raw-html:`<th>PowerPC</th>`
+:raw-html:`<th>Sparc</th>`
+:raw-html:`<th>SystemZ</th>`
+:raw-html:`<th>X86</th>`
+:raw-html:`<th>XCore</th>`
+:raw-html:`<th>eBPF</th>`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_reliable">is generally reliable</a></td>`
+:raw-html:`<td class="yes"></td> <!-- ARM -->`
+:raw-html:`<td class="yes"></td> <!-- Hexagon -->`
+:raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
+:raw-html:`<td class="yes"></td> <!-- Mips -->`
+:raw-html:`<td class="yes"></td> <!-- NVPTX -->`
+:raw-html:`<td class="yes"></td> <!-- PowerPC -->`
+:raw-html:`<td class="yes"></td> <!-- Sparc -->`
+:raw-html:`<td class="yes"></td> <!-- SystemZ -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="yes"></td> <!-- XCore -->`
+:raw-html:`<td class="yes"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_asmparser">assembly parser</a></td>`
+:raw-html:`<td class="no"></td> <!-- ARM -->`
+:raw-html:`<td class="no"></td> <!-- Hexagon -->`
+:raw-html:`<td class="no"></td> <!-- MSP430 -->`
+:raw-html:`<td class="no"></td> <!-- Mips -->`
+:raw-html:`<td class="no"></td> <!-- NVPTX -->`
+:raw-html:`<td class="no"></td> <!-- PowerPC -->`
+:raw-html:`<td class="no"></td> <!-- Sparc -->`
+:raw-html:`<td class="yes"></td> <!-- SystemZ -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="no"></td> <!-- XCore -->`
+:raw-html:`<td class="no"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_disassembler">disassembler</a></td>`
+:raw-html:`<td class="yes"></td> <!-- ARM -->`
+:raw-html:`<td class="no"></td> <!-- Hexagon -->`
+:raw-html:`<td class="no"></td> <!-- MSP430 -->`
+:raw-html:`<td class="no"></td> <!-- Mips -->`
+:raw-html:`<td class="na"></td> <!-- NVPTX -->`
+:raw-html:`<td class="no"></td> <!-- PowerPC -->`
+:raw-html:`<td class="yes"></td> <!-- SystemZ -->`
+:raw-html:`<td class="no"></td> <!-- Sparc -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="yes"></td> <!-- XCore -->`
+:raw-html:`<td class="yes"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_inlineasm">inline asm</a></td>`
+:raw-html:`<td class="yes"></td> <!-- ARM -->`
+:raw-html:`<td class="yes"></td> <!-- Hexagon -->`
+:raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
+:raw-html:`<td class="no"></td> <!-- Mips -->`
+:raw-html:`<td class="yes"></td> <!-- NVPTX -->`
+:raw-html:`<td class="yes"></td> <!-- PowerPC -->`
+:raw-html:`<td class="unknown"></td> <!-- Sparc -->`
+:raw-html:`<td class="yes"></td> <!-- SystemZ -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="yes"></td> <!-- XCore -->`
+:raw-html:`<td class="no"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_jit">jit</a></td>`
+:raw-html:`<td class="partial"><a href="#feat_jit_arm">*</a></td> <!-- ARM -->`
+:raw-html:`<td class="no"></td> <!-- Hexagon -->`
+:raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
+:raw-html:`<td class="yes"></td> <!-- Mips -->`
+:raw-html:`<td class="na"></td> <!-- NVPTX -->`
+:raw-html:`<td class="yes"></td> <!-- PowerPC -->`
+:raw-html:`<td class="unknown"></td> <!-- Sparc -->`
+:raw-html:`<td class="yes"></td> <!-- SystemZ -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="no"></td> <!-- XCore -->`
+:raw-html:`<td class="yes"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_objectwrite">.o file writing</a></td>`
+:raw-html:`<td class="no"></td> <!-- ARM -->`
+:raw-html:`<td class="no"></td> <!-- Hexagon -->`
+:raw-html:`<td class="no"></td> <!-- MSP430 -->`
+:raw-html:`<td class="no"></td> <!-- Mips -->`
+:raw-html:`<td class="na"></td> <!-- NVPTX -->`
+:raw-html:`<td class="no"></td> <!-- PowerPC -->`
+:raw-html:`<td class="no"></td> <!-- Sparc -->`
+:raw-html:`<td class="yes"></td> <!-- SystemZ -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="no"></td> <!-- XCore -->`
+:raw-html:`<td class="yes"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a hr:raw-html:`ef="#feat_tailcall">tail calls</a></td>`
+:raw-html:`<td class="yes"></td> <!-- ARM -->`
+:raw-html:`<td class="yes"></td> <!-- Hexagon -->`
+:raw-html:`<td class="unknown"></td> <!-- MSP430 -->`
+:raw-html:`<td class="no"></td> <!-- Mips -->`
+:raw-html:`<td class="no"></td> <!-- NVPTX -->`
+:raw-html:`<td class="yes"></td> <!-- PowerPC -->`
+:raw-html:`<td class="unknown"></td> <!-- Sparc -->`
+:raw-html:`<td class="no"></td> <!-- SystemZ -->`
+:raw-html:`<td class="yes"></td> <!-- X86 -->`
+:raw-html:`<td class="no"></td> <!-- XCore -->`
+:raw-html:`<td class="no"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`<tr>`
+:raw-html:`<td><a href="#feat_segstacks">segmented stacks</a></td>`
+:raw-html:`<td class="no"></td> <!-- ARM -->`
+:raw-html:`<td class="no"></td> <!-- Hexagon -->`
+:raw-html:`<td class="no"></td> <!-- MSP430 -->`
+:raw-html:`<td class="no"></td> <!-- Mips -->`
+:raw-html:`<td class="no"></td> <!-- NVPTX -->`
+:raw-html:`<td class="no"></td> <!-- PowerPC -->`
+:raw-html:`<td class="no"></td> <!-- Sparc -->`
+:raw-html:`<td class="no"></td> <!-- SystemZ -->`
+:raw-html:`<td class="partial"><a href="#feat_segstacks_x86">*</a></td> <!-- X86 -->`
+:raw-html:`<td class="no"></td> <!-- XCore -->`
+:raw-html:`<td class="no"></td> <!-- eBPF -->`
+:raw-html:`</tr>`
+
+:raw-html:`</table>`
+
+.. _feat_reliable:
+
+Is Generally Reliable
+^^^^^^^^^^^^^^^^^^^^^
+
+This box indicates whether the target is considered to be production quality.
+This indicates that the target has been used as a static compiler to compile
+large amounts of code by a variety of different people and is in continuous use.
+
+.. _feat_asmparser:
+
+Assembly Parser
+^^^^^^^^^^^^^^^
+
+This box indicates whether the target supports parsing target specific .s files
+by implementing the MCAsmParser interface.  This is required for llvm-mc to be
+able to act as a native assembler and is required for inline assembly support in
+the native .o file writer.
+
+.. _feat_disassembler:
+
+Disassembler
+^^^^^^^^^^^^
+
+This box indicates whether the target supports the MCDisassembler API for
+disassembling machine opcode bytes into MCInst's.
+
+.. _feat_inlineasm:
+
+Inline Asm
+^^^^^^^^^^
+
+This box indicates whether the target supports most popular inline assembly
+constraints and modifiers.
+
+.. _feat_jit:
+
+JIT Support
+^^^^^^^^^^^
+
+This box indicates whether the target supports the JIT compiler through the
+ExecutionEngine interface.
+
+.. _feat_jit_arm:
+
+The ARM backend has basic support for integer code in ARM codegen mode, but
+lacks NEON and full Thumb support.
+
+.. _feat_objectwrite:
+
+.o File Writing
+^^^^^^^^^^^^^^^
+
+This box indicates whether the target supports writing .o files (e.g. MachO,
+ELF, and/or COFF) files directly from the target.  Note that the target also
+must include an assembly parser and general inline assembly support for full
+inline assembly support in the .o writer.
+
+Targets that don't support this feature can obviously still write out .o files,
+they just rely on having an external assembler to translate from a .s file to a
+.o file (as is the case for many C compilers).
+
+.. _feat_tailcall:
+
+Tail Calls
+^^^^^^^^^^
+
+This box indicates whether the target supports guaranteed tail calls.  These are
+calls marked "`tail <LangRef.html#i_call>`_" and use the fastcc calling
+convention.  Please see the `tail call section`_ for more details.
+
+.. _feat_segstacks:
+
+Segmented Stacks
+^^^^^^^^^^^^^^^^
+
+This box indicates whether the target supports segmented stacks. This replaces
+the traditional large C stack with many linked segments. It is compatible with
+the `gcc implementation <http://gcc.gnu.org/wiki/SplitStacks>`_ used by the Go
+front end.
+
+.. _feat_segstacks_x86:
+
+Basic support exists on the X86 backend. Currently vararg doesn't work and the
+object files are not marked the way the gold linker expects, but simple Go
+programs can be built by dragonegg.
+
+.. _tail call section:
+
+Tail call optimization
+----------------------
+
+Tail call optimization, callee reusing the stack of the caller, is currently
+supported on x86/x86-64 and PowerPC. It is performed if:
+
+* Caller and callee have the calling convention ``fastcc``, ``cc 10`` (GHC
+  calling convention) or ``cc 11`` (HiPE calling convention).
+
+* The call is a tail call - in tail position (ret immediately follows call and
+  ret uses value of call or is void).
+
+* Option ``-tailcallopt`` is enabled.
+
+* Platform-specific constraints are met.
+
+x86/x86-64 constraints:
+
+* No variable argument lists are used.
+
+* On x86-64 when generating GOT/PIC code only module-local calls (visibility =
+  hidden or protected) are supported.
+
+PowerPC constraints:
+
+* No variable argument lists are used.
+
+* No byval parameters are used.
+
+* On ppc32/64 GOT/PIC only module-local calls (visibility = hidden or protected)
+  are supported.
+
+Example:
+
+Call as ``llc -tailcallopt test.ll``.
+
+.. code-block:: llvm
+
+  declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4)
+
+  define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
+    %l1 = add i32 %in1, %in2
+    %tmp = tail call fastcc i32 @tailcallee(i32 %in1 inreg, i32 %in2 inreg, i32 %in1, i32 %l1)
+    ret i32 %tmp
+  }
+
+Implications of ``-tailcallopt``:
+
+To support tail call optimization in situations where the callee has more
+arguments than the caller a 'callee pops arguments' convention is used. This
+currently causes each ``fastcc`` call that is not tail call optimized (because
+one or more of above constraints are not met) to be followed by a readjustment
+of the stack. So performance might be worse in such cases.
+
+Sibling call optimization
+-------------------------
+
+Sibling call optimization is a restricted form of tail call optimization.
+Unlike tail call optimization described in the previous section, it can be
+performed automatically on any tail calls when ``-tailcallopt`` option is not
+specified.
+
+Sibling call optimization is currently performed on x86/x86-64 when the
+following constraints are met:
+
+* Caller and callee have the same calling convention. It can be either ``c`` or
+  ``fastcc``.
+
+* The call is a tail call - in tail position (ret immediately follows call and
+  ret uses value of call or is void).
+
+* Caller and callee have matching return type or the callee result is not used.
+
+* If any of the callee arguments are being passed in stack, they must be
+  available in caller's own incoming argument stack and the frame offsets must
+  be the same.
+
+Example:
+
+.. code-block:: llvm
+
+  declare i32 @bar(i32, i32)
+
+  define i32 @foo(i32 %a, i32 %b, i32 %c) {
+  entry:
+    %0 = tail call i32 @bar(i32 %a, i32 %b)
+    ret i32 %0
+  }
+
+The X86 backend
+---------------
+
+The X86 code generator lives in the ``lib/Target/X86`` directory.  This code
+generator is capable of targeting a variety of x86-32 and x86-64 processors, and
+includes support for ISA extensions such as MMX and SSE.
+
+X86 Target Triples supported
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following are the known target triples that are supported by the X86
+backend.  This is not an exhaustive list, and it would be useful to add those
+that people test.
+
+* **i686-pc-linux-gnu** --- Linux
+
+* **i386-unknown-freebsd5.3** --- FreeBSD 5.3
+
+* **i686-pc-cygwin** --- Cygwin on Win32
+
+* **i686-pc-mingw32** --- MingW on Win32
+
+* **i386-pc-mingw32msvc** --- MingW crosscompiler on Linux
+
+* **i686-apple-darwin*** --- Apple Darwin on X86
+
+* **x86_64-unknown-linux-gnu** --- Linux
+
+X86 Calling Conventions supported
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following target-specific calling conventions are known to backend:
+
+* **x86_StdCall** --- stdcall calling convention seen on Microsoft Windows
+  platform (CC ID = 64).
+
+* **x86_FastCall** --- fastcall calling convention seen on Microsoft Windows
+  platform (CC ID = 65).
+
+* **x86_ThisCall** --- Similar to X86_StdCall. Passes first argument in ECX,
+  others via stack. Callee is responsible for stack cleaning. This convention is
+  used by MSVC by default for methods in its ABI (CC ID = 70).
+
+.. _X86 addressing mode:
+
+Representing X86 addressing modes in MachineInstrs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The x86 has a very flexible way of accessing memory.  It is capable of forming
+memory addresses of the following expression directly in integer instructions
+(which use ModR/M addressing):
+
+::
+
+  SegmentReg: Base + [1,2,4,8] * IndexReg + Disp32
+
+In order to represent this, LLVM tracks no less than 5 operands for each memory
+operand of this form.  This means that the "load" form of '``mov``' has the
+following ``MachineOperand``\s in this order:
+
+::
+
+  Index:        0     |    1        2       3           4          5
+  Meaning:   DestReg, | BaseReg,  Scale, IndexReg, Displacement Segment
+  OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg,   SignExtImm  PhysReg
+
+Stores, and all other instructions, treat the four memory operands in the same
+way and in the same order.  If the segment register is unspecified (regno = 0),
+then no segment override is generated.  "Lea" operations do not have a segment
+register specified, so they only have 4 operands for their memory reference.
+
+X86 address spaces supported
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+x86 has a feature which provides the ability to perform loads and stores to
+different address spaces via the x86 segment registers.  A segment override
+prefix byte on an instruction causes the instruction's memory access to go to
+the specified segment.  LLVM address space 0 is the default address space, which
+includes the stack, and any unqualified memory accesses in a program.  Address
+spaces 1-255 are currently reserved for user-defined code.  The GS-segment is
+represented by address space 256, the FS-segment is represented by address space
+257, and the SS-segment is represented by address space 258. Other x86 segments
+have yet to be allocated address space numbers.
+
+While these address spaces may seem similar to TLS via the ``thread_local``
+keyword, and often use the same underlying hardware, there are some fundamental
+differences.
+
+The ``thread_local`` keyword applies to global variables and specifies that they
+are to be allocated in thread-local memory. There are no type qualifiers
+involved, and these variables can be pointed to with normal pointers and
+accessed with normal loads and stores.  The ``thread_local`` keyword is
+target-independent at the LLVM IR level (though LLVM doesn't yet have
+implementations of it for some configurations)
+
+Special address spaces, in contrast, apply to static types. Every load and store
+has a particular address space in its address operand type, and this is what
+determines which address space is accessed.  LLVM ignores these special address
+space qualifiers on global variables, and does not provide a way to directly
+allocate storage in them.  At the LLVM IR level, the behavior of these special
+address spaces depends in part on the underlying OS or runtime environment, and
+they are specific to x86 (and LLVM doesn't yet handle them correctly in some
+cases).
+
+Some operating systems and runtime environments use (or may in the future use)
+the FS/GS-segment registers for various low-level purposes, so care should be
+taken when considering them.
+
+Instruction naming
+^^^^^^^^^^^^^^^^^^
+
+An instruction name consists of the base name, a default operand size, and a a
+character per operand with an optional special size. For example:
+
+::
+
+  ADD8rr      -> add, 8-bit register, 8-bit register
+  IMUL16rmi   -> imul, 16-bit register, 16-bit memory, 16-bit immediate
+  IMUL16rmi8  -> imul, 16-bit register, 16-bit memory, 8-bit immediate
+  MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory
+
+The PowerPC backend
+-------------------
+
+The PowerPC code generator lives in the lib/Target/PowerPC directory.  The code
+generation is retargetable to several variations or *subtargets* of the PowerPC
+ISA; including ppc32, ppc64 and altivec.
+
+LLVM PowerPC ABI
+^^^^^^^^^^^^^^^^
+
+LLVM follows the AIX PowerPC ABI, with two deviations. LLVM uses a PC relative
+(PIC) or static addressing for accessing global values, so no TOC (r2) is
+used. Second, r31 is used as a frame pointer to allow dynamic growth of a stack
+frame.  LLVM takes advantage of having no TOC to provide space to save the frame
+pointer in the PowerPC linkage area of the caller frame.  Other details of
+PowerPC ABI can be found at `PowerPC ABI
+<http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/Articles/32bitPowerPC.html>`_\
+. Note: This link describes the 32 bit ABI.  The 64 bit ABI is similar except
+space for GPRs are 8 bytes wide (not 4) and r13 is reserved for system use.
+
+Frame Layout
+^^^^^^^^^^^^
+
+The size of a PowerPC frame is usually fixed for the duration of a function's
+invocation.  Since the frame is fixed size, all references into the frame can be
+accessed via fixed offsets from the stack pointer.  The exception to this is
+when dynamic alloca or variable sized arrays are present, then a base pointer
+(r31) is used as a proxy for the stack pointer and stack pointer is free to grow
+or shrink.  A base pointer is also used if llvm-gcc is not passed the
+-fomit-frame-pointer flag. The stack pointer is always aligned to 16 bytes, so
+that space allocated for altivec vectors will be properly aligned.
+
+An invocation frame is laid out as follows (low memory at top):
+
+:raw-html:`<table border="1" cellspacing="0">`
+:raw-html:`<tr>`
+:raw-html:`<td>Linkage<br><br></td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>Parameter area<br><br></td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>Dynamic area<br><br></td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>Locals area<br><br></td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>Saved registers area<br><br></td>`
+:raw-html:`</tr>`
+:raw-html:`<tr style="border-style: none hidden none hidden;">`
+:raw-html:`<td><br></td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>Previous Frame<br><br></td>`
+:raw-html:`</tr>`
+:raw-html:`</table>`
+
+The *linkage* area is used by a callee to save special registers prior to
+allocating its own frame.  Only three entries are relevant to LLVM. The first
+entry is the previous stack pointer (sp), aka link.  This allows probing tools
+like gdb or exception handlers to quickly scan the frames in the stack.  A
+function epilog can also use the link to pop the frame from the stack.  The
+third entry in the linkage area is used to save the return address from the lr
+register. Finally, as mentioned above, the last entry is used to save the
+previous frame pointer (r31.)  The entries in the linkage area are the size of a
+GPR, thus the linkage area is 24 bytes long in 32 bit mode and 48 bytes in 64
+bit mode.
+
+32 bit linkage area:
+
+:raw-html:`<table  border="1" cellspacing="0">`
+:raw-html:`<tr>`
+:raw-html:`<td>0</td>`
+:raw-html:`<td>Saved SP (r1)</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>4</td>`
+:raw-html:`<td>Saved CR</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>8</td>`
+:raw-html:`<td>Saved LR</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>12</td>`
+:raw-html:`<td>Reserved</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>16</td>`
+:raw-html:`<td>Reserved</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>20</td>`
+:raw-html:`<td>Saved FP (r31)</td>`
+:raw-html:`</tr>`
+:raw-html:`</table>`
+
+64 bit linkage area:
+
+:raw-html:`<table border="1" cellspacing="0">`
+:raw-html:`<tr>`
+:raw-html:`<td>0</td>`
+:raw-html:`<td>Saved SP (r1)</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>8</td>`
+:raw-html:`<td>Saved CR</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>16</td>`
+:raw-html:`<td>Saved LR</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>24</td>`
+:raw-html:`<td>Reserved</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>32</td>`
+:raw-html:`<td>Reserved</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>40</td>`
+:raw-html:`<td>Saved FP (r31)</td>`
+:raw-html:`</tr>`
+:raw-html:`</table>`
+
+The *parameter area* is used to store arguments being passed to a callee
+function.  Following the PowerPC ABI, the first few arguments are actually
+passed in registers, with the space in the parameter area unused.  However, if
+there are not enough registers or the callee is a thunk or vararg function,
+these register arguments can be spilled into the parameter area.  Thus, the
+parameter area must be large enough to store all the parameters for the largest
+call sequence made by the caller.  The size must also be minimally large enough
+to spill registers r3-r10.  This allows callees blind to the call signature,
+such as thunks and vararg functions, enough space to cache the argument
+registers.  Therefore, the parameter area is minimally 32 bytes (64 bytes in 64
+bit mode.)  Also note that since the parameter area is a fixed offset from the
+top of the frame, that a callee can access its spilt arguments using fixed
+offsets from the stack pointer (or base pointer.)
+
+Combining the information about the linkage, parameter areas and alignment. A
+stack frame is minimally 64 bytes in 32 bit mode and 128 bytes in 64 bit mode.
+
+The *dynamic area* starts out as size zero.  If a function uses dynamic alloca
+then space is added to the stack, the linkage and parameter areas are shifted to
+top of stack, and the new space is available immediately below the linkage and
+parameter areas.  The cost of shifting the linkage and parameter areas is minor
+since only the link value needs to be copied.  The link value can be easily
+fetched by adding the original frame size to the base pointer.  Note that
+allocations in the dynamic space need to observe 16 byte alignment.
+
+The *locals area* is where the llvm compiler reserves space for local variables.
+
+The *saved registers area* is where the llvm compiler spills callee saved
+registers on entry to the callee.
+
+Prolog/Epilog
+^^^^^^^^^^^^^
+
+The llvm prolog and epilog are the same as described in the PowerPC ABI, with
+the following exceptions.  Callee saved registers are spilled after the frame is
+created.  This allows the llvm epilog/prolog support to be common with other
+targets.  The base pointer callee saved register r31 is saved in the TOC slot of
+linkage area.  This simplifies allocation of space for the base pointer and
+makes it convenient to locate programmatically and during debugging.
+
+Dynamic Allocation
+^^^^^^^^^^^^^^^^^^
+
+.. note::
+
+  TODO - More to come.
+
+The NVPTX backend
+-----------------
+
+The NVPTX code generator under lib/Target/NVPTX is an open-source version of
+the NVIDIA NVPTX code generator for LLVM.  It is contributed by NVIDIA and is
+a port of the code generator used in the CUDA compiler (nvcc).  It targets the
+PTX 3.0/3.1 ISA and can target any compute capability greater than or equal to
+2.0 (Fermi).
+
+This target is of production quality and should be completely compatible with
+the official NVIDIA toolchain.
+
+Code Generator Options:
+
+:raw-html:`<table border="1" cellspacing="0">`
+:raw-html:`<tr>`
+:raw-html:`<th>Option</th>`
+:raw-html:`<th>Description</th>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>sm_20</td>`
+:raw-html:`<td align="left">Set shader model/compute capability to 2.0</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>sm_21</td>`
+:raw-html:`<td align="left">Set shader model/compute capability to 2.1</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>sm_30</td>`
+:raw-html:`<td align="left">Set shader model/compute capability to 3.0</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>sm_35</td>`
+:raw-html:`<td align="left">Set shader model/compute capability to 3.5</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>ptx30</td>`
+:raw-html:`<td align="left">Target PTX 3.0</td>`
+:raw-html:`</tr>`
+:raw-html:`<tr>`
+:raw-html:`<td>ptx31</td>`
+:raw-html:`<td align="left">Target PTX 3.1</td>`
+:raw-html:`</tr>`
+:raw-html:`</table>`
+
+The extended Berkeley Packet Filter (eBPF) backend
+--------------------------------------------------
+
+Extended BPF (or eBPF) is similar to the original ("classic") BPF (cBPF) used
+to filter network packets.  The
+`bpf() system call <http://man7.org/linux/man-pages/man2/bpf.2.html>`_
+performs a range of operations related to eBPF.  For both cBPF and eBPF
+programs, the Linux kernel statically analyzes the programs before loading
+them, in order to ensure that they cannot harm the running system.  eBPF is
+a 64-bit RISC instruction set designed for one to one mapping to 64-bit CPUs.
+Opcodes are 8-bit encoded, and 87 instructions are defined.  There are 10
+registers, grouped by function as outlined below.
+
+::
+
+  R0        return value from in-kernel functions; exit value for eBPF program
+  R1 - R5   function call arguments to in-kernel functions
+  R6 - R9   callee-saved registers preserved by in-kernel functions
+  R10       stack frame pointer (read only)
+
+Instruction encoding (arithmetic and jump)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+eBPF is reusing most of the opcode encoding from classic to simplify conversion
+of classic BPF to eBPF.  For arithmetic and jump instructions the 8-bit 'code'
+field is divided into three parts:
+
+::
+
+  +----------------+--------+--------------------+
+  |   4 bits       |  1 bit |   3 bits           |
+  | operation code | source | instruction class  |
+  +----------------+--------+--------------------+
+  (MSB)                                      (LSB)
+
+Three LSB bits store instruction class which is one of:
+
+::
+
+  BPF_LD     0x0
+  BPF_LDX    0x1
+  BPF_ST     0x2
+  BPF_STX    0x3
+  BPF_ALU    0x4
+  BPF_JMP    0x5
+  (unused)   0x6
+  BPF_ALU64  0x7
+
+When BPF_CLASS(code) == BPF_ALU or BPF_ALU64 or BPF_JMP,
+4th bit encodes source operand
+
+::
+
+  BPF_X     0x0  use src_reg register as source operand
+  BPF_K     0x1  use 32 bit immediate as source operand
+
+and four MSB bits store operation code
+
+::
+
+  BPF_ADD   0x0  add
+  BPF_SUB   0x1  subtract
+  BPF_MUL   0x2  multiply
+  BPF_DIV   0x3  divide
+  BPF_OR    0x4  bitwise logical OR
+  BPF_AND   0x5  bitwise logical AND
+  BPF_LSH   0x6  left shift
+  BPF_RSH   0x7  right shift (zero extended)
+  BPF_NEG   0x8  arithmetic negation
+  BPF_MOD   0x9  modulo
+  BPF_XOR   0xa  bitwise logical XOR
+  BPF_MOV   0xb  move register to register
+  BPF_ARSH  0xc  right shift (sign extended)
+  BPF_END   0xd  endianness conversion
+
+If BPF_CLASS(code) == BPF_JMP, BPF_OP(code) is one of
+
+::
+
+  BPF_JA    0x0  unconditional jump
+  BPF_JEQ   0x1  jump ==
+  BPF_JGT   0x2  jump >
+  BPF_JGE   0x3  jump >=
+  BPF_JSET  0x4  jump if (DST & SRC)
+  BPF_JNE   0x5  jump !=
+  BPF_JSGT  0x6  jump signed >
+  BPF_JSGE  0x7  jump signed >=
+  BPF_CALL  0x8  function call
+  BPF_EXIT  0x9  function return
+
+Instruction encoding (load, store)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+For load and store instructions the 8-bit 'code' field is divided as:
+
+::
+
+  +--------+--------+-------------------+
+  | 3 bits | 2 bits |   3 bits          |
+  |  mode  |  size  | instruction class |
+  +--------+--------+-------------------+
+  (MSB)                             (LSB)
+
+Size modifier is one of
+
+::
+
+  BPF_W       0x0  word
+  BPF_H       0x1  half word
+  BPF_B       0x2  byte
+  BPF_DW      0x3  double word
+
+Mode modifier is one of
+
+::
+
+  BPF_IMM     0x0  immediate
+  BPF_ABS     0x1  used to access packet data
+  BPF_IND     0x2  used to access packet data
+  BPF_MEM     0x3  memory
+  (reserved)  0x4
+  (reserved)  0x5
+  BPF_XADD    0x6  exclusive add
+
+
+Packet data access (BPF_ABS, BPF_IND)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
+(BPF_IND | <size> | BPF_LD) which are used to access packet data.
+Register R6 is an implicit input that must contain pointer to sk_buff.
+Register R0 is an implicit output which contains the data fetched
+from the packet.  Registers R1-R5 are scratch registers and must not
+be used to store the data across BPF_ABS | BPF_LD or BPF_IND | BPF_LD
+instructions.  These instructions have implicit program exit condition
+as well.  When eBPF program is trying to access the data beyond
+the packet boundary, the interpreter will abort the execution of the program.
+
+BPF_IND | BPF_W | BPF_LD is equivalent to:
+  R0 = ntohl(\*(u32 \*) (((struct sk_buff \*) R6)->data + src_reg + imm32))
+
+eBPF maps
+^^^^^^^^^
+
+eBPF maps are provided for sharing data between kernel and user-space.
+Currently implemented types are hash and array, with potential extension to
+support bloom filters, radix trees, etc.  A map is defined by its type,
+maximum number of elements, key size and value size in bytes.  eBPF syscall
+supports create, update, find and delete functions on maps.
+
+Function calls
+^^^^^^^^^^^^^^
+
+Function call arguments are passed using up to five registers (R1 - R5).
+The return value is passed in a dedicated register (R0).  Four additional
+registers (R6 - R9) are callee-saved, and the values in these registers
+are preserved within kernel functions.  R0 - R5 are scratch registers within
+kernel functions, and eBPF programs must therefor store/restore values in
+these registers if needed across function calls.  The stack can be accessed
+using the read-only frame pointer R10.  eBPF registers map 1:1 to hardware
+registers on x86_64 and other 64-bit architectures.  For example, x86_64
+in-kernel JIT maps them as
+
+::
+
+  R0 - rax
+  R1 - rdi
+  R2 - rsi
+  R3 - rdx
+  R4 - rcx
+  R5 - r8
+  R6 - rbx
+  R7 - r13
+  R8 - r14
+  R9 - r15
+  R10 - rbp
+
+since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing
+and rbx, r12 - r15 are callee saved.
+
+Program start
+^^^^^^^^^^^^^
+
+An eBPF program receives a single argument and contains
+a single eBPF main routine; the program does not contain eBPF functions.
+Function calls are limited to a predefined set of kernel functions.  The size
+of a program is limited to 4K instructions:  this ensures fast termination and
+a limited number of kernel function calls.  Prior to running an eBPF program,
+a verifier performs static analysis to prevent loops in the code and
+to ensure valid register usage and operand types.
+
+The AMDGPU backend
+------------------
+
+The AMDGPU code generator lives in the lib/Target/AMDGPU directory, and is an
+open source native AMD GCN ISA code generator.
+
+Target triples supported
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following are the known target triples that are supported by the AMDGPU
+backend.
+
+* **amdgcn--** --- AMD GCN GPUs (AMDGPU.7.0.0+)
+* **amdgcn--amdhsa** --- AMD GCN GPUs (AMDGPU.7.0.0+) with HSA support
+* **r600--** --- AMD GPUs HD2XXX-HD6XXX
+
+Relocations
+^^^^^^^^^^^
+
+Supported relocatable fields are:
+
+* **word32** --- This specifies a 32-bit field occupying 4 bytes with arbitrary
+  byte alignment. These values use the same byte order as other word values in
+  the AMD GPU architecture
+* **word64** --- This specifies a 64-bit field occupying 8 bytes with arbitrary
+  byte alignment. These values use the same byte order as other word values in
+  the AMD GPU architecture
+
+Following notations are used for specifying relocation calculations:
+
+* **A** --- Represents the addend used to compute the value of the relocatable
+  field
+* **G** --- Represents the offset into the global offset table at which the
+  relocation entryâs symbol will reside during execution.
+* **GOT** --- Represents the address of the global offset table.
+* **P** --- Represents the place (section offset or address) of the storage unit
+  being relocated (computed using ``r_offset``)
+* **S** --- Represents the value of the symbol whose index resides in the
+  relocation entry
+
+AMDGPU Backend generates *Elf64_Rela* relocation records with the following
+supported relocation types:
+
+  ==========================  =====  ==========  ==============================
+  Relocation type             Value  Field       Calculation
+  ==========================  =====  ==========  ==============================
+  ``R_AMDGPU_NONE``           0      ``none``    ``none``
+  ``R_AMDGPU_ABS32_LO``       1      ``word32``  (S + A) & 0xFFFFFFFF
+  ``R_AMDGPU_ABS32_HI``       2      ``word32``  (S + A) >> 32
+  ``R_AMDGPU_ABS64``          3      ``word64``  S + A
+  ``R_AMDGPU_REL32``          4      ``word32``  S + A - P
+  ``R_AMDGPU_REL64``          5      ``word64``  S + A - P
+  ``R_AMDGPU_ABS32``          6      ``word32``  S + A
+  ``R_AMDGPU_GOTPCREL``       7      ``word32``  G + GOT + A - P
+  ``R_AMDGPU_GOTPCREL32_LO``  8      ``word32``  (G + GOT + A - P) & 0xFFFFFFFF
+  ``R_AMDGPU_GOTPCREL32_HI``  9      ``word32``  (G + GOT + A - P) >> 32
+  ``R_AMDGPU_REL32_LO``       10     ``word32``  (S + A - P) & 0xFFFFFFFF
+  ``R_AMDGPU_REL32_HI``       11     ``word32``  (S + A - P) >> 32
+  ==========================  =====  ==========  ==============================

Added: www-releases/trunk/4.0.1/docs/_sources/CodeOfConduct.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CodeOfConduct.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CodeOfConduct.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CodeOfConduct.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,112 @@
+==============================
+LLVM Community Code of Conduct
+==============================
+
+.. note::
+
+   This document is currently a **DRAFT** document while it is being discussed
+   by the community.
+
+The LLVM community has always worked to be a welcoming and respectful
+community, and we want to ensure that doesn't change as we grow and evolve. To
+that end, we have a few ground rules that we ask people to adhere to:
+
+* `be friendly and patient`_,
+* `be welcoming`_,
+* `be considerate`_,
+* `be respectful`_,
+* `be careful in the words that you choose and be kind to others`_, and
+* `when we disagree, try to understand why`_.
+
+This isn't an exhaustive list of things that you can't do. Rather, take it in
+the spirit in which it's intended - a guide to make it easier to communicate
+and participate in the community.
+
+This code of conduct applies to all spaces managed by the LLVM project or The
+LLVM Foundation. This includes IRC channels, mailing lists, bug trackers, LLVM
+events such as the developer meetings and socials, and any other forums created
+by the project that the community uses for communication. It applies to all of
+your communication and conduct in these spaces, including emails, chats, things
+you say, slides, videos, posters, signs, or even t-shirts you display in these
+spaces. In addition, violations of this code outside these spaces may, in rare
+cases, affect a person's ability to participate within them, when the conduct
+amounts to an egregious violation of this code.
+
+If you believe someone is violating the code of conduct, we ask that you report
+it by emailing conduct at llvm.org. For more details please see our
+:doc:`Reporting Guide <ReportingGuide>`.
+
+.. _be friendly and patient:
+
+* **Be friendly and patient.**
+
+.. _be welcoming:
+
+* **Be welcoming.** We strive to be a community that welcomes and supports
+  people of all backgrounds and identities. This includes, but is not limited
+  to members of any race, ethnicity, culture, national origin, colour,
+  immigration status, social and economic class, educational level, sex, sexual
+  orientation, gender identity and expression, age, size, family status,
+  political belief, religion or lack thereof, and mental and physical ability.
+
+.. _be considerate:
+
+* **Be considerate.** Your work will be used by other people, and you in turn
+  will depend on the work of others. Any decision you take will affect users
+  and colleagues, and you should take those consequences into account. Remember
+  that we're a world-wide community, so you might not be communicating in
+  someone else's primary language.
+
+.. _be respectful:
+
+* **Be respectful.** Not all of us will agree all the time, but disagreement is
+  no excuse for poor behavior and poor manners. We might all experience some
+  frustration now and then, but we cannot allow that frustration to turn into
+  a personal attack. It's important to remember that a community where people
+  feel uncomfortable or threatened is not a productive one. Members of the LLVM
+  community should be respectful when dealing with other members as well as
+  with people outside the LLVM community.
+
+.. _be careful in the words that you choose and be kind to others:
+
+* **Be careful in the words that you choose and be kind to others.** Do not
+  insult or put down other participants. Harassment and other exclusionary
+  behavior aren't acceptable. This includes, but is not limited to:
+
+  * Violent threats or language directed against another person.
+  * Discriminatory jokes and language.
+  * Posting sexually explicit or violent material.
+  * Posting (or threatening to post) other people's personally identifying
+    information ("doxing").
+  * Personal insults, especially those using racist or sexist terms.
+  * Unwelcome sexual attention.
+  * Advocating for, or encouraging, any of the above behavior.
+
+  In general, if someone asks you to stop, then stop. Persisting in such
+  behavior after being asked to stop is considered harassment.
+
+.. _when we disagree, try to understand why:
+
+* **When we disagree, try to understand why.** Disagreements, both social and
+  technical, happen all the time and LLVM is no exception. It is important that
+  we resolve disagreements and differing views constructively. Remember that
+  we're different. The strength of LLVM comes from its varied community, people
+  from a wide range of backgrounds. Different people have different
+  perspectives on issues. Being unable to understand why someone holds
+  a viewpoint doesn't mean that they're wrong. Don't forget that it is human to
+  err and blaming each other doesn't get us anywhere. Instead, focus on helping
+  to resolve issues and learning from mistakes.
+
+Questions?
+==========
+
+If you have questions, please feel free to contact the LLVM Foundation Code of
+Conduct Advisory Committee by emailing conduct at llvm.org.
+
+
+(This text is based on the `Django Project`_ Code of Conduct, which is in turn
+based on wording from the `Speak Up! project`_.)
+
+.. _Django Project: https://www.djangoproject.com/conduct/
+.. _Speak Up! project: http://speakup.io/coc.html
+

Added: www-releases/trunk/4.0.1/docs/_sources/CodingStandards.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CodingStandards.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CodingStandards.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CodingStandards.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,1638 @@
+=====================
+LLVM Coding Standards
+=====================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document attempts to describe a few coding standards that are being used in
+the LLVM source tree.  Although no coding standards should be regarded as
+absolute requirements to be followed in all instances, coding standards are
+particularly important for large-scale code bases that follow a library-based
+design (like LLVM).
+
+While this document may provide guidance for some mechanical formatting issues,
+whitespace, or other "microscopic details", these are not fixed standards.
+Always follow the golden rule:
+
+.. _Golden Rule:
+
+    **If you are extending, enhancing, or bug fixing already implemented code,
+    use the style that is already being used so that the source is uniform and
+    easy to follow.**
+
+Note that some code bases (e.g. ``libc++``) have really good reasons to deviate
+from the coding standards.  In the case of ``libc++``, this is because the
+naming and other conventions are dictated by the C++ standard.  If you think
+there is a specific good reason to deviate from the standards here, please bring
+it up on the LLVM-dev mailing list.
+
+There are some conventions that are not uniformly followed in the code base
+(e.g. the naming convention).  This is because they are relatively new, and a
+lot of code was written before they were put in place.  Our long term goal is
+for the entire codebase to follow the convention, but we explicitly *do not*
+want patches that do large-scale reformating of existing code.  On the other
+hand, it is reasonable to rename the methods of a class if you're about to
+change it in some other way.  Just do the reformating as a separate commit from
+the functionality change.
+  
+The ultimate goal of these guidelines is to increase the readability and
+maintainability of our common source base. If you have suggestions for topics to
+be included, please mail them to `Chris <mailto:sabre at nondot.org>`_.
+
+Languages, Libraries, and Standards
+===================================
+
+Most source code in LLVM and other LLVM projects using these coding standards
+is C++ code. There are some places where C code is used either due to
+environment restrictions, historical restrictions, or due to third-party source
+code imported into the tree. Generally, our preference is for standards
+conforming, modern, and portable C++ code as the implementation language of
+choice.
+
+C++ Standard Versions
+---------------------
+
+LLVM, Clang, and LLD are currently written using C++11 conforming code,
+although we restrict ourselves to features which are available in the major
+toolchains supported as host compilers. The LLDB project is even more
+aggressive in the set of host compilers supported and thus uses still more
+features. Regardless of the supported features, code is expected to (when
+reasonable) be standard, portable, and modern C++11 code. We avoid unnecessary
+vendor-specific extensions, etc.
+
+C++ Standard Library
+--------------------
+
+Use the C++ standard library facilities whenever they are available for
+a particular task. LLVM and related projects emphasize and rely on the standard
+library facilities for as much as possible. Common support libraries providing
+functionality missing from the standard library for which there are standard
+interfaces or active work on adding standard interfaces will often be
+implemented in the LLVM namespace following the expected standard interface.
+
+There are some exceptions such as the standard I/O streams library which are
+avoided. Also, there is much more detailed information on these subjects in the
+:doc:`ProgrammersManual`.
+
+Supported C++11 Language and Library Features
+---------------------------------------------
+
+While LLVM, Clang, and LLD use C++11, not all features are available in all of
+the toolchains which we support. The set of features supported for use in LLVM
+is the intersection of those supported in the minimum requirements described
+in the :doc:`GettingStarted` page, section `Software`.
+The ultimate definition of this set is what build bots with those respective
+toolchains accept. Don't argue with the build bots. However, we have some
+guidance below to help you know what to expect.
+
+Each toolchain provides a good reference for what it accepts:
+
+* Clang: http://clang.llvm.org/cxx_status.html
+* GCC: http://gcc.gnu.org/projects/cxx0x.html
+* MSVC: http://msdn.microsoft.com/en-us/library/hh567368.aspx
+
+In most cases, the MSVC list will be the dominating factor. Here is a summary
+of the features that are expected to work. Features not on this list are
+unlikely to be supported by our host compilers.
+
+* Rvalue references: N2118_
+
+  * But *not* Rvalue references for ``*this`` or member qualifiers (N2439_)
+
+* Static assert: N1720_
+* ``auto`` type deduction: N1984_, N1737_
+* Trailing return types: N2541_
+* Lambdas: N2927_
+
+  * But *not* lambdas with default arguments.
+
+* ``decltype``: N2343_
+* Nested closing right angle brackets: N1757_
+* Extern templates: N1987_
+* ``nullptr``: N2431_
+* Strongly-typed and forward declarable enums: N2347_, N2764_
+* Local and unnamed types as template arguments: N2657_
+* Range-based for-loop: N2930_
+
+  * But ``{}`` are required around inner ``do {} while()`` loops.  As a result,
+    ``{}`` are required around function-like macros inside range-based for
+    loops.
+
+* ``override`` and ``final``: N2928_, N3206_, N3272_
+* Atomic operations and the C++11 memory model: N2429_
+* Variadic templates: N2242_
+* Explicit conversion operators: N2437_
+* Defaulted and deleted functions: N2346_
+* Initializer lists: N2627_
+* Delegating constructors: N1986_
+* Default member initializers (non-static data member initializers): N2756_
+
+  * Feel free to use these wherever they make sense and where the `=`
+    syntax is allowed. Don't use braced initialization syntax.
+
+.. _N2118: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html
+.. _N2439: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm
+.. _N1720: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html
+.. _N1984: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf
+.. _N1737: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf
+.. _N2541: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm
+.. _N2927: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2927.pdf
+.. _N2343: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf
+.. _N1757: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html
+.. _N1987: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1987.htm
+.. _N2431: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf
+.. _N2347: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2347.pdf
+.. _N2764: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2764.pdf
+.. _N2657: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2657.htm
+.. _N2930: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2930.html
+.. _N2928: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2928.htm
+.. _N3206: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3206.htm
+.. _N3272: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3272.htm
+.. _N2429: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2429.htm
+.. _N2242: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2242.pdf
+.. _N2437: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2437.pdf
+.. _N2346: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2346.htm
+.. _N2627: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2672.htm
+.. _N1986: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1986.pdf
+.. _N2756: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2756.htm
+
+The supported features in the C++11 standard libraries are less well tracked,
+but also much greater. Most of the standard libraries implement most of C++11's
+library. The most likely lowest common denominator is Linux support. For
+libc++, the support is just poorly tested and undocumented but expected to be
+largely complete. YMMV. For libstdc++, the support is documented in detail in
+`the libstdc++ manual`_. There are some very minor missing facilities that are
+unlikely to be common problems, and there are a few larger gaps that are worth
+being aware of:
+
+* Not all of the type traits are implemented
+* No regular expression library.
+* While most of the atomics library is well implemented, the fences are
+  missing. Fortunately, they are rarely needed.
+* The locale support is incomplete.
+
+Other than these areas you should assume the standard library is available and
+working as expected until some build bot tells you otherwise. If you're in an
+uncertain area of one of the above points, but you cannot test on a Linux
+system, your best approach is to minimize your use of these features, and watch
+the Linux build bots to find out if your usage triggered a bug. For example, if
+you hit a type trait which doesn't work we can then add support to LLVM's
+traits header to emulate it.
+
+.. _the libstdc++ manual:
+  http://gcc.gnu.org/onlinedocs/gcc-4.8.0/libstdc++/manual/manual/status.html#status.iso.2011
+
+Other Languages
+---------------
+
+Any code written in the Go programming language is not subject to the
+formatting rules below. Instead, we adopt the formatting rules enforced by
+the `gofmt`_ tool.
+
+Go code should strive to be idiomatic. Two good sets of guidelines for what
+this means are `Effective Go`_ and `Go Code Review Comments`_.
+
+.. _gofmt:
+  https://golang.org/cmd/gofmt/
+
+.. _Effective Go:
+  https://golang.org/doc/effective_go.html
+
+.. _Go Code Review Comments:
+  https://code.google.com/p/go-wiki/wiki/CodeReviewComments
+
+Mechanical Source Issues
+========================
+
+Source Code Formatting
+----------------------
+
+Commenting
+^^^^^^^^^^
+
+Comments are one critical part of readability and maintainability.  Everyone
+knows they should comment their code, and so should you.  When writing comments,
+write them as English prose, which means they should use proper capitalization,
+punctuation, etc.  Aim to describe what the code is trying to do and why, not
+*how* it does it at a micro level. Here are a few critical things to document:
+
+.. _header file comment:
+
+File Headers
+""""""""""""
+
+Every source file should have a header on it that describes the basic purpose of
+the file.  If a file does not have a header, it should not be checked into the
+tree.  The standard header looks like this:
+
+.. code-block:: c++
+
+  //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//
+  //
+  //                     The LLVM Compiler Infrastructure
+  //
+  // This file is distributed under the University of Illinois Open Source
+  // License. See LICENSE.TXT for details.
+  //
+  //===----------------------------------------------------------------------===//
+  ///
+  /// \file
+  /// This file contains the declaration of the Instruction class, which is the
+  /// base class for all of the VM instructions.
+  ///
+  //===----------------------------------------------------------------------===//
+
+A few things to note about this particular format: The "``-*- C++ -*-``" string
+on the first line is there to tell Emacs that the source file is a C++ file, not
+a C file (Emacs assumes ``.h`` files are C files by default).
+
+.. note::
+
+    This tag is not necessary in ``.cpp`` files.  The name of the file is also
+    on the first line, along with a very short description of the purpose of the
+    file.  This is important when printing out code and flipping though lots of
+    pages.
+
+The next section in the file is a concise note that defines the license that the
+file is released under.  This makes it perfectly clear what terms the source
+code can be distributed under and should not be modified in any way.
+
+The main body is a ``doxygen`` comment (identified by the ``///`` comment
+marker instead of the usual ``//``) describing the purpose of the file.  The
+first sentence (or a passage beginning with ``\brief``) is used as an abstract.
+Any additional information should be separated by a blank line.  If an
+algorithm is being implemented or something tricky is going on, a reference
+to the paper where it is published should be included, as well as any notes or
+*gotchas* in the code to watch out for.
+
+Class overviews
+"""""""""""""""
+
+Classes are one fundamental part of a good object oriented design.  As such, a
+class definition should have a comment block that explains what the class is
+used for and how it works.  Every non-trivial class is expected to have a
+``doxygen`` comment block.
+
+Method information
+""""""""""""""""""
+
+Methods defined in a class (as well as any global functions) should also be
+documented properly.  A quick note about what it does and a description of the
+borderline behaviour is all that is necessary here (unless something
+particularly tricky or insidious is going on).  The hope is that people can
+figure out how to use your interfaces without reading the code itself.
+
+Good things to talk about here are what happens when something unexpected
+happens: does the method return null?  Abort?  Format your hard disk?
+
+Comment Formatting
+^^^^^^^^^^^^^^^^^^
+
+In general, prefer C++ style comments (``//`` for normal comments, ``///`` for
+``doxygen`` documentation comments).  They take less space, require
+less typing, don't have nesting problems, etc.  There are a few cases when it is
+useful to use C style (``/* */``) comments however:
+
+#. When writing C code: Obviously if you are writing C code, use C style
+   comments.
+
+#. When writing a header file that may be ``#include``\d by a C source file.
+
+#. When writing a source file that is used by a tool that only accepts C style
+   comments.
+
+Commenting out large blocks of code is discouraged, but if you really have to do
+this (for documentation purposes or as a suggestion for debug printing), use
+``#if 0`` and ``#endif``. These nest properly and are better behaved in general
+than C style comments.
+
+Doxygen Use in Documentation Comments
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use the ``\file`` command to turn the standard file header into a file-level
+comment.
+
+Include descriptive paragraphs for all public interfaces (public classes,
+member and non-member functions).  Don't just restate the information that can
+be inferred from the API name.  The first sentence (or a paragraph beginning
+with ``\brief``) is used as an abstract. Try to use a single sentence as the
+``\brief`` adds visual clutter.  Put detailed discussion into separate
+paragraphs.
+
+To refer to parameter names inside a paragraph, use the ``\p name`` command.
+Don't use the ``\arg name`` command since it starts a new paragraph that
+contains documentation for the parameter.
+
+Wrap non-inline code examples in ``\code ... \endcode``.
+
+To document a function parameter, start a new paragraph with the
+``\param name`` command.  If the parameter is used as an out or an in/out
+parameter, use the ``\param [out] name`` or ``\param [in,out] name`` command,
+respectively.
+
+To describe function return value, start a new paragraph with the ``\returns``
+command.
+
+A minimal documentation comment:
+
+.. code-block:: c++
+
+  /// Sets the xyzzy property to \p Baz.
+  void setXyzzy(bool Baz);
+
+A documentation comment that uses all Doxygen features in a preferred way:
+
+.. code-block:: c++
+
+  /// Does foo and bar.
+  ///
+  /// Does not do foo the usual way if \p Baz is true.
+  ///
+  /// Typical usage:
+  /// \code
+  ///   fooBar(false, "quux", Res);
+  /// \endcode
+  ///
+  /// \param Quux kind of foo to do.
+  /// \param [out] Result filled with bar sequence on foo success.
+  ///
+  /// \returns true on success.
+  bool fooBar(bool Baz, StringRef Quux, std::vector<int> &Result);
+
+Don't duplicate the documentation comment in the header file and in the
+implementation file.  Put the documentation comments for public APIs into the
+header file.  Documentation comments for private APIs can go to the
+implementation file.  In any case, implementation files can include additional
+comments (not necessarily in Doxygen markup) to explain implementation details
+as needed.
+
+Don't duplicate function or class name at the beginning of the comment.
+For humans it is obvious which function or class is being documented;
+automatic documentation processing tools are smart enough to bind the comment
+to the correct declaration.
+
+Wrong:
+
+.. code-block:: c++
+
+  // In Something.h:
+
+  /// Something - An abstraction for some complicated thing.
+  class Something {
+  public:
+    /// fooBar - Does foo and bar.
+    void fooBar();
+  };
+
+  // In Something.cpp:
+
+  /// fooBar - Does foo and bar.
+  void Something::fooBar() { ... }
+
+Correct:
+
+.. code-block:: c++
+
+  // In Something.h:
+
+  /// An abstraction for some complicated thing.
+  class Something {
+  public:
+    /// Does foo and bar.
+    void fooBar();
+  };
+
+  // In Something.cpp:
+
+  // Builds a B-tree in order to do foo.  See paper by...
+  void Something::fooBar() { ... }
+
+It is not required to use additional Doxygen features, but sometimes it might
+be a good idea to do so.
+
+Consider:
+
+* adding comments to any narrow namespace containing a collection of
+  related functions or types;
+
+* using top-level groups to organize a collection of related functions at
+  namespace scope where the grouping is smaller than the namespace;
+
+* using member groups and additional comments attached to member
+  groups to organize within a class.
+
+For example:
+
+.. code-block:: c++
+
+  class Something {
+    /// \name Functions that do Foo.
+    /// @{
+    void fooBar();
+    void fooBaz();
+    /// @}
+    ...
+  };
+
+``#include`` Style
+^^^^^^^^^^^^^^^^^^
+
+Immediately after the `header file comment`_ (and include guards if working on a
+header file), the `minimal list of #includes`_ required by the file should be
+listed.  We prefer these ``#include``\s to be listed in this order:
+
+.. _Main Module Header:
+.. _Local/Private Headers:
+
+#. Main Module Header
+#. Local/Private Headers
+#. LLVM project/subproject headers (``clang/...``, ``lldb/...``, ``llvm/...``, etc)
+#. System ``#include``\s
+
+and each category should be sorted lexicographically by the full path.
+
+The `Main Module Header`_ file applies to ``.cpp`` files which implement an
+interface defined by a ``.h`` file.  This ``#include`` should always be included
+**first** regardless of where it lives on the file system.  By including a
+header file first in the ``.cpp`` files that implement the interfaces, we ensure
+that the header does not have any hidden dependencies which are not explicitly
+``#include``\d in the header, but should be. It is also a form of documentation
+in the ``.cpp`` file to indicate where the interfaces it implements are defined.
+
+LLVM project and subproject headers should be grouped from most specific to least
+specific, for the same reasons described above.  For example, LLDB depends on
+both clang and LLVM, and clang depends on LLVM.  So an LLDB source file should
+include ``lldb`` headers first, followed by ``clang`` headers, followed by
+``llvm`` headers, to reduce the possibility (for example) of an LLDB header
+accidentally picking up a missing include due to the previous inclusion of that
+header in the main source file or some earlier header file.  clang should
+similarly include its own headers before including llvm headers.  This rule
+applies to all LLVM subprojects.
+
+.. _fit into 80 columns:
+
+Source Code Width
+^^^^^^^^^^^^^^^^^
+
+Write your code to fit within 80 columns of text.  This helps those of us who
+like to print out code and look at your code in an ``xterm`` without resizing
+it.
+
+The longer answer is that there must be some limit to the width of the code in
+order to reasonably allow developers to have multiple files side-by-side in
+windows on a modest display.  If you are going to pick a width limit, it is
+somewhat arbitrary but you might as well pick something standard.  Going with 90
+columns (for example) instead of 80 columns wouldn't add any significant value
+and would be detrimental to printing out code.  Also many other projects have
+standardized on 80 columns, so some people have already configured their editors
+for it (vs something else, like 90 columns).
+
+This is one of many contentious issues in coding standards, but it is not up for
+debate.
+
+Use Spaces Instead of Tabs
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In all cases, prefer spaces to tabs in source files.  People have different
+preferred indentation levels, and different styles of indentation that they
+like; this is fine.  What isn't fine is that different editors/viewers expand
+tabs out to different tab stops.  This can cause your code to look completely
+unreadable, and it is not worth dealing with.
+
+As always, follow the `Golden Rule`_ above: follow the style of
+existing code if you are modifying and extending it.  If you like four spaces of
+indentation, **DO NOT** do that in the middle of a chunk of code with two spaces
+of indentation.  Also, do not reindent a whole source file: it makes for
+incredible diffs that are absolutely worthless.
+
+Indent Code Consistently
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Okay, in your first year of programming you were told that indentation is
+important. If you didn't believe and internalize this then, now is the time.
+Just do it. With the introduction of C++11, there are some new formatting
+challenges that merit some suggestions to help have consistent, maintainable,
+and tool-friendly formatting and indentation.
+
+Format Lambdas Like Blocks Of Code
+""""""""""""""""""""""""""""""""""
+
+When formatting a multi-line lambda, format it like a block of code, that's
+what it is. If there is only one multi-line lambda in a statement, and there
+are no expressions lexically after it in the statement, drop the indent to the
+standard two space indent for a block of code, as if it were an if-block opened
+by the preceding part of the statement:
+
+.. code-block:: c++
+
+  std::sort(foo.begin(), foo.end(), [&](Foo a, Foo b) -> bool {
+    if (a.blah < b.blah)
+      return true;
+    if (a.baz < b.baz)
+      return true;
+    return a.bam < b.bam;
+  });
+
+To take best advantage of this formatting, if you are designing an API which
+accepts a continuation or single callable argument (be it a functor, or
+a ``std::function``), it should be the last argument if at all possible.
+
+If there are multiple multi-line lambdas in a statement, or there is anything
+interesting after the lambda in the statement, indent the block two spaces from
+the indent of the ``[]``:
+
+.. code-block:: c++
+
+  dyn_switch(V->stripPointerCasts(),
+             [] (PHINode *PN) {
+               // process phis...
+             },
+             [] (SelectInst *SI) {
+               // process selects...
+             },
+             [] (LoadInst *LI) {
+               // process loads...
+             },
+             [] (AllocaInst *AI) {
+               // process allocas...
+             });
+
+Braced Initializer Lists
+""""""""""""""""""""""""
+
+With C++11, there are significantly more uses of braced lists to perform
+initialization. These allow you to easily construct aggregate temporaries in
+expressions among other niceness. They now have a natural way of ending up
+nested within each other and within function calls in order to build up
+aggregates (such as option structs) from local variables. To make matters
+worse, we also have many more uses of braces in an expression context that are
+*not* performing initialization.
+
+The historically common formatting of braced initialization of aggregate
+variables does not mix cleanly with deep nesting, general expression contexts,
+function arguments, and lambdas. We suggest new code use a simple rule for
+formatting braced initialization lists: act as-if the braces were parentheses
+in a function call. The formatting rules exactly match those already well
+understood for formatting nested function calls. Examples:
+
+.. code-block:: c++
+
+  foo({a, b, c}, {1, 2, 3});
+
+  llvm::Constant *Mask[] = {
+      llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 0),
+      llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 1),
+      llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 2)};
+
+This formatting scheme also makes it particularly easy to get predictable,
+consistent, and automatic formatting with tools like `Clang Format`_.
+
+.. _Clang Format: http://clang.llvm.org/docs/ClangFormat.html
+
+Language and Compiler Issues
+----------------------------
+
+Treat Compiler Warnings Like Errors
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If your code has compiler warnings in it, something is wrong --- you aren't
+casting values correctly, you have "questionable" constructs in your code, or
+you are doing something legitimately wrong.  Compiler warnings can cover up
+legitimate errors in output and make dealing with a translation unit difficult.
+
+It is not possible to prevent all warnings from all compilers, nor is it
+desirable.  Instead, pick a standard compiler (like ``gcc``) that provides a
+good thorough set of warnings, and stick to it.  At least in the case of
+``gcc``, it is possible to work around any spurious errors by changing the
+syntax of the code slightly.  For example, a warning that annoys me occurs when
+I write code like this:
+
+.. code-block:: c++
+
+  if (V = getValue()) {
+    ...
+  }
+
+``gcc`` will warn me that I probably want to use the ``==`` operator, and that I
+probably mistyped it.  In most cases, I haven't, and I really don't want the
+spurious errors.  To fix this particular problem, I rewrite the code like
+this:
+
+.. code-block:: c++
+
+  if ((V = getValue())) {
+    ...
+  }
+
+which shuts ``gcc`` up.  Any ``gcc`` warning that annoys you can be fixed by
+massaging the code appropriately.
+
+Write Portable Code
+^^^^^^^^^^^^^^^^^^^
+
+In almost all cases, it is possible and within reason to write completely
+portable code.  If there are cases where it isn't possible to write portable
+code, isolate it behind a well defined (and well documented) interface.
+
+In practice, this means that you shouldn't assume much about the host compiler
+(and Visual Studio tends to be the lowest common denominator).  If advanced
+features are used, they should only be an implementation detail of a library
+which has a simple exposed API, and preferably be buried in ``libSystem``.
+
+Do not use RTTI or Exceptions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In an effort to reduce code and executable size, LLVM does not use RTTI
+(e.g. ``dynamic_cast<>;``) or exceptions.  These two language features violate
+the general C++ principle of *"you only pay for what you use"*, causing
+executable bloat even if exceptions are never used in the code base, or if RTTI
+is never used for a class.  Because of this, we turn them off globally in the
+code.
+
+That said, LLVM does make extensive use of a hand-rolled form of RTTI that use
+templates like :ref:`isa\<>, cast\<>, and dyn_cast\<> <isa>`.
+This form of RTTI is opt-in and can be
+:doc:`added to any class <HowToSetUpLLVMStyleRTTI>`. It is also
+substantially more efficient than ``dynamic_cast<>``.
+
+.. _static constructor:
+
+Do not use Static Constructors
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Static constructors and destructors (e.g. global variables whose types have a
+constructor or destructor) should not be added to the code base, and should be
+removed wherever possible.  Besides `well known problems
+<http://yosefk.com/c++fqa/ctors.html#fqa-10.12>`_ where the order of
+initialization is undefined between globals in different source files, the
+entire concept of static constructors is at odds with the common use case of
+LLVM as a library linked into a larger application.
+  
+Consider the use of LLVM as a JIT linked into another application (perhaps for
+`OpenGL, custom languages <http://llvm.org/Users.html>`_, `shaders in movies
+<http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf>`_, etc). Due to the
+design of static constructors, they must be executed at startup time of the
+entire application, regardless of whether or how LLVM is used in that larger
+application.  There are two problems with this:
+
+* The time to run the static constructors impacts startup time of applications
+  --- a critical time for GUI apps, among others.
+  
+* The static constructors cause the app to pull many extra pages of memory off
+  the disk: both the code for the constructor in each ``.o`` file and the small
+  amount of data that gets touched. In addition, touched/dirty pages put more
+  pressure on the VM system on low-memory machines.
+
+We would really like for there to be zero cost for linking in an additional LLVM
+target or other library into an application, but static constructors violate
+this goal.
+  
+That said, LLVM unfortunately does contain static constructors.  It would be a
+`great project <http://llvm.org/PR11944>`_ for someone to purge all static
+constructors from LLVM, and then enable the ``-Wglobal-constructors`` warning
+flag (when building with Clang) to ensure we do not regress in the future.
+
+Use of ``class`` and ``struct`` Keywords
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In C++, the ``class`` and ``struct`` keywords can be used almost
+interchangeably. The only difference is when they are used to declare a class:
+``class`` makes all members private by default while ``struct`` makes all
+members public by default.
+
+Unfortunately, not all compilers follow the rules and some will generate
+different symbols based on whether ``class`` or ``struct`` was used to declare
+the symbol (e.g., MSVC).  This can lead to problems at link time.
+
+* All declarations and definitions of a given ``class`` or ``struct`` must use
+  the same keyword.  For example:
+
+.. code-block:: c++
+
+  class Foo;
+
+  // Breaks mangling in MSVC.
+  struct Foo { int Data; };
+
+* As a rule of thumb, ``struct`` should be kept to structures where *all*
+  members are declared public.
+
+.. code-block:: c++
+
+  // Foo feels like a class... this is strange.
+  struct Foo {
+  private:
+    int Data;
+  public:
+    Foo() : Data(0) { }
+    int getData() const { return Data; }
+    void setData(int D) { Data = D; }
+  };
+
+  // Bar isn't POD, but it does look like a struct.
+  struct Bar {
+    int Data;
+    Bar() : Data(0) { }
+  };
+
+Do not use Braced Initializer Lists to Call a Constructor
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In C++11 there is a "generalized initialization syntax" which allows calling
+constructors using braced initializer lists. Do not use these to call
+constructors with any interesting logic or if you care that you're calling some
+*particular* constructor. Those should look like function calls using
+parentheses rather than like aggregate initialization. Similarly, if you need
+to explicitly name the type and call its constructor to create a temporary,
+don't use a braced initializer list. Instead, use a braced initializer list
+(without any type for temporaries) when doing aggregate initialization or
+something notionally equivalent. Examples:
+
+.. code-block:: c++
+
+  class Foo {
+  public:
+    // Construct a Foo by reading data from the disk in the whizbang format, ...
+    Foo(std::string filename);
+
+    // Construct a Foo by looking up the Nth element of some global data ...
+    Foo(int N);
+
+    // ...
+  };
+
+  // The Foo constructor call is very deliberate, no braces.
+  std::fill(foo.begin(), foo.end(), Foo("name"));
+
+  // The pair is just being constructed like an aggregate, use braces.
+  bar_map.insert({my_key, my_value});
+
+If you use a braced initializer list when initializing a variable, use an equals before the open curly brace:
+
+.. code-block:: c++
+
+  int data[] = {0, 1, 2, 3};
+
+Use ``auto`` Type Deduction to Make Code More Readable
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some are advocating a policy of "almost always ``auto``" in C++11, however LLVM
+uses a more moderate stance. Use ``auto`` if and only if it makes the code more
+readable or easier to maintain. Don't "almost always" use ``auto``, but do use
+``auto`` with initializers like ``cast<Foo>(...)`` or other places where the
+type is already obvious from the context. Another time when ``auto`` works well
+for these purposes is when the type would have been abstracted away anyways,
+often behind a container's typedef such as ``std::vector<T>::iterator``.
+
+Beware unnecessary copies with ``auto``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The convenience of ``auto`` makes it easy to forget that its default behavior
+is a copy.  Particularly in range-based ``for`` loops, careless copies are
+expensive.
+
+As a rule of thumb, use ``auto &`` unless you need to copy the result, and use
+``auto *`` when copying pointers.
+
+.. code-block:: c++
+
+  // Typically there's no reason to copy.
+  for (const auto &Val : Container) { observe(Val); }
+  for (auto &Val : Container) { Val.change(); }
+
+  // Remove the reference if you really want a new copy.
+  for (auto Val : Container) { Val.change(); saveSomewhere(Val); }
+
+  // Copy pointers, but make it clear that they're pointers.
+  for (const auto *Ptr : Container) { observe(*Ptr); }
+  for (auto *Ptr : Container) { Ptr->change(); }
+
+Style Issues
+============
+
+The High-Level Issues
+---------------------
+
+A Public Header File **is** a Module
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+C++ doesn't do too well in the modularity department.  There is no real
+encapsulation or data hiding (unless you use expensive protocol classes), but it
+is what we have to work with.  When you write a public header file (in the LLVM
+source tree, they live in the top level "``include``" directory), you are
+defining a module of functionality.
+
+Ideally, modules should be completely independent of each other, and their
+header files should only ``#include`` the absolute minimum number of headers
+possible. A module is not just a class, a function, or a namespace: it's a
+collection of these that defines an interface.  This interface may be several
+functions, classes, or data structures, but the important issue is how they work
+together.
+
+In general, a module should be implemented by one or more ``.cpp`` files.  Each
+of these ``.cpp`` files should include the header that defines their interface
+first.  This ensures that all of the dependences of the module header have been
+properly added to the module header itself, and are not implicit.  System
+headers should be included after user headers for a translation unit.
+
+.. _minimal list of #includes:
+
+``#include`` as Little as Possible
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``#include`` hurts compile time performance.  Don't do it unless you have to,
+especially in header files.
+
+But wait! Sometimes you need to have the definition of a class to use it, or to
+inherit from it.  In these cases go ahead and ``#include`` that header file.  Be
+aware however that there are many cases where you don't need to have the full
+definition of a class.  If you are using a pointer or reference to a class, you
+don't need the header file.  If you are simply returning a class instance from a
+prototyped function or method, you don't need it.  In fact, for most cases, you
+simply don't need the definition of a class. And not ``#include``\ing speeds up
+compilation.
+
+It is easy to try to go too overboard on this recommendation, however.  You
+**must** include all of the header files that you are using --- you can include
+them either directly or indirectly through another header file.  To make sure
+that you don't accidentally forget to include a header file in your module
+header, make sure to include your module header **first** in the implementation
+file (as mentioned above).  This way there won't be any hidden dependencies that
+you'll find out about later.
+
+Keep "Internal" Headers Private
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Many modules have a complex implementation that causes them to use more than one
+implementation (``.cpp``) file.  It is often tempting to put the internal
+communication interface (helper classes, extra functions, etc) in the public
+module header file.  Don't do this!
+
+If you really need to do something like this, put a private header file in the
+same directory as the source files, and include it locally.  This ensures that
+your private interface remains private and undisturbed by outsiders.
+
+.. note::
+
+    It's okay to put extra implementation methods in a public class itself. Just
+    make them private (or protected) and all is well.
+
+.. _early exits:
+
+Use Early Exits and ``continue`` to Simplify Code
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When reading code, keep in mind how much state and how many previous decisions
+have to be remembered by the reader to understand a block of code.  Aim to
+reduce indentation where possible when it doesn't make it more difficult to
+understand the code.  One great way to do this is by making use of early exits
+and the ``continue`` keyword in long loops.  As an example of using an early
+exit from a function, consider this "bad" code:
+
+.. code-block:: c++
+
+  Value *doSomething(Instruction *I) {
+    if (!isa<TerminatorInst>(I) &&
+        I->hasOneUse() && doOtherThing(I)) {
+      ... some long code ....
+    }
+
+    return 0;
+  }
+
+This code has several problems if the body of the ``'if'`` is large.  When
+you're looking at the top of the function, it isn't immediately clear that this
+*only* does interesting things with non-terminator instructions, and only
+applies to things with the other predicates.  Second, it is relatively difficult
+to describe (in comments) why these predicates are important because the ``if``
+statement makes it difficult to lay out the comments.  Third, when you're deep
+within the body of the code, it is indented an extra level.  Finally, when
+reading the top of the function, it isn't clear what the result is if the
+predicate isn't true; you have to read to the end of the function to know that
+it returns null.
+
+It is much preferred to format the code like this:
+
+.. code-block:: c++
+
+  Value *doSomething(Instruction *I) {
+    // Terminators never need 'something' done to them because ... 
+    if (isa<TerminatorInst>(I))
+      return 0;
+
+    // We conservatively avoid transforming instructions with multiple uses
+    // because goats like cheese.
+    if (!I->hasOneUse())
+      return 0;
+
+    // This is really just here for example.
+    if (!doOtherThing(I))
+      return 0;
+    
+    ... some long code ....
+  }
+
+This fixes these problems.  A similar problem frequently happens in ``for``
+loops.  A silly example is something like this:
+
+.. code-block:: c++
+
+  for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
+    if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) {
+      Value *LHS = BO->getOperand(0);
+      Value *RHS = BO->getOperand(1);
+      if (LHS != RHS) {
+        ...
+      }
+    }
+  }
+
+When you have very, very small loops, this sort of structure is fine. But if it
+exceeds more than 10-15 lines, it becomes difficult for people to read and
+understand at a glance. The problem with this sort of code is that it gets very
+nested very quickly. Meaning that the reader of the code has to keep a lot of
+context in their brain to remember what is going immediately on in the loop,
+because they don't know if/when the ``if`` conditions will have ``else``\s etc.
+It is strongly preferred to structure the loop like this:
+
+.. code-block:: c++
+
+  for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
+    BinaryOperator *BO = dyn_cast<BinaryOperator>(II);
+    if (!BO) continue;
+
+    Value *LHS = BO->getOperand(0);
+    Value *RHS = BO->getOperand(1);
+    if (LHS == RHS) continue;
+
+    ...
+  }
+
+This has all the benefits of using early exits for functions: it reduces nesting
+of the loop, it makes it easier to describe why the conditions are true, and it
+makes it obvious to the reader that there is no ``else`` coming up that they
+have to push context into their brain for.  If a loop is large, this can be a
+big understandability win.
+
+Don't use ``else`` after a ``return``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+For similar reasons above (reduction of indentation and easier reading), please
+do not use ``'else'`` or ``'else if'`` after something that interrupts control
+flow --- like ``return``, ``break``, ``continue``, ``goto``, etc. For
+example, this is *bad*:
+
+.. code-block:: c++
+
+  case 'J': {
+    if (Signed) {
+      Type = Context.getsigjmp_bufType();
+      if (Type.isNull()) {
+        Error = ASTContext::GE_Missing_sigjmp_buf;
+        return QualType();
+      } else {
+        break;
+      }
+    } else {
+      Type = Context.getjmp_bufType();
+      if (Type.isNull()) {
+        Error = ASTContext::GE_Missing_jmp_buf;
+        return QualType();
+      } else {
+        break;
+      }
+    }
+  }
+
+It is better to write it like this:
+
+.. code-block:: c++
+
+  case 'J':
+    if (Signed) {
+      Type = Context.getsigjmp_bufType();
+      if (Type.isNull()) {
+        Error = ASTContext::GE_Missing_sigjmp_buf;
+        return QualType();
+      }
+    } else {
+      Type = Context.getjmp_bufType();
+      if (Type.isNull()) {
+        Error = ASTContext::GE_Missing_jmp_buf;
+        return QualType();
+      }
+    }
+    break;
+
+Or better yet (in this case) as:
+
+.. code-block:: c++
+
+  case 'J':
+    if (Signed)
+      Type = Context.getsigjmp_bufType();
+    else
+      Type = Context.getjmp_bufType();
+    
+    if (Type.isNull()) {
+      Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :
+                       ASTContext::GE_Missing_jmp_buf;
+      return QualType();
+    }
+    break;
+
+The idea is to reduce indentation and the amount of code you have to keep track
+of when reading the code.
+              
+Turn Predicate Loops into Predicate Functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+It is very common to write small loops that just compute a boolean value.  There
+are a number of ways that people commonly write these, but an example of this
+sort of thing is:
+
+.. code-block:: c++
+
+  bool FoundFoo = false;
+  for (unsigned I = 0, E = BarList.size(); I != E; ++I)
+    if (BarList[I]->isFoo()) {
+      FoundFoo = true;
+      break;
+    }
+
+  if (FoundFoo) {
+    ...
+  }
+
+This sort of code is awkward to write, and is almost always a bad sign.  Instead
+of this sort of loop, we strongly prefer to use a predicate function (which may
+be `static`_) that uses `early exits`_ to compute the predicate.  We prefer the
+code to be structured like this:
+
+.. code-block:: c++
+
+  /// \returns true if the specified list has an element that is a foo.
+  static bool containsFoo(const std::vector<Bar*> &List) {
+    for (unsigned I = 0, E = List.size(); I != E; ++I)
+      if (List[I]->isFoo())
+        return true;
+    return false;
+  }
+  ...
+
+  if (containsFoo(BarList)) {
+    ...
+  }
+
+There are many reasons for doing this: it reduces indentation and factors out
+code which can often be shared by other code that checks for the same predicate.
+More importantly, it *forces you to pick a name* for the function, and forces
+you to write a comment for it.  In this silly example, this doesn't add much
+value.  However, if the condition is complex, this can make it a lot easier for
+the reader to understand the code that queries for this predicate.  Instead of
+being faced with the in-line details of how we check to see if the BarList
+contains a foo, we can trust the function name and continue reading with better
+locality.
+
+The Low-Level Issues
+--------------------
+
+Name Types, Functions, Variables, and Enumerators Properly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Poorly-chosen names can mislead the reader and cause bugs. We cannot stress
+enough how important it is to use *descriptive* names.  Pick names that match
+the semantics and role of the underlying entities, within reason.  Avoid
+abbreviations unless they are well known.  After picking a good name, make sure
+to use consistent capitalization for the name, as inconsistency requires clients
+to either memorize the APIs or to look it up to find the exact spelling.
+
+In general, names should be in camel case (e.g. ``TextFileReader`` and
+``isLValue()``).  Different kinds of declarations have different rules:
+
+* **Type names** (including classes, structs, enums, typedefs, etc) should be
+  nouns and start with an upper-case letter (e.g. ``TextFileReader``).
+
+* **Variable names** should be nouns (as they represent state).  The name should
+  be camel case, and start with an upper case letter (e.g. ``Leader`` or
+  ``Boats``).
+  
+* **Function names** should be verb phrases (as they represent actions), and
+  command-like function should be imperative.  The name should be camel case,
+  and start with a lower case letter (e.g. ``openFile()`` or ``isFoo()``).
+
+* **Enum declarations** (e.g. ``enum Foo {...}``) are types, so they should
+  follow the naming conventions for types.  A common use for enums is as a
+  discriminator for a union, or an indicator of a subclass.  When an enum is
+  used for something like this, it should have a ``Kind`` suffix
+  (e.g. ``ValueKind``).
+  
+* **Enumerators** (e.g. ``enum { Foo, Bar }``) and **public member variables**
+  should start with an upper-case letter, just like types.  Unless the
+  enumerators are defined in their own small namespace or inside a class,
+  enumerators should have a prefix corresponding to the enum declaration name.
+  For example, ``enum ValueKind { ... };`` may contain enumerators like
+  ``VK_Argument``, ``VK_BasicBlock``, etc.  Enumerators that are just
+  convenience constants are exempt from the requirement for a prefix.  For
+  instance:
+
+  .. code-block:: c++
+
+      enum {
+        MaxSize = 42,
+        Density = 12
+      };
+  
+As an exception, classes that mimic STL classes can have member names in STL's
+style of lower-case words separated by underscores (e.g. ``begin()``,
+``push_back()``, and ``empty()``). Classes that provide multiple
+iterators should add a singular prefix to ``begin()`` and ``end()``
+(e.g. ``global_begin()`` and ``use_begin()``).
+
+Here are some examples of good and bad names:
+
+.. code-block:: c++
+
+  class VehicleMaker {
+    ...
+    Factory<Tire> F;            // Bad -- abbreviation and non-descriptive.
+    Factory<Tire> Factory;      // Better.
+    Factory<Tire> TireFactory;  // Even better -- if VehicleMaker has more than one
+                                // kind of factories.
+  };
+
+  Vehicle makeVehicle(VehicleType Type) {
+    VehicleMaker M;                         // Might be OK if having a short life-span.
+    Tire Tmp1 = M.makeTire();               // Bad -- 'Tmp1' provides no information.
+    Light Headlight = M.makeLight("head");  // Good -- descriptive.
+    ...
+  }
+
+Assert Liberally
+^^^^^^^^^^^^^^^^
+
+Use the "``assert``" macro to its fullest.  Check all of your preconditions and
+assumptions, you never know when a bug (not necessarily even yours) might be
+caught early by an assertion, which reduces debugging time dramatically.  The
+"``<cassert>``" header file is probably already included by the header files you
+are using, so it doesn't cost anything to use it.
+
+To further assist with debugging, make sure to put some kind of error message in
+the assertion statement, which is printed if the assertion is tripped. This
+helps the poor debugger make sense of why an assertion is being made and
+enforced, and hopefully what to do about it.  Here is one complete example:
+
+.. code-block:: c++
+
+  inline Value *getOperand(unsigned I) {
+    assert(I < Operands.size() && "getOperand() out of range!");
+    return Operands[I];
+  }
+
+Here are more examples:
+
+.. code-block:: c++
+
+  assert(Ty->isPointerType() && "Can't allocate a non-pointer type!");
+
+  assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!");
+
+  assert(idx < getNumSuccessors() && "Successor # out of range!");
+
+  assert(V1.getType() == V2.getType() && "Constant types must be identical!");
+
+  assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!");
+
+You get the idea.
+
+In the past, asserts were used to indicate a piece of code that should not be
+reached.  These were typically of the form:
+
+.. code-block:: c++
+
+  assert(0 && "Invalid radix for integer literal");
+
+This has a few issues, the main one being that some compilers might not
+understand the assertion, or warn about a missing return in builds where
+assertions are compiled out.
+
+Today, we have something much better: ``llvm_unreachable``:
+
+.. code-block:: c++
+
+  llvm_unreachable("Invalid radix for integer literal");
+
+When assertions are enabled, this will print the message if it's ever reached
+and then exit the program. When assertions are disabled (i.e. in release
+builds), ``llvm_unreachable`` becomes a hint to compilers to skip generating
+code for this branch. If the compiler does not support this, it will fall back
+to the "abort" implementation.
+
+Another issue is that values used only by assertions will produce an "unused
+value" warning when assertions are disabled.  For example, this code will warn:
+
+.. code-block:: c++
+
+  unsigned Size = V.size();
+  assert(Size > 42 && "Vector smaller than it should be");
+
+  bool NewToSet = Myset.insert(Value);
+  assert(NewToSet && "The value shouldn't be in the set yet");
+
+These are two interesting different cases. In the first case, the call to
+``V.size()`` is only useful for the assert, and we don't want it executed when
+assertions are disabled.  Code like this should move the call into the assert
+itself.  In the second case, the side effects of the call must happen whether
+the assert is enabled or not.  In this case, the value should be cast to void to
+disable the warning.  To be specific, it is preferred to write the code like
+this:
+
+.. code-block:: c++
+
+  assert(V.size() > 42 && "Vector smaller than it should be");
+
+  bool NewToSet = Myset.insert(Value); (void)NewToSet;
+  assert(NewToSet && "The value shouldn't be in the set yet");
+
+Do Not Use ``using namespace std``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In LLVM, we prefer to explicitly prefix all identifiers from the standard
+namespace with an "``std::``" prefix, rather than rely on "``using namespace
+std;``".
+
+In header files, adding a ``'using namespace XXX'`` directive pollutes the
+namespace of any source file that ``#include``\s the header.  This is clearly a
+bad thing.
+
+In implementation files (e.g. ``.cpp`` files), the rule is more of a stylistic
+rule, but is still important.  Basically, using explicit namespace prefixes
+makes the code **clearer**, because it is immediately obvious what facilities
+are being used and where they are coming from. And **more portable**, because
+namespace clashes cannot occur between LLVM code and other namespaces.  The
+portability rule is important because different standard library implementations
+expose different symbols (potentially ones they shouldn't), and future revisions
+to the C++ standard will add more symbols to the ``std`` namespace.  As such, we
+never use ``'using namespace std;'`` in LLVM.
+
+The exception to the general rule (i.e. it's not an exception for the ``std``
+namespace) is for implementation files.  For example, all of the code in the
+LLVM project implements code that lives in the 'llvm' namespace.  As such, it is
+ok, and actually clearer, for the ``.cpp`` files to have a ``'using namespace
+llvm;'`` directive at the top, after the ``#include``\s.  This reduces
+indentation in the body of the file for source editors that indent based on
+braces, and keeps the conceptual context cleaner.  The general form of this rule
+is that any ``.cpp`` file that implements code in any namespace may use that
+namespace (and its parents'), but should not use any others.
+
+Provide a Virtual Method Anchor for Classes in Headers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If a class is defined in a header file and has a vtable (either it has virtual
+methods or it derives from classes with virtual methods), it must always have at
+least one out-of-line virtual method in the class.  Without this, the compiler
+will copy the vtable and RTTI into every ``.o`` file that ``#include``\s the
+header, bloating ``.o`` file sizes and increasing link times.
+
+Don't use default labels in fully covered switches over enumerations
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``-Wswitch`` warns if a switch, without a default label, over an enumeration
+does not cover every enumeration value. If you write a default label on a fully
+covered switch over an enumeration then the ``-Wswitch`` warning won't fire
+when new elements are added to that enumeration. To help avoid adding these
+kinds of defaults, Clang has the warning ``-Wcovered-switch-default`` which is
+off by default but turned on when building LLVM with a version of Clang that
+supports the warning.
+
+A knock-on effect of this stylistic requirement is that when building LLVM with
+GCC you may get warnings related to "control may reach end of non-void function"
+if you return from each case of a covered switch-over-enum because GCC assumes
+that the enum expression may take any representable value, not just those of
+individual enumerators. To suppress this warning, use ``llvm_unreachable`` after
+the switch.
+
+Don't evaluate ``end()`` every time through a loop
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Because C++ doesn't have a standard "``foreach``" loop (though it can be
+emulated with macros and may be coming in C++'0x) we end up writing a lot of
+loops that manually iterate from begin to end on a variety of containers or
+through other data structures.  One common mistake is to write a loop in this
+style:
+
+.. code-block:: c++
+
+  BasicBlock *BB = ...
+  for (BasicBlock::iterator I = BB->begin(); I != BB->end(); ++I)
+    ... use I ...
+
+The problem with this construct is that it evaluates "``BB->end()``" every time
+through the loop.  Instead of writing the loop like this, we strongly prefer
+loops to be written so that they evaluate it once before the loop starts.  A
+convenient way to do this is like so:
+
+.. code-block:: c++
+
+  BasicBlock *BB = ...
+  for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I)
+    ... use I ...
+
+The observant may quickly point out that these two loops may have different
+semantics: if the container (a basic block in this case) is being mutated, then
+"``BB->end()``" may change its value every time through the loop and the second
+loop may not in fact be correct.  If you actually do depend on this behavior,
+please write the loop in the first form and add a comment indicating that you
+did it intentionally.
+
+Why do we prefer the second form (when correct)?  Writing the loop in the first
+form has two problems. First it may be less efficient than evaluating it at the
+start of the loop.  In this case, the cost is probably minor --- a few extra
+loads every time through the loop.  However, if the base expression is more
+complex, then the cost can rise quickly.  I've seen loops where the end
+expression was actually something like: "``SomeMap[X]->end()``" and map lookups
+really aren't cheap.  By writing it in the second form consistently, you
+eliminate the issue entirely and don't even have to think about it.
+
+The second (even bigger) issue is that writing the loop in the first form hints
+to the reader that the loop is mutating the container (a fact that a comment
+would handily confirm!).  If you write the loop in the second form, it is
+immediately obvious without even looking at the body of the loop that the
+container isn't being modified, which makes it easier to read the code and
+understand what it does.
+
+While the second form of the loop is a few extra keystrokes, we do strongly
+prefer it.
+
+``#include <iostream>`` is Forbidden
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The use of ``#include <iostream>`` in library files is hereby **forbidden**,
+because many common implementations transparently inject a `static constructor`_
+into every translation unit that includes it.
+  
+Note that using the other stream headers (``<sstream>`` for example) is not
+problematic in this regard --- just ``<iostream>``. However, ``raw_ostream``
+provides various APIs that are better performing for almost every use than
+``std::ostream`` style APIs.
+
+.. note::
+
+  New code should always use `raw_ostream`_ for writing, or the
+  ``llvm::MemoryBuffer`` API for reading files.
+
+.. _raw_ostream:
+
+Use ``raw_ostream``
+^^^^^^^^^^^^^^^^^^^
+
+LLVM includes a lightweight, simple, and efficient stream implementation in
+``llvm/Support/raw_ostream.h``, which provides all of the common features of
+``std::ostream``.  All new code should use ``raw_ostream`` instead of
+``ostream``.
+
+Unlike ``std::ostream``, ``raw_ostream`` is not a template and can be forward
+declared as ``class raw_ostream``.  Public headers should generally not include
+the ``raw_ostream`` header, but use forward declarations and constant references
+to ``raw_ostream`` instances.
+
+Avoid ``std::endl``
+^^^^^^^^^^^^^^^^^^^
+
+The ``std::endl`` modifier, when used with ``iostreams`` outputs a newline to
+the output stream specified.  In addition to doing this, however, it also
+flushes the output stream.  In other words, these are equivalent:
+
+.. code-block:: c++
+
+  std::cout << std::endl;
+  std::cout << '\n' << std::flush;
+
+Most of the time, you probably have no reason to flush the output stream, so
+it's better to use a literal ``'\n'``.
+
+Don't use ``inline`` when defining a function in a class definition
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A member function defined in a class definition is implicitly inline, so don't
+put the ``inline`` keyword in this case.
+
+Don't:
+
+.. code-block:: c++
+
+  class Foo {
+  public:
+    inline void bar() {
+      // ...
+    }
+  };
+
+Do:
+
+.. code-block:: c++
+
+  class Foo {
+  public:
+    void bar() {
+      // ...
+    }
+  };
+
+Microscopic Details
+-------------------
+
+This section describes preferred low-level formatting guidelines along with
+reasoning on why we prefer them.
+
+Spaces Before Parentheses
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+We prefer to put a space before an open parenthesis only in control flow
+statements, but not in normal function call expressions and function-like
+macros.  For example, this is good:
+
+.. code-block:: c++
+
+  if (X) ...
+  for (I = 0; I != 100; ++I) ...
+  while (LLVMRocks) ...
+
+  somefunc(42);
+  assert(3 != 4 && "laws of math are failing me");
+  
+  A = foo(42, 92) + bar(X);
+
+and this is bad:
+
+.. code-block:: c++
+
+  if(X) ...
+  for(I = 0; I != 100; ++I) ...
+  while(LLVMRocks) ...
+
+  somefunc (42);
+  assert (3 != 4 && "laws of math are failing me");
+  
+  A = foo (42, 92) + bar (X);
+
+The reason for doing this is not completely arbitrary.  This style makes control
+flow operators stand out more, and makes expressions flow better. The function
+call operator binds very tightly as a postfix operator.  Putting a space after a
+function name (as in the last example) makes it appear that the code might bind
+the arguments of the left-hand-side of a binary operator with the argument list
+of a function and the name of the right side.  More specifically, it is easy to
+misread the "``A``" example as:
+
+.. code-block:: c++
+
+  A = foo ((42, 92) + bar) (X);
+
+when skimming through the code.  By avoiding a space in a function, we avoid
+this misinterpretation.
+
+Prefer Preincrement
+^^^^^^^^^^^^^^^^^^^
+
+Hard fast rule: Preincrement (``++X``) may be no slower than postincrement
+(``X++``) and could very well be a lot faster than it.  Use preincrementation
+whenever possible.
+
+The semantics of postincrement include making a copy of the value being
+incremented, returning it, and then preincrementing the "work value".  For
+primitive types, this isn't a big deal. But for iterators, it can be a huge
+issue (for example, some iterators contains stack and set objects in them...
+copying an iterator could invoke the copy ctor's of these as well).  In general,
+get in the habit of always using preincrement, and you won't have a problem.
+
+
+Namespace Indentation
+^^^^^^^^^^^^^^^^^^^^^
+
+In general, we strive to reduce indentation wherever possible.  This is useful
+because we want code to `fit into 80 columns`_ without wrapping horribly, but
+also because it makes it easier to understand the code. To facilitate this and
+avoid some insanely deep nesting on occasion, don't indent namespaces. If it
+helps readability, feel free to add a comment indicating what namespace is
+being closed by a ``}``.  For example:
+
+.. code-block:: c++
+
+  namespace llvm {
+  namespace knowledge {
+
+  /// This class represents things that Smith can have an intimate
+  /// understanding of and contains the data associated with it.
+  class Grokable {
+  ...
+  public:
+    explicit Grokable() { ... }
+    virtual ~Grokable() = 0;
+  
+    ...
+
+  };
+
+  } // end namespace knowledge
+  } // end namespace llvm
+
+
+Feel free to skip the closing comment when the namespace being closed is
+obvious for any reason. For example, the outer-most namespace in a header file
+is rarely a source of confusion. But namespaces both anonymous and named in
+source files that are being closed half way through the file probably could use
+clarification.
+
+.. _static:
+
+Anonymous Namespaces
+^^^^^^^^^^^^^^^^^^^^
+
+After talking about namespaces in general, you may be wondering about anonymous
+namespaces in particular.  Anonymous namespaces are a great language feature
+that tells the C++ compiler that the contents of the namespace are only visible
+within the current translation unit, allowing more aggressive optimization and
+eliminating the possibility of symbol name collisions.  Anonymous namespaces are
+to C++ as "static" is to C functions and global variables.  While "``static``"
+is available in C++, anonymous namespaces are more general: they can make entire
+classes private to a file.
+
+The problem with anonymous namespaces is that they naturally want to encourage
+indentation of their body, and they reduce locality of reference: if you see a
+random function definition in a C++ file, it is easy to see if it is marked
+static, but seeing if it is in an anonymous namespace requires scanning a big
+chunk of the file.
+
+Because of this, we have a simple guideline: make anonymous namespaces as small
+as possible, and only use them for class declarations.  For example, this is
+good:
+
+.. code-block:: c++
+
+  namespace {
+  class StringSort {
+  ...
+  public:
+    StringSort(...)
+    bool operator<(const char *RHS) const;
+  };
+  } // end anonymous namespace
+
+  static void runHelper() { 
+    ... 
+  }
+
+  bool StringSort::operator<(const char *RHS) const {
+    ...
+  }
+
+This is bad:
+
+.. code-block:: c++
+
+  namespace {
+
+  class StringSort {
+  ...
+  public:
+    StringSort(...)
+    bool operator<(const char *RHS) const;
+  };
+
+  void runHelper() { 
+    ... 
+  }
+
+  bool StringSort::operator<(const char *RHS) const {
+    ...
+  }
+
+  } // end anonymous namespace
+
+This is bad specifically because if you're looking at "``runHelper``" in the middle
+of a large C++ file, that you have no immediate way to tell if it is local to
+the file.  When it is marked static explicitly, this is immediately obvious.
+Also, there is no reason to enclose the definition of "``operator<``" in the
+namespace just because it was declared there.
+
+See Also
+========
+
+A lot of these comments and recommendations have been culled from other sources.
+Two particularly important books for our work are:
+
+#. `Effective C++
+   <http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876>`_
+   by Scott Meyers.  Also interesting and useful are "More Effective C++" and
+   "Effective STL" by the same author.
+
+#. `Large-Scale C++ Software Design
+   <http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1>`_
+   by John Lakos
+
+If you get some free time, and you haven't read them: do so, you might learn
+something.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/FileCheck.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/FileCheck.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/FileCheck.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/FileCheck.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,482 @@
+FileCheck - Flexible pattern matching file verifier
+===================================================
+
+SYNOPSIS
+--------
+
+:program:`FileCheck` *match-filename* [*--check-prefix=XXX*] [*--strict-whitespace*]
+
+DESCRIPTION
+-----------
+
+:program:`FileCheck` reads two files (one from standard input, and one
+specified on the command line) and uses one to verify the other.  This
+behavior is particularly useful for the testsuite, which wants to verify that
+the output of some tool (e.g. :program:`llc`) contains the expected information
+(for example, a movsd from esp or whatever is interesting).  This is similar to
+using :program:`grep`, but it is optimized for matching multiple different
+inputs in one file in a specific order.
+
+The ``match-filename`` file specifies the file that contains the patterns to
+match.  The file to verify is read from standard input unless the
+:option:`--input-file` option is used.
+
+OPTIONS
+-------
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: --check-prefix prefix
+
+ FileCheck searches the contents of ``match-filename`` for patterns to
+ match.  By default, these patterns are prefixed with "``CHECK:``".
+ If you'd like to use a different prefix (e.g. because the same input
+ file is checking multiple different tool or options), the
+ :option:`--check-prefix` argument allows you to specify one or more
+ prefixes to match. Multiple prefixes are useful for tests which might
+ change for different run options, but most lines remain the same.
+
+.. option:: --check-prefixes prefix1,prefix2,...
+
+ An alias of :option:`--check-prefix` that allows multiple prefixes to be
+ specified as a comma separated list.
+
+.. option:: --input-file filename
+
+  File to check (defaults to stdin).
+
+.. option:: --match-full-lines
+
+ By default, FileCheck allows matches of anywhere on a line. This
+ option will require all positive matches to cover an entire
+ line. Leading and trailing whitespace is ignored, unless
+ :option:`--strict-whitespace` is also specified. (Note: negative
+ matches from ``CHECK-NOT`` are not affected by this option!)
+
+ Passing this option is equivalent to inserting ``{{^ *}}`` or
+ ``{{^}}`` before, and ``{{ *$}}`` or ``{{$}}`` after every positive
+ check pattern.
+
+.. option:: --strict-whitespace
+
+ By default, FileCheck canonicalizes input horizontal whitespace (spaces and
+ tabs) which causes it to ignore these differences (a space will match a tab).
+ The :option:`--strict-whitespace` argument disables this behavior. End-of-line
+ sequences are canonicalized to UNIX-style ``\n`` in all modes.
+
+.. option:: --implicit-check-not check-pattern
+
+  Adds implicit negative checks for the specified patterns between positive
+  checks. The option allows writing stricter tests without stuffing them with
+  ``CHECK-NOT``\ s.
+
+  For example, "``--implicit-check-not warning:``" can be useful when testing
+  diagnostic messages from tools that don't have an option similar to ``clang
+  -verify``. With this option FileCheck will verify that input does not contain
+  warnings not covered by any ``CHECK:`` patterns.
+
+.. option:: -version
+
+ Show the version number of this program.
+
+EXIT STATUS
+-----------
+
+If :program:`FileCheck` verifies that the file matches the expected contents,
+it exits with 0.  Otherwise, if not, or if an error occurs, it will exit with a
+non-zero value.
+
+TUTORIAL
+--------
+
+FileCheck is typically used from LLVM regression tests, being invoked on the RUN
+line of the test.  A simple example of using FileCheck from a RUN line looks
+like this:
+
+.. code-block:: llvm
+
+   ; RUN: llvm-as < %s | llc -march=x86-64 | FileCheck %s
+
+This syntax says to pipe the current file ("``%s``") into ``llvm-as``, pipe
+that into ``llc``, then pipe the output of ``llc`` into ``FileCheck``.  This
+means that FileCheck will be verifying its standard input (the llc output)
+against the filename argument specified (the original ``.ll`` file specified by
+"``%s``").  To see how this works, let's look at the rest of the ``.ll`` file
+(after the RUN line):
+
+.. code-block:: llvm
+
+   define void @sub1(i32* %p, i32 %v) {
+   entry:
+   ; CHECK: sub1:
+   ; CHECK: subl
+           %0 = tail call i32 @llvm.atomic.load.sub.i32.p0i32(i32* %p, i32 %v)
+           ret void
+   }
+
+   define void @inc4(i64* %p) {
+   entry:
+   ; CHECK: inc4:
+   ; CHECK: incq
+           %0 = tail call i64 @llvm.atomic.load.add.i64.p0i64(i64* %p, i64 1)
+           ret void
+   }
+
+Here you can see some "``CHECK:``" lines specified in comments.  Now you can
+see how the file is piped into ``llvm-as``, then ``llc``, and the machine code
+output is what we are verifying.  FileCheck checks the machine code output to
+verify that it matches what the "``CHECK:``" lines specify.
+
+The syntax of the "``CHECK:``" lines is very simple: they are fixed strings that
+must occur in order.  FileCheck defaults to ignoring horizontal whitespace
+differences (e.g. a space is allowed to match a tab) but otherwise, the contents
+of the "``CHECK:``" line is required to match some thing in the test file exactly.
+
+One nice thing about FileCheck (compared to grep) is that it allows merging
+test cases together into logical groups.  For example, because the test above
+is checking for the "``sub1:``" and "``inc4:``" labels, it will not match
+unless there is a "``subl``" in between those labels.  If it existed somewhere
+else in the file, that would not count: "``grep subl``" matches if "``subl``"
+exists anywhere in the file.
+
+The FileCheck -check-prefix option
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The FileCheck `-check-prefix` option allows multiple test
+configurations to be driven from one `.ll` file.  This is useful in many
+circumstances, for example, testing different architectural variants with
+:program:`llc`.  Here's a simple example:
+
+.. code-block:: llvm
+
+   ; RUN: llvm-as < %s | llc -mtriple=i686-apple-darwin9 -mattr=sse41 \
+   ; RUN:              | FileCheck %s -check-prefix=X32
+   ; RUN: llvm-as < %s | llc -mtriple=x86_64-apple-darwin9 -mattr=sse41 \
+   ; RUN:              | FileCheck %s -check-prefix=X64
+
+   define <4 x i32> @pinsrd_1(i32 %s, <4 x i32> %tmp) nounwind {
+           %tmp1 = insertelement <4 x i32>; %tmp, i32 %s, i32 1
+           ret <4 x i32> %tmp1
+   ; X32: pinsrd_1:
+   ; X32:    pinsrd $1, 4(%esp), %xmm0
+
+   ; X64: pinsrd_1:
+   ; X64:    pinsrd $1, %edi, %xmm0
+   }
+
+In this case, we're testing that we get the expected code generation with
+both 32-bit and 64-bit code generation.
+
+The "CHECK-NEXT:" directive
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes you want to match lines and would like to verify that matches
+happen on exactly consecutive lines with no other lines in between them.  In
+this case, you can use "``CHECK:``" and "``CHECK-NEXT:``" directives to specify
+this.  If you specified a custom check prefix, just use "``<PREFIX>-NEXT:``".
+For example, something like this works as you'd expect:
+
+.. code-block:: llvm
+
+   define void @t2(<2 x double>* %r, <2 x double>* %A, double %B) {
+ 	%tmp3 = load <2 x double>* %A, align 16
+ 	%tmp7 = insertelement <2 x double> undef, double %B, i32 0
+ 	%tmp9 = shufflevector <2 x double> %tmp3,
+                               <2 x double> %tmp7,
+                               <2 x i32> < i32 0, i32 2 >
+ 	store <2 x double> %tmp9, <2 x double>* %r, align 16
+ 	ret void
+
+   ; CHECK:          t2:
+   ; CHECK: 	        movl	8(%esp), %eax
+   ; CHECK-NEXT: 	movapd	(%eax), %xmm0
+   ; CHECK-NEXT: 	movhpd	12(%esp), %xmm0
+   ; CHECK-NEXT: 	movl	4(%esp), %eax
+   ; CHECK-NEXT: 	movapd	%xmm0, (%eax)
+   ; CHECK-NEXT: 	ret
+   }
+
+"``CHECK-NEXT:``" directives reject the input unless there is exactly one
+newline between it and the previous directive.  A "``CHECK-NEXT:``" cannot be
+the first directive in a file.
+
+The "CHECK-SAME:" directive
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes you want to match lines and would like to verify that matches happen
+on the same line as the previous match.  In this case, you can use "``CHECK:``"
+and "``CHECK-SAME:``" directives to specify this.  If you specified a custom
+check prefix, just use "``<PREFIX>-SAME:``".
+
+"``CHECK-SAME:``" is particularly powerful in conjunction with "``CHECK-NOT:``"
+(described below).
+
+For example, the following works like you'd expect:
+
+.. code-block:: llvm
+
+   !0 = !DILocation(line: 5, scope: !1, inlinedAt: !2)
+
+   ; CHECK:       !DILocation(line: 5,
+   ; CHECK-NOT:               column:
+   ; CHECK-SAME:              scope: ![[SCOPE:[0-9]+]]
+
+"``CHECK-SAME:``" directives reject the input if there are any newlines between
+it and the previous directive.  A "``CHECK-SAME:``" cannot be the first
+directive in a file.
+
+The "CHECK-NOT:" directive
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The "``CHECK-NOT:``" directive is used to verify that a string doesn't occur
+between two matches (or before the first match, or after the last match).  For
+example, to verify that a load is removed by a transformation, a test like this
+can be used:
+
+.. code-block:: llvm
+
+   define i8 @coerce_offset0(i32 %V, i32* %P) {
+     store i32 %V, i32* %P
+
+     %P2 = bitcast i32* %P to i8*
+     %P3 = getelementptr i8* %P2, i32 2
+
+     %A = load i8* %P3
+     ret i8 %A
+   ; CHECK: @coerce_offset0
+   ; CHECK-NOT: load
+   ; CHECK: ret i8
+   }
+
+The "CHECK-DAG:" directive
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If it's necessary to match strings that don't occur in a strictly sequential
+order, "``CHECK-DAG:``" could be used to verify them between two matches (or
+before the first match, or after the last match). For example, clang emits
+vtable globals in reverse order. Using ``CHECK-DAG:``, we can keep the checks
+in the natural order:
+
+.. code-block:: c++
+
+    // RUN: %clang_cc1 %s -emit-llvm -o - | FileCheck %s
+
+    struct Foo { virtual void method(); };
+    Foo f;  // emit vtable
+    // CHECK-DAG: @_ZTV3Foo =
+
+    struct Bar { virtual void method(); };
+    Bar b;
+    // CHECK-DAG: @_ZTV3Bar =
+
+``CHECK-NOT:`` directives could be mixed with ``CHECK-DAG:`` directives to
+exclude strings between the surrounding ``CHECK-DAG:`` directives. As a result,
+the surrounding ``CHECK-DAG:`` directives cannot be reordered, i.e. all
+occurrences matching ``CHECK-DAG:`` before ``CHECK-NOT:`` must not fall behind
+occurrences matching ``CHECK-DAG:`` after ``CHECK-NOT:``. For example,
+
+.. code-block:: llvm
+
+   ; CHECK-DAG: BEFORE
+   ; CHECK-NOT: NOT
+   ; CHECK-DAG: AFTER
+
+This case will reject input strings where ``BEFORE`` occurs after ``AFTER``.
+
+With captured variables, ``CHECK-DAG:`` is able to match valid topological
+orderings of a DAG with edges from the definition of a variable to its use.
+It's useful, e.g., when your test cases need to match different output
+sequences from the instruction scheduler. For example,
+
+.. code-block:: llvm
+
+   ; CHECK-DAG: add [[REG1:r[0-9]+]], r1, r2
+   ; CHECK-DAG: add [[REG2:r[0-9]+]], r3, r4
+   ; CHECK:     mul r5, [[REG1]], [[REG2]]
+
+In this case, any order of that two ``add`` instructions will be allowed.
+
+If you are defining `and` using variables in the same ``CHECK-DAG:`` block,
+be aware that the definition rule can match `after` its use.
+
+So, for instance, the code below will pass:
+
+.. code-block:: text
+
+  ; CHECK-DAG: vmov.32 [[REG2:d[0-9]+]][0]
+  ; CHECK-DAG: vmov.32 [[REG2]][1]
+  vmov.32 d0[1]
+  vmov.32 d0[0]
+
+While this other code, will not:
+
+.. code-block:: text
+
+  ; CHECK-DAG: vmov.32 [[REG2:d[0-9]+]][0]
+  ; CHECK-DAG: vmov.32 [[REG2]][1]
+  vmov.32 d1[1]
+  vmov.32 d0[0]
+
+While this can be very useful, it's also dangerous, because in the case of
+register sequence, you must have a strong order (read before write, copy before
+use, etc). If the definition your test is looking for doesn't match (because
+of a bug in the compiler), it may match further away from the use, and mask
+real bugs away.
+
+In those cases, to enforce the order, use a non-DAG directive between DAG-blocks.
+
+The "CHECK-LABEL:" directive
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes in a file containing multiple tests divided into logical blocks, one
+or more ``CHECK:`` directives may inadvertently succeed by matching lines in a
+later block. While an error will usually eventually be generated, the check
+flagged as causing the error may not actually bear any relationship to the
+actual source of the problem.
+
+In order to produce better error messages in these cases, the "``CHECK-LABEL:``"
+directive can be used. It is treated identically to a normal ``CHECK``
+directive except that FileCheck makes an additional assumption that a line
+matched by the directive cannot also be matched by any other check present in
+``match-filename``; this is intended to be used for lines containing labels or
+other unique identifiers. Conceptually, the presence of ``CHECK-LABEL`` divides
+the input stream into separate blocks, each of which is processed independently,
+preventing a ``CHECK:`` directive in one block matching a line in another block.
+For example,
+
+.. code-block:: llvm
+
+  define %struct.C* @C_ctor_base(%struct.C* %this, i32 %x) {
+  entry:
+  ; CHECK-LABEL: C_ctor_base:
+  ; CHECK: mov [[SAVETHIS:r[0-9]+]], r0
+  ; CHECK: bl A_ctor_base
+  ; CHECK: mov r0, [[SAVETHIS]]
+    %0 = bitcast %struct.C* %this to %struct.A*
+    %call = tail call %struct.A* @A_ctor_base(%struct.A* %0)
+    %1 = bitcast %struct.C* %this to %struct.B*
+    %call2 = tail call %struct.B* @B_ctor_base(%struct.B* %1, i32 %x)
+    ret %struct.C* %this
+  }
+
+  define %struct.D* @D_ctor_base(%struct.D* %this, i32 %x) {
+  entry:
+  ; CHECK-LABEL: D_ctor_base:
+
+The use of ``CHECK-LABEL:`` directives in this case ensures that the three
+``CHECK:`` directives only accept lines corresponding to the body of the
+``@C_ctor_base`` function, even if the patterns match lines found later in
+the file. Furthermore, if one of these three ``CHECK:`` directives fail,
+FileCheck will recover by continuing to the next block, allowing multiple test
+failures to be detected in a single invocation.
+
+There is no requirement that ``CHECK-LABEL:`` directives contain strings that
+correspond to actual syntactic labels in a source or output language: they must
+simply uniquely match a single line in the file being verified.
+
+``CHECK-LABEL:`` directives cannot contain variable definitions or uses.
+
+FileCheck Pattern Matching Syntax
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+All FileCheck directives take a pattern to match.
+For most uses of FileCheck, fixed string matching is perfectly sufficient.  For
+some things, a more flexible form of matching is desired.  To support this,
+FileCheck allows you to specify regular expressions in matching strings,
+surrounded by double braces: ``{{yourregex}}``.  Because we want to use fixed
+string matching for a majority of what we do, FileCheck has been designed to
+support mixing and matching fixed string matching with regular expressions.
+This allows you to write things like this:
+
+.. code-block:: llvm
+
+   ; CHECK: movhpd	{{[0-9]+}}(%esp), {{%xmm[0-7]}}
+
+In this case, any offset from the ESP register will be allowed, and any xmm
+register will be allowed.
+
+Because regular expressions are enclosed with double braces, they are
+visually distinct, and you don't need to use escape characters within the double
+braces like you would in C.  In the rare case that you want to match double
+braces explicitly from the input, you can use something ugly like
+``{{[{][{]}}`` as your pattern.
+
+FileCheck Variables
+~~~~~~~~~~~~~~~~~~~
+
+It is often useful to match a pattern and then verify that it occurs again
+later in the file.  For codegen tests, this can be useful to allow any register,
+but verify that that register is used consistently later.  To do this,
+:program:`FileCheck` allows named variables to be defined and substituted into
+patterns.  Here is a simple example:
+
+.. code-block:: llvm
+
+   ; CHECK: test5:
+   ; CHECK:    notw	[[REGISTER:%[a-z]+]]
+   ; CHECK:    andw	{{.*}}[[REGISTER]]
+
+The first check line matches a regex ``%[a-z]+`` and captures it into the
+variable ``REGISTER``.  The second line verifies that whatever is in
+``REGISTER`` occurs later in the file after an "``andw``".  :program:`FileCheck`
+variable references are always contained in ``[[ ]]`` pairs, and their names can
+be formed with the regex ``[a-zA-Z][a-zA-Z0-9]*``.  If a colon follows the name,
+then it is a definition of the variable; otherwise, it is a use.
+
+:program:`FileCheck` variables can be defined multiple times, and uses always
+get the latest value.  Variables can also be used later on the same line they
+were defined on. For example:
+
+.. code-block:: llvm
+
+    ; CHECK: op [[REG:r[0-9]+]], [[REG]]
+
+Can be useful if you want the operands of ``op`` to be the same register,
+and don't care exactly which register it is.
+
+FileCheck Expressions
+~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes there's a need to verify output which refers line numbers of the
+match file, e.g. when testing compiler diagnostics.  This introduces a certain
+fragility of the match file structure, as "``CHECK:``" lines contain absolute
+line numbers in the same file, which have to be updated whenever line numbers
+change due to text addition or deletion.
+
+To support this case, FileCheck allows using ``[[@LINE]]``,
+``[[@LINE+<offset>]]``, ``[[@LINE-<offset>]]`` expressions in patterns. These
+expressions expand to a number of the line where a pattern is located (with an
+optional integer offset).
+
+This way match patterns can be put near the relevant test lines and include
+relative line number references, for example:
+
+.. code-block:: c++
+
+   // CHECK: test.cpp:[[@LINE+4]]:6: error: expected ';' after top level declarator
+   // CHECK-NEXT: {{^int a}}
+   // CHECK-NEXT: {{^     \^}}
+   // CHECK-NEXT: {{^     ;}}
+   int a
+
+Matching Newline Characters
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To match newline characters in regular expressions the character class
+``[[:space:]]`` can be used. For example, the following pattern:
+
+.. code-block:: c++
+
+   // CHECK: DW_AT_location [DW_FORM_sec_offset] ([[DLOC:0x[0-9a-f]+]]){{[[:space:]].*}}"intd"
+
+matches output of the form (from llvm-dwarfdump):
+
+.. code-block:: text
+
+       DW_AT_location [DW_FORM_sec_offset]   (0x00000233)
+       DW_AT_name [DW_FORM_strp]  ( .debug_str[0x000000c9] = "intd")
+
+letting us set the :program:`FileCheck` variable ``DLOC`` to the desired value 
+``0x00000233``, extracted from the line immediately preceding "``intd``".

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/bugpoint.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/bugpoint.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/bugpoint.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/bugpoint.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,196 @@
+bugpoint - automatic test case reduction tool
+=============================================
+
+SYNOPSIS
+--------
+
+**bugpoint** [*options*] [*input LLVM ll/bc files*] [*LLVM passes*] **--args**
+*program arguments*
+
+DESCRIPTION
+-----------
+
+**bugpoint** narrows down the source of problems in LLVM tools and passes.  It
+can be used to debug three types of failures: optimizer crashes, miscompilations
+by optimizers, or bad native code generation (including problems in the static
+and JIT compilers).  It aims to reduce large test cases to small, useful ones.
+For more information on the design and inner workings of **bugpoint**, as well as
+advice for using bugpoint, see :doc:`/Bugpoint` in the LLVM
+distribution.
+
+OPTIONS
+-------
+
+**--additional-so** *library*
+
+ Load the dynamic shared object *library* into the test program whenever it is
+ run.  This is useful if you are debugging programs which depend on non-LLVM
+ libraries (such as the X or curses libraries) to run.
+
+**--append-exit-code**\ =\ *{true,false}*
+
+ Append the test programs exit code to the output file so that a change in exit
+ code is considered a test failure. Defaults to false.
+
+**--args** *program args*
+
+ Pass all arguments specified after **--args** to the test program whenever it runs.
+ Note that if any of the *program args* start with a "``-``", you should use:
+
+ .. code-block:: bash
+
+      bugpoint [bugpoint args] --args -- [program args]
+
+ The "``--``" right after the **--args** option tells **bugpoint** to consider
+ any options starting with "``-``" to be part of the **--args** option, not as
+ options to **bugpoint** itself.
+
+**--tool-args** *tool args*
+
+ Pass all arguments specified after **--tool-args** to the LLVM tool under test
+ (**llc**, **lli**, etc.) whenever it runs.  You should use this option in the
+ following way:
+
+ .. code-block:: bash
+
+      bugpoint [bugpoint args] --tool-args -- [tool args]
+
+ The "``--``" right after the **--tool-args** option tells **bugpoint** to
+ consider any options starting with "``-``" to be part of the **--tool-args**
+ option, not as options to **bugpoint** itself. (See **--args**, above.)
+
+**--safe-tool-args** *tool args*
+
+ Pass all arguments specified after **--safe-tool-args** to the "safe" execution
+ tool.
+
+**--gcc-tool-args** *gcc tool args*
+
+ Pass all arguments specified after **--gcc-tool-args** to the invocation of
+ **gcc**.
+
+**--opt-args** *opt args*
+
+ Pass all arguments specified after **--opt-args** to the invocation of **opt**.
+
+**--disable-{dce,simplifycfg}**
+
+ Do not run the specified passes to clean up and reduce the size of the test
+ program. By default, **bugpoint** uses these passes internally when attempting to
+ reduce test programs.  If you're trying to find a bug in one of these passes,
+ **bugpoint** may crash.
+
+**--enable-valgrind**
+
+ Use valgrind to find faults in the optimization phase. This will allow
+ bugpoint to find otherwise asymptomatic problems caused by memory
+ mis-management.
+
+**-find-bugs**
+
+ Continually randomize the specified passes and run them on the test program
+ until a bug is found or the user kills **bugpoint**.
+
+**-help**
+
+ Print a summary of command line options.
+
+**--input** *filename*
+
+ Open *filename* and redirect the standard input of the test program, whenever
+ it runs, to come from that file.
+
+**--load** *plugin*
+
+ Load the dynamic object *plugin* into **bugpoint** itself.  This object should
+ register new optimization passes.  Once loaded, the object will add new command
+ line options to enable various optimizations.  To see the new complete list of
+ optimizations, use the **-help** and **--load** options together; for example:
+
+
+ .. code-block:: bash
+
+      bugpoint --load myNewPass.so -help
+
+**--mlimit** *megabytes*
+
+ Specifies an upper limit on memory usage of the optimization and codegen. Set
+ to zero to disable the limit.
+
+**--output** *filename*
+
+ Whenever the test program produces output on its standard output stream, it
+ should match the contents of *filename* (the "reference output"). If you
+ do not use this option, **bugpoint** will attempt to generate a reference output
+ by compiling the program with the "safe" backend and running it.
+
+**--run-{int,jit,llc,custom}**
+
+ Whenever the test program is compiled, **bugpoint** should generate code for it
+ using the specified code generator.  These options allow you to choose the
+ interpreter, the JIT compiler, the static native code compiler, or a
+ custom command (see **--exec-command**) respectively.
+
+**--safe-{llc,custom}**
+
+ When debugging a code generator, **bugpoint** should use the specified code
+ generator as the "safe" code generator. This is a known-good code generator
+ used to generate the "reference output" if it has not been provided, and to
+ compile portions of the program that as they are excluded from the testcase.
+ These options allow you to choose the
+ static native code compiler, or a custom command, (see **--exec-command**)
+ respectively. The interpreter and the JIT backends cannot currently
+ be used as the "safe" backends.
+
+**--exec-command** *command*
+
+ This option defines the command to use with the **--run-custom** and
+ **--safe-custom** options to execute the bitcode testcase. This can
+ be useful for cross-compilation.
+
+**--compile-command** *command*
+
+ This option defines the command to use with the **--compile-custom**
+ option to compile the bitcode testcase. The command should exit with a
+ failure exit code if the file is "interesting" and should exit with a
+ success exit code (i.e. 0) otherwise (this is the same as if it crashed on
+ "interesting" inputs).
+
+ This can be useful for
+ testing compiler output without running any link or execute stages. To
+ generate a reduced unit test, you may add CHECK directives to the
+ testcase and pass the name of an executable compile-command script in this form:
+
+ .. code-block:: sh
+
+      #!/bin/sh
+      llc "$@"
+      not FileCheck [bugpoint input file].ll < bugpoint-test-program.s
+
+ This script will "fail" as long as FileCheck passes. So the result
+ will be the minimum bitcode that passes FileCheck.
+
+**--safe-path** *path*
+
+ This option defines the path to the command to execute with the
+ **--safe-{int,jit,llc,custom}**
+ option.
+
+**--verbose-errors**\ =\ *{true,false}*
+
+ The default behavior of bugpoint is to print "<crash>" when it finds a reduced
+ test that crashes compilation. This flag prints the output of the crashing
+ program to stderr. This is useful to make sure it is the same error being
+ tracked down and not a different error that happens to crash the compiler as
+ well. Defaults to false.
+
+EXIT STATUS
+-----------
+
+If **bugpoint** succeeds in finding a problem, it will exit with 0.  Otherwise,
+if an error occurs, it will exit with a non-zero value.
+
+SEE ALSO
+--------
+
+opt|opt

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/index.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/index.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/index.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/index.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,54 @@
+LLVM Command Guide
+------------------
+
+The following documents are command descriptions for all of the LLVM tools.
+These pages describe how to use the LLVM commands and what their options are.
+Note that these pages do not describe all of the options available for all
+tools. To get a complete listing, pass the ``--help`` (general options) or
+``--help-hidden`` (general and debugging options) arguments to the tool you are
+interested in.
+
+Basic Commands
+~~~~~~~~~~~~~~
+
+.. toctree::
+   :maxdepth: 1
+
+   llvm-as
+   llvm-dis
+   opt
+   llc
+   lli
+   llvm-link
+   llvm-ar
+   llvm-lib
+   llvm-nm
+   llvm-config
+   llvm-diff
+   llvm-cov
+   llvm-profdata
+   llvm-stress
+   llvm-symbolizer
+   llvm-dwarfdump
+
+Debugging Tools
+~~~~~~~~~~~~~~~
+
+.. toctree::
+   :maxdepth: 1
+
+   bugpoint
+   llvm-extract
+   llvm-bcanalyzer
+
+Developer Tools
+~~~~~~~~~~~~~~~
+
+.. toctree::
+   :maxdepth: 1
+
+   FileCheck
+   tblgen
+   lit
+   llvm-build
+   llvm-readobj

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lit.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lit.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lit.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lit.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,451 @@
+lit - LLVM Integrated Tester
+============================
+
+SYNOPSIS
+--------
+
+:program:`lit` [*options*] [*tests*]
+
+DESCRIPTION
+-----------
+
+:program:`lit` is a portable tool for executing LLVM and Clang style test
+suites, summarizing their results, and providing indication of failures.
+:program:`lit` is designed to be a lightweight testing tool with as simple a
+user interface as possible.
+
+:program:`lit` should be run with one or more *tests* to run specified on the
+command line.  Tests can be either individual test files or directories to
+search for tests (see :ref:`test-discovery`).
+
+Each specified test will be executed (potentially in parallel) and once all
+tests have been run :program:`lit` will print summary information on the number
+of tests which passed or failed (see :ref:`test-status-results`).  The
+:program:`lit` program will execute with a non-zero exit code if any tests
+fail.
+
+By default :program:`lit` will use a succinct progress display and will only
+print summary information for test failures.  See :ref:`output-options` for
+options controlling the :program:`lit` progress display and output.
+
+:program:`lit` also includes a number of options for controlling how tests are
+executed (specific features may depend on the particular test format).  See
+:ref:`execution-options` for more information.
+
+Finally, :program:`lit` also supports additional options for only running a
+subset of the options specified on the command line, see
+:ref:`selection-options` for more information.
+
+Users interested in the :program:`lit` architecture or designing a
+:program:`lit` testing implementation should see :ref:`lit-infrastructure`.
+
+GENERAL OPTIONS
+---------------
+
+.. option:: -h, --help
+
+ Show the :program:`lit` help message.
+
+.. option:: -j N, --threads=N
+
+ Run ``N`` tests in parallel.  By default, this is automatically chosen to
+ match the number of detected available CPUs.
+
+.. option:: --config-prefix=NAME
+
+ Search for :file:`{NAME}.cfg` and :file:`{NAME}.site.cfg` when searching for
+ test suites, instead of :file:`lit.cfg` and :file:`lit.site.cfg`.
+
+.. option:: -D NAME, -D NAME=VALUE, --param NAME, --param NAME=VALUE
+
+ Add a user defined parameter ``NAME`` with the given ``VALUE`` (or the empty
+ string if not given).  The meaning and use of these parameters is test suite
+ dependent.
+
+.. _output-options:
+
+OUTPUT OPTIONS
+--------------
+
+.. option:: -q, --quiet
+
+ Suppress any output except for test failures.
+
+.. option:: -s, --succinct
+
+ Show less output, for example don't show information on tests that pass.
+
+.. option:: -v, --verbose
+
+ Show more information on test failures, for example the entire test output
+ instead of just the test result.
+
+.. option:: -a, --show-all
+
+ Show more information about all tests, for example the entire test
+ commandline and output.
+
+.. option:: --no-progress-bar
+
+ Do not use curses based progress bar.
+
+.. option:: --show-unsupported
+
+ Show the names of unsupported tests.
+
+.. option:: --show-xfail
+
+ Show the names of tests that were expected to fail.
+
+.. _execution-options:
+
+EXECUTION OPTIONS
+-----------------
+
+.. option:: --path=PATH
+
+ Specify an additional ``PATH`` to use when searching for executables in tests.
+
+.. option:: --vg
+
+ Run individual tests under valgrind (using the memcheck tool).  The
+ ``--error-exitcode`` argument for valgrind is used so that valgrind failures
+ will cause the program to exit with a non-zero status.
+
+ When this option is enabled, :program:`lit` will also automatically provide a
+ "``valgrind``" feature that can be used to conditionally disable (or expect
+ failure in) certain tests.
+
+.. option:: --vg-arg=ARG
+
+ When :option:`--vg` is used, specify an additional argument to pass to
+ :program:`valgrind` itself.
+
+.. option:: --vg-leak
+
+ When :option:`--vg` is used, enable memory leak checks.  When this option is
+ enabled, :program:`lit` will also automatically provide a "``vg_leak``"
+ feature that can be used to conditionally disable (or expect failure in)
+ certain tests.
+
+.. option:: --time-tests
+
+ Track the wall time individual tests take to execute and includes the results
+ in the summary output.  This is useful for determining which tests in a test
+ suite take the most time to execute.  Note that this option is most useful
+ with ``-j 1``.
+
+.. _selection-options:
+
+SELECTION OPTIONS
+-----------------
+
+.. option:: --max-tests=N
+
+ Run at most ``N`` tests and then terminate.
+
+.. option:: --max-time=N
+
+ Spend at most ``N`` seconds (approximately) running tests and then terminate.
+
+.. option:: --shuffle
+
+ Run the tests in a random order.
+
+ADDITIONAL OPTIONS
+------------------
+
+.. option:: --debug
+
+ Run :program:`lit` in debug mode, for debugging configuration issues and
+ :program:`lit` itself.
+
+.. option:: --show-suites
+
+ List the discovered test suites and exit.
+
+.. option:: --show-tests
+
+ List all of the discovered tests and exit.
+
+EXIT STATUS
+-----------
+
+:program:`lit` will exit with an exit code of 1 if there are any FAIL or XPASS
+results.  Otherwise, it will exit with the status 0.  Other exit codes are used
+for non-test related failures (for example a user error or an internal program
+error).
+
+.. _test-discovery:
+
+TEST DISCOVERY
+--------------
+
+The inputs passed to :program:`lit` can be either individual tests, or entire
+directories or hierarchies of tests to run.  When :program:`lit` starts up, the
+first thing it does is convert the inputs into a complete list of tests to run
+as part of *test discovery*.
+
+In the :program:`lit` model, every test must exist inside some *test suite*.
+:program:`lit` resolves the inputs specified on the command line to test suites
+by searching upwards from the input path until it finds a :file:`lit.cfg` or
+:file:`lit.site.cfg` file.  These files serve as both a marker of test suites
+and as configuration files which :program:`lit` loads in order to understand
+how to find and run the tests inside the test suite.
+
+Once :program:`lit` has mapped the inputs into test suites it traverses the
+list of inputs adding tests for individual files and recursively searching for
+tests in directories.
+
+This behavior makes it easy to specify a subset of tests to run, while still
+allowing the test suite configuration to control exactly how tests are
+interpreted.  In addition, :program:`lit` always identifies tests by the test
+suite they are in, and their relative path inside the test suite.  For
+appropriately configured projects, this allows :program:`lit` to provide
+convenient and flexible support for out-of-tree builds.
+
+.. _test-status-results:
+
+TEST STATUS RESULTS
+-------------------
+
+Each test ultimately produces one of the following six results:
+
+**PASS**
+
+ The test succeeded.
+
+**XFAIL**
+
+ The test failed, but that is expected.  This is used for test formats which allow
+ specifying that a test does not currently work, but wish to leave it in the test
+ suite.
+
+**XPASS**
+
+ The test succeeded, but it was expected to fail.  This is used for tests which
+ were specified as expected to fail, but are now succeeding (generally because
+ the feature they test was broken and has been fixed).
+
+**FAIL**
+
+ The test failed.
+
+**UNRESOLVED**
+
+ The test result could not be determined.  For example, this occurs when the test
+ could not be run, the test itself is invalid, or the test was interrupted.
+
+**UNSUPPORTED**
+
+ The test is not supported in this environment.  This is used by test formats
+ which can report unsupported tests.
+
+Depending on the test format tests may produce additional information about
+their status (generally only for failures).  See the :ref:`output-options`
+section for more information.
+
+.. _lit-infrastructure:
+
+LIT INFRASTRUCTURE
+------------------
+
+This section describes the :program:`lit` testing architecture for users interested in
+creating a new :program:`lit` testing implementation, or extending an existing one.
+
+:program:`lit` proper is primarily an infrastructure for discovering and running
+arbitrary tests, and to expose a single convenient interface to these
+tests. :program:`lit` itself doesn't know how to run tests, rather this logic is
+defined by *test suites*.
+
+TEST SUITES
+~~~~~~~~~~~
+
+As described in :ref:`test-discovery`, tests are always located inside a *test
+suite*.  Test suites serve to define the format of the tests they contain, the
+logic for finding those tests, and any additional information to run the tests.
+
+:program:`lit` identifies test suites as directories containing ``lit.cfg`` or
+``lit.site.cfg`` files (see also :option:`--config-prefix`).  Test suites are
+initially discovered by recursively searching up the directory hierarchy for
+all the input files passed on the command line.  You can use
+:option:`--show-suites` to display the discovered test suites at startup.
+
+Once a test suite is discovered, its config file is loaded.  Config files
+themselves are Python modules which will be executed.  When the config file is
+executed, two important global variables are predefined:
+
+**lit_config**
+
+ The global **lit** configuration object (a *LitConfig* instance), which defines
+ the builtin test formats, global configuration parameters, and other helper
+ routines for implementing test configurations.
+
+**config**
+
+ This is the config object (a *TestingConfig* instance) for the test suite,
+ which the config file is expected to populate.  The following variables are also
+ available on the *config* object, some of which must be set by the config and
+ others are optional or predefined:
+
+ **name** *[required]* The name of the test suite, for use in reports and
+ diagnostics.
+
+ **test_format** *[required]* The test format object which will be used to
+ discover and run tests in the test suite.  Generally this will be a builtin test
+ format available from the *lit.formats* module.
+
+ **test_source_root** The filesystem path to the test suite root.  For out-of-dir
+ builds this is the directory that will be scanned for tests.
+
+ **test_exec_root** For out-of-dir builds, the path to the test suite root inside
+ the object directory.  This is where tests will be run and temporary output files
+ placed.
+
+ **environment** A dictionary representing the environment to use when executing
+ tests in the suite.
+
+ **suffixes** For **lit** test formats which scan directories for tests, this
+ variable is a list of suffixes to identify test files.  Used by: *ShTest*.
+
+ **substitutions** For **lit** test formats which substitute variables into a test
+ script, the list of substitutions to perform.  Used by: *ShTest*.
+
+ **unsupported** Mark an unsupported directory, all tests within it will be
+ reported as unsupported.  Used by: *ShTest*.
+
+ **parent** The parent configuration, this is the config object for the directory
+ containing the test suite, or None.
+
+ **root** The root configuration.  This is the top-most :program:`lit` configuration in
+ the project.
+
+ **pipefail** Normally a test using a shell pipe fails if any of the commands
+ on the pipe fail. If this is not desired, setting this variable to false
+ makes the test fail only if the last command in the pipe fails.
+
+ **available_features** A set of features that can be used in `XFAIL`,
+ `REQUIRES`, and `UNSUPPORTED` directives.
+
+TEST DISCOVERY
+~~~~~~~~~~~~~~
+
+Once test suites are located, :program:`lit` recursively traverses the source
+directory (following *test_source_root*) looking for tests.  When :program:`lit`
+enters a sub-directory, it first checks to see if a nested test suite is
+defined in that directory.  If so, it loads that test suite recursively,
+otherwise it instantiates a local test config for the directory (see
+:ref:`local-configuration-files`).
+
+Tests are identified by the test suite they are contained within, and the
+relative path inside that suite.  Note that the relative path may not refer to
+an actual file on disk; some test formats (such as *GoogleTest*) define
+"virtual tests" which have a path that contains both the path to the actual
+test file and a subpath to identify the virtual test.
+
+.. _local-configuration-files:
+
+LOCAL CONFIGURATION FILES
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When :program:`lit` loads a subdirectory in a test suite, it instantiates a
+local test configuration by cloning the configuration for the parent directory
+--- the root of this configuration chain will always be a test suite.  Once the
+test configuration is cloned :program:`lit` checks for a *lit.local.cfg* file
+in the subdirectory.  If present, this file will be loaded and can be used to
+specialize the configuration for each individual directory.  This facility can
+be used to define subdirectories of optional tests, or to change other
+configuration parameters --- for example, to change the test format, or the
+suffixes which identify test files.
+
+PRE-DEFINED SUBSTITUTIONS
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+:program:`lit` provides various patterns that can be used with the RUN command.
+These are defined in TestRunner.py.
+
+ ========== ==============
+  Macro      Substitution
+ ========== ==============
+ %s         source path (path to the file currently being run)
+ %S         source dir (directory of the file currently being run)
+ %p         same as %S
+ %{pathsep} path separator
+ %t         temporary file name unique to the test
+ %T         temporary directory unique to the test
+ %%         %
+ %/s        same as %s but replace all / with \\
+ %/S        same as %S but replace all / with \\
+ %/p        same as %p but replace all / with \\
+ %/t        same as %t but replace all / with \\
+ %/T        same as %T but replace all / with \\
+ ========== ==============
+
+Further substitution patterns might be defined by each test module.
+See the modules :ref:`local-configuration-files`.
+
+More information on the testing infrastucture can be found in the
+:doc:`../TestingGuide`.
+
+TEST RUN OUTPUT FORMAT
+~~~~~~~~~~~~~~~~~~~~~~
+
+The :program:`lit` output for a test run conforms to the following schema, in
+both short and verbose modes (although in short mode no PASS lines will be
+shown).  This schema has been chosen to be relatively easy to reliably parse by
+a machine (for example in buildbot log scraping), and for other tools to
+generate.
+
+Each test result is expected to appear on a line that matches:
+
+.. code-block:: none
+
+  <result code>: <test name> (<progress info>)
+
+where ``<result-code>`` is a standard test result such as PASS, FAIL, XFAIL,
+XPASS, UNRESOLVED, or UNSUPPORTED.  The performance result codes of IMPROVED and
+REGRESSED are also allowed.
+
+The ``<test name>`` field can consist of an arbitrary string containing no
+newline.
+
+The ``<progress info>`` field can be used to report progress information such
+as (1/300) or can be empty, but even when empty the parentheses are required.
+
+Each test result may include additional (multiline) log information in the
+following format:
+
+.. code-block:: none
+
+  <log delineator> TEST '(<test name>)' <trailing delineator>
+  ... log message ...
+  <log delineator>
+
+where ``<test name>`` should be the name of a preceding reported test, ``<log
+delineator>`` is a string of "*" characters *at least* four characters long
+(the recommended length is 20), and ``<trailing delineator>`` is an arbitrary
+(unparsed) string.
+
+The following is an example of a test run output which consists of four tests A,
+B, C, and D, and a log message for the failing test C:
+
+.. code-block:: none
+
+  PASS: A (1 of 4)
+  PASS: B (2 of 4)
+  FAIL: C (3 of 4)
+  ******************** TEST 'C' FAILED ********************
+  Test 'C' failed as a result of exit code 1.
+  ********************
+  PASS: D (4 of 4)
+
+LIT EXAMPLE TESTS
+~~~~~~~~~~~~~~~~~
+
+The :program:`lit` distribution contains several example implementations of
+test suites in the *ExampleTests* directory.
+
+SEE ALSO
+--------
+
+valgrind(1)

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llc.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llc.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llc.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llc.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,196 @@
+llc - LLVM static compiler
+==========================
+
+SYNOPSIS
+--------
+
+:program:`llc` [*options*] [*filename*]
+
+DESCRIPTION
+-----------
+
+The :program:`llc` command compiles LLVM source inputs into assembly language
+for a specified architecture.  The assembly language output can then be passed
+through a native assembler and linker to generate a native executable.
+
+The choice of architecture for the output assembly code is automatically
+determined from the input file, unless the :option:`-march` option is used to
+override the default.
+
+OPTIONS
+-------
+
+If ``filename`` is "``-``" or omitted, :program:`llc` reads from standard input.
+Otherwise, it will from ``filename``.  Inputs can be in either the LLVM assembly
+language format (``.ll``) or the LLVM bitcode format (``.bc``).
+
+If the :option:`-o` option is omitted, then :program:`llc` will send its output
+to standard output if the input is from standard input.  If the :option:`-o`
+option specifies "``-``", then the output will also be sent to standard output.
+
+If no :option:`-o` option is specified and an input file other than "``-``" is
+specified, then :program:`llc` creates the output filename by taking the input
+filename, removing any existing ``.bc`` extension, and adding a ``.s`` suffix.
+
+Other :program:`llc` options are described below.
+
+End-user Options
+~~~~~~~~~~~~~~~~
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -O=uint
+
+ Generate code at different optimization levels.  These correspond to the
+ ``-O0``, ``-O1``, ``-O2``, and ``-O3`` optimization levels used by
+ :program:`clang`.
+
+.. option:: -mtriple=<target triple>
+
+ Override the target triple specified in the input file with the specified
+ string.
+
+.. option:: -march=<arch>
+
+ Specify the architecture for which to generate assembly, overriding the target
+ encoded in the input file.  See the output of ``llc -help`` for a list of
+ valid architectures.  By default this is inferred from the target triple or
+ autodetected to the current architecture.
+
+.. option:: -mcpu=<cpuname>
+
+ Specify a specific chip in the current architecture to generate code for.
+ By default this is inferred from the target triple and autodetected to
+ the current architecture.  For a list of available CPUs, use:
+
+ .. code-block:: none
+
+   llvm-as < /dev/null | llc -march=xyz -mcpu=help
+
+.. option:: -filetype=<output file type>
+
+ Specify what kind of output ``llc`` should generated.  Options are: ``asm``
+ for textual assembly ( ``'.s'``), ``obj`` for native object files (``'.o'``)
+ and ``null`` for not emitting anything (for performance testing).
+
+ Note that not all targets support all options.
+
+.. option:: -mattr=a1,+a2,-a3,...
+
+ Override or control specific attributes of the target, such as whether SIMD
+ operations are enabled or not.  The default set of attributes is set by the
+ current CPU.  For a list of available attributes, use:
+
+ .. code-block:: none
+
+   llvm-as < /dev/null | llc -march=xyz -mattr=help
+
+.. option:: --disable-fp-elim
+
+ Disable frame pointer elimination optimization.
+
+.. option:: --disable-excess-fp-precision
+
+ Disable optimizations that may produce excess precision for floating point.
+ Note that this option can dramatically slow down code on some systems
+ (e.g. X86).
+
+.. option:: --enable-no-infs-fp-math
+
+ Enable optimizations that assume no Inf values.
+
+.. option:: --enable-no-nans-fp-math
+
+ Enable optimizations that assume no NAN values.
+
+.. option:: --enable-unsafe-fp-math
+
+ Enable optimizations that make unsafe assumptions about IEEE math (e.g. that
+ addition is associative) or may not work for all input ranges.  These
+ optimizations allow the code generator to make use of some instructions which
+ would otherwise not be usable (such as ``fsin`` on X86).
+
+.. option:: --stats
+
+ Print statistics recorded by code-generation passes.
+
+.. option:: --time-passes
+
+ Record the amount of time needed for each pass and print a report to standard
+ error.
+
+.. option:: --load=<dso_path>
+
+ Dynamically load ``dso_path`` (a path to a dynamically shared object) that
+ implements an LLVM target.  This will permit the target name to be used with
+ the :option:`-march` option so that code can be generated for that target.
+
+.. option:: -meabi=[default|gnu|4|5]
+
+ Specify which EABI version should conform to.  Valid EABI versions are *gnu*,
+ *4* and *5*.  Default value (*default*) depends on the triple.
+
+
+Tuning/Configuration Options
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. option:: --print-machineinstrs
+
+ Print generated machine code between compilation phases (useful for debugging).
+
+.. option:: --regalloc=<allocator>
+
+ Specify the register allocator to use.
+ Valid register allocators are:
+
+ *basic*
+
+  Basic register allocator.
+
+ *fast*
+
+  Fast register allocator. It is the default for unoptimized code.
+
+ *greedy*
+
+  Greedy register allocator. It is the default for optimized code.
+
+ *pbqp*
+
+  Register allocator based on 'Partitioned Boolean Quadratic Programming'.
+
+.. option:: --spiller=<spiller>
+
+ Specify the spiller to use for register allocators that support it.  Currently
+ this option is used only by the linear scan register allocator.  The default
+ ``spiller`` is *local*.  Valid spillers are:
+
+ *simple*
+
+  Simple spiller
+
+ *local*
+
+  Local spiller
+
+Intel IA-32-specific Options
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. option:: --x86-asm-syntax=[att|intel]
+
+ Specify whether to emit assembly code in AT&T syntax (the default) or Intel
+ syntax.
+
+EXIT STATUS
+-----------
+
+If :program:`llc` succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.
+
+SEE ALSO
+--------
+
+lli
+

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lli.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lli.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lli.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/lli.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,214 @@
+lli - directly execute programs from LLVM bitcode
+=================================================
+
+SYNOPSIS
+--------
+
+:program:`lli` [*options*] [*filename*] [*program args*]
+
+DESCRIPTION
+-----------
+
+:program:`lli` directly executes programs in LLVM bitcode format.  It takes a program
+in LLVM bitcode format and executes it using a just-in-time compiler or an
+interpreter.
+
+:program:`lli` is *not* an emulator. It will not execute IR of different architectures
+and it can only interpret (or JIT-compile) for the host architecture.
+
+The JIT compiler takes the same arguments as other tools, like :program:`llc`,
+but they don't necessarily work for the interpreter.
+
+If `filename` is not specified, then :program:`lli` reads the LLVM bitcode for the
+program from standard input.
+
+The optional *args* specified on the command line are passed to the program as
+arguments.
+
+GENERAL OPTIONS
+---------------
+
+.. option:: -fake-argv0=executable
+
+ Override the ``argv[0]`` value passed into the executing program.
+
+.. option:: -force-interpreter={false,true}
+
+ If set to true, use the interpreter even if a just-in-time compiler is available
+ for this architecture. Defaults to false.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -load=pluginfilename
+
+ Causes :program:`lli` to load the plugin (shared object) named *pluginfilename* and use
+ it for optimization.
+
+.. option:: -stats
+
+ Print statistics from the code-generation passes. This is only meaningful for
+ the just-in-time compiler, at present.
+
+.. option:: -time-passes
+
+ Record the amount of time needed for each code-generation pass and print it to
+ standard error.
+
+.. option:: -version
+
+ Print out the version of :program:`lli` and exit without doing anything else.
+
+TARGET OPTIONS
+--------------
+
+.. option:: -mtriple=target triple
+
+ Override the target triple specified in the input bitcode file with the
+ specified string.  This may result in a crash if you pick an
+ architecture which is not compatible with the current system.
+
+.. option:: -march=arch
+
+ Specify the architecture for which to generate assembly, overriding the target
+ encoded in the bitcode file.  See the output of **llc -help** for a list of
+ valid architectures.  By default this is inferred from the target triple or
+ autodetected to the current architecture.
+
+.. option:: -mcpu=cpuname
+
+ Specify a specific chip in the current architecture to generate code for.
+ By default this is inferred from the target triple and autodetected to
+ the current architecture.  For a list of available CPUs, use:
+ **llvm-as < /dev/null | llc -march=xyz -mcpu=help**
+
+.. option:: -mattr=a1,+a2,-a3,...
+
+ Override or control specific attributes of the target, such as whether SIMD
+ operations are enabled or not.  The default set of attributes is set by the
+ current CPU.  For a list of available attributes, use:
+ **llvm-as < /dev/null | llc -march=xyz -mattr=help**
+
+FLOATING POINT OPTIONS
+----------------------
+
+.. option:: -disable-excess-fp-precision
+
+ Disable optimizations that may increase floating point precision.
+
+.. option:: -enable-no-infs-fp-math
+
+ Enable optimizations that assume no Inf values.
+
+.. option:: -enable-no-nans-fp-math
+
+ Enable optimizations that assume no NAN values.
+
+.. option:: -enable-unsafe-fp-math
+
+ Causes :program:`lli` to enable optimizations that may decrease floating point
+ precision.
+
+.. option:: -soft-float
+
+ Causes :program:`lli` to generate software floating point library calls instead of
+ equivalent hardware instructions.
+
+CODE GENERATION OPTIONS
+-----------------------
+
+.. option:: -code-model=model
+
+ Choose the code model from:
+
+ .. code-block:: perl
+
+      default: Target default code model
+      small: Small code model
+      kernel: Kernel code model
+      medium: Medium code model
+      large: Large code model
+
+.. option:: -disable-post-RA-scheduler
+
+ Disable scheduling after register allocation.
+
+.. option:: -disable-spill-fusing
+
+ Disable fusing of spill code into instructions.
+
+.. option:: -jit-enable-eh
+
+ Exception handling should be enabled in the just-in-time compiler.
+
+.. option:: -join-liveintervals
+
+ Coalesce copies (default=true).
+
+.. option:: -nozero-initialized-in-bss
+
+  Don't place zero-initialized symbols into the BSS section.
+
+.. option:: -pre-RA-sched=scheduler
+
+ Instruction schedulers available (before register allocation):
+
+ .. code-block:: perl
+
+      =default: Best scheduler for the target
+      =none: No scheduling: breadth first sequencing
+      =simple: Simple two pass scheduling: minimize critical path and maximize processor utilization
+      =simple-noitin: Simple two pass scheduling: Same as simple except using generic latency
+      =list-burr: Bottom-up register reduction list scheduling
+      =list-tdrr: Top-down register reduction list scheduling
+      =list-td: Top-down list scheduler -print-machineinstrs - Print generated machine code
+
+.. option:: -regalloc=allocator
+
+ Register allocator to use (default=linearscan)
+
+ .. code-block:: perl
+
+      =bigblock: Big-block register allocator
+      =linearscan: linear scan register allocator =local -   local register allocator
+      =simple: simple register allocator
+
+.. option:: -relocation-model=model
+
+ Choose relocation model from:
+
+ .. code-block:: perl
+
+      =default: Target default relocation model
+      =static: Non-relocatable code =pic -   Fully relocatable, position independent code
+      =dynamic-no-pic: Relocatable external references, non-relocatable code
+
+.. option:: -spiller
+
+ Spiller to use (default=local)
+
+ .. code-block:: perl
+
+      =simple: simple spiller
+      =local: local spiller
+
+.. option:: -x86-asm-syntax=syntax
+
+ Choose style of code to emit from X86 backend:
+
+ .. code-block:: perl
+
+      =att: Emit AT&T-style assembly
+      =intel: Emit Intel-style assembly
+
+EXIT STATUS
+-----------
+
+If :program:`lli` fails to load the program, it will exit with an exit code of 1.
+Otherwise, it will return the exit code of the program it executes.
+
+SEE ALSO
+--------
+
+:program:`llc`

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-ar.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-ar.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-ar.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-ar.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,367 @@
+llvm-ar - LLVM archiver
+=======================
+
+
+SYNOPSIS
+--------
+
+
+**llvm-ar** [-]{dmpqrtx}[Rabfikou] [relpos] [count] <archive> [files...]
+
+
+DESCRIPTION
+-----------
+
+
+The **llvm-ar** command is similar to the common Unix utility, ``ar``. It
+archives several files together into a single file. The intent for this is
+to produce archive libraries by LLVM bitcode that can be linked into an
+LLVM program. However, the archive can contain any kind of file. By default,
+**llvm-ar** generates a symbol table that makes linking faster because
+only the symbol table needs to be consulted, not each individual file member
+of the archive.
+
+The **llvm-ar** command can be used to *read* SVR4, GNU and BSD style archive
+files. However, right now it can only write in the GNU format. If an
+SVR4 or BSD style archive is used with the ``r`` (replace) or ``q`` (quick
+update) operations, the archive will be reconstructed in GNU format.
+
+Here's where **llvm-ar** departs from previous ``ar`` implementations:
+
+
+*Symbol Table*
+
+ Since **llvm-ar** supports bitcode files. The symbol table it creates
+ is in GNU format and includes both native and bitcode files.
+
+
+*Long Paths*
+
+ Currently **llvm-ar** can read GNU and BSD long file names, but only writes
+ archives with the GNU format.
+
+
+
+OPTIONS
+-------
+
+
+The options to **llvm-ar** are compatible with other ``ar`` implementations.
+However, there are a few modifiers (*R*) that are not found in other ``ar``
+implementations. The options to **llvm-ar** specify a single basic operation to
+perform on the archive, a variety of modifiers for that operation, the name of
+the archive file, and an optional list of file names. These options are used to
+determine how **llvm-ar** should process the archive file.
+
+The Operations and Modifiers are explained in the sections below. The minimal
+set of options is at least one operator and the name of the archive. Typically
+archive files end with a ``.a`` suffix, but this is not required. Following
+the *archive-name* comes a list of *files* that indicate the specific members
+of the archive to operate on. If the *files* option is not specified, it
+generally means either "none" or "all" members, depending on the operation.
+
+Operations
+~~~~~~~~~~
+
+
+
+d
+
+ Delete files from the archive. No modifiers are applicable to this operation.
+ The *files* options specify which members should be removed from the
+ archive. It is not an error if a specified file does not appear in the archive.
+ If no *files* are specified, the archive is not modified.
+
+
+
+m[abi]
+
+ Move files from one location in the archive to another. The *a*, *b*, and
+ *i* modifiers apply to this operation. The *files* will all be moved
+ to the location given by the modifiers. If no modifiers are used, the files
+ will be moved to the end of the archive. If no *files* are specified, the
+ archive is not modified.
+
+
+
+p
+
+ Print files to the standard output. This operation simply prints the
+ *files* indicated to the standard output. If no *files* are
+ specified, the entire  archive is printed.  Printing bitcode files is
+ ill-advised as they might confuse your terminal settings. The *p*
+ operation never modifies the archive.
+
+
+
+q
+
+ Quickly append files to the end of the archive.  This operation quickly adds the
+ *files* to the archive without checking for duplicates that should be
+ removed first. If no *files* are specified, the archive is not modified.
+ Because of the way that **llvm-ar** constructs the archive file, its dubious
+ whether the *q* operation is any faster than the *r* operation.
+
+
+
+r[abu]
+
+ Replace or insert file members. The *a*, *b*,  and *u*
+ modifiers apply to this operation. This operation will replace existing
+ *files* or insert them at the end of the archive if they do not exist. If no
+ *files* are specified, the archive is not modified.
+
+
+
+t[v]
+
+ Print the table of contents. Without any modifiers, this operation just prints
+ the names of the members to the standard output. With the *v* modifier,
+ **llvm-ar** also prints out the file type (B=bitcode, S=symbol
+ table, blank=regular file), the permission mode, the owner and group, the
+ size, and the date. If any *files* are specified, the listing is only for
+ those files. If no *files* are specified, the table of contents for the
+ whole archive is printed.
+
+
+
+x[oP]
+
+ Extract archive members back to files. The *o* modifier applies to this
+ operation. This operation retrieves the indicated *files* from the archive
+ and writes them back to the operating system's file system. If no
+ *files* are specified, the entire archive is extract.
+
+
+
+
+Modifiers (operation specific)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+
+The modifiers below are specific to certain operations. See the Operations
+section (above) to determine which modifiers are applicable to which operations.
+
+
+[a]
+
+ When inserting or moving member files, this option specifies the destination of
+ the new files as being after the *relpos* member. If *relpos* is not found,
+ the files are placed at the end of the archive.
+
+
+
+[b]
+
+ When inserting or moving member files, this option specifies the destination of
+ the new files as being before the *relpos* member. If *relpos* is not
+ found, the files are placed at the end of the archive. This modifier is
+ identical to the *i* modifier.
+
+
+
+[i]
+
+ A synonym for the *b* option.
+
+
+
+[o]
+
+ When extracting files, this option will cause **llvm-ar** to preserve the
+ original modification times of the files it writes.
+
+
+
+[u]
+
+ When replacing existing files in the archive, only replace those files that have
+ a time stamp than the time stamp of the member in the archive.
+
+
+
+
+Modifiers (generic)
+~~~~~~~~~~~~~~~~~~~
+
+
+The modifiers below may be applied to any operation.
+
+
+[c]
+
+ For all operations, **llvm-ar** will always create the archive if it doesn't
+ exist. Normally, **llvm-ar** will print a warning message indicating that the
+ archive is being created. Using this modifier turns off that warning.
+
+
+
+[s]
+
+ This modifier requests that an archive index (or symbol table) be added to the
+ archive. This is the default mode of operation. The symbol table will contain
+ all the externally visible functions and global variables defined by all the
+ bitcode files in the archive.
+
+
+
+[S]
+
+ This modifier is the opposite of the *s* modifier. It instructs **llvm-ar** to
+ not build the symbol table. If both *s* and *S* are used, the last modifier to
+ occur in the options will prevail.
+
+
+
+[v]
+
+ This modifier instructs **llvm-ar** to be verbose about what it is doing. Each
+ editing operation taken against the archive will produce a line of output saying
+ what is being done.
+
+
+
+
+
+STANDARDS
+---------
+
+
+The **llvm-ar** utility is intended to provide a superset of the IEEE Std 1003.2
+(POSIX.2) functionality for ``ar``. **llvm-ar** can read both SVR4 and BSD4.4 (or
+Mac OS X) archives. If the ``f`` modifier is given to the ``x`` or ``r`` operations
+then **llvm-ar** will write SVR4 compatible archives. Without this modifier,
+**llvm-ar** will write BSD4.4 compatible archives that have long names
+immediately after the header and indicated using the "#1/ddd" notation for the
+name in the header.
+
+
+FILE FORMAT
+-----------
+
+
+The file format for LLVM Archive files is similar to that of BSD 4.4 or Mac OSX
+archive files. In fact, except for the symbol table, the ``ar`` commands on those
+operating systems should be able to read LLVM archive files. The details of the
+file format follow.
+
+Each archive begins with the archive magic number which is the eight printable
+characters "!<arch>\n" where \n represents the newline character (0x0A).
+Following the magic number, the file is composed of even length members that
+begin with an archive header and end with a \n padding character if necessary
+(to make the length even). Each file member is composed of a header (defined
+below), an optional newline-terminated "long file name" and the contents of
+the file.
+
+The fields of the header are described in the items below. All fields of the
+header contain only ASCII characters, are left justified and are right padded
+with space characters.
+
+
+name - char[16]
+
+ This field of the header provides the name of the archive member. If the name is
+ longer than 15 characters or contains a slash (/) character, then this field
+ contains ``#1/nnn`` where ``nnn`` provides the length of the name and the ``#1/``
+ is literal.  In this case, the actual name of the file is provided in the ``nnn``
+ bytes immediately following the header. If the name is 15 characters or less, it
+ is contained directly in this field and terminated with a slash (/) character.
+
+
+
+date - char[12]
+
+ This field provides the date of modification of the file in the form of a
+ decimal encoded number that provides the number of seconds since the epoch
+ (since 00:00:00 Jan 1, 1970) per Posix specifications.
+
+
+
+uid - char[6]
+
+ This field provides the user id of the file encoded as a decimal ASCII string.
+ This field might not make much sense on non-Unix systems. On Unix, it is the
+ same value as the st_uid field of the stat structure returned by the stat(2)
+ operating system call.
+
+
+
+gid - char[6]
+
+ This field provides the group id of the file encoded as a decimal ASCII string.
+ This field might not make much sense on non-Unix systems. On Unix, it is the
+ same value as the st_gid field of the stat structure returned by the stat(2)
+ operating system call.
+
+
+
+mode - char[8]
+
+ This field provides the access mode of the file encoded as an octal ASCII
+ string. This field might not make much sense on non-Unix systems. On Unix, it
+ is the same value as the st_mode field of the stat structure returned by the
+ stat(2) operating system call.
+
+
+
+size - char[10]
+
+ This field provides the size of the file, in bytes, encoded as a decimal ASCII
+ string.
+
+
+
+fmag - char[2]
+
+ This field is the archive file member magic number. Its content is always the
+ two characters back tick (0x60) and newline (0x0A). This provides some measure
+ utility in identifying archive files that have been corrupted.
+
+
+offset - vbr encoded 32-bit integer
+
+ The offset item provides the offset into the archive file where the bitcode
+ member is stored that is associated with the symbol. The offset value is 0
+ based at the start of the first "normal" file member. To derive the actual
+ file offset of the member, you must add the number of bytes occupied by the file
+ signature (8 bytes) and the symbol tables. The value of this item is encoded
+ using variable bit rate encoding to reduce the size of the symbol table.
+ Variable bit rate encoding uses the high bit (0x80) of each byte to indicate
+ if there are more bytes to follow. The remaining 7 bits in each byte carry bits
+ from the value. The final byte does not have the high bit set.
+
+
+
+length - vbr encoded 32-bit integer
+
+ The length item provides the length of the symbol that follows. Like this
+ *offset* item, the length is variable bit rate encoded.
+
+
+
+symbol - character array
+
+ The symbol item provides the text of the symbol that is associated with the
+ *offset*. The symbol is not terminated by any character. Its length is provided
+ by the *length* field. Note that is allowed (but unwise) to use non-printing
+ characters (even 0x00) in the symbol. This allows for multiple encodings of
+ symbol names.
+
+
+
+
+EXIT STATUS
+-----------
+
+
+If **llvm-ar** succeeds, it will exit with 0.  A usage error, results
+in an exit code of 1. A hard (file system typically) error results in an
+exit code of 2. Miscellaneous or unknown errors result in an
+exit code of 3.
+
+
+SEE ALSO
+--------
+
+
+ar(1)

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-as.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-as.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-as.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-as.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,56 @@
+llvm-as - LLVM assembler
+========================
+
+SYNOPSIS
+--------
+
+**llvm-as** [*options*] [*filename*]
+
+DESCRIPTION
+-----------
+
+**llvm-as** is the LLVM assembler.  It reads a file containing human-readable
+LLVM assembly language, translates it to LLVM bitcode, and writes the result
+into a file or to standard output.
+
+If *filename* is omitted or is ``-``, then **llvm-as** reads its input from
+standard input.
+
+If an output file is not specified with the **-o** option, then
+**llvm-as** sends its output to a file or standard output by following
+these rules:
+
+* If the input is standard input, then the output is standard output.
+
+* If the input is a file that ends with ``.ll``, then the output file is of the
+  same name, except that the suffix is changed to ``.bc``.
+
+* If the input is a file that does not end with the ``.ll`` suffix, then the
+  output file has the same name as the input file, except that the ``.bc``
+  suffix is appended.
+
+OPTIONS
+-------
+
+**-f**
+ Enable binary output on terminals.  Normally, **llvm-as** will refuse to
+ write raw bitcode output if the output stream is a terminal. With this option,
+ **llvm-as** will write raw bitcode regardless of the output device.
+
+**-help**
+ Print a summary of command line options.
+
+**-o** *filename*
+ Specify the output file name.  If *filename* is ``-``, then **llvm-as**
+ sends its output to standard output.
+
+EXIT STATUS
+-----------
+
+If **llvm-as** succeeds, it will exit with 0.  Otherwise, if an error occurs, it
+will exit with a non-zero value.
+
+SEE ALSO
+--------
+
+llvm-dis|llvm-dis, gccas|gccas

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-bcanalyzer.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-bcanalyzer.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-bcanalyzer.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-bcanalyzer.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,305 @@
+llvm-bcanalyzer - LLVM bitcode analyzer
+=======================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-bcanalyzer` [*options*] [*filename*]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-bcanalyzer` command is a small utility for analyzing bitcode
+files.  The tool reads a bitcode file (such as generated with the
+:program:`llvm-as` tool) and produces a statistical report on the contents of
+the bitcode file.  The tool can also dump a low level but human readable
+version of the bitcode file.  This tool is probably not of much interest or
+utility except for those working directly with the bitcode file format.  Most
+LLVM users can just ignore this tool.
+
+If *filename* is omitted or is ``-``, then :program:`llvm-bcanalyzer` reads its
+input from standard input.  This is useful for combining the tool into a
+pipeline.  Output is written to the standard output.
+
+OPTIONS
+-------
+
+.. program:: llvm-bcanalyzer
+
+.. option:: -nodetails
+
+ Causes :program:`llvm-bcanalyzer` to abbreviate its output by writing out only
+ a module level summary.  The details for individual functions are not
+ displayed.
+
+.. option:: -dump
+
+ Causes :program:`llvm-bcanalyzer` to dump the bitcode in a human readable
+ format.  This format is significantly different from LLVM assembly and
+ provides details about the encoding of the bitcode file.
+
+.. option:: -verify
+
+ Causes :program:`llvm-bcanalyzer` to verify the module produced by reading the
+ bitcode.  This ensures that the statistics generated are based on a consistent
+ module.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+EXIT STATUS
+-----------
+
+If :program:`llvm-bcanalyzer` succeeds, it will exit with 0.  Otherwise, if an
+error occurs, it will exit with a non-zero value, usually 1.
+
+SUMMARY OUTPUT DEFINITIONS
+--------------------------
+
+The following items are always printed by llvm-bcanalyzer.  They comprize the
+summary output.
+
+**Bitcode Analysis Of Module**
+
+ This just provides the name of the module for which bitcode analysis is being
+ generated.
+
+**Bitcode Version Number**
+
+ The bitcode version (not LLVM version) of the file read by the analyzer.
+
+**File Size**
+
+ The size, in bytes, of the entire bitcode file.
+
+**Module Bytes**
+
+ The size, in bytes, of the module block.  Percentage is relative to File Size.
+
+**Function Bytes**
+
+ The size, in bytes, of all the function blocks.  Percentage is relative to File
+ Size.
+
+**Global Types Bytes**
+
+ The size, in bytes, of the Global Types Pool.  Percentage is relative to File
+ Size.  This is the size of the definitions of all types in the bitcode file.
+
+**Constant Pool Bytes**
+
+ The size, in bytes, of the Constant Pool Blocks Percentage is relative to File
+ Size.
+
+**Module Globals Bytes**
+
+ Ths size, in bytes, of the Global Variable Definitions and their initializers.
+ Percentage is relative to File Size.
+
+**Instruction List Bytes**
+
+ The size, in bytes, of all the instruction lists in all the functions.
+ Percentage is relative to File Size.  Note that this value is also included in
+ the Function Bytes.
+
+**Compaction Table Bytes**
+
+ The size, in bytes, of all the compaction tables in all the functions.
+ Percentage is relative to File Size.  Note that this value is also included in
+ the Function Bytes.
+
+**Symbol Table Bytes**
+
+ The size, in bytes, of all the symbol tables in all the functions.  Percentage is
+ relative to File Size.  Note that this value is also included in the Function
+ Bytes.
+
+**Dependent Libraries Bytes**
+
+ The size, in bytes, of the list of dependent libraries in the module.  Percentage
+ is relative to File Size.  Note that this value is also included in the Module
+ Global Bytes.
+
+**Number Of Bitcode Blocks**
+
+ The total number of blocks of any kind in the bitcode file.
+
+**Number Of Functions**
+
+ The total number of function definitions in the bitcode file.
+
+**Number Of Types**
+
+ The total number of types defined in the Global Types Pool.
+
+**Number Of Constants**
+
+ The total number of constants (of any type) defined in the Constant Pool.
+
+**Number Of Basic Blocks**
+
+ The total number of basic blocks defined in all functions in the bitcode file.
+
+**Number Of Instructions**
+
+ The total number of instructions defined in all functions in the bitcode file.
+
+**Number Of Long Instructions**
+
+ The total number of long instructions defined in all functions in the bitcode
+ file.  Long instructions are those taking greater than 4 bytes.  Typically long
+ instructions are GetElementPtr with several indices, PHI nodes, and calls to
+ functions with large numbers of arguments.
+
+**Number Of Operands**
+
+ The total number of operands used in all instructions in the bitcode file.
+
+**Number Of Compaction Tables**
+
+ The total number of compaction tables in all functions in the bitcode file.
+
+**Number Of Symbol Tables**
+
+ The total number of symbol tables in all functions in the bitcode file.
+
+**Number Of Dependent Libs**
+
+ The total number of dependent libraries found in the bitcode file.
+
+**Total Instruction Size**
+
+ The total size of the instructions in all functions in the bitcode file.
+
+**Average Instruction Size**
+
+ The average number of bytes per instruction across all functions in the bitcode
+ file.  This value is computed by dividing Total Instruction Size by Number Of
+ Instructions.
+
+**Maximum Type Slot Number**
+
+ The maximum value used for a type's slot number.  Larger slot number values take
+ more bytes to encode.
+
+**Maximum Value Slot Number**
+
+ The maximum value used for a value's slot number.  Larger slot number values take
+ more bytes to encode.
+
+**Bytes Per Value**
+
+ The average size of a Value definition (of any type).  This is computed by
+ dividing File Size by the total number of values of any type.
+
+**Bytes Per Global**
+
+ The average size of a global definition (constants and global variables).
+
+**Bytes Per Function**
+
+ The average number of bytes per function definition.  This is computed by
+ dividing Function Bytes by Number Of Functions.
+
+**# of VBR 32-bit Integers**
+
+ The total number of 32-bit integers encoded using the Variable Bit Rate
+ encoding scheme.
+
+**# of VBR 64-bit Integers**
+
+ The total number of 64-bit integers encoded using the Variable Bit Rate encoding
+ scheme.
+
+**# of VBR Compressed Bytes**
+
+ The total number of bytes consumed by the 32-bit and 64-bit integers that use
+ the Variable Bit Rate encoding scheme.
+
+**# of VBR Expanded Bytes**
+
+ The total number of bytes that would have been consumed by the 32-bit and 64-bit
+ integers had they not been compressed with the Variable Bit Rage encoding
+ scheme.
+
+**Bytes Saved With VBR**
+
+ The total number of bytes saved by using the Variable Bit Rate encoding scheme.
+ The percentage is relative to # of VBR Expanded Bytes.
+
+DETAILED OUTPUT DEFINITIONS
+---------------------------
+
+The following definitions occur only if the -nodetails option was not given.
+The detailed output provides additional information on a per-function basis.
+
+**Type**
+
+ The type signature of the function.
+
+**Byte Size**
+
+ The total number of bytes in the function's block.
+
+**Basic Blocks**
+
+ The number of basic blocks defined by the function.
+
+**Instructions**
+
+ The number of instructions defined by the function.
+
+**Long Instructions**
+
+ The number of instructions using the long instruction format in the function.
+
+**Operands**
+
+ The number of operands used by all instructions in the function.
+
+**Instruction Size**
+
+ The number of bytes consumed by instructions in the function.
+
+**Average Instruction Size**
+
+ The average number of bytes consumed by the instructions in the function.
+ This value is computed by dividing Instruction Size by Instructions.
+
+**Bytes Per Instruction**
+
+ The average number of bytes used by the function per instruction.  This value
+ is computed by dividing Byte Size by Instructions.  Note that this is not the
+ same as Average Instruction Size.  It computes a number relative to the total
+ function size not just the size of the instruction list.
+
+**Number of VBR 32-bit Integers**
+
+ The total number of 32-bit integers found in this function (for any use).
+
+**Number of VBR 64-bit Integers**
+
+ The total number of 64-bit integers found in this function (for any use).
+
+**Number of VBR Compressed Bytes**
+
+ The total number of bytes in this function consumed by the 32-bit and 64-bit
+ integers that use the Variable Bit Rate encoding scheme.
+
+**Number of VBR Expanded Bytes**
+
+ The total number of bytes in this function that would have been consumed by
+ the 32-bit and 64-bit integers had they not been compressed with the Variable
+ Bit Rate encoding scheme.
+
+**Bytes Saved With VBR**
+
+ The total number of bytes saved in this function by using the Variable Bit
+ Rate encoding scheme.  The percentage is relative to # of VBR Expanded Bytes.
+
+SEE ALSO
+--------
+
+:doc:`/CommandGuide/llvm-dis`, :doc:`/BitCodeFormat`
+

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-build.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-build.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-build.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-build.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,102 @@
+llvm-build - LLVM Project Build Utility
+=======================================
+
+
+SYNOPSIS
+--------
+
+
+**llvm-build** [*options*]
+
+
+DESCRIPTION
+-----------
+
+
+**llvm-build** is a tool for working with LLVM projects that use the LLVMBuild
+system for describing their components.
+
+At heart, **llvm-build** is responsible for loading, verifying, and manipulating
+the project's component data. The tool is primarily designed for use in
+implementing build systems and tools which need access to the project structure
+information.
+
+
+OPTIONS
+-------
+
+
+
+**-h**, **--help**
+
+ Print the builtin program help.
+
+
+
+**--source-root**\ =\ *PATH*
+
+ If given, load the project at the given source root path. If this option is not
+ given, the location of the project sources will be inferred from the location of
+ the **llvm-build** script itself.
+
+
+
+**--print-tree**
+
+ Print the component tree for the project.
+
+
+
+**--write-library-table**
+
+ Write out the C++ fragment which defines the components, library names, and
+ required libraries. This C++ fragment is built into llvm-config|llvm-config
+ in order to provide clients with the list of required libraries for arbitrary
+ component combinations.
+
+
+
+**--write-llvmbuild**
+
+ Write out new *LLVMBuild.txt* files based on the loaded components. This is
+ useful for auto-upgrading the schema of the files. **llvm-build** will try to a
+ limited extent to preserve the comments which were written in the original
+ source file, although at this time it only preserves block comments that precede
+ the section names in the *LLVMBuild* files.
+
+
+
+**--write-cmake-fragment**
+
+ Write out the LLVMBuild in the form of a CMake fragment, so it can easily be
+ consumed by the CMake based build system. The exact contents and format of this
+ file are closely tied to how LLVMBuild is integrated with CMake, see LLVM's
+ top-level CMakeLists.txt.
+
+
+
+**--write-make-fragment**
+
+ Write out the LLVMBuild in the form of a Makefile fragment, so it can easily be
+ consumed by a Make based build system. The exact contents and format of this
+ file are closely tied to how LLVMBuild is integrated with the Makefiles, see
+ LLVM's Makefile.rules.
+
+
+
+**--llvmbuild-source-root**\ =\ *PATH*
+
+ If given, expect the *LLVMBuild* files for the project to be rooted at the
+ given path, instead of inside the source tree itself. This option is primarily
+ designed for use in conjunction with **--write-llvmbuild** to test changes to
+ *LLVMBuild* schema.
+
+
+
+
+EXIT STATUS
+-----------
+
+
+**llvm-build** exits with 0 if operation was successful. Otherwise, it will exist
+with a non-zero value.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-config.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-config.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-config.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-config.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,176 @@
+llvm-config - Print LLVM compilation options
+============================================
+
+
+SYNOPSIS
+--------
+
+
+**llvm-config** *option* [*components*...]
+
+
+DESCRIPTION
+-----------
+
+
+**llvm-config** makes it easier to build applications that use LLVM.  It can
+print the compiler flags, linker flags and object libraries needed to link
+against LLVM.
+
+
+EXAMPLES
+--------
+
+
+To link against the JIT:
+
+
+.. code-block:: sh
+
+   g++ `llvm-config --cxxflags` -o HowToUseJIT.o -c HowToUseJIT.cpp
+   g++ `llvm-config --ldflags` -o HowToUseJIT HowToUseJIT.o \
+       `llvm-config --libs engine bcreader scalaropts`
+
+
+
+OPTIONS
+-------
+
+
+
+**--version**
+
+ Print the version number of LLVM.
+
+
+
+**-help**
+
+ Print a summary of **llvm-config** arguments.
+
+
+
+**--prefix**
+
+ Print the installation prefix for LLVM.
+
+
+
+**--src-root**
+
+ Print the source root from which LLVM was built.
+
+
+
+**--obj-root**
+
+ Print the object root used to build LLVM.
+
+
+
+**--bindir**
+
+ Print the installation directory for LLVM binaries.
+
+
+
+**--includedir**
+
+ Print the installation directory for LLVM headers.
+
+
+
+**--libdir**
+
+ Print the installation directory for LLVM libraries.
+
+
+
+**--cxxflags**
+
+ Print the C++ compiler flags needed to use LLVM headers.
+
+
+
+**--ldflags**
+
+ Print the flags needed to link against LLVM libraries.
+
+
+
+**--libs**
+
+ Print all the libraries needed to link against the specified LLVM
+ *components*, including any dependencies.
+
+
+
+**--libnames**
+
+ Similar to **--libs**, but prints the bare filenames of the libraries
+ without **-l** or pathnames.  Useful for linking against a not-yet-installed
+ copy of LLVM.
+
+
+
+**--libfiles**
+
+ Similar to **--libs**, but print the full path to each library file.  This is
+ useful when creating makefile dependencies, to ensure that a tool is relinked if
+ any library it uses changes.
+
+
+
+**--components**
+
+ Print all valid component names.
+
+
+
+**--targets-built**
+
+ Print the component names for all targets supported by this copy of LLVM.
+
+
+
+**--build-mode**
+
+ Print the build mode used when LLVM was built (e.g. Debug or Release)
+
+
+
+
+COMPONENTS
+----------
+
+
+To print a list of all available components, run **llvm-config
+--components**.  In most cases, components correspond directly to LLVM
+libraries.  Useful "virtual" components include:
+
+
+**all**
+
+ Includes all LLVM libraries.  The default if no components are specified.
+
+
+
+**backend**
+
+ Includes either a native backend or the C backend.
+
+
+
+**engine**
+
+ Includes either a native JIT or the bitcode interpreter.
+
+
+
+
+EXIT STATUS
+-----------
+
+
+If **llvm-config** succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-cov.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-cov.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-cov.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-cov.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,355 @@
+llvm-cov - emit coverage information
+====================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-cov` *command* [*args...*]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-cov` tool shows code coverage information for
+programs that are instrumented to emit profile data. It can be used to
+work with ``gcov``\-style coverage or with ``clang``\'s instrumentation
+based profiling.
+
+If the program is invoked with a base name of ``gcov``, it will behave as if
+the :program:`llvm-cov gcov` command were called. Otherwise, a command should
+be provided.
+
+COMMANDS
+--------
+
+* :ref:`gcov <llvm-cov-gcov>`
+* :ref:`show <llvm-cov-show>`
+* :ref:`report <llvm-cov-report>`
+* :ref:`export <llvm-cov-export>`
+
+.. program:: llvm-cov gcov
+
+.. _llvm-cov-gcov:
+
+GCOV COMMAND
+------------
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-cov gcov` [*options*] *SOURCEFILE*
+
+DESCRIPTION
+^^^^^^^^^^^
+
+The :program:`llvm-cov gcov` tool reads code coverage data files and displays
+the coverage information for a specified source file. It is compatible with the
+``gcov`` tool from version 4.2 of ``GCC`` and may also be compatible with some
+later versions of ``gcov``.
+
+To use :program:`llvm-cov gcov`, you must first build an instrumented version
+of your application that collects coverage data as it runs. Compile with the
+``-fprofile-arcs`` and ``-ftest-coverage`` options to add the
+instrumentation. (Alternatively, you can use the ``--coverage`` option, which
+includes both of those other options.) You should compile with debugging
+information (``-g``) and without optimization (``-O0``); otherwise, the
+coverage data cannot be accurately mapped back to the source code.
+
+At the time you compile the instrumented code, a ``.gcno`` data file will be
+generated for each object file. These ``.gcno`` files contain half of the
+coverage data. The other half of the data comes from ``.gcda`` files that are
+generated when you run the instrumented program, with a separate ``.gcda``
+file for each object file. Each time you run the program, the execution counts
+are summed into any existing ``.gcda`` files, so be sure to remove any old
+files if you do not want their contents to be included.
+
+By default, the ``.gcda`` files are written into the same directory as the
+object files, but you can override that by setting the ``GCOV_PREFIX`` and
+``GCOV_PREFIX_STRIP`` environment variables. The ``GCOV_PREFIX_STRIP``
+variable specifies a number of directory components to be removed from the
+start of the absolute path to the object file directory. After stripping those
+directories, the prefix from the ``GCOV_PREFIX`` variable is added. These
+environment variables allow you to run the instrumented program on a machine
+where the original object file directories are not accessible, but you will
+then need to copy the ``.gcda`` files back to the object file directories
+where :program:`llvm-cov gcov` expects to find them.
+
+Once you have generated the coverage data files, run :program:`llvm-cov gcov`
+for each main source file where you want to examine the coverage results. This
+should be run from the same directory where you previously ran the
+compiler. The results for the specified source file are written to a file named
+by appending a ``.gcov`` suffix. A separate output file is also created for
+each file included by the main source file, also with a ``.gcov`` suffix added.
+
+The basic content of an ``.gcov`` output file is a copy of the source file with
+an execution count and line number prepended to every line. The execution
+count is shown as ``-`` if a line does not contain any executable code. If
+a line contains code but that code was never executed, the count is displayed
+as ``#####``.
+
+OPTIONS
+^^^^^^^
+
+.. option:: -a, --all-blocks
+
+ Display all basic blocks. If there are multiple blocks for a single line of
+ source code, this option causes llvm-cov to show the count for each block
+ instead of just one count for the entire line.
+
+.. option:: -b, --branch-probabilities
+
+ Display conditional branch probabilities and a summary of branch information.
+
+.. option:: -c, --branch-counts
+
+ Display branch counts instead of probabilities (requires -b).
+
+.. option:: -f, --function-summaries
+
+ Show a summary of coverage for each function instead of just one summary for
+ an entire source file.
+
+.. option:: --help
+
+ Display available options (--help-hidden for more).
+
+.. option:: -l, --long-file-names
+
+ For coverage output of files included from the main source file, add the
+ main file name followed by ``##`` as a prefix to the output file names. This
+ can be combined with the --preserve-paths option to use complete paths for
+ both the main file and the included file.
+
+.. option:: -n, --no-output
+
+ Do not output any ``.gcov`` files. Summary information is still
+ displayed.
+
+.. option:: -o=<DIR|FILE>, --object-directory=<DIR>, --object-file=<FILE>
+
+ Find objects in DIR or based on FILE's path. If you specify a particular
+ object file, the coverage data files are expected to have the same base name
+ with ``.gcno`` and ``.gcda`` extensions. If you specify a directory, the
+ files are expected in that directory with the same base name as the source
+ file.
+
+.. option:: -p, --preserve-paths
+
+ Preserve path components when naming the coverage output files. In addition
+ to the source file name, include the directories from the path to that
+ file. The directories are separate by ``#`` characters, with ``.`` directories
+ removed and ``..`` directories replaced by ``^`` characters. When used with
+ the --long-file-names option, this applies to both the main file name and the
+ included file name.
+
+.. option:: -u, --unconditional-branches
+
+ Include unconditional branches in the output for the --branch-probabilities
+ option.
+
+.. option:: -version
+
+ Display the version of llvm-cov.
+
+EXIT STATUS
+^^^^^^^^^^^
+
+:program:`llvm-cov gcov` returns 1 if it cannot read input files.  Otherwise,
+it exits with zero.
+
+
+.. program:: llvm-cov show
+
+.. _llvm-cov-show:
+
+SHOW COMMAND
+------------
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-cov show` [*options*] -instr-profile *PROFILE* *BIN* [*-object BIN,...*] [[*-object BIN*]] [*SOURCES*]
+
+DESCRIPTION
+^^^^^^^^^^^
+
+The :program:`llvm-cov show` command shows line by line coverage of the
+binaries *BIN*,...  using the profile data *PROFILE*. It can optionally be
+filtered to only show the coverage for the files listed in *SOURCES*.
+
+To use :program:`llvm-cov show`, you need a program that is compiled with
+instrumentation to emit profile and coverage data. To build such a program with
+``clang`` use the ``-fprofile-instr-generate`` and ``-fcoverage-mapping``
+flags. If linking with the ``clang`` driver, pass ``-fprofile-instr-generate``
+to the link stage to make sure the necessary runtime libraries are linked in.
+
+The coverage information is stored in the built executable or library itself,
+and this is what you should pass to :program:`llvm-cov show` as a *BIN*
+argument. The profile data is generated by running this instrumented program
+normally. When the program exits it will write out a raw profile file,
+typically called ``default.profraw``, which can be converted to a format that
+is suitable for the *PROFILE* argument using the :program:`llvm-profdata merge`
+tool.
+
+OPTIONS
+^^^^^^^
+
+.. option:: -show-line-counts
+
+ Show the execution counts for each line. This is enabled by default, unless
+ another ``-show`` option is used.
+
+.. option:: -show-expansions
+
+ Expand inclusions, such as preprocessor macros or textual inclusions, inline
+ in the display of the source file.
+
+.. option:: -show-instantiations
+
+ For source regions that are instantiated multiple times, such as templates in
+ ``C++``, show each instantiation separately as well as the combined summary.
+
+.. option:: -show-regions
+
+ Show the execution counts for each region by displaying a caret that points to
+ the character where the region starts.
+
+.. option:: -show-line-counts-or-regions
+
+ Show the execution counts for each line if there is only one region on the
+ line, but show the individual regions if there are multiple on the line.
+
+.. option:: -use-color[=VALUE]
+
+ Enable or disable color output. By default this is autodetected.
+
+.. option:: -arch=<name>
+
+ If the covered binary is a universal binary, select the architecture to use.
+ It is an error to specify an architecture that is not included in the
+ universal binary or to use an architecture that does not match a
+ non-universal binary.
+
+.. option:: -name=<NAME>
+
+ Show code coverage only for functions with the given name.
+
+.. option:: -name-regex=<PATTERN>
+
+ Show code coverage only for functions that match the given regular expression.
+
+.. option:: -format=<FORMAT>
+
+ Use the specified output format. The supported formats are: "text", "html".
+
+.. option:: -tab-size=<TABSIZE>
+
+ Replace tabs with <TABSIZE> spaces when preparing reports. Currently, this is
+ only supported for the html format.
+
+.. option:: -output-dir=PATH
+
+ Specify a directory to write coverage reports into. If the directory does not
+ exist, it is created. When used in function view mode (i.e when -name or
+ -name-regex are used to select specific functions), the report is written to
+ PATH/functions.EXTENSION. When used in file view mode, a report for each file
+ is written to PATH/REL_PATH_TO_FILE.EXTENSION.
+
+.. option:: -Xdemangler=<TOOL>|<TOOL-OPTION>
+
+ Specify a symbol demangler. This can be used to make reports more
+ human-readable. This option can be specified multiple times to supply
+ arguments to the demangler (e.g `-Xdemangler c++filt -Xdemangler -n` for C++).
+ The demangler is expected to read a newline-separated list of symbols from
+ stdin and write a newline-separated list of the same length to stdout.
+
+.. option:: -line-coverage-gt=<N>
+
+ Show code coverage only for functions with line coverage greater than the
+ given threshold.
+
+.. option:: -line-coverage-lt=<N>
+
+ Show code coverage only for functions with line coverage less than the given
+ threshold.
+
+.. option:: -region-coverage-gt=<N>
+
+ Show code coverage only for functions with region coverage greater than the
+ given threshold.
+
+.. option:: -region-coverage-lt=<N>
+
+ Show code coverage only for functions with region coverage less than the given
+ threshold.
+
+.. program:: llvm-cov report
+
+.. _llvm-cov-report:
+
+REPORT COMMAND
+--------------
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-cov report` [*options*] -instr-profile *PROFILE* *BIN* [*-object BIN,...*] [[*-object BIN*]] [*SOURCES*]
+
+DESCRIPTION
+^^^^^^^^^^^
+
+The :program:`llvm-cov report` command displays a summary of the coverage of
+the binaries *BIN*,... using the profile data *PROFILE*. It can optionally be
+filtered to only show the coverage for the files listed in *SOURCES*.
+
+If no source files are provided, a summary line is printed for each file in the
+coverage data. If any files are provided, summaries are shown for each function
+in the listed files instead.
+
+For information on compiling programs for coverage and generating profile data,
+see :ref:`llvm-cov-show`.
+
+OPTIONS
+^^^^^^^
+
+.. option:: -use-color[=VALUE]
+
+ Enable or disable color output. By default this is autodetected.
+
+.. option:: -arch=<name>
+
+ If the covered binary is a universal binary, select the architecture to use.
+ It is an error to specify an architecture that is not included in the
+ universal binary or to use an architecture that does not match a
+ non-universal binary.
+
+.. program:: llvm-cov export
+
+.. _llvm-cov-export:
+
+EXPORT COMMAND
+--------------
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-cov export` [*options*] -instr-profile *PROFILE* *BIN* [*-object BIN,...*] [[*-object BIN*]]
+
+DESCRIPTION
+^^^^^^^^^^^
+
+The :program:`llvm-cov export` command exports regions, functions, expansions,
+and summaries of the coverage of the binaries *BIN*,... using the profile data
+*PROFILE* as JSON.
+
+For information on compiling programs for coverage and generating profile data,
+see :ref:`llvm-cov-show`.
+
+OPTIONS
+^^^^^^^
+
+.. option:: -arch=<name>
+
+ If the covered binary is a universal binary, select the architecture to use.
+ It is an error to specify an architecture that is not included in the
+ universal binary or to use an architecture that does not match a
+ non-universal binary.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-diff.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-diff.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-diff.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-diff.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,56 @@
+llvm-diff - LLVM structural 'diff'
+==================================
+
+
+SYNOPSIS
+--------
+
+
+**llvm-diff** [*options*] *module 1* *module 2* [*global name ...*]
+
+
+DESCRIPTION
+-----------
+
+
+**llvm-diff** compares the structure of two LLVM modules, primarily
+focusing on differences in function definitions.  Insignificant
+differences, such as changes in the ordering of globals or in the
+names of local values, are ignored.
+
+An input module will be interpreted as an assembly file if its name
+ends in '.ll';  otherwise it will be read in as a bitcode file.
+
+If a list of global names is given, just the values with those names
+are compared; otherwise, all global values are compared, and
+diagnostics are produced for globals which only appear in one module
+or the other.
+
+**llvm-diff** compares two functions by comparing their basic blocks,
+beginning with the entry blocks.  If the terminators seem to match,
+then the corresponding successors are compared; otherwise they are
+ignored.  This algorithm is very sensitive to changes in control flow,
+which tend to stop any downstream changes from being detected.
+
+**llvm-diff** is intended as a debugging tool for writers of LLVM
+passes and frontends.  It does not have a stable output format.
+
+
+EXIT STATUS
+-----------
+
+
+If **llvm-diff** finds no differences between the modules, it will exit
+with 0 and produce no output.  Otherwise it will exit with a non-zero
+value.
+
+
+BUGS
+----
+
+
+Many important differences, like changes in linkage or function
+attributes, are not diagnosed.
+
+Changes in memory behavior (for example, coalescing loads) can cause
+massive detected differences in blocks.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dis.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dis.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dis.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dis.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,69 @@
+llvm-dis - LLVM disassembler
+============================
+
+
+SYNOPSIS
+--------
+
+
+**llvm-dis** [*options*] [*filename*]
+
+
+DESCRIPTION
+-----------
+
+
+The **llvm-dis** command is the LLVM disassembler.  It takes an LLVM
+bitcode file and converts it into human-readable LLVM assembly language.
+
+If filename is omitted or specified as ``-``, **llvm-dis** reads its
+input from standard input.
+
+If the input is being read from standard input, then **llvm-dis**
+will send its output to standard output by default.  Otherwise, the
+output will be written to a file named after the input file, with
+a ``.ll`` suffix added (any existing ``.bc`` suffix will first be
+removed).  You can override the choice of output file using the
+**-o** option.
+
+
+OPTIONS
+-------
+
+
+
+**-f**
+
+ Enable binary output on terminals.  Normally, **llvm-dis** will refuse to
+ write raw bitcode output if the output stream is a terminal. With this option,
+ **llvm-dis** will write raw bitcode regardless of the output device.
+
+
+
+**-help**
+
+ Print a summary of command line options.
+
+
+
+**-o** *filename*
+
+ Specify the output file name.  If *filename* is -, then the output is sent
+ to standard output.
+
+
+
+
+EXIT STATUS
+-----------
+
+
+If **llvm-dis** succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.
+
+
+SEE ALSO
+--------
+
+
+llvm-as|llvm-as

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dwarfdump.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dwarfdump.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dwarfdump.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-dwarfdump.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,30 @@
+llvm-dwarfdump - print contents of DWARF sections
+=================================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-dwarfdump` [*options*] [*filenames...*]
+
+DESCRIPTION
+-----------
+
+:program:`llvm-dwarfdump` parses DWARF sections in the object files
+and prints their contents in human-readable form.
+
+OPTIONS
+-------
+
+.. option:: -debug-dump=section
+
+  Specify the DWARF section to dump.
+  For example, use ``abbrev`` to dump the contents of ``.debug_abbrev`` section,
+  ``loc.dwo`` to dump the contents of ``.debug_loc.dwo`` etc.
+  See ``llvm-dwarfdump --help`` for the complete list of supported sections.
+  Use ``all`` to dump all DWARF sections. It is the default.
+
+EXIT STATUS
+-----------
+
+:program:`llvm-dwarfdump` returns 0 if the input files were parsed and dumped
+successfully. Otherwise, it returns 1.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-extract.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-extract.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-extract.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-extract.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,79 @@
+llvm-extract - extract a function from an LLVM module
+=====================================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-extract` [*options*] **--func** *function-name* [*filename*]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-extract` command takes the name of a function and extracts
+it from the specified LLVM bitcode file.  It is primarily used as a debugging
+tool to reduce test cases from larger programs that are triggering a bug.
+
+In addition to extracting the bitcode of the specified function,
+:program:`llvm-extract` will also remove unreachable global variables,
+prototypes, and unused types.
+
+The :program:`llvm-extract` command reads its input from standard input if
+filename is omitted or if filename is ``-``.  The output is always written to
+standard output, unless the **-o** option is specified (see below).
+
+OPTIONS
+-------
+
+**-f**
+
+ Enable binary output on terminals.  Normally, :program:`llvm-extract` will
+ refuse to write raw bitcode output if the output stream is a terminal.  With
+ this option, :program:`llvm-extract` will write raw bitcode regardless of the
+ output device.
+
+**--func** *function-name*
+
+ Extract the function named *function-name* from the LLVM bitcode.  May be
+ specified multiple times to extract multiple functions at once.
+
+**--rfunc** *function-regular-expr*
+
+ Extract the function(s) matching *function-regular-expr* from the LLVM bitcode.
+ All functions matching the regular expression will be extracted.  May be
+ specified multiple times.
+
+**--glob** *global-name*
+
+ Extract the global variable named *global-name* from the LLVM bitcode.  May be
+ specified multiple times to extract multiple global variables at once.
+
+**--rglob** *glob-regular-expr*
+
+ Extract the global variable(s) matching *global-regular-expr* from the LLVM
+ bitcode.  All global variables matching the regular expression will be
+ extracted.  May be specified multiple times.
+
+**-help**
+
+ Print a summary of command line options.
+
+**-o** *filename*
+
+ Specify the output filename.  If filename is "-" (the default), then
+ :program:`llvm-extract` sends its output to standard output.
+
+**-S**
+
+ Write output in LLVM intermediate language (instead of bitcode).
+
+EXIT STATUS
+-----------
+
+If :program:`llvm-extract` succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.
+
+SEE ALSO
+--------
+
+bugpoint
+

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-lib.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-lib.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-lib.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-lib.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,31 @@
+llvm-lib - LLVM lib.exe compatible library tool
+===============================================
+
+
+SYNOPSIS
+--------
+
+
+**llvm-lib** [/libpath:<path>] [/out:<output>] [/llvmlibthin]
+[/ignore] [/machine] [/nologo] [files...]
+
+
+DESCRIPTION
+-----------
+
+
+The **llvm-lib** command is intended to be a ``lib.exe`` compatible
+tool. See https://msdn.microsoft.com/en-us/library/7ykb2k5f for the
+general description.
+
+**llvm-lib** has the following extensions:
+
+* Bitcode files in symbol tables.
+  **llvm-lib** includes symbols from both bitcode files and regular
+  object files in the symbol table.
+
+* Creating thin archives.
+  The /llvmlibthin option causes **llvm-lib** to create thin archive
+  that contain only the symbol table and the header for the various
+  members. These files are much smaller, but are not compatible with
+  link.exe (lld can handle them).

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-link.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-link.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-link.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-link.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,56 @@
+llvm-link - LLVM bitcode linker
+===============================
+
+SYNOPSIS
+--------
+
+:program:`llvm-link` [*options*] *filename ...*
+
+DESCRIPTION
+-----------
+
+:program:`llvm-link` takes several LLVM bitcode files and links them together
+into a single LLVM bitcode file.  It writes the output file to standard output,
+unless the :option:`-o` option is used to specify a filename.
+
+OPTIONS
+-------
+
+.. option:: -f
+
+ Enable binary output on terminals.  Normally, :program:`llvm-link` will refuse
+ to write raw bitcode output if the output stream is a terminal. With this
+ option, :program:`llvm-link` will write raw bitcode regardless of the output
+ device.
+
+.. option:: -o filename
+
+ Specify the output file name.  If ``filename`` is "``-``", then
+ :program:`llvm-link` will write its output to standard output.
+
+.. option:: -S
+
+ Write output in LLVM intermediate language (instead of bitcode).
+
+.. option:: -d
+
+ If specified, :program:`llvm-link` prints a human-readable version of the
+ output bitcode file to standard error.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -v
+
+ Verbose mode.  Print information about what :program:`llvm-link` is doing.
+ This typically includes a message for each bitcode file linked in and for each
+ library found.
+
+EXIT STATUS
+-----------
+
+If :program:`llvm-link` succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.
+
+

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-nm.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-nm.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-nm.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-nm.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,151 @@
+llvm-nm - list LLVM bitcode and object file's symbol table
+==========================================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-nm` [*options*] [*filenames...*]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-nm` utility lists the names of symbols from the LLVM bitcode
+files, object files, or :program:`ar` archives containing them, named on the
+command line.  Each symbol is listed along with some simple information about
+its provenance.  If no file name is specified, or *-* is used as a file name,
+:program:`llvm-nm` will process a file on its standard input stream.
+
+:program:`llvm-nm`'s default output format is the traditional BSD :program:`nm`
+output format.  Each such output record consists of an (optional) 8-digit
+hexadecimal address, followed by a type code character, followed by a name, for
+each symbol.  One record is printed per line; fields are separated by spaces.
+When the address is omitted, it is replaced by 8 spaces.
+
+Type code characters currently supported, and their meanings, are as follows:
+
+U
+
+ Named object is referenced but undefined in this bitcode file
+
+C
+
+ Common (multiple definitions link together into one def)
+
+W
+
+ Weak reference (multiple definitions link together into zero or one definitions)
+
+t
+
+ Local function (text) object
+
+T
+
+ Global function (text) object
+
+d
+
+ Local data object
+
+D
+
+ Global data object
+
+?
+
+ Something unrecognizable
+
+Because LLVM bitcode files typically contain objects that are not considered to
+have addresses until they are linked into an executable image or dynamically
+compiled "just-in-time", :program:`llvm-nm` does not print an address for any
+symbol in an LLVM bitcode file, even symbols which are defined in the bitcode
+file.
+
+OPTIONS
+-------
+
+.. program:: llvm-nm
+
+.. option:: -B    (default)
+
+ Use BSD output format.  Alias for `--format=bsd`.
+
+.. option:: -P
+
+ Use POSIX.2 output format.  Alias for `--format=posix`.
+
+.. option:: --debug-syms, -a
+
+ Show all symbols, even debugger only.
+
+.. option:: --defined-only
+
+ Print only symbols defined in this file (as opposed to
+ symbols which may be referenced by objects in this file, but not
+ defined in this file.)
+
+.. option:: --dynamic, -D
+
+ Display dynamic symbols instead of normal symbols.
+
+.. option:: --extern-only, -g
+
+ Print only symbols whose definitions are external; that is, accessible
+ from other files.
+
+.. option:: --format=format, -f format
+
+ Select an output format; *format* may be *sysv*, *posix*, or *bsd*.  The default
+ is *bsd*.
+
+.. option:: -help
+
+ Print a summary of command-line options and their meanings.
+
+.. option:: --no-sort, -p
+
+ Shows symbols in order encountered.
+
+.. option:: --numeric-sort, -n, -v
+
+ Sort symbols by address.
+
+.. option:: --print-file-name, -A, -o
+
+ Precede each symbol with the file it came from.
+
+.. option:: --print-size, -S
+
+ Show symbol size instead of address.
+
+.. option:: --size-sort
+
+ Sort symbols by size.
+
+.. option:: --undefined-only, -u
+
+ Print only symbols referenced but not defined in this file.
+
+.. option:: --radix=RADIX, -t
+
+ Specify the radix of the symbol address(es). Values accepted d(decimal),
+ x(hexadecomal) and o(octal).
+
+BUGS
+----
+
+ * :program:`llvm-nm` cannot demangle C++ mangled names, like GNU :program:`nm`
+   can.
+
+ * :program:`llvm-nm` does not support the full set of arguments that GNU
+   :program:`nm` does.
+
+EXIT STATUS
+-----------
+
+:program:`llvm-nm` exits with an exit code of zero.
+
+SEE ALSO
+--------
+
+llvm-dis, ar(1), nm(1)

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-profdata.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-profdata.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-profdata.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-profdata.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,203 @@
+llvm-profdata - Profile data tool
+=================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-profdata` *command* [*args...*]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-profdata` tool is a small utility for working with profile
+data files.
+
+COMMANDS
+--------
+
+* :ref:`merge <profdata-merge>`
+* :ref:`show <profdata-show>`
+
+.. program:: llvm-profdata merge
+
+.. _profdata-merge:
+
+MERGE
+-----
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-profdata merge` [*options*] [*filename...*]
+
+DESCRIPTION
+^^^^^^^^^^^
+
+:program:`llvm-profdata merge` takes several profile data files
+generated by PGO instrumentation and merges them together into a single
+indexed profile data file.
+
+By default profile data is merged without modification. This means that the
+relative importance of each input file is proportional to the number of samples
+or counts it contains. In general, the input from a longer training run will be
+interpreted as relatively more important than a shorter run. Depending on the
+nature of the training runs it may be useful to adjust the weight given to each
+input file by using the ``-weighted-input`` option.
+
+Profiles passed in via ``-weighted-input``, ``-input-files``, or via positional
+arguments are processed once for each time they are seen.
+
+
+OPTIONS
+^^^^^^^
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -output=output, -o=output
+
+ Specify the output file name.  *Output* cannot be ``-`` as the resulting
+ indexed profile data can't be written to standard output.
+
+.. option:: -weighted-input=weight,filename
+
+ Specify an input file name along with a weight. The profile counts of the
+ supplied ``filename`` will be scaled (multiplied) by the supplied
+ ``weight``, where where ``weight`` is a decimal integer >= 1.
+ Input files specified without using this option are assigned a default
+ weight of 1. Examples are shown below.
+
+.. option:: -input-files=path, -f=path
+
+  Specify a file which contains a list of files to merge. The entries in this
+  file are newline-separated. Lines starting with '#' are skipped. Entries may
+  be of the form <filename> or <weight>,<filename>.
+
+.. option:: -instr (default)
+
+ Specify that the input profile is an instrumentation-based profile.
+
+.. option:: -sample
+
+ Specify that the input profile is a sample-based profile.
+ 
+ The format of the generated file can be generated in one of three ways:
+
+ .. option:: -binary (default)
+
+ Emit the profile using a binary encoding. For instrumentation-based profile
+ the output format is the indexed binary format. 
+
+ .. option:: -text
+
+ Emit the profile in text mode. This option can also be used with both
+ sample-based and instrumentation-based profile. When this option is used
+ the profile will be dumped in the text format that is parsable by the profile
+ reader.
+
+ .. option:: -gcc
+
+ Emit the profile using GCC's gcov format (Not yet supported).
+
+.. option:: -sparse[=true|false]
+
+ Do not emit function records with 0 execution count. Can only be used in
+ conjunction with -instr. Defaults to false, since it can inhibit compiler
+ optimization during PGO.
+
+.. option:: -num-threads=N, -j=N
+
+ Use N threads to perform profile merging. When N=0, llvm-profdata auto-detects
+ an appropriate number of threads to use. This is the default.
+
+EXAMPLES
+^^^^^^^^
+Basic Usage
++++++++++++
+Merge three profiles:
+
+::
+
+    llvm-profdata merge foo.profdata bar.profdata baz.profdata -output merged.profdata
+
+Weighted Input
+++++++++++++++
+The input file `foo.profdata` is especially important, multiply its counts by 10:
+
+::
+
+    llvm-profdata merge -weighted-input=10,foo.profdata bar.profdata baz.profdata -output merged.profdata
+
+Exactly equivalent to the previous invocation (explicit form; useful for programmatic invocation):
+
+::
+
+    llvm-profdata merge -weighted-input=10,foo.profdata -weighted-input=1,bar.profdata -weighted-input=1,baz.profdata -output merged.profdata
+
+.. program:: llvm-profdata show
+
+.. _profdata-show:
+
+SHOW
+----
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-profdata show` [*options*] [*filename*]
+
+DESCRIPTION
+^^^^^^^^^^^
+
+:program:`llvm-profdata show` takes a profile data file and displays the
+information about the profile counters for this file and
+for any of the specified function(s).
+
+If *filename* is omitted or is ``-``, then **llvm-profdata show** reads its
+input from standard input.
+
+OPTIONS
+^^^^^^^
+
+.. option:: -all-functions
+
+ Print details for every function.
+
+.. option:: -counts
+
+ Print the counter values for the displayed functions.
+
+.. option:: -function=string
+
+ Print details for a function if the function's name contains the given string.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -output=output, -o=output
+
+ Specify the output file name.  If *output* is ``-`` or it isn't specified,
+ then the output is sent to standard output.
+
+.. option:: -instr (default)
+
+ Specify that the input profile is an instrumentation-based profile.
+
+.. option:: -text
+
+ Instruct the profile dumper to show profile counts in the text format of the
+ instrumentation-based profile data representation. By default, the profile
+ information is dumped in a more human readable form (also in text) with
+ annotations.
+
+.. option:: -sample
+
+ Specify that the input profile is a sample-based profile.
+
+EXIT STATUS
+-----------
+
+:program:`llvm-profdata` returns 1 if the command is omitted or is invalid,
+if it cannot read input files, or if there is a mismatch between their data.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-readobj.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-readobj.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-readobj.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-readobj.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,90 @@
+llvm-readobj - LLVM Object Reader
+=================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-readobj` [*options*] [*input...*]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-readobj` tool displays low-level format-specific information
+about one or more object files. The tool and its output is primarily designed
+for use in FileCheck-based tests.
+
+OPTIONS
+-------
+
+If ``input`` is "``-``" or omitted, :program:`llvm-readobj` reads from standard
+input. Otherwise, it will read from the specified ``filenames``.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -version
+
+ Display the version of this program
+
+.. option:: -file-headers, -h
+
+ Display file headers.
+
+.. option:: -sections, -s
+
+ Display all sections.
+
+.. option:: -section-data, -sd
+
+ When used with ``-sections``, display section data for each section shown.
+
+.. option:: -section-relocations, -sr
+
+ When used with ``-sections``, display relocations for each section shown.
+
+.. option:: -section-symbols, -st
+
+ When used with ``-sections``, display symbols for each section shown.
+
+.. option:: -relocations, -r
+
+ Display the relocation entries in the file.
+
+.. option:: -symbols, -t
+
+ Display the symbol table.
+
+.. option:: -dyn-symbols
+
+ Display the dynamic symbol table (only for ELF object files).
+
+.. option:: -unwind, -u
+
+ Display unwind information.
+
+.. option:: -expand-relocs
+
+ When used with ``-relocations``, display each relocation in an expanded
+ multi-line format.
+
+.. option:: -dynamic-table
+
+ Display the ELF .dynamic section table (only for ELF object files).
+
+.. option:: -needed-libs
+
+ Display the needed libraries (only for ELF object files).
+
+.. option:: -program-headers
+
+ Display the ELF program headers (only for ELF object files).
+
+.. option:: -elf-section-groups, -g
+
+ Display section groups (only for ELF object files).
+
+EXIT STATUS
+-----------
+
+:program:`llvm-readobj` returns 0.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-stress.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-stress.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-stress.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-stress.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,34 @@
+llvm-stress - generate random .ll files
+=======================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-stress` [-size=filesize] [-seed=initialseed] [-o=outfile]
+
+DESCRIPTION
+-----------
+
+The :program:`llvm-stress` tool is used to generate random ``.ll`` files that
+can be used to test different components of LLVM.
+
+OPTIONS
+-------
+
+.. option:: -o filename
+
+ Specify the output filename.
+
+.. option:: -size size
+
+ Specify the size of the generated ``.ll`` file.
+
+.. option:: -seed seed
+
+ Specify the seed to be used for the randomly generated instructions.
+
+EXIT STATUS
+-----------
+
+:program:`llvm-stress` returns 0.
+

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-symbolizer.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-symbolizer.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-symbolizer.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/llvm-symbolizer.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,121 @@
+llvm-symbolizer - convert addresses into source code locations
+==============================================================
+
+SYNOPSIS
+--------
+
+:program:`llvm-symbolizer` [options]
+
+DESCRIPTION
+-----------
+
+:program:`llvm-symbolizer` reads object file names and addresses from standard
+input and prints corresponding source code locations to standard output.
+If object file is specified in command line, :program:`llvm-symbolizer` 
+processes only addresses from standard input, the rest is output verbatim.
+This program uses debug info sections and symbol table in the object files.
+
+EXAMPLE
+--------
+
+.. code-block:: console
+
+  $ cat addr.txt
+  a.out 0x4004f4
+  /tmp/b.out 0x400528
+  /tmp/c.so 0x710
+  /tmp/mach_universal_binary:i386 0x1f84
+  /tmp/mach_universal_binary:x86_64 0x100000f24
+  $ llvm-symbolizer < addr.txt
+  main
+  /tmp/a.cc:4
+  
+  f(int, int)
+  /tmp/b.cc:11
+
+  h_inlined_into_g
+  /tmp/header.h:2
+  g_inlined_into_f
+  /tmp/header.h:7
+  f_inlined_into_main
+  /tmp/source.cc:3
+  main
+  /tmp/source.cc:8
+
+  _main
+  /tmp/source_i386.cc:8
+
+  _main
+  /tmp/source_x86_64.cc:8
+  $ cat addr2.txt
+  0x4004f4
+  0x401000
+  $ llvm-symbolizer -obj=a.out < addr2.txt
+  main
+  /tmp/a.cc:4
+
+  foo(int)
+  /tmp/a.cc:12
+  $cat addr.txt
+  0x40054d
+  $llvm-symbolizer -inlining -print-address -pretty-print -obj=addr.exe < addr.txt
+  0x40054d: inc at /tmp/x.c:3:3
+   (inlined by) main at /tmp/x.c:9:0
+  $llvm-symbolizer -inlining -pretty-print -obj=addr.exe < addr.txt
+  inc at /tmp/x.c:3:3
+   (inlined by) main at /tmp/x.c:9:0
+
+OPTIONS
+-------
+
+.. option:: -obj
+
+  Path to object file to be symbolized.
+
+.. option:: -functions=[none|short|linkage]
+
+  Specify the way function names are printed (omit function name,
+  print short function name, or print full linkage name, respectively).
+  Defaults to ``linkage``.
+
+.. option:: -use-symbol-table
+
+ Prefer function names stored in symbol table to function names
+ in debug info sections. Defaults to true.
+
+.. option:: -demangle
+
+ Print demangled function names. Defaults to true.
+
+.. option:: -inlining 
+
+ If a source code location is in an inlined function, prints all the
+ inlnied frames. Defaults to true.
+
+.. option:: -default-arch
+
+ If a binary contains object files for multiple architectures (e.g. it is a
+ Mach-O universal binary), symbolize the object file for a given architecture.
+ You can also specify architecture by writing ``binary_name:arch_name`` in the
+ input (see example above). If architecture is not specified in either way,
+ address will not be symbolized. Defaults to empty string.
+
+.. option:: -dsym-hint=<path/to/file.dSYM>
+
+ (Darwin-only flag). If the debug info for a binary isn't present in the default
+ location, look for the debug info at the .dSYM path provided via the
+ ``-dsym-hint`` flag. This flag can be used multiple times.
+
+.. option:: -print-address
+
+ Print address before the source code location. Defaults to false.
+
+.. option:: -pretty-print
+
+ Print human readable output. If ``-inlining`` is specified, enclosing scope is
+ prefixed by (inlined by). Refer to listed examples.
+
+EXIT STATUS
+-----------
+
+:program:`llvm-symbolizer` returns 0. Other exit codes imply internal program error.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/opt.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/opt.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/opt.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/opt.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,123 @@
+opt - LLVM optimizer
+====================
+
+SYNOPSIS
+--------
+
+:program:`opt` [*options*] [*filename*]
+
+DESCRIPTION
+-----------
+
+The :program:`opt` command is the modular LLVM optimizer and analyzer.  It
+takes LLVM source files as input, runs the specified optimizations or analyses
+on it, and then outputs the optimized file or the analysis results.  The
+function of :program:`opt` depends on whether the `-analyze` option is
+given.
+
+When `-analyze` is specified, :program:`opt` performs various analyses
+of the input source.  It will usually print the results on standard output, but
+in a few cases, it will print output to standard error or generate a file with
+the analysis output, which is usually done when the output is meant for another
+program.
+
+While `-analyze` is *not* given, :program:`opt` attempts to produce an
+optimized output file.  The optimizations available via :program:`opt` depend
+upon what libraries were linked into it as well as any additional libraries
+that have been loaded with the :option:`-load` option.  Use the :option:`-help`
+option to determine what optimizations you can use.
+
+If ``filename`` is omitted from the command line or is "``-``", :program:`opt`
+reads its input from standard input.  Inputs can be in either the LLVM assembly
+language format (``.ll``) or the LLVM bitcode format (``.bc``).
+
+If an output filename is not specified with the :option:`-o` option,
+:program:`opt` writes its output to the standard output.
+
+OPTIONS
+-------
+
+.. option:: -f
+
+ Enable binary output on terminals.  Normally, :program:`opt` will refuse to
+ write raw bitcode output if the output stream is a terminal.  With this option,
+ :program:`opt` will write raw bitcode regardless of the output device.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -o <filename>
+
+ Specify the output filename.
+
+.. option:: -S
+
+ Write output in LLVM intermediate language (instead of bitcode).
+
+.. option:: -{passname}
+
+ :program:`opt` provides the ability to run any of LLVM's optimization or
+ analysis passes in any order.  The :option:`-help` option lists all the passes
+ available.  The order in which the options occur on the command line are the
+ order in which they are executed (within pass constraints).
+
+.. option:: -disable-inlining
+
+ This option simply removes the inlining pass from the standard list.
+
+.. option:: -disable-opt
+
+ This option is only meaningful when `-std-link-opts` is given.  It
+ disables most passes.
+
+.. option:: -strip-debug
+
+ This option causes opt to strip debug information from the module before
+ applying other optimizations.  It is essentially the same as `-strip`
+ but it ensures that stripping of debug information is done first.
+
+.. option:: -verify-each
+
+ This option causes opt to add a verify pass after every pass otherwise
+ specified on the command line (including `-verify`).  This is useful
+ for cases where it is suspected that a pass is creating an invalid module but
+ it is not clear which pass is doing it.
+
+.. option:: -stats
+
+ Print statistics.
+
+.. option:: -time-passes
+
+ Record the amount of time needed for each pass and print it to standard
+ error.
+
+.. option:: -debug
+
+ If this is a debug build, this option will enable debug printouts from passes
+ which use the ``DEBUG()`` macro.  See the `LLVM Programmer's Manual
+ <../ProgrammersManual.html>`_, section ``#DEBUG`` for more information.
+
+.. option:: -load=<plugin>
+
+ Load the dynamic object ``plugin``.  This object should register new
+ optimization or analysis passes.  Once loaded, the object will add new command
+ line options to enable various optimizations or analyses.  To see the new
+ complete list of optimizations, use the :option:`-help` and :option:`-load`
+ options together.  For example:
+
+ .. code-block:: sh
+
+     opt -load=plugin.so -help
+
+.. option:: -p
+
+ Print module after each transformation.
+
+EXIT STATUS
+-----------
+
+If :program:`opt` succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.
+

Added: www-releases/trunk/4.0.1/docs/_sources/CommandGuide/tblgen.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandGuide/tblgen.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandGuide/tblgen.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandGuide/tblgen.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,132 @@
+tblgen - Target Description To C++ Code Generator
+=================================================
+
+SYNOPSIS
+--------
+
+:program:`tblgen` [*options*] [*filename*]
+
+DESCRIPTION
+-----------
+
+:program:`tblgen` translates from target description (``.td``) files into C++
+code that can be included in the definition of an LLVM target library.  Most
+users of LLVM will not need to use this program.  It is only for assisting with
+writing an LLVM target backend.
+
+The input and output of :program:`tblgen` is beyond the scope of this short
+introduction; please see the :doc:`introduction to TableGen
+<../TableGen/index>`.
+
+The *filename* argument specifies the name of a Target Description (``.td``)
+file to read as input.
+
+OPTIONS
+-------
+
+.. program:: tblgen
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -o filename
+
+ Specify the output file name.  If ``filename`` is ``-``, then
+ :program:`tblgen` sends its output to standard output.
+
+.. option:: -I directory
+
+ Specify where to find other target description files for inclusion.  The
+ ``directory`` value should be a full or partial path to a directory that
+ contains target description files.
+
+.. option:: -asmparsernum N
+
+ Make -gen-asm-parser emit assembly writer number ``N``.
+
+.. option:: -asmwriternum N
+
+ Make -gen-asm-writer emit assembly writer number ``N``.
+
+.. option:: -class className
+
+ Print the enumeration list for this class.
+
+.. option:: -print-records
+
+ Print all records to standard output (default).
+
+.. option:: -print-enums
+
+ Print enumeration values for a class.
+
+.. option:: -print-sets
+
+ Print expanded sets for testing DAG exprs.
+
+.. option:: -gen-emitter
+
+ Generate machine code emitter.
+
+.. option:: -gen-register-info
+
+ Generate registers and register classes info.
+
+.. option:: -gen-instr-info
+
+ Generate instruction descriptions.
+
+.. option:: -gen-asm-writer
+
+ Generate the assembly writer.
+
+.. option:: -gen-disassembler
+
+ Generate disassembler.
+
+.. option:: -gen-pseudo-lowering
+
+ Generate pseudo instruction lowering.
+
+.. option:: -gen-dag-isel
+
+ Generate a DAG (Directed Acycle Graph) instruction selector.
+
+.. option:: -gen-asm-matcher
+
+ Generate assembly instruction matcher.
+
+.. option:: -gen-dfa-packetizer
+
+ Generate DFA Packetizer for VLIW targets.
+
+.. option:: -gen-fast-isel
+
+ Generate a "fast" instruction selector.
+
+.. option:: -gen-subtarget
+
+ Generate subtarget enumerations.
+
+.. option:: -gen-intrinsic
+
+ Generate intrinsic information.
+
+.. option:: -gen-tgt-intrinsic
+
+ Generate target intrinsic information.
+
+.. option:: -gen-enhanced-disassembly-info
+
+ Generate enhanced disassembly info.
+
+.. option:: -version
+
+ Show the version number of this program.
+
+EXIT STATUS
+-----------
+
+If :program:`tblgen` succeeds, it will exit with 0.  Otherwise, if an error
+occurs, it will exit with a non-zero value.

Added: www-releases/trunk/4.0.1/docs/_sources/CommandLine.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CommandLine.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CommandLine.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CommandLine.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,1736 @@
+==============================
+CommandLine 2.0 Library Manual
+==============================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document describes the CommandLine argument processing library.  It will
+show you how to use it, and what it can do.  The CommandLine library uses a
+declarative approach to specifying the command line options that your program
+takes.  By default, these options declarations implicitly hold the value parsed
+for the option declared (of course this `can be changed`_).
+
+Although there are a **lot** of command line argument parsing libraries out
+there in many different languages, none of them fit well with what I needed.  By
+looking at the features and problems of other libraries, I designed the
+CommandLine library to have the following features:
+
+#. Speed: The CommandLine library is very quick and uses little resources.  The
+   parsing time of the library is directly proportional to the number of
+   arguments parsed, not the number of options recognized.  Additionally,
+   command line argument values are captured transparently into user defined
+   global variables, which can be accessed like any other variable (and with the
+   same performance).
+
+#. Type Safe: As a user of CommandLine, you don't have to worry about
+   remembering the type of arguments that you want (is it an int?  a string? a
+   bool? an enum?) and keep casting it around.  Not only does this help prevent
+   error prone constructs, it also leads to dramatically cleaner source code.
+
+#. No subclasses required: To use CommandLine, you instantiate variables that
+   correspond to the arguments that you would like to capture, you don't
+   subclass a parser.  This means that you don't have to write **any**
+   boilerplate code.
+
+#. Globally accessible: Libraries can specify command line arguments that are
+   automatically enabled in any tool that links to the library.  This is
+   possible because the application doesn't have to keep a list of arguments to
+   pass to the parser.  This also makes supporting `dynamically loaded options`_
+   trivial.
+
+#. Cleaner: CommandLine supports enum and other types directly, meaning that
+   there is less error and more security built into the library.  You don't have
+   to worry about whether your integral command line argument accidentally got
+   assigned a value that is not valid for your enum type.
+
+#. Powerful: The CommandLine library supports many different types of arguments,
+   from simple `boolean flags`_ to `scalars arguments`_ (`strings`_,
+   `integers`_, `enums`_, `doubles`_), to `lists of arguments`_.  This is
+   possible because CommandLine is...
+
+#. Extensible: It is very simple to add a new argument type to CommandLine.
+   Simply specify the parser that you want to use with the command line option
+   when you declare it. `Custom parsers`_ are no problem.
+
+#. Labor Saving: The CommandLine library cuts down on the amount of grunt work
+   that you, the user, have to do.  For example, it automatically provides a
+   ``-help`` option that shows the available command line options for your tool.
+   Additionally, it does most of the basic correctness checking for you.
+
+#. Capable: The CommandLine library can handle lots of different forms of
+   options often found in real programs.  For example, `positional`_ arguments,
+   ``ls`` style `grouping`_ options (to allow processing '``ls -lad``'
+   naturally), ``ld`` style `prefix`_ options (to parse '``-lmalloc
+   -L/usr/lib``'), and interpreter style options.
+
+This document will hopefully let you jump in and start using CommandLine in your
+utility quickly and painlessly.  Additionally it should be a simple reference
+manual to figure out how stuff works.
+
+Quick Start Guide
+=================
+
+This section of the manual runs through a simple CommandLine'ification of a
+basic compiler tool.  This is intended to show you how to jump into using the
+CommandLine library in your own program, and show you some of the cool things it
+can do.
+
+To start out, you need to include the CommandLine header file into your program:
+
+.. code-block:: c++
+
+  #include "llvm/Support/CommandLine.h"
+
+Additionally, you need to add this as the first line of your main program:
+
+.. code-block:: c++
+
+  int main(int argc, char **argv) {
+    cl::ParseCommandLineOptions(argc, argv);
+    ...
+  }
+
+... which actually parses the arguments and fills in the variable declarations.
+
+Now that you are ready to support command line arguments, we need to tell the
+system which ones we want, and what type of arguments they are.  The CommandLine
+library uses a declarative syntax to model command line arguments with the
+global variable declarations that capture the parsed values.  This means that
+for every command line option that you would like to support, there should be a
+global variable declaration to capture the result.  For example, in a compiler,
+we would like to support the Unix-standard '``-o <filename>``' option to specify
+where to put the output.  With the CommandLine library, this is represented like
+this:
+
+.. _scalars arguments:
+.. _here:
+
+.. code-block:: c++
+
+  cl::opt<string> OutputFilename("o", cl::desc("Specify output filename"), cl::value_desc("filename"));
+
+This declares a global variable "``OutputFilename``" that is used to capture the
+result of the "``o``" argument (first parameter).  We specify that this is a
+simple scalar option by using the "``cl::opt``" template (as opposed to the
+"``cl::list``" template), and tell the CommandLine library that the data
+type that we are parsing is a string.
+
+The second and third parameters (which are optional) are used to specify what to
+output for the "``-help``" option.  In this case, we get a line that looks like
+this:
+
+::
+
+  USAGE: compiler [options]
+
+  OPTIONS:
+    -help             - display available options (-help-hidden for more)
+    -o <filename>     - Specify output filename
+
+Because we specified that the command line option should parse using the
+``string`` data type, the variable declared is automatically usable as a real
+string in all contexts that a normal C++ string object may be used.  For
+example:
+
+.. code-block:: c++
+
+  ...
+  std::ofstream Output(OutputFilename.c_str());
+  if (Output.good()) ...
+  ...
+
+There are many different options that you can use to customize the command line
+option handling library, but the above example shows the general interface to
+these options.  The options can be specified in any order, and are specified
+with helper functions like `cl::desc(...)`_, so there are no positional
+dependencies to remember.  The available options are discussed in detail in the
+`Reference Guide`_.
+
+Continuing the example, we would like to have our compiler take an input
+filename as well as an output filename, but we do not want the input filename to
+be specified with a hyphen (ie, not ``-filename.c``).  To support this style of
+argument, the CommandLine library allows for `positional`_ arguments to be
+specified for the program.  These positional arguments are filled with command
+line parameters that are not in option form.  We use this feature like this:
+
+.. code-block:: c++
+
+
+  cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::init("-"));
+
+This declaration indicates that the first positional argument should be treated
+as the input filename.  Here we use the `cl::init`_ option to specify an initial
+value for the command line option, which is used if the option is not specified
+(if you do not specify a `cl::init`_ modifier for an option, then the default
+constructor for the data type is used to initialize the value).  Command line
+options default to being optional, so if we would like to require that the user
+always specify an input filename, we would add the `cl::Required`_ flag, and we
+could eliminate the `cl::init`_ modifier, like this:
+
+.. code-block:: c++
+
+  cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::Required);
+
+Again, the CommandLine library does not require the options to be specified in
+any particular order, so the above declaration is equivalent to:
+
+.. code-block:: c++
+
+  cl::opt<string> InputFilename(cl::Positional, cl::Required, cl::desc("<input file>"));
+
+By simply adding the `cl::Required`_ flag, the CommandLine library will
+automatically issue an error if the argument is not specified, which shifts all
+of the command line option verification code out of your application into the
+library.  This is just one example of how using flags can alter the default
+behaviour of the library, on a per-option basis.  By adding one of the
+declarations above, the ``-help`` option synopsis is now extended to:
+
+::
+
+  USAGE: compiler [options] <input file>
+
+  OPTIONS:
+    -help             - display available options (-help-hidden for more)
+    -o <filename>     - Specify output filename
+
+... indicating that an input filename is expected.
+
+Boolean Arguments
+-----------------
+
+In addition to input and output filenames, we would like the compiler example to
+support three boolean flags: "``-f``" to force writing binary output to a
+terminal, "``--quiet``" to enable quiet mode, and "``-q``" for backwards
+compatibility with some of our users.  We can support these by declaring options
+of boolean type like this:
+
+.. code-block:: c++
+
+  cl::opt<bool> Force ("f", cl::desc("Enable binary output on terminals"));
+  cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages"));
+  cl::opt<bool> Quiet2("q", cl::desc("Don't print informational messages"), cl::Hidden);
+
+This does what you would expect: it declares three boolean variables
+("``Force``", "``Quiet``", and "``Quiet2``") to recognize these options.  Note
+that the "``-q``" option is specified with the "`cl::Hidden`_" flag.  This
+modifier prevents it from being shown by the standard "``-help``" output (note
+that it is still shown in the "``-help-hidden``" output).
+
+The CommandLine library uses a `different parser`_ for different data types.
+For example, in the string case, the argument passed to the option is copied
+literally into the content of the string variable... we obviously cannot do that
+in the boolean case, however, so we must use a smarter parser.  In the case of
+the boolean parser, it allows no options (in which case it assigns the value of
+true to the variable), or it allows the values "``true``" or "``false``" to be
+specified, allowing any of the following inputs:
+
+::
+
+  compiler -f          # No value, 'Force' == true
+  compiler -f=true     # Value specified, 'Force' == true
+  compiler -f=TRUE     # Value specified, 'Force' == true
+  compiler -f=FALSE    # Value specified, 'Force' == false
+
+... you get the idea.  The `bool parser`_ just turns the string values into
+boolean values, and rejects things like '``compiler -f=foo``'.  Similarly, the
+`float`_, `double`_, and `int`_ parsers work like you would expect, using the
+'``strtol``' and '``strtod``' C library calls to parse the string value into the
+specified data type.
+
+With the declarations above, "``compiler -help``" emits this:
+
+::
+
+  USAGE: compiler [options] <input file>
+
+  OPTIONS:
+    -f     - Enable binary output on terminals
+    -o     - Override output filename
+    -quiet - Don't print informational messages
+    -help  - display available options (-help-hidden for more)
+
+and "``compiler -help-hidden``" prints this:
+
+::
+
+  USAGE: compiler [options] <input file>
+
+  OPTIONS:
+    -f     - Enable binary output on terminals
+    -o     - Override output filename
+    -q     - Don't print informational messages
+    -quiet - Don't print informational messages
+    -help  - display available options (-help-hidden for more)
+
+This brief example has shown you how to use the '`cl::opt`_' class to parse
+simple scalar command line arguments.  In addition to simple scalar arguments,
+the CommandLine library also provides primitives to support CommandLine option
+`aliases`_, and `lists`_ of options.
+
+.. _aliases:
+
+Argument Aliases
+----------------
+
+So far, the example works well, except for the fact that we need to check the
+quiet condition like this now:
+
+.. code-block:: c++
+
+  ...
+    if (!Quiet && !Quiet2) printInformationalMessage(...);
+  ...
+
+... which is a real pain!  Instead of defining two values for the same
+condition, we can use the "`cl::alias`_" class to make the "``-q``" option an
+**alias** for the "``-quiet``" option, instead of providing a value itself:
+
+.. code-block:: c++
+
+  cl::opt<bool> Force ("f", cl::desc("Overwrite output files"));
+  cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages"));
+  cl::alias     QuietA("q", cl::desc("Alias for -quiet"), cl::aliasopt(Quiet));
+
+The third line (which is the only one we modified from above) defines a "``-q``"
+alias that updates the "``Quiet``" variable (as specified by the `cl::aliasopt`_
+modifier) whenever it is specified.  Because aliases do not hold state, the only
+thing the program has to query is the ``Quiet`` variable now.  Another nice
+feature of aliases is that they automatically hide themselves from the ``-help``
+output (although, again, they are still visible in the ``-help-hidden output``).
+
+Now the application code can simply use:
+
+.. code-block:: c++
+
+  ...
+    if (!Quiet) printInformationalMessage(...);
+  ...
+
+... which is much nicer!  The "`cl::alias`_" can be used to specify an
+alternative name for any variable type, and has many uses.
+
+.. _unnamed alternatives using the generic parser:
+
+Selecting an alternative from a set of possibilities
+----------------------------------------------------
+
+So far we have seen how the CommandLine library handles builtin types like
+``std::string``, ``bool`` and ``int``, but how does it handle things it doesn't
+know about, like enums or '``int*``'s?
+
+The answer is that it uses a table-driven generic parser (unless you specify
+your own parser, as described in the `Extension Guide`_).  This parser maps
+literal strings to whatever type is required, and requires you to tell it what
+this mapping should be.
+
+Let's say that we would like to add four optimization levels to our optimizer,
+using the standard flags "``-g``", "``-O0``", "``-O1``", and "``-O2``".  We
+could easily implement this with boolean options like above, but there are
+several problems with this strategy:
+
+#. A user could specify more than one of the options at a time, for example,
+   "``compiler -O3 -O2``".  The CommandLine library would not be able to catch
+   this erroneous input for us.
+
+#. We would have to test 4 different variables to see which ones are set.
+
+#. This doesn't map to the numeric levels that we want... so we cannot easily
+   see if some level >= "``-O1``" is enabled.
+
+To cope with these problems, we can use an enum value, and have the CommandLine
+library fill it in with the appropriate level directly, which is used like this:
+
+.. code-block:: c++
+
+  enum OptLevel {
+    g, O1, O2, O3
+  };
+
+  cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
+    cl::values(
+      clEnumVal(g , "No optimizations, enable debugging"),
+      clEnumVal(O1, "Enable trivial optimizations"),
+      clEnumVal(O2, "Enable default optimizations"),
+      clEnumVal(O3, "Enable expensive optimizations")));
+
+  ...
+    if (OptimizationLevel >= O2) doPartialRedundancyElimination(...);
+  ...
+
+This declaration defines a variable "``OptimizationLevel``" of the
+"``OptLevel``" enum type.  This variable can be assigned any of the values that
+are listed in the declaration.  The CommandLine library enforces that
+the user can only specify one of the options, and it ensure that only valid enum
+values can be specified.  The "``clEnumVal``" macros ensure that the command
+line arguments matched the enum values.  With this option added, our help output
+now is:
+
+::
+
+  USAGE: compiler [options] <input file>
+
+  OPTIONS:
+    Choose optimization level:
+      -g          - No optimizations, enable debugging
+      -O1         - Enable trivial optimizations
+      -O2         - Enable default optimizations
+      -O3         - Enable expensive optimizations
+    -f            - Enable binary output on terminals
+    -help         - display available options (-help-hidden for more)
+    -o <filename> - Specify output filename
+    -quiet        - Don't print informational messages
+
+In this case, it is sort of awkward that flag names correspond directly to enum
+names, because we probably don't want a enum definition named "``g``" in our
+program.  Because of this, we can alternatively write this example like this:
+
+.. code-block:: c++
+
+  enum OptLevel {
+    Debug, O1, O2, O3
+  };
+
+  cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
+    cl::values(
+     clEnumValN(Debug, "g", "No optimizations, enable debugging"),
+      clEnumVal(O1        , "Enable trivial optimizations"),
+      clEnumVal(O2        , "Enable default optimizations"),
+      clEnumVal(O3        , "Enable expensive optimizations")));
+
+  ...
+    if (OptimizationLevel == Debug) outputDebugInfo(...);
+  ...
+
+By using the "``clEnumValN``" macro instead of "``clEnumVal``", we can directly
+specify the name that the flag should get.  In general a direct mapping is nice,
+but sometimes you can't or don't want to preserve the mapping, which is when you
+would use it.
+
+Named Alternatives
+------------------
+
+Another useful argument form is a named alternative style.  We shall use this
+style in our compiler to specify different debug levels that can be used.
+Instead of each debug level being its own switch, we want to support the
+following options, of which only one can be specified at a time:
+"``--debug-level=none``", "``--debug-level=quick``",
+"``--debug-level=detailed``".  To do this, we use the exact same format as our
+optimization level flags, but we also specify an option name.  For this case,
+the code looks like this:
+
+.. code-block:: c++
+
+  enum DebugLev {
+    nodebuginfo, quick, detailed
+  };
+
+  // Enable Debug Options to be specified on the command line
+  cl::opt<DebugLev> DebugLevel("debug_level", cl::desc("Set the debugging level:"),
+    cl::values(
+      clEnumValN(nodebuginfo, "none", "disable debug information"),
+       clEnumVal(quick,               "enable quick debug information"),
+       clEnumVal(detailed,            "enable detailed debug information")));
+
+This definition defines an enumerated command line variable of type "``enum
+DebugLev``", which works exactly the same way as before.  The difference here is
+just the interface exposed to the user of your program and the help output by
+the "``-help``" option:
+
+::
+
+  USAGE: compiler [options] <input file>
+
+  OPTIONS:
+    Choose optimization level:
+      -g          - No optimizations, enable debugging
+      -O1         - Enable trivial optimizations
+      -O2         - Enable default optimizations
+      -O3         - Enable expensive optimizations
+    -debug_level  - Set the debugging level:
+      =none       - disable debug information
+      =quick      - enable quick debug information
+      =detailed   - enable detailed debug information
+    -f            - Enable binary output on terminals
+    -help         - display available options (-help-hidden for more)
+    -o <filename> - Specify output filename
+    -quiet        - Don't print informational messages
+
+Again, the only structural difference between the debug level declaration and
+the optimization level declaration is that the debug level declaration includes
+an option name (``"debug_level"``), which automatically changes how the library
+processes the argument.  The CommandLine library supports both forms so that you
+can choose the form most appropriate for your application.
+
+.. _lists:
+
+Parsing a list of options
+-------------------------
+
+Now that we have the standard run-of-the-mill argument types out of the way,
+lets get a little wild and crazy.  Lets say that we want our optimizer to accept
+a **list** of optimizations to perform, allowing duplicates.  For example, we
+might want to run: "``compiler -dce -constprop -inline -dce -strip``".  In this
+case, the order of the arguments and the number of appearances is very
+important.  This is what the "``cl::list``" template is for.  First, start by
+defining an enum of the optimizations that you would like to perform:
+
+.. code-block:: c++
+
+  enum Opts {
+    // 'inline' is a C++ keyword, so name it 'inlining'
+    dce, constprop, inlining, strip
+  };
+
+Then define your "``cl::list``" variable:
+
+.. code-block:: c++
+
+  cl::list<Opts> OptimizationList(cl::desc("Available Optimizations:"),
+    cl::values(
+      clEnumVal(dce               , "Dead Code Elimination"),
+      clEnumVal(constprop         , "Constant Propagation"),
+     clEnumValN(inlining, "inline", "Procedure Integration"),
+      clEnumVal(strip             , "Strip Symbols")));
+
+This defines a variable that is conceptually of the type
+"``std::vector<enum Opts>``".  Thus, you can access it with standard vector
+methods:
+
+.. code-block:: c++
+
+  for (unsigned i = 0; i != OptimizationList.size(); ++i)
+    switch (OptimizationList[i])
+       ...
+
+... to iterate through the list of options specified.
+
+Note that the "``cl::list``" template is completely general and may be used with
+any data types or other arguments that you can use with the "``cl::opt``"
+template.  One especially useful way to use a list is to capture all of the
+positional arguments together if there may be more than one specified.  In the
+case of a linker, for example, the linker takes several '``.o``' files, and
+needs to capture them into a list.  This is naturally specified as:
+
+.. code-block:: c++
+
+  ...
+  cl::list<std::string> InputFilenames(cl::Positional, cl::desc("<Input files>"), cl::OneOrMore);
+  ...
+
+This variable works just like a "``vector<string>``" object.  As such, accessing
+the list is simple, just like above.  In this example, we used the
+`cl::OneOrMore`_ modifier to inform the CommandLine library that it is an error
+if the user does not specify any ``.o`` files on our command line.  Again, this
+just reduces the amount of checking we have to do.
+
+Collecting options as a set of flags
+------------------------------------
+
+Instead of collecting sets of options in a list, it is also possible to gather
+information for enum values in a **bit vector**.  The representation used by the
+`cl::bits`_ class is an ``unsigned`` integer.  An enum value is represented by a
+0/1 in the enum's ordinal value bit position. 1 indicating that the enum was
+specified, 0 otherwise.  As each specified value is parsed, the resulting enum's
+bit is set in the option's bit vector:
+
+.. code-block:: c++
+
+  bits |= 1 << (unsigned)enum;
+
+Options that are specified multiple times are redundant.  Any instances after
+the first are discarded.
+
+Reworking the above list example, we could replace `cl::list`_ with `cl::bits`_:
+
+.. code-block:: c++
+
+  cl::bits<Opts> OptimizationBits(cl::desc("Available Optimizations:"),
+    cl::values(
+      clEnumVal(dce               , "Dead Code Elimination"),
+      clEnumVal(constprop         , "Constant Propagation"),
+     clEnumValN(inlining, "inline", "Procedure Integration"),
+      clEnumVal(strip             , "Strip Symbols")));
+
+To test to see if ``constprop`` was specified, we can use the ``cl:bits::isSet``
+function:
+
+.. code-block:: c++
+
+  if (OptimizationBits.isSet(constprop)) {
+    ...
+  }
+
+It's also possible to get the raw bit vector using the ``cl::bits::getBits``
+function:
+
+.. code-block:: c++
+
+  unsigned bits = OptimizationBits.getBits();
+
+Finally, if external storage is used, then the location specified must be of
+**type** ``unsigned``. In all other ways a `cl::bits`_ option is equivalent to a
+`cl::list`_ option.
+
+.. _additional extra text:
+
+Adding freeform text to help output
+-----------------------------------
+
+As our program grows and becomes more mature, we may decide to put summary
+information about what it does into the help output.  The help output is styled
+to look similar to a Unix ``man`` page, providing concise information about a
+program.  Unix ``man`` pages, however often have a description about what the
+program does.  To add this to your CommandLine program, simply pass a third
+argument to the `cl::ParseCommandLineOptions`_ call in main.  This additional
+argument is then printed as the overview information for your program, allowing
+you to include any additional information that you want.  For example:
+
+.. code-block:: c++
+
+  int main(int argc, char **argv) {
+    cl::ParseCommandLineOptions(argc, argv, " CommandLine compiler example\n\n"
+                                "  This program blah blah blah...\n");
+    ...
+  }
+
+would yield the help output:
+
+::
+
+  **OVERVIEW: CommandLine compiler example
+
+    This program blah blah blah...**
+
+  USAGE: compiler [options] <input file>
+
+  OPTIONS:
+    ...
+    -help             - display available options (-help-hidden for more)
+    -o <filename>     - Specify output filename
+
+.. _grouping options into categories:
+
+Grouping options into categories
+--------------------------------
+
+If our program has a large number of options it may become difficult for users
+of our tool to navigate the output of ``-help``. To alleviate this problem we
+can put our options into categories. This can be done by declaring option
+categories (`cl::OptionCategory`_ objects) and then placing our options into
+these categories using the `cl::cat`_ option attribute. For example:
+
+.. code-block:: c++
+
+  cl::OptionCategory StageSelectionCat("Stage Selection Options",
+                                       "These control which stages are run.");
+
+  cl::opt<bool> Preprocessor("E",cl::desc("Run preprocessor stage."),
+                             cl::cat(StageSelectionCat));
+
+  cl::opt<bool> NoLink("c",cl::desc("Run all stages except linking."),
+                       cl::cat(StageSelectionCat));
+
+The output of ``-help`` will become categorized if an option category is
+declared. The output looks something like ::
+
+  OVERVIEW: This is a small program to demo the LLVM CommandLine API
+  USAGE: Sample [options]
+
+  OPTIONS:
+
+    General options:
+
+      -help              - Display available options (-help-hidden for more)
+      -help-list         - Display list of available options (-help-list-hidden for more)
+
+
+    Stage Selection Options:
+    These control which stages are run.
+
+      -E                 - Run preprocessor stage.
+      -c                 - Run all stages except linking.
+
+In addition to the behaviour of ``-help`` changing when an option category is
+declared, the command line option ``-help-list`` becomes visible which will
+print the command line options as uncategorized list.
+
+Note that Options that are not explicitly categorized will be placed in the
+``cl::GeneralCategory`` category.
+
+.. _Reference Guide:
+
+Reference Guide
+===============
+
+Now that you know the basics of how to use the CommandLine library, this section
+will give you the detailed information you need to tune how command line options
+work, as well as information on more "advanced" command line option processing
+capabilities.
+
+.. _positional:
+.. _positional argument:
+.. _Positional Arguments:
+.. _Positional arguments section:
+.. _positional options:
+
+Positional Arguments
+--------------------
+
+Positional arguments are those arguments that are not named, and are not
+specified with a hyphen.  Positional arguments should be used when an option is
+specified by its position alone.  For example, the standard Unix ``grep`` tool
+takes a regular expression argument, and an optional filename to search through
+(which defaults to standard input if a filename is not specified).  Using the
+CommandLine library, this would be specified as:
+
+.. code-block:: c++
+
+  cl::opt<string> Regex   (cl::Positional, cl::desc("<regular expression>"), cl::Required);
+  cl::opt<string> Filename(cl::Positional, cl::desc("<input file>"), cl::init("-"));
+
+Given these two option declarations, the ``-help`` output for our grep
+replacement would look like this:
+
+::
+
+  USAGE: spiffygrep [options] <regular expression> <input file>
+
+  OPTIONS:
+    -help - display available options (-help-hidden for more)
+
+... and the resultant program could be used just like the standard ``grep``
+tool.
+
+Positional arguments are sorted by their order of construction.  This means that
+command line options will be ordered according to how they are listed in a .cpp
+file, but will not have an ordering defined if the positional arguments are
+defined in multiple .cpp files.  The fix for this problem is simply to define
+all of your positional arguments in one .cpp file.
+
+Specifying positional options with hyphens
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Sometimes you may want to specify a value to your positional argument that
+starts with a hyphen (for example, searching for '``-foo``' in a file).  At
+first, you will have trouble doing this, because it will try to find an argument
+named '``-foo``', and will fail (and single quotes will not save you).  Note
+that the system ``grep`` has the same problem:
+
+::
+
+  $ spiffygrep '-foo' test.txt
+  Unknown command line argument '-foo'.  Try: spiffygrep -help'
+
+  $ grep '-foo' test.txt
+  grep: illegal option -- f
+  grep: illegal option -- o
+  grep: illegal option -- o
+  Usage: grep -hblcnsviw pattern file . . .
+
+The solution for this problem is the same for both your tool and the system
+version: use the '``--``' marker.  When the user specifies '``--``' on the
+command line, it is telling the program that all options after the '``--``'
+should be treated as positional arguments, not options.  Thus, we can use it
+like this:
+
+::
+
+  $ spiffygrep -- -foo test.txt
+    ...output...
+
+Determining absolute position with getPosition()
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Sometimes an option can affect or modify the meaning of another option. For
+example, consider ``gcc``'s ``-x LANG`` option. This tells ``gcc`` to ignore the
+suffix of subsequent positional arguments and force the file to be interpreted
+as if it contained source code in language ``LANG``. In order to handle this
+properly, you need to know the absolute position of each argument, especially
+those in lists, so their interaction(s) can be applied correctly. This is also
+useful for options like ``-llibname`` which is actually a positional argument
+that starts with a dash.
+
+So, generally, the problem is that you have two ``cl::list`` variables that
+interact in some way. To ensure the correct interaction, you can use the
+``cl::list::getPosition(optnum)`` method. This method returns the absolute
+position (as found on the command line) of the ``optnum`` item in the
+``cl::list``.
+
+The idiom for usage is like this:
+
+.. code-block:: c++
+
+  static cl::list<std::string> Files(cl::Positional, cl::OneOrMore);
+  static cl::list<std::string> Libraries("l", cl::ZeroOrMore);
+
+  int main(int argc, char**argv) {
+    // ...
+    std::vector<std::string>::iterator fileIt = Files.begin();
+    std::vector<std::string>::iterator libIt  = Libraries.begin();
+    unsigned libPos = 0, filePos = 0;
+    while ( 1 ) {
+      if ( libIt != Libraries.end() )
+        libPos = Libraries.getPosition( libIt - Libraries.begin() );
+      else
+        libPos = 0;
+      if ( fileIt != Files.end() )
+        filePos = Files.getPosition( fileIt - Files.begin() );
+      else
+        filePos = 0;
+
+      if ( filePos != 0 && (libPos == 0 || filePos < libPos) ) {
+        // Source File Is next
+        ++fileIt;
+      }
+      else if ( libPos != 0 && (filePos == 0 || libPos < filePos) ) {
+        // Library is next
+        ++libIt;
+      }
+      else
+        break; // we're done with the list
+    }
+  }
+
+Note that, for compatibility reasons, the ``cl::opt`` also supports an
+``unsigned getPosition()`` option that will provide the absolute position of
+that option. You can apply the same approach as above with a ``cl::opt`` and a
+``cl::list`` option as you can with two lists.
+
+.. _interpreter style options:
+.. _cl::ConsumeAfter:
+.. _this section for more information:
+
+The ``cl::ConsumeAfter`` modifier
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::ConsumeAfter`` `formatting option`_ is used to construct programs that
+use "interpreter style" option processing.  With this style of option
+processing, all arguments specified after the last positional argument are
+treated as special interpreter arguments that are not interpreted by the command
+line argument.
+
+As a concrete example, lets say we are developing a replacement for the standard
+Unix Bourne shell (``/bin/sh``).  To run ``/bin/sh``, first you specify options
+to the shell itself (like ``-x`` which turns on trace output), then you specify
+the name of the script to run, then you specify arguments to the script.  These
+arguments to the script are parsed by the Bourne shell command line option
+processor, but are not interpreted as options to the shell itself.  Using the
+CommandLine library, we would specify this as:
+
+.. code-block:: c++
+
+  cl::opt<string> Script(cl::Positional, cl::desc("<input script>"), cl::init("-"));
+  cl::list<string>  Argv(cl::ConsumeAfter, cl::desc("<program arguments>..."));
+  cl::opt<bool>    Trace("x", cl::desc("Enable trace output"));
+
+which automatically provides the help output:
+
+::
+
+  USAGE: spiffysh [options] <input script> <program arguments>...
+
+  OPTIONS:
+    -help - display available options (-help-hidden for more)
+    -x    - Enable trace output
+
+At runtime, if we run our new shell replacement as ```spiffysh -x test.sh -a -x
+-y bar``', the ``Trace`` variable will be set to true, the ``Script`` variable
+will be set to "``test.sh``", and the ``Argv`` list will contain ``["-a", "-x",
+"-y", "bar"]``, because they were specified after the last positional argument
+(which is the script name).
+
+There are several limitations to when ``cl::ConsumeAfter`` options can be
+specified.  For example, only one ``cl::ConsumeAfter`` can be specified per
+program, there must be at least one `positional argument`_ specified, there must
+not be any `cl::list`_ positional arguments, and the ``cl::ConsumeAfter`` option
+should be a `cl::list`_ option.
+
+.. _can be changed:
+.. _Internal vs External Storage:
+
+Internal vs External Storage
+----------------------------
+
+By default, all command line options automatically hold the value that they
+parse from the command line.  This is very convenient in the common case,
+especially when combined with the ability to define command line options in the
+files that use them.  This is called the internal storage model.
+
+Sometimes, however, it is nice to separate the command line option processing
+code from the storage of the value parsed.  For example, lets say that we have a
+'``-debug``' option that we would like to use to enable debug information across
+the entire body of our program.  In this case, the boolean value controlling the
+debug code should be globally accessible (in a header file, for example) yet the
+command line option processing code should not be exposed to all of these
+clients (requiring lots of .cpp files to ``#include CommandLine.h``).
+
+To do this, set up your .h file with your option, like this for example:
+
+.. code-block:: c++
+
+  // DebugFlag.h - Get access to the '-debug' command line option
+  //
+
+  // DebugFlag - This boolean is set to true if the '-debug' command line option
+  // is specified.  This should probably not be referenced directly, instead, use
+  // the DEBUG macro below.
+  //
+  extern bool DebugFlag;
+
+  // DEBUG macro - This macro should be used by code to emit debug information.
+  // In the '-debug' option is specified on the command line, and if this is a
+  // debug build, then the code specified as the option to the macro will be
+  // executed.  Otherwise it will not be.
+  #ifdef NDEBUG
+  #define DEBUG(X)
+  #else
+  #define DEBUG(X) do { if (DebugFlag) { X; } } while (0)
+  #endif
+
+This allows clients to blissfully use the ``DEBUG()`` macro, or the
+``DebugFlag`` explicitly if they want to.  Now we just need to be able to set
+the ``DebugFlag`` boolean when the option is set.  To do this, we pass an
+additional argument to our command line argument processor, and we specify where
+to fill in with the `cl::location`_ attribute:
+
+.. code-block:: c++
+
+  bool DebugFlag;                  // the actual value
+  static cl::opt<bool, true>       // The parser
+  Debug("debug", cl::desc("Enable debug output"), cl::Hidden, cl::location(DebugFlag));
+
+In the above example, we specify "``true``" as the second argument to the
+`cl::opt`_ template, indicating that the template should not maintain a copy of
+the value itself.  In addition to this, we specify the `cl::location`_
+attribute, so that ``DebugFlag`` is automatically set.
+
+Option Attributes
+-----------------
+
+This section describes the basic attributes that you can specify on options.
+
+* The option name attribute (which is required for all options, except
+  `positional options`_) specifies what the option name is.  This option is
+  specified in simple double quotes:
+
+  .. code-block:: c++
+
+    cl::opt<bool> Quiet("quiet");
+
+.. _cl::desc(...):
+
+* The **cl::desc** attribute specifies a description for the option to be
+  shown in the ``-help`` output for the program. This attribute supports
+  multi-line descriptions with lines separated by '\n'.
+
+.. _cl::value_desc:
+
+* The **cl::value_desc** attribute specifies a string that can be used to
+  fine tune the ``-help`` output for a command line option.  Look `here`_ for an
+  example.
+
+.. _cl::init:
+
+* The **cl::init** attribute specifies an initial value for a `scalar`_
+  option.  If this attribute is not specified then the command line option value
+  defaults to the value created by the default constructor for the
+  type.
+
+  .. warning::
+
+    If you specify both **cl::init** and **cl::location** for an option, you
+    must specify **cl::location** first, so that when the command-line parser
+    sees **cl::init**, it knows where to put the initial value. (You will get an
+    error at runtime if you don't put them in the right order.)
+
+.. _cl::location:
+
+* The **cl::location** attribute where to store the value for a parsed command
+  line option if using external storage.  See the section on `Internal vs
+  External Storage`_ for more information.
+
+.. _cl::aliasopt:
+
+* The **cl::aliasopt** attribute specifies which option a `cl::alias`_ option is
+  an alias for.
+
+.. _cl::values:
+
+* The **cl::values** attribute specifies the string-to-value mapping to be used
+  by the generic parser.  It takes a list of (option, value, description)
+  triplets that specify the option name, the value mapped to, and the
+  description shown in the ``-help`` for the tool.  Because the generic parser
+  is used most frequently with enum values, two macros are often useful:
+
+  #. The **clEnumVal** macro is used as a nice simple way to specify a triplet
+     for an enum.  This macro automatically makes the option name be the same as
+     the enum name.  The first option to the macro is the enum, the second is
+     the description for the command line option.
+
+  #. The **clEnumValN** macro is used to specify macro options where the option
+     name doesn't equal the enum name.  For this macro, the first argument is
+     the enum value, the second is the flag name, and the second is the
+     description.
+
+  You will get a compile time error if you try to use cl::values with a parser
+  that does not support it.
+
+.. _cl::multi_val:
+
+* The **cl::multi_val** attribute specifies that this option takes has multiple
+  values (example: ``-sectalign segname sectname sectvalue``). This attribute
+  takes one unsigned argument - the number of values for the option. This
+  attribute is valid only on ``cl::list`` options (and will fail with compile
+  error if you try to use it with other option types). It is allowed to use all
+  of the usual modifiers on multi-valued options (besides
+  ``cl::ValueDisallowed``, obviously).
+
+.. _cl::cat:
+
+* The **cl::cat** attribute specifies the option category that the option
+  belongs to. The category should be a `cl::OptionCategory`_ object.
+
+Option Modifiers
+----------------
+
+Option modifiers are the flags and expressions that you pass into the
+constructors for `cl::opt`_ and `cl::list`_.  These modifiers give you the
+ability to tweak how options are parsed and how ``-help`` output is generated to
+fit your application well.
+
+These options fall into five main categories:
+
+#. Hiding an option from ``-help`` output
+
+#. Controlling the number of occurrences required and allowed
+
+#. Controlling whether or not a value must be specified
+
+#. Controlling other formatting options
+
+#. Miscellaneous option modifiers
+
+It is not possible to specify two options from the same category (you'll get a
+runtime error) to a single option, except for options in the miscellaneous
+category.  The CommandLine library specifies defaults for all of these settings
+that are the most useful in practice and the most common, which mean that you
+usually shouldn't have to worry about these.
+
+Hiding an option from ``-help`` output
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::NotHidden``, ``cl::Hidden``, and ``cl::ReallyHidden`` modifiers are
+used to control whether or not an option appears in the ``-help`` and
+``-help-hidden`` output for the compiled program:
+
+.. _cl::NotHidden:
+
+* The **cl::NotHidden** modifier (which is the default for `cl::opt`_ and
+  `cl::list`_ options) indicates the option is to appear in both help
+  listings.
+
+.. _cl::Hidden:
+
+* The **cl::Hidden** modifier (which is the default for `cl::alias`_ options)
+  indicates that the option should not appear in the ``-help`` output, but
+  should appear in the ``-help-hidden`` output.
+
+.. _cl::ReallyHidden:
+
+* The **cl::ReallyHidden** modifier indicates that the option should not appear
+  in any help output.
+
+Controlling the number of occurrences required and allowed
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This group of options is used to control how many time an option is allowed (or
+required) to be specified on the command line of your program.  Specifying a
+value for this setting allows the CommandLine library to do error checking for
+you.
+
+The allowed values for this option group are:
+
+.. _cl::Optional:
+
+* The **cl::Optional** modifier (which is the default for the `cl::opt`_ and
+  `cl::alias`_ classes) indicates that your program will allow either zero or
+  one occurrence of the option to be specified.
+
+.. _cl::ZeroOrMore:
+
+* The **cl::ZeroOrMore** modifier (which is the default for the `cl::list`_
+  class) indicates that your program will allow the option to be specified zero
+  or more times.
+
+.. _cl::Required:
+
+* The **cl::Required** modifier indicates that the specified option must be
+  specified exactly one time.
+
+.. _cl::OneOrMore:
+
+* The **cl::OneOrMore** modifier indicates that the option must be specified at
+  least one time.
+
+* The **cl::ConsumeAfter** modifier is described in the `Positional arguments
+  section`_.
+
+If an option is not specified, then the value of the option is equal to the
+value specified by the `cl::init`_ attribute.  If the ``cl::init`` attribute is
+not specified, the option value is initialized with the default constructor for
+the data type.
+
+If an option is specified multiple times for an option of the `cl::opt`_ class,
+only the last value will be retained.
+
+Controlling whether or not a value must be specified
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This group of options is used to control whether or not the option allows a
+value to be present.  In the case of the CommandLine library, a value is either
+specified with an equal sign (e.g. '``-index-depth=17``') or as a trailing
+string (e.g. '``-o a.out``').
+
+The allowed values for this option group are:
+
+.. _cl::ValueOptional:
+
+* The **cl::ValueOptional** modifier (which is the default for ``bool`` typed
+  options) specifies that it is acceptable to have a value, or not.  A boolean
+  argument can be enabled just by appearing on the command line, or it can have
+  an explicit '``-foo=true``'.  If an option is specified with this mode, it is
+  illegal for the value to be provided without the equal sign.  Therefore
+  '``-foo true``' is illegal.  To get this behavior, you must use
+  the `cl::ValueRequired`_ modifier.
+
+.. _cl::ValueRequired:
+
+* The **cl::ValueRequired** modifier (which is the default for all other types
+  except for `unnamed alternatives using the generic parser`_) specifies that a
+  value must be provided.  This mode informs the command line library that if an
+  option is not provides with an equal sign, that the next argument provided
+  must be the value.  This allows things like '``-o a.out``' to work.
+
+.. _cl::ValueDisallowed:
+
+* The **cl::ValueDisallowed** modifier (which is the default for `unnamed
+  alternatives using the generic parser`_) indicates that it is a runtime error
+  for the user to specify a value.  This can be provided to disallow users from
+  providing options to boolean options (like '``-foo=true``').
+
+In general, the default values for this option group work just like you would
+want them to.  As mentioned above, you can specify the `cl::ValueDisallowed`_
+modifier to a boolean argument to restrict your command line parser.  These
+options are mostly useful when `extending the library`_.
+
+.. _formatting option:
+
+Controlling other formatting options
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The formatting option group is used to specify that the command line option has
+special abilities and is otherwise different from other command line arguments.
+As usual, you can only specify one of these arguments at most.
+
+.. _cl::NormalFormatting:
+
+* The **cl::NormalFormatting** modifier (which is the default all options)
+  specifies that this option is "normal".
+
+.. _cl::Positional:
+
+* The **cl::Positional** modifier specifies that this is a positional argument
+  that does not have a command line option associated with it.  See the
+  `Positional Arguments`_ section for more information.
+
+* The **cl::ConsumeAfter** modifier specifies that this option is used to
+  capture "interpreter style" arguments.  See `this section for more
+  information`_.
+
+.. _prefix:
+.. _cl::Prefix:
+
+* The **cl::Prefix** modifier specifies that this option prefixes its value.
+  With 'Prefix' options, the equal sign does not separate the value from the
+  option name specified. Instead, the value is everything after the prefix,
+  including any equal sign if present. This is useful for processing odd
+  arguments like ``-lmalloc`` and ``-L/usr/lib`` in a linker tool or
+  ``-DNAME=value`` in a compiler tool.  Here, the '``l``', '``D``' and '``L``'
+  options are normal string (or list) options, that have the **cl::Prefix**
+  modifier added to allow the CommandLine library to recognize them.  Note that
+  **cl::Prefix** options must not have the **cl::ValueDisallowed** modifier
+  specified.
+
+.. _grouping:
+.. _cl::Grouping:
+
+* The **cl::Grouping** modifier is used to implement Unix-style tools (like
+  ``ls``) that have lots of single letter arguments, but only require a single
+  dash.  For example, the '``ls -labF``' command actually enables four different
+  options, all of which are single letters.  Note that **cl::Grouping** options
+  cannot have values.
+
+The CommandLine library does not restrict how you use the **cl::Prefix** or
+**cl::Grouping** modifiers, but it is possible to specify ambiguous argument
+settings.  Thus, it is possible to have multiple letter options that are prefix
+or grouping options, and they will still work as designed.
+
+To do this, the CommandLine library uses a greedy algorithm to parse the input
+option into (potentially multiple) prefix and grouping options.  The strategy
+basically looks like this:
+
+::
+
+  parse(string OrigInput) {
+
+  1. string input = OrigInput;
+  2. if (isOption(input)) return getOption(input).parse();  // Normal option
+  3. while (!isOption(input) && !input.empty()) input.pop_back();  // Remove the last letter
+  4. if (input.empty()) return error();  // No matching option
+  5. if (getOption(input).isPrefix())
+       return getOption(input).parse(input);
+  6. while (!input.empty()) {  // Must be grouping options
+       getOption(input).parse();
+       OrigInput.erase(OrigInput.begin(), OrigInput.begin()+input.length());
+       input = OrigInput;
+       while (!isOption(input) && !input.empty()) input.pop_back();
+     }
+  7. if (!OrigInput.empty()) error();
+
+  }
+
+Miscellaneous option modifiers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The miscellaneous option modifiers are the only flags where you can specify more
+than one flag from the set: they are not mutually exclusive.  These flags
+specify boolean properties that modify the option.
+
+.. _cl::CommaSeparated:
+
+* The **cl::CommaSeparated** modifier indicates that any commas specified for an
+  option's value should be used to split the value up into multiple values for
+  the option.  For example, these two options are equivalent when
+  ``cl::CommaSeparated`` is specified: "``-foo=a -foo=b -foo=c``" and
+  "``-foo=a,b,c``".  This option only makes sense to be used in a case where the
+  option is allowed to accept one or more values (i.e. it is a `cl::list`_
+  option).
+
+.. _cl::PositionalEatsArgs:
+
+* The **cl::PositionalEatsArgs** modifier (which only applies to positional
+  arguments, and only makes sense for lists) indicates that positional argument
+  should consume any strings after it (including strings that start with a "-")
+  up until another recognized positional argument.  For example, if you have two
+  "eating" positional arguments, "``pos1``" and "``pos2``", the string "``-pos1
+  -foo -bar baz -pos2 -bork``" would cause the "``-foo -bar -baz``" strings to
+  be applied to the "``-pos1``" option and the "``-bork``" string to be applied
+  to the "``-pos2``" option.
+
+.. _cl::Sink:
+
+* The **cl::Sink** modifier is used to handle unknown options. If there is at
+  least one option with ``cl::Sink`` modifier specified, the parser passes
+  unrecognized option strings to it as values instead of signaling an error. As
+  with ``cl::CommaSeparated``, this modifier only makes sense with a `cl::list`_
+  option.
+
+So far, these are the only three miscellaneous option modifiers.
+
+.. _response files:
+
+Response files
+^^^^^^^^^^^^^^
+
+Some systems, such as certain variants of Microsoft Windows and some older
+Unices have a relatively low limit on command-line length. It is therefore
+customary to use the so-called 'response files' to circumvent this
+restriction. These files are mentioned on the command-line (using the "@file")
+syntax. The program reads these files and inserts the contents into argv,
+thereby working around the command-line length limits. Response files are
+enabled by an optional fourth argument to `cl::ParseEnvironmentOptions`_ and
+`cl::ParseCommandLineOptions`_.
+
+Top-Level Classes and Functions
+-------------------------------
+
+Despite all of the built-in flexibility, the CommandLine option library really
+only consists of one function `cl::ParseCommandLineOptions`_) and three main
+classes: `cl::opt`_, `cl::list`_, and `cl::alias`_.  This section describes
+these three classes in detail.
+
+.. _cl::getRegisteredOptions:
+
+The ``cl::getRegisteredOptions`` function
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::getRegisteredOptions`` function is designed to give a programmer
+access to declared non-positional command line options so that how they appear
+in ``-help`` can be modified prior to calling `cl::ParseCommandLineOptions`_.
+Note this method should not be called during any static initialisation because
+it cannot be guaranteed that all options will have been initialised. Hence it
+should be called from ``main``.
+
+This function can be used to gain access to options declared in libraries that
+the tool writter may not have direct access to.
+
+The function retrieves a :ref:`StringMap <dss_stringmap>` that maps the option
+string (e.g. ``-help``) to an ``Option*``.
+
+Here is an example of how the function could be used:
+
+.. code-block:: c++
+
+  using namespace llvm;
+  int main(int argc, char **argv) {
+    cl::OptionCategory AnotherCategory("Some options");
+
+    StringMap<cl::Option*> &Map = cl::getRegisteredOptions();
+
+    //Unhide useful option and put it in a different category
+    assert(Map.count("print-all-options") > 0);
+    Map["print-all-options"]->setHiddenFlag(cl::NotHidden);
+    Map["print-all-options"]->setCategory(AnotherCategory);
+
+    //Hide an option we don't want to see
+    assert(Map.count("enable-no-infs-fp-math") > 0);
+    Map["enable-no-infs-fp-math"]->setHiddenFlag(cl::Hidden);
+
+    //Change --version to --show-version
+    assert(Map.count("version") > 0);
+    Map["version"]->setArgStr("show-version");
+
+    //Change --help description
+    assert(Map.count("help") > 0);
+    Map["help"]->setDescription("Shows help");
+
+    cl::ParseCommandLineOptions(argc, argv, "This is a small program to demo the LLVM CommandLine API");
+    ...
+  }
+
+
+.. _cl::ParseCommandLineOptions:
+
+The ``cl::ParseCommandLineOptions`` function
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::ParseCommandLineOptions`` function is designed to be called directly
+from ``main``, and is used to fill in the values of all of the command line
+option variables once ``argc`` and ``argv`` are available.
+
+The ``cl::ParseCommandLineOptions`` function requires two parameters (``argc``
+and ``argv``), but may also take an optional third parameter which holds
+`additional extra text`_ to emit when the ``-help`` option is invoked, and a
+fourth boolean parameter that enables `response files`_.
+
+.. _cl::ParseEnvironmentOptions:
+
+The ``cl::ParseEnvironmentOptions`` function
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::ParseEnvironmentOptions`` function has mostly the same effects as
+`cl::ParseCommandLineOptions`_, except that it is designed to take values for
+options from an environment variable, for those cases in which reading the
+command line is not convenient or desired. It fills in the values of all the
+command line option variables just like `cl::ParseCommandLineOptions`_ does.
+
+It takes four parameters: the name of the program (since ``argv`` may not be
+available, it can't just look in ``argv[0]``), the name of the environment
+variable to examine, the optional `additional extra text`_ to emit when the
+``-help`` option is invoked, and the boolean switch that controls whether
+`response files`_ should be read.
+
+``cl::ParseEnvironmentOptions`` will break the environment variable's value up
+into words and then process them using `cl::ParseCommandLineOptions`_.
+**Note:** Currently ``cl::ParseEnvironmentOptions`` does not support quoting, so
+an environment variable containing ``-option "foo bar"`` will be parsed as three
+words, ``-option``, ``"foo``, and ``bar"``, which is different from what you
+would get from the shell with the same input.
+
+The ``cl::SetVersionPrinter`` function
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::SetVersionPrinter`` function is designed to be called directly from
+``main`` and *before* ``cl::ParseCommandLineOptions``. Its use is optional. It
+simply arranges for a function to be called in response to the ``--version``
+option instead of having the ``CommandLine`` library print out the usual version
+string for LLVM. This is useful for programs that are not part of LLVM but wish
+to use the ``CommandLine`` facilities. Such programs should just define a small
+function that takes no arguments and returns ``void`` and that prints out
+whatever version information is appropriate for the program. Pass the address of
+that function to ``cl::SetVersionPrinter`` to arrange for it to be called when
+the ``--version`` option is given by the user.
+
+.. _cl::opt:
+.. _scalar:
+
+The ``cl::opt`` class
+^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::opt`` class is the class used to represent scalar command line
+options, and is the one used most of the time.  It is a templated class which
+can take up to three arguments (all except for the first have default values
+though):
+
+.. code-block:: c++
+
+  namespace cl {
+    template <class DataType, bool ExternalStorage = false,
+              class ParserClass = parser<DataType> >
+    class opt;
+  }
+
+The first template argument specifies what underlying data type the command line
+argument is, and is used to select a default parser implementation.  The second
+template argument is used to specify whether the option should contain the
+storage for the option (the default) or whether external storage should be used
+to contain the value parsed for the option (see `Internal vs External Storage`_
+for more information).
+
+The third template argument specifies which parser to use.  The default value
+selects an instantiation of the ``parser`` class based on the underlying data
+type of the option.  In general, this default works well for most applications,
+so this option is only used when using a `custom parser`_.
+
+.. _lists of arguments:
+.. _cl::list:
+
+The ``cl::list`` class
+^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::list`` class is the class used to represent a list of command line
+options.  It too is a templated class which can take up to three arguments:
+
+.. code-block:: c++
+
+  namespace cl {
+    template <class DataType, class Storage = bool,
+              class ParserClass = parser<DataType> >
+    class list;
+  }
+
+This class works the exact same as the `cl::opt`_ class, except that the second
+argument is the **type** of the external storage, not a boolean value.  For this
+class, the marker type '``bool``' is used to indicate that internal storage
+should be used.
+
+.. _cl::bits:
+
+The ``cl::bits`` class
+^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::bits`` class is the class used to represent a list of command line
+options in the form of a bit vector.  It is also a templated class which can
+take up to three arguments:
+
+.. code-block:: c++
+
+  namespace cl {
+    template <class DataType, class Storage = bool,
+              class ParserClass = parser<DataType> >
+    class bits;
+  }
+
+This class works the exact same as the `cl::list`_ class, except that the second
+argument must be of **type** ``unsigned`` if external storage is used.
+
+.. _cl::alias:
+
+The ``cl::alias`` class
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::alias`` class is a nontemplated class that is used to form aliases for
+other arguments.
+
+.. code-block:: c++
+
+  namespace cl {
+    class alias;
+  }
+
+The `cl::aliasopt`_ attribute should be used to specify which option this is an
+alias for.  Alias arguments default to being `cl::Hidden`_, and use the aliased
+options parser to do the conversion from string to data.
+
+.. _cl::extrahelp:
+
+The ``cl::extrahelp`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::extrahelp`` class is a nontemplated class that allows extra help text
+to be printed out for the ``-help`` option.
+
+.. code-block:: c++
+
+  namespace cl {
+    struct extrahelp;
+  }
+
+To use the extrahelp, simply construct one with a ``const char*`` parameter to
+the constructor. The text passed to the constructor will be printed at the
+bottom of the help message, verbatim. Note that multiple ``cl::extrahelp``
+**can** be used, but this practice is discouraged. If your tool needs to print
+additional help information, put all that help into a single ``cl::extrahelp``
+instance.
+
+For example:
+
+.. code-block:: c++
+
+  cl::extrahelp("\nADDITIONAL HELP:\n\n  This is the extra help\n");
+
+.. _cl::OptionCategory:
+
+The ``cl::OptionCategory`` class
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``cl::OptionCategory`` class is a simple class for declaring
+option categories.
+
+.. code-block:: c++
+
+  namespace cl {
+    class OptionCategory;
+  }
+
+An option category must have a name and optionally a description which are
+passed to the constructor as ``const char*``.
+
+Note that declaring an option category and associating it with an option before
+parsing options (e.g. statically) will change the output of ``-help`` from
+uncategorized to categorized. If an option category is declared but not
+associated with an option then it will be hidden from the output of ``-help``
+but will be shown in the output of ``-help-hidden``.
+
+.. _different parser:
+.. _discussed previously:
+
+Builtin parsers
+---------------
+
+Parsers control how the string value taken from the command line is translated
+into a typed value, suitable for use in a C++ program.  By default, the
+CommandLine library uses an instance of ``parser<type>`` if the command line
+option specifies that it uses values of type '``type``'.  Because of this,
+custom option processing is specified with specializations of the '``parser``'
+class.
+
+The CommandLine library provides the following builtin parser specializations,
+which are sufficient for most applications. It can, however, also be extended to
+work with new data types and new ways of interpreting the same data.  See the
+`Writing a Custom Parser`_ for more details on this type of library extension.
+
+.. _enums:
+.. _cl::parser:
+
+* The generic ``parser<t>`` parser can be used to map strings values to any data
+  type, through the use of the `cl::values`_ property, which specifies the
+  mapping information.  The most common use of this parser is for parsing enum
+  values, which allows you to use the CommandLine library for all of the error
+  checking to make sure that only valid enum values are specified (as opposed to
+  accepting arbitrary strings).  Despite this, however, the generic parser class
+  can be used for any data type.
+
+.. _boolean flags:
+.. _bool parser:
+
+* The **parser<bool> specialization** is used to convert boolean strings to a
+  boolean value.  Currently accepted strings are "``true``", "``TRUE``",
+  "``True``", "``1``", "``false``", "``FALSE``", "``False``", and "``0``".
+
+* The **parser<boolOrDefault> specialization** is used for cases where the value
+  is boolean, but we also need to know whether the option was specified at all.
+  boolOrDefault is an enum with 3 values, BOU_UNSET, BOU_TRUE and BOU_FALSE.
+  This parser accepts the same strings as **``parser<bool>``**.
+
+.. _strings:
+
+* The **parser<string> specialization** simply stores the parsed string into the
+  string value specified.  No conversion or modification of the data is
+  performed.
+
+.. _integers:
+.. _int:
+
+* The **parser<int> specialization** uses the C ``strtol`` function to parse the
+  string input.  As such, it will accept a decimal number (with an optional '+'
+  or '-' prefix) which must start with a non-zero digit.  It accepts octal
+  numbers, which are identified with a '``0``' prefix digit, and hexadecimal
+  numbers with a prefix of '``0x``' or '``0X``'.
+
+.. _doubles:
+.. _float:
+.. _double:
+
+* The **parser<double>** and **parser<float> specializations** use the standard
+  C ``strtod`` function to convert floating point strings into floating point
+  values.  As such, a broad range of string formats is supported, including
+  exponential notation (ex: ``1.7e15``) and properly supports locales.
+
+.. _Extension Guide:
+.. _extending the library:
+
+Extension Guide
+===============
+
+Although the CommandLine library has a lot of functionality built into it
+already (as discussed previously), one of its true strengths lie in its
+extensibility.  This section discusses how the CommandLine library works under
+the covers and illustrates how to do some simple, common, extensions.
+
+.. _Custom parsers:
+.. _custom parser:
+.. _Writing a Custom Parser:
+
+Writing a custom parser
+-----------------------
+
+One of the simplest and most common extensions is the use of a custom parser.
+As `discussed previously`_, parsers are the portion of the CommandLine library
+that turns string input from the user into a particular parsed data type,
+validating the input in the process.
+
+There are two ways to use a new parser:
+
+#. Specialize the `cl::parser`_ template for your custom data type.
+
+   This approach has the advantage that users of your custom data type will
+   automatically use your custom parser whenever they define an option with a
+   value type of your data type.  The disadvantage of this approach is that it
+   doesn't work if your fundamental data type is something that is already
+   supported.
+
+#. Write an independent class, using it explicitly from options that need it.
+
+   This approach works well in situations where you would line to parse an
+   option using special syntax for a not-very-special data-type.  The drawback
+   of this approach is that users of your parser have to be aware that they are
+   using your parser instead of the builtin ones.
+
+To guide the discussion, we will discuss a custom parser that accepts file
+sizes, specified with an optional unit after the numeric size.  For example, we
+would like to parse "102kb", "41M", "1G" into the appropriate integer value.  In
+this case, the underlying data type we want to parse into is '``unsigned``'.  We
+choose approach #2 above because we don't want to make this the default for all
+``unsigned`` options.
+
+To start out, we declare our new ``FileSizeParser`` class:
+
+.. code-block:: c++
+
+  struct FileSizeParser : public cl::parser<unsigned> {
+    // parse - Return true on error.
+    bool parse(cl::Option &O, StringRef ArgName, const std::string &ArgValue,
+               unsigned &Val);
+  };
+
+Our new class inherits from the ``cl::parser`` template class to fill in
+the default, boiler plate code for us.  We give it the data type that we parse
+into, the last argument to the ``parse`` method, so that clients of our custom
+parser know what object type to pass in to the parse method.  (Here we declare
+that we parse into '``unsigned``' variables.)
+
+For most purposes, the only method that must be implemented in a custom parser
+is the ``parse`` method.  The ``parse`` method is called whenever the option is
+invoked, passing in the option itself, the option name, the string to parse, and
+a reference to a return value.  If the string to parse is not well-formed, the
+parser should output an error message and return true.  Otherwise it should
+return false and set '``Val``' to the parsed value.  In our example, we
+implement ``parse`` as:
+
+.. code-block:: c++
+
+  bool FileSizeParser::parse(cl::Option &O, StringRef ArgName,
+                             const std::string &Arg, unsigned &Val) {
+    const char *ArgStart = Arg.c_str();
+    char *End;
+
+    // Parse integer part, leaving 'End' pointing to the first non-integer char
+    Val = (unsigned)strtol(ArgStart, &End, 0);
+
+    while (1) {
+      switch (*End++) {
+      case 0: return false;   // No error
+      case 'i':               // Ignore the 'i' in KiB if people use that
+      case 'b': case 'B':     // Ignore B suffix
+        break;
+
+      case 'g': case 'G': Val *= 1024*1024*1024; break;
+      case 'm': case 'M': Val *= 1024*1024;      break;
+      case 'k': case 'K': Val *= 1024;           break;
+
+      default:
+        // Print an error message if unrecognized character!
+        return O.error("'" + Arg + "' value invalid for file size argument!");
+      }
+    }
+  }
+
+This function implements a very simple parser for the kinds of strings we are
+interested in.  Although it has some holes (it allows "``123KKK``" for example),
+it is good enough for this example.  Note that we use the option itself to print
+out the error message (the ``error`` method always returns true) in order to get
+a nice error message (shown below).  Now that we have our parser class, we can
+use it like this:
+
+.. code-block:: c++
+
+  static cl::opt<unsigned, false, FileSizeParser>
+  MFS("max-file-size", cl::desc("Maximum file size to accept"),
+      cl::value_desc("size"));
+
+Which adds this to the output of our program:
+
+::
+
+  OPTIONS:
+    -help                 - display available options (-help-hidden for more)
+    ...
+    -max-file-size=<size> - Maximum file size to accept
+
+And we can test that our parse works correctly now (the test program just prints
+out the max-file-size argument value):
+
+::
+
+  $ ./test
+  MFS: 0
+  $ ./test -max-file-size=123MB
+  MFS: 128974848
+  $ ./test -max-file-size=3G
+  MFS: 3221225472
+  $ ./test -max-file-size=dog
+  -max-file-size option: 'dog' value invalid for file size argument!
+
+It looks like it works.  The error message that we get is nice and helpful, and
+we seem to accept reasonable file sizes.  This wraps up the "custom parser"
+tutorial.
+
+Exploiting external storage
+---------------------------
+
+Several of the LLVM libraries define static ``cl::opt`` instances that will
+automatically be included in any program that links with that library.  This is
+a feature. However, sometimes it is necessary to know the value of the command
+line option outside of the library. In these cases the library does or should
+provide an external storage location that is accessible to users of the
+library. Examples of this include the ``llvm::DebugFlag`` exported by the
+``lib/Support/Debug.cpp`` file and the ``llvm::TimePassesIsEnabled`` flag
+exported by the ``lib/VMCore/PassManager.cpp`` file.
+
+.. todo::
+
+  TODO: complete this section
+
+.. _dynamically loaded options:
+
+Dynamically adding command line options
+---------------------------------------
+
+.. todo::
+
+  TODO: fill in this section

Added: www-releases/trunk/4.0.1/docs/_sources/CompileCudaWithLLVM.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CompileCudaWithLLVM.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CompileCudaWithLLVM.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CompileCudaWithLLVM.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,559 @@
+=========================
+Compiling CUDA with clang
+=========================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document describes how to compile CUDA code with clang, and gives some
+details about LLVM and clang's CUDA implementations.
+
+This document assumes a basic familiarity with CUDA. Information about CUDA
+programming can be found in the
+`CUDA programming guide
+<http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html>`_.
+
+Compiling CUDA Code
+===================
+
+Prerequisites
+-------------
+
+CUDA is supported in llvm 3.9, but it's still in active development, so we
+recommend you `compile clang/LLVM from HEAD
+<http://llvm.org/docs/GettingStarted.html>`_.
+
+Before you build CUDA code, you'll need to have installed the appropriate
+driver for your nvidia GPU and the CUDA SDK.  See `NVIDIA's CUDA installation
+guide <https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html>`_
+for details.  Note that clang `does not support
+<https://llvm.org/bugs/show_bug.cgi?id=26966>`_ the CUDA toolkit as installed
+by many Linux package managers; you probably need to install nvidia's package.
+
+You will need CUDA 7.0, 7.5, or 8.0 to compile with clang.
+
+CUDA compilation is supported on Linux, on MacOS as of 2016-11-18, and on
+Windows as of 2017-01-05.
+
+Invoking clang
+--------------
+
+Invoking clang for CUDA compilation works similarly to compiling regular C++.
+You just need to be aware of a few additional flags.
+
+You can use `this <https://gist.github.com/855e277884eb6b388cd2f00d956c2fd4>`_
+program as a toy example.  Save it as ``axpy.cu``.  (Clang detects that you're
+compiling CUDA code by noticing that your filename ends with ``.cu``.
+Alternatively, you can pass ``-x cuda``.)
+
+To build and run, run the following commands, filling in the parts in angle
+brackets as described below:
+
+.. code-block:: console
+
+  $ clang++ axpy.cu -o axpy --cuda-gpu-arch=<GPU arch> \
+      -L<CUDA install path>/<lib64 or lib>             \
+      -lcudart_static -ldl -lrt -pthread
+  $ ./axpy
+  y[0] = 2
+  y[1] = 4
+  y[2] = 6
+  y[3] = 8
+
+On MacOS, replace `-lcudart_static` with `-lcudart`; otherwise, you may get
+"CUDA driver version is insufficient for CUDA runtime version" errors when you
+run your program.
+
+* ``<CUDA install path>`` -- the directory where you installed CUDA SDK.
+  Typically, ``/usr/local/cuda``.
+
+  Pass e.g. ``-L/usr/local/cuda/lib64`` if compiling in 64-bit mode; otherwise,
+  pass e.g. ``-L/usr/local/cuda/lib``.  (In CUDA, the device code and host code
+  always have the same pointer widths, so if you're compiling 64-bit code for
+  the host, you're also compiling 64-bit code for the device.)
+
+* ``<GPU arch>`` -- the `compute capability
+  <https://developer.nvidia.com/cuda-gpus>`_ of your GPU. For example, if you
+  want to run your program on a GPU with compute capability of 3.5, specify
+  ``--cuda-gpu-arch=sm_35``.
+
+  Note: You cannot pass ``compute_XX`` as an argument to ``--cuda-gpu-arch``;
+  only ``sm_XX`` is currently supported.  However, clang always includes PTX in
+  its binaries, so e.g. a binary compiled with ``--cuda-gpu-arch=sm_30`` would be
+  forwards-compatible with e.g. ``sm_35`` GPUs.
+
+  You can pass ``--cuda-gpu-arch`` multiple times to compile for multiple archs.
+
+The `-L` and `-l` flags only need to be passed when linking.  When compiling,
+you may also need to pass ``--cuda-path=/path/to/cuda`` if you didn't install
+the CUDA SDK into ``/usr/local/cuda``, ``/usr/local/cuda-7.0``, or
+``/usr/local/cuda-7.5``.
+
+Flags that control numerical code
+---------------------------------
+
+If you're using GPUs, you probably care about making numerical code run fast.
+GPU hardware allows for more control over numerical operations than most CPUs,
+but this results in more compiler options for you to juggle.
+
+Flags you may wish to tweak include:
+
+* ``-ffp-contract={on,off,fast}`` (defaults to ``fast`` on host and device when
+  compiling CUDA) Controls whether the compiler emits fused multiply-add
+  operations.
+
+  * ``off``: never emit fma operations, and prevent ptxas from fusing multiply
+    and add instructions.
+  * ``on``: fuse multiplies and adds within a single statement, but never
+    across statements (C11 semantics).  Prevent ptxas from fusing other
+    multiplies and adds.
+  * ``fast``: fuse multiplies and adds wherever profitable, even across
+    statements.  Doesn't prevent ptxas from fusing additional multiplies and
+    adds.
+
+  Fused multiply-add instructions can be much faster than the unfused
+  equivalents, but because the intermediate result in an fma is not rounded,
+  this flag can affect numerical code.
+
+* ``-fcuda-flush-denormals-to-zero`` (default: off) When this is enabled,
+  floating point operations may flush `denormal
+  <https://en.wikipedia.org/wiki/Denormal_number>`_ inputs and/or outputs to 0.
+  Operations on denormal numbers are often much slower than the same operations
+  on normal numbers.
+
+* ``-fcuda-approx-transcendentals`` (default: off) When this is enabled, the
+  compiler may emit calls to faster, approximate versions of transcendental
+  functions, instead of using the slower, fully IEEE-compliant versions.  For
+  example, this flag allows clang to emit the ptx ``sin.approx.f32``
+  instruction.
+
+  This is implied by ``-ffast-math``.
+
+Standard library support
+========================
+
+In clang and nvcc, most of the C++ standard library is not supported on the
+device side.
+
+``<math.h>`` and ``<cmath>``
+----------------------------
+
+In clang, ``math.h`` and ``cmath`` are available and `pass
+<https://github.com/llvm-mirror/test-suite/blob/master/External/CUDA/math_h.cu>`_
+`tests
+<https://github.com/llvm-mirror/test-suite/blob/master/External/CUDA/cmath.cu>`_
+adapted from libc++'s test suite.
+
+In nvcc ``math.h`` and ``cmath`` are mostly available.  Versions of ``::foof``
+in namespace std (e.g. ``std::sinf``) are not available, and where the standard
+calls for overloads that take integral arguments, these are usually not
+available.
+
+.. code-block:: c++
+
+  #include <math.h>
+  #include <cmath.h>
+
+  // clang is OK with everything in this function.
+  __device__ void test() {
+    std::sin(0.); // nvcc - ok
+    std::sin(0);  // nvcc - error, because no std::sin(int) override is available.
+    sin(0);       // nvcc - same as above.
+
+    sinf(0.);       // nvcc - ok
+    std::sinf(0.);  // nvcc - no such function
+  }
+
+``<std::complex>``
+------------------
+
+nvcc does not officially support ``std::complex``.  It's an error to use
+``std::complex`` in ``__device__`` code, but it often works in ``__host__
+__device__`` code due to nvcc's interpretation of the "wrong-side rule" (see
+below).  However, we have heard from implementers that it's possible to get
+into situations where nvcc will omit a call to an ``std::complex`` function,
+especially when compiling without optimizations.
+
+As of 2016-11-16, clang supports ``std::complex`` without these caveats.  It is
+tested with libstdc++ 4.8.5 and newer, but is known to work only with libc++
+newer than 2016-11-16.
+
+``<algorithm>``
+---------------
+
+In C++14, many useful functions from ``<algorithm>`` (notably, ``std::min`` and
+``std::max``) become constexpr.  You can therefore use these in device code,
+when compiling with clang.
+
+Detecting clang vs NVCC from code
+=================================
+
+Although clang's CUDA implementation is largely compatible with NVCC's, you may
+still want to detect when you're compiling CUDA code specifically with clang.
+
+This is tricky, because NVCC may invoke clang as part of its own compilation
+process!  For example, NVCC uses the host compiler's preprocessor when
+compiling for device code, and that host compiler may in fact be clang.
+
+When clang is actually compiling CUDA code -- rather than being used as a
+subtool of NVCC's -- it defines the ``__CUDA__`` macro.  ``__CUDA_ARCH__`` is
+defined only in device mode (but will be defined if NVCC is using clang as a
+preprocessor).  So you can use the following incantations to detect clang CUDA
+compilation, in host and device modes:
+
+.. code-block:: c++
+
+  #if defined(__clang__) && defined(__CUDA__) && !defined(__CUDA_ARCH__)
+  // clang compiling CUDA code, host mode.
+  #endif
+
+  #if defined(__clang__) && defined(__CUDA__) && defined(__CUDA_ARCH__)
+  // clang compiling CUDA code, device mode.
+  #endif
+
+Both clang and nvcc define ``__CUDACC__`` during CUDA compilation.  You can
+detect NVCC specifically by looking for ``__NVCC__``.
+
+Dialect Differences Between clang and nvcc
+==========================================
+
+There is no formal CUDA spec, and clang and nvcc speak slightly different
+dialects of the language.  Below, we describe some of the differences.
+
+This section is painful; hopefully you can skip this section and live your life
+blissfully unaware.
+
+Compilation Models
+------------------
+
+Most of the differences between clang and nvcc stem from the different
+compilation models used by clang and nvcc.  nvcc uses *split compilation*,
+which works roughly as follows:
+
+ * Run a preprocessor over the input ``.cu`` file to split it into two source
+   files: ``H``, containing source code for the host, and ``D``, containing
+   source code for the device.
+
+ * For each GPU architecture ``arch`` that we're compiling for, do:
+
+   * Compile ``D`` using nvcc proper.  The result of this is a ``ptx`` file for
+     ``P_arch``.
+
+   * Optionally, invoke ``ptxas``, the PTX assembler, to generate a file,
+     ``S_arch``, containing GPU machine code (SASS) for ``arch``.
+
+ * Invoke ``fatbin`` to combine all ``P_arch`` and ``S_arch`` files into a
+   single "fat binary" file, ``F``.
+
+ * Compile ``H`` using an external host compiler (gcc, clang, or whatever you
+   like).  ``F`` is packaged up into a header file which is force-included into
+   ``H``; nvcc generates code that calls into this header to e.g. launch
+   kernels.
+
+clang uses *merged parsing*.  This is similar to split compilation, except all
+of the host and device code is present and must be semantically-correct in both
+compilation steps.
+
+  * For each GPU architecture ``arch`` that we're compiling for, do:
+
+    * Compile the input ``.cu`` file for device, using clang.  ``__host__`` code
+      is parsed and must be semantically correct, even though we're not
+      generating code for the host at this time.
+
+      The output of this step is a ``ptx`` file ``P_arch``.
+
+    * Invoke ``ptxas`` to generate a SASS file, ``S_arch``.  Note that, unlike
+      nvcc, clang always generates SASS code.
+
+  * Invoke ``fatbin`` to combine all ``P_arch`` and ``S_arch`` files into a
+    single fat binary file, ``F``.
+
+  * Compile ``H`` using clang.  ``__device__`` code is parsed and must be
+    semantically correct, even though we're not generating code for the device
+    at this time.
+
+    ``F`` is passed to this compilation, and clang includes it in a special ELF
+    section, where it can be found by tools like ``cuobjdump``.
+
+(You may ask at this point, why does clang need to parse the input file
+multiple times?  Why not parse it just once, and then use the AST to generate
+code for the host and each device architecture?
+
+Unfortunately this can't work because we have to define different macros during
+host compilation and during device compilation for each GPU architecture.)
+
+clang's approach allows it to be highly robust to C++ edge cases, as it doesn't
+need to decide at an early stage which declarations to keep and which to throw
+away.  But it has some consequences you should be aware of.
+
+Overloading Based on ``__host__`` and ``__device__`` Attributes
+---------------------------------------------------------------
+
+Let "H", "D", and "HD" stand for "``__host__`` functions", "``__device__``
+functions", and "``__host__ __device__`` functions", respectively.  Functions
+with no attributes behave the same as H.
+
+nvcc does not allow you to create H and D functions with the same signature:
+
+.. code-block:: c++
+
+  // nvcc: error - function "foo" has already been defined
+  __host__ void foo() {}
+  __device__ void foo() {}
+
+However, nvcc allows you to "overload" H and D functions with different
+signatures:
+
+.. code-block:: c++
+
+  // nvcc: no error
+  __host__ void foo(int) {}
+  __device__ void foo() {}
+
+In clang, the ``__host__`` and ``__device__`` attributes are part of a
+function's signature, and so it's legal to have H and D functions with
+(otherwise) the same signature:
+
+.. code-block:: c++
+
+  // clang: no error
+  __host__ void foo() {}
+  __device__ void foo() {}
+
+HD functions cannot be overloaded by H or D functions with the same signature:
+
+.. code-block:: c++
+
+  // nvcc: error - function "foo" has already been defined
+  // clang: error - redefinition of 'foo'
+  __host__ __device__ void foo() {}
+  __device__ void foo() {}
+
+  // nvcc: no error
+  // clang: no error
+  __host__ __device__ void bar(int) {}
+  __device__ void bar() {}
+
+When resolving an overloaded function, clang considers the host/device
+attributes of the caller and callee.  These are used as a tiebreaker during
+overload resolution.  See `IdentifyCUDAPreference
+<http://clang.llvm.org/doxygen/SemaCUDA_8cpp.html>`_ for the full set of rules,
+but at a high level they are:
+
+ * D functions prefer to call other Ds.  HDs are given lower priority.
+
+ * Similarly, H functions prefer to call other Hs, or ``__global__`` functions
+   (with equal priority).  HDs are given lower priority.
+
+ * HD functions prefer to call other HDs.
+
+   When compiling for device, HDs will call Ds with lower priority than HD, and
+   will call Hs with still lower priority.  If it's forced to call an H, the
+   program is malformed if we emit code for this HD function.  We call this the
+   "wrong-side rule", see example below.
+
+   The rules are symmetrical when compiling for host.
+
+Some examples:
+
+.. code-block:: c++
+
+   __host__ void foo();
+   __device__ void foo();
+
+   __host__ void bar();
+   __host__ __device__ void bar();
+
+   __host__ void test_host() {
+     foo();  // calls H overload
+     bar();  // calls H overload
+   }
+
+   __device__ void test_device() {
+     foo();  // calls D overload
+     bar();  // calls HD overload
+   }
+
+   __host__ __device__ void test_hd() {
+     foo();  // calls H overload when compiling for host, otherwise D overload
+     bar();  // always calls HD overload
+   }
+
+Wrong-side rule example:
+
+.. code-block:: c++
+
+  __host__ void host_only();
+
+  // We don't codegen inline functions unless they're referenced by a
+  // non-inline function.  inline_hd1() is called only from the host side, so
+  // does not generate an error.  inline_hd2() is called from the device side,
+  // so it generates an error.
+  inline __host__ __device__ void inline_hd1() { host_only(); }  // no error
+  inline __host__ __device__ void inline_hd2() { host_only(); }  // error
+
+  __host__ void host_fn() { inline_hd1(); }
+  __device__ void device_fn() { inline_hd2(); }
+
+  // This function is not inline, so it's always codegen'ed on both the host
+  // and the device.  Therefore, it generates an error.
+  __host__ __device__ void not_inline_hd() { host_only(); }
+
+For the purposes of the wrong-side rule, templated functions also behave like
+``inline`` functions: They aren't codegen'ed unless they're instantiated
+(usually as part of the process of invoking them).
+
+clang's behavior with respect to the wrong-side rule matches nvcc's, except
+nvcc only emits a warning for ``not_inline_hd``; device code is allowed to call
+``not_inline_hd``.  In its generated code, nvcc may omit ``not_inline_hd``'s
+call to ``host_only`` entirely, or it may try to generate code for
+``host_only`` on the device.  What you get seems to depend on whether or not
+the compiler chooses to inline ``host_only``.
+
+Member functions, including constructors, may be overloaded using H and D
+attributes.  However, destructors cannot be overloaded.
+
+Using a Different Class on Host/Device
+--------------------------------------
+
+Occasionally you may want to have a class with different host/device versions.
+
+If all of the class's members are the same on the host and device, you can just
+provide overloads for the class's member functions.
+
+However, if you want your class to have different members on host/device, you
+won't be able to provide working H and D overloads in both classes. In this
+case, clang is likely to be unhappy with you.
+
+.. code-block:: c++
+
+  #ifdef __CUDA_ARCH__
+  struct S {
+    __device__ void foo() { /* use device_only */ }
+    int device_only;
+  };
+  #else
+  struct S {
+    __host__ void foo() { /* use host_only */ }
+    double host_only;
+  };
+
+  __device__ void test() {
+    S s;
+    // clang generates an error here, because during host compilation, we
+    // have ifdef'ed away the __device__ overload of S::foo().  The __device__
+    // overload must be present *even during host compilation*.
+    S.foo();
+  }
+  #endif
+
+We posit that you don't really want to have classes with different members on H
+and D.  For example, if you were to pass one of these as a parameter to a
+kernel, it would have a different layout on H and D, so would not work
+properly.
+
+To make code like this compatible with clang, we recommend you separate it out
+into two classes.  If you need to write code that works on both host and
+device, consider writing an overloaded wrapper function that returns different
+types on host and device.
+
+.. code-block:: c++
+
+  struct HostS { ... };
+  struct DeviceS { ... };
+
+  __host__ HostS MakeStruct() { return HostS(); }
+  __device__ DeviceS MakeStruct() { return DeviceS(); }
+
+  // Now host and device code can call MakeStruct().
+
+Unfortunately, this idiom isn't compatible with nvcc, because it doesn't allow
+you to overload based on the H/D attributes.  Here's an idiom that works with
+both clang and nvcc:
+
+.. code-block:: c++
+
+  struct HostS { ... };
+  struct DeviceS { ... };
+
+  #ifdef __NVCC__
+    #ifndef __CUDA_ARCH__
+      __host__ HostS MakeStruct() { return HostS(); }
+    #else
+      __device__ DeviceS MakeStruct() { return DeviceS(); }
+    #endif
+  #else
+    __host__ HostS MakeStruct() { return HostS(); }
+    __device__ DeviceS MakeStruct() { return DeviceS(); }
+  #endif
+
+  // Now host and device code can call MakeStruct().
+
+Hopefully you don't have to do this sort of thing often.
+
+Optimizations
+=============
+
+Modern CPUs and GPUs are architecturally quite different, so code that's fast
+on a CPU isn't necessarily fast on a GPU.  We've made a number of changes to
+LLVM to make it generate good GPU code.  Among these changes are:
+
+* `Straight-line scalar optimizations <https://goo.gl/4Rb9As>`_ -- These
+  reduce redundancy within straight-line code.
+
+* `Aggressive speculative execution
+  <http://llvm.org/docs/doxygen/html/SpeculativeExecution_8cpp_source.html>`_
+  -- This is mainly for promoting straight-line scalar optimizations, which are
+  most effective on code along dominator paths.
+
+* `Memory space inference
+  <http://llvm.org/doxygen/NVPTXInferAddressSpaces_8cpp_source.html>`_ --
+  In PTX, we can operate on pointers that are in a paricular "address space"
+  (global, shared, constant, or local), or we can operate on pointers in the
+  "generic" address space, which can point to anything.  Operations in a
+  non-generic address space are faster, but pointers in CUDA are not explicitly
+  annotated with their address space, so it's up to LLVM to infer it where
+  possible.
+
+* `Bypassing 64-bit divides
+  <http://llvm.org/docs/doxygen/html/BypassSlowDivision_8cpp_source.html>`_ --
+  This was an existing optimization that we enabled for the PTX backend.
+
+  64-bit integer divides are much slower than 32-bit ones on NVIDIA GPUs.
+  Many of the 64-bit divides in our benchmarks have a divisor and dividend
+  which fit in 32-bits at runtime. This optimization provides a fast path for
+  this common case.
+
+* Aggressive loop unrooling and function inlining -- Loop unrolling and
+  function inlining need to be more aggressive for GPUs than for CPUs because
+  control flow transfer in GPU is more expensive. More aggressive unrolling and
+  inlining also promote other optimizations, such as constant propagation and
+  SROA, which sometimes speed up code by over 10x.
+
+  (Programmers can force unrolling and inline using clang's `loop unrolling pragmas
+  <http://clang.llvm.org/docs/AttributeReference.html#pragma-unroll-pragma-nounroll>`_
+  and ``__attribute__((always_inline))``.)
+
+Publication
+===========
+
+The team at Google published a paper in CGO 2016 detailing the optimizations
+they'd made to clang/LLVM.  Note that "gpucc" is no longer a meaningful name:
+The relevant tools are now just vanilla clang/LLVM.
+
+| `gpucc: An Open-Source GPGPU Compiler <http://dl.acm.org/citation.cfm?id=2854041>`_
+| Jingyue Wu, Artem Belevich, Eli Bendersky, Mark Heffernan, Chris Leary, Jacques Pienaar, Bjarke Roune, Rob Springer, Xuetian Weng, Robert Hundt
+| *Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO 2016)*
+|
+| `Slides from the CGO talk <http://wujingyue.com/docs/gpucc-talk.pdf>`_
+|
+| `Tutorial given at CGO <http://wujingyue.com/docs/gpucc-tutorial.pdf>`_
+
+Obtaining Help
+==============
+
+To obtain help on LLVM in general and its CUDA support, see `the LLVM
+community <http://llvm.org/docs/#mailing-lists>`_.

Added: www-releases/trunk/4.0.1/docs/_sources/CompilerWriterInfo.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CompilerWriterInfo.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CompilerWriterInfo.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CompilerWriterInfo.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,167 @@
+========================================================
+Architecture & Platform Information for Compiler Writers
+========================================================
+
+.. contents::
+   :local:
+
+.. note::
+
+  This document is a work-in-progress.  Additions and clarifications are
+  welcome.
+
+Hardware
+========
+
+AArch64 & ARM
+-------------
+
+* `ARMv8-A Architecture Reference Manual <http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.h/index.html>`_ (authentication required, free sign-up). This document covers both AArch64 and ARM instructions
+
+* `ARMv7-M Architecture Reference Manual <http://infocenter.arm.com/help/topic/com.arm.doc.ddi0403e.b/index.html>`_ (authentication required, free sign-up). This covers the Thumb2-only microcontrollers
+
+* `ARMv6-M Architecture Reference Manual <http://infocenter.arm.com/help/topic/com.arm.doc.ddi0419c/index.html>`_ (authentication required, free sign-up). This covers the Thumb1-only microcontrollers
+
+* `ARM C Language Extensions <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf>`_
+
+* AArch32 `ABI Addenda and Errata <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045d/IHI0045D_ABI_addenda.pdf>`_
+
+Itanium (ia64)
+--------------
+
+* `Itanium documentation <http://developer.intel.com/design/itanium2/documentation.htm>`_
+
+Lanai
+-----
+
+* `Lanai Instruction Set Architecture <http://g.co/lanai/isa>`_
+
+
+MIPS
+----
+
+* `MIPS Processor Architecture <https://imgtec.com/mips/architectures/>`_
+
+* `MIPS 64-bit ELF Object File Specification <http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf>`_
+
+PowerPC
+-------
+
+IBM - Official manuals and docs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* `Power Instruction Set Architecture, Versions 2.03 through 2.06 (authentication required, free sign-up) <https://www.power.org/technology-introduction/standards-specifications>`_
+
+* `PowerPC Compiler Writer's Guide <http://www.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF7785256996007558C6>`_
+
+* `Intro to PowerPC Architecture <http://www.ibm.com/developerworks/linux/library/l-powarch/>`_
+
+* `PowerPC Processor Manuals (embedded) <http://www.ibm.com/chips/techlib/techlib.nsf/products/PowerPC>`_
+
+* `Various IBM specifications and white papers <https://www.power.org/documentation/?document_company=105&document_category=all&publish_year=all&grid_order=DESC&grid_sort=title>`_
+
+* `IBM AIX/5L for POWER Assembly Reference <http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixassem/alangref/alangreftfrm.htm>`_
+
+Other documents, collections, notes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* `PowerPC ABI documents <http://penguinppc.org/dev/#library>`_
+* `PowerPC64 alignment of long doubles (from GCC) <http://gcc.gnu.org/ml/gcc-patches/2003-09/msg00997.html>`_
+* `Long branch stubs for powerpc64-linux (from binutils) <http://sources.redhat.com/ml/binutils/2002-04/msg00573.html>`_
+
+AMDGPU
+------
+
+* `AMD R6xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R600_Instruction_Set_Architecture.pdf>`_
+* `AMD R7xx shader ISA <http://developer.amd.com/wordpress/media/2012/10/R700-Family_Instruction_Set_Architecture.pdf>`_
+* `AMD Evergreen shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_Evergreen-Family_Instruction_Set_Architecture.pdf>`_
+* `AMD Cayman/Trinity shader ISA <http://developer.amd.com/wordpress/media/2012/10/AMD_HD_6900_Series_Instruction_Set_Architecture.pdf>`_
+* `AMD Southern Islands Series ISA <http://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf>`_
+* `AMD Sea Islands Series ISA <http://developer.amd.com/wordpress/media/2013/07/AMD_Sea_Islands_Instruction_Set_Architecture.pdf>`_
+* `AMD GCN3 Instruction Set Architecture <http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf>`__
+* `AMD GPU Programming Guide <http://developer.amd.com/download/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf>`_
+* `AMD Compute Resources <http://developer.amd.com/tools/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/documentation/>`_
+* `AMDGPU Compute Application Binary Interface <https://github.com/RadeonOpenCompute/ROCm-ComputeABI-Doc/blob/master/AMDGPU-ABI.md>`__
+
+RISC-V
+------
+* `RISC-V User-Level ISA Specification <https://riscv.org/specifications/>`_
+
+SPARC
+-----
+
+* `SPARC standards <http://sparc.org/standards>`_
+* `SPARC V9 ABI <http://sparc.org/standards/64.psabi.1.35.ps.Z>`_
+* `SPARC V8 ABI <http://sparc.org/standards/psABI3rd.pdf>`_
+
+SystemZ
+-------
+
+* `z/Architecture Principles of Operation (registration required, free sign-up) <http://www-01.ibm.com/support/docview.wss?uid=isg2b9de5f05a9d57819852571c500428f9a>`_
+
+X86
+---
+
+* `AMD processor manuals <http://developer.amd.com/resources/developer-guides-manuals/>`_
+* `Intel 64 and IA-32 manuals <http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html>`_
+* `Intel Itanium documentation <http://www.intel.com/design/itanium/documentation.htm?iid=ipp_srvr_proc_itanium2+techdocs>`_
+* `X86 and X86-64 SysV psABI <https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI>`_
+* `Calling conventions for different C++ compilers and operating systems  <http://www.agner.org/optimize/calling_conventions.pdf>`_
+
+XCore
+-----
+
+* `The XMOS XS1 Architecture (ISA) <https://www.xmos.com/en/download/public/The-XMOS-XS1-Architecture%28X7879A%29.pdf>`_
+* `Tools Development Guide (includes ABI) <https://www.xmos.com/download/public/Tools-Development-Guide%28X9114A%29.pdf>`_
+
+Hexagon
+-------
+
+* `Hexagon Programmer's Reference Manuals and Hexagon ABI Specification (registration required, free sign-up) <https://developer.qualcomm.com/software/hexagon-dsp-sdk/tools>`_
+
+Other relevant lists
+--------------------
+
+* `GCC reading list <http://gcc.gnu.org/readings.html>`_
+
+ABI
+===
+
+* `System V Application Binary Interface <http://www.sco.com/developers/gabi/latest/contents.html>`_
+* `Itanium C++ ABI <http://mentorembedded.github.io/cxx-abi/>`_
+
+Linux
+-----
+
+* `Linux extensions to gabi <https://github.com/hjl-tools/linux-abi/wiki/Linux-Extensions-to-gABI>`_
+* `PowerPC 64-bit ELF ABI Supplement <http://www.linuxbase.org/spec/ELF/ppc64/>`_
+* `Procedure Call Standard for the AArch64 Architecture <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055a/IHI0055A_aapcs64.pdf>`_
+* `ELF for the ARM Architecture <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf>`_
+* `ELF for the ARM 64-bit Architecture (AArch64) <http://infocenter.arm.com/help/topic/com.arm.doc.ihi0056a/IHI0056A_aaelf64.pdf>`_
+* `System z ELF ABI Supplement <http://legacy.redhat.com/pub/redhat/linux/7.1/es/os/s390x/doc/lzsabi0.pdf>`_
+
+OS X
+----
+
+* `Mach-O Runtime Architecture <http://developer.apple.com/documentation/Darwin/RuntimeArchitecture-date.html>`_
+* `Notes on Mach-O ABI <http://www.unsanity.org/archives/000044.php>`_
+
+Windows
+-------
+
+* `Microsoft PE/COFF Specification <http://www.microsoft.com/whdc/system/platform/firmware/pecoff.mspx>`_
+
+NVPTX
+=====
+
+* `CUDA Documentation <http://docs.nvidia.com/cuda/index.html>`_ includes the PTX
+  ISA and Driver API documentation
+
+Miscellaneous Resources
+=======================
+
+* `Executable File Format library <http://www.nondot.org/sabre/os/articles/ExecutableFileFormats/>`_
+
+* `GCC prefetch project <http://gcc.gnu.org/projects/prefetch.html>`_ page has a
+  good survey of the prefetching capabilities of a variety of modern
+  processors.

Added: www-releases/trunk/4.0.1/docs/_sources/Coroutines.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/Coroutines.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/Coroutines.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/Coroutines.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,1244 @@
+=====================================
+Coroutines in LLVM
+=====================================
+
+.. contents::
+   :local:
+   :depth: 3
+
+.. warning::
+  This is a work in progress. Compatibility across LLVM releases is not 
+  guaranteed.
+
+Introduction
+============
+
+.. _coroutine handle:
+
+LLVM coroutines are functions that have one or more `suspend points`_. 
+When a suspend point is reached, the execution of a coroutine is suspended and
+control is returned back to its caller. A suspended coroutine can be resumed 
+to continue execution from the last suspend point or it can be destroyed. 
+
+In the following example, we call function `f` (which may or may not be a 
+coroutine itself) that returns a handle to a suspended coroutine 
+(**coroutine handle**) that is used by `main` to resume the coroutine twice and
+then destroy it:
+
+.. code-block:: llvm
+
+  define i32 @main() {
+  entry:
+    %hdl = call i8* @f(i32 4)
+    call void @llvm.coro.resume(i8* %hdl)
+    call void @llvm.coro.resume(i8* %hdl)
+    call void @llvm.coro.destroy(i8* %hdl)
+    ret i32 0
+  }
+
+.. _coroutine frame:
+
+In addition to the function stack frame which exists when a coroutine is 
+executing, there is an additional region of storage that contains objects that 
+keep the coroutine state when a coroutine is suspended. This region of storage
+is called **coroutine frame**. It is created when a coroutine is called and 
+destroyed when a coroutine runs to completion or destroyed by a call to 
+the `coro.destroy`_ intrinsic. 
+
+An LLVM coroutine is represented as an LLVM function that has calls to
+`coroutine intrinsics`_ defining the structure of the coroutine.
+After lowering, a coroutine is split into several
+functions that represent three different ways of how control can enter the 
+coroutine: 
+
+1. a ramp function, which represents an initial invocation of the coroutine that
+   creates the coroutine frame and executes the coroutine code until it 
+   encounters a suspend point or reaches the end of the function;
+
+2. a coroutine resume function that is invoked when the coroutine is resumed;
+
+3. a coroutine destroy function that is invoked when the coroutine is destroyed.
+
+.. note:: Splitting out resume and destroy functions are just one of the 
+   possible ways of lowering the coroutine. We chose it for initial 
+   implementation as it matches closely the mental model and results in 
+   reasonably nice code.
+
+Coroutines by Example
+=====================
+
+Coroutine Representation
+------------------------
+
+Let's look at an example of an LLVM coroutine with the behavior sketched
+by the following pseudo-code.
+
+.. code-block:: c++
+
+  void *f(int n) {
+     for(;;) {
+       print(n++);
+       <suspend> // returns a coroutine handle on first suspend
+     }     
+  } 
+
+This coroutine calls some function `print` with value `n` as an argument and
+suspends execution. Every time this coroutine resumes, it calls `print` again with an argument one bigger than the last time. This coroutine never completes by itself and must be destroyed explicitly. If we use this coroutine with 
+a `main` shown in the previous section. It will call `print` with values 4, 5 
+and 6 after which the coroutine will be destroyed.
+
+The LLVM IR for this coroutine looks like this:
+
+.. code-block:: none
+
+  define i8* @f(i32 %n) {
+  entry:
+    %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+    %size = call i32 @llvm.coro.size.i32()
+    %alloc = call i8* @malloc(i32 %size)
+    %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
+    br label %loop
+  loop:
+    %n.val = phi i32 [ %n, %entry ], [ %inc, %loop ]
+    %inc = add nsw i32 %n.val, 1
+    call void @print(i32 %n.val)
+    %0 = call i8 @llvm.coro.suspend(token none, i1 false)
+    switch i8 %0, label %suspend [i8 0, label %loop
+                                  i8 1, label %cleanup]
+  cleanup:
+    %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
+    call void @free(i8* %mem)
+    br label %suspend
+  suspend:
+    call void @llvm.coro.end(i8* %hdl, i1 false)
+    ret i8* %hdl
+  }
+
+The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is
+lowered to a constant representing the size required for the coroutine frame. 
+The `coro.begin`_ intrinsic initializes the coroutine frame and returns the 
+coroutine handle. The second parameter of `coro.begin` is given a block of memory 
+to be used if the coroutine frame needs to be allocated dynamically.
+The `coro.id`_ intrinsic serves as coroutine identity useful in cases when the
+`coro.begin`_ intrinsic get duplicated by optimization passes such as 
+jump-threading.
+
+The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic, 
+given the coroutine handle, returns a pointer of the memory block to be freed or
+`null` if the coroutine frame was not allocated dynamically. The `cleanup` 
+block is entered when coroutine runs to completion by itself or destroyed via
+call to the `coro.destroy`_ intrinsic.
+
+The `suspend` block contains code to be executed when coroutine runs to 
+completion or suspended. The `coro.end`_ intrinsic marks the point where 
+a coroutine needs to return control back to the caller if it is not an initial 
+invocation of the coroutine. 
+
+The `loop` blocks represents the body of the coroutine. The `coro.suspend`_ 
+intrinsic in combination with the following switch indicates what happens to 
+control flow when a coroutine is suspended (default case), resumed (case 0) or 
+destroyed (case 1).
+
+Coroutine Transformation
+------------------------
+
+One of the steps of coroutine lowering is building the coroutine frame. The
+def-use chains are analyzed to determine which objects need be kept alive across
+suspend points. In the coroutine shown in the previous section, use of virtual register 
+`%n.val` is separated from the definition by a suspend point, therefore, it 
+cannot reside on the stack frame since the latter goes away once the coroutine 
+is suspended and control is returned back to the caller. An i32 slot is 
+allocated in the coroutine frame and `%n.val` is spilled and reloaded from that
+slot as needed.
+
+We also store addresses of the resume and destroy functions so that the 
+`coro.resume` and `coro.destroy` intrinsics can resume and destroy the coroutine
+when its identity cannot be determined statically at compile time. For our 
+example, the coroutine frame will be:
+
+.. code-block:: text
+
+  %f.frame = type { void (%f.frame*)*, void (%f.frame*)*, i32 }
+
+After resume and destroy parts are outlined, function `f` will contain only the 
+code responsible for creation and initialization of the coroutine frame and 
+execution of the coroutine until a suspend point is reached:
+
+.. code-block:: none
+
+  define i8* @f(i32 %n) {
+  entry:
+    %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+    %alloc = call noalias i8* @malloc(i32 24)
+    %0 = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
+    %frame = bitcast i8* %0 to %f.frame*
+    %1 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 0
+    store void (%f.frame*)* @f.resume, void (%f.frame*)** %1
+    %2 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 1
+    store void (%f.frame*)* @f.destroy, void (%f.frame*)** %2
+   
+    %inc = add nsw i32 %n, 1
+    %inc.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
+    store i32 %inc, i32* %inc.spill.addr
+    call void @print(i32 %n)
+   
+    ret i8* %frame
+  }
+
+Outlined resume part of the coroutine will reside in function `f.resume`:
+
+.. code-block:: llvm
+
+  define internal fastcc void @f.resume(%f.frame* %frame.ptr.resume) {
+  entry:
+    %inc.spill.addr = getelementptr %f.frame, %f.frame* %frame.ptr.resume, i64 0, i32 2
+    %inc.spill = load i32, i32* %inc.spill.addr, align 4
+    %inc = add i32 %n.val, 1
+    store i32 %inc, i32* %inc.spill.addr, align 4
+    tail call void @print(i32 %inc)
+    ret void
+  }
+
+Whereas function `f.destroy` will contain the cleanup code for the coroutine:
+
+.. code-block:: llvm
+
+  define internal fastcc void @f.destroy(%f.frame* %frame.ptr.destroy) {
+  entry:
+    %0 = bitcast %f.frame* %frame.ptr.destroy to i8*
+    tail call void @free(i8* %0)
+    ret void
+  }
+
+Avoiding Heap Allocations
+-------------------------
+ 
+A particular coroutine usage pattern, which is illustrated by the `main` 
+function in the overview section, where a coroutine is created, manipulated and 
+destroyed by the same calling function, is common for coroutines implementing
+RAII idiom and is suitable for allocation elision optimization which avoid 
+dynamic allocation by storing the coroutine frame as a static `alloca` in its 
+caller.
+
+In the entry block, we will call `coro.alloc`_ intrinsic that will return `true`
+when dynamic allocation is required, and `false` if dynamic allocation is 
+elided.
+
+.. code-block:: none
+
+  entry:
+    %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+    %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
+    br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
+  dyn.alloc:
+    %size = call i32 @llvm.coro.size.i32()
+    %alloc = call i8* @CustomAlloc(i32 %size)
+    br label %coro.begin
+  coro.begin:
+    %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
+    %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
+
+In the cleanup block, we will make freeing the coroutine frame conditional on
+`coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null`
+thus skipping the deallocation code:
+
+.. code-block:: text
+
+  cleanup:
+    %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
+    %need.dyn.free = icmp ne i8* %mem, null
+    br i1 %need.dyn.free, label %dyn.free, label %if.end
+  dyn.free:
+    call void @CustomFree(i8* %mem)
+    br label %if.end
+  if.end:
+    ...
+
+With allocations and deallocations represented as described as above, after
+coroutine heap allocation elision optimization, the resulting main will be:
+
+.. code-block:: llvm
+
+  define i32 @main() {
+  entry:
+    call void @print(i32 4)
+    call void @print(i32 5)
+    call void @print(i32 6)
+    ret i32 0
+  }
+
+Multiple Suspend Points
+-----------------------
+
+Let's consider the coroutine that has more than one suspend point:
+
+.. code-block:: c++
+
+  void *f(int n) {
+     for(;;) {
+       print(n++);
+       <suspend>
+       print(-n);
+       <suspend>
+     }
+  }
+
+Matching LLVM code would look like (with the rest of the code remaining the same
+as the code in the previous section):
+
+.. code-block:: text
+
+  loop:
+    %n.addr = phi i32 [ %n, %entry ], [ %inc, %loop.resume ]
+    call void @print(i32 %n.addr) #4
+    %2 = call i8 @llvm.coro.suspend(token none, i1 false)
+    switch i8 %2, label %suspend [i8 0, label %loop.resume
+                                  i8 1, label %cleanup]
+  loop.resume:
+    %inc = add nsw i32 %n.addr, 1
+    %sub = xor i32 %n.addr, -1
+    call void @print(i32 %sub)
+    %3 = call i8 @llvm.coro.suspend(token none, i1 false)
+    switch i8 %3, label %suspend [i8 0, label %loop
+                                  i8 1, label %cleanup]
+
+In this case, the coroutine frame would include a suspend index that will 
+indicate at which suspend point the coroutine needs to resume. The resume 
+function will use an index to jump to an appropriate basic block and will look 
+as follows:
+
+.. code-block:: llvm
+
+  define internal fastcc void @f.Resume(%f.Frame* %FramePtr) {
+  entry.Resume:
+    %index.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 2
+    %index = load i8, i8* %index.addr, align 1
+    %switch = icmp eq i8 %index, 0
+    %n.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 3
+    %n = load i32, i32* %n.addr, align 4
+    br i1 %switch, label %loop.resume, label %loop
+
+  loop.resume:
+    %sub = xor i32 %n, -1
+    call void @print(i32 %sub)
+    br label %suspend
+  loop:
+    %inc = add nsw i32 %n, 1
+    store i32 %inc, i32* %n.addr, align 4
+    tail call void @print(i32 %inc)
+    br label %suspend
+
+  suspend:
+    %storemerge = phi i8 [ 0, %loop ], [ 1, %loop.resume ]
+    store i8 %storemerge, i8* %index.addr, align 1
+    ret void
+  }
+
+If different cleanup code needs to get executed for different suspend points, 
+a similar switch will be in the `f.destroy` function.
+
+.. note ::
+
+  Using suspend index in a coroutine state and having a switch in `f.resume` and
+  `f.destroy` is one of the possible implementation strategies. We explored 
+  another option where a distinct `f.resume1`, `f.resume2`, etc. are created for
+  every suspend point, and instead of storing an index, the resume and destroy 
+  function pointers are updated at every suspend. Early testing showed that the
+  current approach is easier on the optimizer than the latter so it is a 
+  lowering strategy implemented at the moment.
+
+Distinct Save and Suspend
+-------------------------
+
+In the previous example, setting a resume index (or some other state change that 
+needs to happen to prepare a coroutine for resumption) happens at the same time as
+a suspension of a coroutine. However, in certain cases, it is necessary to control 
+when coroutine is prepared for resumption and when it is suspended.
+
+In the following example, a coroutine represents some activity that is driven
+by completions of asynchronous operations `async_op1` and `async_op2` which get
+a coroutine handle as a parameter and resume the coroutine once async
+operation is finished.
+
+.. code-block:: text
+
+  void g() {
+     for (;;)
+       if (cond()) {
+          async_op1(<coroutine-handle>); // will resume once async_op1 completes
+          <suspend>
+          do_one();
+       }
+       else {
+          async_op2(<coroutine-handle>); // will resume once async_op2 completes
+          <suspend>
+          do_two();
+       }
+     }
+  }
+
+In this case, coroutine should be ready for resumption prior to a call to 
+`async_op1` and `async_op2`. The `coro.save`_ intrinsic is used to indicate a
+point when coroutine should be ready for resumption (namely, when a resume index
+should be stored in the coroutine frame, so that it can be resumed at the 
+correct resume point):
+
+.. code-block:: text
+
+  if.true:
+    %save1 = call token @llvm.coro.save(i8* %hdl)
+    call void async_op1(i8* %hdl)
+    %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
+    switch i8 %suspend1, label %suspend [i8 0, label %resume1
+                                         i8 1, label %cleanup]
+  if.false:
+    %save2 = call token @llvm.coro.save(i8* %hdl)
+    call void async_op2(i8* %hdl)
+    %suspend2 = call i1 @llvm.coro.suspend(token %save2, i1 false)
+    switch i8 %suspend1, label %suspend [i8 0, label %resume2
+                                         i8 1, label %cleanup]
+
+.. _coroutine promise:
+
+Coroutine Promise
+-----------------
+
+A coroutine author or a frontend may designate a distinguished `alloca` that can
+be used to communicate with the coroutine. This distinguished alloca is called
+**coroutine promise** and is provided as the second parameter to the 
+`coro.id`_ intrinsic.
+
+The following coroutine designates a 32 bit integer `promise` and uses it to
+store the current value produced by a coroutine.
+
+.. code-block:: text
+
+  define i8* @f(i32 %n) {
+  entry:
+    %promise = alloca i32
+    %pv = bitcast i32* %promise to i8*
+    %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
+    %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
+    br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
+  dyn.alloc:
+    %size = call i32 @llvm.coro.size.i32()
+    %alloc = call i8* @malloc(i32 %size)
+    br label %coro.begin
+  coro.begin:
+    %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
+    %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
+    br label %loop
+  loop:
+    %n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ]
+    %inc = add nsw i32 %n.val, 1
+    store i32 %n.val, i32* %promise
+    %0 = call i8 @llvm.coro.suspend(token none, i1 false)
+    switch i8 %0, label %suspend [i8 0, label %loop
+                                  i8 1, label %cleanup]
+  cleanup:
+    %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
+    call void @free(i8* %mem)
+    br label %suspend
+  suspend:
+    call void @llvm.coro.end(i8* %hdl, i1 false)
+    ret i8* %hdl
+  }
+
+A coroutine consumer can rely on the `coro.promise`_ intrinsic to access the
+coroutine promise.
+
+.. code-block:: llvm
+
+  define i32 @main() {
+  entry:
+    %hdl = call i8* @f(i32 4)
+    %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
+    %promise.addr = bitcast i8* %promise.addr.raw to i32*
+    %val0 = load i32, i32* %promise.addr
+    call void @print(i32 %val0)
+    call void @llvm.coro.resume(i8* %hdl)
+    %val1 = load i32, i32* %promise.addr
+    call void @print(i32 %val1)
+    call void @llvm.coro.resume(i8* %hdl)
+    %val2 = load i32, i32* %promise.addr
+    call void @print(i32 %val2)
+    call void @llvm.coro.destroy(i8* %hdl)
+    ret i32 0
+  }
+
+After example in this section is compiled, result of the compilation will be:
+
+.. code-block:: llvm
+
+  define i32 @main() {
+  entry:
+    tail call void @print(i32 4)
+    tail call void @print(i32 5)
+    tail call void @print(i32 6)
+    ret i32 0
+  }
+
+.. _final:
+.. _final suspend:
+
+Final Suspend
+-------------
+
+A coroutine author or a frontend may designate a particular suspend to be final,
+by setting the second argument of the `coro.suspend`_ intrinsic to `true`.
+Such a suspend point has two properties:
+
+* it is possible to check whether a suspended coroutine is at the final suspend
+  point via `coro.done`_ intrinsic;
+
+* a resumption of a coroutine stopped at the final suspend point leads to 
+  undefined behavior. The only possible action for a coroutine at a final
+  suspend point is destroying it via `coro.destroy`_ intrinsic.
+
+From the user perspective, the final suspend point represents an idea of a 
+coroutine reaching the end. From the compiler perspective, it is an optimization
+opportunity for reducing number of resume points (and therefore switch cases) in
+the resume function.
+
+The following is an example of a function that keeps resuming the coroutine
+until the final suspend point is reached after which point the coroutine is 
+destroyed:
+
+.. code-block:: llvm
+
+  define i32 @main() {
+  entry:
+    %hdl = call i8* @f(i32 4)
+    br label %while
+  while:
+    call void @llvm.coro.resume(i8* %hdl)
+    %done = call i1 @llvm.coro.done(i8* %hdl)
+    br i1 %done, label %end, label %while
+  end:
+    call void @llvm.coro.destroy(i8* %hdl)
+    ret i32 0
+  }
+
+Usually, final suspend point is a frontend injected suspend point that does not
+correspond to any explicitly authored suspend point of the high level language.
+For example, for a Python generator that has only one suspend point:
+
+.. code-block:: python
+
+  def coroutine(n):
+    for i in range(n):
+      yield i
+
+Python frontend would inject two more suspend points, so that the actual code
+looks like this:
+
+.. code-block:: c
+
+  void* coroutine(int n) {
+    int current_value; 
+    <designate current_value to be coroutine promise>
+    <SUSPEND> // injected suspend point, so that the coroutine starts suspended
+    for (int i = 0; i < n; ++i) {
+      current_value = i; <SUSPEND>; // corresponds to "yield i"
+    }
+    <SUSPEND final=true> // injected final suspend point
+  }
+
+and python iterator `__next__` would look like:
+
+.. code-block:: c++
+
+  int __next__(void* hdl) {
+    coro.resume(hdl);
+    if (coro.done(hdl)) throw StopIteration();
+    return *(int*)coro.promise(hdl, 4, false);
+  }
+
+Intrinsics
+==========
+
+Coroutine Manipulation Intrinsics
+---------------------------------
+
+Intrinsics described in this section are used to manipulate an existing
+coroutine. They can be used in any function which happen to have a pointer
+to a `coroutine frame`_ or a pointer to a `coroutine promise`_.
+
+.. _coro.destroy:
+
+'llvm.coro.destroy' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      declare void @llvm.coro.destroy(i8* <handle>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.destroy``' intrinsic destroys a suspended
+coroutine.
+
+Arguments:
+""""""""""
+
+The argument is a coroutine handle to a suspended coroutine.
+
+Semantics:
+""""""""""
+
+When possible, the `coro.destroy` intrinsic is replaced with a direct call to 
+the coroutine destroy function. Otherwise it is replaced with an indirect call 
+based on the function pointer for the destroy function stored in the coroutine
+frame. Destroying a coroutine that is not suspended leads to undefined behavior.
+
+.. _coro.resume:
+
+'llvm.coro.resume' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+::
+
+      declare void @llvm.coro.resume(i8* <handle>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.resume``' intrinsic resumes a suspended coroutine.
+
+Arguments:
+""""""""""
+
+The argument is a handle to a suspended coroutine.
+
+Semantics:
+""""""""""
+
+When possible, the `coro.resume` intrinsic is replaced with a direct call to the
+coroutine resume function. Otherwise it is replaced with an indirect call based 
+on the function pointer for the resume function stored in the coroutine frame. 
+Resuming a coroutine that is not suspended leads to undefined behavior.
+
+.. _coro.done:
+
+'llvm.coro.done' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+::
+
+      declare i1 @llvm.coro.done(i8* <handle>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.done``' intrinsic checks whether a suspended coroutine is at 
+the final suspend point or not.
+
+Arguments:
+""""""""""
+
+The argument is a handle to a suspended coroutine.
+
+Semantics:
+""""""""""
+
+Using this intrinsic on a coroutine that does not have a `final suspend`_ point 
+or on a coroutine that is not suspended leads to undefined behavior.
+
+.. _coro.promise:
+
+'llvm.coro.promise' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+::
+
+      declare i8* @llvm.coro.promise(i8* <ptr>, i32 <alignment>, i1 <from>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.promise``' intrinsic obtains a pointer to a 
+`coroutine promise`_ given a coroutine handle and vice versa.
+
+Arguments:
+""""""""""
+
+The first argument is a handle to a coroutine if `from` is false. Otherwise, 
+it is a pointer to a coroutine promise.
+
+The second argument is an alignment requirements of the promise. 
+If a frontend designated `%promise = alloca i32` as a promise, the alignment 
+argument to `coro.promise` should be the alignment of `i32` on the target 
+platform. If a frontend designated `%promise = alloca i32, align 16` as a 
+promise, the alignment argument should be 16.
+This argument only accepts constants.
+
+The third argument is a boolean indicating a direction of the transformation.
+If `from` is true, the intrinsic returns a coroutine handle given a pointer 
+to a promise. If `from` is false, the intrinsics return a pointer to a promise 
+from a coroutine handle. This argument only accepts constants.
+
+Semantics:
+""""""""""
+
+Using this intrinsic on a coroutine that does not have a coroutine promise
+leads to undefined behavior. It is possible to read and modify coroutine
+promise of the coroutine which is currently executing. The coroutine author and
+a coroutine user are responsible to makes sure there is no data races.
+
+Example:
+""""""""
+
+.. code-block:: text
+
+  define i8* @f(i32 %n) {
+  entry:
+    %promise = alloca i32
+    %pv = bitcast i32* %promise to i8*
+    ; the second argument to coro.id points to the coroutine promise.
+    %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
+    ...
+    %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
+    ...
+    store i32 42, i32* %promise ; store something into the promise
+    ...
+    ret i8* %hdl
+  }
+
+  define i32 @main() {
+  entry:
+    %hdl = call i8* @f(i32 4) ; starts the coroutine and returns its handle
+    %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
+    %promise.addr = bitcast i8* %promise.addr.raw to i32*    
+    %val = load i32, i32* %promise.addr ; load a value from the promise
+    call void @print(i32 %val)
+    call void @llvm.coro.destroy(i8* %hdl)
+    ret i32 0
+  }
+
+.. _coroutine intrinsics:
+
+Coroutine Structure Intrinsics
+------------------------------
+Intrinsics described in this section are used within a coroutine to describe
+the coroutine structure. They should not be used outside of a coroutine.
+
+.. _coro.size:
+
+'llvm.coro.size' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+    declare i32 @llvm.coro.size.i32()
+    declare i64 @llvm.coro.size.i64()
+
+Overview:
+"""""""""
+
+The '``llvm.coro.size``' intrinsic returns the number of bytes
+required to store a `coroutine frame`_.
+
+Arguments:
+""""""""""
+
+None
+
+Semantics:
+""""""""""
+
+The `coro.size` intrinsic is lowered to a constant representing the size of
+the coroutine frame. 
+
+.. _coro.begin:
+
+'llvm.coro.begin' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i8* @llvm.coro.begin(token <id>, i8* <mem>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.begin``' intrinsic returns an address of the coroutine frame.
+
+Arguments:
+""""""""""
+
+The first argument is a token returned by a call to '``llvm.coro.id``' 
+identifying the coroutine.
+
+The second argument is a pointer to a block of memory where coroutine frame
+will be stored if it is allocated dynamically.
+
+Semantics:
+""""""""""
+
+Depending on the alignment requirements of the objects in the coroutine frame
+and/or on the codegen compactness reasons the pointer returned from `coro.begin` 
+may be at offset to the `%mem` argument. (This could be beneficial if 
+instructions that express relative access to data can be more compactly encoded 
+with small positive and negative offsets).
+
+A frontend should emit exactly one `coro.begin` intrinsic per coroutine.
+
+.. _coro.free:
+
+'llvm.coro.free' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i8* @llvm.coro.free(token %id, i8* <frame>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where 
+coroutine frame is stored or `null` if this instance of a coroutine did not use
+dynamically allocated memory for its coroutine frame.
+
+Arguments:
+""""""""""
+
+The first argument is a token returned by a call to '``llvm.coro.id``' 
+identifying the coroutine.
+
+The second argument is a pointer to the coroutine frame. This should be the same
+pointer that was returned by prior `coro.begin` call.
+
+Example (custom deallocation function):
+"""""""""""""""""""""""""""""""""""""""
+
+.. code-block:: text
+
+  cleanup:
+    %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
+    %mem_not_null = icmp ne i8* %mem, null
+    br i1 %mem_not_null, label %if.then, label %if.end
+  if.then:
+    call void @CustomFree(i8* %mem)
+    br label %if.end
+  if.end:
+    ret void
+
+Example (standard deallocation functions):
+""""""""""""""""""""""""""""""""""""""""""
+
+.. code-block:: text
+
+  cleanup:
+    %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
+    call void @free(i8* %mem)
+    ret void
+
+.. _coro.alloc:
+
+'llvm.coro.alloc' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i1 @llvm.coro.alloc(token <id>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is
+required to obtain a memory for the corutine frame and `false` otherwise.
+
+Arguments:
+""""""""""
+
+The first argument is a token returned by a call to '``llvm.coro.id``' 
+identifying the coroutine.
+
+Semantics:
+""""""""""
+
+A frontend should emit at most one `coro.alloc` intrinsic per coroutine.
+The intrinsic is used to suppress dynamic allocation of the coroutine frame
+when possible.
+
+Example:
+""""""""
+
+.. code-block:: text
+
+  entry:
+    %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
+    %dyn.alloc.required = call i1 @llvm.coro.alloc(token %id)
+    br i1 %dyn.alloc.required, label %coro.alloc, label %coro.begin
+
+  coro.alloc:
+    %frame.size = call i32 @llvm.coro.size()
+    %alloc = call i8* @MyAlloc(i32 %frame.size)
+    br label %coro.begin
+
+  coro.begin:
+    %phi = phi i8* [ null, %entry ], [ %alloc, %coro.alloc ]
+    %frame = call i8* @llvm.coro.begin(token %id, i8* %phi)
+
+.. _coro.frame:
+
+'llvm.coro.frame' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i8* @llvm.coro.frame()
+
+Overview:
+"""""""""
+
+The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of
+the enclosing coroutine.
+
+Arguments:
+""""""""""
+
+None
+
+Semantics:
+""""""""""
+
+This intrinsic is lowered to refer to the `coro.begin`_ instruction. This is
+a frontend convenience intrinsic that makes it easier to refer to the
+coroutine frame.
+
+.. _coro.id:
+
+'llvm.coro.id' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare token @llvm.coro.id(i32 <align>, i8* <promise>, i8* <coroaddr>, 
+                                                          i8* <fnaddrs>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.id``' intrinsic returns a token identifying a coroutine.
+
+Arguments:
+""""""""""
+
+The first argument provides information on the alignment of the memory returned 
+by the allocation function and given to `coro.begin` by the first argument. If 
+this argument is 0, the memory is assumed to be aligned to 2 * sizeof(i8*).
+This argument only accepts constants.
+
+The second argument, if not `null`, designates a particular alloca instruction
+to be a `coroutine promise`_.
+
+The third argument is `null` coming out of the frontend. The CoroEarly pass sets
+this argument to point to the function this coro.id belongs to. 
+
+The fourth argument is `null` before coroutine is split, and later is replaced 
+to point to a private global constant array containing function pointers to 
+outlined resume and destroy parts of the coroutine.
+
+
+Semantics:
+""""""""""
+
+The purpose of this intrinsic is to tie together `coro.id`, `coro.alloc` and
+`coro.begin` belonging to the same coroutine to prevent optimization passes from
+duplicating any of these instructions unless entire body of the coroutine is
+duplicated.
+
+A frontend should emit exactly one `coro.id` intrinsic per coroutine.
+
+.. _coro.end:
+
+'llvm.coro.end' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare void @llvm.coro.end(i8* <handle>, i1 <unwind>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.end``' marks the point where execution of the resume part of 
+the coroutine should end and control returns back to the caller.
+
+
+Arguments:
+""""""""""
+
+The first argument should refer to the coroutine handle of the enclosing coroutine.
+
+The second argument should be `true` if this coro.end is in the block that is 
+part of the unwind sequence leaving the coroutine body due to exception prior to
+the first reaching any suspend points, and `false` otherwise.
+
+Semantics:
+""""""""""
+The `coro.end`_ intrinsic is a no-op during an initial invocation of the 
+coroutine. When the coroutine resumes, the intrinsic marks the point when 
+coroutine need to return control back to the caller.
+
+This intrinsic is removed by the CoroSplit pass when a coroutine is split into
+the start, resume and destroy parts. In start part, the intrinsic is removed,
+in resume and destroy parts, it is replaced with `ret void` instructions and
+the rest of the block containing `coro.end` instruction is discarded.
+
+In landing pads it is replaced with an appropriate instruction to unwind to 
+caller.
+
+A frontend is allowed to supply null as the first parameter, in this case 
+`coro-early` pass will replace the null with an appropriate coroutine handle
+value.
+
+.. _coro.suspend:
+.. _suspend points:
+
+'llvm.coro.suspend' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i8 @llvm.coro.suspend(token <save>, i1 <final>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.suspend``' marks the point where execution of the coroutine 
+need to get suspended and control returned back to the caller.
+Conditional branches consuming the result of this intrinsic lead to basic blocks
+where coroutine should proceed when suspended (-1), resumed (0) or destroyed 
+(1).
+
+Arguments:
+""""""""""
+
+The first argument refers to a token of `coro.save` intrinsic that marks the 
+point when coroutine state is prepared for suspension. If `none` token is passed,
+the intrinsic behaves as if there were a `coro.save` immediately preceding
+the `coro.suspend` intrinsic.
+
+The second argument indicates whether this suspension point is `final`_.
+The second argument only accepts constants. If more than one suspend point is
+designated as final, the resume and destroy branches should lead to the same
+basic blocks.
+
+Example (normal suspend point):
+"""""""""""""""""""""""""""""""
+
+.. code-block:: text
+
+    %0 = call i8 @llvm.coro.suspend(token none, i1 false)
+    switch i8 %0, label %suspend [i8 0, label %resume
+                                  i8 1, label %cleanup]
+
+Example (final suspend point):
+""""""""""""""""""""""""""""""
+
+.. code-block:: text
+
+  while.end:
+    %s.final = call i8 @llvm.coro.suspend(token none, i1 true)
+    switch i8 %s.final, label %suspend [i8 0, label %trap
+                                        i8 1, label %cleanup]
+  trap: 
+    call void @llvm.trap()
+    unreachable
+
+Semantics:
+""""""""""
+
+If a coroutine that was suspended at the suspend point marked by this intrinsic
+is resumed via `coro.resume`_ the control will transfer to the basic block
+of the 0-case. If it is resumed via `coro.destroy`_, it will proceed to the
+basic block indicated by the 1-case. To suspend, coroutine proceed to the 
+default label.
+
+If suspend intrinsic is marked as final, it can consider the `true` branch
+unreachable and can perform optimizations that can take advantage of that fact.
+
+.. _coro.save:
+
+'llvm.coro.save' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare token @llvm.coro.save(i8* <handle>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.save``' marks the point where a coroutine need to update its 
+state to prepare for resumption to be considered suspended (and thus eligible 
+for resumption). 
+
+Arguments:
+""""""""""
+
+The first argument points to a coroutine handle of the enclosing coroutine.
+
+Semantics:
+""""""""""
+
+Whatever coroutine state changes are required to enable resumption of
+the coroutine from the corresponding suspend point should be done at the point 
+of `coro.save` intrinsic.
+
+Example:
+""""""""
+
+Separate save and suspend points are necessary when a coroutine is used to 
+represent an asynchronous control flow driven by callbacks representing
+completions of asynchronous operations.
+
+In such a case, a coroutine should be ready for resumption prior to a call to 
+`async_op` function that may trigger resumption of a coroutine from the same or
+a different thread possibly prior to `async_op` call returning control back
+to the coroutine:
+
+.. code-block:: text
+
+    %save1 = call token @llvm.coro.save(i8* %hdl)
+    call void async_op1(i8* %hdl)
+    %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
+    switch i8 %suspend1, label %suspend [i8 0, label %resume1
+                                         i8 1, label %cleanup]
+
+.. _coro.param:
+
+'llvm.coro.param' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i1 @llvm.coro.param(i8* <original>, i8* <copy>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.param``' is used by a frontend to mark up the code used to
+construct and destruct copies of the parameters. If the optimizer discovers that
+a particular parameter copy is not used after any suspends, it can remove the
+construction and destruction of the copy by replacing corresponding coro.param
+with `i1 false` and replacing any use of the `copy` with the `original`.
+
+Arguments:
+""""""""""
+
+The first argument points to an `alloca` storing the value of a parameter to a 
+coroutine. 
+
+The second argument points to an `alloca` storing the value of the copy of that
+parameter.
+
+Semantics:
+""""""""""
+
+The optimizer is free to always replace this intrinsic with `i1 true`.
+
+The optimizer is also allowed to replace it with `i1 false` provided that the 
+parameter copy is only used prior to control flow reaching any of the suspend
+points. The code that would be DCE'd if the `coro.param` is replaced with 
+`i1 false` is not considered to be a use of the parameter copy.
+
+The frontend can emit this intrinsic if its language rules allow for this 
+optimization.
+
+Example:
+""""""""
+Consider the following example. A coroutine takes two parameters `a` and `b`
+that has a destructor and a move constructor.
+
+.. code-block:: c++
+
+  struct A { ~A(); A(A&&); bool foo(); void bar(); };
+
+  task<int> f(A a, A b) {
+    if (a.foo())
+      return 42;
+
+    a.bar();
+    co_await read_async(); // introduces suspend point
+    b.bar();
+  }
+
+Note that, uses of `b` is used after a suspend point and thus must be copied
+into a coroutine frame, whereas `a` does not have to, since it never used 
+after suspend.
+
+A frontend can create parameter copies for `a` and `b` as follows:
+
+.. code-block:: text
+
+  task<int> f(A a', A b') {
+    a = alloca A;
+    b = alloca A;
+    // move parameters to its copies
+    if (coro.param(a', a)) A::A(a, A&& a');
+    if (coro.param(b', b)) A::A(b, A&& b');
+    ...
+    // destroy parameters copies
+    if (coro.param(a', a)) A::~A(a);
+    if (coro.param(b', b)) A::~A(b);
+  }
+
+The optimizer can replace coro.param(a',a) with `i1 false` and replace all uses
+of `a` with `a'`, since it is not used after suspend.
+
+The optimizer must replace coro.param(b', b) with `i1 true`, since `b` is used
+after suspend and therefore, it has to reside in the coroutine frame.
+
+Coroutine Transformation Passes
+===============================
+CoroEarly
+---------
+The pass CoroEarly lowers coroutine intrinsics that hide the details of the
+structure of the coroutine frame, but, otherwise not needed to be preserved to
+help later coroutine passes. This pass lowers `coro.frame`_, `coro.done`_, 
+and `coro.promise`_ intrinsics.
+
+.. _CoroSplit:
+
+CoroSplit
+---------
+The pass CoroSplit buides coroutine frame and outlines resume and destroy parts 
+into separate functions.
+
+CoroElide
+---------
+The pass CoroElide examines if the inlined coroutine is eligible for heap 
+allocation elision optimization. If so, it replaces 
+`coro.begin` intrinsic with an address of a coroutine frame placed on its caller
+and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null`
+respectively to remove the deallocation code. 
+This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct 
+calls to resume and destroy functions for a particular coroutine where possible.
+
+CoroCleanup
+-----------
+This pass runs late to lower all coroutine related intrinsics not replaced by
+earlier passes.
+
+Areas Requiring Attention
+=========================
+#. A coroutine frame is bigger than it could be. Adding stack packing and stack 
+   coloring like optimization on the coroutine frame will result in tighter
+   coroutine frames.
+
+#. Take advantage of the lifetime intrinsics for the data that goes into the
+   coroutine frame. Leave lifetime intrinsics as is for the data that stays in
+   allocas.
+
+#. The CoroElide optimization pass relies on coroutine ramp function to be
+   inlined. It would be beneficial to split the ramp function further to 
+   increase the chance that it will get inlined into its caller.
+
+#. Design a convention that would make it possible to apply coroutine heap
+   elision optimization across ABI boundaries.
+
+#. Cannot handle coroutines with `inalloca` parameters (used in x86 on Windows).
+
+#. Alignment is ignored by coro.begin and coro.free intrinsics.
+
+#. Make required changes to make sure that coroutine optimizations work with
+   LTO.
+
+#. More tests, more tests, more tests

Added: www-releases/trunk/4.0.1/docs/_sources/CoverageMappingFormat.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/CoverageMappingFormat.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/CoverageMappingFormat.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/CoverageMappingFormat.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,605 @@
+.. role:: raw-html(raw)
+   :format: html
+
+=================================
+LLVM Code Coverage Mapping Format
+=================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+LLVM's code coverage mapping format is used to provide code coverage
+analysis using LLVM's and Clang's instrumenation based profiling
+(Clang's ``-fprofile-instr-generate`` option).
+
+This document is aimed at those who use LLVM's code coverage mapping to provide
+code coverage analysis for their own programs, and for those who would like
+to know how it works under the hood. A prior knowledge of how Clang's profile
+guided optimization works is useful, but not required.
+
+We start by showing how to use LLVM and Clang for code coverage analysis,
+then we briefly desribe LLVM's code coverage mapping format and the
+way that Clang and LLVM's code coverage tool work with this format. After
+the basics are down, more advanced features of the coverage mapping format
+are discussed - such as the data structures, LLVM IR representation and
+the binary encoding.
+
+Quick Start
+===========
+
+Here's a short story that describes how to generate code coverage overview
+for a sample source file called *test.c*.
+
+* First, compile an instrumented version of your program using Clang's
+  ``-fprofile-instr-generate`` option with the additional ``-fcoverage-mapping``
+  option:
+
+  ``clang -o test -fprofile-instr-generate -fcoverage-mapping test.c``
+* Then, run the instrumented binary. The runtime will produce a file called
+  *default.profraw* containing the raw profile instrumentation data:
+
+  ``./test``
+* After that, merge the profile data using the *llvm-profdata* tool:
+
+  ``llvm-profdata merge -o test.profdata default.profraw``
+* Finally, run LLVM's code coverage tool (*llvm-cov*) to produce the code
+  coverage overview for the sample source file:
+
+  ``llvm-cov show ./test -instr-profile=test.profdata test.c``
+
+High Level Overview
+===================
+
+LLVM's code coverage mapping format is designed to be a self contained
+data format, that can be embedded into the LLVM IR and object files.
+It's described in this document as a **mapping** format because its goal is
+to store the data that is required for a code coverage tool to map between
+the specific source ranges in a file and the execution counts obtained
+after running the instrumented version of the program.
+
+The mapping data is used in two places in the code coverage process:
+
+1. When clang compiles a source file with ``-fcoverage-mapping``, it
+   generates the mapping information that describes the mapping between the
+   source ranges and the profiling instrumentation counters.
+   This information gets embedded into the LLVM IR and conveniently
+   ends up in the final executable file when the program is linked.
+
+2. It is also used by *llvm-cov* - the mapping information is extracted from an
+   object file and is used to associate the execution counts (the values of the
+   profile instrumentation counters), and the source ranges in a file.
+   After that, the tool is able to generate various code coverage reports
+   for the program.
+
+The coverage mapping format aims to be a "universal format" that would be
+suitable for usage by any frontend, and not just by Clang. It also aims to
+provide the frontend the possibility of generating the minimal coverage mapping
+data in order to reduce the size of the IR and object files - for example,
+instead of emitting mapping information for each statement in a function, the
+frontend is allowed to group the statements with the same execution count into
+regions of code, and emit the mapping information only for those regions.
+
+Advanced Concepts
+=================
+
+The remainder of this guide is meant to give you insight into the way the
+coverage mapping format works.
+
+The coverage mapping format operates on a per-function level as the
+profile instrumentation counters are associated with a specific function.
+For each function that requires code coverage, the frontend has to create
+coverage mapping data that can map between the source code ranges and
+the profile instrumentation counters for that function.
+
+Mapping Region
+--------------
+
+The function's coverage mapping data contains an array of mapping regions.
+A mapping region stores the `source code range`_ that is covered by this region,
+the `file id <coverage file id_>`_, the `coverage mapping counter`_ and
+the region's kind.
+There are several kinds of mapping regions:
+
+* Code regions associate portions of source code and `coverage mapping
+  counters`_. They make up the majority of the mapping regions. They are used
+  by the code coverage tool to compute the execution counts for lines,
+  highlight the regions of code that were never executed, and to obtain
+  the various code coverage statistics for a function.
+  For example:
+
+  :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main(int argc, const char *argv[]) </span><span style='background-color:#4A789C'>{    </span> <span class='c1'>// Code Region from 1:40 to 9:2</span>
+  <span style='background-color:#4A789C'>                                            </span>
+  <span style='background-color:#4A789C'>  if (argc > 1) </span><span style='background-color:#85C1F5'>{                         </span>   <span class='c1'>// Code Region from 3:17 to 5:4</span>
+  <span style='background-color:#85C1F5'>    printf("%s\n", argv[1]);              </span>
+  <span style='background-color:#85C1F5'>  }</span><span style='background-color:#4A789C'> else </span><span style='background-color:#F6D55D'>{                                </span>   <span class='c1'>// Code Region from 5:10 to 7:4</span>
+  <span style='background-color:#F6D55D'>    printf("\n");                         </span>
+  <span style='background-color:#F6D55D'>  }</span><span style='background-color:#4A789C'>                                         </span>
+  <span style='background-color:#4A789C'>  return 0;                                 </span>
+  <span style='background-color:#4A789C'>}</span>
+  </pre>`
+* Skipped regions are used to represent source ranges that were skipped
+  by Clang's preprocessor. They don't associate with
+  `coverage mapping counters`_, as the frontend knows that they are never
+  executed. They are used by the code coverage tool to mark the skipped lines
+  inside a function as non-code lines that don't have execution counts.
+  For example:
+
+  :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main() </span><span style='background-color:#4A789C'>{               </span> <span class='c1'>// Code Region from 1:12 to 6:2</span>
+  <span style='background-color:#85C1F5'>#ifdef DEBUG             </span>   <span class='c1'>// Skipped Region from 2:1 to 4:2</span>
+  <span style='background-color:#85C1F5'>  printf("Hello world"); </span>
+  <span style='background-color:#85C1F5'>#</span><span style='background-color:#4A789C'>endif                     </span>
+  <span style='background-color:#4A789C'>  return 0;                </span>
+  <span style='background-color:#4A789C'>}</span>
+  </pre>`
+* Expansion regions are used to represent Clang's macro expansions. They
+  have an additional property - *expanded file id*. This property can be
+  used by the code coverage tool to find the mapping regions that are created
+  as a result of this macro expansion, by checking if their file id matches the
+  expanded file id. They don't associate with `coverage mapping counters`_,
+  as the code coverage tool can determine the execution count for this region
+  by looking up the execution count of the first region with a corresponding
+  file id.
+  For example:
+
+  :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int func(int x) </span><span style='background-color:#4A789C'>{                             </span>
+  <span style='background-color:#4A789C'>  #define MAX(x,y) </span><span style='background-color:#85C1F5'>((x) > (y)? </span><span style='background-color:#F6D55D'>(x)</span><span style='background-color:#85C1F5'> : </span><span style='background-color:#F4BA70'>(y)</span><span style='background-color:#85C1F5'>)</span><span style='background-color:#4A789C'>     </span>
+  <span style='background-color:#4A789C'>  return </span><span style='background-color:#7FCA9F'>MAX</span><span style='background-color:#4A789C'>(x, 42);                          </span> <span class='c1'>// Expansion Region from 3:10 to 3:13</span>
+  <span style='background-color:#4A789C'>}</span>
+  </pre>`
+
+.. _source code range:
+
+Source Range:
+^^^^^^^^^^^^^
+
+The source range record contains the starting and ending location of a certain
+mapping region. Both locations include the line and the column numbers.
+
+.. _coverage file id:
+
+File ID:
+^^^^^^^^
+
+The file id an integer value that tells us
+in which source file or macro expansion is this region located.
+It enables Clang to produce mapping information for the code
+defined inside macros, like this example demonstrates:
+
+:raw-html:`<pre class='highlight' style='line-height:initial;'><span>void func(const char *str) </span><span style='background-color:#4A789C'>{        </span> <span class='c1'>// Code Region from 1:28 to 6:2 with file id 0</span>
+<span style='background-color:#4A789C'>  #define PUT </span><span style='background-color:#85C1F5'>printf("%s\n", str)</span><span style='background-color:#4A789C'>   </span> <span class='c1'>// 2 Code Regions from 2:15 to 2:34 with file ids 1 and 2</span>
+<span style='background-color:#4A789C'>  if(*str)                          </span>
+<span style='background-color:#4A789C'>    </span><span style='background-color:#F6D55D'>PUT</span><span style='background-color:#4A789C'>;                            </span> <span class='c1'>// Expansion Region from 4:5 to 4:8 with file id 0 that expands a macro with file id 1</span>
+<span style='background-color:#4A789C'>  </span><span style='background-color:#F6D55D'>PUT</span><span style='background-color:#4A789C'>;                              </span> <span class='c1'>// Expansion Region from 5:3 to 5:6 with file id 0 that expands a macro with file id 2</span>
+<span style='background-color:#4A789C'>}</span>
+</pre>`
+
+.. _coverage mapping counter:
+.. _coverage mapping counters:
+
+Counter:
+^^^^^^^^
+
+A coverage mapping counter can represents a reference to the profile
+instrumentation counter. The execution count for a region with such counter
+is determined by looking up the value of the corresponding profile
+instrumentation counter.
+
+It can also represent a binary arithmetical expression that operates on
+coverage mapping counters or other expressions.
+The execution count for a region with an expression counter is determined by
+evaluating the expression's arguments and then adding them together or
+subtracting them from one another.
+In the example below, a subtraction expression is used to compute the execution
+count for the compound statement that follows the *else* keyword:
+
+:raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main(int argc, const char *argv[]) </span><span style='background-color:#4A789C'>{   </span> <span class='c1'>// Region's counter is a reference to the profile counter #0</span>
+<span style='background-color:#4A789C'>                                           </span>
+<span style='background-color:#4A789C'>  if (argc > 1) </span><span style='background-color:#85C1F5'>{                        </span>   <span class='c1'>// Region's counter is a reference to the profile counter #1</span>
+<span style='background-color:#85C1F5'>    printf("%s\n", argv[1]);             </span><span>   </span>
+<span style='background-color:#85C1F5'>  }</span><span style='background-color:#4A789C'> else </span><span style='background-color:#F6D55D'>{                               </span>   <span class='c1'>// Region's counter is an expression (reference to the profile counter #0 - reference to the profile counter #1)</span>
+<span style='background-color:#F6D55D'>    printf("\n");                        </span>
+<span style='background-color:#F6D55D'>  }</span><span style='background-color:#4A789C'>                                        </span>
+<span style='background-color:#4A789C'>  return 0;                                </span>
+<span style='background-color:#4A789C'>}</span>
+</pre>`
+
+Finally, a coverage mapping counter can also represent an execution count of
+of zero. The zero counter is used to provide coverage mapping for
+unreachable statements and expressions, like in the example below:
+
+:raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main() </span><span style='background-color:#4A789C'>{                  </span>
+<span style='background-color:#4A789C'>  return 0;                   </span>
+<span style='background-color:#4A789C'>  </span><span style='background-color:#85C1F5'>printf("Hello world!\n")</span><span style='background-color:#4A789C'>;   </span> <span class='c1'>// Unreachable region's counter is zero</span>
+<span style='background-color:#4A789C'>}</span>
+</pre>`
+
+The zero counters allow the code coverage tool to display proper line execution
+counts for the unreachable lines and highlight the unreachable code.
+Without them, the tool would think that those lines and regions were still
+executed, as it doesn't possess the frontend's knowledge.
+
+LLVM IR Representation
+======================
+
+The coverage mapping data is stored in the LLVM IR using a single global
+constant structure variable called *__llvm_coverage_mapping*
+with the *__llvm_covmap* section specifier.
+
+For example, letâs consider a C file and how it gets compiled to LLVM:
+
+.. _coverage mapping sample:
+
+.. code-block:: c
+
+  int foo() {
+    return 42;
+  }
+  int bar() {
+    return 13;
+  }
+
+The coverage mapping variable generated by Clang has 3 fields:
+
+* Coverage mapping header.
+
+* An array of function records.
+
+* Coverage mapping data which is an array of bytes. Zero paddings are added at the end to force 8 byte alignment.
+
+.. code-block:: llvm
+
+  @__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x { i64, i32, i64 }], [40 x i8] }
+  { 
+    { i32, i32, i32, i32 } ; Coverage map header
+    {
+      i32 2,  ; The number of function records
+      i32 20, ; The length of the string that contains the encoded translation unit filenames
+      i32 20, ; The length of the string that contains the encoded coverage mapping data
+      i32 1,  ; Coverage mapping format version
+    },
+    [2 x { i64, i32, i64 }] [ ; Function records
+     { i64, i32, i64 } {
+       i64 0x5cf8c24cdb18bdac, ; Function's name MD5
+       i32 9, ; Function's encoded coverage mapping data string length
+       i64 0  ; Function's structural hash
+     },
+     { i64, i32, i64 } { 
+       i64 0xe413754a191db537, ; Function's name MD5
+       i32 9, ; Function's encoded coverage mapping data string length
+       i64 0  ; Function's structural hash
+     }],
+   [40 x i8] c"..." ; Encoded data (dissected later)
+  }, section "__llvm_covmap", align 8
+
+The function record layout has evolved since version 1. In version 1, the function record for *foo* is defined as follows:
+
+.. code-block:: llvm
+
+     { i8*, i32, i32, i64 } { i8* getelementptr inbounds ([3 x i8]* @__profn_foo, i32 0, i32 0), ; Function's name
+       i32 3, ; Function's name length
+       i32 9, ; Function's encoded coverage mapping data string length
+       i64 0  ; Function's structural hash
+     }
+
+
+Coverage Mapping Header:
+------------------------
+
+The coverage mapping header has the following fields:
+
+* The number of function records.
+
+* The length of the string in the third field of *__llvm_coverage_mapping* that contains the encoded translation unit filenames.
+
+* The length of the string in the third field of *__llvm_coverage_mapping* that contains the encoded coverage mapping data.
+
+* The format version. The current version is 2 (encoded as a 1).
+
+.. _function records:
+
+Function record:
+----------------
+
+A function record is a structure of the following type:
+
+.. code-block:: llvm
+
+  { i64, i32, i64 }
+
+It contains function name's MD5, the length of the encoded mapping data for that function, and function's 
+structural hash value.
+
+Encoded data:
+-------------
+
+The encoded data is stored in a single string that contains
+the encoded filenames used by this translation unit and the encoded coverage
+mapping data for each function in this translation unit.
+
+The encoded data has the following structure:
+
+``[filenames, coverageMappingDataForFunctionRecord0, coverageMappingDataForFunctionRecord1, ..., padding]``
+
+If necessary, the encoded data is padded with zeroes so that the size
+of the data string is rounded up to the nearest multiple of 8 bytes.
+
+Dissecting the sample:
+^^^^^^^^^^^^^^^^^^^^^^
+
+Here's an overview of the encoded data that was stored in the
+IR for the `coverage mapping sample`_ that was shown earlier:
+
+* The IR contains the following string constant that represents the encoded
+  coverage mapping data for the sample translation unit:
+
+  .. code-block:: llvm
+
+    c"\01\12/Users/alex/test.c\01\00\00\01\01\01\0C\02\02\01\00\00\01\01\04\0C\02\02\00\00"
+
+* The string contains values that are encoded in the LEB128 format, which is
+  used throughout for storing integers. It also contains a string value.
+
+* The length of the substring that contains the encoded translation unit
+  filenames is the value of the second field in the *__llvm_coverage_mapping*
+  structure, which is 20, thus the filenames are encoded in this string:
+
+  .. code-block:: llvm
+
+    c"\01\12/Users/alex/test.c"
+
+  This string contains the following data:
+
+  * Its first byte has a value of ``0x01``. It stores the number of filenames
+    contained in this string.
+  * Its second byte stores the length of the first filename in this string.
+  * The remaining 18 bytes are used to store the first filename.
+
+* The length of the substring that contains the encoded coverage mapping data
+  for the first function is the value of the third field in the first
+  structure in an array of `function records`_ stored in the
+  third field of the *__llvm_coverage_mapping* structure, which is the 9.
+  Therefore, the coverage mapping for the first function record is encoded
+  in this string:
+
+  .. code-block:: llvm
+
+    c"\01\00\00\01\01\01\0C\02\02"
+
+  This string consists of the following bytes:
+
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x01`` | The number of file ids used by this function. There is only one file id used by the mapping data in this function.      |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x00`` | An index into the filenames array which corresponds to the file "/Users/alex/test.c".                                   |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x00`` | The number of counter expressions used by this function. This function doesn't use any expressions.                     |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x01`` | The number of mapping regions that are stored in an array for the function's file id #0.                                |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x01`` | The coverage mapping counter for the first region in this function. The value of 1 tells us that it's a coverage        |
+  |          | mapping counter that is a reference to the profile instrumentation counter with an index of 0.                          |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x01`` | The starting line of the first mapping region in this function.                                                         |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x0C`` | The starting column of the first mapping region in this function.                                                       |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x02`` | The ending line of the first mapping region in this function.                                                           |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+  | ``0x02`` | The ending column of the first mapping region in this function.                                                         |
+  +----------+-------------------------------------------------------------------------------------------------------------------------+
+
+* The length of the substring that contains the encoded coverage mapping data
+  for the second function record is also 9. It's structured like the mapping data
+  for the first function record.
+
+* The two trailing bytes are zeroes and are used to pad the coverage mapping
+  data to give it the 8 byte alignment.
+
+Encoding
+========
+
+The per-function coverage mapping data is encoded as a stream of bytes,
+with a simple structure. The structure consists of the encoding
+`types <cvmtypes_>`_ like variable-length unsigned integers, that
+are used to encode `File ID Mapping`_, `Counter Expressions`_ and
+the `Mapping Regions`_.
+
+The format of the structure follows:
+
+  ``[file id mapping, counter expressions, mapping regions]``
+
+The translation unit filenames are encoded using the same encoding
+`types <cvmtypes_>`_ as the per-function coverage mapping data, with the
+following structure:
+
+  ``[numFilenames : LEB128, filename0 : string, filename1 : string, ...]``
+
+.. _cvmtypes:
+
+Types
+-----
+
+This section describes the basic types that are used by the encoding format
+and can appear after ``:`` in the ``[foo : type]`` description.
+
+.. _LEB128:
+
+LEB128
+^^^^^^
+
+LEB128 is an unsigned integer value that is encoded using DWARF's LEB128
+encoding, optimizing for the case where values are small
+(1 byte for values less than 128).
+
+.. _CoverageStrings:
+
+Strings
+^^^^^^^
+
+``[length : LEB128, characters...]``
+
+String values are encoded with a `LEB value <LEB128_>`_ for the length
+of the string and a sequence of bytes for its characters.
+
+.. _file id mapping:
+
+File ID Mapping
+---------------
+
+``[numIndices : LEB128, filenameIndex0 : LEB128, filenameIndex1 : LEB128, ...]``
+
+File id mapping in a function's coverage mapping stream
+contains the indices into the translation unit's filenames array.
+
+Counter
+-------
+
+``[value : LEB128]``
+
+A `coverage mapping counter`_ is stored in a single `LEB value <LEB128_>`_.
+It is composed of two things --- the `tag <counter-tag_>`_
+which is stored in the lowest 2 bits, and the `counter data`_ which is stored
+in the remaining bits.
+
+.. _counter-tag:
+
+Tag:
+^^^^
+
+The counter's tag encodes the counter's kind
+and, if the counter is an expression, the expression's kind.
+The possible tag values are:
+
+* 0 - The counter is zero.
+
+* 1 - The counter is a reference to the profile instrumentation counter.
+
+* 2 - The counter is a subtraction expression.
+
+* 3 - The counter is an addition expression.
+
+.. _counter data:
+
+Data:
+^^^^^
+
+The counter's data is interpreted in the following manner:
+
+* When the counter is a reference to the profile instrumentation counter,
+  then the counter's data is the id of the profile counter.
+* When the counter is an expression, then the counter's data
+  is the index into the array of counter expressions.
+
+.. _Counter Expressions:
+
+Counter Expressions
+-------------------
+
+``[numExpressions : LEB128, expr0LHS : LEB128, expr0RHS : LEB128, expr1LHS : LEB128, expr1RHS : LEB128, ...]``
+
+Counter expressions consist of two counters as they
+represent binary arithmetic operations.
+The expression's kind is determined from the `tag <counter-tag_>`_ of the
+counter that references this expression.
+
+.. _Mapping Regions:
+
+Mapping Regions
+---------------
+
+``[numRegionArrays : LEB128, regionsForFile0, regionsForFile1, ...]``
+
+The mapping regions are stored in an array of sub-arrays where every
+region in a particular sub-array has the same file id.
+
+The file id for a sub-array of regions is the index of that
+sub-array in the main array e.g. The first sub-array will have the file id
+of 0.
+
+Sub-Array of Regions
+^^^^^^^^^^^^^^^^^^^^
+
+``[numRegions : LEB128, region0, region1, ...]``
+
+The mapping regions for a specific file id are stored in an array that is
+sorted in an ascending order by the region's starting location.
+
+Mapping Region
+^^^^^^^^^^^^^^
+
+``[header, source range]``
+
+The mapping region record contains two sub-records ---
+the `header`_, which stores the counter and/or the region's kind,
+and the `source range`_ that contains the starting and ending
+location of this region.
+
+.. _header:
+
+Header
+^^^^^^
+
+``[counter]``
+
+or
+
+``[pseudo-counter]``
+
+The header encodes the region's counter and the region's kind.
+
+The value of the counter's tag distinguishes between the counters and
+pseudo-counters --- if the tag is zero, than this header contains a
+pseudo-counter, otherwise this header contains an ordinary counter.
+
+Counter:
+""""""""
+
+A mapping region whose header has a counter with a non-zero tag is
+a code region.
+
+Pseudo-Counter:
+"""""""""""""""
+
+``[value : LEB128]``
+
+A pseudo-counter is stored in a single `LEB value <LEB128_>`_, just like
+the ordinary counter. It has the following interpretation:
+
+* bits 0-1: tag, which is always 0.
+
+* bit 2: expansionRegionTag. If this bit is set, then this mapping region
+  is an expansion region.
+
+* remaining bits: data. If this region is an expansion region, then the data
+  contains the expanded file id of that region.
+
+  Otherwise, the data contains the region's kind. The possible region
+  kind values are:
+
+  * 0 - This mapping region is a code region with a counter of zero.
+  * 2 - This mapping region is a skipped region.
+
+.. _source range:
+
+Source Range
+^^^^^^^^^^^^
+
+``[deltaLineStart : LEB128, columnStart : LEB128, numLines : LEB128, columnEnd : LEB128]``
+
+The source range record contains the following fields:
+
+* *deltaLineStart*: The difference between the starting line of the
+  current mapping region and the starting line of the previous mapping region.
+
+  If the current mapping region is the first region in the current
+  sub-array, then it stores the starting line of that region.
+
+* *columnStart*: The starting column of the mapping region.
+
+* *numLines*: The difference between the ending line and the starting line
+  of the current mapping region.
+
+* *columnEnd*: The ending column of the mapping region.

Added: www-releases/trunk/4.0.1/docs/_sources/DebuggingJITedCode.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/DebuggingJITedCode.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/DebuggingJITedCode.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/DebuggingJITedCode.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,143 @@
+==============================
+Debugging JIT-ed Code With GDB
+==============================
+
+Background
+==========
+
+Without special runtime support, debugging dynamically generated code with
+GDB (as well as most debuggers) can be quite painful.  Debuggers generally
+read debug information from the object file of the code, but for JITed
+code, there is no such file to look for.
+
+In order to communicate the necessary debug info to GDB, an interface for
+registering JITed code with debuggers has been designed and implemented for
+GDB and LLVM MCJIT.  At a high level, whenever MCJIT generates new machine code,
+it does so in an in-memory object file that contains the debug information in
+DWARF format.  MCJIT then adds this in-memory object file to a global list of
+dynamically generated object files and calls a special function
+(``__jit_debug_register_code``) marked noinline that GDB knows about.  When
+GDB attaches to a process, it puts a breakpoint in this function and loads all
+of the object files in the global list.  When MCJIT calls the registration
+function, GDB catches the breakpoint signal, loads the new object file from
+the inferior's memory, and resumes the execution.  In this way, GDB can get the
+necessary debug information.
+
+GDB Version
+===========
+
+In order to debug code JIT-ed by LLVM, you need GDB 7.0 or newer, which is
+available on most modern distributions of Linux.  The version of GDB that
+Apple ships with Xcode has been frozen at 6.3 for a while.  LLDB may be a
+better option for debugging JIT-ed code on Mac OS X.
+
+
+Debugging MCJIT-ed code
+=======================
+
+The emerging MCJIT component of LLVM allows full debugging of JIT-ed code with
+GDB.  This is due to MCJIT's ability to use the MC emitter to provide full
+DWARF debugging information to GDB.
+
+Note that lli has to be passed the ``-use-mcjit`` flag to JIT the code with
+MCJIT instead of the old JIT.
+
+Example
+-------
+
+Consider the following C code (with line numbers added to make the example
+easier to follow):
+
+..
+   FIXME:
+   Sphinx has the ability to automatically number these lines by adding
+   :linenos: on the line immediately following the `.. code-block:: c`, but
+   it looks like garbage; the line numbers don't even line up with the
+   lines. Is this a Sphinx bug, or is it a CSS problem?
+
+.. code-block:: c
+
+   1   int compute_factorial(int n)
+   2   {
+   3       if (n <= 1)
+   4           return 1;
+   5
+   6       int f = n;
+   7       while (--n > 1)
+   8           f *= n;
+   9       return f;
+   10  }
+   11
+   12
+   13  int main(int argc, char** argv)
+   14  {
+   15      if (argc < 2)
+   16          return -1;
+   17      char firstletter = argv[1][0];
+   18      int result = compute_factorial(firstletter - '0');
+   19
+   20      // Returned result is clipped at 255...
+   21      return result;
+   22  }
+
+Here is a sample command line session that shows how to build and run this
+code via ``lli`` inside GDB:
+
+.. code-block:: bash
+
+   $ $BINPATH/clang -cc1 -O0 -g -emit-llvm showdebug.c
+   $ gdb --quiet --args $BINPATH/lli -use-mcjit showdebug.ll 5
+   Reading symbols from $BINPATH/lli...done.
+   (gdb) b showdebug.c:6
+   No source file named showdebug.c.
+   Make breakpoint pending on future shared library load? (y or [n]) y
+   Breakpoint 1 (showdebug.c:6) pending.
+   (gdb) r
+   Starting program: $BINPATH/lli -use-mcjit showdebug.ll 5
+   [Thread debugging using libthread_db enabled]
+
+   Breakpoint 1, compute_factorial (n=5) at showdebug.c:6
+   6	    int f = n;
+   (gdb) p n
+   $1 = 5
+   (gdb) p f
+   $2 = 0
+   (gdb) n
+   7	    while (--n > 1)
+   (gdb) p f
+   $3 = 5
+   (gdb) b showdebug.c:9
+   Breakpoint 2 at 0x7ffff7ed404c: file showdebug.c, line 9.
+   (gdb) c
+   Continuing.
+
+   Breakpoint 2, compute_factorial (n=1) at showdebug.c:9
+   9	    return f;
+   (gdb) p f
+   $4 = 120
+   (gdb) bt
+   #0  compute_factorial (n=1) at showdebug.c:9
+   #1  0x00007ffff7ed40a9 in main (argc=2, argv=0x16677e0) at showdebug.c:18
+   #2  0x3500000001652748 in ?? ()
+   #3  0x00000000016677e0 in ?? ()
+   #4  0x0000000000000002 in ?? ()
+   #5  0x0000000000d953b3 in llvm::MCJIT::runFunction (this=0x16151f0, F=0x1603020, ArgValues=...) at /home/ebenders_test/llvm_svn_rw/lib/ExecutionEngine/MCJIT/MCJIT.cpp:161
+   #6  0x0000000000dc8872 in llvm::ExecutionEngine::runFunctionAsMain (this=0x16151f0, Fn=0x1603020, argv=..., envp=0x7fffffffe040)
+       at /home/ebenders_test/llvm_svn_rw/lib/ExecutionEngine/ExecutionEngine.cpp:397
+   #7  0x000000000059c583 in main (argc=4, argv=0x7fffffffe018, envp=0x7fffffffe040) at /home/ebenders_test/llvm_svn_rw/tools/lli/lli.cpp:324
+   (gdb) finish
+   Run till exit from #0  compute_factorial (n=1) at showdebug.c:9
+   0x00007ffff7ed40a9 in main (argc=2, argv=0x16677e0) at showdebug.c:18
+   18	    int result = compute_factorial(firstletter - '0');
+   Value returned is $5 = 120
+   (gdb) p result
+   $6 = 23406408
+   (gdb) n
+   21	    return result;
+   (gdb) p result
+   $7 = 120
+   (gdb) c
+   Continuing.
+
+   Program exited with code 0170.
+   (gdb)

Added: www-releases/trunk/4.0.1/docs/_sources/DeveloperPolicy.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/DeveloperPolicy.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/DeveloperPolicy.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/DeveloperPolicy.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,746 @@
+=====================
+LLVM Developer Policy
+=====================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document contains the LLVM Developer Policy which defines the project's
+policy towards developers and their contributions. The intent of this policy is
+to eliminate miscommunication, rework, and confusion that might arise from the
+distributed nature of LLVM's development.  By stating the policy in clear terms,
+we hope each developer can know ahead of time what to expect when making LLVM
+contributions.  This policy covers all llvm.org subprojects, including Clang,
+LLDB, libc++, etc.
+
+This policy is also designed to accomplish the following objectives:
+
+#. Attract both users and developers to the LLVM project.
+
+#. Make life as simple and easy for contributors as possible.
+
+#. Keep the top of Subversion trees as stable as possible.
+
+#. Establish awareness of the project's :ref:`copyright, license, and patent
+   policies <copyright-license-patents>` with contributors to the project.
+
+This policy is aimed at frequent contributors to LLVM. People interested in
+contributing one-off patches can do so in an informal way by sending them to the
+`llvm-commits mailing list
+<http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ and engaging another
+developer to see it through the process.
+
+Developer Policies
+==================
+
+This section contains policies that pertain to frequent LLVM developers.  We
+always welcome `one-off patches`_ from people who do not routinely contribute to
+LLVM, but we expect more from frequent contributors to keep the system as
+efficient as possible for everyone.  Frequent LLVM contributors are expected to
+meet the following requirements in order for LLVM to maintain a high standard of
+quality.
+
+Stay Informed
+-------------
+
+Developers should stay informed by reading at least the "dev" mailing list for
+the projects you are interested in, such as `llvm-dev
+<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ for LLVM, `cfe-dev
+<http://lists.llvm.org/mailman/listinfo/cfe-dev>`_ for Clang, or `lldb-dev
+<http://lists.llvm.org/mailman/listinfo/lldb-dev>`_ for LLDB.  If you are
+doing anything more than just casual work on LLVM, it is suggested that you also
+subscribe to the "commits" mailing list for the subproject you're interested in,
+such as `llvm-commits
+<http://lists.llvm.org/mailman/listinfo/llvm-commits>`_, `cfe-commits
+<http://lists.llvm.org/mailman/listinfo/cfe-commits>`_, or `lldb-commits
+<http://lists.llvm.org/mailman/listinfo/lldb-commits>`_.  Reading the
+"commits" list and paying attention to changes being made by others is a good
+way to see what other people are interested in and watching the flow of the
+project as a whole.
+
+We recommend that active developers register an email account with `LLVM
+Bugzilla <http://llvm.org/bugs/>`_ and preferably subscribe to the `llvm-bugs
+<http://lists.llvm.org/mailman/listinfo/llvm-bugs>`_ email list to keep track
+of bugs and enhancements occurring in LLVM.  We really appreciate people who are
+proactive at catching incoming bugs in their components and dealing with them
+promptly.
+
+Please be aware that all public LLVM mailing lists are public and archived, and
+that notices of confidentiality or non-disclosure cannot be respected.
+
+.. _patch:
+.. _one-off patches:
+
+Making and Submitting a Patch
+-----------------------------
+
+When making a patch for review, the goal is to make it as easy for the reviewer
+to read it as possible.  As such, we recommend that you:
+
+#. Make your patch against the Subversion trunk, not a branch, and not an old
+   version of LLVM.  This makes it easy to apply the patch.  For information on
+   how to check out SVN trunk, please see the `Getting Started
+   Guide <GettingStarted.html#checkout>`_.
+
+#. Similarly, patches should be submitted soon after they are generated.  Old
+   patches may not apply correctly if the underlying code changes between the
+   time the patch was created and the time it is applied.
+
+#. Patches should be made with ``svn diff``, or similar. If you use a
+   different tool, make sure it uses the ``diff -u`` format and that it
+   doesn't contain clutter which makes it hard to read.
+
+#. If you are modifying generated files, such as the top-level ``configure``
+   script, please separate out those changes into a separate patch from the rest
+   of your changes.
+
+Once your patch is ready, submit it by emailing it to the appropriate project's
+commit mailing list (or commit it directly if applicable). Alternatively, some
+patches get sent to the project's development list or component of the LLVM bug
+tracker, but the commit list is the primary place for reviews and should
+generally be preferred.
+
+When sending a patch to a mailing list, it is a good idea to send it as an
+*attachment* to the message, not embedded into the text of the message.  This
+ensures that your mailer will not mangle the patch when it sends it (e.g. by
+making whitespace changes or by wrapping lines).
+
+*For Thunderbird users:* Before submitting a patch, please open *Preferences >
+Advanced > General > Config Editor*, find the key
+``mail.content_disposition_type``, and set its value to ``1``. Without this
+setting, Thunderbird sends your attachment using ``Content-Disposition: inline``
+rather than ``Content-Disposition: attachment``. Apple Mail gamely displays such
+a file inline, making it difficult to work with for reviewers using that
+program.
+
+When submitting patches, please do not add confidentiality or non-disclosure
+notices to the patches themselves.  These notices conflict with the `LLVM
+License`_ and may result in your contribution being excluded.
+
+.. _code review:
+
+Code Reviews
+------------
+
+LLVM has a code review policy. Code review is one way to increase the quality of
+software. We generally follow these policies:
+
+#. All developers are required to have significant changes reviewed before they
+   are committed to the repository.
+
+#. Code reviews are conducted by email on the relevant project's commit mailing
+   list, or alternatively on the project's development list or bug tracker.
+
+#. Code can be reviewed either before it is committed or after.  We expect major
+   changes to be reviewed before being committed, but smaller changes (or
+   changes where the developer owns the component) can be reviewed after commit.
+
+#. The developer responsible for a code change is also responsible for making
+   all necessary review-related changes.
+
+#. Code review can be an iterative process, which continues until the patch is
+   ready to be committed. Specifically, once a patch is sent out for review, it
+   needs an explicit "looks good" before it is submitted. Do not assume silent
+   approval, or request active objections to the patch with a deadline.
+
+Sometimes code reviews will take longer than you would hope for, especially for
+larger features. Accepted ways to speed up review times for your patches are:
+
+* Review other people's patches. If you help out, everybody will be more
+  willing to do the same for you; goodwill is our currency.
+* Ping the patch. If it is urgent, provide reasons why it is important to you to
+  get this patch landed and ping it every couple of days. If it is
+  not urgent, the common courtesy ping rate is one week. Remember that you're
+  asking for valuable time from other professional developers.
+* Ask for help on IRC. Developers on IRC will be able to either help you
+  directly, or tell you who might be a good reviewer.
+* Split your patch into multiple smaller patches that build on each other. The
+  smaller your patch, the higher the probability that somebody will take a quick
+  look at it.
+
+Developers should participate in code reviews as both reviewers and
+reviewees. If someone is kind enough to review your code, you should return the
+favor for someone else.  Note that anyone is welcome to review and give feedback
+on a patch, but only people with Subversion write access can approve it.
+
+There is a web based code review tool that can optionally be used
+for code reviews. See :doc:`Phabricator`.
+
+.. _code owners:
+
+Code Owners
+-----------
+
+The LLVM Project relies on two features of its process to maintain rapid
+development in addition to the high quality of its source base: the combination
+of code review plus post-commit review for trusted maintainers.  Having both is
+a great way for the project to take advantage of the fact that most people do
+the right thing most of the time, and only commit patches without pre-commit
+review when they are confident they are right.
+
+The trick to this is that the project has to guarantee that all patches that are
+committed are reviewed after they go in: you don't want everyone to assume
+someone else will review it, allowing the patch to go unreviewed.  To solve this
+problem, we have a notion of an 'owner' for a piece of the code.  The sole
+responsibility of a code owner is to ensure that a commit to their area of the
+code is appropriately reviewed, either by themself or by someone else.  The list
+of current code owners can be found in the file
+`CODE_OWNERS.TXT <http://llvm.org/klaus/llvm/blob/master/CODE_OWNERS.TXT>`_
+in the root of the LLVM source tree.
+
+Note that code ownership is completely different than reviewers: anyone can
+review a piece of code, and we welcome code review from anyone who is
+interested.  Code owners are the "last line of defense" to guarantee that all
+patches that are committed are actually reviewed.
+
+Being a code owner is a somewhat unglamorous position, but it is incredibly
+important for the ongoing success of the project.  Because people get busy,
+interests change, and unexpected things happen, code ownership is purely opt-in,
+and anyone can choose to resign their "title" at any time. For now, we do not
+have an official policy on how one gets elected to be a code owner.
+
+.. _include a testcase:
+
+Test Cases
+----------
+
+Developers are required to create test cases for any bugs fixed and any new
+features added.  Some tips for getting your testcase approved:
+
+* All feature and regression test cases are added to the ``llvm/test``
+  directory. The appropriate sub-directory should be selected (see the
+  :doc:`Testing Guide <TestingGuide>` for details).
+
+* Test cases should be written in :doc:`LLVM assembly language <LangRef>`.
+
+* Test cases, especially for regressions, should be reduced as much as possible,
+  by :doc:`bugpoint <Bugpoint>` or manually. It is unacceptable to place an
+  entire failing program into ``llvm/test`` as this creates a *time-to-test*
+  burden on all developers. Please keep them short.
+
+Note that llvm/test and clang/test are designed for regression and small feature
+tests only. More extensive test cases (e.g., entire applications, benchmarks,
+etc) should be added to the ``llvm-test`` test suite.  The llvm-test suite is
+for coverage (correctness, performance, etc) testing, not feature or regression
+testing.
+
+Quality
+-------
+
+The minimum quality standards that any change must satisfy before being
+committed to the main development branch are:
+
+#. Code must adhere to the `LLVM Coding Standards <CodingStandards.html>`_.
+
+#. Code must compile cleanly (no errors, no warnings) on at least one platform.
+
+#. Bug fixes and new features should `include a testcase`_ so we know if the
+   fix/feature ever regresses in the future.
+
+#. Code must pass the ``llvm/test`` test suite.
+
+#. The code must not cause regressions on a reasonable subset of llvm-test,
+   where "reasonable" depends on the contributor's judgement and the scope of
+   the change (more invasive changes require more testing). A reasonable subset
+   might be something like "``llvm-test/MultiSource/Benchmarks``".
+
+Additionally, the committer is responsible for addressing any problems found in
+the future that the change is responsible for.  For example:
+
+* The code should compile cleanly on all supported platforms.
+
+* The changes should not cause any correctness regressions in the ``llvm-test``
+  suite and must not cause any major performance regressions.
+
+* The change set should not cause performance or correctness regressions for the
+  LLVM tools.
+
+* The changes should not cause performance or correctness regressions in code
+  compiled by LLVM on all applicable targets.
+
+* You are expected to address any `Bugzilla bugs <http://llvm.org/bugs/>`_ that
+  result from your change.
+
+We prefer for this to be handled before submission but understand that it isn't
+possible to test all of this for every submission.  Our build bots and nightly
+testing infrastructure normally finds these problems.  A good rule of thumb is
+to check the nightly testers for regressions the day after your change.  Build
+bots will directly email you if a group of commits that included yours caused a
+failure.  You are expected to check the build bot messages to see if they are
+your fault and, if so, fix the breakage.
+
+Commits that violate these quality standards (e.g. are very broken) may be
+reverted. This is necessary when the change blocks other developers from making
+progress. The developer is welcome to re-commit the change after the problem has
+been fixed.
+
+.. _commit messages:
+
+Commit messages
+---------------
+
+Although we don't enforce the format of commit messages, we prefer that
+you follow these guidelines to help review, search in logs, email formatting
+and so on. These guidelines are very similar to rules used by other open source
+projects.
+
+Most importantly, the contents of the message should be carefully written to
+convey the rationale of the change (without delving too much in detail). It
+also should avoid being vague or overly specific. For example, "bits were not
+set right" will leave the reviewer wondering about which bits, and why they
+weren't right, while "Correctly set overflow bits in TargetInfo" conveys almost
+all there is to the change.
+
+Below are some guidelines about the format of the message itself:
+
+* Separate the commit message into title, body and, if you're not the original
+  author, a "Patch by" attribution line (see below).
+
+* The title should be concise. Because all commits are emailed to the list with
+  the first line as the subject, long titles are frowned upon.  Short titles
+  also look better in `git log`.
+
+* When the changes are restricted to a specific part of the code (e.g. a
+  back-end or optimization pass), it is customary to add a tag to the
+  beginning of the line in square brackets.  For example, "[SCEV] ..."
+  or "[OpenMP] ...". This helps email filters and searches for post-commit
+  reviews.
+
+* The body, if it exists, should be separated from the title by an empty line.
+
+* The body should be concise, but explanatory, including a complete
+  reasoning.  Unless it is required to understand the change, examples,
+  code snippets and gory details should be left to bug comments, web
+  review or the mailing list.
+
+* If the patch fixes a bug in bugzilla, please include the PR# in the message.
+
+* `Attribution of Changes`_ should be in a separate line, after the end of
+  the body, as simple as "Patch by John Doe.". This is how we officially
+  handle attribution, and there are automated processes that rely on this
+  format.
+
+* Text formatting and spelling should follow the same rules as documentation
+  and in-code comments, ex. capitalization, full stop, etc.
+
+* If the commit is a bug fix on top of another recently committed patch, or a
+  revert or reapply of a patch, include the svn revision number of the prior
+  related commit. This could be as simple as "Revert rNNNN because it caused
+  PR#".
+
+For minor violations of these recommendations, the community normally favors
+reminding the contributor of this policy over reverting. Minor corrections and
+omissions can be handled by sending a reply to the commits mailing list.
+
+Obtaining Commit Access
+-----------------------
+
+We grant commit access to contributors with a track record of submitting high
+quality patches.  If you would like commit access, please send an email to
+`Chris <mailto:clattner at llvm.org>`_ with the following information:
+
+#. The user name you want to commit with, e.g. "hacker".
+
+#. The full name and email address you want message to llvm-commits to come
+   from, e.g. "J. Random Hacker <hacker at yoyodyne.com>".
+
+#. A "password hash" of the password you want to use, e.g. "``2ACR96qjUqsyM``".
+   Note that you don't ever tell us what your password is; you just give it to
+   us in an encrypted form.  To get this, run "``htpasswd``" (a utility that
+   comes with apache) in *crypt* mode (often enabled with "``-d``"), or find a web
+   page that will do it for you.  Note that our system does not work with MD5
+   hashes.  These are significantly longer than a crypt hash - e.g.
+   "``$apr1$vea6bBV2$Z8IFx.AfeD8LhqlZFqJer0``", we only accept the shorter crypt hash.
+
+Once you've been granted commit access, you should be able to check out an LLVM
+tree with an SVN URL of "https://username@llvm.org/..." instead of the normal
+anonymous URL of "http://llvm.org/...".  The first time you commit you'll have
+to type in your password.  Note that you may get a warning from SVN about an
+untrusted key; you can ignore this.  To verify that your commit access works,
+please do a test commit (e.g. change a comment or add a blank line).  Your first
+commit to a repository may require the autogenerated email to be approved by a
+mailing list.  This is normal and will be done when the mailing list owner has
+time.
+
+If you have recently been granted commit access, these policies apply:
+
+#. You are granted *commit-after-approval* to all parts of LLVM.  To get
+   approval, submit a `patch`_ to `llvm-commits
+   <http://lists.llvm.org/mailman/listinfo/llvm-commits>`_. When approved,
+   you may commit it yourself.
+
+#. You are allowed to commit patches without approval which you think are
+   obvious. This is clearly a subjective decision --- we simply expect you to
+   use good judgement.  Examples include: fixing build breakage, reverting
+   obviously broken patches, documentation/comment changes, any other minor
+   changes.
+
+#. You are allowed to commit patches without approval to those portions of LLVM
+   that you have contributed or maintain (i.e., have been assigned
+   responsibility for), with the proviso that such commits must not break the
+   build.  This is a "trust but verify" policy, and commits of this nature are
+   reviewed after they are committed.
+
+#. Multiple violations of these policies or a single egregious violation may
+   cause commit access to be revoked.
+
+In any case, your changes are still subject to `code review`_ (either before or
+after they are committed, depending on the nature of the change).  You are
+encouraged to review other peoples' patches as well, but you aren't required
+to do so.
+
+.. _discuss the change/gather consensus:
+
+Making a Major Change
+---------------------
+
+When a developer begins a major new project with the aim of contributing it back
+to LLVM, they should inform the community with an email to the `llvm-dev
+<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ email list, to the extent
+possible. The reason for this is to:
+
+#. keep the community informed about future changes to LLVM,
+
+#. avoid duplication of effort by preventing multiple parties working on the
+   same thing and not knowing about it, and
+
+#. ensure that any technical issues around the proposed work are discussed and
+   resolved before any significant work is done.
+
+The design of LLVM is carefully controlled to ensure that all the pieces fit
+together well and are as consistent as possible. If you plan to make a major
+change to the way LLVM works or want to add a major new extension, it is a good
+idea to get consensus with the development community before you start working on
+it.
+
+Once the design of the new feature is finalized, the work itself should be done
+as a series of `incremental changes`_, not as a long-term development branch.
+
+.. _incremental changes:
+
+Incremental Development
+-----------------------
+
+In the LLVM project, we do all significant changes as a series of incremental
+patches.  We have a strong dislike for huge changes or long-term development
+branches.  Long-term development branches have a number of drawbacks:
+
+#. Branches must have mainline merged into them periodically.  If the branch
+   development and mainline development occur in the same pieces of code,
+   resolving merge conflicts can take a lot of time.
+
+#. Other people in the community tend to ignore work on branches.
+
+#. Huge changes (produced when a branch is merged back onto mainline) are
+   extremely difficult to `code review`_.
+
+#. Branches are not routinely tested by our nightly tester infrastructure.
+
+#. Changes developed as monolithic large changes often don't work until the
+   entire set of changes is done.  Breaking it down into a set of smaller
+   changes increases the odds that any of the work will be committed to the main
+   repository.
+
+To address these problems, LLVM uses an incremental development style and we
+require contributors to follow this practice when making a large/invasive
+change.  Some tips:
+
+* Large/invasive changes usually have a number of secondary changes that are
+  required before the big change can be made (e.g. API cleanup, etc).  These
+  sorts of changes can often be done before the major change is done,
+  independently of that work.
+
+* The remaining inter-related work should be decomposed into unrelated sets of
+  changes if possible.  Once this is done, define the first increment and get
+  consensus on what the end goal of the change is.
+
+* Each change in the set can be stand alone (e.g. to fix a bug), or part of a
+  planned series of changes that works towards the development goal.
+
+* Each change should be kept as small as possible. This simplifies your work
+  (into a logical progression), simplifies code review and reduces the chance
+  that you will get negative feedback on the change. Small increments also
+  facilitate the maintenance of a high quality code base.
+
+* Often, an independent precursor to a big change is to add a new API and slowly
+  migrate clients to use the new API.  Each change to use the new API is often
+  "obvious" and can be committed without review.  Once the new API is in place
+  and used, it is much easier to replace the underlying implementation of the
+  API.  This implementation change is logically separate from the API
+  change.
+
+If you are interested in making a large change, and this scares you, please make
+sure to first `discuss the change/gather consensus`_ then ask about the best way
+to go about making the change.
+
+Attribution of Changes
+----------------------
+
+When contributors submit a patch to an LLVM project, other developers with
+commit access may commit it for the author once appropriate (based on the
+progression of code review, etc.). When doing so, it is important to retain
+correct attribution of contributions to their contributors. However, we do not
+want the source code to be littered with random attributions "this code written
+by J. Random Hacker" (this is noisy and distracting). In practice, the revision
+control system keeps a perfect history of who changed what, and the CREDITS.txt
+file describes higher-level contributions. If you commit a patch for someone
+else, please follow the attribution of changes in the simple manner as outlined
+by the `commit messages`_ section. Overall, please do not add contributor names
+to the source code.
+
+Also, don't commit patches authored by others unless they have submitted the
+patch to the project or you have been authorized to submit them on their behalf
+(you work together and your company authorized you to contribute the patches,
+etc.). The author should first submit them to the relevant project's commit
+list, development list, or LLVM bug tracker component. If someone sends you
+a patch privately, encourage them to submit it to the appropriate list first.
+
+
+.. _IR backwards compatibility:
+
+IR Backwards Compatibility
+--------------------------
+
+When the IR format has to be changed, keep in mind that we try to maintain some
+backwards compatibility. The rules are intended as a balance between convenience
+for llvm users and not imposing a big burden on llvm developers:
+
+* The textual format is not backwards compatible. We don't change it too often,
+  but there are no specific promises.
+
+* Additions and changes to the IR should be reflected in
+  ``test/Bitcode/compatibility.ll``.
+
+* The current LLVM version supports loading any bitcode since version 3.0.
+
+* After each X.Y release, ``compatibility.ll`` must be copied to
+  ``compatibility-X.Y.ll``. The corresponding bitcode file should be assembled
+  using the X.Y build and committed as ``compatibility-X.Y.ll.bc``.
+
+* Newer releases can ignore features from older releases, but they cannot
+  miscompile them. For example, if nsw is ever replaced with something else,
+  dropping it would be a valid way to upgrade the IR.
+
+* Debug metadata is special in that it is currently dropped during upgrades.
+
+* Non-debug metadata is defined to be safe to drop, so a valid way to upgrade
+  it is to drop it. That is not very user friendly and a bit more effort is
+  expected, but no promises are made.
+
+C API Changes
+----------------
+
+* Stability Guarantees: The C API is, in general, a "best effort" for stability.
+  This means that we make every attempt to keep the C API stable, but that
+  stability will be limited by the abstractness of the interface and the
+  stability of the C++ API that it wraps. In practice, this means that things
+  like "create debug info" or "create this type of instruction" are likely to be
+  less stable than "take this IR file and JIT it for my current machine".
+
+* Release stability: We won't break the C API on the release branch with patches
+  that go on that branch, with the exception that we will fix an unintentional
+  C API break that will keep the release consistent with both the previous and
+  next release.
+
+* Testing: Patches to the C API are expected to come with tests just like any
+  other patch.
+
+* Including new things into the API: If an LLVM subcomponent has a C API already
+  included, then expanding that C API is acceptable. Adding C API for
+  subcomponents that don't currently have one needs to be discussed on the
+  mailing list for design and maintainability feedback prior to implementation.
+
+* Documentation: Any changes to the C API are required to be documented in the
+  release notes so that it's clear to external users who do not follow the
+  project how the C API is changing and evolving.
+
+New Targets
+-----------
+
+LLVM is very receptive to new targets, even experimental ones, but a number of
+problems can appear when adding new large portions of code, and back-ends are
+normally added in bulk.  We have found that landing large pieces of new code 
+and then trying to fix emergent problems in-tree is problematic for a variety 
+of reasons.
+
+For these reasons, new targets are *always* added as *experimental* until
+they can be proven stable, and later moved to non-experimental. The difference
+between both classes is that experimental targets are not built by default
+(need to be added to -DLLVM_TARGETS_TO_BUILD at CMake time).
+
+The basic rules for a back-end to be upstreamed in **experimental** mode are:
+
+* Every target must have a :ref:`code owner<code owners>`. The `CODE_OWNERS.TXT`
+  file has to be updated as part of the first merge. The code owner makes sure
+  that changes to the target get reviewed and steers the overall effort.
+
+* There must be an active community behind the target. This community
+  will help maintain the target by providing buildbots, fixing
+  bugs, answering the LLVM community's questions and making sure the new
+  target doesn't break any of the other targets, or generic code. This
+  behavior is expected to continue throughout the lifetime of the
+  target's code.
+
+* The code must be free of contentious issues, for example, large
+  changes in how the IR behaves or should be formed by the front-ends,
+  unless agreed by the majority of the community via refactoring of the
+  (:doc:`IR standard<LangRef>`) **before** the merge of the new target changes,
+  following the :ref:`IR backwards compatibility`.
+
+* The code conforms to all of the policies laid out in this developer policy
+  document, including license, patent, and coding standards.
+
+* The target should have either reasonable documentation on how it
+  works (ISA, ABI, etc.) or a publicly available simulator/hardware
+  (either free or cheap enough) - preferably both.  This allows
+  developers to validate assumptions, understand constraints and review code 
+  that can affect the target. 
+
+In addition, the rules for a back-end to be promoted to **official** are:
+
+* The target must have addressed every other minimum requirement and
+  have been stable in tree for at least 3 months. This cool down
+  period is to make sure that the back-end and the target community can
+  endure continuous upstream development for the foreseeable future.
+
+* The target's code must have been completely adapted to this policy
+  as well as the :doc:`coding standards<CodingStandards>`. Any exceptions that
+  were made to move into experimental mode must have been fixed **before**
+  becoming official.
+
+* The test coverage needs to be broad and well written (small tests,
+  well documented). The build target ``check-all`` must pass with the
+  new target built, and where applicable, the ``test-suite`` must also
+  pass without errors, in at least one configuration (publicly
+  demonstrated, for example, via buildbots).
+
+* Public buildbots need to be created and actively maintained, unless
+  the target requires no additional buildbots (ex. ``check-all`` covers
+  all tests). The more relevant and public the new target's CI infrastructure
+  is, the more the LLVM community will embrace it.
+
+To **continue** as a supported and official target:
+
+* The maintainer(s) must continue following these rules throughout the lifetime
+  of the target. Continuous violations of aforementioned rules and policies
+  could lead to complete removal of the target from the code base.
+
+* Degradation in support, documentation or test coverage will make the target as
+  nuisance to other targets and be considered a candidate for deprecation and
+  ultimately removed.
+
+In essences, these rules are necessary for targets to gain and retain their
+status, but also markers to define bit-rot, and will be used to clean up the
+tree from unmaintained targets.
+
+.. _copyright-license-patents:
+
+Copyright, License, and Patents
+===============================
+
+.. note::
+
+   This section deals with legal matters but does not provide legal advice.  We
+   are not lawyers --- please seek legal counsel from an attorney.
+
+This section addresses the issues of copyright, license and patents for the LLVM
+project.  The copyright for the code is held by the individual contributors of
+the code and the terms of its license to LLVM users and developers is the
+`University of Illinois/NCSA Open Source License
+<http://www.opensource.org/licenses/UoI-NCSA.php>`_ (with portions dual licensed
+under the `MIT License <http://www.opensource.org/licenses/mit-license.php>`_,
+see below).  As contributor to the LLVM project, you agree to allow any
+contributions to the project to licensed under these terms.
+
+Copyright
+---------
+
+The LLVM project does not require copyright assignments, which means that the
+copyright for the code in the project is held by its respective contributors who
+have each agreed to release their contributed code under the terms of the `LLVM
+License`_.
+
+An implication of this is that the LLVM license is unlikely to ever change:
+changing it would require tracking down all the contributors to LLVM and getting
+them to agree that a license change is acceptable for their contribution.  Since
+there are no plans to change the license, this is not a cause for concern.
+
+As a contributor to the project, this means that you (or your company) retain
+ownership of the code you contribute, that it cannot be used in a way that
+contradicts the license (which is a liberal BSD-style license), and that the
+license for your contributions won't change without your approval in the
+future.
+
+.. _LLVM License:
+
+License
+-------
+
+We intend to keep LLVM perpetually open source and to use a liberal open source
+license. **As a contributor to the project, you agree that any contributions be
+licensed under the terms of the corresponding subproject.** All of the code in
+LLVM is available under the `University of Illinois/NCSA Open Source License
+<http://www.opensource.org/licenses/UoI-NCSA.php>`_, which boils down to
+this:
+
+* You can freely distribute LLVM.
+* You must retain the copyright notice if you redistribute LLVM.
+* Binaries derived from LLVM must reproduce the copyright notice (e.g. in an
+  included readme file).
+* You can't use our names to promote your LLVM derived products.
+* There's no warranty on LLVM at all.
+
+We believe this fosters the widest adoption of LLVM because it **allows
+commercial products to be derived from LLVM** with few restrictions and without
+a requirement for making any derived works also open source (i.e.  LLVM's
+license is not a "copyleft" license like the GPL). We suggest that you read the
+`License <http://www.opensource.org/licenses/UoI-NCSA.php>`_ if further
+clarification is needed.
+
+In addition to the UIUC license, the runtime library components of LLVM
+(**compiler_rt, libc++, and libclc**) are also licensed under the `MIT License
+<http://www.opensource.org/licenses/mit-license.php>`_, which does not contain
+the binary redistribution clause.  As a user of these runtime libraries, it
+means that you can choose to use the code under either license (and thus don't
+need the binary redistribution clause), and as a contributor to the code that
+you agree that any contributions to these libraries be licensed under both
+licenses.  We feel that this is important for runtime libraries, because they
+are implicitly linked into applications and therefore should not subject those
+applications to the binary redistribution clause. This also means that it is ok
+to move code from (e.g.)  libc++ to the LLVM core without concern, but that code
+cannot be moved from the LLVM core to libc++ without the copyright owner's
+permission.
+
+Note that the LLVM Project does distribute dragonegg, **which is
+GPL.** This means that anything "linked" into dragonegg must itself be compatible
+with the GPL, and must be releasable under the terms of the GPL.  This implies
+that **any code linked into dragonegg and distributed to others may be subject to
+the viral aspects of the GPL** (for example, a proprietary code generator linked
+into dragonegg must be made available under the GPL).  This is not a problem for
+code already distributed under a more liberal license (like the UIUC license),
+and GPL-containing subprojects are kept in separate SVN repositories whose
+LICENSE.txt files specifically indicate that they contain GPL code.
+
+We have no plans to change the license of LLVM.  If you have questions or
+comments about the license, please contact the `LLVM Developer's Mailing
+List <mailto:llvm-dev at lists.llvm.org>`_.
+
+Patents
+-------
+
+To the best of our knowledge, LLVM does not infringe on any patents (we have
+actually removed code from LLVM in the past that was found to infringe).  Having
+code in LLVM that infringes on patents would violate an important goal of the
+project by making it hard or impossible to reuse the code for arbitrary purposes
+(including commercial use).
+
+When contributing code, we expect contributors to notify us of any potential for
+patent-related trouble with their changes (including from third parties).  If
+you or your employer own the rights to a patent and would like to contribute
+code to LLVM that relies on it, we require that the copyright owner sign an
+agreement that allows any other user of LLVM to freely use your patent.  Please
+contact the `LLVM Foundation Board of Directors <mailto:board at llvm.org>`_ for more
+details.

Added: www-releases/trunk/4.0.1/docs/_sources/ExceptionHandling.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/ExceptionHandling.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/ExceptionHandling.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/ExceptionHandling.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,841 @@
+==========================
+Exception Handling in LLVM
+==========================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document is the central repository for all information pertaining to
+exception handling in LLVM.  It describes the format that LLVM exception
+handling information takes, which is useful for those interested in creating
+front-ends or dealing directly with the information.  Further, this document
+provides specific examples of what exception handling information is used for in
+C and C++.
+
+Itanium ABI Zero-cost Exception Handling
+----------------------------------------
+
+Exception handling for most programming languages is designed to recover from
+conditions that rarely occur during general use of an application.  To that end,
+exception handling should not interfere with the main flow of an application's
+algorithm by performing checkpointing tasks, such as saving the current pc or
+register state.
+
+The Itanium ABI Exception Handling Specification defines a methodology for
+providing outlying data in the form of exception tables without inlining
+speculative exception handling code in the flow of an application's main
+algorithm.  Thus, the specification is said to add "zero-cost" to the normal
+execution of an application.
+
+A more complete description of the Itanium ABI exception handling runtime
+support of can be found at `Itanium C++ ABI: Exception Handling
+<http://mentorembedded.github.com/cxx-abi/abi-eh.html>`_. A description of the
+exception frame format can be found at `Exception Frames
+<http://refspecs.linuxfoundation.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html>`_,
+with details of the DWARF 4 specification at `DWARF 4 Standard
+<http://dwarfstd.org/Dwarf4Std.php>`_.  A description for the C++ exception
+table formats can be found at `Exception Handling Tables
+<http://mentorembedded.github.com/cxx-abi/exceptions.pdf>`_.
+
+Setjmp/Longjmp Exception Handling
+---------------------------------
+
+Setjmp/Longjmp (SJLJ) based exception handling uses LLVM intrinsics
+`llvm.eh.sjlj.setjmp`_ and `llvm.eh.sjlj.longjmp`_ to handle control flow for
+exception handling.
+
+For each function which does exception processing --- be it ``try``/``catch``
+blocks or cleanups --- that function registers itself on a global frame
+list. When exceptions are unwinding, the runtime uses this list to identify
+which functions need processing.
+
+Landing pad selection is encoded in the call site entry of the function
+context. The runtime returns to the function via `llvm.eh.sjlj.longjmp`_, where
+a switch table transfers control to the appropriate landing pad based on the
+index stored in the function context.
+
+In contrast to DWARF exception handling, which encodes exception regions and
+frame information in out-of-line tables, SJLJ exception handling builds and
+removes the unwind frame context at runtime. This results in faster exception
+handling at the expense of slower execution when no exceptions are thrown. As
+exceptions are, by their nature, intended for uncommon code paths, DWARF
+exception handling is generally preferred to SJLJ.
+
+Windows Runtime Exception Handling
+-----------------------------------
+
+LLVM supports handling exceptions produced by the Windows runtime, but it
+requires a very different intermediate representation. It is not based on the
+":ref:`landingpad <i_landingpad>`" instruction like the other two models, and is
+described later in this document under :ref:`wineh`.
+
+Overview
+--------
+
+When an exception is thrown in LLVM code, the runtime does its best to find a
+handler suited to processing the circumstance.
+
+The runtime first attempts to find an *exception frame* corresponding to the
+function where the exception was thrown.  If the programming language supports
+exception handling (e.g. C++), the exception frame contains a reference to an
+exception table describing how to process the exception.  If the language does
+not support exception handling (e.g. C), or if the exception needs to be
+forwarded to a prior activation, the exception frame contains information about
+how to unwind the current activation and restore the state of the prior
+activation.  This process is repeated until the exception is handled. If the
+exception is not handled and no activations remain, then the application is
+terminated with an appropriate error message.
+
+Because different programming languages have different behaviors when handling
+exceptions, the exception handling ABI provides a mechanism for
+supplying *personalities*. An exception handling personality is defined by
+way of a *personality function* (e.g. ``__gxx_personality_v0`` in C++),
+which receives the context of the exception, an *exception structure*
+containing the exception object type and value, and a reference to the exception
+table for the current function.  The personality function for the current
+compile unit is specified in a *common exception frame*.
+
+The organization of an exception table is language dependent. For C++, an
+exception table is organized as a series of code ranges defining what to do if
+an exception occurs in that range. Typically, the information associated with a
+range defines which types of exception objects (using C++ *type info*) that are
+handled in that range, and an associated action that should take place. Actions
+typically pass control to a *landing pad*.
+
+A landing pad corresponds roughly to the code found in the ``catch`` portion of
+a ``try``/``catch`` sequence. When execution resumes at a landing pad, it
+receives an *exception structure* and a *selector value* corresponding to the
+*type* of exception thrown. The selector is then used to determine which *catch*
+should actually process the exception.
+
+LLVM Code Generation
+====================
+
+From a C++ developer's perspective, exceptions are defined in terms of the
+``throw`` and ``try``/``catch`` statements. In this section we will describe the
+implementation of LLVM exception handling in terms of C++ examples.
+
+Throw
+-----
+
+Languages that support exception handling typically provide a ``throw``
+operation to initiate the exception process. Internally, a ``throw`` operation
+breaks down into two steps.
+
+#. A request is made to allocate exception space for an exception structure.
+   This structure needs to survive beyond the current activation. This structure
+   will contain the type and value of the object being thrown.
+
+#. A call is made to the runtime to raise the exception, passing the exception
+   structure as an argument.
+
+In C++, the allocation of the exception structure is done by the
+``__cxa_allocate_exception`` runtime function. The exception raising is handled
+by ``__cxa_throw``. The type of the exception is represented using a C++ RTTI
+structure.
+
+Try/Catch
+---------
+
+A call within the scope of a *try* statement can potentially raise an
+exception. In those circumstances, the LLVM C++ front-end replaces the call with
+an ``invoke`` instruction. Unlike a call, the ``invoke`` has two potential
+continuation points:
+
+#. where to continue when the call succeeds as per normal, and
+
+#. where to continue if the call raises an exception, either by a throw or the
+   unwinding of a throw
+
+The term used to define the place where an ``invoke`` continues after an
+exception is called a *landing pad*. LLVM landing pads are conceptually
+alternative function entry points where an exception structure reference and a
+type info index are passed in as arguments. The landing pad saves the exception
+structure reference and then proceeds to select the catch block that corresponds
+to the type info of the exception object.
+
+The LLVM :ref:`i_landingpad` is used to convey information about the landing
+pad to the back end. For C++, the ``landingpad`` instruction returns a pointer
+and integer pair corresponding to the pointer to the *exception structure* and
+the *selector value* respectively.
+
+The ``landingpad`` instruction looks for a reference to the personality
+function to be used for this ``try``/``catch`` sequence in the parent
+function's attribute list. The instruction contains a list of *cleanup*,
+*catch*, and *filter* clauses. The exception is tested against the clauses
+sequentially from first to last. The clauses have the following meanings:
+
+-  ``catch <type> @ExcType``
+
+   - This clause means that the landingpad block should be entered if the
+     exception being thrown is of type ``@ExcType`` or a subtype of
+     ``@ExcType``. For C++, ``@ExcType`` is a pointer to the ``std::type_info``
+     object (an RTTI object) representing the C++ exception type.
+
+   - If ``@ExcType`` is ``null``, any exception matches, so the landingpad
+     should always be entered. This is used for C++ catch-all blocks ("``catch
+     (...)``").
+
+   - When this clause is matched, the selector value will be equal to the value
+     returned by "``@llvm.eh.typeid.for(i8* @ExcType)``". This will always be a
+     positive value.
+
+-  ``filter <type> [<type> @ExcType1, ..., <type> @ExcTypeN]``
+
+   - This clause means that the landingpad should be entered if the exception
+     being thrown does *not* match any of the types in the list (which, for C++,
+     are again specified as ``std::type_info`` pointers).
+
+   - C++ front-ends use this to implement C++ exception specifications, such as
+     "``void foo() throw (ExcType1, ..., ExcTypeN) { ... }``".
+
+   - When this clause is matched, the selector value will be negative.
+
+   - The array argument to ``filter`` may be empty; for example, "``[0 x i8**]
+     undef``". This means that the landingpad should always be entered. (Note
+     that such a ``filter`` would not be equivalent to "``catch i8* null``",
+     because ``filter`` and ``catch`` produce negative and positive selector
+     values respectively.)
+
+-  ``cleanup``
+
+   - This clause means that the landingpad should always be entered.
+
+   - C++ front-ends use this for calling objects' destructors.
+
+   - When this clause is matched, the selector value will be zero.
+
+   - The runtime may treat "``cleanup``" differently from "``catch <type>
+     null``".
+
+     In C++, if an unhandled exception occurs, the language runtime will call
+     ``std::terminate()``, but it is implementation-defined whether the runtime
+     unwinds the stack and calls object destructors first. For example, the GNU
+     C++ unwinder does not call object destructors when an unhandled exception
+     occurs. The reason for this is to improve debuggability: it ensures that
+     ``std::terminate()`` is called from the context of the ``throw``, so that
+     this context is not lost by unwinding the stack. A runtime will typically
+     implement this by searching for a matching non-``cleanup`` clause, and
+     aborting if it does not find one, before entering any landingpad blocks.
+
+Once the landing pad has the type info selector, the code branches to the code
+for the first catch. The catch then checks the value of the type info selector
+against the index of type info for that catch.  Since the type info index is not
+known until all the type infos have been gathered in the backend, the catch code
+must call the `llvm.eh.typeid.for`_ intrinsic to determine the index for a given
+type info. If the catch fails to match the selector then control is passed on to
+the next catch.
+
+Finally, the entry and exit of catch code is bracketed with calls to
+``__cxa_begin_catch`` and ``__cxa_end_catch``.
+
+* ``__cxa_begin_catch`` takes an exception structure reference as an argument
+  and returns the value of the exception object.
+
+* ``__cxa_end_catch`` takes no arguments. This function:
+
+  #. Locates the most recently caught exception and decrements its handler
+     count,
+
+  #. Removes the exception from the *caught* stack if the handler count goes to
+     zero, and
+
+  #. Destroys the exception if the handler count goes to zero and the exception
+     was not re-thrown by throw.
+
+  .. note::
+
+    a rethrow from within the catch may replace this call with a
+    ``__cxa_rethrow``.
+
+Cleanups
+--------
+
+A cleanup is extra code which needs to be run as part of unwinding a scope.  C++
+destructors are a typical example, but other languages and language extensions
+provide a variety of different kinds of cleanups. In general, a landing pad may
+need to run arbitrary amounts of cleanup code before actually entering a catch
+block. To indicate the presence of cleanups, a :ref:`i_landingpad` should have
+a *cleanup* clause.  Otherwise, the unwinder will not stop at the landing pad if
+there are no catches or filters that require it to.
+
+.. note::
+
+  Do not allow a new exception to propagate out of the execution of a
+  cleanup. This can corrupt the internal state of the unwinder.  Different
+  languages describe different high-level semantics for these situations: for
+  example, C++ requires that the process be terminated, whereas Ada cancels both
+  exceptions and throws a third.
+
+When all cleanups are finished, if the exception is not handled by the current
+function, resume unwinding by calling the :ref:`resume instruction <i_resume>`,
+passing in the result of the ``landingpad`` instruction for the original
+landing pad.
+
+Throw Filters
+-------------
+
+C++ allows the specification of which exception types may be thrown from a
+function. To represent this, a top level landing pad may exist to filter out
+invalid types. To express this in LLVM code the :ref:`i_landingpad` will have a
+filter clause. The clause consists of an array of type infos.
+``landingpad`` will return a negative value
+if the exception does not match any of the type infos. If no match is found then
+a call to ``__cxa_call_unexpected`` should be made, otherwise
+``_Unwind_Resume``.  Each of these functions requires a reference to the
+exception structure.  Note that the most general form of a ``landingpad``
+instruction can have any number of catch, cleanup, and filter clauses (though
+having more than one cleanup is pointless). The LLVM C++ front-end can generate
+such ``landingpad`` instructions due to inlining creating nested exception
+handling scopes.
+
+.. _undefined:
+
+Restrictions
+------------
+
+The unwinder delegates the decision of whether to stop in a call frame to that
+call frame's language-specific personality function. Not all unwinders guarantee
+that they will stop to perform cleanups. For example, the GNU C++ unwinder
+doesn't do so unless the exception is actually caught somewhere further up the
+stack.
+
+In order for inlining to behave correctly, landing pads must be prepared to
+handle selector results that they did not originally advertise. Suppose that a
+function catches exceptions of type ``A``, and it's inlined into a function that
+catches exceptions of type ``B``. The inliner will update the ``landingpad``
+instruction for the inlined landing pad to include the fact that ``B`` is also
+caught. If that landing pad assumes that it will only be entered to catch an
+``A``, it's in for a rude awakening.  Consequently, landing pads must test for
+the selector results they understand and then resume exception propagation with
+the `resume instruction <LangRef.html#i_resume>`_ if none of the conditions
+match.
+
+Exception Handling Intrinsics
+=============================
+
+In addition to the ``landingpad`` and ``resume`` instructions, LLVM uses several
+intrinsic functions (name prefixed with ``llvm.eh``) to provide exception
+handling information at various points in generated code.
+
+.. _llvm.eh.typeid.for:
+
+``llvm.eh.typeid.for``
+----------------------
+
+.. code-block:: llvm
+
+  i32 @llvm.eh.typeid.for(i8* %type_info)
+
+
+This intrinsic returns the type info index in the exception table of the current
+function.  This value can be used to compare against the result of
+``landingpad`` instruction.  The single argument is a reference to a type info.
+
+Uses of this intrinsic are generated by the C++ front-end.
+
+.. _llvm.eh.begincatch:
+
+``llvm.eh.begincatch``
+----------------------
+
+.. code-block:: llvm
+
+  void @llvm.eh.begincatch(i8* %ehptr, i8* %ehobj)
+
+
+This intrinsic marks the beginning of catch handling code within the blocks
+following a ``landingpad`` instruction.  The exact behavior of this function
+depends on the compilation target and the personality function associated
+with the ``landingpad`` instruction.
+
+The first argument to this intrinsic is a pointer that was previously extracted
+from the aggregate return value of the ``landingpad`` instruction.  The second
+argument to the intrinsic is a pointer to stack space where the exception object
+should be stored. The runtime handles the details of copying the exception
+object into the slot. If the second parameter is null, no copy occurs.
+
+Uses of this intrinsic are generated by the C++ front-end.  Many targets will
+use implementation-specific functions (such as ``__cxa_begin_catch``) instead
+of this intrinsic.  The intrinsic is provided for targets that require a more
+abstract interface.
+
+When used in the native Windows C++ exception handling implementation, this
+intrinsic serves as a placeholder to delimit code before a catch handler is
+outlined.  When the handler is is outlined, this intrinsic will be replaced
+by instructions that retrieve the exception object pointer from the frame
+allocation block.
+
+
+.. _llvm.eh.endcatch:
+
+``llvm.eh.endcatch``
+----------------------
+
+.. code-block:: llvm
+
+  void @llvm.eh.endcatch()
+
+
+This intrinsic marks the end of catch handling code within the current block,
+which will be a successor of a block which called ``llvm.eh.begincatch''.
+The exact behavior of this function depends on the compilation target and the
+personality function associated with the corresponding ``landingpad``
+instruction.
+
+There may be more than one call to ``llvm.eh.endcatch`` for any given call to
+``llvm.eh.begincatch`` with each ``llvm.eh.endcatch`` call corresponding to the
+end of a different control path.  All control paths following a call to
+``llvm.eh.begincatch`` must reach a call to ``llvm.eh.endcatch``.
+
+Uses of this intrinsic are generated by the C++ front-end.  Many targets will
+use implementation-specific functions (such as ``__cxa_begin_catch``) instead
+of this intrinsic.  The intrinsic is provided for targets that require a more
+abstract interface.
+
+When used in the native Windows C++ exception handling implementation, this
+intrinsic serves as a placeholder to delimit code before a catch handler is
+outlined.  After the handler is outlined, this intrinsic is simply removed.
+
+
+.. _llvm.eh.exceptionpointer:
+
+``llvm.eh.exceptionpointer``
+----------------------------
+
+.. code-block:: text
+
+  i8 addrspace(N)* @llvm.eh.padparam.pNi8(token %catchpad)
+
+
+This intrinsic retrieves a pointer to the exception caught by the given
+``catchpad``.
+
+
+SJLJ Intrinsics
+---------------
+
+The ``llvm.eh.sjlj`` intrinsics are used internally within LLVM's
+backend.  Uses of them are generated by the backend's
+``SjLjEHPrepare`` pass.
+
+.. _llvm.eh.sjlj.setjmp:
+
+``llvm.eh.sjlj.setjmp``
+~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: text
+
+  i32 @llvm.eh.sjlj.setjmp(i8* %setjmp_buf)
+
+For SJLJ based exception handling, this intrinsic forces register saving for the
+current function and stores the address of the following instruction for use as
+a destination address by `llvm.eh.sjlj.longjmp`_. The buffer format and the
+overall functioning of this intrinsic is compatible with the GCC
+``__builtin_setjmp`` implementation allowing code built with the clang and GCC
+to interoperate.
+
+The single parameter is a pointer to a five word buffer in which the calling
+context is saved. The front end places the frame pointer in the first word, and
+the target implementation of this intrinsic should place the destination address
+for a `llvm.eh.sjlj.longjmp`_ in the second word. The following three words are
+available for use in a target-specific manner.
+
+.. _llvm.eh.sjlj.longjmp:
+
+``llvm.eh.sjlj.longjmp``
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: llvm
+
+  void @llvm.eh.sjlj.longjmp(i8* %setjmp_buf)
+
+For SJLJ based exception handling, the ``llvm.eh.sjlj.longjmp`` intrinsic is
+used to implement ``__builtin_longjmp()``. The single parameter is a pointer to
+a buffer populated by `llvm.eh.sjlj.setjmp`_. The frame pointer and stack
+pointer are restored from the buffer, then control is transferred to the
+destination address.
+
+``llvm.eh.sjlj.lsda``
+~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: llvm
+
+  i8* @llvm.eh.sjlj.lsda()
+
+For SJLJ based exception handling, the ``llvm.eh.sjlj.lsda`` intrinsic returns
+the address of the Language Specific Data Area (LSDA) for the current
+function. The SJLJ front-end code stores this address in the exception handling
+function context for use by the runtime.
+
+``llvm.eh.sjlj.callsite``
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: llvm
+
+  void @llvm.eh.sjlj.callsite(i32 %call_site_num)
+
+For SJLJ based exception handling, the ``llvm.eh.sjlj.callsite`` intrinsic
+identifies the callsite value associated with the following ``invoke``
+instruction. This is used to ensure that landing pad entries in the LSDA are
+generated in matching order.
+
+Asm Table Formats
+=================
+
+There are two tables that are used by the exception handling runtime to
+determine which actions should be taken when an exception is thrown.
+
+Exception Handling Frame
+------------------------
+
+An exception handling frame ``eh_frame`` is very similar to the unwind frame
+used by DWARF debug info. The frame contains all the information necessary to
+tear down the current frame and restore the state of the prior frame. There is
+an exception handling frame for each function in a compile unit, plus a common
+exception handling frame that defines information common to all functions in the
+unit.
+
+The format of this call frame information (CFI) is often platform-dependent,
+however. ARM, for example, defines their own format. Apple has their own compact
+unwind info format.  On Windows, another format is used for all architectures
+since 32-bit x86.  LLVM will emit whatever information is required by the
+target.
+
+Exception Tables
+----------------
+
+An exception table contains information about what actions to take when an
+exception is thrown in a particular part of a function's code. This is typically
+referred to as the language-specific data area (LSDA). The format of the LSDA
+table is specific to the personality function, but the majority of personalities
+out there use a variation of the tables consumed by ``__gxx_personality_v0``.
+There is one exception table per function, except leaf functions and functions
+that have calls only to non-throwing functions. They do not need an exception
+table.
+
+.. _wineh:
+
+Exception Handling using the Windows Runtime
+=================================================
+
+Background on Windows exceptions
+---------------------------------
+
+Interacting with exceptions on Windows is significantly more complicated than
+on Itanium C++ ABI platforms. The fundamental difference between the two models
+is that Itanium EH is designed around the idea of "successive unwinding," while
+Windows EH is not.
+
+Under Itanium, throwing an exception typically involes allocating thread local
+memory to hold the exception, and calling into the EH runtime. The runtime
+identifies frames with appropriate exception handling actions, and successively
+resets the register context of the current thread to the most recently active
+frame with actions to run. In LLVM, execution resumes at a ``landingpad``
+instruction, which produces register values provided by the runtime. If a
+function is only cleaning up allocated resources, the function is responsible
+for calling ``_Unwind_Resume`` to transition to the next most recently active
+frame after it is finished cleaning up. Eventually, the frame responsible for
+handling the exception calls ``__cxa_end_catch`` to destroy the exception,
+release its memory, and resume normal control flow.
+
+The Windows EH model does not use these successive register context resets.
+Instead, the active exception is typically described by a frame on the stack.
+In the case of C++ exceptions, the exception object is allocated in stack memory
+and its address is passed to ``__CxxThrowException``. General purpose structured
+exceptions (SEH) are more analogous to Linux signals, and they are dispatched by
+userspace DLLs provided with Windows. Each frame on the stack has an assigned EH
+personality routine, which decides what actions to take to handle the exception.
+There are a few major personalities for C and C++ code: the C++ personality
+(``__CxxFrameHandler3``) and the SEH personalities (``_except_handler3``,
+``_except_handler4``, and ``__C_specific_handler``). All of them implement
+cleanups by calling back into a "funclet" contained in the parent function.
+
+Funclets, in this context, are regions of the parent function that can be called
+as though they were a function pointer with a very special calling convention.
+The frame pointer of the parent frame is passed into the funclet either using
+the standard EBP register or as the first parameter register, depending on the
+architecture. The funclet implements the EH action by accessing local variables
+in memory through the frame pointer, and returning some appropriate value,
+continuing the EH process.  No variables live in to or out of the funclet can be
+allocated in registers.
+
+The C++ personality also uses funclets to contain the code for catch blocks
+(i.e. all user code between the braces in ``catch (Type obj) { ... }``). The
+runtime must use funclets for catch bodies because the C++ exception object is
+allocated in a child stack frame of the function handling the exception. If the
+runtime rewound the stack back to frame of the catch, the memory holding the
+exception would be overwritten quickly by subsequent function calls.  The use of
+funclets also allows ``__CxxFrameHandler3`` to implement rethrow without
+resorting to TLS. Instead, the runtime throws a special exception, and then uses
+SEH (``__try / __except``) to resume execution with new information in the child
+frame.
+
+In other words, the successive unwinding approach is incompatible with Visual
+C++ exceptions and general purpose Windows exception handling. Because the C++
+exception object lives in stack memory, LLVM cannot provide a custom personality
+function that uses landingpads.  Similarly, SEH does not provide any mechanism
+to rethrow an exception or continue unwinding.  Therefore, LLVM must use the IR
+constructs described later in this document to implement compatible exception
+handling.
+
+SEH filter expressions
+-----------------------
+
+The SEH personality functions also use funclets to implement filter expressions,
+which allow executing arbitrary user code to decide which exceptions to catch.
+Filter expressions should not be confused with the ``filter`` clause of the LLVM
+``landingpad`` instruction.  Typically filter expressions are used to determine
+if the exception came from a particular DLL or code region, or if code faulted
+while accessing a particular memory address range. LLVM does not currently have
+IR to represent filter expressions because it is difficult to represent their
+control dependencies.  Filter expressions run during the first phase of EH,
+before cleanups run, making it very difficult to build a faithful control flow
+graph.  For now, the new EH instructions cannot represent SEH filter
+expressions, and frontends must outline them ahead of time. Local variables of
+the parent function can be escaped and accessed using the ``llvm.localescape``
+and ``llvm.localrecover`` intrinsics.
+
+New exception handling instructions
+------------------------------------
+
+The primary design goal of the new EH instructions is to support funclet
+generation while preserving information about the CFG so that SSA formation
+still works.  As a secondary goal, they are designed to be generic across MSVC
+and Itanium C++ exceptions. They make very few assumptions about the data
+required by the personality, so long as it uses the familiar core EH actions:
+catch, cleanup, and terminate.  However, the new instructions are hard to modify
+without knowing details of the EH personality. While they can be used to
+represent Itanium EH, the landingpad model is strictly better for optimization
+purposes.
+
+The following new instructions are considered "exception handling pads", in that
+they must be the first non-phi instruction of a basic block that may be the
+unwind destination of an EH flow edge:
+``catchswitch``, ``catchpad``, and ``cleanuppad``.
+As with landingpads, when entering a try scope, if the
+frontend encounters a call site that may throw an exception, it should emit an
+invoke that unwinds to a ``catchswitch`` block. Similarly, inside the scope of a
+C++ object with a destructor, invokes should unwind to a ``cleanuppad``.
+
+New instructions are also used to mark the points where control is transferred
+out of a catch/cleanup handler (which will correspond to exits from the
+generated funclet).  A catch handler which reaches its end by normal execution
+executes a ``catchret`` instruction, which is a terminator indicating where in
+the function control is returned to.  A cleanup handler which reaches its end
+by normal execution executes a ``cleanupret`` instruction, which is a terminator
+indicating where the active exception will unwind to next.
+
+Each of these new EH pad instructions has a way to identify which action should
+be considered after this action. The ``catchswitch`` instruction is a terminator
+and has an unwind destination operand analogous to the unwind destination of an
+invoke.  The ``cleanuppad`` instruction is not
+a terminator, so the unwind destination is stored on the ``cleanupret``
+instruction instead. Successfully executing a catch handler should resume
+normal control flow, so neither ``catchpad`` nor ``catchret`` instructions can
+unwind. All of these "unwind edges" may refer to a basic block that contains an
+EH pad instruction, or they may unwind to the caller.  Unwinding to the caller
+has roughly the same semantics as the ``resume`` instruction in the landingpad
+model. When inlining through an invoke, instructions that unwind to the caller
+are hooked up to unwind to the unwind destination of the call site.
+
+Putting things together, here is a hypothetical lowering of some C++ that uses
+all of the new IR instructions:
+
+.. code-block:: c
+
+  struct Cleanup {
+    Cleanup();
+    ~Cleanup();
+    int m;
+  };
+  void may_throw();
+  int f() noexcept {
+    try {
+      Cleanup obj;
+      may_throw();
+    } catch (int e) {
+      may_throw();
+      return e;
+    }
+    return 0;
+  }
+
+.. code-block:: text
+
+  define i32 @f() nounwind personality i32 (...)* @__CxxFrameHandler3 {
+  entry:
+    %obj = alloca %struct.Cleanup, align 4
+    %e = alloca i32, align 4
+    %call = invoke %struct.Cleanup* @"\01??0Cleanup@@QEAA at XZ"(%struct.Cleanup* nonnull %obj)
+            to label %invoke.cont unwind label %lpad.catch
+
+  invoke.cont:                                      ; preds = %entry
+    invoke void @"\01?may_throw@@YAXXZ"()
+            to label %invoke.cont.2 unwind label %lpad.cleanup
+
+  invoke.cont.2:                                    ; preds = %invoke.cont
+    call void @"\01??_DCleanup@@QEAA at XZ"(%struct.Cleanup* nonnull %obj) nounwind
+    br label %return
+
+  return:                                           ; preds = %invoke.cont.3, %invoke.cont.2
+    %retval.0 = phi i32 [ 0, %invoke.cont.2 ], [ %3, %invoke.cont.3 ]
+    ret i32 %retval.0
+
+  lpad.cleanup:                                     ; preds = %invoke.cont.2
+    %0 = cleanuppad within none []
+    call void @"\01??1Cleanup@@QEAA at XZ"(%struct.Cleanup* nonnull %obj) nounwind
+    cleanupret %0 unwind label %lpad.catch
+
+  lpad.catch:                                       ; preds = %lpad.cleanup, %entry
+    %1 = catchswitch within none [label %catch.body] unwind label %lpad.terminate
+
+  catch.body:                                       ; preds = %lpad.catch
+    %catch = catchpad within %1 [%rtti.TypeDescriptor2* @"\01??_R0H at 8", i32 0, i32* %e]
+    invoke void @"\01?may_throw@@YAXXZ"()
+            to label %invoke.cont.3 unwind label %lpad.terminate
+
+  invoke.cont.3:                                    ; preds = %catch.body
+    %3 = load i32, i32* %e, align 4
+    catchret from %catch to label %return
+
+  lpad.terminate:                                   ; preds = %catch.body, %lpad.catch
+    cleanuppad within none []
+    call void @"\01?terminate@@YAXXZ"
+    unreachable
+  }
+
+Funclet parent tokens
+-----------------------
+
+In order to produce tables for EH personalities that use funclets, it is
+necessary to recover the nesting that was present in the source. This funclet
+parent relationship is encoded in the IR using tokens produced by the new "pad"
+instructions. The token operand of a "pad" or "ret" instruction indicates which
+funclet it is in, or "none" if it is not nested within another funclet.
+
+The ``catchpad`` and ``cleanuppad`` instructions establish new funclets, and
+their tokens are consumed by other "pad" instructions to establish membership.
+The ``catchswitch`` instruction does not create a funclet, but it produces a
+token that is always consumed by its immediate successor ``catchpad``
+instructions. This ensures that every catch handler modelled by a ``catchpad``
+belongs to exactly one ``catchswitch``, which models the dispatch point after a
+C++ try.
+
+Here is an example of what this nesting looks like using some hypothetical
+C++ code:
+
+.. code-block:: c
+
+  void f() {
+    try {
+      throw;
+    } catch (...) {
+      try {
+        throw;
+      } catch (...) {
+      }
+    }
+  }
+
+.. code-block:: text
+
+  define void @f() #0 personality i8* bitcast (i32 (...)* @__CxxFrameHandler3 to i8*) {
+  entry:
+    invoke void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #1
+            to label %unreachable unwind label %catch.dispatch
+
+  catch.dispatch:                                   ; preds = %entry
+    %0 = catchswitch within none [label %catch] unwind to caller
+
+  catch:                                            ; preds = %catch.dispatch
+    %1 = catchpad within %0 [i8* null, i32 64, i8* null]
+    invoke void @_CxxThrowException(i8* null, %eh.ThrowInfo* null) #1
+            to label %unreachable unwind label %catch.dispatch2
+
+  catch.dispatch2:                                  ; preds = %catch
+    %2 = catchswitch within %1 [label %catch3] unwind to caller
+
+  catch3:                                           ; preds = %catch.dispatch2
+    %3 = catchpad within %2 [i8* null, i32 64, i8* null]
+    catchret from %3 to label %try.cont
+
+  try.cont:                                         ; preds = %catch3
+    catchret from %1 to label %try.cont6
+
+  try.cont6:                                        ; preds = %try.cont
+    ret void
+
+  unreachable:                                      ; preds = %catch, %entry
+    unreachable
+  }
+
+The "inner" ``catchswitch`` consumes ``%1`` which is produced by the outer
+catchswitch.
+
+.. _wineh-constraints:
+
+Funclet transitions
+-----------------------
+
+The EH tables for personalities that use funclets make implicit use of the
+funclet nesting relationship to encode unwind destinations, and so are
+constrained in the set of funclet transitions they can represent.  The related
+LLVM IR instructions accordingly have constraints that ensure encodability of
+the EH edges in the flow graph.
+
+A ``catchswitch``, ``catchpad``, or ``cleanuppad`` is said to be "entered"
+when it executes.  It may subsequently be "exited" by any of the following
+means:
+
+* A ``catchswitch`` is immediately exited when none of its constituent
+  ``catchpad``\ s are appropriate for the in-flight exception and it unwinds
+  to its unwind destination or the caller.
+* A ``catchpad`` and its parent ``catchswitch`` are both exited when a
+  ``catchret`` from the ``catchpad`` is executed.
+* A ``cleanuppad`` is exited when a ``cleanupret`` from it is executed.
+* Any of these pads is exited when control unwinds to the function's caller,
+  either by a ``call`` which unwinds all the way to the function's caller,
+  a nested ``catchswitch`` marked "``unwinds to caller``", or a nested
+  ``cleanuppad``\ 's ``cleanupret`` marked "``unwinds to caller"``.
+* Any of these pads is exited when an unwind edge (from an ``invoke``,
+  nested ``catchswitch``, or nested ``cleanuppad``\ 's ``cleanupret``)
+  unwinds to a destination pad that is not a descendant of the given pad.
+
+Note that the ``ret`` instruction is *not* a valid way to exit a funclet pad;
+it is undefined behavior to execute a ``ret`` when a pad has been entered but
+not exited.
+
+A single unwind edge may exit any number of pads (with the restrictions that
+the edge from a ``catchswitch`` must exit at least itself, and the edge from
+a ``cleanupret`` must exit at least its ``cleanuppad``), and then must enter
+exactly one pad, which must be distinct from all the exited pads.  The parent
+of the pad that an unwind edge enters must be the most-recently-entered
+not-yet-exited pad (after exiting from any pads that the unwind edge exits),
+or "none" if there is no such pad.  This ensures that the stack of executing
+funclets at run-time always corresponds to some path in the funclet pad tree
+that the parent tokens encode.
+
+All unwind edges which exit any given funclet pad (including ``cleanupret``
+edges exiting their ``cleanuppad`` and ``catchswitch`` edges exiting their
+``catchswitch``) must share the same unwind destination.  Similarly, any
+funclet pad which may be exited by unwind to caller must not be exited by
+any exception edges which unwind anywhere other than the caller.  This
+ensures that each funclet as a whole has only one unwind destination, which
+EH tables for funclet personalities may require.  Note that any unwind edge
+which exits a ``catchpad`` also exits its parent ``catchswitch``, so this
+implies that for any given ``catchswitch``, its unwind destination must also
+be the unwind destination of any unwind edge that exits any of its constituent
+``catchpad``\s.  Because ``catchswitch`` has no ``nounwind`` variant, and
+because IR producers are not *required* to annotate calls which will not
+unwind as ``nounwind``, it is legal to nest a ``call`` or an "``unwind to
+caller``\ " ``catchswitch`` within a funclet pad that has an unwind
+destination other than caller; it is undefined behavior for such a ``call``
+or ``catchswitch`` to unwind.
+
+Finally, the funclet pads' unwind destinations cannot form a cycle.  This
+ensures that EH lowering can construct "try regions" with a tree-like
+structure, which funclet-based personalities may require.

Added: www-releases/trunk/4.0.1/docs/_sources/ExtendingLLVM.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/ExtendingLLVM.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/ExtendingLLVM.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/ExtendingLLVM.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,327 @@
+============================================================
+Extending LLVM: Adding instructions, intrinsics, types, etc.
+============================================================
+
+Introduction and Warning
+========================
+
+
+During the course of using LLVM, you may wish to customize it for your research
+project or for experimentation. At this point, you may realize that you need to
+add something to LLVM, whether it be a new fundamental type, a new intrinsic
+function, or a whole new instruction.
+
+When you come to this realization, stop and think. Do you really need to extend
+LLVM? Is it a new fundamental capability that LLVM does not support at its
+current incarnation or can it be synthesized from already pre-existing LLVM
+elements? If you are not sure, ask on the `LLVM-dev
+<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ list. The reason is that
+extending LLVM will get involved as you need to update all the different passes
+that you intend to use with your extension, and there are ``many`` LLVM analyses
+and transformations, so it may be quite a bit of work.
+
+Adding an `intrinsic function`_ is far easier than adding an
+instruction, and is transparent to optimization passes.  If your added
+functionality can be expressed as a function call, an intrinsic function is the
+method of choice for LLVM extension.
+
+Before you invest a significant amount of effort into a non-trivial extension,
+**ask on the list** if what you are looking to do can be done with
+already-existing infrastructure, or if maybe someone else is already working on
+it. You will save yourself a lot of time and effort by doing so.
+
+.. _intrinsic function:
+
+Adding a new intrinsic function
+===============================
+
+Adding a new intrinsic function to LLVM is much easier than adding a new
+instruction.  Almost all extensions to LLVM should start as an intrinsic
+function and then be turned into an instruction if warranted.
+
+#. ``llvm/docs/LangRef.html``:
+
+   Document the intrinsic.  Decide whether it is code generator specific and
+   what the restrictions are.  Talk to other people about it so that you are
+   sure it's a good idea.
+
+#. ``llvm/include/llvm/IR/Intrinsics*.td``:
+
+   Add an entry for your intrinsic.  Describe its memory access characteristics
+   for optimization (this controls whether it will be DCE'd, CSE'd, etc). Note
+   that any intrinsic using one of the ``llvm_any*_ty`` types for an argument or
+   return type will be deemed by ``tblgen`` as overloaded and the corresponding
+   suffix will be required on the intrinsic's name.
+
+#. ``llvm/lib/Analysis/ConstantFolding.cpp``:
+
+   If it is possible to constant fold your intrinsic, add support to it in the
+   ``canConstantFoldCallTo`` and ``ConstantFoldCall`` functions.
+
+#. ``llvm/test/*``:
+
+   Add test cases for your test cases to the test suite
+
+Once the intrinsic has been added to the system, you must add code generator
+support for it.  Generally you must do the following steps:
+
+Add support to the .td file for the target(s) of your choice in
+``lib/Target/*/*.td``.
+
+  This is usually a matter of adding a pattern to the .td file that matches the
+  intrinsic, though it may obviously require adding the instructions you want to
+  generate as well.  There are lots of examples in the PowerPC and X86 backend
+  to follow.
+
+Adding a new SelectionDAG node
+==============================
+
+As with intrinsics, adding a new SelectionDAG node to LLVM is much easier than
+adding a new instruction.  New nodes are often added to help represent
+instructions common to many targets.  These nodes often map to an LLVM
+instruction (add, sub) or intrinsic (byteswap, population count).  In other
+cases, new nodes have been added to allow many targets to perform a common task
+(converting between floating point and integer representation) or capture more
+complicated behavior in a single node (rotate).
+
+#. ``include/llvm/CodeGen/ISDOpcodes.h``:
+
+   Add an enum value for the new SelectionDAG node.
+
+#. ``lib/CodeGen/SelectionDAG/SelectionDAG.cpp``:
+
+   Add code to print the node to ``getOperationName``.  If your new node can be
+    evaluated at compile time when given constant arguments (such as an add of a
+    constant with another constant), find the ``getNode`` method that takes the
+    appropriate number of arguments, and add a case for your node to the switch
+    statement that performs constant folding for nodes that take the same number
+    of arguments as your new node.
+
+#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
+
+   Add code to `legalize, promote, and expand
+   <CodeGenerator.html#selectiondag_legalize>`_ the node as necessary.  At a
+   minimum, you will need to add a case statement for your node in
+   ``LegalizeOp`` which calls LegalizeOp on the node's operands, and returns a
+   new node if any of the operands changed as a result of being legalized.  It
+   is likely that not all targets supported by the SelectionDAG framework will
+   natively support the new node.  In this case, you must also add code in your
+   node's case statement in ``LegalizeOp`` to Expand your node into simpler,
+   legal operations.  The case for ``ISD::UREM`` for expanding a remainder into
+   a divide, multiply, and a subtract is a good example.
+
+#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
+
+   If targets may support the new node being added only at certain sizes, you
+    will also need to add code to your node's case statement in ``LegalizeOp``
+    to Promote your node's operands to a larger size, and perform the correct
+    operation.  You will also need to add code to ``PromoteOp`` to do this as
+    well.  For a good example, see ``ISD::BSWAP``, which promotes its operand to
+    a wider size, performs the byteswap, and then shifts the correct bytes right
+    to emulate the narrower byteswap in the wider type.
+
+#. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
+
+   Add a case for your node in ``ExpandOp`` to teach the legalizer how to
+   perform the action represented by the new node on a value that has been split
+   into high and low halves.  This case will be used to support your node with a
+   64 bit operand on a 32 bit target.
+
+#. ``lib/CodeGen/SelectionDAG/DAGCombiner.cpp``:
+
+   If your node can be combined with itself, or other existing nodes in a
+   peephole-like fashion, add a visit function for it, and call that function
+   from. There are several good examples for simple combines you can do;
+   ``visitFABS`` and ``visitSRL`` are good starting places.
+
+#. ``lib/Target/PowerPC/PPCISelLowering.cpp``:
+
+   Each target has an implementation of the ``TargetLowering`` class, usually in
+   its own file (although some targets include it in the same file as the
+   DAGToDAGISel).  The default behavior for a target is to assume that your new
+   node is legal for all types that are legal for that target.  If this target
+   does not natively support your node, then tell the target to either Promote
+   it (if it is supported at a larger type) or Expand it.  This will cause the
+   code you wrote in ``LegalizeOp`` above to decompose your new node into other
+   legal nodes for this target.
+
+#. ``lib/Target/TargetSelectionDAG.td``:
+
+   Most current targets supported by LLVM generate code using the DAGToDAG
+   method, where SelectionDAG nodes are pattern matched to target-specific
+   nodes, which represent individual instructions.  In order for the targets to
+   match an instruction to your new node, you must add a def for that node to
+   the list in this file, with the appropriate type constraints. Look at
+   ``add``, ``bswap``, and ``fadd`` for examples.
+
+#. ``lib/Target/PowerPC/PPCInstrInfo.td``:
+
+   Each target has a tablegen file that describes the target's instruction set.
+   For targets that use the DAGToDAG instruction selection framework, add a
+   pattern for your new node that uses one or more target nodes.  Documentation
+   for this is a bit sparse right now, but there are several decent examples.
+   See the patterns for ``rotl`` in ``PPCInstrInfo.td``.
+
+#. TODO: document complex patterns.
+
+#. ``llvm/test/CodeGen/*``:
+
+   Add test cases for your new node to the test suite.
+   ``llvm/test/CodeGen/X86/bswap.ll`` is a good example.
+
+Adding a new instruction
+========================
+
+.. warning::
+
+  Adding instructions changes the bitcode format, and it will take some effort
+  to maintain compatibility with the previous version. Only add an instruction
+  if it is absolutely necessary.
+
+#. ``llvm/include/llvm/IR/Instruction.def``:
+
+   add a number for your instruction and an enum name
+
+#. ``llvm/include/llvm/IR/Instructions.h``:
+
+   add a definition for the class that will represent your instruction
+
+#. ``llvm/include/llvm/IR/InstVisitor.h``:
+
+   add a prototype for a visitor to your new instruction type
+
+#. ``llvm/lib/AsmParser/LLLexer.cpp``:
+
+   add a new token to parse your instruction from assembly text file
+
+#. ``llvm/lib/AsmParser/LLParser.cpp``:
+
+   add the grammar on how your instruction can be read and what it will
+   construct as a result
+
+#. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``:
+
+   add a case for your instruction and how it will be parsed from bitcode
+
+#. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``:
+
+   add a case for your instruction and how it will be parsed from bitcode
+
+#. ``llvm/lib/IR/Instruction.cpp``:
+
+   add a case for how your instruction will be printed out to assembly
+
+#. ``llvm/lib/IR/Instructions.cpp``:
+
+   implement the class you defined in ``llvm/include/llvm/Instructions.h``
+
+#. Test your instruction
+
+#. ``llvm/lib/Target/*``:
+
+   add support for your instruction to code generators, or add a lowering pass.
+
+#. ``llvm/test/*``:
+
+   add your test cases to the test suite.
+
+Also, you need to implement (or modify) any analyses or passes that you want to
+understand this new instruction.
+
+Adding a new type
+=================
+
+.. warning::
+
+  Adding new types changes the bitcode format, and will break compatibility with
+  currently-existing LLVM installations. Only add new types if it is absolutely
+  necessary.
+
+Adding a fundamental type
+-------------------------
+
+#. ``llvm/include/llvm/IR/Type.h``:
+
+   add enum for the new type; add static ``Type*`` for this type
+
+#. ``llvm/lib/IR/Type.cpp`` and ``llvm/lib/IR/ValueTypes.cpp``:
+
+   add mapping from ``TypeID`` => ``Type*``; initialize the static ``Type*``
+
+#. ``llvm/llvm/llvm-c/Core.cpp``:
+
+   add enum ``LLVMTypeKind`` and modify
+   ``LLVMTypeKind LLVMGetTypeKind(LLVMTypeRef Ty)`` for the new type
+
+#. ``llvm/include/llvm/IR/TypeBuilder.h``:
+
+   add new class to represent new type in the hierarchy
+
+#. ``llvm/lib/AsmParser/LLLexer.cpp``:
+
+   add ability to parse in the type from text assembly
+
+#. ``llvm/lib/AsmParser/LLParser.cpp``:
+
+   add a token for that type
+
+#. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``:
+
+   modify ``static void WriteTypeTable(const ValueEnumerator &VE,
+   BitstreamWriter &Stream)`` to serialize your type
+
+#. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``:
+
+   modify ``bool BitcodeReader::ParseTypeType()`` to read your data type
+
+#. ``include/llvm/Bitcode/LLVMBitCodes.h``:
+
+   add enum ``TypeCodes`` for the new type
+
+Adding a derived type
+---------------------
+
+#. ``llvm/include/llvm/IR/Type.h``:
+
+   add enum for the new type; add a forward declaration of the type also
+
+#. ``llvm/include/llvm/IR/DerivedTypes.h``:
+
+   add new class to represent new class in the hierarchy; add forward
+   declaration to the TypeMap value type
+
+#. ``llvm/lib/IR/Type.cpp`` and ``llvm/lib/IR/ValueTypes.cpp``:
+
+   add support for derived type, notably `enum TypeID` and `is`, `get` methods.
+
+#. ``llvm/llvm/llvm-c/Core.cpp``:
+
+   add enum ``LLVMTypeKind`` and modify
+   `LLVMTypeKind LLVMGetTypeKind(LLVMTypeRef Ty)` for the new type
+
+#. ``llvm/include/llvm/IR/TypeBuilder.h``:
+
+   add new class to represent new class in the hierarchy
+
+#. ``llvm/lib/AsmParser/LLLexer.cpp``:
+
+   modify ``lltok::Kind LLLexer::LexIdentifier()`` to add ability to
+   parse in the type from text assembly
+
+#. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``:
+
+   modify ``static void WriteTypeTable(const ValueEnumerator &VE,
+   BitstreamWriter &Stream)`` to serialize your type
+
+#. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``:
+
+   modify ``bool BitcodeReader::ParseTypeType()`` to read your data type
+
+#. ``include/llvm/Bitcode/LLVMBitCodes.h``:
+
+   add enum ``TypeCodes`` for the new type
+
+#. ``llvm/lib/IR/AsmWriter.cpp``:
+
+   modify ``void TypePrinting::print(Type *Ty, raw_ostream &OS)``
+   to output the new derived type

Added: www-releases/trunk/4.0.1/docs/_sources/Extensions.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/Extensions.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/Extensions.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/Extensions.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,250 @@
+===============
+LLVM Extensions
+===============
+
+.. contents::
+   :local:
+
+.. toctree::
+   :hidden:
+
+Introduction
+============
+
+This document describes extensions to tools and formats LLVM seeks compatibility
+with.
+
+General Assembly Syntax
+===========================
+
+C99-style Hexadecimal Floating-point Constants
+----------------------------------------------
+
+LLVM's assemblers allow floating-point constants to be written in C99's
+hexadecimal format instead of decimal if desired.
+
+.. code-block:: gas
+
+  .section .data
+  .float 0x1c2.2ap3
+
+Machine-specific Assembly Syntax
+================================
+
+X86/COFF-Dependent
+------------------
+
+Relocations
+^^^^^^^^^^^
+
+The following additional relocation types are supported:
+
+**@IMGREL** (AT&T syntax only) generates an image-relative relocation that
+corresponds to the COFF relocation types ``IMAGE_REL_I386_DIR32NB`` (32-bit) or
+``IMAGE_REL_AMD64_ADDR32NB`` (64-bit).
+
+.. code-block:: text
+
+  .text
+  fun:
+    mov foo at IMGREL(%ebx, %ecx, 4), %eax
+
+  .section .pdata
+    .long fun at IMGREL
+    .long (fun at imgrel + 0x3F)
+    .long $unwind$fun at imgrel
+
+**.secrel32** generates a relocation that corresponds to the COFF relocation
+types ``IMAGE_REL_I386_SECREL`` (32-bit) or ``IMAGE_REL_AMD64_SECREL`` (64-bit).
+
+**.secidx** relocation generates an index of the section that contains
+the target.  It corresponds to the COFF relocation types
+``IMAGE_REL_I386_SECTION`` (32-bit) or ``IMAGE_REL_AMD64_SECTION`` (64-bit).
+
+.. code-block:: none
+
+  .section .debug$S,"rn"
+    .long 4
+    .long 242
+    .long 40
+    .secrel32 _function_name + 0
+    .secidx   _function_name
+    ...
+
+``.linkonce`` Directive
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+
+   ``.linkonce [ comdat type ]``
+
+Supported COMDAT types:
+
+``discard``
+   Discards duplicate sections with the same COMDAT symbol. This is the default
+   if no type is specified.
+
+``one_only``
+   If the symbol is defined multiple times, the linker issues an error.
+
+``same_size``
+   Duplicates are discarded, but the linker issues an error if any have
+   different sizes.
+
+``same_contents``
+   Duplicates are discarded, but the linker issues an error if any duplicates
+   do not have exactly the same content.
+
+``largest``
+   Links the largest section from among the duplicates.
+
+``newest``
+   Links the newest section from among the duplicates.
+
+
+.. code-block:: gas
+
+  .section .text$foo
+  .linkonce
+    ...
+
+``.section`` Directive
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+MC supports passing the information in ``.linkonce`` at the end of
+``.section``. For example,  these two codes are equivalent
+
+.. code-block:: gas
+
+  .section secName, "dr", discard, "Symbol1"
+  .globl Symbol1
+  Symbol1:
+  .long 1
+
+.. code-block:: gas
+
+  .section secName, "dr"
+  .linkonce discard
+  .globl Symbol1
+  Symbol1:
+  .long 1
+
+Note that in the combined form the COMDAT symbol is explicit. This
+extension exists to support multiple sections with the same name in
+different COMDATs:
+
+
+.. code-block:: gas
+
+  .section secName, "dr", discard, "Symbol1"
+  .globl Symbol1
+  Symbol1:
+  .long 1
+
+  .section secName, "dr", discard, "Symbol2"
+  .globl Symbol2
+  Symbol2:
+  .long 1
+
+In addition to the types allowed with ``.linkonce``, ``.section`` also accepts
+``associative``. The meaning is that the section is linked  if a certain other
+COMDAT section is linked. This other section is indicated by the comdat symbol
+in this directive. It can be any symbol defined in the associated section, but
+is usually the associated section's comdat.
+
+   The following restrictions apply to the associated section:
+
+   1. It must be a COMDAT section.
+   2. It cannot be another associative COMDAT section.
+
+In the following example the symobl ``sym`` is the comdat symbol of ``.foo``
+and ``.bar`` is associated to ``.foo``.
+
+.. code-block:: gas
+
+	.section	.foo,"bw",discard, "sym"
+	.section	.bar,"rd",associative, "sym"
+
+MC supports these flags in the COFF ``.section`` directive:
+
+  - ``b``: BSS section (``IMAGE_SCN_CNT_INITIALIZED_DATA``)
+  - ``d``: Data section (``IMAGE_SCN_CNT_UNINITIALIZED_DATA``)
+  - ``n``: Section is not loaded (``IMAGE_SCN_LNK_REMOVE``)
+  - ``r``: Read-only
+  - ``s``: Shared section
+  - ``w``: Writable
+  - ``x``: Executable section
+  - ``y``: Not readable
+  - ``D``: Discardable (``IMAGE_SCN_MEM_DISCARDABLE``)
+
+These flags are all compatible with gas, with the exception of the ``D`` flag,
+which gnu as does not support. For gas compatibility, sections with a name
+starting with ".debug" are implicitly discardable.
+
+
+ELF-Dependent
+-------------
+
+``.section`` Directive
+^^^^^^^^^^^^^^^^^^^^^^
+
+In order to support creating multiple sections with the same name and comdat,
+it is possible to add an unique number at the end of the ``.seciton`` directive.
+For example, the following code creates two sections named ``.text``.
+
+.. code-block:: gas
+
+	.section	.text,"ax", at progbits,unique,1
+        nop
+
+	.section	.text,"ax", at progbits,unique,2
+        nop
+
+
+The unique number is not present in the resulting object at all. It is just used
+in the assembler to differentiate the sections.
+
+Target Specific Behaviour
+=========================
+
+Windows on ARM
+--------------
+
+Stack Probe Emission
+^^^^^^^^^^^^^^^^^^^^
+
+The reference implementation (Microsoft Visual Studio 2012) emits stack probes
+in the following fashion:
+
+.. code-block:: gas
+
+  movw r4, #constant
+  bl __chkstk
+  sub.w sp, sp, r4
+
+However, this has the limitation of 32 MiB (Â±16MiB).  In order to accommodate
+larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 4GiB
+range via a slight deviation.  It will generate an indirect jump as follows:
+
+.. code-block:: gas
+
+  movw r4, #constant
+  movw r12, :lower16:__chkstk
+  movt r12, :upper16:__chkstk
+  blx r12
+  sub.w sp, sp, r4
+
+Variable Length Arrays
+^^^^^^^^^^^^^^^^^^^^^^
+
+The reference implementation (Microsoft Visual Studio 2012) does not permit the
+emission of Variable Length Arrays (VLAs).
+
+The Windows ARM Itanium ABI extends the base ABI by adding support for emitting
+a dynamic stack allocation.  When emitting a variable stack allocation, a call
+to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup
+properly.  The emission of this stack probe emission is handled similar to the
+standard stack probe emission.
+
+The MSVC environment does not emit code for VLAs currently.
+

Added: www-releases/trunk/4.0.1/docs/_sources/FAQ.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/FAQ.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/FAQ.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/FAQ.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,345 @@
+================================
+Frequently Asked Questions (FAQ)
+================================
+
+.. contents::
+   :local:
+
+
+License
+=======
+
+Does the University of Illinois Open Source License really qualify as an "open source" license?
+-----------------------------------------------------------------------------------------------
+Yes, the license is `certified
+<http://www.opensource.org/licenses/UoI-NCSA.php>`_ by the Open Source
+Initiative (OSI).
+
+
+Can I modify LLVM source code and redistribute the modified source?
+-------------------------------------------------------------------
+Yes.  The modified source distribution must retain the copyright notice and
+follow the three bulleted conditions listed in the `LLVM license
+<http://llvm.org/svn/llvm-project/llvm/trunk/LICENSE.TXT>`_.
+
+
+Can I modify the LLVM source code and redistribute binaries or other tools based on it, without redistributing the source?
+--------------------------------------------------------------------------------------------------------------------------
+Yes. This is why we distribute LLVM under a less restrictive license than GPL,
+as explained in the first question above.
+
+
+Source Code
+===========
+
+In what language is LLVM written?
+---------------------------------
+All of the LLVM tools and libraries are written in C++ with extensive use of
+the STL.
+
+
+How portable is the LLVM source code?
+-------------------------------------
+The LLVM source code should be portable to most modern Unix-like operating
+systems.  Most of the code is written in standard C++ with operating system
+services abstracted to a support library.  The tools required to build and
+test LLVM have been ported to a plethora of platforms.
+
+Some porting problems may exist in the following areas:
+
+* The autoconf/makefile build system relies heavily on UNIX shell tools,
+  like the Bourne Shell and sed.  Porting to systems without these tools
+  (MacOS 9, Plan 9) will require more effort.
+
+What API do I use to store a value to one of the virtual registers in LLVM IR's SSA representation?
+---------------------------------------------------------------------------------------------------
+
+In short: you can't. It's actually kind of a silly question once you grok
+what's going on. Basically, in code like:
+
+.. code-block:: llvm
+
+    %result = add i32 %foo, %bar
+
+, ``%result`` is just a name given to the ``Value`` of the ``add``
+instruction. In other words, ``%result`` *is* the add instruction. The
+"assignment" doesn't explicitly "store" anything to any "virtual register";
+the "``=``" is more like the mathematical sense of equality.
+
+Longer explanation: In order to generate a textual representation of the
+IR, some kind of name has to be given to each instruction so that other
+instructions can textually reference it. However, the isomorphic in-memory
+representation that you manipulate from C++ has no such restriction since
+instructions can simply keep pointers to any other ``Value``'s that they
+reference. In fact, the names of dummy numbered temporaries like ``%1`` are
+not explicitly represented in the in-memory representation at all (see
+``Value::getName()``).
+
+
+Source Languages
+================
+
+What source languages are supported?
+------------------------------------
+
+LLVM currently has full support for C and C++ source languages through
+`Clang <http://clang.llvm.org/>`_. Many other language frontends have
+been written using LLVM, and an incomplete list is available at
+`projects with LLVM <http://llvm.org/ProjectsWithLLVM/>`_.
+
+
+I'd like to write a self-hosting LLVM compiler. How should I interface with the LLVM middle-end optimizers and back-end code generators?
+----------------------------------------------------------------------------------------------------------------------------------------
+Your compiler front-end will communicate with LLVM by creating a module in the
+LLVM intermediate representation (IR) format. Assuming you want to write your
+language's compiler in the language itself (rather than C++), there are 3
+major ways to tackle generating LLVM IR from a front-end:
+
+1. **Call into the LLVM libraries code using your language's FFI (foreign
+   function interface).**
+
+  * *for:* best tracks changes to the LLVM IR, .ll syntax, and .bc format
+
+  * *for:* enables running LLVM optimization passes without a emit/parse
+    overhead
+
+  * *for:* adapts well to a JIT context
+
+  * *against:* lots of ugly glue code to write
+
+2. **Emit LLVM assembly from your compiler's native language.**
+
+  * *for:* very straightforward to get started
+
+  * *against:* the .ll parser is slower than the bitcode reader when
+    interfacing to the middle end
+
+  * *against:* it may be harder to track changes to the IR
+
+3. **Emit LLVM bitcode from your compiler's native language.**
+
+  * *for:* can use the more-efficient bitcode reader when interfacing to the
+    middle end
+
+  * *against:* you'll have to re-engineer the LLVM IR object model and bitcode
+    writer in your language
+
+  * *against:* it may be harder to track changes to the IR
+
+If you go with the first option, the C bindings in include/llvm-c should help
+a lot, since most languages have strong support for interfacing with C. The
+most common hurdle with calling C from managed code is interfacing with the
+garbage collector. The C interface was designed to require very little memory
+management, and so is straightforward in this regard.
+
+What support is there for a higher level source language constructs for building a compiler?
+--------------------------------------------------------------------------------------------
+Currently, there isn't much. LLVM supports an intermediate representation
+which is useful for code representation but will not support the high level
+(abstract syntax tree) representation needed by most compilers. There are no
+facilities for lexical nor semantic analysis.
+
+
+I don't understand the ``GetElementPtr`` instruction. Help!
+-----------------------------------------------------------
+See `The Often Misunderstood GEP Instruction <GetElementPtr.html>`_.
+
+
+Using the C and C++ Front Ends
+==============================
+
+Can I compile C or C++ code to platform-independent LLVM bitcode?
+-----------------------------------------------------------------
+No. C and C++ are inherently platform-dependent languages. The most obvious
+example of this is the preprocessor. A very common way that C code is made
+portable is by using the preprocessor to include platform-specific code. In
+practice, information about other platforms is lost after preprocessing, so
+the result is inherently dependent on the platform that the preprocessing was
+targeting.
+
+Another example is ``sizeof``. It's common for ``sizeof(long)`` to vary
+between platforms. In most C front-ends, ``sizeof`` is expanded to a
+constant immediately, thus hard-wiring a platform-specific detail.
+
+Also, since many platforms define their ABIs in terms of C, and since LLVM is
+lower-level than C, front-ends currently must emit platform-specific IR in
+order to have the result conform to the platform ABI.
+
+
+Questions about code generated by the demo page
+===============================================
+
+What is this ``llvm.global_ctors`` and ``_GLOBAL__I_a...`` stuff that happens when I ``#include <iostream>``?
+-------------------------------------------------------------------------------------------------------------
+If you ``#include`` the ``<iostream>`` header into a C++ translation unit,
+the file will probably use the ``std::cin``/``std::cout``/... global objects.
+However, C++ does not guarantee an order of initialization between static
+objects in different translation units, so if a static ctor/dtor in your .cpp
+file used ``std::cout``, for example, the object would not necessarily be
+automatically initialized before your use.
+
+To make ``std::cout`` and friends work correctly in these scenarios, the STL
+that we use declares a static object that gets created in every translation
+unit that includes ``<iostream>``.  This object has a static constructor
+and destructor that initializes and destroys the global iostream objects
+before they could possibly be used in the file.  The code that you see in the
+``.ll`` file corresponds to the constructor and destructor registration code.
+
+If you would like to make it easier to *understand* the LLVM code generated
+by the compiler in the demo page, consider using ``printf()`` instead of
+``iostream``\s to print values.
+
+
+Where did all of my code go??
+-----------------------------
+If you are using the LLVM demo page, you may often wonder what happened to
+all of the code that you typed in.  Remember that the demo script is running
+the code through the LLVM optimizers, so if your code doesn't actually do
+anything useful, it might all be deleted.
+
+To prevent this, make sure that the code is actually needed.  For example, if
+you are computing some expression, return the value from the function instead
+of leaving it in a local variable.  If you really want to constrain the
+optimizer, you can read from and assign to ``volatile`` global variables.
+
+
+What is this "``undef``" thing that shows up in my code?
+--------------------------------------------------------
+``undef`` is the LLVM way of representing a value that is not defined.  You
+can get these if you do not initialize a variable before you use it.  For
+example, the C function:
+
+.. code-block:: c
+
+   int X() { int i; return i; }
+
+Is compiled to "``ret i32 undef``" because "``i``" never has a value specified
+for it.
+
+
+Why does instcombine + simplifycfg turn a call to a function with a mismatched calling convention into "unreachable"? Why not make the verifier reject it?
+----------------------------------------------------------------------------------------------------------------------------------------------------------
+This is a common problem run into by authors of front-ends that are using
+custom calling conventions: you need to make sure to set the right calling
+convention on both the function and on each call to the function.  For
+example, this code:
+
+.. code-block:: llvm
+
+   define fastcc void @foo() {
+       ret void
+   }
+   define void @bar() {
+       call void @foo()
+       ret void
+   }
+
+Is optimized to:
+
+.. code-block:: llvm
+
+   define fastcc void @foo() {
+       ret void
+   }
+   define void @bar() {
+       unreachable
+   }
+
+... with "``opt -instcombine -simplifycfg``".  This often bites people because
+"all their code disappears".  Setting the calling convention on the caller and
+callee is required for indirect calls to work, so people often ask why not
+make the verifier reject this sort of thing.
+
+The answer is that this code has undefined behavior, but it is not illegal.
+If we made it illegal, then every transformation that could potentially create
+this would have to ensure that it doesn't, and there is valid code that can
+create this sort of construct (in dead code).  The sorts of things that can
+cause this to happen are fairly contrived, but we still need to accept them.
+Here's an example:
+
+.. code-block:: llvm
+
+   define fastcc void @foo() {
+       ret void
+   }
+   define internal void @bar(void()* %FP, i1 %cond) {
+       br i1 %cond, label %T, label %F
+   T:
+       call void %FP()
+       ret void
+   F:
+       call fastcc void %FP()
+       ret void
+   }
+   define void @test() {
+       %X = or i1 false, false
+       call void @bar(void()* @foo, i1 %X)
+       ret void
+   }
+
+In this example, "test" always passes ``@foo``/``false`` into ``bar``, which
+ensures that it is dynamically called with the right calling conv (thus, the
+code is perfectly well defined).  If you run this through the inliner, you
+get this (the explicit "or" is there so that the inliner doesn't dead code
+eliminate a bunch of stuff):
+
+.. code-block:: llvm
+
+   define fastcc void @foo() {
+       ret void
+   }
+   define void @test() {
+       %X = or i1 false, false
+       br i1 %X, label %T.i, label %F.i
+   T.i:
+       call void @foo()
+       br label %bar.exit
+   F.i:
+       call fastcc void @foo()
+       br label %bar.exit
+   bar.exit:
+       ret void
+   }
+
+Here you can see that the inlining pass made an undefined call to ``@foo``
+with the wrong calling convention.  We really don't want to make the inliner
+have to know about this sort of thing, so it needs to be valid code.  In this
+case, dead code elimination can trivially remove the undefined code.  However,
+if ``%X`` was an input argument to ``@test``, the inliner would produce this:
+
+.. code-block:: llvm
+
+   define fastcc void @foo() {
+       ret void
+   }
+
+   define void @test(i1 %X) {
+       br i1 %X, label %T.i, label %F.i
+   T.i:
+       call void @foo()
+       br label %bar.exit
+   F.i:
+       call fastcc void @foo()
+       br label %bar.exit
+   bar.exit:
+       ret void
+   }
+
+The interesting thing about this is that ``%X`` *must* be false for the
+code to be well-defined, but no amount of dead code elimination will be able
+to delete the broken call as unreachable.  However, since
+``instcombine``/``simplifycfg`` turns the undefined call into unreachable, we
+end up with a branch on a condition that goes to unreachable: a branch to
+unreachable can never happen, so "``-inline -instcombine -simplifycfg``" is
+able to produce:
+
+.. code-block:: llvm
+
+   define fastcc void @foo() {
+      ret void
+   }
+   define void @test(i1 %X) {
+   F.i:
+      call fastcc void @foo()
+      ret void
+   }

Added: www-releases/trunk/4.0.1/docs/_sources/FaultMaps.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/FaultMaps.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/FaultMaps.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/FaultMaps.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,127 @@
+==============================
+FaultMaps and implicit checks
+==============================
+
+.. contents::
+   :local:
+   :depth: 2
+
+Motivation
+==========
+
+Code generated by managed language runtimes tend to have checks that
+are required for safety but never fail in practice.  In such cases, it
+is profitable to make the non-failing case cheaper even if it makes
+the failing case significantly more expensive.  This asymmetry can be
+exploited by folding such safety checks into operations that can be
+made to fault reliably if the check would have failed, and recovering
+from such a fault by using a signal handler.
+
+For example, Java requires null checks on objects before they are read
+from or written to.  If the object is ``null`` then a
+``NullPointerException`` has to be thrown, interrupting normal
+execution.  In practice, however, dereferencing a ``null`` pointer is
+extremely rare in well-behaved Java programs, and typically the null
+check can be folded into a nearby memory operation that operates on
+the same memory location.
+
+The Fault Map Section
+=====================
+
+Information about implicit checks generated by LLVM are put in a
+special "fault map" section.  On Darwin this section is named
+``__llvm_faultmaps``.
+
+The format of this section is
+
+.. code-block:: none
+
+  Header {
+    uint8  : Fault Map Version (current version is 1)
+    uint8  : Reserved (expected to be 0)
+    uint16 : Reserved (expected to be 0)
+  }
+  uint32 : NumFunctions
+  FunctionInfo[NumFunctions] {
+    uint64 : FunctionAddress
+    uint32 : NumFaultingPCs
+    uint32 : Reserved (expected to be 0)
+    FunctionFaultInfo[NumFaultingPCs] {
+      uint32  : FaultKind = FaultMaps::FaultingLoad (only legal value currently)
+      uint32  : FaultingPCOffset
+      uint32  : HandlerPCOffset
+    }
+  }
+
+
+The ``ImplicitNullChecks`` pass
+===============================
+
+The ``ImplicitNullChecks`` pass transforms explicit control flow for
+checking if a pointer is ``null``, like:
+
+.. code-block:: llvm
+
+    %ptr = call i32* @get_ptr()
+    %ptr_is_null = icmp i32* %ptr, null
+    br i1 %ptr_is_null, label %is_null, label %not_null, !make.implicit !0
+  
+  not_null:
+    %t = load i32, i32* %ptr
+    br label %do_something_with_t
+    
+  is_null:
+    call void @HFC()
+    unreachable
+  
+  !0 = !{}
+
+to control flow implicit in the instruction loading or storing through
+the pointer being null checked:
+
+.. code-block:: llvm
+
+    %ptr = call i32* @get_ptr()
+    %t = load i32, i32* %ptr  ;; handler-pc = label %is_null
+    br label %do_something_with_t
+    
+  is_null:
+    call void @HFC()
+    unreachable
+
+This transform happens at the ``MachineInstr`` level, not the LLVM IR
+level (so the above example is only representative, not literal).  The
+``ImplicitNullChecks`` pass runs during codegen, if
+``-enable-implicit-null-checks`` is passed to ``llc``.
+
+The ``ImplicitNullChecks`` pass adds entries to the
+``__llvm_faultmaps`` section described above as needed.
+
+``make.implicit`` metadata
+--------------------------
+
+Making null checks implicit is an aggressive optimization, and it can
+be a net performance pessimization if too many memory operations end
+up faulting because of it.  A language runtime typically needs to
+ensure that only a negligible number of implicit null checks actually
+fault once the application has reached a steady state.  A standard way
+of doing this is by healing failed implicit null checks into explicit
+null checks via code patching or recompilation.  It follows that there
+are two requirements an explicit null check needs to satisfy for it to
+be profitable to convert it to an implicit null check:
+
+  1. The case where the pointer is actually null (i.e. the "failing"
+     case) is extremely rare.
+
+  2. The failing path heals the implicit null check into an explicit
+     null check so that the application does not repeatedly page
+     fault.
+
+The frontend is expected to mark branches that satisfy (1) and (2)
+using a ``!make.implicit`` metadata node (the actual content of the
+metadata node is ignored).  Only branches that are marked with
+``!make.implicit`` metadata are considered as candidates for
+conversion into implicit null checks.
+
+(Note that while we could deal with (1) using profiling data, dealing
+with (2) requires some information not present in branch profiles.)

Added: www-releases/trunk/4.0.1/docs/_sources/Frontend/PerformanceTips.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/Frontend/PerformanceTips.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/Frontend/PerformanceTips.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/Frontend/PerformanceTips.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,296 @@
+=====================================
+Performance Tips for Frontend Authors
+=====================================
+
+.. contents::
+   :local:
+   :depth: 2
+
+Abstract
+========
+
+The intended audience of this document is developers of language frontends 
+targeting LLVM IR. This document is home to a collection of tips on how to 
+generate IR that optimizes well.  
+
+IR Best Practices
+=================
+
+As with any optimizer, LLVM has its strengths and weaknesses.  In some cases, 
+surprisingly small changes in the source IR can have a large effect on the 
+generated code.  
+
+Beyond the specific items on the list below, it's worth noting that the most 
+mature frontend for LLVM is Clang.  As a result, the further your IR gets from what Clang might emit, the less likely it is to be effectively optimized.  It 
+can often be useful to write a quick C program with the semantics you're trying
+to model and see what decisions Clang's IRGen makes about what IR to emit.  
+Studying Clang's CodeGen directory can also be a good source of ideas.  Note 
+that Clang and LLVM are explicitly version locked so you'll need to make sure 
+you're using a Clang built from the same svn revision or release as the LLVM 
+library you're using.  As always, it's *strongly* recommended that you track 
+tip of tree development, particularly during bring up of a new project.
+
+The Basics
+^^^^^^^^^^^
+
+#. Make sure that your Modules contain both a data layout specification and 
+   target triple. Without these pieces, non of the target specific optimization
+   will be enabled.  This can have a major effect on the generated code quality.
+
+#. For each function or global emitted, use the most private linkage type
+   possible (private, internal or linkonce_odr preferably).  Doing so will 
+   make LLVM's inter-procedural optimizations much more effective.
+
+#. Avoid high in-degree basic blocks (e.g. basic blocks with dozens or hundreds
+   of predecessors).  Among other issues, the register allocator is known to 
+   perform badly with confronted with such structures.  The only exception to 
+   this guidance is that a unified return block with high in-degree is fine.
+
+Use of allocas
+^^^^^^^^^^^^^^
+
+An alloca instruction can be used to represent a function scoped stack slot, 
+but can also represent dynamic frame expansion.  When representing function 
+scoped variables or locations, placing alloca instructions at the beginning of 
+the entry block should be preferred.   In particular, place them before any 
+call instructions. Call instructions might get inlined and replaced with 
+multiple basic blocks. The end result is that a following alloca instruction 
+would no longer be in the entry basic block afterward.
+
+The SROA (Scalar Replacement Of Aggregates) and Mem2Reg passes only attempt
+to eliminate alloca instructions that are in the entry basic block.  Given 
+SSA is the canonical form expected by much of the optimizer; if allocas can 
+not be eliminated by Mem2Reg or SROA, the optimizer is likely to be less 
+effective than it could be.
+
+Avoid loads and stores of large aggregate type
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+LLVM currently does not optimize well loads and stores of large :ref:`aggregate
+types <t_aggregate>` (i.e. structs and arrays).  As an alternative, consider 
+loading individual fields from memory.
+
+Aggregates that are smaller than the largest (performant) load or store 
+instruction supported by the targeted hardware are well supported.  These can 
+be an effective way to represent collections of small packed fields.  
+
+Prefer zext over sext when legal
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+On some architectures (X86_64 is one), sign extension can involve an extra 
+instruction whereas zero extension can be folded into a load.  LLVM will try to
+replace a sext with a zext when it can be proven safe, but if you have 
+information in your source language about the range of a integer value, it can 
+be profitable to use a zext rather than a sext.  
+
+Alternatively, you can :ref:`specify the range of the value using metadata 
+<range-metadata>` and LLVM can do the sext to zext conversion for you.
+
+Zext GEP indices to machine register width
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Internally, LLVM often promotes the width of GEP indices to machine register
+width.  When it does so, it will default to using sign extension (sext) 
+operations for safety.  If your source language provides information about 
+the range of the index, you may wish to manually extend indices to machine 
+register width using a zext instruction.
+
+When to specify alignment
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+LLVM will always generate correct code if you donât specify alignment, but may
+generate inefficient code.  For example, if you are targeting MIPS (or older 
+ARM ISAs) then the hardware does not handle unaligned loads and stores, and 
+so you will enter a trap-and-emulate path if you do a load or store with 
+lower-than-natural alignment.  To avoid this, LLVM will emit a slower 
+sequence of loads, shifts and masks (or load-right + load-left on MIPS) for 
+all cases where the load / store does not have a sufficiently high alignment 
+in the IR.
+
+The alignment is used to guarantee the alignment on allocas and globals, 
+though in most cases this is unnecessary (most targets have a sufficiently 
+high default alignment that theyâll be fine).  It is also used to provide a 
+contract to the back end saying âeither this load/store has this alignment, or
+it is undefined behaviorâ.  This means that the back end is free to emit 
+instructions that rely on that alignment (and mid-level optimizers are free to 
+perform transforms that require that alignment).  For x86, it doesnât make 
+much difference, as almost all instructions are alignment-independent.  For 
+MIPS, it can make a big difference.
+
+Note that if your loads and stores are atomic, the backend will be unable to 
+lower an under aligned access into a sequence of natively aligned accesses.  
+As a result, alignment is mandatory for atomic loads and stores.
+
+Other Things to Consider
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. Use ptrtoint/inttoptr sparingly (they interfere with pointer aliasing 
+   analysis), prefer GEPs
+
+#. Prefer globals over inttoptr of a constant address - this gives you 
+   dereferencability information.  In MCJIT, use getSymbolAddress to provide 
+   actual address.
+
+#. Be wary of ordered and atomic memory operations.  They are hard to optimize 
+   and may not be well optimized by the current optimizer.  Depending on your
+   source language, you may consider using fences instead.
+
+#. If calling a function which is known to throw an exception (unwind), use 
+   an invoke with a normal destination which contains an unreachable 
+   instruction.  This form conveys to the optimizer that the call returns 
+   abnormally.  For an invoke which neither returns normally or requires unwind
+   code in the current function, you can use a noreturn call instruction if 
+   desired.  This is generally not required because the optimizer will convert
+   an invoke with an unreachable unwind destination to a call instruction.
+
+#. Use profile metadata to indicate statically known cold paths, even if 
+   dynamic profiling information is not available.  This can make a large 
+   difference in code placement and thus the performance of tight loops.
+
+#. When generating code for loops, try to avoid terminating the header block of
+   the loop earlier than necessary.  If the terminator of the loop header 
+   block is a loop exiting conditional branch, the effectiveness of LICM will
+   be limited for loads not in the header.  (This is due to the fact that LLVM 
+   may not know such a load is safe to speculatively execute and thus can't 
+   lift an otherwise loop invariant load unless it can prove the exiting 
+   condition is not taken.)  It can be profitable, in some cases, to emit such 
+   instructions into the header even if they are not used along a rarely 
+   executed path that exits the loop.  This guidance specifically does not 
+   apply if the condition which terminates the loop header is itself invariant,
+   or can be easily discharged by inspecting the loop index variables.
+
+#. In hot loops, consider duplicating instructions from small basic blocks 
+   which end in highly predictable terminators into their successor blocks.  
+   If a hot successor block contains instructions which can be vectorized 
+   with the duplicated ones, this can provide a noticeable throughput
+   improvement.  Note that this is not always profitable and does involve a 
+   potentially large increase in code size.
+
+#. When checking a value against a constant, emit the check using a consistent
+   comparison type.  The GVN pass *will* optimize redundant equalities even if
+   the type of comparison is inverted, but GVN only runs late in the pipeline.
+   As a result, you may miss the opportunity to run other important 
+   optimizations.  Improvements to EarlyCSE to remove this issue are tracked in 
+   Bug 23333.
+
+#. Avoid using arithmetic intrinsics unless you are *required* by your source 
+   language specification to emit a particular code sequence.  The optimizer 
+   is quite good at reasoning about general control flow and arithmetic, it is
+   not anywhere near as strong at reasoning about the various intrinsics.  If 
+   profitable for code generation purposes, the optimizer will likely form the 
+   intrinsics itself late in the optimization pipeline.  It is *very* rarely 
+   profitable to emit these directly in the language frontend.  This item
+   explicitly includes the use of the :ref:`overflow intrinsics <int_overflow>`.
+
+#. Avoid using the :ref:`assume intrinsic <int_assume>` until you've 
+   established that a) there's no other way to express the given fact and b) 
+   that fact is critical for optimization purposes.  Assumes are a great 
+   prototyping mechanism, but they can have negative effects on both compile 
+   time and optimization effectiveness.  The former is fixable with enough 
+   effort, but the later is fairly fundamental to their designed purpose.
+
+
+Describing Language Specific Properties
+=======================================
+
+When translating a source language to LLVM, finding ways to express concepts 
+and guarantees available in your source language which are not natively 
+provided by LLVM IR will greatly improve LLVM's ability to optimize your code. 
+As an example, C/C++'s ability to mark every add as "no signed wrap (nsw)" goes
+a long way to assisting the optimizer in reasoning about loop induction 
+variables and thus generating more optimal code for loops.  
+
+The LLVM LangRef includes a number of mechanisms for annotating the IR with 
+additional semantic information.  It is *strongly* recommended that you become 
+highly familiar with this document.  The list below is intended to highlight a 
+couple of items of particular interest, but is by no means exhaustive.
+
+Restricted Operation Semantics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+#. Add nsw/nuw flags as appropriate.  Reasoning about overflow is 
+   generally hard for an optimizer so providing these facts from the frontend 
+   can be very impactful.  
+
+#. Use fast-math flags on floating point operations if legal.  If you don't 
+   need strict IEEE floating point semantics, there are a number of additional 
+   optimizations that can be performed.  This can be highly impactful for 
+   floating point intensive computations.
+
+Describing Aliasing Properties
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. Add noalias/align/dereferenceable/nonnull to function arguments and return 
+   values as appropriate
+
+#. Use pointer aliasing metadata, especially tbaa metadata, to communicate 
+   otherwise-non-deducible pointer aliasing facts
+
+#. Use inbounds on geps.  This can help to disambiguate some aliasing queries.
+
+
+Modeling Memory Effects
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. Mark functions as readnone/readonly/argmemonly or noreturn/nounwind when
+   known.  The optimizer will try to infer these flags, but may not always be
+   able to.  Manual annotations are particularly important for external 
+   functions that the optimizer can not analyze.
+
+#. Use the lifetime.start/lifetime.end and invariant.start/invariant.end 
+   intrinsics where possible.  Common profitable uses are for stack like data 
+   structures (thus allowing dead store elimination) and for describing 
+   life times of allocas (thus allowing smaller stack sizes).  
+
+#. Mark invariant locations using !invariant.load and TBAA's constant flags
+
+Pass Ordering
+^^^^^^^^^^^^^
+
+One of the most common mistakes made by new language frontend projects is to 
+use the existing -O2 or -O3 pass pipelines as is.  These pass pipelines make a
+good starting point for an optimizing compiler for any language, but they have 
+been carefully tuned for C and C++, not your target language.  You will almost 
+certainly need to use a custom pass order to achieve optimal performance.  A 
+couple specific suggestions:
+
+#. For languages with numerous rarely executed guard conditions (e.g. null 
+   checks, type checks, range checks) consider adding an extra execution or 
+   two of LoopUnswith and LICM to your pass order.  The standard pass order, 
+   which is tuned for C and C++ applications, may not be sufficient to remove 
+   all dischargeable checks from loops.
+
+#. If you language uses range checks, consider using the IRCE pass.  It is not 
+   currently part of the standard pass order.
+
+#. A useful sanity check to run is to run your optimized IR back through the 
+   -O2 pipeline again.  If you see noticeable improvement in the resulting IR, 
+   you likely need to adjust your pass order.
+
+
+I Still Can't Find What I'm Looking For
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you didn't find what you were looking for above, consider proposing an piece
+of metadata which provides the optimization hint you need.  Such extensions are
+relatively common and are generally well received by the community.  You will 
+need to ensure that your proposal is sufficiently general so that it benefits 
+others if you wish to contribute it upstream.
+
+You should also consider describing the problem you're facing on `llvm-dev 
+<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ and asking for advice.  
+It's entirely possible someone has encountered your problem before and can 
+give good advice.  If there are multiple interested parties, that also 
+increases the chances that a metadata extension would be well received by the
+community as a whole.  
+
+Adding to this document
+=======================
+
+If you run across a case that you feel deserves to be covered here, please send
+a patch to `llvm-commits
+<http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ for review.
+
+If you have questions on these items, please direct them to `llvm-dev 
+<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_.  The more relevant 
+context you are able to give to your question, the more likely it is to be 
+answered.
+

Added: www-releases/trunk/4.0.1/docs/_sources/GarbageCollection.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/4.0.1/docs/_sources/GarbageCollection.txt?rev=306451&view=auto
==============================================================================
--- www-releases/trunk/4.0.1/docs/_sources/GarbageCollection.txt (added)
+++ www-releases/trunk/4.0.1/docs/_sources/GarbageCollection.txt Tue Jun 27 12:30:47 2017
@@ -0,0 +1,1098 @@
+=====================================
+Garbage Collection with LLVM
+=====================================
+
+.. contents::
+   :local:
+
+Abstract
+========
+
+This document covers how to integrate LLVM into a compiler for a language which
+supports garbage collection.  **Note that LLVM itself does not provide a 
+garbage collector.**  You must provide your own.  
+
+Quick Start
+============
+
+First, you should pick a collector strategy.  LLVM includes a number of built 
+in ones, but you can also implement a loadable plugin with a custom definition.
+Note that the collector strategy is a description of how LLVM should generate 
+code such that it interacts with your collector and runtime, not a description
+of the collector itself.
+
+Next, mark your generated functions as using your chosen collector strategy.  
+From c++, you can call: 
+
+.. code-block:: c++
+
+  F.setGC(<collector description name>);
+
+
+This will produce IR like the following fragment:
+
+.. code-block:: llvm
+
+  define void @foo() gc "<collector description name>" { ... }
+
+
+When generating LLVM IR for your functions, you will need to:
+
+* Use ``@llvm.gcread`` and/or ``@llvm.gcwrite`` in place of standard load and 
+  store instructions.  These intrinsics are used to represent load and store 
+  barriers.  If you collector does not require such barriers, you can skip 
+  this step.  
+
+* Use the memory allocation routines provided by your garbage collector's 
+  runtime library.
+
+* If your collector requires them, generate type maps according to your 
+  runtime's binary interface.  LLVM is not involved in the process.  In 
+  particular, the LLVM type system is not suitable for conveying such 
+  information though the compiler.
+
+* Insert any coordination code required for interacting with your collector.  
+  Many collectors require running application code to periodically check a
+  flag and conditionally call a runtime function.  This is often referred to 
+  as a safepoint poll.  
+
+You will need to identify roots (i.e. references to heap objects your collector 
+needs to know about) in your generated IR, so that LLVM can encode them into 
+your final stack maps.  Depending on the collector strategy chosen, this is 
+accomplished by using either the ``@llvm.gcroot`` intrinsics or an 
+``gc.statepoint`` relocation sequence. 
+
+Don't forget to create a root for each intermediate value that is generated when
+evaluating an expression.  In ``h(f(), g())``, the result of ``f()`` could 
+easily be collected if evaluating ``g()`` triggers a collection.
+
+Finally, you need to link your runtime library with the generated program 
+executable (for a static compiler) or ensure the appropriate symbols are 
+available for the runtime linker (for a JIT compiler).  
+
+
+Introduction
+============
+
+What is Garbage Collection?
+---------------------------
+
+Garbage collection is a widely used technique that frees the programmer from
+having to know the lifetimes of heap objects, making software easier to produce
+and maintain.  Many programming languages rely on garbage collection for
+automatic memory management.  There are two primary forms of garbage collection:
+conservative and accurate.
+
+Conservative garbage collection often does not require any special support from
+either the language or the compiler: it can handle non-type-safe programming
+languages (such as C/C++) and does not require any special information from the
+compiler.  The `Boehm collector
+<http://www.hpl.hp.com/personal/Hans_Boehm/gc/>`__ is an example of a
+state-of-the-art conservative collector.
+
+Accurate garbage collection requires the ability to identify all pointers in the
+program at run-time (which requires that the source-language be type-safe in
+most cases).  Identifying pointers at run-time requires compiler support to
+locate all places that hold live pointer variables at run-time, including the
+:ref:`processor stack and registers <gcroot>`.
+
+Conservative garbage collection is attractive because it does not require any
+special compiler support, but it does have problems.  In particular, because the
+conservative garbage collector cannot *know* that a particular word in the
+machine is a pointer, it cannot move live objects in the heap (preventing the
+use of compacting and generational GC algorithms) and it can occasionally suffer
+from memory leaks due to integer values that happen to point to objects in the
+program.  In addition, some aggressive compiler transformations can break
+conservative garbage collectors (though these seem rare in practice).
+
+Accurate garbage collectors do not suffer from any of these problems, but they
+can suffer from degraded scalar optimization of the program.  In particular,
+because the runtime must be able to identify and update all pointers active in
+the program, some optimizations are less effective.  In practice, however, the
+locality and performance benefits of using aggressive garbage collection
+techniques dominates any low-level losses.
+
+This document describes the mechanisms and interfaces provided by LLVM to
+support accurate garbage collection.
+
+Goals and non-goals
+-------------------
+
+LLVM's intermediate representation provides :ref:`garbage collection intrinsics
+<gc_intrinsics>` that offer support for a broad class of collector models.  For
+instance, the intrinsics permit:
+
+* semi-space collectors
+
+* mark-sweep collectors
+
+* generational collectors
+
+* incremental collectors
+
+* concurrent collectors
+
+* cooperative collectors
+
+* reference counting
+
+We hope that the support built into the LLVM IR is sufficient to support a 
+broad class of garbage collected languages including Scheme, ML, Java, C#, 
+Perl, Python, Lua, Ruby, other scripting languages, and more.
+
+Note that LLVM **does not itself provide a garbage collector** --- this should
+be part of your language's runtime library.  LLVM provides a framework for
+describing the garbage collectors requirements to the compiler.  In particular,
+LLVM provides support for generating stack maps at call sites, polling for a 
+safepoint, and emitting load and store barriers.  You can also extend LLVM - 
+possibly through a loadable :ref:`code generation plugins <plugin>` - to
+generate code and data structures which conforms to the *binary interface*
+specified by the *runtime library*.  This is similar to the relationship between
+LLVM and DWARF debugging info, for example.  The difference primarily lies in
+the lack of an established standard in the domain of garbage collection --- thus
+the need for a flexible extension mechanism.
+
+The aspects of the binary interface with which LLVM's GC support is
+concerned are:
+
+* Creation of GC safepoints within code where collection is allowed to execute
+  safely.
+
+* Computation of the stack map.  For each safe point in the code, object
+  references within the stack frame must be identified so that the collector may
+  traverse and perhaps update them.
+
+* Write barriers when storing object references to the heap.  These are commonly
+  used to optimize incremental scans in generational collectors.
+
+* Emission of read barriers when loading object references.  These are useful
+  for interoperating with concurrent collectors.
+
+There are additional areas that LLVM does not directly address:
+
+* Registration of global roots with the runtime.
+
+* Registration of stack map entries with the runtime.
+
+* The functions used by the program to allocate memory, trigger a collection,
+  etc.
+
+* Computation or compilation of type maps, or registration of them with the
+  runtime.  These are used to crawl the heap for object references.
+
+In general, LLVM's support for GC does not include features which can be
+adequately addressed with other features of the IR and does not specify a
+particular binary interface.  On the plus side, this means that you should be
+able to integrate LLVM with an existing runtime.  On the other hand, it can 
+have the effect of leaving a lot of work for the developer of a novel 
+language.  We try to mitigate this by providing built in collector strategy 
+descriptions that can work with many common collector designs and easy 
+extension points.  If you don't already have a specific binary interface 
+you need to support, we recommend trying to use one of these built in collector 
+strategies.
+
+.. _gc_intrinsics:
+
+LLVM IR Features
+================
+
+This section describes the garbage collection facilities provided by the
+:doc:`LLVM intermediate representation <LangRef>`.  The exact behavior of these
+IR features is specified by the selected :ref:`GC strategy description 
+<plugin>`. 
+
+Specifying GC code generation: ``gc "..."``
+-------------------------------------------
+
+.. code-block:: text
+
+  define <returntype> @name(...) gc "name" { ... }
+
+The ``gc`` function attribute is used to specify the desired GC strategy to the
+compiler.  Its programmatic equivalent is the ``setGC`` method of ``Function``.
+
+Setting ``gc "name"`` on a function triggers a search for a matching subclass
+of GCStrategy.  Some collector strategies are built in.  You can add others 
+using either the loadable plugin mechanism, or by patching your copy of LLVM.
+It is the selected GC strategy which defines the exact nature of the code 
+generated to support GC.  If none is found, the compiler will raise an error.
+
+Specifying the GC style on a per-function basis allows LLVM to link together
+programs that use different garbage collection algorithms (or none at all).
+
+.. _gcroot:
+
+Identifying GC roots on the stack
+----------------------------------
+
+LLVM currently supports two different mechanisms for describing references in
+compiled code at safepoints.  ``llvm.gcroot`` is the older mechanism; 
+``gc.statepoint`` has been added more recently.  At the moment, you can choose 
+either implementation (on a per :ref:`GC strategy <plugin>` basis).  Longer 
+term, we will probably either migrate away from ``llvm.gcroot`` entirely, or 
+substantially merge their implementations. Note that most new development 
+work is focused on ``gc.statepoint``.  
+
+Using ``gc.statepoint``
+^^^^^^^^^^^^^^^^^^^^^^^^
+:doc:`This page <Statepoints>` contains detailed documentation for 
+``gc.statepoint``. 
+
+Using ``llvm.gcwrite``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+  void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
+
+The ``llvm.gcroot`` intrinsic is used to inform LLVM that a stack variable
+references an object on the heap and is to be tracked for garbage collection.
+The exact impact on generated code is specified by the Function's selected 
+:ref:`GC strategy <plugin>`.  All calls to ``llvm.gcroot`` **must** reside 
+inside the first basic block.
+
+The first argument **must** be a value referring to an alloca instruction or a
+bitcast of an alloca.  The second contains a pointer to metadata that should be
+associated with the pointer, and **must** be a constant or global value
+address.  If your target collector uses tags, use a null pointer for metadata.
+
+A compiler which performs manual SSA construction **must** ensure that SSA 
+values representing GC references are stored in to the alloca passed to the
+respective ``gcroot`` before every call site and reloaded after every call.  
+A compiler which uses mem2reg to raise imperative code using ``alloca`` into 
+SSA form need only add a call to ``@llvm.gcroot`` for those variables which 
+are pointers into the GC heap.  
+
+It is also important to mark intermediate values with ``llvm.gcroot``.  For
+example, consider ``h(f(), g())``.  Beware leaking the result of ``f()`` in the
+case that ``g()`` triggers a collection.  Note, that stack variables must be
+initialized and marked with ``llvm.gcroot`` in function's prologue.
+
+The ``%metadata`` argument can be used to avoid requiring heap objects to have
+'isa' pointers or tag bits. [Appel89_, Goldberg91_, Tolmach94_] If specified,
+its value will be tracked along with the location of the pointer in the stack
+frame.
+
+Consider the following fragment of Java code:
+
+.. code-block:: java
+
+   {
+     Object X;   // A null-initialized reference to an object
+     ...
+   }
+
+This block (which may be located in the middle of a function or in a loop nest),
+could be compiled to this LLVM code:
+
+.. code-block:: llvm
+
+  Entry:
+     ;; In the entry block for the function, allocate the
+     ;; stack space for X, which is an LLVM pointer.
+     %X = alloca %Object*
+
+     ;; Tell LLVM that the stack space is a stack root.
+     ;; Java has type-tags on objects, so we pass null as metadata.
+     %tmp = bitcast %Object** %X to i8**
+     call void @llvm.gcroot(i8** %tmp, i8* null)
+     ...
+
+     ;; "CodeBlock" is the block corresponding to the start
+     ;;  of the scope above.
+  CodeBlock:
+     ;; Java null-initializes pointers.
+     store %Object* null, %Object** %X
+
+     ...
+
+     ;; As the pointer goes out of scope, store a null value into
+     ;; it, to indicate that the value is no longer live.
+     store %Object* null, %Object** %X
+     ...
+
+Reading and writing references in the heap
+------------------------------------------
+
+Some collectors need to be informed when the mutator (the program that needs
+garbage collection) either reads a pointer from or writes a pointer to a field
+of a heap object.  The code fragments inserted at these points are called *read
+barriers* and *write barriers*, respectively.  The amount of code that needs to
+be executed is usually quite small and not on the critical path of any
+computation, so the overall performance impact of the barrier is tolerable.
+
+Barriers often require access to the *object pointer* rather than the *derived
+pointer* (which is a pointer to the field within the object).  Accordingly,
+these intrinsics take both pointers as separate arguments for completeness.  In
+this snippet, ``%object`` is the object pointer, and ``%derived`` is the derived
+pointer:
+
+.. code-block:: llvm
+
+  ;; An array type.
+  %class.Array = type { %class.Object, i32, [0 x %class.Object*] }
+  ...
+
+  ;; Load the object pointer from a gcroot.
+  %object = load %class.Array** %object_addr
+
+  ;; Compute the derived pointer.
+  %derived = getelementptr %object, i32 0, i32 2, i32 %n
+
+LLVM does not enforce this relationship between the object and derived pointer
+(although a particular :ref:`collector strategy <plugin>` might).  However, it
+would be an unusual collector that violated it.
+
+The use of these intrinsics is naturally optional if the target GC does not 
+require the corresponding barrier.  The GC strategy used with such a collector 
+should replace the intrinsic calls with the corresponding ``load`` or 
+``store`` instruction if they are used.
+
+One known deficiency with the current design is that the barrier intrinsics do 
+not include the size or alignment of the underlying operation performed.  It is 
+currently assumed that the operation is of pointer size and the alignment is
+assumed to be the target machine's default alignment.
+
+Write barrier: ``llvm.gcwrite``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+  void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived)
+
+For write barriers, LLVM provides the ``llvm.gcwrite`` intrinsic function.  It
+has exactly the same semantics as a non-volatile ``store`` to the derived
+pointer (the third argument).  The exact code generated is specified by the
+Function's selected :ref:`GC strategy <plugin>`.
+
+Many important algorithms require write barriers, including generational and
+concurrent collectors.  Additionally, write barriers could be used to implement
+reference counting.
+
+Read barrier: ``llvm.gcread``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: llvm
+
+  i8* @llvm.gcread(i8* %object, i8** %derived)
+
+For read barriers, LLVM provides the ``llvm.gcread`` intrinsic function.  It has
+exactly the same semantics as a non-volatile ``load`` from the derived pointer
+(the second argument).  The exact code generated is specified by the Function's
+selected :ref:`GC strategy <plugin>`.
+
+Read barriers are needed by fewer algorithms than write barriers, and may have a
+greater performance impact since pointer reads are more frequent than writes.
+
+.. _plugin:
+
+.. _builtin-gc-strategies:
+
+Built In GC Strategies
+======================
+
+LLVM includes built in support for several varieties of garbage collectors.  
+
+The Shadow Stack GC
+----------------------
+
+To use this collector strategy, mark your functions with:
+
+.. code-block:: c++
+
+  F.setGC("shadow-stack");
+
+Unlike many GC algorithms which rely on a cooperative code generator to compile
+stack maps, this algorithm carefully maintains a linked list of stack roots
+[:ref:`Henderson2002 <henderson02>`].  This so-called "shadow stack" mirrors the
+machine stack.  Maintaining this data structure is slower than using a stack map
+compiled into the executable as constant data, but has a significant portability
+advantage because it requires no special support from the target code generator,
+and does not require tricky platform-specific code to crawl the machine stack.
+
+The tradeoff for this simplicity and portability is:
+
+* High overhead per function call.
+
+* Not thread-safe.
+
+Still, it's an easy way to get started.  After your compiler and runtime are up
+and running, writing a :ref:`plugin <plugin>` will allow you to take advantage
+of :ref:`more advanced GC features <collector-algos>` of LLVM in order to
+improve performance.
+
+
+The shadow stack doesn't imply a memory allocation algorithm.  A semispace
+collector or building atop ``malloc`` are great places to start, and can be
+implemented with very little code.
+
+When it comes time to collect, however, your runtime needs to traverse the stack
+roots, and for this it needs to integrate with the shadow stack.  Luckily, doing
+so is very simple. (This code is heavily commented to help you understand the
+data structure, but there are only 20 lines of meaningful code.)
+
+.. code-block:: c++
+
+  /// @brief The map for a single function's stack frame.  One of these is
+  ///        compiled as constant data into the executable for each function.
+  ///
+  /// Storage of metadata values is elided if the %metadata parameter to
+  /// @llvm.gcroot is null.
+  struct FrameMap {
+    int32_t NumRoots;    //< Number of roots in stack frame.
+    int32_t NumMeta;     //< Number of metadata entries.  May be < NumRoots.
+    const void *Meta[0]; //< Metadata for each root.
+  };
+
+  /// @brief A link in the dynamic shadow stack.  One of these is embedded in
+  ///        the stack frame of each function on the call stack.
+  struct StackEntry {
+    StackEntry *Next;    //< Link to next stack entry (the caller's).
+    const FrameMap *Map; //< Pointer to constant FrameMap.
+    void *Roots[0];      //< Stack roots (in-place array).
+  };
+
+  /// @brief The head of the singly-linked list of StackEntries.  Functions push
+  ///        and pop onto this in their prologue and epilogue.
+  ///
+  /// Since there is only a global list, this technique is not threadsafe.
+  StackEntry *llvm_gc_root_chain;
+
+  /// @brief Calls Visitor(root, meta) for each GC root on the stack.
+  ///        root and meta are exactly the values passed to
+  ///        @llvm.gcroot.
+  ///
+  /// Visitor could be a function to recursively mark live objects.  Or it
+  /// might copy them to another heap or generation.
+  ///
+  /// @param Visitor A function to invoke for every GC root on the stack.
+  void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) {
+    for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) {
+      unsigned i = 0;
+
+      // For roots [0, NumMeta), the metadata pointer is in the FrameMap.
+      for (unsigned e = R->Map->NumMeta; i != e; ++i)
+        Visitor(&R->Roots[i], R->Map->Meta[i]);
+
+      // For roots [NumMeta, NumRoots), the metadata pointer is null.
+      for (unsigned e = R->Map->NumRoots; i != e; ++i)
+        Visitor(&R->Roots[i], NULL);
+    }
+  }
+
+
+The 'Erlang' and 'Ocaml' GCs
+-----------------------------
+
+LLVM ships with two example collectors which leverage the ``gcroot`` 
+mechanisms.  To our knowledge, these are not actually used by any language 
+runtime, but they do provide a reasonable starting point for someone interested 
+in writing an ``gcroot`` compatible GC plugin.  In particular, these are the 
+only in tree examples of how to produce a custom binary stack map format using 
+a ``gcroot`` strategy.
+
+As there names imply, the binary format produced is intended to model that 
+used by the Erlang and OCaml compilers respectively.  
+
+.. _statepoint_example_gc:
+
+The Statepoint Example GC
+-------------------------
+
+.. code-block:: c++
+
+  F.setGC("statepoint-example");
+
+This GC provides an example of how one might use the infrastructure provided 
+by ``gc.statepoint``. This example GC is compatible with the 
+:ref:`PlaceSafepoints` and :ref:`RewriteStatepointsForGC` utility passes 
+which simplify ``gc.statepoint`` sequence insertion. If you need to build a 
+custom GC strategy around the ``gc.statepoints`` mechanisms, it is recommended
+that you use this one as a starting point.
+
+This GC strategy does not support read or write barriers.  As a result, these 
+intrinsics are lowered to normal loads and stores.
+
+The stack map format generated by this GC strategy can be found in the 
+:ref:`stackmap-section` using a format documented :ref:`here 
+<statepoint-stackmap-format>`. This format is intended to be the standard 
+format supported by LLVM going forward.
+
+The CoreCLR GC
+-------------------------
+
+.. code-block:: c++
+
+  F.setGC("coreclr");
+
+This GC leverages the ``gc.statepoint`` mechanism to support the 
+`CoreCLR <https://github.com/dotnet/coreclr>`__ runtime.
+
+Support for this GC strategy is a work in progress. This strategy will 
+differ from 
+:ref:`statepoint-example GC<statepoint_example_gc>` strategy in 
+certain aspects like:
+
+* Base-pointers of interior pointers are not explicitly 
+  tracked and reported.
+
+* A different format is used for encoding stack maps.
+
+* Safe-point polls are only needed before loop-back edges
+  and before tail-calls (not needed at function-entry).
+
+Custom GC Strategies
+====================
+
+If none of the built in GC strategy descriptions met your needs above, you will
+need to define a custom GCStrategy and possibly, a custom LLVM pass to perform 
+lowering.  Your best example of where to start defining a custom GCStrategy 
+would be to look at one of the built in strategies.
+
+You may be able to structure this additional code as a loadable plugin library.
+Loadable plugins are sufficient if all you need is to enable a different 
+combination of built in functionality, but if you need to provide a custom 
+lowering pass, you will need to build a patched version of LLVM.  If you think 
+you need a patched build, please ask for advice on llvm-dev.  There may be an 
+easy way we can extend the support to make it work for your use case without 
+requiring a custom build.  
+
+Collector Requirements
+----------------------
+
+You should be able to leverage any existing collector library that includes the following elements:
+
+#. A memory allocator which exposes an allocation function your compiled 
+   code can call.
+
+#. A binary format for the stack map.  A stack map describes the location
+   of references at a safepoint and is used by precise collectors to identify
+   references within a stack frame on the machine stack. Note that collectors
+   which conservatively scan the stack don't require such a structure.
+
+#. A stack crawler to discover functions on the call stack, and enumerate the
+   references listed in the stack map for each call site.  
+
+#. A mechanism for identifying references in global locations (e.g. global 
+   variables).
+
+#. If you collector requires them, an LLVM IR implementation of your collectors
+   load and store barriers.  Note that since many collectors don't require 
+   barriers at all, LLVM defaults to lowering such barriers to normal loads 
+   and stores unless you arrange otherwise.
+
+
+Implementing a collector plugin
+-------------------------------
+
+User code specifies which GC code generation to use with the ``gc`` function
+attribute or, equivalently, with the ``setGC`` method of ``Function``.
+
+To implement a GC plugin, it is necessary to subclass ``llvm::GCStrategy``,
+which can be accomplished in a few lines of boilerplate code.  LLVM's
+infrastructure provides access to several important algorithms.  For an
+uncontroversial collector, all that remains may be to compile LLVM's computed
+stack map to assembly code (using the binary representation expected by the
+runtime library).  This can be accomplished in about 100 lines of code.
+
+This is not the appropriate place to implement a garbage collected heap or a
+garbage collector itself.  That code should exist in the language's runtime
+library.  The compiler plugin is responsible for generating code which conforms
+to the binary interface defined by library, most essentially the :ref:`stack map
+<stack-map>`.
+
+To subclass ``llvm::GCStrategy`` and register it with the compiler:
+
+.. code-block:: c++
+
+  // lib/MyGC/MyGC.cpp - Example LLVM GC plugin
+
+  #include "llvm/CodeGen/GCStrategy.h"
+  #include "llvm/CodeGen/GCMetadata.h"
+  #include "llvm/Support/Compiler.h"
+
+  using namespace llvm;
+
+  namespace {
+    class LLVM_LIBRARY_VISIBILITY MyGC : public GCStrategy {
+    public:
+      MyGC() {}
+    };
+
+    GCRegistry::Add<MyGC>
+    X("mygc", "My bespoke garbage collector.");
+  }
+
+This boilerplate collector does nothing.  More specifically:
+
+* ``llvm.gcread`` calls are replaced with the corresponding ``load``
+  instruction.
+
+* ``llvm.gcwrite`` calls are replaced with the corresponding ``store``
+  instruction.
+
+* No safe points are added to the code.
+
+* The stack map is not compiled into the executable.
+
+Using the LLVM makefiles, this code
+can be compiled as a plugin using a simple makefile:
+
+.. code-block:: make
+
+  # lib/MyGC/Makefile
+
+  LEVEL := ../..
+  LIBRARYNAME = MyGC
+  LOADABLE_MODULE = 1
+
+  include $(LEVEL)/Makefile.common
+
+Once the plugin is compiled, code using it may be compiled using ``llc
+-load=MyGC.so`` (though MyGC.so may have some other platform-specific
+extension):
+
+::
+
+  $ cat sample.ll
+  define void @f() gc "mygc" {
+  entry:
+    ret void
+  }
+  $ llvm-as < sample.ll | llc -load=MyGC.so
+
+It is also possible to statically link the collector plugin into tools, such as
+a language-specific compiler front-end.
+
+.. _collector-algos:
+
+Overview of available features
+------------------------------
+
+``GCStrategy`` provides a range of features through which a plugin may do useful
+work.  Some of these are callbacks, some are algorithms that can be enabled,
+disabled, or customized.  This matrix summarizes the supported (and planned)
+features and correlates them with the collection techniques which typically
+require them.
+
+.. |v| unicode:: 0x2714
+   :trim:
+
+.. |x| unicode:: 0x2718
+   :trim:
+
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| Algorithm  | Done | Shadow | refcount | mark- | copying | incremental | threaded | concurrent |
+|            |      | stack  |          | sweep |         |             |          |            |
++============+======+========+==========+=======+=========+=============+==========+============+
+| stack map  | |v|  |        |          | |x|   | |x|     | |x|         | |x|      | |x|        |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| initialize | |v|  | |x|    | |x|      | |x|   | |x|     | |x|         | |x|      | |x|        |
+| roots      |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| derived    | NO   |        |          |       |         |             | **N**\*  | **N**\*    |
+| pointers   |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| **custom   | |v|  |        |          |       |         |             |          |            |
+| lowering** |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *gcroot*   | |v|  | |x|    | |x|      |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *gcwrite*  | |v|  |        | |x|      |       |         | |x|         |          | |x|        |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *gcread*   | |v|  |        |          |       |         |             |          | |x|        |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| **safe     |      |        |          |       |         |             |          |            |
+| points**   |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *in        | |v|  |        |          | |x|   | |x|     | |x|         | |x|      | |x|        |
+| calls*     |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *before    | |v|  |        |          |       |         |             | |x|      | |x|        |
+| calls*     |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *for       | NO   |        |          |       |         |             | **N**    | **N**      |
+| loops*     |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *before    | |v|  |        |          |       |         |             | |x|      | |x|        |
+| escape*    |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| emit code  | NO   |        |          |       |         |             | **N**    | **N**      |
+| at safe    |      |        |          |       |         |             |          |            |
+| points     |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| **output** |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *assembly* | |v|  |        |          | |x|   | |x|     | |x|         | |x|      | |x|        |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *JIT*      | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| *obj*      | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| live       | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
+| analysis   |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| register   | NO   |        |          | **?** | **?**   | **?**       | **?**    | **?**      |
+| map        |      |        |          |       |         |             |          |            |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| \* Derived pointers only pose a hasard to copying collections.                                |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+| **?** denotes a feature which could be utilized if available.                                 |
++------------+------+--------+----------+-------+---------+-------------+----------+------------+
+
+To be clear, the collection techniques above are defined as:
+
+Shadow Stack
+  The mutator carefully maintains a linked list of stack roots.
+
+Reference Counting
+  The mutator maintains a reference count for each object and frees an object
+  when its count falls to zero.
+
+Mark-Sweep
+  When the heap is exhausted, the collector marks reachable objects starting
+  from the roots, then deallocates unreachable objects in a sweep phase.
+
+Copying
+  As reachability analysis proceeds, the collector copies objects from one heap
+  area to another, compacting them in the process.  Copying collectors enable
+  highly efficient "bump pointer" allocation and can improve locality of
+  reference.
+
+Incremental
+  (Including generational collectors.) Incremental collectors generally have all
+  the properties of a copying collector (regardless of whether the mature heap
+  is compacting), but bring the added complexity of requiring write barriers.
+
+Threaded
+  Denotes a multithreaded mutator; the collector must still stop the mutator
+  ("stop the world") before beginning reachability analysis.  Stopping a
+  multithreaded mutator is a complicated problem.  It generally requires highly
+  platform-specific code in the runtime, and the production of carefully
+  designed machine code at safe points.
+
+Concurrent
+  In this technique, the mutator and the collector run concurrently, with the
+  goal of eliminating pause times.  In a *cooperative* collector, the mutator
+  further aids with collection should a pause occur, allowing collection to take
+  advantage of multiprocessor hosts.  The "stop the world" problem of threaded
+  collectors is generally still present to a limited extent.  Sophisticated
+  marking algorithms are necessary.  Read barriers may be necessary.
+
+As the matrix indicates, LLVM's garbage collection infrastructure is already
+suitable for a wide variety of collectors, but does not currently extend to
+multithreaded programs.  This will be added in the future as there is
+interest.
+
+.. _stack-map:
+
+Computing stack maps
+--------------------
+
+LLVM automatically computes a stack map.  One of the most important features
+of a ``GCStrategy`` is to compile this information into the executable in
+the binary representation expected by the runtime library.
+
+The stack map consists of the location and identity of each GC root in the
+each function in the module.  For each root:
+
+* ``RootNum``: The index of the root.
+
+* ``StackOffset``: The offset of the object relative to the frame pointer.
+
+* ``RootMetadata``: The value passed as the ``%metadata`` parameter to the
+  ``@llvm.gcroot`` intrinsic.
+
+Also, for the function as a whole:
+
+* ``getFrameSize()``: The overall size of the function's initial stack frame,
+   not accounting for any dynamic allocation.
+
+* ``roots_size()``: The count of roots in the function.
+
+To access the stack map, use ``GCFunctionMetadata::roots_begin()`` and
+-``end()`` from the :ref:`GCMetadataPrinter <assembly>`:
+
+.. code-block:: c++
+
+  for (iterator I = begin(), E = end(); I != E; ++I) {
+    GCFunctionInfo *FI = *I;
+    unsigned FrameSize = FI->getFrameSize();
+    size_t RootCount = FI->roots_size();
+
+    for (GCFunctionInfo::roots_iterator RI = FI->roots_begin(),
+                                        RE = FI->roots_end();
+                                        RI != RE; ++RI) {
+      int RootNum = RI->Num;
+      int RootStackOffset = RI->StackOffset;
+      Constant *RootMetadata = RI->Metadata;
+    }
+  }
+
+If the ``llvm.gcroot`` intrinsic is eliminated before code generation by a
+custom lowering pass, LLVM will compute an empty stack map.  This may be useful
+for collector plugins which implement reference counting or a shadow stack.
+
+.. _init-roots:
+
+Initializing roots to null: ``InitRoots``
+-----------------------------------------
+
+.. code-block:: c++
+
+  MyGC::MyGC() {
+    InitRoots = true;
+  }
+
+When set, LLVM will automatically initialize each root to ``null`` upon entry to
+the function.  This prevents the GC's sweep phase from visiting uninitialized
+pointers, which will almost certainly cause it to crash.  This initialization
+occurs before custom lowering, so the two may be used together.
+
+Since LLVM does not yet compute liveness information, there is no means of
+distinguishing an uninitialized stack root from an initialized one.  Therefore,
+this feature should be used by all GC plugins.  It is enabled by default.
+
+Custom lowering of intrinsics: ``CustomRoots``, ``CustomReadBarriers``, and ``CustomWriteBarriers``
+---------------------------------------------------------------------------------------------------
+
+For GCs which use barriers or unusual treatment of stack roots, these 
+flags allow the collector to perform arbitrary transformations of the
+LLVM IR:
+
+.. code-block:: c++
+
+  class MyGC : public GCStrategy {
+  public:
+    MyGC() {
+      CustomRoots = true;
+      CustomReadBarriers = true;
+      CustomWriteBarriers = true;
+    }
+  };
+
+If any of these flags are set, LLVM suppresses its default lowering for
+the corresponding intrinsics.  Instead, you must provide a custom Pass
+which lowers the intrinsics as desired.  If you have opted in to custom
+lowering of a particular intrinsic your pass **must** eliminate all 
+instances of the corresponding intrinsic in functions which opt in to
+your GC.  The best example of such a pass is the ShadowStackGC and it's 
+ShadowStackGCLowering pass.  
+
+There is currently no way to register such a custom lowering pass 
+without building a custom copy of LLVM.
+
+.. _safe-points:
+
+Generating safe points: ``NeededSafePoints``
+--------------------------------------------
+
+LLVM can compute four kinds of safe points:
+
+.. code-block:: c++
+
+  namespace GC {
+    /// PointKind - The type of a collector-safe point.
+    ///
+    enum PointKind {
+      Loop,    //< Instr is a loop (backwards branch).
+      Return,  //< Instr is a return instruction.
+      PreCall, //< Instr is a call instruction.
+      PostCall //< Instr is the return address of a call.
+    };
+  }
+
+A collector can request any combination of the four by setting the
+``NeededSafePoints`` mask:
+
+.. code-block:: c++
+
+  MyGC::MyGC()  {
+    NeededSafePoints = 1 << GC::Loop
+                     | 1 << GC::Return
+                     | 1 << GC::PreCall
+                     | 1 << GC::PostCall;
+  }
+
+It can then use the following routines to access safe points.
+
+.. code-block:: c++
+
+  for (iterator I = begin(), E = end(); I != E; ++I) {
+    GCFunctionInfo *MD = *I;
+    size_t PointCount = MD->size();
+
+    for (GCFunctionInfo::iterator PI = MD->begin(),
+                                  PE = MD->end(); PI != PE; ++PI) {
+      GC::PointKind PointKind = PI->Kind;
+      unsigned PointNum = PI->Num;
+    }
+  }
+
+Almost every collector requires ``PostCall`` safe points, since these correspond
+to the moments when the function is suspended during a call to a subroutine.
+
+Threaded programs generally require ``Loop`` safe points to guarantee that the
+application will reach a safe point within a bounded amount of time, even if it
+is executing a long-running loop which contains no function calls.
+
+Threaded collectors may also require ``Return`` and ``PreCall`` safe points to
+implement "stop the world" techniques using self-modifying code, where it is
+important that the program not exit the function without reaching a safe point
+(because only the topmost function has been patched).
+
+.. _assembly:
+
+Emitting assembly code: ``GCMetadataPrinter``
+---------------------------------------------
+
+LLVM allows a plugin to print arbitrary assembly code before and after the rest
+of a module's assembly code.  At the end of the module, the GC can compile the
+LLVM stack map into assembly code. (At the beginning, this information is not
+yet computed.)
+
+Since AsmWriter and CodeGen are separate components of LLVM, a separate abstract
+base class and registry is provided for printing assembly code, the
+``GCMetadaPrinter`` and ``GCMetadataPrinterRegistry``.  The AsmWriter will look
+for such a subclass if the ``GCStrategy`` sets ``UsesMetadata``:
+
+.. code-block:: c++
+
+  MyGC::MyGC() {
+    UsesMetadata = true;
+  }
+
+This separation allows JIT-only clients to be smaller.
+
+Note that LLVM does not currently have analogous APIs to support code generation
+in the JIT, nor using the object writers.
+
+.. code-block:: c++
+
+  // lib/MyGC/MyGCPrinter.cpp - Example LLVM GC printer
+
+  #include "llvm/CodeGen/GCMetadataPrinter.h"
+  #include "llvm/Support/Compiler.h"
+
+  using namespace llvm;
+
+  namespace {
+    class LLVM_LIBRARY_VISIBILITY MyGCPrinter : public GCMetadataPrinter {
+    public:
+      virtual void beginAssembly(AsmPrinter &AP);
+
+      virtual void finishAssembly(AsmPrinter &AP);
+    };
+
+    GCMetadataPrinterRegistry::Add<MyGCPrinter>
+    X("mygc", "My bespoke garbage collector.");
+  }
+
+The collector should use ``AsmPrinter`` to print portable assembly code.  The
+collector itself contains the stack map for the entire module, and may access
+the ``GCFunctionInfo`` using its own ``begin()`` and ``end()`` methods.  Here's
+a realistic example:
+
+.. code-block:: c++
+
+  #include "llvm/CodeGen/AsmPrinter.h"
+  #include "llvm/IR/Function.h"
+  #include "llvm/IR/DataLayout.h"
+  #include "llvm/Target/TargetAsmInfo.h"
+  #include "llvm/Target/TargetMachine.h"
+
+  void MyGCPrinter::beginAssembly(AsmPrinter &AP) {
+    // Nothing to do.
+  }
+
+  void MyGCPrinter::finishAssembly(AsmPrinter &AP) {
+    MCStreamer &OS = AP.OutStreamer;
+    unsigned IntPtrSize = AP.getPointerSize();
+
+    // Put this in the data section.
+    OS.SwitchSection(AP.getObjFileLowering().getDataSection());
+
+    // For each function...
+    for (iterator FI = begin(), FE = end(); FI != FE; ++FI) {
+      GCFunctionInfo &MD = **FI;
+
+      // A compact GC layout. Emit this data structure:
+      //
+      // struct {
+      //   int32_t PointCount;
+      //   void *SafePointAddress[PointCount];
+      //   int32_t StackFrameSize; // in words
+      //   int32_t StackArity;
+      //   int32_t LiveCount;
+      //   int32_t LiveOffsets[LiveCount];
+      // } __gcmap_<FUNCTIONNAME>;
+
+      // Align to address width.
+      AP.EmitAlignment(IntPtrSize == 4 ? 2 : 3);
+
+      // Emit PointCount.
+      OS.AddComment("safe point count");
+      AP.EmitInt32(MD.size());
+
+      // And each safe point...
+      for (GCFunctionInfo::iterator PI = MD.begin(),
+                                    PE = MD.end(); PI != PE; ++PI) {
+        // Emit the address of the safe point.
+        OS.AddComment("safe point address");
+        MCSymbol *Label = PI->Label;
+        AP.EmitLabelPlusOffset(Label/*Hi*/, 0/*Offset*/, 4/*Size*/);
+      }
+
+      // Stack information never change in safe points! Only print info from the
+      // first call-site.
+      GCFunctionInfo::iterator PI = MD.begin();
+
+      // Emit the stack frame size.
+      OS.AddComment("stack frame size (in words)");
+      AP.EmitInt32(MD.getFrameSize() / IntPtrSize);
+
+      // Emit stack arity, i.e. the number of stacked arguments.
+      unsigned RegisteredArgs = IntPtrSize == 4 ? 5 : 6;
+      unsigned StackArity = MD.getFunction().arg_size() > RegisteredArgs ?
+                            MD.getFunction().arg_size() - RegisteredArgs : 0;
+      OS.AddComment("stack arity");
+      AP.EmitInt32(StackArity);
+
+      // Emit the number of live roots in the function.
+      OS.AddComment("live root count");
+      AP.EmitInt32(MD.live_size(PI));
+
+      // And for each live root...
+      for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
+                                         LE = MD.live_end(PI);
+                                         LI != LE; ++LI) {
+        // Emit live root's offset within the stack frame.
+        OS.AddComment("stack index (offset / wordsize)");
+        AP.EmitInt32(LI->StackOffset);
+      }
+    }
+  }
+
+References
+==========
+
+.. _appel89:
+
+[Appel89] Runtime Tags Aren't Necessary. Andrew W. Appel. Lisp and Symbolic
+Computation 19(7):703-705, July 1989.
+
+.. _goldberg91:
+
+[Goldberg91] Tag-free garbage collection for strongly typed programming
+languages. Benjamin Goldberg. ACM SIGPLAN PLDI'91.
+
+.. _tolmach94:
+
+[Tolmach94] Tag-free garbage collection using explicit type parameters. Andrew
+Tolmach. Proceedings of the 1994 ACM conference on LISP and functional
+programming.
+
+.. _henderson02:
+
+[Henderson2002] `Accurate Garbage Collection in an Uncooperative Environment
+<http://citeseer.ist.psu.edu/henderson02accurate.html>`__