[www-releases] r225843 - Add documentation for 3.5.1
Tom Stellard
thomas.stellard at amd.com
Tue Jan 13 14:55:45 PST 2015
Added: www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl6.html
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl6.html?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl6.html (added)
+++ www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl6.html Tue Jan 13 16:55:20 2015
@@ -0,0 +1,1485 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+
+ <title>6. Kaleidoscope: Extending the Language: User-defined Operators — LLVM 3.5 documentation</title>
+
+ <link rel="stylesheet" href="../_static/llvm-theme.css" type="text/css" />
+ <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: '../',
+ VERSION: '3.5',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true
+ };
+ </script>
+ <script type="text/javascript" src="../_static/jquery.js"></script>
+ <script type="text/javascript" src="../_static/underscore.js"></script>
+ <script type="text/javascript" src="../_static/doctools.js"></script>
+ <link rel="top" title="LLVM 3.5 documentation" href="../index.html" />
+ <link rel="up" title="LLVM Tutorial: Table of Contents" href="index.html" />
+ <link rel="next" title="7. Kaleidoscope: Extending the Language: Mutable Variables" href="OCamlLangImpl7.html" />
+ <link rel="prev" title="5. Kaleidoscope: Extending the Language: Control Flow" href="OCamlLangImpl5.html" />
+<style type="text/css">
+ table.right { float: right; margin-left: 20px; }
+ table.right td { border: 1px solid #ccc; }
+</style>
+
+ </head>
+ <body>
+<div class="logo">
+ <a href="../index.html">
+ <img src="../_static/logo.png"
+ alt="LLVM Logo" width="250" height="88"/></a>
+</div>
+
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="OCamlLangImpl7.html" title="7. Kaleidoscope: Extending the Language: Mutable Variables"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="OCamlLangImpl5.html" title="5. Kaleidoscope: Extending the Language: Control Flow"
+ accesskey="P">previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ <li><a href="index.html" accesskey="U">LLVM Tutorial: Table of Contents</a> »</li>
+ </ul>
+ </div>
+
+
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="body">
+
+ <div class="section" id="kaleidoscope-extending-the-language-user-defined-operators">
+<h1>6. Kaleidoscope: Extending the Language: User-defined Operators<a class="headerlink" href="#kaleidoscope-extending-the-language-user-defined-operators" title="Permalink to this headline">¶</a></h1>
+<div class="contents local topic" id="contents">
+<ul class="simple">
+<li><a class="reference internal" href="#chapter-6-introduction" id="id1">Chapter 6 Introduction</a></li>
+<li><a class="reference internal" href="#user-defined-operators-the-idea" id="id2">User-defined Operators: the Idea</a></li>
+<li><a class="reference internal" href="#user-defined-binary-operators" id="id3">User-defined Binary Operators</a></li>
+<li><a class="reference internal" href="#user-defined-unary-operators" id="id4">User-defined Unary Operators</a></li>
+<li><a class="reference internal" href="#kicking-the-tires" id="id5">Kicking the Tires</a></li>
+<li><a class="reference internal" href="#full-code-listing" id="id6">Full Code Listing</a></li>
+</ul>
+</div>
+<div class="section" id="chapter-6-introduction">
+<h2><a class="toc-backref" href="#id1">6.1. Chapter 6 Introduction</a><a class="headerlink" href="#chapter-6-introduction" title="Permalink to this headline">¶</a></h2>
+<p>Welcome to Chapter 6 of the “<a class="reference external" href="index.html">Implementing a language with
+LLVM</a>” tutorial. At this point in our tutorial, we now
+have a fully functional language that is fairly minimal, but also
+useful. There is still one big problem with it, however. Our language
+doesn’t have many useful operators (like division, logical negation, or
+even any comparisons besides less-than).</p>
+<p>This chapter of the tutorial takes a wild digression into adding
+user-defined operators to the simple and beautiful Kaleidoscope
+language. This digression now gives us a simple and ugly language in
+some ways, but also a powerful one at the same time. One of the great
+things about creating your own language is that you get to decide what
+is good or bad. In this tutorial we’ll assume that it is okay to use
+this as a way to show some interesting parsing techniques.</p>
+<p>At the end of this tutorial, we’ll run through an example Kaleidoscope
+application that <a class="reference external" href="#example">renders the Mandelbrot set</a>. This gives an
+example of what you can build with Kaleidoscope and its feature set.</p>
+</div>
+<div class="section" id="user-defined-operators-the-idea">
+<h2><a class="toc-backref" href="#id2">6.2. User-defined Operators: the Idea</a><a class="headerlink" href="#user-defined-operators-the-idea" title="Permalink to this headline">¶</a></h2>
+<p>The “operator overloading” that we will add to Kaleidoscope is more
+general than languages like C++. In C++, you are only allowed to
+redefine existing operators: you can’t programatically change the
+grammar, introduce new operators, change precedence levels, etc. In this
+chapter, we will add this capability to Kaleidoscope, which will let the
+user round out the set of operators that are supported.</p>
+<p>The point of going into user-defined operators in a tutorial like this
+is to show the power and flexibility of using a hand-written parser.
+Thus far, the parser we have been implementing uses recursive descent
+for most parts of the grammar and operator precedence parsing for the
+expressions. See <a class="reference external" href="OCamlLangImpl2.html">Chapter 2</a> for details. Without
+using operator precedence parsing, it would be very difficult to allow
+the programmer to introduce new operators into the grammar: the grammar
+is dynamically extensible as the JIT runs.</p>
+<p>The two specific features we’ll add are programmable unary operators
+(right now, Kaleidoscope has no unary operators at all) as well as
+binary operators. An example of this is:</p>
+<div class="highlight-python"><pre># Logical unary not.
+def unary!(v)
+ if v then
+ 0
+ else
+ 1;
+
+# Define > with the same precedence as <.
+def binary> 10 (LHS RHS)
+ RHS < LHS;
+
+# Binary "logical or", (note that it does not "short circuit")
+def binary| 5 (LHS RHS)
+ if LHS then
+ 1
+ else if RHS then
+ 1
+ else
+ 0;
+
+# Define = with slightly lower precedence than relationals.
+def binary= 9 (LHS RHS)
+ !(LHS < RHS | LHS > RHS);</pre>
+</div>
+<p>Many languages aspire to being able to implement their standard runtime
+library in the language itself. In Kaleidoscope, we can implement
+significant parts of the language in the library!</p>
+<p>We will break down implementation of these features into two parts:
+implementing support for user-defined binary operators and adding unary
+operators.</p>
+</div>
+<div class="section" id="user-defined-binary-operators">
+<h2><a class="toc-backref" href="#id3">6.3. User-defined Binary Operators</a><a class="headerlink" href="#user-defined-binary-operators" title="Permalink to this headline">¶</a></h2>
+<p>Adding support for user-defined binary operators is pretty simple with
+our current framework. We’ll first add support for the unary/binary
+keywords:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">type</span> <span class="n">token</span> <span class="o">=</span>
+ <span class="o">...</span>
+ <span class="c">(* operators *)</span>
+ <span class="o">|</span> <span class="nc">Binary</span> <span class="o">|</span> <span class="nn">Unary</span>
+
+<span class="p">...</span>
+
+<span class="n">and</span> <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">...</span>
+ <span class="o">|</span> <span class="s2">"for"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">For</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"in"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"binary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"unary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+</pre></div>
+</div>
+<p>This just adds lexer support for the unary and binary keywords, like we
+did in <a class="reference external" href="OCamlLangImpl5.html#iflexer">previous chapters</a>. One nice
+thing about our current AST, is that we represent binary operators with
+full generalisation by using their ASCII code as the opcode. For our
+extended operators, we’ll use this same representation, so we don’t need
+any new AST or parser support.</p>
+<p>On the other hand, we have to be able to represent the definitions of
+these new operators, in the “def binary| 5” part of the function
+definition. In our grammar so far, the “name” for the function
+definition is parsed as the “prototype” production and into the
+<tt class="docutils literal"><span class="pre">Ast.Prototype</span></tt> AST node. To represent our new user-defined operators
+as prototypes, we have to extend the <tt class="docutils literal"><span class="pre">Ast.Prototype</span></tt> AST node like
+this:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* proto - This type represents the "prototype" for a function, which captures</span>
+<span class="c"> * its name, and its argument names (thus implicitly the number of arguments the</span>
+<span class="c"> * function takes). *)</span>
+<span class="k">type</span> <span class="n">proto</span> <span class="o">=</span>
+ <span class="o">|</span> <span class="nc">Prototype</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="kt">string</span> <span class="kt">array</span>
+ <span class="o">|</span> <span class="nc">BinOpPrototype</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="kt">string</span> <span class="kt">array</span> <span class="o">*</span> <span class="kt">int</span>
+</pre></div>
+</div>
+<p>Basically, in addition to knowing a name for the prototype, we now keep
+track of whether it was an operator, and if it was, what precedence
+level the operator is at. The precedence is only used for binary
+operators (as you’ll see below, it just doesn’t apply for unary
+operators). Now that we have a way to represent the prototype for a
+user-defined operator, we need to parse it:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* prototype</span>
+<span class="c"> * ::= id '(' id* ')'</span>
+<span class="c"> * ::= binary LETTER number? (id, id)</span>
+<span class="c"> * ::= unary LETTER number? (id) *)</span>
+<span class="k">let</span> <span class="n">parse_prototype</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_args</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_args</span> <span class="o">(</span><span class="n">id</span><span class="o">::</span><span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_operator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"unary"</span><span class="o">,</span> <span class="mi">1</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"binary"</span><span class="o">,</span> <span class="mi">2</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_binary_precedence</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">int_of_float</span> <span class="n">n</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="mi">30</span>
+ <span class="k">in</span>
+ <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="c">(* success. *)</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">))</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">(</span><span class="n">prefix</span><span class="o">,</span> <span class="n">kind</span><span class="o">)=</span><span class="n">parse_operator</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="o">??</span> <span class="s2">"expected an operator"</span><span class="o">;</span>
+ <span class="c">(* Read the precedence if present. *)</span>
+ <span class="n">binary_precedence</span><span class="o">=</span><span class="n">parse_binary_precedence</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Verify right number of arguments for operator. *)</span>
+ <span class="k">if</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span> <span class="o">!=</span> <span class="n">kind</span>
+ <span class="k">then</span> <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"invalid number of operands for operator"</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="k">if</span> <span class="n">kind</span> <span class="o">==</span> <span class="mi">1</span> <span class="k">then</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">binary_precedence</span><span class="o">)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected function name in prototype"</span><span class="o">)</span>
+</pre></div>
+</div>
+<p>This is all fairly straightforward parsing code, and we have already
+seen a lot of similar code in the past. One interesting part about the
+code above is the couple lines that set up <tt class="docutils literal"><span class="pre">name</span></tt> for binary
+operators. This builds names like “binary@” for a newly defined “@”
+operator. This then takes advantage of the fact that symbol names in the
+LLVM symbol table are allowed to have any character in them, including
+embedded nul characters.</p>
+<p>The next interesting thing to add, is codegen support for these binary
+operators. Given our current structure, this is a simple addition of a
+default case for our existing binary operator node:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">...</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">lhs</span><span class="o">,</span> <span class="n">rhs</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">lhs_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">lhs</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">rhs_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">rhs</span> <span class="k">in</span>
+ <span class="k">begin</span>
+ <span class="k">match</span> <span class="n">op</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="sc">'+'</span> <span class="o">-></span> <span class="n">build_add</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"addtmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'-'</span> <span class="o">-></span> <span class="n">build_sub</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"subtmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'*'</span> <span class="o">-></span> <span class="n">build_mul</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"multmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'<'</span> <span class="o">-></span>
+ <span class="c">(* Convert bool 0/1 to double 0.0 or 1.0 *)</span>
+ <span class="k">let</span> <span class="n">i</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">Ult</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"cmptmp"</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="n">build_uitofp</span> <span class="n">i</span> <span class="n">double_type</span> <span class="s2">"booltmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="c">(* If it wasn't a builtin binary operator, it must be a user defined</span>
+<span class="c"> * one. Emit a call to it. *)</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span> <span class="s2">"binary"</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"binary operator not found!"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="o">[|</span><span class="n">lhs_val</span><span class="o">;</span> <span class="n">rhs_val</span><span class="o">|]</span> <span class="s2">"binop"</span> <span class="n">builder</span>
+ <span class="k">end</span>
+</pre></div>
+</div>
+<p>As you can see above, the new code is actually really simple. It just
+does a lookup for the appropriate operator in the symbol table and
+generates a function call to it. Since user-defined operators are just
+built as normal functions (because the “prototype” boils down to a
+function with the right name) everything falls into place.</p>
+<p>The final piece of code we are missing, is a bit of top level magic:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="n">proto</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">clear</span> <span class="n">named_values</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">codegen_proto</span> <span class="n">proto</span> <span class="k">in</span>
+
+ <span class="c">(* If this is an operator, install it. *)</span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">proto</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">prec</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">op</span> <span class="o">=</span> <span class="n">name</span><span class="o">.[</span><span class="nn">String</span><span class="p">.</span><span class="n">length</span> <span class="n">name</span> <span class="o">-</span> <span class="mi">1</span><span class="o">]</span> <span class="k">in</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="n">op</span> <span class="n">prec</span><span class="o">;</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* Create a new basic block to start insertion into. *)</span>
+ <span class="k">let</span> <span class="n">bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"entry"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>Basically, before codegening a function, if it is a user-defined
+operator, we register it in the precedence table. This allows the binary
+operator parsing logic we already have in place to handle it. Since we
+are working on a fully-general operator precedence parser, this is all
+we need to do to “extend the grammar”.</p>
+<p>Now we have useful user-defined binary operators. This builds a lot on
+the previous framework we built for other operators. Adding unary
+operators is a bit more challenging, because we don’t have any framework
+for it yet - lets see what it takes.</p>
+</div>
+<div class="section" id="user-defined-unary-operators">
+<h2><a class="toc-backref" href="#id4">6.4. User-defined Unary Operators</a><a class="headerlink" href="#user-defined-unary-operators" title="Permalink to this headline">¶</a></h2>
+<p>Since we don’t currently support unary operators in the Kaleidoscope
+language, we’ll need to add everything to support them. Above, we added
+simple support for the ‘unary’ keyword to the lexer. In addition to
+that, we need an AST node:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">type</span> <span class="n">expr</span> <span class="o">=</span>
+ <span class="o">...</span>
+ <span class="c">(* variant for a unary operator. *)</span>
+ <span class="o">|</span> <span class="nc">Unary</span> <span class="k">of</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">expr</span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>This AST node is very simple and obvious by now. It directly mirrors the
+binary operator AST node, except that it only has one child. With this,
+we need to add the parsing logic. Parsing a unary operator is pretty
+simple: we’ll add a new function to do it:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* unary</span>
+<span class="c"> * ::= primary</span>
+<span class="c"> * ::= '!' unary *)</span>
+<span class="ow">and</span> <span class="n">parse_unary</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* If this is a unary operator, read it. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="k">when</span> <span class="n">op</span> <span class="o">!=</span> <span class="sc">'('</span> <span class="o">&&</span> <span class="n">op</span> <span class="o">!=</span> <span class="sc">')'</span><span class="o">;</span> <span class="n">operand</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">operand</span><span class="o">)</span>
+
+ <span class="c">(* If the current token is not an operator, it must be a primary expr. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">parse_primary</span> <span class="n">stream</span>
+</pre></div>
+</div>
+<p>The grammar we add is pretty straightforward here. If we see a unary
+operator when parsing a primary operator, we eat the operator as a
+prefix and parse the remaining piece as another unary operator. This
+allows us to handle multiple unary operators (e.g. ”!!x”). Note that
+unary operators can’t have ambiguous parses like binary operators can,
+so there is no need for precedence information.</p>
+<p>The problem with this function, is that we need to call ParseUnary from
+somewhere. To do this, we change previous callers of ParsePrimary to
+call <tt class="docutils literal"><span class="pre">parse_unary</span></tt> instead:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* binoprhs</span>
+<span class="c"> * ::= ('+' primary)* *)</span>
+<span class="ow">and</span> <span class="n">parse_bin_rhs</span> <span class="n">expr_prec</span> <span class="n">lhs</span> <span class="n">stream</span> <span class="o">=</span>
+ <span class="o">...</span>
+ <span class="c">(* Parse the unary expression after the binary operator. *)</span>
+ <span class="k">let</span> <span class="n">rhs</span> <span class="o">=</span> <span class="n">parse_unary</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="o">...</span>
+
+<span class="o">...</span>
+
+<span class="c">(* expression</span>
+<span class="c"> * ::= primary binoprhs *)</span>
+<span class="ow">and</span> <span class="n">parse_expr</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">lhs</span><span class="o">=</span><span class="n">parse_unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">parse_bin_rhs</span> <span class="mi">0</span> <span class="n">lhs</span> <span class="n">stream</span>
+</pre></div>
+</div>
+<p>With these two simple changes, we are now able to parse unary operators
+and build the AST for them. Next up, we need to add parser support for
+prototypes, to parse the unary operator prototype. We extend the binary
+operator code above with:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* prototype</span>
+<span class="c"> * ::= id '(' id* ')'</span>
+<span class="c"> * ::= binary LETTER number? (id, id)</span>
+<span class="c"> * ::= unary LETTER number? (id) *)</span>
+<span class="k">let</span> <span class="n">parse_prototype</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_args</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_args</span> <span class="o">(</span><span class="n">id</span><span class="o">::</span><span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_operator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"unary"</span><span class="o">,</span> <span class="mi">1</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"binary"</span><span class="o">,</span> <span class="mi">2</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_binary_precedence</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">int_of_float</span> <span class="n">n</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="mi">30</span>
+ <span class="k">in</span>
+ <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="c">(* success. *)</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">))</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">(</span><span class="n">prefix</span><span class="o">,</span> <span class="n">kind</span><span class="o">)=</span><span class="n">parse_operator</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="o">??</span> <span class="s2">"expected an operator"</span><span class="o">;</span>
+ <span class="c">(* Read the precedence if present. *)</span>
+ <span class="n">binary_precedence</span><span class="o">=</span><span class="n">parse_binary_precedence</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Verify right number of arguments for operator. *)</span>
+ <span class="k">if</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span> <span class="o">!=</span> <span class="n">kind</span>
+ <span class="k">then</span> <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"invalid number of operands for operator"</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="k">if</span> <span class="n">kind</span> <span class="o">==</span> <span class="mi">1</span> <span class="k">then</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">binary_precedence</span><span class="o">)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected function name in prototype"</span><span class="o">)</span>
+</pre></div>
+</div>
+<p>As with binary operators, we name unary operators with a name that
+includes the operator character. This assists us at code generation
+time. Speaking of, the final piece we need to add is codegen support for
+unary operators. It looks like this:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="k">rec</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">...</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">operand</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">operand</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">operand</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span> <span class="s2">"unary"</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown unary operator"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="o">[|</span><span class="n">operand</span><span class="o">|]</span> <span class="s2">"unop"</span> <span class="n">builder</span>
+</pre></div>
+</div>
+<p>This code is similar to, but simpler than, the code for binary
+operators. It is simpler primarily because it doesn’t need to handle any
+predefined operators.</p>
+</div>
+<div class="section" id="kicking-the-tires">
+<h2><a class="toc-backref" href="#id5">6.5. Kicking the Tires</a><a class="headerlink" href="#kicking-the-tires" title="Permalink to this headline">¶</a></h2>
+<p>It is somewhat hard to believe, but with a few simple extensions we’ve
+covered in the last chapters, we have grown a real-ish language. With
+this, we can do a lot of interesting things, including I/O, math, and a
+bunch of other things. For example, we can now add a nice sequencing
+operator (printd is defined to print out the specified value and a
+newline):</p>
+<div class="highlight-python"><pre>ready> extern printd(x);
+Read extern: declare double @printd(double)
+ready> def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.
+..
+ready> printd(123) : printd(456) : printd(789);
+123.000000
+456.000000
+789.000000
+Evaluated to 0.000000</pre>
+</div>
+<p>We can also define a bunch of other “primitive” operations, such as:</p>
+<div class="highlight-python"><pre># Logical unary not.
+def unary!(v)
+ if v then
+ 0
+ else
+ 1;
+
+# Unary negate.
+def unary-(v)
+ 0-v;
+
+# Define > with the same precedence as <.
+def binary> 10 (LHS RHS)
+ RHS < LHS;
+
+# Binary logical or, which does not short circuit.
+def binary| 5 (LHS RHS)
+ if LHS then
+ 1
+ else if RHS then
+ 1
+ else
+ 0;
+
+# Binary logical and, which does not short circuit.
+def binary& 6 (LHS RHS)
+ if !LHS then
+ 0
+ else
+ !!RHS;
+
+# Define = with slightly lower precedence than relationals.
+def binary = 9 (LHS RHS)
+ !(LHS < RHS | LHS > RHS);</pre>
+</div>
+<p>Given the previous if/then/else support, we can also define interesting
+functions for I/O. For example, the following prints out a character
+whose “density” reflects the value passed in: the lower the value, the
+denser the character:</p>
+<div class="highlight-python"><pre>ready>
+
+extern putchard(char)
+def printdensity(d)
+ if d > 8 then
+ putchard(32) # ' '
+ else if d > 4 then
+ putchard(46) # '.'
+ else if d > 2 then
+ putchard(43) # '+'
+ else
+ putchard(42); # '*'
+...
+ready> printdensity(1): printdensity(2): printdensity(3) :
+ printdensity(4): printdensity(5): printdensity(9): putchard(10);
+*++..
+Evaluated to 0.000000</pre>
+</div>
+<p>Based on these simple primitive operations, we can start to define more
+interesting things. For example, here’s a little function that solves
+for the number of iterations it takes a function in the complex plane to
+converge:</p>
+<div class="highlight-python"><pre># determine whether the specific location diverges.
+# Solve for z = z^2 + c in the complex plane.
+def mandleconverger(real imag iters creal cimag)
+ if iters > 255 | (real*real + imag*imag > 4) then
+ iters
+ else
+ mandleconverger(real*real - imag*imag + creal,
+ 2*real*imag + cimag,
+ iters+1, creal, cimag);
+
+# return the number of iterations required for the iteration to escape
+def mandleconverge(real imag)
+ mandleconverger(real, imag, 0, real, imag);</pre>
+</div>
+<p>This “z = z<sup>2</sup> + c” function is a beautiful little creature
+that is the basis for computation of the <a class="reference external" href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot
+Set</a>. Our
+<tt class="docutils literal"><span class="pre">mandelconverge</span></tt> function returns the number of iterations that it
+takes for a complex orbit to escape, saturating to 255. This is not a
+very useful function by itself, but if you plot its value over a
+two-dimensional plane, you can see the Mandelbrot set. Given that we are
+limited to using putchard here, our amazing graphical output is limited,
+but we can whip together something using the density plotter above:</p>
+<div class="highlight-python"><pre># compute and plot the mandlebrot set with the specified 2 dimensional range
+# info.
+def mandelhelp(xmin xmax xstep ymin ymax ystep)
+ for y = ymin, y < ymax, ystep in (
+ (for x = xmin, x < xmax, xstep in
+ printdensity(mandleconverge(x,y)))
+ : putchard(10)
+ )
+
+# mandel - This is a convenient helper function for plotting the mandelbrot set
+# from the specified position with the specified Magnification.
+def mandel(realstart imagstart realmag imagmag)
+ mandelhelp(realstart, realstart+realmag*78, realmag,
+ imagstart, imagstart+imagmag*40, imagmag);</pre>
+</div>
+<p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
+<div class="highlight-python"><pre>ready> mandel(-2.3, -1.3, 0.05, 0.07);
+*******************************+++++++++++*************************************
+*************************+++++++++++++++++++++++*******************************
+**********************+++++++++++++++++++++++++++++****************************
+*******************+++++++++++++++++++++.. ...++++++++*************************
+*****************++++++++++++++++++++++.... ...+++++++++***********************
+***************+++++++++++++++++++++++..... ...+++++++++*********************
+**************+++++++++++++++++++++++.... ....+++++++++********************
+*************++++++++++++++++++++++...... .....++++++++*******************
+************+++++++++++++++++++++....... .......+++++++******************
+***********+++++++++++++++++++.... ... .+++++++*****************
+**********+++++++++++++++++....... .+++++++****************
+*********++++++++++++++........... ...+++++++***************
+********++++++++++++............ ...++++++++**************
+********++++++++++... .......... .++++++++**************
+*******+++++++++..... .+++++++++*************
+*******++++++++...... ..+++++++++*************
+*******++++++....... ..+++++++++*************
+*******+++++...... ..+++++++++*************
+*******.... .... ...+++++++++*************
+*******.... . ...+++++++++*************
+*******+++++...... ...+++++++++*************
+*******++++++....... ..+++++++++*************
+*******++++++++...... .+++++++++*************
+*******+++++++++..... ..+++++++++*************
+********++++++++++... .......... .++++++++**************
+********++++++++++++............ ...++++++++**************
+*********++++++++++++++.......... ...+++++++***************
+**********++++++++++++++++........ .+++++++****************
+**********++++++++++++++++++++.... ... ..+++++++****************
+***********++++++++++++++++++++++....... .......++++++++*****************
+************+++++++++++++++++++++++...... ......++++++++******************
+**************+++++++++++++++++++++++.... ....++++++++********************
+***************+++++++++++++++++++++++..... ...+++++++++*********************
+*****************++++++++++++++++++++++.... ...++++++++***********************
+*******************+++++++++++++++++++++......++++++++*************************
+*********************++++++++++++++++++++++.++++++++***************************
+*************************+++++++++++++++++++++++*******************************
+******************************+++++++++++++************************************
+*******************************************************************************
+*******************************************************************************
+*******************************************************************************
+Evaluated to 0.000000
+ready> mandel(-2, -1, 0.02, 0.04);
+**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
+***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
+*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
+*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
+***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
+**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
+************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
+***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
+**********++++++++++++++++++++++++++++++++++++++++++++++.............
+********+++++++++++++++++++++++++++++++++++++++++++..................
+*******+++++++++++++++++++++++++++++++++++++++.......................
+******+++++++++++++++++++++++++++++++++++...........................
+*****++++++++++++++++++++++++++++++++............................
+*****++++++++++++++++++++++++++++...............................
+****++++++++++++++++++++++++++...... .........................
+***++++++++++++++++++++++++......... ...... ...........
+***++++++++++++++++++++++............
+**+++++++++++++++++++++..............
+**+++++++++++++++++++................
+*++++++++++++++++++.................
+*++++++++++++++++............ ...
+*++++++++++++++..............
+*+++....++++................
+*.......... ...........
+*
+*.......... ...........
+*+++....++++................
+*++++++++++++++..............
+*++++++++++++++++............ ...
+*++++++++++++++++++.................
+**+++++++++++++++++++................
+**+++++++++++++++++++++..............
+***++++++++++++++++++++++............
+***++++++++++++++++++++++++......... ...... ...........
+****++++++++++++++++++++++++++...... .........................
+*****++++++++++++++++++++++++++++...............................
+*****++++++++++++++++++++++++++++++++............................
+******+++++++++++++++++++++++++++++++++++...........................
+*******+++++++++++++++++++++++++++++++++++++++.......................
+********+++++++++++++++++++++++++++++++++++++++++++..................
+Evaluated to 0.000000
+ready> mandel(-0.9, -1.4, 0.02, 0.03);
+*******************************************************************************
+*******************************************************************************
+*******************************************************************************
+**********+++++++++++++++++++++************************************************
+*+++++++++++++++++++++++++++++++++++++++***************************************
++++++++++++++++++++++++++++++++++++++++++++++**********************************
+++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
+++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
++++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
++++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
++++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
+++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
++++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
+++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
+++++++++++++++++++++++++............. .......++++++++++++++++++++++******
++++++++++++++++++++++++............. ........+++++++++++++++++++++++****
+++++++++++++++++++++++........... ..........++++++++++++++++++++++***
+++++++++++++++++++++........... .........++++++++++++++++++++++*
+++++++++++++++++++............ ...........++++++++++++++++++++
+++++++++++++++++............... .............++++++++++++++++++
+++++++++++++++................. ...............++++++++++++++++
+++++++++++++.................. .................++++++++++++++
++++++++++.................. .................+++++++++++++
+++++++........ . ......... ..++++++++++++
+++............ ...... ....++++++++++
+.............. ...++++++++++
+.............. ....+++++++++
+.............. .....++++++++
+............. ......++++++++
+........... .......++++++++
+......... ........+++++++
+......... ........+++++++
+......... ....+++++++
+........ ...+++++++
+....... ...+++++++
+ ....+++++++
+ .....+++++++
+ ....+++++++
+ ....+++++++
+ ....+++++++
+Evaluated to 0.000000
+ready> ^D</pre>
+</div>
+<p>At this point, you may be starting to realize that Kaleidoscope is a
+real and powerful language. It may not be self-similar :), but it can be
+used to plot things that are!</p>
+<p>With this, we conclude the “adding user-defined operators” chapter of
+the tutorial. We have successfully augmented our language, adding the
+ability to extend the language in the library, and we have shown how
+this can be used to build a simple but interesting end-user application
+in Kaleidoscope. At this point, Kaleidoscope can build a variety of
+applications that are functional and can call functions with
+side-effects, but it can’t actually define and mutate a variable itself.</p>
+<p>Strikingly, variable mutation is an important feature of some languages,
+and it is not at all obvious how to <a class="reference external" href="OCamlLangImpl7.html">add support for mutable
+variables</a> without having to add an “SSA
+construction” phase to your front-end. In the next chapter, we will
+describe how you can add variable mutation without building SSA in your
+front-end.</p>
+</div>
+<div class="section" id="full-code-listing">
+<h2><a class="toc-backref" href="#id6">6.6. Full Code Listing</a><a class="headerlink" href="#full-code-listing" title="Permalink to this headline">¶</a></h2>
+<p>Here is the complete code listing for our running example, enhanced with
+the if/then/else and for expressions.. To build this example, use:</p>
+<div class="highlight-bash"><div class="highlight"><pre><span class="c"># Compile</span>
+ocamlbuild toy.byte
+<span class="c"># Run</span>
+./toy.byte
+</pre></div>
+</div>
+<p>Here is the code:</p>
+<dl class="docutils">
+<dt>_tags:</dt>
+<dd><div class="first last highlight-python"><pre><{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
+<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
+<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
+<*.{byte,native}>: use_llvm_scalar_opts, use_bindings</pre>
+</div>
+</dd>
+<dt>myocamlbuild.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="k">open</span> <span class="nc">Ocamlbuild_plugin</span><span class="o">;;</span>
+
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_analysis"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_executionengine"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_target"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_scalar_opts"</span><span class="o">;;</span>
+
+<span class="n">flag</span> <span class="o">[</span><span class="s2">"link"</span><span class="o">;</span> <span class="s2">"ocaml"</span><span class="o">;</span> <span class="s2">"g++"</span><span class="o">]</span> <span class="o">(</span><span class="nc">S</span><span class="o">[</span><span class="nc">A</span><span class="s2">"-cc"</span><span class="o">;</span> <span class="nc">A</span><span class="s2">"g++"</span><span class="o">;</span> <span class="nc">A</span><span class="s2">"-cclib"</span><span class="o">;</span> <span class="nc">A</span><span class="s2">"-rdynamic"</span><span class="o">]);;</span>
+<span class="n">dep</span> <span class="o">[</span><span class="s2">"link"</span><span class="o">;</span> <span class="s2">"ocaml"</span><span class="o">;</span> <span class="s2">"use_bindings"</span><span class="o">]</span> <span class="o">[</span><span class="s2">"bindings.o"</span><span class="o">];;</span>
+</pre></div>
+</div>
+</dd>
+<dt>token.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Lexer Tokens</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="c">(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of</span>
+<span class="c"> * these others for known things. *)</span>
+<span class="k">type</span> <span class="n">token</span> <span class="o">=</span>
+ <span class="c">(* commands *)</span>
+ <span class="o">|</span> <span class="nc">Def</span> <span class="o">|</span> <span class="nc">Extern</span>
+
+ <span class="c">(* primary *)</span>
+ <span class="o">|</span> <span class="nc">Ident</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">|</span> <span class="nc">Number</span> <span class="k">of</span> <span class="kt">float</span>
+
+ <span class="c">(* unknown *)</span>
+ <span class="o">|</span> <span class="nc">Kwd</span> <span class="k">of</span> <span class="kt">char</span>
+
+ <span class="c">(* control *)</span>
+ <span class="o">|</span> <span class="nc">If</span> <span class="o">|</span> <span class="nc">Then</span> <span class="o">|</span> <span class="nc">Else</span>
+ <span class="o">|</span> <span class="nc">For</span> <span class="o">|</span> <span class="nc">In</span>
+
+ <span class="c">(* operators *)</span>
+ <span class="o">|</span> <span class="nc">Binary</span> <span class="o">|</span> <span class="nc">Unary</span>
+</pre></div>
+</div>
+</dd>
+<dt>lexer.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Lexer</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">let</span> <span class="k">rec</span> <span class="n">lex</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* Skip any whitespace. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">' '</span> <span class="o">|</span> <span class="sc">'\n'</span> <span class="o">|</span> <span class="sc">'\r'</span> <span class="o">|</span> <span class="sc">'\t'</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">lex</span> <span class="n">stream</span>
+
+ <span class="c">(* identifier: [a-zA-Z][a-zA-Z0-9] *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'A'</span> <span class="o">..</span> <span class="sc">'Z'</span> <span class="o">|</span> <span class="sc">'a'</span> <span class="o">..</span> <span class="sc">'z'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">buffer</span> <span class="o">=</span> <span class="nn">Buffer</span><span class="p">.</span><span class="n">create</span> <span class="mi">1</span> <span class="k">in</span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="n">stream</span>
+
+ <span class="c">(* number: [0-9.]+ *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'0'</span> <span class="o">..</span> <span class="sc">'9'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">buffer</span> <span class="o">=</span> <span class="nn">Buffer</span><span class="p">.</span><span class="n">create</span> <span class="mi">1</span> <span class="k">in</span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_number</span> <span class="n">buffer</span> <span class="n">stream</span>
+
+ <span class="c">(* Comment until end of line. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'#'</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="n">lex_comment</span> <span class="n">stream</span>
+
+ <span class="c">(* Otherwise, just return the character as its ascii value. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="n">c</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">c</span><span class="o">;</span> <span class="n">lex</span> <span class="n">stream</span> <span class="o">>]</span>
+
+ <span class="c">(* end of stream. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="o">[<</span> <span class="o">>]</span>
+
+<span class="ow">and</span> <span class="n">lex_number</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'0'</span> <span class="o">..</span> <span class="sc">'9'</span> <span class="o">|</span> <span class="sc">'.'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_number</span> <span class="n">buffer</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span><span class="o">=</span><span class="n">lex</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="o">(</span><span class="n">float_of_string</span> <span class="o">(</span><span class="nn">Buffer</span><span class="p">.</span><span class="n">contents</span> <span class="n">buffer</span><span class="o">));</span> <span class="n">stream</span> <span class="o">>]</span>
+
+<span class="ow">and</span> <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'A'</span> <span class="o">..</span> <span class="sc">'Z'</span> <span class="o">|</span> <span class="sc">'a'</span> <span class="o">..</span> <span class="sc">'z'</span> <span class="o">|</span> <span class="sc">'0'</span> <span class="o">..</span> <span class="sc">'9'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span><span class="o">=</span><span class="n">lex</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">match</span> <span class="nn">Buffer</span><span class="p">.</span><span class="n">contents</span> <span class="n">buffer</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="s2">"def"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Def</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"extern"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Extern</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"if"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">If</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"then"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Then</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"else"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Else</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"for"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">For</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"in"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"binary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"unary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="n">id</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+
+<span class="ow">and</span> <span class="n">lex_comment</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'\n'</span><span class="o">);</span> <span class="n">stream</span><span class="o">=</span><span class="n">lex</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="n">c</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">lex_comment</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="o">[<</span> <span class="o">>]</span>
+</pre></div>
+</div>
+</dd>
+<dt>ast.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Abstract Syntax Tree (aka Parse Tree)</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="c">(* expr - Base type for all expression nodes. *)</span>
+<span class="k">type</span> <span class="n">expr</span> <span class="o">=</span>
+ <span class="c">(* variant for numeric literals like "1.0". *)</span>
+ <span class="o">|</span> <span class="nc">Number</span> <span class="k">of</span> <span class="kt">float</span>
+
+ <span class="c">(* variant for referencing a variable, like "a". *)</span>
+ <span class="o">|</span> <span class="nc">Variable</span> <span class="k">of</span> <span class="kt">string</span>
+
+ <span class="c">(* variant for a unary operator. *)</span>
+ <span class="o">|</span> <span class="nc">Unary</span> <span class="k">of</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for a binary operator. *)</span>
+ <span class="o">|</span> <span class="nc">Binary</span> <span class="k">of</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for function calls. *)</span>
+ <span class="o">|</span> <span class="nc">Call</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="n">expr</span> <span class="kt">array</span>
+
+ <span class="c">(* variant for if/then/else. *)</span>
+ <span class="o">|</span> <span class="nc">If</span> <span class="k">of</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for for/in. *)</span>
+ <span class="o">|</span> <span class="nc">For</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span> <span class="n">option</span> <span class="o">*</span> <span class="n">expr</span>
+
+<span class="c">(* proto - This type represents the "prototype" for a function, which captures</span>
+<span class="c"> * its name, and its argument names (thus implicitly the number of arguments the</span>
+<span class="c"> * function takes). *)</span>
+<span class="k">type</span> <span class="n">proto</span> <span class="o">=</span>
+ <span class="o">|</span> <span class="nc">Prototype</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="kt">string</span> <span class="kt">array</span>
+ <span class="o">|</span> <span class="nc">BinOpPrototype</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="kt">string</span> <span class="kt">array</span> <span class="o">*</span> <span class="kt">int</span>
+
+<span class="c">(* func - This type represents a function definition itself. *)</span>
+<span class="k">type</span> <span class="n">func</span> <span class="o">=</span> <span class="nc">Function</span> <span class="k">of</span> <span class="n">proto</span> <span class="o">*</span> <span class="n">expr</span>
+</pre></div>
+</div>
+</dd>
+<dt>parser.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===---------------------------------------------------------------------===</span>
+<span class="c"> * Parser</span>
+<span class="c"> *===---------------------------------------------------------------------===*)</span>
+
+<span class="c">(* binop_precedence - This holds the precedence for each binary operator that is</span>
+<span class="c"> * defined *)</span>
+<span class="k">let</span> <span class="n">binop_precedence</span><span class="o">:(</span><span class="kt">char</span><span class="o">,</span> <span class="kt">int</span><span class="o">)</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">10</span>
+
+<span class="c">(* precedence - Get the precedence of the pending binary operator token. *)</span>
+<span class="k">let</span> <span class="n">precedence</span> <span class="n">c</span> <span class="o">=</span> <span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">binop_precedence</span> <span class="n">c</span> <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="o">-</span><span class="mi">1</span>
+
+<span class="c">(* primary</span>
+<span class="c"> * ::= identifier</span>
+<span class="c"> * ::= numberexpr</span>
+<span class="c"> * ::= parenexpr</span>
+<span class="c"> * ::= ifexpr</span>
+<span class="c"> * ::= forexpr *)</span>
+<span class="k">let</span> <span class="k">rec</span> <span class="n">parse_primary</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* numberexpr ::= number *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">>]</span> <span class="o">-></span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span>
+
+ <span class="c">(* parenexpr ::= '(' expression ')' *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')'"</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+
+ <span class="c">(* identifierexpr</span>
+<span class="c"> * ::= identifier</span>
+<span class="c"> * ::= identifier '(' argumentexpr ')' *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_args</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_args</span> <span class="o">(</span><span class="n">e</span> <span class="o">::</span> <span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span> <span class="o">::</span> <span class="n">accumulator</span>
+ <span class="k">end</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_ident</span> <span class="n">id</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* Call. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')'"</span><span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Call</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">))</span>
+
+ <span class="c">(* Simple variable ref. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">id</span>
+ <span class="k">in</span>
+ <span class="n">parse_ident</span> <span class="n">id</span> <span class="n">stream</span>
+
+ <span class="c">(* ifexpr ::= 'if' expr 'then' expr 'else' expr *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">If</span><span class="o">;</span> <span class="n">c</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Then</span> <span class="o">??</span> <span class="s2">"expected 'then'"</span><span class="o">;</span> <span class="n">t</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Else</span> <span class="o">??</span> <span class="s2">"expected 'else'"</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">If</span> <span class="o">(</span><span class="n">c</span><span class="o">,</span> <span class="n">t</span><span class="o">,</span> <span class="n">e</span><span class="o">)</span>
+
+ <span class="c">(* forexpr</span>
+<span class="c"> ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">For</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span> <span class="o">??</span> <span class="s2">"expected identifier after for"</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'='</span> <span class="o">??</span> <span class="s2">"expected '=' after for"</span><span class="o">;</span>
+ <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span>
+ <span class="n">start</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span> <span class="o">??</span> <span class="s2">"expected ',' after for"</span><span class="o">;</span>
+ <span class="n">end_</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">step</span> <span class="o">=</span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span><span class="o">;</span> <span class="n">step</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">Some</span> <span class="n">step</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">None</span>
+ <span class="k">end</span> <span class="n">stream</span>
+ <span class="k">in</span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span><span class="o">;</span> <span class="n">body</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">For</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="n">start</span><span class="o">,</span> <span class="n">end_</span><span class="o">,</span> <span class="n">step</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected 'in' after for"</span><span class="o">)</span>
+ <span class="k">end</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected '=' after for"</span><span class="o">)</span>
+ <span class="k">end</span> <span class="n">stream</span>
+
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"unknown token when expecting an expression."</span><span class="o">)</span>
+
+<span class="c">(* unary</span>
+<span class="c"> * ::= primary</span>
+<span class="c"> * ::= '!' unary *)</span>
+<span class="ow">and</span> <span class="n">parse_unary</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* If this is a unary operator, read it. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="k">when</span> <span class="n">op</span> <span class="o">!=</span> <span class="sc">'('</span> <span class="o">&&</span> <span class="n">op</span> <span class="o">!=</span> <span class="sc">')'</span><span class="o">;</span> <span class="n">operand</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">operand</span><span class="o">)</span>
+
+ <span class="c">(* If the current token is not an operator, it must be a primary expr. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">parse_primary</span> <span class="n">stream</span>
+
+<span class="c">(* binoprhs</span>
+<span class="c"> * ::= ('+' primary)* *)</span>
+<span class="ow">and</span> <span class="n">parse_bin_rhs</span> <span class="n">expr_prec</span> <span class="n">lhs</span> <span class="n">stream</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="nn">Stream</span><span class="p">.</span><span class="n">peek</span> <span class="n">stream</span> <span class="k">with</span>
+ <span class="c">(* If this is a binop, find its precedence. *)</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">c</span><span class="o">)</span> <span class="k">when</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">mem</span> <span class="n">binop_precedence</span> <span class="n">c</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">token_prec</span> <span class="o">=</span> <span class="n">precedence</span> <span class="n">c</span> <span class="k">in</span>
+
+ <span class="c">(* If this is a binop that binds at least as tightly as the current binop,</span>
+<span class="c"> * consume it, otherwise we are done. *)</span>
+ <span class="k">if</span> <span class="n">token_prec</span> <span class="o"><</span> <span class="n">expr_prec</span> <span class="k">then</span> <span class="n">lhs</span> <span class="k">else</span> <span class="k">begin</span>
+ <span class="c">(* Eat the binop. *)</span>
+ <span class="nn">Stream</span><span class="p">.</span><span class="n">junk</span> <span class="n">stream</span><span class="o">;</span>
+
+ <span class="c">(* Parse the unary expression after the binary operator. *)</span>
+ <span class="k">let</span> <span class="n">rhs</span> <span class="o">=</span> <span class="n">parse_unary</span> <span class="n">stream</span> <span class="k">in</span>
+
+ <span class="c">(* Okay, we know this is a binop. *)</span>
+ <span class="k">let</span> <span class="n">rhs</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="nn">Stream</span><span class="p">.</span><span class="n">peek</span> <span class="n">stream</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">c2</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* If BinOp binds less tightly with rhs than the operator after</span>
+<span class="c"> * rhs, let the pending operator take rhs as its lhs. *)</span>
+ <span class="k">let</span> <span class="n">next_prec</span> <span class="o">=</span> <span class="n">precedence</span> <span class="n">c2</span> <span class="k">in</span>
+ <span class="k">if</span> <span class="n">token_prec</span> <span class="o"><</span> <span class="n">next_prec</span>
+ <span class="k">then</span> <span class="n">parse_bin_rhs</span> <span class="o">(</span><span class="n">token_prec</span> <span class="o">+</span> <span class="mi">1</span><span class="o">)</span> <span class="n">rhs</span> <span class="n">stream</span>
+ <span class="k">else</span> <span class="n">rhs</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="n">rhs</span>
+ <span class="k">in</span>
+
+ <span class="c">(* Merge lhs/rhs. *)</span>
+ <span class="k">let</span> <span class="n">lhs</span> <span class="o">=</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">(</span><span class="n">c</span><span class="o">,</span> <span class="n">lhs</span><span class="o">,</span> <span class="n">rhs</span><span class="o">)</span> <span class="k">in</span>
+ <span class="n">parse_bin_rhs</span> <span class="n">expr_prec</span> <span class="n">lhs</span> <span class="n">stream</span>
+ <span class="k">end</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="n">lhs</span>
+
+<span class="c">(* expression</span>
+<span class="c"> * ::= primary binoprhs *)</span>
+<span class="ow">and</span> <span class="n">parse_expr</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">lhs</span><span class="o">=</span><span class="n">parse_unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">parse_bin_rhs</span> <span class="mi">0</span> <span class="n">lhs</span> <span class="n">stream</span>
+
+<span class="c">(* prototype</span>
+<span class="c"> * ::= id '(' id* ')'</span>
+<span class="c"> * ::= binary LETTER number? (id, id)</span>
+<span class="c"> * ::= unary LETTER number? (id) *)</span>
+<span class="k">let</span> <span class="n">parse_prototype</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_args</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_args</span> <span class="o">(</span><span class="n">id</span><span class="o">::</span><span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_operator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"unary"</span><span class="o">,</span> <span class="mi">1</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"binary"</span><span class="o">,</span> <span class="mi">2</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_binary_precedence</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">int_of_float</span> <span class="n">n</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="mi">30</span>
+ <span class="k">in</span>
+ <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="c">(* success. *)</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">))</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">(</span><span class="n">prefix</span><span class="o">,</span> <span class="n">kind</span><span class="o">)=</span><span class="n">parse_operator</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="o">??</span> <span class="s2">"expected an operator"</span><span class="o">;</span>
+ <span class="c">(* Read the precedence if present. *)</span>
+ <span class="n">binary_precedence</span><span class="o">=</span><span class="n">parse_binary_precedence</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Verify right number of arguments for operator. *)</span>
+ <span class="k">if</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span> <span class="o">!=</span> <span class="n">kind</span>
+ <span class="k">then</span> <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"invalid number of operands for operator"</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="k">if</span> <span class="n">kind</span> <span class="o">==</span> <span class="mi">1</span> <span class="k">then</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">binary_precedence</span><span class="o">)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected function name in prototype"</span><span class="o">)</span>
+
+<span class="c">(* definition ::= 'def' prototype expression *)</span>
+<span class="k">let</span> <span class="n">parse_definition</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Def</span><span class="o">;</span> <span class="n">p</span><span class="o">=</span><span class="n">parse_prototype</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="n">p</span><span class="o">,</span> <span class="n">e</span><span class="o">)</span>
+
+<span class="c">(* toplevelexpr ::= expression *)</span>
+<span class="k">let</span> <span class="n">parse_toplevel</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="c">(* Make an anonymous proto. *)</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="s2">""</span><span class="o">,</span> <span class="o">[||]),</span> <span class="n">e</span><span class="o">)</span>
+
+<span class="c">(* external ::= 'extern' prototype *)</span>
+<span class="k">let</span> <span class="n">parse_extern</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Extern</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_prototype</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+</pre></div>
+</div>
+</dd>
+<dt>codegen.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Code Generation</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">open</span> <span class="nc">Llvm</span>
+
+<span class="k">exception</span> <span class="nc">Error</span> <span class="k">of</span> <span class="kt">string</span>
+
+<span class="k">let</span> <span class="n">context</span> <span class="o">=</span> <span class="n">global_context</span> <span class="bp">()</span>
+<span class="k">let</span> <span class="n">the_module</span> <span class="o">=</span> <span class="n">create_module</span> <span class="n">context</span> <span class="s2">"my cool jit"</span>
+<span class="k">let</span> <span class="n">builder</span> <span class="o">=</span> <span class="n">builder</span> <span class="n">context</span>
+<span class="k">let</span> <span class="n">named_values</span><span class="o">:(</span><span class="kt">string</span><span class="o">,</span> <span class="n">llvalue</span><span class="o">)</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">10</span>
+<span class="k">let</span> <span class="n">double_type</span> <span class="o">=</span> <span class="n">double_type</span> <span class="n">context</span>
+
+<span class="k">let</span> <span class="k">rec</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">-></span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="n">n</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">name</span> <span class="o">-></span>
+ <span class="o">(</span><span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">name</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown variable name"</span><span class="o">))</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">operand</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">operand</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">operand</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span> <span class="s2">"unary"</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown unary operator"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="o">[|</span><span class="n">operand</span><span class="o">|]</span> <span class="s2">"unop"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">lhs</span><span class="o">,</span> <span class="n">rhs</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">lhs_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">lhs</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">rhs_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">rhs</span> <span class="k">in</span>
+ <span class="k">begin</span>
+ <span class="k">match</span> <span class="n">op</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="sc">'+'</span> <span class="o">-></span> <span class="n">build_add</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"addtmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'-'</span> <span class="o">-></span> <span class="n">build_sub</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"subtmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'*'</span> <span class="o">-></span> <span class="n">build_mul</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"multmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'<'</span> <span class="o">-></span>
+ <span class="c">(* Convert bool 0/1 to double 0.0 or 1.0 *)</span>
+ <span class="k">let</span> <span class="n">i</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">Ult</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"cmptmp"</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="n">build_uitofp</span> <span class="n">i</span> <span class="n">double_type</span> <span class="s2">"booltmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="c">(* If it wasn't a builtin binary operator, it must be a user defined</span>
+<span class="c"> * one. Emit a call to it. *)</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span> <span class="s2">"binary"</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"binary operator not found!"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="o">[|</span><span class="n">lhs_val</span><span class="o">;</span> <span class="n">rhs_val</span><span class="o">|]</span> <span class="s2">"binop"</span> <span class="n">builder</span>
+ <span class="k">end</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Call</span> <span class="o">(</span><span class="n">callee</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* Look up the name in the module table. *)</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown function referenced"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">params</span> <span class="o">=</span> <span class="n">params</span> <span class="n">callee</span> <span class="k">in</span>
+
+ <span class="c">(* If argument mismatch error. *)</span>
+ <span class="k">if</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">params</span> <span class="o">==</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span> <span class="k">then</span> <span class="bp">()</span> <span class="k">else</span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"incorrect # arguments passed"</span><span class="o">);</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">map</span> <span class="n">codegen_expr</span> <span class="n">args</span> <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="n">args</span> <span class="s2">"calltmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">If</span> <span class="o">(</span><span class="n">cond</span><span class="o">,</span> <span class="n">then_</span><span class="o">,</span> <span class="n">else_</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">cond</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">cond</span> <span class="k">in</span>
+
+ <span class="c">(* Convert condition to a bool by comparing equal to 0.0 *)</span>
+ <span class="k">let</span> <span class="n">zero</span> <span class="o">=</span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">cond_val</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">One</span> <span class="n">cond</span> <span class="n">zero</span> <span class="s2">"ifcond"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Grab the first block so that we might later add the conditional branch</span>
+<span class="c"> * to it at the end of the function. *)</span>
+ <span class="k">let</span> <span class="n">start_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="n">start_bb</span> <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">then_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"then"</span> <span class="n">the_function</span> <span class="k">in</span>
+
+ <span class="c">(* Emit 'then' value. *)</span>
+ <span class="n">position_at_end</span> <span class="n">then_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">then_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">then_</span> <span class="k">in</span>
+
+ <span class="c">(* Codegen of 'then' can change the current block, update then_bb for the</span>
+<span class="c"> * phi. We create a new name because one is used for the phi node, and the</span>
+<span class="c"> * other is used for the conditional branch. *)</span>
+ <span class="k">let</span> <span class="n">new_then_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Emit 'else' value. *)</span>
+ <span class="k">let</span> <span class="n">else_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"else"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">else_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">else_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">else_</span> <span class="k">in</span>
+
+ <span class="c">(* Codegen of 'else' can change the current block, update else_bb for the</span>
+<span class="c"> * phi. *)</span>
+ <span class="k">let</span> <span class="n">new_else_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Emit merge block. *)</span>
+ <span class="k">let</span> <span class="n">merge_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"ifcont"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">incoming</span> <span class="o">=</span> <span class="o">[(</span><span class="n">then_val</span><span class="o">,</span> <span class="n">new_then_bb</span><span class="o">);</span> <span class="o">(</span><span class="n">else_val</span><span class="o">,</span> <span class="n">new_else_bb</span><span class="o">)]</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">phi</span> <span class="o">=</span> <span class="n">build_phi</span> <span class="n">incoming</span> <span class="s2">"iftmp"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Return to the start block to add the conditional branch. *)</span>
+ <span class="n">position_at_end</span> <span class="n">start_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">build_cond_br</span> <span class="n">cond_val</span> <span class="n">then_bb</span> <span class="n">else_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Set a unconditional branch at the end of the 'then' block and the</span>
+<span class="c"> * 'else' block to the 'merge' block. *)</span>
+ <span class="n">position_at_end</span> <span class="n">new_then_bb</span> <span class="n">builder</span><span class="o">;</span> <span class="n">ignore</span> <span class="o">(</span><span class="n">build_br</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">);</span>
+ <span class="n">position_at_end</span> <span class="n">new_else_bb</span> <span class="n">builder</span><span class="o">;</span> <span class="n">ignore</span> <span class="o">(</span><span class="n">build_br</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Finally, set the builder to the end of the merge block. *)</span>
+ <span class="n">position_at_end</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="n">phi</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">For</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">start</span><span class="o">,</span> <span class="n">end_</span><span class="o">,</span> <span class="n">step</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* Emit the start code first, without 'variable' in scope. *)</span>
+ <span class="k">let</span> <span class="n">start_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">start</span> <span class="k">in</span>
+
+ <span class="c">(* Make the new basic block for the loop header, inserting after current</span>
+<span class="c"> * block. *)</span>
+ <span class="k">let</span> <span class="n">preheader_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="n">preheader_bb</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">loop_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"loop"</span> <span class="n">the_function</span> <span class="k">in</span>
+
+ <span class="c">(* Insert an explicit fall through from the current block to the</span>
+<span class="c"> * loop_bb. *)</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">build_br</span> <span class="n">loop_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Start insertion in loop_bb. *)</span>
+ <span class="n">position_at_end</span> <span class="n">loop_bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="c">(* Start the PHI node with an entry for start. *)</span>
+ <span class="k">let</span> <span class="n">variable</span> <span class="o">=</span> <span class="n">build_phi</span> <span class="o">[(</span><span class="n">start_val</span><span class="o">,</span> <span class="n">preheader_bb</span><span class="o">)]</span> <span class="n">var_name</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Within the loop, the variable is defined equal to the PHI node. If it</span>
+<span class="c"> * shadows an existing variable, we have to restore it, so save it</span>
+<span class="c"> * now. *)</span>
+ <span class="k">let</span> <span class="n">old_val</span> <span class="o">=</span>
+ <span class="k">try</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">var_name</span><span class="o">)</span> <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="nc">None</span>
+ <span class="k">in</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">variable</span><span class="o">;</span>
+
+ <span class="c">(* Emit the body of the loop. This, like any other expr, can change the</span>
+<span class="c"> * current BB. Note that we ignore the value computed by the body, but</span>
+<span class="c"> * don't allow an error *)</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">codegen_expr</span> <span class="n">body</span><span class="o">);</span>
+
+ <span class="c">(* Emit the step value. *)</span>
+ <span class="k">let</span> <span class="n">step_val</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">step</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">step</span> <span class="o">-></span> <span class="n">codegen_expr</span> <span class="n">step</span>
+ <span class="c">(* If not specified, use 1.0. *)</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">1</span><span class="o">.</span><span class="mi">0</span>
+ <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">next_var</span> <span class="o">=</span> <span class="n">build_add</span> <span class="n">variable</span> <span class="n">step_val</span> <span class="s2">"nextvar"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Compute the end condition. *)</span>
+ <span class="k">let</span> <span class="n">end_cond</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">end_</span> <span class="k">in</span>
+
+ <span class="c">(* Convert condition to a bool by comparing equal to 0.0. *)</span>
+ <span class="k">let</span> <span class="n">zero</span> <span class="o">=</span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">end_cond</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">One</span> <span class="n">end_cond</span> <span class="n">zero</span> <span class="s2">"loopcond"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Create the "after loop" block and insert it. *)</span>
+ <span class="k">let</span> <span class="n">loop_end_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">after_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"afterloop"</span> <span class="n">the_function</span> <span class="k">in</span>
+
+ <span class="c">(* Insert the conditional branch into the end of loop_end_bb. *)</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">build_cond_br</span> <span class="n">end_cond</span> <span class="n">loop_bb</span> <span class="n">after_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Any new code will be inserted in after_bb. *)</span>
+ <span class="n">position_at_end</span> <span class="n">after_bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="c">(* Add a new entry to the PHI node for the backedge. *)</span>
+ <span class="n">add_incoming</span> <span class="o">(</span><span class="n">next_var</span><span class="o">,</span> <span class="n">loop_end_bb</span><span class="o">)</span> <span class="n">variable</span><span class="o">;</span>
+
+ <span class="c">(* Restore the unshadowed variable. *)</span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">old_val</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">old_val</span> <span class="o">-></span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">old_val</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* for expr always returns 0.0. *)</span>
+ <span class="n">const_null</span> <span class="n">double_type</span>
+
+<span class="k">let</span> <span class="n">codegen_proto</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span> <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="o">_)</span> <span class="o">-></span>
+ <span class="c">(* Make the function type: double(double,double) etc. *)</span>
+ <span class="k">let</span> <span class="n">doubles</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">make</span> <span class="o">(</span><span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span><span class="o">)</span> <span class="n">double_type</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">ft</span> <span class="o">=</span> <span class="n">function_type</span> <span class="n">double_type</span> <span class="n">doubles</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">f</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">name</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="n">declare_function</span> <span class="n">name</span> <span class="n">ft</span> <span class="n">the_module</span>
+
+ <span class="c">(* If 'f' conflicted, there was already something named 'name'. If it</span>
+<span class="c"> * has a body, don't allow redefinition or reextern. *)</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">f</span> <span class="o">-></span>
+ <span class="c">(* If 'f' already has a body, reject this. *)</span>
+ <span class="k">if</span> <span class="n">block_begin</span> <span class="n">f</span> <span class="o"><></span> <span class="nc">At_end</span> <span class="n">f</span> <span class="k">then</span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"redefinition of function"</span><span class="o">);</span>
+
+ <span class="c">(* If 'f' took a different number of arguments, reject. *)</span>
+ <span class="k">if</span> <span class="n">element_type</span> <span class="o">(</span><span class="n">type_of</span> <span class="n">f</span><span class="o">)</span> <span class="o"><></span> <span class="n">ft</span> <span class="k">then</span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"redefinition of function with different # args"</span><span class="o">);</span>
+ <span class="n">f</span>
+ <span class="k">in</span>
+
+ <span class="c">(* Set names for all arguments. *)</span>
+ <span class="nn">Array</span><span class="p">.</span><span class="n">iteri</span> <span class="o">(</span><span class="k">fun</span> <span class="n">i</span> <span class="n">a</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">n</span> <span class="o">=</span> <span class="n">args</span><span class="o">.(</span><span class="n">i</span><span class="o">)</span> <span class="k">in</span>
+ <span class="n">set_value_name</span> <span class="n">n</span> <span class="n">a</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">n</span> <span class="n">a</span><span class="o">;</span>
+ <span class="o">)</span> <span class="o">(</span><span class="n">params</span> <span class="n">f</span><span class="o">);</span>
+ <span class="n">f</span>
+
+<span class="k">let</span> <span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="n">proto</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">clear</span> <span class="n">named_values</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">codegen_proto</span> <span class="n">proto</span> <span class="k">in</span>
+
+ <span class="c">(* If this is an operator, install it. *)</span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">proto</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">prec</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">op</span> <span class="o">=</span> <span class="n">name</span><span class="o">.[</span><span class="nn">String</span><span class="p">.</span><span class="n">length</span> <span class="n">name</span> <span class="o">-</span> <span class="mi">1</span><span class="o">]</span> <span class="k">in</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="n">op</span> <span class="n">prec</span><span class="o">;</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* Create a new basic block to start insertion into. *)</span>
+ <span class="k">let</span> <span class="n">bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"entry"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="k">try</span>
+ <span class="k">let</span> <span class="n">ret_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">body</span> <span class="k">in</span>
+
+ <span class="c">(* Finish off the function. *)</span>
+ <span class="k">let</span> <span class="o">_</span> <span class="o">=</span> <span class="n">build_ret</span> <span class="n">ret_val</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Validate the generated code, checking for consistency. *)</span>
+ <span class="nn">Llvm_analysis</span><span class="p">.</span><span class="n">assert_valid_function</span> <span class="n">the_function</span><span class="o">;</span>
+
+ <span class="c">(* Optimize the function. *)</span>
+ <span class="k">let</span> <span class="o">_</span> <span class="o">=</span> <span class="nn">PassManager</span><span class="p">.</span><span class="n">run_function</span> <span class="n">the_function</span> <span class="n">the_fpm</span> <span class="k">in</span>
+
+ <span class="n">the_function</span>
+ <span class="k">with</span> <span class="n">e</span> <span class="o">-></span>
+ <span class="n">delete_function</span> <span class="n">the_function</span><span class="o">;</span>
+ <span class="k">raise</span> <span class="n">e</span>
+</pre></div>
+</div>
+</dd>
+<dt>toplevel.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Top-Level parsing and JIT Driver</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">open</span> <span class="nc">Llvm</span>
+<span class="k">open</span> <span class="nc">Llvm_executionengine</span>
+
+<span class="c">(* top ::= definition | external | expression | ';' *)</span>
+<span class="k">let</span> <span class="k">rec</span> <span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="nn">Stream</span><span class="p">.</span><span class="n">peek</span> <span class="n">stream</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="bp">()</span>
+
+ <span class="c">(* ignore top-level semicolons. *)</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">';'</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Stream</span><span class="p">.</span><span class="n">junk</span> <span class="n">stream</span><span class="o">;</span>
+ <span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span>
+
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">token</span> <span class="o">-></span>
+ <span class="k">begin</span>
+ <span class="k">try</span> <span class="k">match</span> <span class="n">token</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Token</span><span class="p">.</span><span class="nc">Def</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">e</span> <span class="o">=</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">parse_definition</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="n">print_endline</span> <span class="s2">"parsed a function definition."</span><span class="o">;</span>
+ <span class="n">dump_value</span> <span class="o">(</span><span class="nn">Codegen</span><span class="p">.</span><span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="n">e</span><span class="o">);</span>
+ <span class="o">|</span> <span class="nn">Token</span><span class="p">.</span><span class="nc">Extern</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">e</span> <span class="o">=</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">parse_extern</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="n">print_endline</span> <span class="s2">"parsed an extern."</span><span class="o">;</span>
+ <span class="n">dump_value</span> <span class="o">(</span><span class="nn">Codegen</span><span class="p">.</span><span class="n">codegen_proto</span> <span class="n">e</span><span class="o">);</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="c">(* Evaluate a top-level expression into an anonymous function. *)</span>
+ <span class="k">let</span> <span class="n">e</span> <span class="o">=</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">parse_toplevel</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="n">print_endline</span> <span class="s2">"parsed a top-level expr"</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="n">e</span> <span class="k">in</span>
+ <span class="n">dump_value</span> <span class="n">the_function</span><span class="o">;</span>
+
+ <span class="c">(* JIT the function, returning a function pointer. *)</span>
+ <span class="k">let</span> <span class="n">result</span> <span class="o">=</span> <span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">run_function</span> <span class="n">the_function</span> <span class="o">[||]</span>
+ <span class="n">the_execution_engine</span> <span class="k">in</span>
+
+ <span class="n">print_string</span> <span class="s2">"Evaluated to "</span><span class="o">;</span>
+ <span class="n">print_float</span> <span class="o">(</span><span class="nn">GenericValue</span><span class="p">.</span><span class="n">as_float</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">double_type</span> <span class="n">result</span><span class="o">);</span>
+ <span class="n">print_newline</span> <span class="bp">()</span><span class="o">;</span>
+ <span class="k">with</span> <span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="n">s</span> <span class="o">|</span> <span class="nn">Codegen</span><span class="p">.</span><span class="nc">Error</span> <span class="n">s</span> <span class="o">-></span>
+ <span class="c">(* Skip token for error recovery. *)</span>
+ <span class="nn">Stream</span><span class="p">.</span><span class="n">junk</span> <span class="n">stream</span><span class="o">;</span>
+ <span class="n">print_endline</span> <span class="n">s</span><span class="o">;</span>
+ <span class="k">end</span><span class="o">;</span>
+ <span class="n">print_string</span> <span class="s2">"ready> "</span><span class="o">;</span> <span class="n">flush</span> <span class="n">stdout</span><span class="o">;</span>
+ <span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span>
+</pre></div>
+</div>
+</dd>
+<dt>toy.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Main driver code.</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">open</span> <span class="nc">Llvm</span>
+<span class="k">open</span> <span class="nc">Llvm_executionengine</span>
+<span class="k">open</span> <span class="nc">Llvm_target</span>
+<span class="k">open</span> <span class="nc">Llvm_scalar_opts</span>
+
+<span class="k">let</span> <span class="n">main</span> <span class="bp">()</span> <span class="o">=</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">initialize_native_target</span> <span class="bp">()</span><span class="o">);</span>
+
+ <span class="c">(* Install standard binary operators.</span>
+<span class="c"> * 1 is the lowest precedence. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'<'</span> <span class="mi">10</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'+'</span> <span class="mi">20</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'-'</span> <span class="mi">20</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'*'</span> <span class="mi">40</span><span class="o">;</span> <span class="c">(* highest. *)</span>
+
+ <span class="c">(* Prime the first token. *)</span>
+ <span class="n">print_string</span> <span class="s2">"ready> "</span><span class="o">;</span> <span class="n">flush</span> <span class="n">stdout</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">stream</span> <span class="o">=</span> <span class="nn">Lexer</span><span class="p">.</span><span class="n">lex</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="n">of_channel</span> <span class="n">stdin</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Create the JIT. *)</span>
+ <span class="k">let</span> <span class="n">the_execution_engine</span> <span class="o">=</span> <span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">create</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">the_fpm</span> <span class="o">=</span> <span class="nn">PassManager</span><span class="p">.</span><span class="n">create_function</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span> <span class="k">in</span>
+
+ <span class="c">(* Set up the optimizer pipeline. Start with registering info about how the</span>
+<span class="c"> * target lays out data structures. *)</span>
+ <span class="nn">DataLayout</span><span class="p">.</span><span class="n">add</span> <span class="o">(</span><span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">target_data</span> <span class="n">the_execution_engine</span><span class="o">)</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Do simple "peephole" optimizations and bit-twiddling optzn. *)</span>
+ <span class="n">add_instruction_combination</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* reassociate expressions. *)</span>
+ <span class="n">add_reassociation</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Eliminate Common SubExpressions. *)</span>
+ <span class="n">add_gvn</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Simplify the control flow graph (deleting unreachable blocks, etc). *)</span>
+ <span class="n">add_cfg_simplification</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="n">ignore</span> <span class="o">(</span><span class="nn">PassManager</span><span class="p">.</span><span class="n">initialize</span> <span class="n">the_fpm</span><span class="o">);</span>
+
+ <span class="c">(* Run the main "interpreter loop" now. *)</span>
+ <span class="nn">Toplevel</span><span class="p">.</span><span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span><span class="o">;</span>
+
+ <span class="c">(* Print out all the generated code. *)</span>
+ <span class="n">dump_module</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span>
+<span class="o">;;</span>
+
+<span class="n">main</span> <span class="bp">()</span>
+</pre></div>
+</div>
+</dd>
+<dt>bindings.c</dt>
+<dd><div class="first last highlight-c"><div class="highlight"><pre><span class="cp">#include <stdio.h></span>
+
+<span class="cm">/* putchard - putchar that takes a double and returns 0. */</span>
+<span class="k">extern</span> <span class="kt">double</span> <span class="nf">putchard</span><span class="p">(</span><span class="kt">double</span> <span class="n">X</span><span class="p">)</span> <span class="p">{</span>
+ <span class="n">putchar</span><span class="p">((</span><span class="kt">char</span><span class="p">)</span><span class="n">X</span><span class="p">);</span>
+ <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
+<span class="p">}</span>
+
+<span class="cm">/* printd - printf that takes a double prints it as "%f\n", returning 0. */</span>
+<span class="k">extern</span> <span class="kt">double</span> <span class="nf">printd</span><span class="p">(</span><span class="kt">double</span> <span class="n">X</span><span class="p">)</span> <span class="p">{</span>
+ <span class="n">printf</span><span class="p">(</span><span class="s">"%f</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">X</span><span class="p">);</span>
+ <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+</dd>
+</dl>
+<p><a class="reference external" href="OCamlLangImpl7.html">Next: Extending the language: mutable variables / SSA
+construction</a></p>
+</div>
+</div>
+
+
+ </div>
+ </div>
+ <div class="clearer"></div>
+ </div>
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="OCamlLangImpl7.html" title="7. Kaleidoscope: Extending the Language: Mutable Variables"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="OCamlLangImpl5.html" title="5. Kaleidoscope: Extending the Language: Control Flow"
+ >previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ <li><a href="index.html" >LLVM Tutorial: Table of Contents</a> »</li>
+ </ul>
+ </div>
+ <div class="footer">
+ © Copyright 2003-2014, LLVM Project.
+ Last updated on 2015-01-13.
+ Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
+ </div>
+ </body>
+</html>
\ No newline at end of file
Added: www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl7.html
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl7.html?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl7.html (added)
+++ www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl7.html Tue Jan 13 16:55:20 2015
@@ -0,0 +1,1749 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+
+ <title>7. Kaleidoscope: Extending the Language: Mutable Variables — LLVM 3.5 documentation</title>
+
+ <link rel="stylesheet" href="../_static/llvm-theme.css" type="text/css" />
+ <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: '../',
+ VERSION: '3.5',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true
+ };
+ </script>
+ <script type="text/javascript" src="../_static/jquery.js"></script>
+ <script type="text/javascript" src="../_static/underscore.js"></script>
+ <script type="text/javascript" src="../_static/doctools.js"></script>
+ <link rel="top" title="LLVM 3.5 documentation" href="../index.html" />
+ <link rel="up" title="LLVM Tutorial: Table of Contents" href="index.html" />
+ <link rel="next" title="8. Kaleidoscope: Conclusion and other useful LLVM tidbits" href="OCamlLangImpl8.html" />
+ <link rel="prev" title="6. Kaleidoscope: Extending the Language: User-defined Operators" href="OCamlLangImpl6.html" />
+<style type="text/css">
+ table.right { float: right; margin-left: 20px; }
+ table.right td { border: 1px solid #ccc; }
+</style>
+
+ </head>
+ <body>
+<div class="logo">
+ <a href="../index.html">
+ <img src="../_static/logo.png"
+ alt="LLVM Logo" width="250" height="88"/></a>
+</div>
+
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="OCamlLangImpl8.html" title="8. Kaleidoscope: Conclusion and other useful LLVM tidbits"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="OCamlLangImpl6.html" title="6. Kaleidoscope: Extending the Language: User-defined Operators"
+ accesskey="P">previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ <li><a href="index.html" accesskey="U">LLVM Tutorial: Table of Contents</a> »</li>
+ </ul>
+ </div>
+
+
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="body">
+
+ <div class="section" id="kaleidoscope-extending-the-language-mutable-variables">
+<h1>7. Kaleidoscope: Extending the Language: Mutable Variables<a class="headerlink" href="#kaleidoscope-extending-the-language-mutable-variables" title="Permalink to this headline">¶</a></h1>
+<div class="contents local topic" id="contents">
+<ul class="simple">
+<li><a class="reference internal" href="#chapter-7-introduction" id="id2">Chapter 7 Introduction</a></li>
+<li><a class="reference internal" href="#why-is-this-a-hard-problem" id="id3">Why is this a hard problem?</a></li>
+<li><a class="reference internal" href="#memory-in-llvm" id="id4">Memory in LLVM</a></li>
+<li><a class="reference internal" href="#mutable-variables-in-kaleidoscope" id="id5">Mutable Variables in Kaleidoscope</a></li>
+<li><a class="reference internal" href="#adjusting-existing-variables-for-mutation" id="id6">Adjusting Existing Variables for Mutation</a></li>
+<li><a class="reference internal" href="#new-assignment-operator" id="id7">New Assignment Operator</a></li>
+<li><a class="reference internal" href="#user-defined-local-variables" id="id8">User-defined Local Variables</a></li>
+<li><a class="reference internal" href="#id1" id="id9">Full Code Listing</a></li>
+</ul>
+</div>
+<div class="section" id="chapter-7-introduction">
+<h2><a class="toc-backref" href="#id2">7.1. Chapter 7 Introduction</a><a class="headerlink" href="#chapter-7-introduction" title="Permalink to this headline">¶</a></h2>
+<p>Welcome to Chapter 7 of the “<a class="reference external" href="index.html">Implementing a language with
+LLVM</a>” tutorial. In chapters 1 through 6, we’ve built a
+very respectable, albeit simple, <a class="reference external" href="http://en.wikipedia.org/wiki/Functional_programming">functional programming
+language</a>. In our
+journey, we learned some parsing techniques, how to build and represent
+an AST, how to build LLVM IR, and how to optimize the resultant code as
+well as JIT compile it.</p>
+<p>While Kaleidoscope is interesting as a functional language, the fact
+that it is functional makes it “too easy” to generate LLVM IR for it. In
+particular, a functional language makes it very easy to build LLVM IR
+directly in <a class="reference external" href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA
+form</a>.
+Since LLVM requires that the input code be in SSA form, this is a very
+nice property and it is often unclear to newcomers how to generate code
+for an imperative language with mutable variables.</p>
+<p>The short (and happy) summary of this chapter is that there is no need
+for your front-end to build SSA form: LLVM provides highly tuned and
+well tested support for this, though the way it works is a bit
+unexpected for some.</p>
+</div>
+<div class="section" id="why-is-this-a-hard-problem">
+<h2><a class="toc-backref" href="#id3">7.2. Why is this a hard problem?</a><a class="headerlink" href="#why-is-this-a-hard-problem" title="Permalink to this headline">¶</a></h2>
+<p>To understand why mutable variables cause complexities in SSA
+construction, consider this extremely simple C example:</p>
+<div class="highlight-c"><div class="highlight"><pre><span class="kt">int</span> <span class="n">G</span><span class="p">,</span> <span class="n">H</span><span class="p">;</span>
+<span class="kt">int</span> <span class="nf">test</span><span class="p">(</span><span class="kt">_Bool</span> <span class="n">Condition</span><span class="p">)</span> <span class="p">{</span>
+ <span class="kt">int</span> <span class="n">X</span><span class="p">;</span>
+ <span class="k">if</span> <span class="p">(</span><span class="n">Condition</span><span class="p">)</span>
+ <span class="n">X</span> <span class="o">=</span> <span class="n">G</span><span class="p">;</span>
+ <span class="k">else</span>
+ <span class="n">X</span> <span class="o">=</span> <span class="n">H</span><span class="p">;</span>
+ <span class="k">return</span> <span class="n">X</span><span class="p">;</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>In this case, we have the variable “X”, whose value depends on the path
+executed in the program. Because there are two different possible values
+for X before the return instruction, a PHI node is inserted to merge the
+two values. The LLVM IR that we want for this example looks like this:</p>
+<div class="highlight-llvm"><div class="highlight"><pre><span class="vg">@G</span> <span class="p">=</span> <span class="k">weak</span> <span class="k">global</span> <span class="k">i32</span> <span class="m">0</span> <span class="c">; type of @G is i32*</span>
+<span class="vg">@H</span> <span class="p">=</span> <span class="k">weak</span> <span class="k">global</span> <span class="k">i32</span> <span class="m">0</span> <span class="c">; type of @H is i32*</span>
+
+<span class="k">define</span> <span class="k">i32</span> <span class="vg">@test</span><span class="p">(</span><span class="k">i1</span> <span class="nv">%Condition</span><span class="p">)</span> <span class="p">{</span>
+<span class="nl">entry:</span>
+ <span class="k">br</span> <span class="k">i1</span> <span class="nv">%Condition</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%cond_true</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%cond_false</span>
+
+<span class="nl">cond_true:</span>
+ <span class="nv">%X.0</span> <span class="p">=</span> <span class="k">load</span> <span class="k">i32</span><span class="p">*</span> <span class="vg">@G</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%cond_next</span>
+
+<span class="nl">cond_false:</span>
+ <span class="nv">%X.1</span> <span class="p">=</span> <span class="k">load</span> <span class="k">i32</span><span class="p">*</span> <span class="vg">@H</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%cond_next</span>
+
+<span class="nl">cond_next:</span>
+ <span class="nv">%X.2</span> <span class="p">=</span> <span class="k">phi</span> <span class="k">i32</span> <span class="p">[</span> <span class="nv">%X.1</span><span class="p">,</span> <span class="nv">%cond_false</span> <span class="p">],</span> <span class="p">[</span> <span class="nv">%X.0</span><span class="p">,</span> <span class="nv">%cond_true</span> <span class="p">]</span>
+ <span class="k">ret</span> <span class="k">i32</span> <span class="nv">%X.2</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>In this example, the loads from the G and H global variables are
+explicit in the LLVM IR, and they live in the then/else branches of the
+if statement (cond_true/cond_false). In order to merge the incoming
+values, the X.2 phi node in the cond_next block selects the right value
+to use based on where control flow is coming from: if control flow comes
+from the cond_false block, X.2 gets the value of X.1. Alternatively, if
+control flow comes from cond_true, it gets the value of X.0. The intent
+of this chapter is not to explain the details of SSA form. For more
+information, see one of the many <a class="reference external" href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
+references</a>.</p>
+<p>The question for this article is “who places the phi nodes when lowering
+assignments to mutable variables?”. The issue here is that LLVM
+<em>requires</em> that its IR be in SSA form: there is no “non-ssa” mode for
+it. However, SSA construction requires non-trivial algorithms and data
+structures, so it is inconvenient and wasteful for every front-end to
+have to reproduce this logic.</p>
+</div>
+<div class="section" id="memory-in-llvm">
+<h2><a class="toc-backref" href="#id4">7.3. Memory in LLVM</a><a class="headerlink" href="#memory-in-llvm" title="Permalink to this headline">¶</a></h2>
+<p>The ‘trick’ here is that while LLVM does require all register values to
+be in SSA form, it does not require (or permit) memory objects to be in
+SSA form. In the example above, note that the loads from G and H are
+direct accesses to G and H: they are not renamed or versioned. This
+differs from some other compiler systems, which do try to version memory
+objects. In LLVM, instead of encoding dataflow analysis of memory into
+the LLVM IR, it is handled with <a class="reference external" href="../WritingAnLLVMPass.html">Analysis
+Passes</a> which are computed on demand.</p>
+<p>With this in mind, the high-level idea is that we want to make a stack
+variable (which lives in memory, because it is on the stack) for each
+mutable object in a function. To take advantage of this trick, we need
+to talk about how LLVM represents stack variables.</p>
+<p>In LLVM, all memory accesses are explicit with load/store instructions,
+and it is carefully designed not to have (or need) an “address-of”
+operator. Notice how the type of the @G/@H global variables is actually
+“i32*” even though the variable is defined as “i32”. What this means is
+that @G defines <em>space</em> for an i32 in the global data area, but its
+<em>name</em> actually refers to the address for that space. Stack variables
+work the same way, except that instead of being declared with global
+variable definitions, they are declared with the <a class="reference external" href="../LangRef.html#i_alloca">LLVM alloca
+instruction</a>:</p>
+<div class="highlight-llvm"><div class="highlight"><pre><span class="k">define</span> <span class="k">i32</span> <span class="vg">@example</span><span class="p">()</span> <span class="p">{</span>
+<span class="nl">entry:</span>
+ <span class="nv">%X</span> <span class="p">=</span> <span class="k">alloca</span> <span class="k">i32</span> <span class="c">; type of %X is i32*.</span>
+ <span class="p">...</span>
+ <span class="nv">%tmp</span> <span class="p">=</span> <span class="k">load</span> <span class="k">i32</span><span class="p">*</span> <span class="nv">%X</span> <span class="c">; load the stack value %X from the stack.</span>
+ <span class="nv">%tmp2</span> <span class="p">=</span> <span class="k">add</span> <span class="k">i32</span> <span class="nv">%tmp</span><span class="p">,</span> <span class="m">1</span> <span class="c">; increment it</span>
+ <span class="k">store</span> <span class="k">i32</span> <span class="nv">%tmp2</span><span class="p">,</span> <span class="k">i32</span><span class="p">*</span> <span class="nv">%X</span> <span class="c">; store it back</span>
+ <span class="p">...</span>
+</pre></div>
+</div>
+<p>This code shows an example of how you can declare and manipulate a stack
+variable in the LLVM IR. Stack memory allocated with the alloca
+instruction is fully general: you can pass the address of the stack slot
+to functions, you can store it in other variables, etc. In our example
+above, we could rewrite the example to use the alloca technique to avoid
+using a PHI node:</p>
+<div class="highlight-llvm"><div class="highlight"><pre><span class="vg">@G</span> <span class="p">=</span> <span class="k">weak</span> <span class="k">global</span> <span class="k">i32</span> <span class="m">0</span> <span class="c">; type of @G is i32*</span>
+<span class="vg">@H</span> <span class="p">=</span> <span class="k">weak</span> <span class="k">global</span> <span class="k">i32</span> <span class="m">0</span> <span class="c">; type of @H is i32*</span>
+
+<span class="k">define</span> <span class="k">i32</span> <span class="vg">@test</span><span class="p">(</span><span class="k">i1</span> <span class="nv">%Condition</span><span class="p">)</span> <span class="p">{</span>
+<span class="nl">entry:</span>
+ <span class="nv">%X</span> <span class="p">=</span> <span class="k">alloca</span> <span class="k">i32</span> <span class="c">; type of %X is i32*.</span>
+ <span class="k">br</span> <span class="k">i1</span> <span class="nv">%Condition</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%cond_true</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%cond_false</span>
+
+<span class="nl">cond_true:</span>
+ <span class="nv">%X.0</span> <span class="p">=</span> <span class="k">load</span> <span class="k">i32</span><span class="p">*</span> <span class="vg">@G</span>
+ <span class="k">store</span> <span class="k">i32</span> <span class="nv">%X.0</span><span class="p">,</span> <span class="k">i32</span><span class="p">*</span> <span class="nv">%X</span> <span class="c">; Update X</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%cond_next</span>
+
+<span class="nl">cond_false:</span>
+ <span class="nv">%X.1</span> <span class="p">=</span> <span class="k">load</span> <span class="k">i32</span><span class="p">*</span> <span class="vg">@H</span>
+ <span class="k">store</span> <span class="k">i32</span> <span class="nv">%X.1</span><span class="p">,</span> <span class="k">i32</span><span class="p">*</span> <span class="nv">%X</span> <span class="c">; Update X</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%cond_next</span>
+
+<span class="nl">cond_next:</span>
+ <span class="nv">%X.2</span> <span class="p">=</span> <span class="k">load</span> <span class="k">i32</span><span class="p">*</span> <span class="nv">%X</span> <span class="c">; Read X</span>
+ <span class="k">ret</span> <span class="k">i32</span> <span class="nv">%X.2</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>With this, we have discovered a way to handle arbitrary mutable
+variables without the need to create Phi nodes at all:</p>
+<ol class="arabic simple">
+<li>Each mutable variable becomes a stack allocation.</li>
+<li>Each read of the variable becomes a load from the stack.</li>
+<li>Each update of the variable becomes a store to the stack.</li>
+<li>Taking the address of a variable just uses the stack address
+directly.</li>
+</ol>
+<p>While this solution has solved our immediate problem, it introduced
+another one: we have now apparently introduced a lot of stack traffic
+for very simple and common operations, a major performance problem.
+Fortunately for us, the LLVM optimizer has a highly-tuned optimization
+pass named “mem2reg” that handles this case, promoting allocas like this
+into SSA registers, inserting Phi nodes as appropriate. If you run this
+example through the pass, for example, you’ll get:</p>
+<div class="highlight-bash"><div class="highlight"><pre><span class="nv">$ </span>llvm-as < example.ll | opt -mem2reg | llvm-dis
+ at G <span class="o">=</span> weak global i32 0
+ at H <span class="o">=</span> weak global i32 0
+
+define i32 @test<span class="o">(</span>i1 %Condition<span class="o">)</span> <span class="o">{</span>
+entry:
+ br i1 %Condition, label %cond_true, label %cond_false
+
+cond_true:
+ %X.0 <span class="o">=</span> load i32* @G
+ br label %cond_next
+
+cond_false:
+ %X.1 <span class="o">=</span> load i32* @H
+ br label %cond_next
+
+cond_next:
+ %X.01 <span class="o">=</span> phi i32 <span class="o">[</span> %X.1, %cond_false <span class="o">]</span>, <span class="o">[</span> %X.0, %cond_true <span class="o">]</span>
+ ret i32 %X.01
+<span class="o">}</span>
+</pre></div>
+</div>
+<p>The mem2reg pass implements the standard “iterated dominance frontier”
+algorithm for constructing SSA form and has a number of optimizations
+that speed up (very common) degenerate cases. The mem2reg optimization
+pass is the answer to dealing with mutable variables, and we highly
+recommend that you depend on it. Note that mem2reg only works on
+variables in certain circumstances:</p>
+<ol class="arabic simple">
+<li>mem2reg is alloca-driven: it looks for allocas and if it can handle
+them, it promotes them. It does not apply to global variables or heap
+allocations.</li>
+<li>mem2reg only looks for alloca instructions in the entry block of the
+function. Being in the entry block guarantees that the alloca is only
+executed once, which makes analysis simpler.</li>
+<li>mem2reg only promotes allocas whose uses are direct loads and stores.
+If the address of the stack object is passed to a function, or if any
+funny pointer arithmetic is involved, the alloca will not be
+promoted.</li>
+<li>mem2reg only works on allocas of <a class="reference external" href="../LangRef.html#t_classifications">first
+class</a> values (such as pointers,
+scalars and vectors), and only if the array size of the allocation is
+1 (or missing in the .ll file). mem2reg is not capable of promoting
+structs or arrays to registers. Note that the “scalarrepl” pass is
+more powerful and can promote structs, “unions”, and arrays in many
+cases.</li>
+</ol>
+<p>All of these properties are easy to satisfy for most imperative
+languages, and we’ll illustrate it below with Kaleidoscope. The final
+question you may be asking is: should I bother with this nonsense for my
+front-end? Wouldn’t it be better if I just did SSA construction
+directly, avoiding use of the mem2reg optimization pass? In short, we
+strongly recommend that you use this technique for building SSA form,
+unless there is an extremely good reason not to. Using this technique
+is:</p>
+<ul class="simple">
+<li>Proven and well tested: clang uses this technique
+for local mutable variables. As such, the most common clients of LLVM
+are using this to handle a bulk of their variables. You can be sure
+that bugs are found fast and fixed early.</li>
+<li>Extremely Fast: mem2reg has a number of special cases that make it
+fast in common cases as well as fully general. For example, it has
+fast-paths for variables that are only used in a single block,
+variables that only have one assignment point, good heuristics to
+avoid insertion of unneeded phi nodes, etc.</li>
+<li>Needed for debug info generation: <a class="reference external" href="../SourceLevelDebugging.html">Debug information in
+LLVM</a> relies on having the address of
+the variable exposed so that debug info can be attached to it. This
+technique dovetails very naturally with this style of debug info.</li>
+</ul>
+<p>If nothing else, this makes it much easier to get your front-end up and
+running, and is very simple to implement. Lets extend Kaleidoscope with
+mutable variables now!</p>
+</div>
+<div class="section" id="mutable-variables-in-kaleidoscope">
+<h2><a class="toc-backref" href="#id5">7.4. Mutable Variables in Kaleidoscope</a><a class="headerlink" href="#mutable-variables-in-kaleidoscope" title="Permalink to this headline">¶</a></h2>
+<p>Now that we know the sort of problem we want to tackle, lets see what
+this looks like in the context of our little Kaleidoscope language.
+We’re going to add two features:</p>
+<ol class="arabic simple">
+<li>The ability to mutate variables with the ‘=’ operator.</li>
+<li>The ability to define new variables.</li>
+</ol>
+<p>While the first item is really what this is about, we only have
+variables for incoming arguments as well as for induction variables, and
+redefining those only goes so far :). Also, the ability to define new
+variables is a useful thing regardless of whether you will be mutating
+them. Here’s a motivating example that shows how we could use these:</p>
+<div class="highlight-python"><pre># Define ':' for sequencing: as a low-precedence operator that ignores operands
+# and just returns the RHS.
+def binary : 1 (x y) y;
+
+# Recursive fib, we could do this before.
+def fib(x)
+ if (x < 3) then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+# Iterative fib.
+def fibi(x)
+ var a = 1, b = 1, c in
+ (for i = 3, i < x in
+ c = a + b :
+ a = b :
+ b = c) :
+ b;
+
+# Call it.
+fibi(10);</pre>
+</div>
+<p>In order to mutate variables, we have to change our existing variables
+to use the “alloca trick”. Once we have that, we’ll add our new
+operator, then extend Kaleidoscope to support new variable definitions.</p>
+</div>
+<div class="section" id="adjusting-existing-variables-for-mutation">
+<h2><a class="toc-backref" href="#id6">7.5. Adjusting Existing Variables for Mutation</a><a class="headerlink" href="#adjusting-existing-variables-for-mutation" title="Permalink to this headline">¶</a></h2>
+<p>The symbol table in Kaleidoscope is managed at code generation time by
+the ‘<tt class="docutils literal"><span class="pre">named_values</span></tt>‘ map. This map currently keeps track of the LLVM
+“Value*” that holds the double value for the named variable. In order
+to support mutation, we need to change this slightly, so that it
+<tt class="docutils literal"><span class="pre">named_values</span></tt> holds the <em>memory location</em> of the variable in
+question. Note that this change is a refactoring: it changes the
+structure of the code, but does not (by itself) change the behavior of
+the compiler. All of these changes are isolated in the Kaleidoscope code
+generator.</p>
+<p>At this point in Kaleidoscope’s development, it only supports variables
+for two things: incoming arguments to functions and the induction
+variable of ‘for’ loops. For consistency, we’ll allow mutation of these
+variables in addition to other user-defined variables. This means that
+these will both need memory locations.</p>
+<p>To start our transformation of Kaleidoscope, we’ll change the
+<tt class="docutils literal"><span class="pre">named_values</span></tt> map so that it maps to AllocaInst* instead of Value*.
+Once we do this, the C++ compiler will tell us what parts of the code we
+need to update:</p>
+<p><strong>Note:</strong> the ocaml bindings currently model both <tt class="docutils literal"><span class="pre">Value*</span></tt>‘s and
+<tt class="docutils literal"><span class="pre">AllocInst*</span></tt>‘s as <tt class="docutils literal"><span class="pre">Llvm.llvalue</span></tt>‘s, but this may change in the future
+to be more type safe.</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="n">named_values</span><span class="o">:(</span><span class="kt">string</span><span class="o">,</span> <span class="n">llvalue</span><span class="o">)</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">10</span>
+</pre></div>
+</div>
+<p>Also, since we will need to create these alloca’s, we’ll use a helper
+function that ensures that the allocas are created in the entry block of
+the function:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* Create an alloca instruction in the entry block of the function. This</span>
+<span class="c"> * is used for mutable variables etc. *)</span>
+<span class="k">let</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="n">builder</span> <span class="o">=</span> <span class="n">builder_at</span> <span class="o">(</span><span class="n">instr_begin</span> <span class="o">(</span><span class="n">entry_block</span> <span class="n">the_function</span><span class="o">))</span> <span class="k">in</span>
+ <span class="n">build_alloca</span> <span class="n">double_type</span> <span class="n">var_name</span> <span class="n">builder</span>
+</pre></div>
+</div>
+<p>This funny looking code creates an <tt class="docutils literal"><span class="pre">Llvm.llbuilder</span></tt> object that is
+pointing at the first instruction of the entry block. It then creates an
+alloca with the expected name and returns it. Because all values in
+Kaleidoscope are doubles, there is no need to pass in a type to use.</p>
+<p>With this in place, the first functionality change we want to make is to
+variable references. In our new scheme, variables live on the stack, so
+code generating a reference to them actually needs to produce a load
+from the stack slot:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="k">rec</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">...</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">name</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">v</span> <span class="o">=</span> <span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">name</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown variable name"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="c">(* Load the value. *)</span>
+ <span class="n">build_load</span> <span class="n">v</span> <span class="n">name</span> <span class="n">builder</span>
+</pre></div>
+</div>
+<p>As you can see, this is pretty straightforward. Now we need to update
+the things that define the variables to set up the alloca. We’ll start
+with <tt class="docutils literal"><span class="pre">codegen_expr</span> <span class="pre">Ast.For</span> <span class="pre">...</span></tt> (see the <a class="reference external" href="#code">full code listing</a>
+for the unabridged code):</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">For</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">start</span><span class="o">,</span> <span class="n">end_</span><span class="o">,</span> <span class="n">step</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="o">(</span><span class="n">insertion_block</span> <span class="n">builder</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Create an alloca for the variable in the entry block. *)</span>
+ <span class="k">let</span> <span class="n">alloca</span> <span class="o">=</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="k">in</span>
+
+ <span class="c">(* Emit the start code first, without 'variable' in scope. *)</span>
+ <span class="k">let</span> <span class="n">start_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">start</span> <span class="k">in</span>
+
+ <span class="c">(* Store the value into the alloca. *)</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">start_val</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="o">...</span>
+
+ <span class="c">(* Within the loop, the variable is defined equal to the PHI node. If it</span>
+<span class="c"> * shadows an existing variable, we have to restore it, so save it</span>
+<span class="c"> * now. *)</span>
+ <span class="k">let</span> <span class="n">old_val</span> <span class="o">=</span>
+ <span class="k">try</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">var_name</span><span class="o">)</span> <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="nc">None</span>
+ <span class="k">in</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">alloca</span><span class="o">;</span>
+
+ <span class="o">...</span>
+
+ <span class="c">(* Compute the end condition. *)</span>
+ <span class="k">let</span> <span class="n">end_cond</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">end_</span> <span class="k">in</span>
+
+ <span class="c">(* Reload, increment, and restore the alloca. This handles the case where</span>
+<span class="c"> * the body of the loop mutates the variable. *)</span>
+ <span class="k">let</span> <span class="n">cur_var</span> <span class="o">=</span> <span class="n">build_load</span> <span class="n">alloca</span> <span class="n">var_name</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">next_var</span> <span class="o">=</span> <span class="n">build_add</span> <span class="n">cur_var</span> <span class="n">step_val</span> <span class="s2">"nextvar"</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">next_var</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>This code is virtually identical to the code <a class="reference external" href="OCamlLangImpl5.html#forcodegen">before we allowed mutable
+variables</a>. The big difference is that
+we no longer have to construct a PHI node, and we use load/store to
+access the variable as needed.</p>
+<p>To support mutable argument variables, we need to also make allocas for
+them. The code for this is also pretty simple:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* Create an alloca for each argument and register the argument in the symbol</span>
+<span class="c"> * table so that references to it will succeed. *)</span>
+<span class="k">let</span> <span class="n">create_argument_allocas</span> <span class="n">the_function</span> <span class="n">proto</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="k">match</span> <span class="n">proto</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(_,</span> <span class="n">args</span><span class="o">)</span> <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(_,</span> <span class="n">args</span><span class="o">,</span> <span class="o">_)</span> <span class="o">-></span> <span class="n">args</span>
+ <span class="k">in</span>
+ <span class="nn">Array</span><span class="p">.</span><span class="n">iteri</span> <span class="o">(</span><span class="k">fun</span> <span class="n">i</span> <span class="n">ai</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">var_name</span> <span class="o">=</span> <span class="n">args</span><span class="o">.(</span><span class="n">i</span><span class="o">)</span> <span class="k">in</span>
+ <span class="c">(* Create an alloca for this variable. *)</span>
+ <span class="k">let</span> <span class="n">alloca</span> <span class="o">=</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="k">in</span>
+
+ <span class="c">(* Store the initial value into the alloca. *)</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">ai</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Add arguments to variable symbol table. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">alloca</span><span class="o">;</span>
+ <span class="o">)</span> <span class="o">(</span><span class="n">params</span> <span class="n">the_function</span><span class="o">)</span>
+</pre></div>
+</div>
+<p>For each argument, we make an alloca, store the input value to the
+function into the alloca, and register the alloca as the memory location
+for the argument. This method gets invoked by <tt class="docutils literal"><span class="pre">Codegen.codegen_func</span></tt>
+right after it sets up the entry block for the function.</p>
+<p>The final missing piece is adding the mem2reg pass, which allows us to
+get good codegen once again:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="n">main</span> <span class="bp">()</span> <span class="o">=</span>
+ <span class="o">...</span>
+ <span class="k">let</span> <span class="n">the_fpm</span> <span class="o">=</span> <span class="nn">PassManager</span><span class="p">.</span><span class="n">create_function</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span> <span class="k">in</span>
+
+ <span class="c">(* Set up the optimizer pipeline. Start with registering info about how the</span>
+<span class="c"> * target lays out data structures. *)</span>
+ <span class="nn">DataLayout</span><span class="p">.</span><span class="n">add</span> <span class="o">(</span><span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">target_data</span> <span class="n">the_execution_engine</span><span class="o">)</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Promote allocas to registers. *)</span>
+ <span class="n">add_memory_to_register_promotion</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Do simple "peephole" optimizations and bit-twiddling optzn. *)</span>
+ <span class="n">add_instruction_combining</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* reassociate expressions. *)</span>
+ <span class="n">add_reassociation</span> <span class="n">the_fpm</span><span class="o">;</span>
+</pre></div>
+</div>
+<p>It is interesting to see what the code looks like before and after the
+mem2reg optimization runs. For example, this is the before/after code
+for our recursive fib function. Before the optimization:</p>
+<div class="highlight-llvm"><div class="highlight"><pre><span class="k">define</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%x</span><span class="p">)</span> <span class="p">{</span>
+<span class="nl">entry:</span>
+ <span class="nv">%x1</span> <span class="p">=</span> <span class="k">alloca</span> <span class="kt">double</span>
+ <span class="k">store</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="kt">double</span><span class="p">*</span> <span class="nv">%x1</span>
+ <span class="nv">%x2</span> <span class="p">=</span> <span class="k">load</span> <span class="kt">double</span><span class="p">*</span> <span class="nv">%x1</span>
+ <span class="nv">%cmptmp</span> <span class="p">=</span> <span class="k">fcmp</span> <span class="k">ult</span> <span class="kt">double</span> <span class="nv">%x2</span><span class="p">,</span> <span class="m">3.000000e+00</span>
+ <span class="nv">%booltmp</span> <span class="p">=</span> <span class="k">uitofp</span> <span class="k">i1</span> <span class="nv">%cmptmp</span> <span class="k">to</span> <span class="kt">double</span>
+ <span class="nv">%ifcond</span> <span class="p">=</span> <span class="k">fcmp</span> <span class="k">one</span> <span class="kt">double</span> <span class="nv">%booltmp</span><span class="p">,</span> <span class="m">0.000000e+00</span>
+ <span class="k">br</span> <span class="k">i1</span> <span class="nv">%ifcond</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%then</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%else</span>
+
+<span class="nl">then:</span> <span class="c">; preds = %entry</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%ifcont</span>
+
+<span class="nl">else:</span> <span class="c">; preds = %entry</span>
+ <span class="nv">%x3</span> <span class="p">=</span> <span class="k">load</span> <span class="kt">double</span><span class="p">*</span> <span class="nv">%x1</span>
+ <span class="nv">%subtmp</span> <span class="p">=</span> <span class="k">fsub</span> <span class="kt">double</span> <span class="nv">%x3</span><span class="p">,</span> <span class="m">1.000000e+00</span>
+ <span class="nv">%calltmp</span> <span class="p">=</span> <span class="k">call</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%subtmp</span><span class="p">)</span>
+ <span class="nv">%x4</span> <span class="p">=</span> <span class="k">load</span> <span class="kt">double</span><span class="p">*</span> <span class="nv">%x1</span>
+ <span class="nv">%subtmp5</span> <span class="p">=</span> <span class="k">fsub</span> <span class="kt">double</span> <span class="nv">%x4</span><span class="p">,</span> <span class="m">2.000000e+00</span>
+ <span class="nv">%calltmp6</span> <span class="p">=</span> <span class="k">call</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%subtmp5</span><span class="p">)</span>
+ <span class="nv">%addtmp</span> <span class="p">=</span> <span class="k">fadd</span> <span class="kt">double</span> <span class="nv">%calltmp</span><span class="p">,</span> <span class="nv">%calltmp6</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%ifcont</span>
+
+<span class="nl">ifcont:</span> <span class="c">; preds = %else, %then</span>
+ <span class="nv">%iftmp</span> <span class="p">=</span> <span class="k">phi</span> <span class="kt">double</span> <span class="p">[</span> <span class="m">1.000000e+00</span><span class="p">,</span> <span class="nv">%then</span> <span class="p">],</span> <span class="p">[</span> <span class="nv">%addtmp</span><span class="p">,</span> <span class="nv">%else</span> <span class="p">]</span>
+ <span class="k">ret</span> <span class="kt">double</span> <span class="nv">%iftmp</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>Here there is only one variable (x, the input argument) but you can
+still see the extremely simple-minded code generation strategy we are
+using. In the entry block, an alloca is created, and the initial input
+value is stored into it. Each reference to the variable does a reload
+from the stack. Also, note that we didn’t modify the if/then/else
+expression, so it still inserts a PHI node. While we could make an
+alloca for it, it is actually easier to create a PHI node for it, so we
+still just make the PHI.</p>
+<p>Here is the code after the mem2reg pass runs:</p>
+<div class="highlight-llvm"><div class="highlight"><pre><span class="k">define</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%x</span><span class="p">)</span> <span class="p">{</span>
+<span class="nl">entry:</span>
+ <span class="nv">%cmptmp</span> <span class="p">=</span> <span class="k">fcmp</span> <span class="k">ult</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="m">3.000000e+00</span>
+ <span class="nv">%booltmp</span> <span class="p">=</span> <span class="k">uitofp</span> <span class="k">i1</span> <span class="nv">%cmptmp</span> <span class="k">to</span> <span class="kt">double</span>
+ <span class="nv">%ifcond</span> <span class="p">=</span> <span class="k">fcmp</span> <span class="k">one</span> <span class="kt">double</span> <span class="nv">%booltmp</span><span class="p">,</span> <span class="m">0.000000e+00</span>
+ <span class="k">br</span> <span class="k">i1</span> <span class="nv">%ifcond</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%then</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%else</span>
+
+<span class="nl">then:</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%ifcont</span>
+
+<span class="nl">else:</span>
+ <span class="nv">%subtmp</span> <span class="p">=</span> <span class="k">fsub</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="m">1.000000e+00</span>
+ <span class="nv">%calltmp</span> <span class="p">=</span> <span class="k">call</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%subtmp</span><span class="p">)</span>
+ <span class="nv">%subtmp5</span> <span class="p">=</span> <span class="k">fsub</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="m">2.000000e+00</span>
+ <span class="nv">%calltmp6</span> <span class="p">=</span> <span class="k">call</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%subtmp5</span><span class="p">)</span>
+ <span class="nv">%addtmp</span> <span class="p">=</span> <span class="k">fadd</span> <span class="kt">double</span> <span class="nv">%calltmp</span><span class="p">,</span> <span class="nv">%calltmp6</span>
+ <span class="k">br</span> <span class="kt">label</span> <span class="nv">%ifcont</span>
+
+<span class="nl">ifcont:</span> <span class="c">; preds = %else, %then</span>
+ <span class="nv">%iftmp</span> <span class="p">=</span> <span class="k">phi</span> <span class="kt">double</span> <span class="p">[</span> <span class="m">1.000000e+00</span><span class="p">,</span> <span class="nv">%then</span> <span class="p">],</span> <span class="p">[</span> <span class="nv">%addtmp</span><span class="p">,</span> <span class="nv">%else</span> <span class="p">]</span>
+ <span class="k">ret</span> <span class="kt">double</span> <span class="nv">%iftmp</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>This is a trivial case for mem2reg, since there are no redefinitions of
+the variable. The point of showing this is to calm your tension about
+inserting such blatent inefficiencies :).</p>
+<p>After the rest of the optimizers run, we get:</p>
+<div class="highlight-llvm"><div class="highlight"><pre><span class="k">define</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%x</span><span class="p">)</span> <span class="p">{</span>
+<span class="nl">entry:</span>
+ <span class="nv">%cmptmp</span> <span class="p">=</span> <span class="k">fcmp</span> <span class="k">ult</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="m">3.000000e+00</span>
+ <span class="nv">%booltmp</span> <span class="p">=</span> <span class="k">uitofp</span> <span class="k">i1</span> <span class="nv">%cmptmp</span> <span class="k">to</span> <span class="kt">double</span>
+ <span class="nv">%ifcond</span> <span class="p">=</span> <span class="k">fcmp</span> <span class="k">ueq</span> <span class="kt">double</span> <span class="nv">%booltmp</span><span class="p">,</span> <span class="m">0.000000e+00</span>
+ <span class="k">br</span> <span class="k">i1</span> <span class="nv">%ifcond</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%else</span><span class="p">,</span> <span class="kt">label</span> <span class="nv">%ifcont</span>
+
+<span class="nl">else:</span>
+ <span class="nv">%subtmp</span> <span class="p">=</span> <span class="k">fsub</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="m">1.000000e+00</span>
+ <span class="nv">%calltmp</span> <span class="p">=</span> <span class="k">call</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%subtmp</span><span class="p">)</span>
+ <span class="nv">%subtmp5</span> <span class="p">=</span> <span class="k">fsub</span> <span class="kt">double</span> <span class="nv">%x</span><span class="p">,</span> <span class="m">2.000000e+00</span>
+ <span class="nv">%calltmp6</span> <span class="p">=</span> <span class="k">call</span> <span class="kt">double</span> <span class="vg">@fib</span><span class="p">(</span><span class="kt">double</span> <span class="nv">%subtmp5</span><span class="p">)</span>
+ <span class="nv">%addtmp</span> <span class="p">=</span> <span class="k">fadd</span> <span class="kt">double</span> <span class="nv">%calltmp</span><span class="p">,</span> <span class="nv">%calltmp6</span>
+ <span class="k">ret</span> <span class="kt">double</span> <span class="nv">%addtmp</span>
+
+<span class="nl">ifcont:</span>
+ <span class="k">ret</span> <span class="kt">double</span> <span class="m">1.000000e+00</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+<p>Here we see that the simplifycfg pass decided to clone the return
+instruction into the end of the ‘else’ block. This allowed it to
+eliminate some branches and the PHI node.</p>
+<p>Now that all symbol table references are updated to use stack variables,
+we’ll add the assignment operator.</p>
+</div>
+<div class="section" id="new-assignment-operator">
+<h2><a class="toc-backref" href="#id7">7.6. New Assignment Operator</a><a class="headerlink" href="#new-assignment-operator" title="Permalink to this headline">¶</a></h2>
+<p>With our current framework, adding a new assignment operator is really
+simple. We will parse it just like any other binary operator, but handle
+it internally (instead of allowing the user to define it). The first
+step is to set a precedence:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="n">main</span> <span class="bp">()</span> <span class="o">=</span>
+ <span class="c">(* Install standard binary operators.</span>
+<span class="c"> * 1 is the lowest precedence. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'='</span> <span class="mi">2</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'<'</span> <span class="mi">10</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'+'</span> <span class="mi">20</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'-'</span> <span class="mi">20</span><span class="o">;</span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>Now that the parser knows the precedence of the binary operator, it
+takes care of all the parsing and AST generation. We just need to
+implement codegen for the assignment operator. This looks like:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="k">rec</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">op</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="sc">'='</span> <span class="o">-></span>
+ <span class="c">(* Special case '=' because we don't want to emit the LHS as an</span>
+<span class="c"> * expression. *)</span>
+ <span class="k">let</span> <span class="n">name</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lhs</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">name</span> <span class="o">-></span> <span class="n">name</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"destination of '=' must be a variable"</span><span class="o">)</span>
+ <span class="k">in</span>
+</pre></div>
+</div>
+<p>Unlike the rest of the binary operators, our assignment operator doesn’t
+follow the “emit LHS, emit RHS, do computation” model. As such, it is
+handled as a special case before the other binary operators are handled.
+The other strange thing is that it requires the LHS to be a variable. It
+is invalid to have “(x+1) = expr” - only things like “x = expr” are
+allowed.</p>
+<div class="highlight-ocaml"><div class="highlight"><pre> <span class="c">(* Codegen the rhs. *)</span>
+ <span class="k">let</span> <span class="n">val_</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">rhs</span> <span class="k">in</span>
+
+ <span class="c">(* Lookup the name. *)</span>
+ <span class="k">let</span> <span class="n">variable</span> <span class="o">=</span> <span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">name</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown variable name"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">val_</span> <span class="n">variable</span> <span class="n">builder</span><span class="o">);</span>
+ <span class="n">val_</span>
+<span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>Once we have the variable, codegen’ing the assignment is
+straightforward: we emit the RHS of the assignment, create a store, and
+return the computed value. Returning a value allows for chained
+assignments like “X = (Y = Z)”.</p>
+<p>Now that we have an assignment operator, we can mutate loop variables
+and arguments. For example, we can now run code like this:</p>
+<div class="highlight-python"><pre># Function to print a double.
+extern printd(x);
+
+# Define ':' for sequencing: as a low-precedence operator that ignores operands
+# and just returns the RHS.
+def binary : 1 (x y) y;
+
+def test(x)
+ printd(x) :
+ x = 4 :
+ printd(x);
+
+test(123);</pre>
+</div>
+<p>When run, this example prints “123” and then “4”, showing that we did
+actually mutate the value! Okay, we have now officially implemented our
+goal: getting this to work requires SSA construction in the general
+case. However, to be really useful, we want the ability to define our
+own local variables, lets add this next!</p>
+</div>
+<div class="section" id="user-defined-local-variables">
+<h2><a class="toc-backref" href="#id8">7.7. User-defined Local Variables</a><a class="headerlink" href="#user-defined-local-variables" title="Permalink to this headline">¶</a></h2>
+<p>Adding var/in is just like any other other extensions we made to
+Kaleidoscope: we extend the lexer, the parser, the AST and the code
+generator. The first step for adding our new ‘var/in’ construct is to
+extend the lexer. As before, this is pretty trivial, the code looks like
+this:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">type</span> <span class="n">token</span> <span class="o">=</span>
+ <span class="o">...</span>
+ <span class="c">(* var definition *)</span>
+ <span class="o">|</span> <span class="nn">Var</span>
+
+<span class="p">...</span>
+
+<span class="n">and</span> <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">...</span>
+ <span class="o">|</span> <span class="s2">"in"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"binary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"unary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"var"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Var</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>The next step is to define the AST node that we will construct. For
+var/in, it looks like this:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">type</span> <span class="n">expr</span> <span class="o">=</span>
+ <span class="o">...</span>
+ <span class="c">(* variant for var/in. *)</span>
+ <span class="o">|</span> <span class="nc">Var</span> <span class="k">of</span> <span class="o">(</span><span class="kt">string</span> <span class="o">*</span> <span class="n">expr</span> <span class="n">option</span><span class="o">)</span> <span class="kt">array</span> <span class="o">*</span> <span class="n">expr</span>
+ <span class="o">...</span>
+</pre></div>
+</div>
+<p>var/in allows a list of names to be defined all at once, and each name
+can optionally have an initializer value. As such, we capture this
+information in the VarNames vector. Also, var/in has a body, this body
+is allowed to access the variables defined by the var/in.</p>
+<p>With this in place, we can define the parser pieces. The first thing we
+do is add it as a primary expression:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* primary</span>
+<span class="c"> * ::= identifier</span>
+<span class="c"> * ::= numberexpr</span>
+<span class="c"> * ::= parenexpr</span>
+<span class="c"> * ::= ifexpr</span>
+<span class="c"> * ::= forexpr</span>
+<span class="c"> * ::= varexpr *)</span>
+<span class="k">let</span> <span class="k">rec</span> <span class="n">parse_primary</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">...</span>
+ <span class="c">(* varexpr</span>
+<span class="c"> * ::= 'var' identifier ('=' expression?</span>
+<span class="c"> * (',' identifier ('=' expression)?)* 'in' expression *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Var</span><span class="o">;</span>
+ <span class="c">(* At least one variable name is required. *)</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span> <span class="o">??</span> <span class="s2">"expected identifier after var"</span><span class="o">;</span>
+ <span class="n">init</span><span class="o">=</span><span class="n">parse_var_init</span><span class="o">;</span>
+ <span class="n">var_names</span><span class="o">=</span><span class="n">parse_var_names</span> <span class="o">[(</span><span class="n">id</span><span class="o">,</span> <span class="n">init</span><span class="o">)];</span>
+ <span class="c">(* At this point, we have to have 'in'. *)</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span> <span class="o">??</span> <span class="s2">"expected 'in' keyword after 'var'"</span><span class="o">;</span>
+ <span class="n">body</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Var</span> <span class="o">(</span><span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">var_names</span><span class="o">),</span> <span class="n">body</span><span class="o">)</span>
+
+<span class="o">...</span>
+
+<span class="ow">and</span> <span class="n">parse_var_init</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* read in the optional initializer. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'='</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">Some</span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">None</span>
+
+<span class="ow">and</span> <span class="n">parse_var_names</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span> <span class="o">??</span> <span class="s2">"expected identifier list after var"</span><span class="o">;</span>
+ <span class="n">init</span><span class="o">=</span><span class="n">parse_var_init</span><span class="o">;</span>
+ <span class="n">e</span><span class="o">=</span><span class="n">parse_var_names</span> <span class="o">((</span><span class="n">id</span><span class="o">,</span> <span class="n">init</span><span class="o">)</span> <span class="o">::</span> <span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+</pre></div>
+</div>
+<p>Now that we can parse and represent the code, we need to support
+emission of LLVM IR for it. This code starts out with:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="k">let</span> <span class="k">rec</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">...</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Var</span> <span class="o">(</span><span class="n">var_names</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span>
+ <span class="k">let</span> <span class="n">old_bindings</span> <span class="o">=</span> <span class="n">ref</span> <span class="bp">[]</span> <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="o">(</span><span class="n">insertion_block</span> <span class="n">builder</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Register all variables and emit their initializer. *)</span>
+ <span class="nn">Array</span><span class="p">.</span><span class="n">iter</span> <span class="o">(</span><span class="k">fun</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">init</span><span class="o">)</span> <span class="o">-></span>
+</pre></div>
+</div>
+<p>Basically it loops over all the variables, installing them one at a
+time. For each variable we put into the symbol table, we remember the
+previous value that we replace in OldBindings.</p>
+<div class="highlight-ocaml"><div class="highlight"><pre> <span class="c">(* Emit the initializer before adding the variable to scope, this</span>
+<span class="c"> * prevents the initializer from referencing the variable itself, and</span>
+<span class="c"> * permits stuff like this:</span>
+<span class="c"> * var a = 1 in</span>
+<span class="c"> * var a = a in ... # refers to outer 'a'. *)</span>
+ <span class="k">let</span> <span class="n">init_val</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">init</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">init</span> <span class="o">-></span> <span class="n">codegen_expr</span> <span class="n">init</span>
+ <span class="c">(* If not specified, use 0.0. *)</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span>
+ <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">alloca</span> <span class="o">=</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="k">in</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">init_val</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Remember the old variable binding so that we can restore the binding</span>
+<span class="c"> * when we unrecurse. *)</span>
+
+ <span class="k">begin</span>
+ <span class="k">try</span>
+ <span class="k">let</span> <span class="n">old_value</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="k">in</span>
+ <span class="n">old_bindings</span> <span class="o">:=</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">old_value</span><span class="o">)</span> <span class="o">::</span> <span class="o">!</span><span class="n">old_bindings</span><span class="o">;</span>
+ <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* Remember this binding. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">alloca</span><span class="o">;</span>
+<span class="o">)</span> <span class="n">var_names</span><span class="o">;</span>
+</pre></div>
+</div>
+<p>There are more comments here than code. The basic idea is that we emit
+the initializer, create the alloca, then update the symbol table to
+point to it. Once all the variables are installed in the symbol table,
+we evaluate the body of the var/in expression:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* Codegen the body, now that all vars are in scope. *)</span>
+<span class="k">let</span> <span class="n">body_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">body</span> <span class="k">in</span>
+</pre></div>
+</div>
+<p>Finally, before returning, we restore the previous variable bindings:</p>
+<div class="highlight-ocaml"><div class="highlight"><pre><span class="c">(* Pop all our variables from scope. *)</span>
+<span class="nn">List</span><span class="p">.</span><span class="n">iter</span> <span class="o">(</span><span class="k">fun</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">old_value</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">old_value</span>
+<span class="o">)</span> <span class="o">!</span><span class="n">old_bindings</span><span class="o">;</span>
+
+<span class="c">(* Return the body computation. *)</span>
+<span class="n">body_val</span>
+</pre></div>
+</div>
+<p>The end result of all of this is that we get properly scoped variable
+definitions, and we even (trivially) allow mutation of them :).</p>
+<p>With this, we completed what we set out to do. Our nice iterative fib
+example from the intro compiles and runs just fine. The mem2reg pass
+optimizes all of our stack variables into SSA registers, inserting PHI
+nodes where needed, and our front-end remains simple: no “iterated
+dominance frontier” computation anywhere in sight.</p>
+</div>
+<div class="section" id="id1">
+<h2><a class="toc-backref" href="#id9">7.8. Full Code Listing</a><a class="headerlink" href="#id1" title="Permalink to this headline">¶</a></h2>
+<p>Here is the complete code listing for our running example, enhanced with
+mutable variables and var/in support. To build this example, use:</p>
+<div class="highlight-bash"><div class="highlight"><pre><span class="c"># Compile</span>
+ocamlbuild toy.byte
+<span class="c"># Run</span>
+./toy.byte
+</pre></div>
+</div>
+<p>Here is the code:</p>
+<dl class="docutils">
+<dt>_tags:</dt>
+<dd><div class="first last highlight-python"><pre><{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
+<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
+<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
+<*.{byte,native}>: use_llvm_scalar_opts, use_bindings</pre>
+</div>
+</dd>
+<dt>myocamlbuild.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="k">open</span> <span class="nc">Ocamlbuild_plugin</span><span class="o">;;</span>
+
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_analysis"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_executionengine"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_target"</span><span class="o">;;</span>
+<span class="n">ocaml_lib</span> <span class="o">~</span><span class="n">extern</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"llvm_scalar_opts"</span><span class="o">;;</span>
+
+<span class="n">flag</span> <span class="o">[</span><span class="s2">"link"</span><span class="o">;</span> <span class="s2">"ocaml"</span><span class="o">;</span> <span class="s2">"g++"</span><span class="o">]</span> <span class="o">(</span><span class="nc">S</span><span class="o">[</span><span class="nc">A</span><span class="s2">"-cc"</span><span class="o">;</span> <span class="nc">A</span><span class="s2">"g++"</span><span class="o">;</span> <span class="nc">A</span><span class="s2">"-cclib"</span><span class="o">;</span> <span class="nc">A</span><span class="s2">"-rdynamic"</span><span class="o">]);;</span>
+<span class="n">dep</span> <span class="o">[</span><span class="s2">"link"</span><span class="o">;</span> <span class="s2">"ocaml"</span><span class="o">;</span> <span class="s2">"use_bindings"</span><span class="o">]</span> <span class="o">[</span><span class="s2">"bindings.o"</span><span class="o">];;</span>
+</pre></div>
+</div>
+</dd>
+<dt>token.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Lexer Tokens</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="c">(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of</span>
+<span class="c"> * these others for known things. *)</span>
+<span class="k">type</span> <span class="n">token</span> <span class="o">=</span>
+ <span class="c">(* commands *)</span>
+ <span class="o">|</span> <span class="nc">Def</span> <span class="o">|</span> <span class="nc">Extern</span>
+
+ <span class="c">(* primary *)</span>
+ <span class="o">|</span> <span class="nc">Ident</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">|</span> <span class="nc">Number</span> <span class="k">of</span> <span class="kt">float</span>
+
+ <span class="c">(* unknown *)</span>
+ <span class="o">|</span> <span class="nc">Kwd</span> <span class="k">of</span> <span class="kt">char</span>
+
+ <span class="c">(* control *)</span>
+ <span class="o">|</span> <span class="nc">If</span> <span class="o">|</span> <span class="nc">Then</span> <span class="o">|</span> <span class="nc">Else</span>
+ <span class="o">|</span> <span class="nc">For</span> <span class="o">|</span> <span class="nc">In</span>
+
+ <span class="c">(* operators *)</span>
+ <span class="o">|</span> <span class="nc">Binary</span> <span class="o">|</span> <span class="nc">Unary</span>
+
+ <span class="c">(* var definition *)</span>
+ <span class="o">|</span> <span class="nc">Var</span>
+</pre></div>
+</div>
+</dd>
+<dt>lexer.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Lexer</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">let</span> <span class="k">rec</span> <span class="n">lex</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* Skip any whitespace. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">' '</span> <span class="o">|</span> <span class="sc">'\n'</span> <span class="o">|</span> <span class="sc">'\r'</span> <span class="o">|</span> <span class="sc">'\t'</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">lex</span> <span class="n">stream</span>
+
+ <span class="c">(* identifier: [a-zA-Z][a-zA-Z0-9] *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'A'</span> <span class="o">..</span> <span class="sc">'Z'</span> <span class="o">|</span> <span class="sc">'a'</span> <span class="o">..</span> <span class="sc">'z'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">buffer</span> <span class="o">=</span> <span class="nn">Buffer</span><span class="p">.</span><span class="n">create</span> <span class="mi">1</span> <span class="k">in</span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="n">stream</span>
+
+ <span class="c">(* number: [0-9.]+ *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'0'</span> <span class="o">..</span> <span class="sc">'9'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">buffer</span> <span class="o">=</span> <span class="nn">Buffer</span><span class="p">.</span><span class="n">create</span> <span class="mi">1</span> <span class="k">in</span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_number</span> <span class="n">buffer</span> <span class="n">stream</span>
+
+ <span class="c">(* Comment until end of line. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'#'</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="n">lex_comment</span> <span class="n">stream</span>
+
+ <span class="c">(* Otherwise, just return the character as its ascii value. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="n">c</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">c</span><span class="o">;</span> <span class="n">lex</span> <span class="n">stream</span> <span class="o">>]</span>
+
+ <span class="c">(* end of stream. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="o">[<</span> <span class="o">>]</span>
+
+<span class="ow">and</span> <span class="n">lex_number</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'0'</span> <span class="o">..</span> <span class="sc">'9'</span> <span class="o">|</span> <span class="sc">'.'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_number</span> <span class="n">buffer</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span><span class="o">=</span><span class="n">lex</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="o">(</span><span class="n">float_of_string</span> <span class="o">(</span><span class="nn">Buffer</span><span class="p">.</span><span class="n">contents</span> <span class="n">buffer</span><span class="o">));</span> <span class="n">stream</span> <span class="o">>]</span>
+
+<span class="ow">and</span> <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'A'</span> <span class="o">..</span> <span class="sc">'Z'</span> <span class="o">|</span> <span class="sc">'a'</span> <span class="o">..</span> <span class="sc">'z'</span> <span class="o">|</span> <span class="sc">'0'</span> <span class="o">..</span> <span class="sc">'9'</span> <span class="k">as</span> <span class="n">c</span><span class="o">);</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Buffer</span><span class="p">.</span><span class="n">add_char</span> <span class="n">buffer</span> <span class="n">c</span><span class="o">;</span>
+ <span class="n">lex_ident</span> <span class="n">buffer</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span><span class="o">=</span><span class="n">lex</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">match</span> <span class="nn">Buffer</span><span class="p">.</span><span class="n">contents</span> <span class="n">buffer</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="s2">"def"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Def</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"extern"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Extern</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"if"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">If</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"then"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Then</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"else"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Else</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"for"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">For</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"in"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"binary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"unary"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="s2">"var"</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Var</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+ <span class="o">|</span> <span class="n">id</span> <span class="o">-></span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span>
+
+<span class="ow">and</span> <span class="n">lex_comment</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span> <span class="o">(</span><span class="sc">'\n'</span><span class="o">);</span> <span class="n">stream</span><span class="o">=</span><span class="n">lex</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="n">c</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">lex_comment</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="o">[<</span> <span class="o">>]</span>
+</pre></div>
+</div>
+</dd>
+<dt>ast.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Abstract Syntax Tree (aka Parse Tree)</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="c">(* expr - Base type for all expression nodes. *)</span>
+<span class="k">type</span> <span class="n">expr</span> <span class="o">=</span>
+ <span class="c">(* variant for numeric literals like "1.0". *)</span>
+ <span class="o">|</span> <span class="nc">Number</span> <span class="k">of</span> <span class="kt">float</span>
+
+ <span class="c">(* variant for referencing a variable, like "a". *)</span>
+ <span class="o">|</span> <span class="nc">Variable</span> <span class="k">of</span> <span class="kt">string</span>
+
+ <span class="c">(* variant for a unary operator. *)</span>
+ <span class="o">|</span> <span class="nc">Unary</span> <span class="k">of</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for a binary operator. *)</span>
+ <span class="o">|</span> <span class="nc">Binary</span> <span class="k">of</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for function calls. *)</span>
+ <span class="o">|</span> <span class="nc">Call</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="n">expr</span> <span class="kt">array</span>
+
+ <span class="c">(* variant for if/then/else. *)</span>
+ <span class="o">|</span> <span class="nc">If</span> <span class="k">of</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for for/in. *)</span>
+ <span class="o">|</span> <span class="nc">For</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span> <span class="o">*</span> <span class="n">expr</span> <span class="n">option</span> <span class="o">*</span> <span class="n">expr</span>
+
+ <span class="c">(* variant for var/in. *)</span>
+ <span class="o">|</span> <span class="nc">Var</span> <span class="k">of</span> <span class="o">(</span><span class="kt">string</span> <span class="o">*</span> <span class="n">expr</span> <span class="n">option</span><span class="o">)</span> <span class="kt">array</span> <span class="o">*</span> <span class="n">expr</span>
+
+<span class="c">(* proto - This type represents the "prototype" for a function, which captures</span>
+<span class="c"> * its name, and its argument names (thus implicitly the number of arguments the</span>
+<span class="c"> * function takes). *)</span>
+<span class="k">type</span> <span class="n">proto</span> <span class="o">=</span>
+ <span class="o">|</span> <span class="nc">Prototype</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="kt">string</span> <span class="kt">array</span>
+ <span class="o">|</span> <span class="nc">BinOpPrototype</span> <span class="k">of</span> <span class="kt">string</span> <span class="o">*</span> <span class="kt">string</span> <span class="kt">array</span> <span class="o">*</span> <span class="kt">int</span>
+
+<span class="c">(* func - This type represents a function definition itself. *)</span>
+<span class="k">type</span> <span class="n">func</span> <span class="o">=</span> <span class="nc">Function</span> <span class="k">of</span> <span class="n">proto</span> <span class="o">*</span> <span class="n">expr</span>
+</pre></div>
+</div>
+</dd>
+<dt>parser.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===---------------------------------------------------------------------===</span>
+<span class="c"> * Parser</span>
+<span class="c"> *===---------------------------------------------------------------------===*)</span>
+
+<span class="c">(* binop_precedence - This holds the precedence for each binary operator that is</span>
+<span class="c"> * defined *)</span>
+<span class="k">let</span> <span class="n">binop_precedence</span><span class="o">:(</span><span class="kt">char</span><span class="o">,</span> <span class="kt">int</span><span class="o">)</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">10</span>
+
+<span class="c">(* precedence - Get the precedence of the pending binary operator token. *)</span>
+<span class="k">let</span> <span class="n">precedence</span> <span class="n">c</span> <span class="o">=</span> <span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">binop_precedence</span> <span class="n">c</span> <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="o">-</span><span class="mi">1</span>
+
+<span class="c">(* primary</span>
+<span class="c"> * ::= identifier</span>
+<span class="c"> * ::= numberexpr</span>
+<span class="c"> * ::= parenexpr</span>
+<span class="c"> * ::= ifexpr</span>
+<span class="c"> * ::= forexpr</span>
+<span class="c"> * ::= varexpr *)</span>
+<span class="k">let</span> <span class="k">rec</span> <span class="n">parse_primary</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* numberexpr ::= number *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">>]</span> <span class="o">-></span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span>
+
+ <span class="c">(* parenexpr ::= '(' expression ')' *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')'"</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+
+ <span class="c">(* identifierexpr</span>
+<span class="c"> * ::= identifier</span>
+<span class="c"> * ::= identifier '(' argumentexpr ')' *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_args</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_args</span> <span class="o">(</span><span class="n">e</span> <span class="o">::</span> <span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span> <span class="o">::</span> <span class="n">accumulator</span>
+ <span class="k">end</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_ident</span> <span class="n">id</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* Call. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')'"</span><span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Call</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">))</span>
+
+ <span class="c">(* Simple variable ref. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">id</span>
+ <span class="k">in</span>
+ <span class="n">parse_ident</span> <span class="n">id</span> <span class="n">stream</span>
+
+ <span class="c">(* ifexpr ::= 'if' expr 'then' expr 'else' expr *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">If</span><span class="o">;</span> <span class="n">c</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Then</span> <span class="o">??</span> <span class="s2">"expected 'then'"</span><span class="o">;</span> <span class="n">t</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Else</span> <span class="o">??</span> <span class="s2">"expected 'else'"</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">If</span> <span class="o">(</span><span class="n">c</span><span class="o">,</span> <span class="n">t</span><span class="o">,</span> <span class="n">e</span><span class="o">)</span>
+
+ <span class="c">(* forexpr</span>
+<span class="c"> ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">For</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span> <span class="o">??</span> <span class="s2">"expected identifier after for"</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'='</span> <span class="o">??</span> <span class="s2">"expected '=' after for"</span><span class="o">;</span>
+ <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span>
+ <span class="n">start</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span> <span class="o">??</span> <span class="s2">"expected ',' after for"</span><span class="o">;</span>
+ <span class="n">end_</span><span class="o">=</span><span class="n">parse_expr</span><span class="o">;</span>
+ <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">step</span> <span class="o">=</span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span><span class="o">;</span> <span class="n">step</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">Some</span> <span class="n">step</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">None</span>
+ <span class="k">end</span> <span class="n">stream</span>
+ <span class="k">in</span>
+ <span class="k">begin</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span><span class="o">;</span> <span class="n">body</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">For</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="n">start</span><span class="o">,</span> <span class="n">end_</span><span class="o">,</span> <span class="n">step</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected 'in' after for"</span><span class="o">)</span>
+ <span class="k">end</span> <span class="n">stream</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected '=' after for"</span><span class="o">)</span>
+ <span class="k">end</span> <span class="n">stream</span>
+
+ <span class="c">(* varexpr</span>
+<span class="c"> * ::= 'var' identifier ('=' expression?</span>
+<span class="c"> * (',' identifier ('=' expression)?)* 'in' expression *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Var</span><span class="o">;</span>
+ <span class="c">(* At least one variable name is required. *)</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span> <span class="o">??</span> <span class="s2">"expected identifier after var"</span><span class="o">;</span>
+ <span class="n">init</span><span class="o">=</span><span class="n">parse_var_init</span><span class="o">;</span>
+ <span class="n">var_names</span><span class="o">=</span><span class="n">parse_var_names</span> <span class="o">[(</span><span class="n">id</span><span class="o">,</span> <span class="n">init</span><span class="o">)];</span>
+ <span class="c">(* At this point, we have to have 'in'. *)</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">In</span> <span class="o">??</span> <span class="s2">"expected 'in' keyword after 'var'"</span><span class="o">;</span>
+ <span class="n">body</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Var</span> <span class="o">(</span><span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">var_names</span><span class="o">),</span> <span class="n">body</span><span class="o">)</span>
+
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"unknown token when expecting an expression."</span><span class="o">)</span>
+
+<span class="c">(* unary</span>
+<span class="c"> * ::= primary</span>
+<span class="c"> * ::= '!' unary *)</span>
+<span class="ow">and</span> <span class="n">parse_unary</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* If this is a unary operator, read it. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="k">when</span> <span class="n">op</span> <span class="o">!=</span> <span class="sc">'('</span> <span class="o">&&</span> <span class="n">op</span> <span class="o">!=</span> <span class="sc">')'</span><span class="o">;</span> <span class="n">operand</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">operand</span><span class="o">)</span>
+
+ <span class="c">(* If the current token is not an operator, it must be a primary expr. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">parse_primary</span> <span class="n">stream</span>
+
+<span class="c">(* binoprhs</span>
+<span class="c"> * ::= ('+' primary)* *)</span>
+<span class="ow">and</span> <span class="n">parse_bin_rhs</span> <span class="n">expr_prec</span> <span class="n">lhs</span> <span class="n">stream</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="nn">Stream</span><span class="p">.</span><span class="n">peek</span> <span class="n">stream</span> <span class="k">with</span>
+ <span class="c">(* If this is a binop, find its precedence. *)</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">c</span><span class="o">)</span> <span class="k">when</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">mem</span> <span class="n">binop_precedence</span> <span class="n">c</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">token_prec</span> <span class="o">=</span> <span class="n">precedence</span> <span class="n">c</span> <span class="k">in</span>
+
+ <span class="c">(* If this is a binop that binds at least as tightly as the current binop,</span>
+<span class="c"> * consume it, otherwise we are done. *)</span>
+ <span class="k">if</span> <span class="n">token_prec</span> <span class="o"><</span> <span class="n">expr_prec</span> <span class="k">then</span> <span class="n">lhs</span> <span class="k">else</span> <span class="k">begin</span>
+ <span class="c">(* Eat the binop. *)</span>
+ <span class="nn">Stream</span><span class="p">.</span><span class="n">junk</span> <span class="n">stream</span><span class="o">;</span>
+
+ <span class="c">(* Parse the primary expression after the binary operator. *)</span>
+ <span class="k">let</span> <span class="n">rhs</span> <span class="o">=</span> <span class="n">parse_unary</span> <span class="n">stream</span> <span class="k">in</span>
+
+ <span class="c">(* Okay, we know this is a binop. *)</span>
+ <span class="k">let</span> <span class="n">rhs</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="nn">Stream</span><span class="p">.</span><span class="n">peek</span> <span class="n">stream</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">c2</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* If BinOp binds less tightly with rhs than the operator after</span>
+<span class="c"> * rhs, let the pending operator take rhs as its lhs. *)</span>
+ <span class="k">let</span> <span class="n">next_prec</span> <span class="o">=</span> <span class="n">precedence</span> <span class="n">c2</span> <span class="k">in</span>
+ <span class="k">if</span> <span class="n">token_prec</span> <span class="o"><</span> <span class="n">next_prec</span>
+ <span class="k">then</span> <span class="n">parse_bin_rhs</span> <span class="o">(</span><span class="n">token_prec</span> <span class="o">+</span> <span class="mi">1</span><span class="o">)</span> <span class="n">rhs</span> <span class="n">stream</span>
+ <span class="k">else</span> <span class="n">rhs</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="n">rhs</span>
+ <span class="k">in</span>
+
+ <span class="c">(* Merge lhs/rhs. *)</span>
+ <span class="k">let</span> <span class="n">lhs</span> <span class="o">=</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">(</span><span class="n">c</span><span class="o">,</span> <span class="n">lhs</span><span class="o">,</span> <span class="n">rhs</span><span class="o">)</span> <span class="k">in</span>
+ <span class="n">parse_bin_rhs</span> <span class="n">expr_prec</span> <span class="n">lhs</span> <span class="n">stream</span>
+ <span class="k">end</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="n">lhs</span>
+
+<span class="ow">and</span> <span class="n">parse_var_init</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="c">(* read in the optional initializer. *)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'='</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">Some</span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="nc">None</span>
+
+<span class="ow">and</span> <span class="n">parse_var_names</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">','</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span> <span class="o">??</span> <span class="s2">"expected identifier list after var"</span><span class="o">;</span>
+ <span class="n">init</span><span class="o">=</span><span class="n">parse_var_init</span><span class="o">;</span>
+ <span class="n">e</span><span class="o">=</span><span class="n">parse_var_names</span> <span class="o">((</span><span class="n">id</span><span class="o">,</span> <span class="n">init</span><span class="o">)</span> <span class="o">::</span> <span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+
+<span class="c">(* expression</span>
+<span class="c"> * ::= primary binoprhs *)</span>
+<span class="ow">and</span> <span class="n">parse_expr</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">lhs</span><span class="o">=</span><span class="n">parse_unary</span><span class="o">;</span> <span class="n">stream</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">parse_bin_rhs</span> <span class="mi">0</span> <span class="n">lhs</span> <span class="n">stream</span>
+
+<span class="c">(* prototype</span>
+<span class="c"> * ::= id '(' id* ')'</span>
+<span class="c"> * ::= binary LETTER number? (id, id)</span>
+<span class="c"> * ::= unary LETTER number? (id) *)</span>
+<span class="k">let</span> <span class="n">parse_prototype</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="k">rec</span> <span class="n">parse_args</span> <span class="n">accumulator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_args</span> <span class="o">(</span><span class="n">id</span><span class="o">::</span><span class="n">accumulator</span><span class="o">)</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">accumulator</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_operator</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"unary"</span><span class="o">,</span> <span class="mi">1</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">>]</span> <span class="o">-></span> <span class="s2">"binary"</span><span class="o">,</span> <span class="mi">2</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">parse_binary_precedence</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">int_of_float</span> <span class="n">n</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span> <span class="mi">30</span>
+ <span class="k">in</span>
+ <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Ident</span> <span class="n">id</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="c">(* success. *)</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">id</span><span class="o">,</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">))</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">(</span><span class="n">prefix</span><span class="o">,</span> <span class="n">kind</span><span class="o">)=</span><span class="n">parse_operator</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="n">op</span> <span class="o">??</span> <span class="s2">"expected an operator"</span><span class="o">;</span>
+ <span class="c">(* Read the precedence if present. *)</span>
+ <span class="n">binary_precedence</span><span class="o">=</span><span class="n">parse_binary_precedence</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">'('</span> <span class="o">??</span> <span class="s2">"expected '(' in prototype"</span><span class="o">;</span>
+ <span class="n">args</span><span class="o">=</span><span class="n">parse_args</span> <span class="bp">[]</span><span class="o">;</span>
+ <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">')'</span> <span class="o">??</span> <span class="s2">"expected ')' in prototype"</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="n">prefix</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="o">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">args</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Verify right number of arguments for operator. *)</span>
+ <span class="k">if</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span> <span class="o">!=</span> <span class="n">kind</span>
+ <span class="k">then</span> <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"invalid number of operands for operator"</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="k">if</span> <span class="n">kind</span> <span class="o">==</span> <span class="mi">1</span> <span class="k">then</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span>
+ <span class="k">else</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">binary_precedence</span><span class="o">)</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="s2">"expected function name in prototype"</span><span class="o">)</span>
+
+<span class="c">(* definition ::= 'def' prototype expression *)</span>
+<span class="k">let</span> <span class="n">parse_definition</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Def</span><span class="o">;</span> <span class="n">p</span><span class="o">=</span><span class="n">parse_prototype</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="n">p</span><span class="o">,</span> <span class="n">e</span><span class="o">)</span>
+
+<span class="c">(* toplevelexpr ::= expression *)</span>
+<span class="k">let</span> <span class="n">parse_toplevel</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_expr</span> <span class="o">>]</span> <span class="o">-></span>
+ <span class="c">(* Make an anonymous proto. *)</span>
+ <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="s2">""</span><span class="o">,</span> <span class="o">[||]),</span> <span class="n">e</span><span class="o">)</span>
+
+<span class="c">(* external ::= 'extern' prototype *)</span>
+<span class="k">let</span> <span class="n">parse_extern</span> <span class="o">=</span> <span class="n">parser</span>
+ <span class="o">|</span> <span class="o">[<</span> <span class="k">'</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Extern</span><span class="o">;</span> <span class="n">e</span><span class="o">=</span><span class="n">parse_prototype</span> <span class="o">>]</span> <span class="o">-></span> <span class="n">e</span>
+</pre></div>
+</div>
+</dd>
+<dt>codegen.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Code Generation</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">open</span> <span class="nc">Llvm</span>
+
+<span class="k">exception</span> <span class="nc">Error</span> <span class="k">of</span> <span class="kt">string</span>
+
+<span class="k">let</span> <span class="n">context</span> <span class="o">=</span> <span class="n">global_context</span> <span class="bp">()</span>
+<span class="k">let</span> <span class="n">the_module</span> <span class="o">=</span> <span class="n">create_module</span> <span class="n">context</span> <span class="s2">"my cool jit"</span>
+<span class="k">let</span> <span class="n">builder</span> <span class="o">=</span> <span class="n">builder</span> <span class="n">context</span>
+<span class="k">let</span> <span class="n">named_values</span><span class="o">:(</span><span class="kt">string</span><span class="o">,</span> <span class="n">llvalue</span><span class="o">)</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">10</span>
+<span class="k">let</span> <span class="n">double_type</span> <span class="o">=</span> <span class="n">double_type</span> <span class="n">context</span>
+
+<span class="c">(* Create an alloca instruction in the entry block of the function. This</span>
+<span class="c"> * is used for mutable variables etc. *)</span>
+<span class="k">let</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="n">builder</span> <span class="o">=</span> <span class="n">builder_at</span> <span class="n">context</span> <span class="o">(</span><span class="n">instr_begin</span> <span class="o">(</span><span class="n">entry_block</span> <span class="n">the_function</span><span class="o">))</span> <span class="k">in</span>
+ <span class="n">build_alloca</span> <span class="n">double_type</span> <span class="n">var_name</span> <span class="n">builder</span>
+
+<span class="k">let</span> <span class="k">rec</span> <span class="n">codegen_expr</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Number</span> <span class="n">n</span> <span class="o">-></span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="n">n</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">name</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">v</span> <span class="o">=</span> <span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">name</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown variable name"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="c">(* Load the value. *)</span>
+ <span class="n">build_load</span> <span class="n">v</span> <span class="n">name</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Unary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">operand</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">operand</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">operand</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span> <span class="s2">"unary"</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown unary operator"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="o">[|</span><span class="n">operand</span><span class="o">|]</span> <span class="s2">"unop"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Binary</span> <span class="o">(</span><span class="n">op</span><span class="o">,</span> <span class="n">lhs</span><span class="o">,</span> <span class="n">rhs</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">op</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="sc">'='</span> <span class="o">-></span>
+ <span class="c">(* Special case '=' because we don't want to emit the LHS as an</span>
+<span class="c"> * expression. *)</span>
+ <span class="k">let</span> <span class="n">name</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lhs</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Variable</span> <span class="n">name</span> <span class="o">-></span> <span class="n">name</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"destination of '=' must be a variable"</span><span class="o">)</span>
+ <span class="k">in</span>
+
+ <span class="c">(* Codegen the rhs. *)</span>
+ <span class="k">let</span> <span class="n">val_</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">rhs</span> <span class="k">in</span>
+
+ <span class="c">(* Lookup the name. *)</span>
+ <span class="k">let</span> <span class="n">variable</span> <span class="o">=</span> <span class="k">try</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">name</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown variable name"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">val_</span> <span class="n">variable</span> <span class="n">builder</span><span class="o">);</span>
+ <span class="n">val_</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">lhs_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">lhs</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">rhs_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">rhs</span> <span class="k">in</span>
+ <span class="k">begin</span>
+ <span class="k">match</span> <span class="n">op</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="sc">'+'</span> <span class="o">-></span> <span class="n">build_add</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"addtmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'-'</span> <span class="o">-></span> <span class="n">build_sub</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"subtmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'*'</span> <span class="o">-></span> <span class="n">build_mul</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"multmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="sc">'<'</span> <span class="o">-></span>
+ <span class="c">(* Convert bool 0/1 to double 0.0 or 1.0 *)</span>
+ <span class="k">let</span> <span class="n">i</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">Ult</span> <span class="n">lhs_val</span> <span class="n">rhs_val</span> <span class="s2">"cmptmp"</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="n">build_uitofp</span> <span class="n">i</span> <span class="n">double_type</span> <span class="s2">"booltmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="c">(* If it wasn't a builtin binary operator, it must be a user defined</span>
+<span class="c"> * one. Emit a call to it. *)</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span> <span class="s2">"binary"</span> <span class="o">^</span> <span class="o">(</span><span class="nn">String</span><span class="p">.</span><span class="n">make</span> <span class="mi">1</span> <span class="n">op</span><span class="o">)</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"binary operator not found!"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="o">[|</span><span class="n">lhs_val</span><span class="o">;</span> <span class="n">rhs_val</span><span class="o">|]</span> <span class="s2">"binop"</span> <span class="n">builder</span>
+ <span class="k">end</span>
+ <span class="k">end</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Call</span> <span class="o">(</span><span class="n">callee</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* Look up the name in the module table. *)</span>
+ <span class="k">let</span> <span class="n">callee</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">callee</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">callee</span> <span class="o">-></span> <span class="n">callee</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"unknown function referenced"</span><span class="o">)</span>
+ <span class="k">in</span>
+ <span class="k">let</span> <span class="n">params</span> <span class="o">=</span> <span class="n">params</span> <span class="n">callee</span> <span class="k">in</span>
+
+ <span class="c">(* If argument mismatch error. *)</span>
+ <span class="k">if</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">params</span> <span class="o">==</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span> <span class="k">then</span> <span class="bp">()</span> <span class="k">else</span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"incorrect # arguments passed"</span><span class="o">);</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">map</span> <span class="n">codegen_expr</span> <span class="n">args</span> <span class="k">in</span>
+ <span class="n">build_call</span> <span class="n">callee</span> <span class="n">args</span> <span class="s2">"calltmp"</span> <span class="n">builder</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">If</span> <span class="o">(</span><span class="n">cond</span><span class="o">,</span> <span class="n">then_</span><span class="o">,</span> <span class="n">else_</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">cond</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">cond</span> <span class="k">in</span>
+
+ <span class="c">(* Convert condition to a bool by comparing equal to 0.0 *)</span>
+ <span class="k">let</span> <span class="n">zero</span> <span class="o">=</span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">cond_val</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">One</span> <span class="n">cond</span> <span class="n">zero</span> <span class="s2">"ifcond"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Grab the first block so that we might later add the conditional branch</span>
+<span class="c"> * to it at the end of the function. *)</span>
+ <span class="k">let</span> <span class="n">start_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="n">start_bb</span> <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">then_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"then"</span> <span class="n">the_function</span> <span class="k">in</span>
+
+ <span class="c">(* Emit 'then' value. *)</span>
+ <span class="n">position_at_end</span> <span class="n">then_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">then_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">then_</span> <span class="k">in</span>
+
+ <span class="c">(* Codegen of 'then' can change the current block, update then_bb for the</span>
+<span class="c"> * phi. We create a new name because one is used for the phi node, and the</span>
+<span class="c"> * other is used for the conditional branch. *)</span>
+ <span class="k">let</span> <span class="n">new_then_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Emit 'else' value. *)</span>
+ <span class="k">let</span> <span class="n">else_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"else"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">else_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">else_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">else_</span> <span class="k">in</span>
+
+ <span class="c">(* Codegen of 'else' can change the current block, update else_bb for the</span>
+<span class="c"> * phi. *)</span>
+ <span class="k">let</span> <span class="n">new_else_bb</span> <span class="o">=</span> <span class="n">insertion_block</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Emit merge block. *)</span>
+ <span class="k">let</span> <span class="n">merge_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"ifcont"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">incoming</span> <span class="o">=</span> <span class="o">[(</span><span class="n">then_val</span><span class="o">,</span> <span class="n">new_then_bb</span><span class="o">);</span> <span class="o">(</span><span class="n">else_val</span><span class="o">,</span> <span class="n">new_else_bb</span><span class="o">)]</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">phi</span> <span class="o">=</span> <span class="n">build_phi</span> <span class="n">incoming</span> <span class="s2">"iftmp"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Return to the start block to add the conditional branch. *)</span>
+ <span class="n">position_at_end</span> <span class="n">start_bb</span> <span class="n">builder</span><span class="o">;</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">build_cond_br</span> <span class="n">cond_val</span> <span class="n">then_bb</span> <span class="n">else_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Set a unconditional branch at the end of the 'then' block and the</span>
+<span class="c"> * 'else' block to the 'merge' block. *)</span>
+ <span class="n">position_at_end</span> <span class="n">new_then_bb</span> <span class="n">builder</span><span class="o">;</span> <span class="n">ignore</span> <span class="o">(</span><span class="n">build_br</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">);</span>
+ <span class="n">position_at_end</span> <span class="n">new_else_bb</span> <span class="n">builder</span><span class="o">;</span> <span class="n">ignore</span> <span class="o">(</span><span class="n">build_br</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Finally, set the builder to the end of the merge block. *)</span>
+ <span class="n">position_at_end</span> <span class="n">merge_bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="n">phi</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">For</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">start</span><span class="o">,</span> <span class="n">end_</span><span class="o">,</span> <span class="n">step</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* Output this as:</span>
+<span class="c"> * var = alloca double</span>
+<span class="c"> * ...</span>
+<span class="c"> * start = startexpr</span>
+<span class="c"> * store start -> var</span>
+<span class="c"> * goto loop</span>
+<span class="c"> * loop:</span>
+<span class="c"> * ...</span>
+<span class="c"> * bodyexpr</span>
+<span class="c"> * ...</span>
+<span class="c"> * loopend:</span>
+<span class="c"> * step = stepexpr</span>
+<span class="c"> * endcond = endexpr</span>
+<span class="c"> *</span>
+<span class="c"> * curvar = load var</span>
+<span class="c"> * nextvar = curvar + step</span>
+<span class="c"> * store nextvar -> var</span>
+<span class="c"> * br endcond, loop, endloop</span>
+<span class="c"> * outloop: *)</span>
+
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="o">(</span><span class="n">insertion_block</span> <span class="n">builder</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Create an alloca for the variable in the entry block. *)</span>
+ <span class="k">let</span> <span class="n">alloca</span> <span class="o">=</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="k">in</span>
+
+ <span class="c">(* Emit the start code first, without 'variable' in scope. *)</span>
+ <span class="k">let</span> <span class="n">start_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">start</span> <span class="k">in</span>
+
+ <span class="c">(* Store the value into the alloca. *)</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">start_val</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Make the new basic block for the loop header, inserting after current</span>
+<span class="c"> * block. *)</span>
+ <span class="k">let</span> <span class="n">loop_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"loop"</span> <span class="n">the_function</span> <span class="k">in</span>
+
+ <span class="c">(* Insert an explicit fall through from the current block to the</span>
+<span class="c"> * loop_bb. *)</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">build_br</span> <span class="n">loop_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Start insertion in loop_bb. *)</span>
+ <span class="n">position_at_end</span> <span class="n">loop_bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="c">(* Within the loop, the variable is defined equal to the PHI node. If it</span>
+<span class="c"> * shadows an existing variable, we have to restore it, so save it</span>
+<span class="c"> * now. *)</span>
+ <span class="k">let</span> <span class="n">old_val</span> <span class="o">=</span>
+ <span class="k">try</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">var_name</span><span class="o">)</span> <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="nc">None</span>
+ <span class="k">in</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">alloca</span><span class="o">;</span>
+
+ <span class="c">(* Emit the body of the loop. This, like any other expr, can change the</span>
+<span class="c"> * current BB. Note that we ignore the value computed by the body, but</span>
+<span class="c"> * don't allow an error *)</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">codegen_expr</span> <span class="n">body</span><span class="o">);</span>
+
+ <span class="c">(* Emit the step value. *)</span>
+ <span class="k">let</span> <span class="n">step_val</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">step</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">step</span> <span class="o">-></span> <span class="n">codegen_expr</span> <span class="n">step</span>
+ <span class="c">(* If not specified, use 1.0. *)</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">1</span><span class="o">.</span><span class="mi">0</span>
+ <span class="k">in</span>
+
+ <span class="c">(* Compute the end condition. *)</span>
+ <span class="k">let</span> <span class="n">end_cond</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">end_</span> <span class="k">in</span>
+
+ <span class="c">(* Reload, increment, and restore the alloca. This handles the case where</span>
+<span class="c"> * the body of the loop mutates the variable. *)</span>
+ <span class="k">let</span> <span class="n">cur_var</span> <span class="o">=</span> <span class="n">build_load</span> <span class="n">alloca</span> <span class="n">var_name</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">next_var</span> <span class="o">=</span> <span class="n">build_add</span> <span class="n">cur_var</span> <span class="n">step_val</span> <span class="s2">"nextvar"</span> <span class="n">builder</span> <span class="k">in</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">next_var</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Convert condition to a bool by comparing equal to 0.0. *)</span>
+ <span class="k">let</span> <span class="n">zero</span> <span class="o">=</span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">end_cond</span> <span class="o">=</span> <span class="n">build_fcmp</span> <span class="nn">Fcmp</span><span class="p">.</span><span class="nc">One</span> <span class="n">end_cond</span> <span class="n">zero</span> <span class="s2">"loopcond"</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Create the "after loop" block and insert it. *)</span>
+ <span class="k">let</span> <span class="n">after_bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"afterloop"</span> <span class="n">the_function</span> <span class="k">in</span>
+
+ <span class="c">(* Insert the conditional branch into the end of loop_end_bb. *)</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">build_cond_br</span> <span class="n">end_cond</span> <span class="n">loop_bb</span> <span class="n">after_bb</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Any new code will be inserted in after_bb. *)</span>
+ <span class="n">position_at_end</span> <span class="n">after_bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="c">(* Restore the unshadowed variable. *)</span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">old_val</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">old_val</span> <span class="o">-></span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">old_val</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* for expr always returns 0.0. *)</span>
+ <span class="n">const_null</span> <span class="n">double_type</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Var</span> <span class="o">(</span><span class="n">var_names</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">old_bindings</span> <span class="o">=</span> <span class="n">ref</span> <span class="bp">[]</span> <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">block_parent</span> <span class="o">(</span><span class="n">insertion_block</span> <span class="n">builder</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Register all variables and emit their initializer. *)</span>
+ <span class="nn">Array</span><span class="p">.</span><span class="n">iter</span> <span class="o">(</span><span class="k">fun</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">init</span><span class="o">)</span> <span class="o">-></span>
+ <span class="c">(* Emit the initializer before adding the variable to scope, this</span>
+<span class="c"> * prevents the initializer from referencing the variable itself, and</span>
+<span class="c"> * permits stuff like this:</span>
+<span class="c"> * var a = 1 in</span>
+<span class="c"> * var a = a in ... # refers to outer 'a'. *)</span>
+ <span class="k">let</span> <span class="n">init_val</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">init</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">init</span> <span class="o">-></span> <span class="n">codegen_expr</span> <span class="n">init</span>
+ <span class="c">(* If not specified, use 0.0. *)</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="n">const_float</span> <span class="n">double_type</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span>
+ <span class="k">in</span>
+
+ <span class="k">let</span> <span class="n">alloca</span> <span class="o">=</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="k">in</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">init_val</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Remember the old variable binding so that we can restore the binding</span>
+<span class="c"> * when we unrecurse. *)</span>
+ <span class="k">begin</span>
+ <span class="k">try</span>
+ <span class="k">let</span> <span class="n">old_value</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">find</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="k">in</span>
+ <span class="n">old_bindings</span> <span class="o">:=</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">old_value</span><span class="o">)</span> <span class="o">::</span> <span class="o">!</span><span class="n">old_bindings</span><span class="o">;</span>
+ <span class="k">with</span> <span class="nc">Not_found</span> <span class="o">-></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* Remember this binding. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">alloca</span><span class="o">;</span>
+ <span class="o">)</span> <span class="n">var_names</span><span class="o">;</span>
+
+ <span class="c">(* Codegen the body, now that all vars are in scope. *)</span>
+ <span class="k">let</span> <span class="n">body_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">body</span> <span class="k">in</span>
+
+ <span class="c">(* Pop all our variables from scope. *)</span>
+ <span class="nn">List</span><span class="p">.</span><span class="n">iter</span> <span class="o">(</span><span class="k">fun</span> <span class="o">(</span><span class="n">var_name</span><span class="o">,</span> <span class="n">old_value</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">old_value</span>
+ <span class="o">)</span> <span class="o">!</span><span class="n">old_bindings</span><span class="o">;</span>
+
+ <span class="c">(* Return the body computation. *)</span>
+ <span class="n">body_val</span>
+
+<span class="k">let</span> <span class="n">codegen_proto</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">)</span> <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="o">_)</span> <span class="o">-></span>
+ <span class="c">(* Make the function type: double(double,double) etc. *)</span>
+ <span class="k">let</span> <span class="n">doubles</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">make</span> <span class="o">(</span><span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">args</span><span class="o">)</span> <span class="n">double_type</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">ft</span> <span class="o">=</span> <span class="n">function_type</span> <span class="n">double_type</span> <span class="n">doubles</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">f</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="n">lookup_function</span> <span class="n">name</span> <span class="n">the_module</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="n">declare_function</span> <span class="n">name</span> <span class="n">ft</span> <span class="n">the_module</span>
+
+ <span class="c">(* If 'f' conflicted, there was already something named 'name'. If it</span>
+<span class="c"> * has a body, don't allow redefinition or reextern. *)</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">f</span> <span class="o">-></span>
+ <span class="c">(* If 'f' already has a body, reject this. *)</span>
+ <span class="k">if</span> <span class="n">block_begin</span> <span class="n">f</span> <span class="o"><></span> <span class="nc">At_end</span> <span class="n">f</span> <span class="k">then</span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"redefinition of function"</span><span class="o">);</span>
+
+ <span class="c">(* If 'f' took a different number of arguments, reject. *)</span>
+ <span class="k">if</span> <span class="n">element_type</span> <span class="o">(</span><span class="n">type_of</span> <span class="n">f</span><span class="o">)</span> <span class="o"><></span> <span class="n">ft</span> <span class="k">then</span>
+ <span class="k">raise</span> <span class="o">(</span><span class="nc">Error</span> <span class="s2">"redefinition of function with different # args"</span><span class="o">);</span>
+ <span class="n">f</span>
+ <span class="k">in</span>
+
+ <span class="c">(* Set names for all arguments. *)</span>
+ <span class="nn">Array</span><span class="p">.</span><span class="n">iteri</span> <span class="o">(</span><span class="k">fun</span> <span class="n">i</span> <span class="n">a</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">n</span> <span class="o">=</span> <span class="n">args</span><span class="o">.(</span><span class="n">i</span><span class="o">)</span> <span class="k">in</span>
+ <span class="n">set_value_name</span> <span class="n">n</span> <span class="n">a</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">n</span> <span class="n">a</span><span class="o">;</span>
+ <span class="o">)</span> <span class="o">(</span><span class="n">params</span> <span class="n">f</span><span class="o">);</span>
+ <span class="n">f</span>
+
+<span class="c">(* Create an alloca for each argument and register the argument in the symbol</span>
+<span class="c"> * table so that references to it will succeed. *)</span>
+<span class="k">let</span> <span class="n">create_argument_allocas</span> <span class="n">the_function</span> <span class="n">proto</span> <span class="o">=</span>
+ <span class="k">let</span> <span class="n">args</span> <span class="o">=</span> <span class="k">match</span> <span class="n">proto</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Prototype</span> <span class="o">(_,</span> <span class="n">args</span><span class="o">)</span> <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(_,</span> <span class="n">args</span><span class="o">,</span> <span class="o">_)</span> <span class="o">-></span> <span class="n">args</span>
+ <span class="k">in</span>
+ <span class="nn">Array</span><span class="p">.</span><span class="n">iteri</span> <span class="o">(</span><span class="k">fun</span> <span class="n">i</span> <span class="n">ai</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">var_name</span> <span class="o">=</span> <span class="n">args</span><span class="o">.(</span><span class="n">i</span><span class="o">)</span> <span class="k">in</span>
+ <span class="c">(* Create an alloca for this variable. *)</span>
+ <span class="k">let</span> <span class="n">alloca</span> <span class="o">=</span> <span class="n">create_entry_block_alloca</span> <span class="n">the_function</span> <span class="n">var_name</span> <span class="k">in</span>
+
+ <span class="c">(* Store the initial value into the alloca. *)</span>
+ <span class="n">ignore</span><span class="o">(</span><span class="n">build_store</span> <span class="n">ai</span> <span class="n">alloca</span> <span class="n">builder</span><span class="o">);</span>
+
+ <span class="c">(* Add arguments to variable symbol table. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="n">named_values</span> <span class="n">var_name</span> <span class="n">alloca</span><span class="o">;</span>
+ <span class="o">)</span> <span class="o">(</span><span class="n">params</span> <span class="n">the_function</span><span class="o">)</span>
+
+<span class="k">let</span> <span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="o">=</span> <span class="k">function</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">Function</span> <span class="o">(</span><span class="n">proto</span><span class="o">,</span> <span class="n">body</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">clear</span> <span class="n">named_values</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="n">codegen_proto</span> <span class="n">proto</span> <span class="k">in</span>
+
+ <span class="c">(* If this is an operator, install it. *)</span>
+ <span class="k">begin</span> <span class="k">match</span> <span class="n">proto</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Ast</span><span class="p">.</span><span class="nc">BinOpPrototype</span> <span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">args</span><span class="o">,</span> <span class="n">prec</span><span class="o">)</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">op</span> <span class="o">=</span> <span class="n">name</span><span class="o">.[</span><span class="nn">String</span><span class="p">.</span><span class="n">length</span> <span class="n">name</span> <span class="o">-</span> <span class="mi">1</span><span class="o">]</span> <span class="k">in</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="n">op</span> <span class="n">prec</span><span class="o">;</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span> <span class="bp">()</span>
+ <span class="k">end</span><span class="o">;</span>
+
+ <span class="c">(* Create a new basic block to start insertion into. *)</span>
+ <span class="k">let</span> <span class="n">bb</span> <span class="o">=</span> <span class="n">append_block</span> <span class="n">context</span> <span class="s2">"entry"</span> <span class="n">the_function</span> <span class="k">in</span>
+ <span class="n">position_at_end</span> <span class="n">bb</span> <span class="n">builder</span><span class="o">;</span>
+
+ <span class="k">try</span>
+ <span class="c">(* Add all arguments to the symbol table and create their allocas. *)</span>
+ <span class="n">create_argument_allocas</span> <span class="n">the_function</span> <span class="n">proto</span><span class="o">;</span>
+
+ <span class="k">let</span> <span class="n">ret_val</span> <span class="o">=</span> <span class="n">codegen_expr</span> <span class="n">body</span> <span class="k">in</span>
+
+ <span class="c">(* Finish off the function. *)</span>
+ <span class="k">let</span> <span class="o">_</span> <span class="o">=</span> <span class="n">build_ret</span> <span class="n">ret_val</span> <span class="n">builder</span> <span class="k">in</span>
+
+ <span class="c">(* Validate the generated code, checking for consistency. *)</span>
+ <span class="nn">Llvm_analysis</span><span class="p">.</span><span class="n">assert_valid_function</span> <span class="n">the_function</span><span class="o">;</span>
+
+ <span class="c">(* Optimize the function. *)</span>
+ <span class="k">let</span> <span class="o">_</span> <span class="o">=</span> <span class="nn">PassManager</span><span class="p">.</span><span class="n">run_function</span> <span class="n">the_function</span> <span class="n">the_fpm</span> <span class="k">in</span>
+
+ <span class="n">the_function</span>
+ <span class="k">with</span> <span class="n">e</span> <span class="o">-></span>
+ <span class="n">delete_function</span> <span class="n">the_function</span><span class="o">;</span>
+ <span class="k">raise</span> <span class="n">e</span>
+</pre></div>
+</div>
+</dd>
+<dt>toplevel.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Top-Level parsing and JIT Driver</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">open</span> <span class="nc">Llvm</span>
+<span class="k">open</span> <span class="nc">Llvm_executionengine</span>
+
+<span class="c">(* top ::= definition | external | expression | ';' *)</span>
+<span class="k">let</span> <span class="k">rec</span> <span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span> <span class="o">=</span>
+ <span class="k">match</span> <span class="nn">Stream</span><span class="p">.</span><span class="n">peek</span> <span class="n">stream</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nc">None</span> <span class="o">-></span> <span class="bp">()</span>
+
+ <span class="c">(* ignore top-level semicolons. *)</span>
+ <span class="o">|</span> <span class="nc">Some</span> <span class="o">(</span><span class="nn">Token</span><span class="p">.</span><span class="nc">Kwd</span> <span class="sc">';'</span><span class="o">)</span> <span class="o">-></span>
+ <span class="nn">Stream</span><span class="p">.</span><span class="n">junk</span> <span class="n">stream</span><span class="o">;</span>
+ <span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span>
+
+ <span class="o">|</span> <span class="nc">Some</span> <span class="n">token</span> <span class="o">-></span>
+ <span class="k">begin</span>
+ <span class="k">try</span> <span class="k">match</span> <span class="n">token</span> <span class="k">with</span>
+ <span class="o">|</span> <span class="nn">Token</span><span class="p">.</span><span class="nc">Def</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">e</span> <span class="o">=</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">parse_definition</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="n">print_endline</span> <span class="s2">"parsed a function definition."</span><span class="o">;</span>
+ <span class="n">dump_value</span> <span class="o">(</span><span class="nn">Codegen</span><span class="p">.</span><span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="n">e</span><span class="o">);</span>
+ <span class="o">|</span> <span class="nn">Token</span><span class="p">.</span><span class="nc">Extern</span> <span class="o">-></span>
+ <span class="k">let</span> <span class="n">e</span> <span class="o">=</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">parse_extern</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="n">print_endline</span> <span class="s2">"parsed an extern."</span><span class="o">;</span>
+ <span class="n">dump_value</span> <span class="o">(</span><span class="nn">Codegen</span><span class="p">.</span><span class="n">codegen_proto</span> <span class="n">e</span><span class="o">);</span>
+ <span class="o">|</span> <span class="o">_</span> <span class="o">-></span>
+ <span class="c">(* Evaluate a top-level expression into an anonymous function. *)</span>
+ <span class="k">let</span> <span class="n">e</span> <span class="o">=</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">parse_toplevel</span> <span class="n">stream</span> <span class="k">in</span>
+ <span class="n">print_endline</span> <span class="s2">"parsed a top-level expr"</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">the_function</span> <span class="o">=</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">codegen_func</span> <span class="n">the_fpm</span> <span class="n">e</span> <span class="k">in</span>
+ <span class="n">dump_value</span> <span class="n">the_function</span><span class="o">;</span>
+
+ <span class="c">(* JIT the function, returning a function pointer. *)</span>
+ <span class="k">let</span> <span class="n">result</span> <span class="o">=</span> <span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">run_function</span> <span class="n">the_function</span> <span class="o">[||]</span>
+ <span class="n">the_execution_engine</span> <span class="k">in</span>
+
+ <span class="n">print_string</span> <span class="s2">"Evaluated to "</span><span class="o">;</span>
+ <span class="n">print_float</span> <span class="o">(</span><span class="nn">GenericValue</span><span class="p">.</span><span class="n">as_float</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">double_type</span> <span class="n">result</span><span class="o">);</span>
+ <span class="n">print_newline</span> <span class="bp">()</span><span class="o">;</span>
+ <span class="k">with</span> <span class="nn">Stream</span><span class="p">.</span><span class="nc">Error</span> <span class="n">s</span> <span class="o">|</span> <span class="nn">Codegen</span><span class="p">.</span><span class="nc">Error</span> <span class="n">s</span> <span class="o">-></span>
+ <span class="c">(* Skip token for error recovery. *)</span>
+ <span class="nn">Stream</span><span class="p">.</span><span class="n">junk</span> <span class="n">stream</span><span class="o">;</span>
+ <span class="n">print_endline</span> <span class="n">s</span><span class="o">;</span>
+ <span class="k">end</span><span class="o">;</span>
+ <span class="n">print_string</span> <span class="s2">"ready> "</span><span class="o">;</span> <span class="n">flush</span> <span class="n">stdout</span><span class="o">;</span>
+ <span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span>
+</pre></div>
+</div>
+</dd>
+<dt>toy.ml:</dt>
+<dd><div class="first last highlight-ocaml"><div class="highlight"><pre><span class="c">(*===----------------------------------------------------------------------===</span>
+<span class="c"> * Main driver code.</span>
+<span class="c"> *===----------------------------------------------------------------------===*)</span>
+
+<span class="k">open</span> <span class="nc">Llvm</span>
+<span class="k">open</span> <span class="nc">Llvm_executionengine</span>
+<span class="k">open</span> <span class="nc">Llvm_target</span>
+<span class="k">open</span> <span class="nc">Llvm_scalar_opts</span>
+
+<span class="k">let</span> <span class="n">main</span> <span class="bp">()</span> <span class="o">=</span>
+ <span class="n">ignore</span> <span class="o">(</span><span class="n">initialize_native_target</span> <span class="bp">()</span><span class="o">);</span>
+
+ <span class="c">(* Install standard binary operators.</span>
+<span class="c"> * 1 is the lowest precedence. *)</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'='</span> <span class="mi">2</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'<'</span> <span class="mi">10</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'+'</span> <span class="mi">20</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'-'</span> <span class="mi">20</span><span class="o">;</span>
+ <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">add</span> <span class="nn">Parser</span><span class="p">.</span><span class="n">binop_precedence</span> <span class="sc">'*'</span> <span class="mi">40</span><span class="o">;</span> <span class="c">(* highest. *)</span>
+
+ <span class="c">(* Prime the first token. *)</span>
+ <span class="n">print_string</span> <span class="s2">"ready> "</span><span class="o">;</span> <span class="n">flush</span> <span class="n">stdout</span><span class="o">;</span>
+ <span class="k">let</span> <span class="n">stream</span> <span class="o">=</span> <span class="nn">Lexer</span><span class="p">.</span><span class="n">lex</span> <span class="o">(</span><span class="nn">Stream</span><span class="p">.</span><span class="n">of_channel</span> <span class="n">stdin</span><span class="o">)</span> <span class="k">in</span>
+
+ <span class="c">(* Create the JIT. *)</span>
+ <span class="k">let</span> <span class="n">the_execution_engine</span> <span class="o">=</span> <span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">create</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span> <span class="k">in</span>
+ <span class="k">let</span> <span class="n">the_fpm</span> <span class="o">=</span> <span class="nn">PassManager</span><span class="p">.</span><span class="n">create_function</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span> <span class="k">in</span>
+
+ <span class="c">(* Set up the optimizer pipeline. Start with registering info about how the</span>
+<span class="c"> * target lays out data structures. *)</span>
+ <span class="nn">DataLayout</span><span class="p">.</span><span class="n">add</span> <span class="o">(</span><span class="nn">ExecutionEngine</span><span class="p">.</span><span class="n">target_data</span> <span class="n">the_execution_engine</span><span class="o">)</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Promote allocas to registers. *)</span>
+ <span class="n">add_memory_to_register_promotion</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Do simple "peephole" optimizations and bit-twiddling optzn. *)</span>
+ <span class="n">add_instruction_combination</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* reassociate expressions. *)</span>
+ <span class="n">add_reassociation</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Eliminate Common SubExpressions. *)</span>
+ <span class="n">add_gvn</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="c">(* Simplify the control flow graph (deleting unreachable blocks, etc). *)</span>
+ <span class="n">add_cfg_simplification</span> <span class="n">the_fpm</span><span class="o">;</span>
+
+ <span class="n">ignore</span> <span class="o">(</span><span class="nn">PassManager</span><span class="p">.</span><span class="n">initialize</span> <span class="n">the_fpm</span><span class="o">);</span>
+
+ <span class="c">(* Run the main "interpreter loop" now. *)</span>
+ <span class="nn">Toplevel</span><span class="p">.</span><span class="n">main_loop</span> <span class="n">the_fpm</span> <span class="n">the_execution_engine</span> <span class="n">stream</span><span class="o">;</span>
+
+ <span class="c">(* Print out all the generated code. *)</span>
+ <span class="n">dump_module</span> <span class="nn">Codegen</span><span class="p">.</span><span class="n">the_module</span>
+<span class="o">;;</span>
+
+<span class="n">main</span> <span class="bp">()</span>
+</pre></div>
+</div>
+</dd>
+<dt>bindings.c</dt>
+<dd><div class="first last highlight-c"><div class="highlight"><pre><span class="cp">#include <stdio.h></span>
+
+<span class="cm">/* putchard - putchar that takes a double and returns 0. */</span>
+<span class="k">extern</span> <span class="kt">double</span> <span class="nf">putchard</span><span class="p">(</span><span class="kt">double</span> <span class="n">X</span><span class="p">)</span> <span class="p">{</span>
+ <span class="n">putchar</span><span class="p">((</span><span class="kt">char</span><span class="p">)</span><span class="n">X</span><span class="p">);</span>
+ <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
+<span class="p">}</span>
+
+<span class="cm">/* printd - printf that takes a double prints it as "%f\n", returning 0. */</span>
+<span class="k">extern</span> <span class="kt">double</span> <span class="nf">printd</span><span class="p">(</span><span class="kt">double</span> <span class="n">X</span><span class="p">)</span> <span class="p">{</span>
+ <span class="n">printf</span><span class="p">(</span><span class="s">"%f</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">X</span><span class="p">);</span>
+ <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
+<span class="p">}</span>
+</pre></div>
+</div>
+</dd>
+</dl>
+<p><a class="reference external" href="OCamlLangImpl8.html">Next: Conclusion and other useful LLVM tidbits</a></p>
+</div>
+</div>
+
+
+ </div>
+ </div>
+ <div class="clearer"></div>
+ </div>
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="OCamlLangImpl8.html" title="8. Kaleidoscope: Conclusion and other useful LLVM tidbits"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="OCamlLangImpl6.html" title="6. Kaleidoscope: Extending the Language: User-defined Operators"
+ >previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ <li><a href="index.html" >LLVM Tutorial: Table of Contents</a> »</li>
+ </ul>
+ </div>
+ <div class="footer">
+ © Copyright 2003-2014, LLVM Project.
+ Last updated on 2015-01-13.
+ Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
+ </div>
+ </body>
+</html>
\ No newline at end of file
Added: www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl8.html
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl8.html?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl8.html (added)
+++ www-releases/trunk/3.5.1/docs/tutorial/OCamlLangImpl8.html Tue Jan 13 16:55:20 2015
@@ -0,0 +1,353 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+
+ <title>8. Kaleidoscope: Conclusion and other useful LLVM tidbits — LLVM 3.5 documentation</title>
+
+ <link rel="stylesheet" href="../_static/llvm-theme.css" type="text/css" />
+ <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: '../',
+ VERSION: '3.5',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true
+ };
+ </script>
+ <script type="text/javascript" src="../_static/jquery.js"></script>
+ <script type="text/javascript" src="../_static/underscore.js"></script>
+ <script type="text/javascript" src="../_static/doctools.js"></script>
+ <link rel="top" title="LLVM 3.5 documentation" href="../index.html" />
+ <link rel="up" title="LLVM Tutorial: Table of Contents" href="index.html" />
+ <link rel="next" title="LLVM 3.5 Release Notes" href="../ReleaseNotes.html" />
+ <link rel="prev" title="7. Kaleidoscope: Extending the Language: Mutable Variables" href="OCamlLangImpl7.html" />
+<style type="text/css">
+ table.right { float: right; margin-left: 20px; }
+ table.right td { border: 1px solid #ccc; }
+</style>
+
+ </head>
+ <body>
+<div class="logo">
+ <a href="../index.html">
+ <img src="../_static/logo.png"
+ alt="LLVM Logo" width="250" height="88"/></a>
+</div>
+
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="../ReleaseNotes.html" title="LLVM 3.5 Release Notes"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="OCamlLangImpl7.html" title="7. Kaleidoscope: Extending the Language: Mutable Variables"
+ accesskey="P">previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ <li><a href="index.html" accesskey="U">LLVM Tutorial: Table of Contents</a> »</li>
+ </ul>
+ </div>
+
+
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="body">
+
+ <div class="section" id="kaleidoscope-conclusion-and-other-useful-llvm-tidbits">
+<h1>8. Kaleidoscope: Conclusion and other useful LLVM tidbits<a class="headerlink" href="#kaleidoscope-conclusion-and-other-useful-llvm-tidbits" title="Permalink to this headline">¶</a></h1>
+<div class="contents local topic" id="contents">
+<ul class="simple">
+<li><a class="reference internal" href="#tutorial-conclusion" id="id2">Tutorial Conclusion</a></li>
+<li><a class="reference internal" href="#properties-of-the-llvm-ir" id="id3">Properties of the LLVM IR</a><ul>
+<li><a class="reference internal" href="#target-independence" id="id4">Target Independence</a></li>
+<li><a class="reference internal" href="#safety-guarantees" id="id5">Safety Guarantees</a></li>
+<li><a class="reference internal" href="#language-specific-optimizations" id="id6">Language-Specific Optimizations</a></li>
+</ul>
+</li>
+<li><a class="reference internal" href="#tips-and-tricks" id="id7">Tips and Tricks</a><ul>
+<li><a class="reference internal" href="#implementing-portable-offsetof-sizeof" id="id8">Implementing portable offsetof/sizeof</a></li>
+<li><a class="reference internal" href="#garbage-collected-stack-frames" id="id9">Garbage Collected Stack Frames</a></li>
+</ul>
+</li>
+</ul>
+</div>
+<div class="section" id="tutorial-conclusion">
+<h2><a class="toc-backref" href="#id2">8.1. Tutorial Conclusion</a><a class="headerlink" href="#tutorial-conclusion" title="Permalink to this headline">¶</a></h2>
+<p>Welcome to the final chapter of the “<a class="reference external" href="index.html">Implementing a language with
+LLVM</a>” tutorial. In the course of this tutorial, we have
+grown our little Kaleidoscope language from being a useless toy, to
+being a semi-interesting (but probably still useless) toy. :)</p>
+<p>It is interesting to see how far we’ve come, and how little code it has
+taken. We built the entire lexer, parser, AST, code generator, and an
+interactive run-loop (with a JIT!) by-hand in under 700 lines of
+(non-comment/non-blank) code.</p>
+<p>Our little language supports a couple of interesting features: it
+supports user defined binary and unary operators, it uses JIT
+compilation for immediate evaluation, and it supports a few control flow
+constructs with SSA construction.</p>
+<p>Part of the idea of this tutorial was to show you how easy and fun it
+can be to define, build, and play with languages. Building a compiler
+need not be a scary or mystical process! Now that you’ve seen some of
+the basics, I strongly encourage you to take the code and hack on it.
+For example, try adding:</p>
+<ul class="simple">
+<li><strong>global variables</strong> - While global variables have questional value
+in modern software engineering, they are often useful when putting
+together quick little hacks like the Kaleidoscope compiler itself.
+Fortunately, our current setup makes it very easy to add global
+variables: just have value lookup check to see if an unresolved
+variable is in the global variable symbol table before rejecting it.
+To create a new global variable, make an instance of the LLVM
+<tt class="docutils literal"><span class="pre">GlobalVariable</span></tt> class.</li>
+<li><strong>typed variables</strong> - Kaleidoscope currently only supports variables
+of type double. This gives the language a very nice elegance, because
+only supporting one type means that you never have to specify types.
+Different languages have different ways of handling this. The easiest
+way is to require the user to specify types for every variable
+definition, and record the type of the variable in the symbol table
+along with its Value*.</li>
+<li><strong>arrays, structs, vectors, etc</strong> - Once you add types, you can start
+extending the type system in all sorts of interesting ways. Simple
+arrays are very easy and are quite useful for many different
+applications. Adding them is mostly an exercise in learning how the
+LLVM <a class="reference external" href="../LangRef.html#i_getelementptr">getelementptr</a> instruction
+works: it is so nifty/unconventional, it <a class="reference external" href="../GetElementPtr.html">has its own
+FAQ</a>! If you add support for recursive types
+(e.g. linked lists), make sure to read the <a class="reference external" href="../ProgrammersManual.html#TypeResolve">section in the LLVM
+Programmer’s Manual</a> that
+describes how to construct them.</li>
+<li><strong>standard runtime</strong> - Our current language allows the user to access
+arbitrary external functions, and we use it for things like “printd”
+and “putchard”. As you extend the language to add higher-level
+constructs, often these constructs make the most sense if they are
+lowered to calls into a language-supplied runtime. For example, if
+you add hash tables to the language, it would probably make sense to
+add the routines to a runtime, instead of inlining them all the way.</li>
+<li><strong>memory management</strong> - Currently we can only access the stack in
+Kaleidoscope. It would also be useful to be able to allocate heap
+memory, either with calls to the standard libc malloc/free interface
+or with a garbage collector. If you would like to use garbage
+collection, note that LLVM fully supports <a class="reference external" href="../GarbageCollection.html">Accurate Garbage
+Collection</a> including algorithms that
+move objects and need to scan/update the stack.</li>
+<li><strong>debugger support</strong> - LLVM supports generation of <a class="reference external" href="../SourceLevelDebugging.html">DWARF Debug
+info</a> which is understood by common
+debuggers like GDB. Adding support for debug info is fairly
+straightforward. The best way to understand it is to compile some
+C/C++ code with “<tt class="docutils literal"><span class="pre">clang</span> <span class="pre">-g</span> <span class="pre">-O0</span></tt>” and taking a look at what it
+produces.</li>
+<li><strong>exception handling support</strong> - LLVM supports generation of <a class="reference external" href="../ExceptionHandling.html">zero
+cost exceptions</a> which interoperate with
+code compiled in other languages. You could also generate code by
+implicitly making every function return an error value and checking
+it. You could also make explicit use of setjmp/longjmp. There are
+many different ways to go here.</li>
+<li><strong>object orientation, generics, database access, complex numbers,
+geometric programming, ...</strong> - Really, there is no end of crazy
+features that you can add to the language.</li>
+<li><strong>unusual domains</strong> - We’ve been talking about applying LLVM to a
+domain that many people are interested in: building a compiler for a
+specific language. However, there are many other domains that can use
+compiler technology that are not typically considered. For example,
+LLVM has been used to implement OpenGL graphics acceleration,
+translate C++ code to ActionScript, and many other cute and clever
+things. Maybe you will be the first to JIT compile a regular
+expression interpreter into native code with LLVM?</li>
+</ul>
+<p>Have fun - try doing something crazy and unusual. Building a language
+like everyone else always has, is much less fun than trying something a
+little crazy or off the wall and seeing how it turns out. If you get
+stuck or want to talk about it, feel free to email the <a class="reference external" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev mailing
+list</a>: it has lots
+of people who are interested in languages and are often willing to help
+out.</p>
+<p>Before we end this tutorial, I want to talk about some “tips and tricks”
+for generating LLVM IR. These are some of the more subtle things that
+may not be obvious, but are very useful if you want to take advantage of
+LLVM’s capabilities.</p>
+</div>
+<div class="section" id="properties-of-the-llvm-ir">
+<h2><a class="toc-backref" href="#id3">8.2. Properties of the LLVM IR</a><a class="headerlink" href="#properties-of-the-llvm-ir" title="Permalink to this headline">¶</a></h2>
+<p>We have a couple common questions about code in the LLVM IR form - lets
+just get these out of the way right now, shall we?</p>
+<div class="section" id="target-independence">
+<h3><a class="toc-backref" href="#id4">8.2.1. Target Independence</a><a class="headerlink" href="#target-independence" title="Permalink to this headline">¶</a></h3>
+<p>Kaleidoscope is an example of a “portable language”: any program written
+in Kaleidoscope will work the same way on any target that it runs on.
+Many other languages have this property, e.g. lisp, java, haskell,
+javascript, python, etc (note that while these languages are portable,
+not all their libraries are).</p>
+<p>One nice aspect of LLVM is that it is often capable of preserving target
+independence in the IR: you can take the LLVM IR for a
+Kaleidoscope-compiled program and run it on any target that LLVM
+supports, even emitting C code and compiling that on targets that LLVM
+doesn’t support natively. You can trivially tell that the Kaleidoscope
+compiler generates target-independent code because it never queries for
+any target-specific information when generating code.</p>
+<p>The fact that LLVM provides a compact, target-independent,
+representation for code gets a lot of people excited. Unfortunately,
+these people are usually thinking about C or a language from the C
+family when they are asking questions about language portability. I say
+“unfortunately”, because there is really no way to make (fully general)
+C code portable, other than shipping the source code around (and of
+course, C source code is not actually portable in general either - ever
+port a really old application from 32- to 64-bits?).</p>
+<p>The problem with C (again, in its full generality) is that it is heavily
+laden with target specific assumptions. As one simple example, the
+preprocessor often destructively removes target-independence from the
+code when it processes the input text:</p>
+<div class="highlight-c"><div class="highlight"><pre><span class="cp">#ifdef __i386__</span>
+ <span class="kt">int</span> <span class="n">X</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
+<span class="cp">#else</span>
+ <span class="kt">int</span> <span class="n">X</span> <span class="o">=</span> <span class="mi">42</span><span class="p">;</span>
+<span class="cp">#endif</span>
+</pre></div>
+</div>
+<p>While it is possible to engineer more and more complex solutions to
+problems like this, it cannot be solved in full generality in a way that
+is better than shipping the actual source code.</p>
+<p>That said, there are interesting subsets of C that can be made portable.
+If you are willing to fix primitive types to a fixed size (say int =
+32-bits, and long = 64-bits), don’t care about ABI compatibility with
+existing binaries, and are willing to give up some other minor features,
+you can have portable code. This can make sense for specialized domains
+such as an in-kernel language.</p>
+</div>
+<div class="section" id="safety-guarantees">
+<h3><a class="toc-backref" href="#id5">8.2.2. Safety Guarantees</a><a class="headerlink" href="#safety-guarantees" title="Permalink to this headline">¶</a></h3>
+<p>Many of the languages above are also “safe” languages: it is impossible
+for a program written in Java to corrupt its address space and crash the
+process (assuming the JVM has no bugs). Safety is an interesting
+property that requires a combination of language design, runtime
+support, and often operating system support.</p>
+<p>It is certainly possible to implement a safe language in LLVM, but LLVM
+IR does not itself guarantee safety. The LLVM IR allows unsafe pointer
+casts, use after free bugs, buffer over-runs, and a variety of other
+problems. Safety needs to be implemented as a layer on top of LLVM and,
+conveniently, several groups have investigated this. Ask on the <a class="reference external" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev
+mailing list</a> if
+you are interested in more details.</p>
+</div>
+<div class="section" id="language-specific-optimizations">
+<h3><a class="toc-backref" href="#id6">8.2.3. Language-Specific Optimizations</a><a class="headerlink" href="#language-specific-optimizations" title="Permalink to this headline">¶</a></h3>
+<p>One thing about LLVM that turns off many people is that it does not
+solve all the world’s problems in one system (sorry ‘world hunger’,
+someone else will have to solve you some other day). One specific
+complaint is that people perceive LLVM as being incapable of performing
+high-level language-specific optimization: LLVM “loses too much
+information”.</p>
+<p>Unfortunately, this is really not the place to give you a full and
+unified version of “Chris Lattner’s theory of compiler design”. Instead,
+I’ll make a few observations:</p>
+<p>First, you’re right that LLVM does lose information. For example, as of
+this writing, there is no way to distinguish in the LLVM IR whether an
+SSA-value came from a C “int” or a C “long” on an ILP32 machine (other
+than debug info). Both get compiled down to an ‘i32’ value and the
+information about what it came from is lost. The more general issue
+here, is that the LLVM type system uses “structural equivalence” instead
+of “name equivalence”. Another place this surprises people is if you
+have two types in a high-level language that have the same structure
+(e.g. two different structs that have a single int field): these types
+will compile down into a single LLVM type and it will be impossible to
+tell what it came from.</p>
+<p>Second, while LLVM does lose information, LLVM is not a fixed target: we
+continue to enhance and improve it in many different ways. In addition
+to adding new features (LLVM did not always support exceptions or debug
+info), we also extend the IR to capture important information for
+optimization (e.g. whether an argument is sign or zero extended,
+information about pointers aliasing, etc). Many of the enhancements are
+user-driven: people want LLVM to include some specific feature, so they
+go ahead and extend it.</p>
+<p>Third, it is <em>possible and easy</em> to add language-specific optimizations,
+and you have a number of choices in how to do it. As one trivial
+example, it is easy to add language-specific optimization passes that
+“know” things about code compiled for a language. In the case of the C
+family, there is an optimization pass that “knows” about the standard C
+library functions. If you call “exit(0)” in main(), it knows that it is
+safe to optimize that into “return 0;” because C specifies what the
+‘exit’ function does.</p>
+<p>In addition to simple library knowledge, it is possible to embed a
+variety of other language-specific information into the LLVM IR. If you
+have a specific need and run into a wall, please bring the topic up on
+the llvmdev list. At the very worst, you can always treat LLVM as if it
+were a “dumb code generator” and implement the high-level optimizations
+you desire in your front-end, on the language-specific AST.</p>
+</div>
+</div>
+<div class="section" id="tips-and-tricks">
+<h2><a class="toc-backref" href="#id7">8.3. Tips and Tricks</a><a class="headerlink" href="#tips-and-tricks" title="Permalink to this headline">¶</a></h2>
+<p>There is a variety of useful tips and tricks that you come to know after
+working on/with LLVM that aren’t obvious at first glance. Instead of
+letting everyone rediscover them, this section talks about some of these
+issues.</p>
+<div class="section" id="implementing-portable-offsetof-sizeof">
+<h3><a class="toc-backref" href="#id8">8.3.1. Implementing portable offsetof/sizeof</a><a class="headerlink" href="#implementing-portable-offsetof-sizeof" title="Permalink to this headline">¶</a></h3>
+<p>One interesting thing that comes up, if you are trying to keep the code
+generated by your compiler “target independent”, is that you often need
+to know the size of some LLVM type or the offset of some field in an
+llvm structure. For example, you might need to pass the size of a type
+into a function that allocates memory.</p>
+<p>Unfortunately, this can vary widely across targets: for example the
+width of a pointer is trivially target-specific. However, there is a
+<a class="reference external" href="http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt">clever way to use the getelementptr
+instruction</a>
+that allows you to compute this in a portable way.</p>
+</div>
+<div class="section" id="garbage-collected-stack-frames">
+<h3><a class="toc-backref" href="#id9">8.3.2. Garbage Collected Stack Frames</a><a class="headerlink" href="#garbage-collected-stack-frames" title="Permalink to this headline">¶</a></h3>
+<p>Some languages want to explicitly manage their stack frames, often so
+that they are garbage collected or to allow easy implementation of
+closures. There are often better ways to implement these features than
+explicit stack frames, but <a class="reference external" href="http://nondot.org/sabre/LLVMNotes/ExplicitlyManagedStackFrames.txt">LLVM does support
+them,</a>
+if you want. It requires your front-end to convert the code into
+<a class="reference external" href="http://en.wikipedia.org/wiki/Continuation-passing_style">Continuation Passing
+Style</a> and
+the use of tail calls (which LLVM also supports).</p>
+</div>
+</div>
+</div>
+
+
+ </div>
+ </div>
+ <div class="clearer"></div>
+ </div>
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="../ReleaseNotes.html" title="LLVM 3.5 Release Notes"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="OCamlLangImpl7.html" title="7. Kaleidoscope: Extending the Language: Mutable Variables"
+ >previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ <li><a href="index.html" >LLVM Tutorial: Table of Contents</a> »</li>
+ </ul>
+ </div>
+ <div class="footer">
+ © Copyright 2003-2014, LLVM Project.
+ Last updated on 2015-01-13.
+ Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
+ </div>
+ </body>
+</html>
\ No newline at end of file
Added: www-releases/trunk/3.5.1/docs/tutorial/index.html
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/docs/tutorial/index.html?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/docs/tutorial/index.html (added)
+++ www-releases/trunk/3.5.1/docs/tutorial/index.html Tue Jan 13 16:55:20 2015
@@ -0,0 +1,179 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+
+ <title>LLVM Tutorial: Table of Contents — LLVM 3.5 documentation</title>
+
+ <link rel="stylesheet" href="../_static/llvm-theme.css" type="text/css" />
+ <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: '../',
+ VERSION: '3.5',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true
+ };
+ </script>
+ <script type="text/javascript" src="../_static/jquery.js"></script>
+ <script type="text/javascript" src="../_static/underscore.js"></script>
+ <script type="text/javascript" src="../_static/doctools.js"></script>
+ <link rel="top" title="LLVM 3.5 documentation" href="../index.html" />
+ <link rel="next" title="1. Kaleidoscope: Tutorial Introduction and the Lexer" href="LangImpl1.html" />
+ <link rel="prev" title="LLVM test-suite Makefile Guide" href="../TestSuiteMakefileGuide.html" />
+<style type="text/css">
+ table.right { float: right; margin-left: 20px; }
+ table.right td { border: 1px solid #ccc; }
+</style>
+
+ </head>
+ <body>
+<div class="logo">
+ <a href="../index.html">
+ <img src="../_static/logo.png"
+ alt="LLVM Logo" width="250" height="88"/></a>
+</div>
+
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="LangImpl1.html" title="1. Kaleidoscope: Tutorial Introduction and the Lexer"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="../TestSuiteMakefileGuide.html" title="LLVM test-suite Makefile Guide"
+ accesskey="P">previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ </ul>
+ </div>
+
+
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="body">
+
+ <div class="section" id="llvm-tutorial-table-of-contents">
+<h1>LLVM Tutorial: Table of Contents<a class="headerlink" href="#llvm-tutorial-table-of-contents" title="Permalink to this headline">¶</a></h1>
+<div class="section" id="kaleidoscope-implementing-a-language-with-llvm">
+<h2>Kaleidoscope: Implementing a Language with LLVM<a class="headerlink" href="#kaleidoscope-implementing-a-language-with-llvm" title="Permalink to this headline">¶</a></h2>
+<div class="toctree-wrapper compound">
+<ul>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl1.html">1. Kaleidoscope: Tutorial Introduction and the Lexer</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl2.html">2. Kaleidoscope: Implementing a Parser and AST</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl3.html">3. Kaleidoscope: Code generation to LLVM IR</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl4.html">4. Kaleidoscope: Adding JIT and Optimizer Support</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl5.html">5. Kaleidoscope: Extending the Language: Control Flow</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl6.html">6. Kaleidoscope: Extending the Language: User-defined Operators</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl7.html">7. Kaleidoscope: Extending the Language: Mutable Variables</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="LangImpl8.html">8. Kaleidoscope: Conclusion and other useful LLVM tidbits</a><ul class="simple">
+</ul>
+</li>
+</ul>
+</div>
+</div>
+<div class="section" id="kaleidoscope-implementing-a-language-with-llvm-in-objective-caml">
+<h2>Kaleidoscope: Implementing a Language with LLVM in Objective Caml<a class="headerlink" href="#kaleidoscope-implementing-a-language-with-llvm-in-objective-caml" title="Permalink to this headline">¶</a></h2>
+<div class="toctree-wrapper compound">
+<ul>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl1.html">1. Kaleidoscope: Tutorial Introduction and the Lexer</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl2.html">2. Kaleidoscope: Implementing a Parser and AST</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl3.html">3. Kaleidoscope: Code generation to LLVM IR</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl4.html">4. Kaleidoscope: Adding JIT and Optimizer Support</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl5.html">5. Kaleidoscope: Extending the Language: Control Flow</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl6.html">6. Kaleidoscope: Extending the Language: User-defined Operators</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl7.html">7. Kaleidoscope: Extending the Language: Mutable Variables</a><ul class="simple">
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="OCamlLangImpl8.html">8. Kaleidoscope: Conclusion and other useful LLVM tidbits</a><ul class="simple">
+</ul>
+</li>
+</ul>
+</div>
+</div>
+<div class="section" id="external-tutorials">
+<h2>External Tutorials<a class="headerlink" href="#external-tutorials" title="Permalink to this headline">¶</a></h2>
+<dl class="docutils">
+<dt><a class="reference external" href="http://jonathan2251.github.com/lbd/">Tutorial: Creating an LLVM Backend for the Cpu0 Architecture</a></dt>
+<dd>A step-by-step tutorial for developing an LLVM backend. Under
+active development at <a class="reference external" href="https://github.com/Jonathan2251/lbd">https://github.com/Jonathan2251/lbd</a> (please
+contribute!).</dd>
+<dt><a class="reference external" href="http://www.embecosm.com/appnotes/ean10/ean10-howto-llvmas-1.0.html">Howto: Implementing LLVM Integrated Assembler</a></dt>
+<dd>A simple guide for how to implement an LLVM integrated assembler for an
+architecture.</dd>
+</dl>
+</div>
+<div class="section" id="advanced-topics">
+<h2>Advanced Topics<a class="headerlink" href="#advanced-topics" title="Permalink to this headline">¶</a></h2>
+<ol class="arabic simple">
+<li><a class="reference external" href="http://llvm.org/pubs/2004-09-22-LCPCLLVMTutorial.html">Writing an Optimization for LLVM</a></li>
+</ol>
+</div>
+</div>
+
+
+ </div>
+ </div>
+ <div class="clearer"></div>
+ </div>
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="../genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="LangImpl1.html" title="1. Kaleidoscope: Tutorial Introduction and the Lexer"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="../TestSuiteMakefileGuide.html" title="LLVM test-suite Makefile Guide"
+ >previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="../index.html">Documentation</a>»</li>
+
+ </ul>
+ </div>
+ <div class="footer">
+ © Copyright 2003-2014, LLVM Project.
+ Last updated on 2015-01-13.
+ Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
+ </div>
+ </body>
+</html>
\ No newline at end of file
Added: www-releases/trunk/3.5.1/docs/yaml2obj.html
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/docs/yaml2obj.html?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/docs/yaml2obj.html (added)
+++ www-releases/trunk/3.5.1/docs/yaml2obj.html Tue Jan 13 16:55:20 2015
@@ -0,0 +1,307 @@
+
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+
+<html xmlns="http://www.w3.org/1999/xhtml">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+
+ <title>yaml2obj — LLVM 3.5 documentation</title>
+
+ <link rel="stylesheet" href="_static/llvm-theme.css" type="text/css" />
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT: '',
+ VERSION: '3.5',
+ COLLAPSE_INDEX: false,
+ FILE_SUFFIX: '.html',
+ HAS_SOURCE: true
+ };
+ </script>
+ <script type="text/javascript" src="_static/jquery.js"></script>
+ <script type="text/javascript" src="_static/underscore.js"></script>
+ <script type="text/javascript" src="_static/doctools.js"></script>
+ <link rel="top" title="LLVM 3.5 documentation" href="index.html" />
+ <link rel="next" title="How to submit an LLVM bug report" href="HowToSubmitABug.html" />
+ <link rel="prev" title="How To Add Your Build Configuration To LLVM Buildbot Infrastructure" href="HowToAddABuilder.html" />
+<style type="text/css">
+ table.right { float: right; margin-left: 20px; }
+ table.right td { border: 1px solid #ccc; }
+</style>
+
+ </head>
+ <body>
+<div class="logo">
+ <a href="index.html">
+ <img src="_static/logo.png"
+ alt="LLVM Logo" width="250" height="88"/></a>
+</div>
+
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ accesskey="I">index</a></li>
+ <li class="right" >
+ <a href="HowToSubmitABug.html" title="How to submit an LLVM bug report"
+ accesskey="N">next</a> |</li>
+ <li class="right" >
+ <a href="HowToAddABuilder.html" title="How To Add Your Build Configuration To LLVM Buildbot Infrastructure"
+ accesskey="P">previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="index.html">Documentation</a>»</li>
+
+ </ul>
+ </div>
+
+
+ <div class="document">
+ <div class="documentwrapper">
+ <div class="body">
+
+ <div class="section" id="yaml2obj">
+<h1>yaml2obj<a class="headerlink" href="#yaml2obj" title="Permalink to this headline">¶</a></h1>
+<p>yaml2obj takes a YAML description of an object file and converts it to a binary
+file.</p>
+<blockquote>
+<div>$ yaml2obj input-file</div></blockquote>
+<p>Outputs the binary to stdout.</p>
+<div class="section" id="coff-syntax">
+<h2>COFF Syntax<a class="headerlink" href="#coff-syntax" title="Permalink to this headline">¶</a></h2>
+<p>Here’s a sample COFF file.</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="l-Scalar-Plain">header</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">Machine</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_FILE_MACHINE_I386</span> <span class="c1"># (0x14C)</span>
+
+<span class="l-Scalar-Plain">sections</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">Name</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">.text</span>
+ <span class="l-Scalar-Plain">Characteristics</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="nv">IMAGE_SCN_CNT_CODE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_16BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_EXECUTE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_READ</span>
+ <span class="p-Indicator">]</span> <span class="c1"># 0x60500020</span>
+ <span class="l-Scalar-Plain">SectionData</span><span class="p-Indicator">:</span>
+ <span class="s">"</span><span class="se">\x83\xEC\x0C\xC7\x44\x24\x08\x00\x00\x00\x00\xC7\x04\x24\x00\x00\x00\x00\xE8\x00\x00\x00\x00\xE8\x00\x00\x00\x00\x8B\x44\x24\x08\x83\xC4\x0C\xC3</span><span class="s">"</span> <span class="c1"># |....D$.......$...............D$.....|</span>
+
+<span class="l-Scalar-Plain">symbols</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">Name</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">.text</span>
+ <span class="l-Scalar-Plain">Value</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">0</span>
+ <span class="l-Scalar-Plain">SectionNumber</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">1</span>
+ <span class="l-Scalar-Plain">SimpleType</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_SYM_TYPE_NULL</span> <span class="c1"># (0)</span>
+ <span class="l-Scalar-Plain">ComplexType</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_SYM_DTYPE_NULL</span> <span class="c1"># (0)</span>
+ <span class="l-Scalar-Plain">StorageClass</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_SYM_CLASS_STATIC</span> <span class="c1"># (3)</span>
+ <span class="l-Scalar-Plain">NumberOfAuxSymbols</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">1</span>
+ <span class="l-Scalar-Plain">AuxiliaryData</span><span class="p-Indicator">:</span>
+ <span class="s">"</span><span class="se">\x24\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00</span><span class="s">"</span> <span class="c1"># |$.................|</span>
+
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">Name</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">_main</span>
+ <span class="l-Scalar-Plain">Value</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">0</span>
+ <span class="l-Scalar-Plain">SectionNumber</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">1</span>
+ <span class="l-Scalar-Plain">SimpleType</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_SYM_TYPE_NULL</span> <span class="c1"># (0)</span>
+ <span class="l-Scalar-Plain">ComplexType</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_SYM_DTYPE_NULL</span> <span class="c1"># (0)</span>
+ <span class="l-Scalar-Plain">StorageClass</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">IMAGE_SYM_CLASS_EXTERNAL</span> <span class="c1"># (2)</span>
+</pre></div>
+</div>
+<p>Here’s a simplified <a class="reference external" href="http://www.kuwata-lab.com/kwalify/ruby/users-guide.html">Kwalify</a> schema with an extension to allow alternate types.</p>
+<div class="highlight-yaml"><div class="highlight"><pre><span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">map</span>
+ <span class="l-Scalar-Plain">mapping</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">header</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">map</span>
+ <span class="l-Scalar-Plain">mapping</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">Machine</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">,</span> <span class="nv">enum</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">[</span> <span class="nv">IMAGE_FILE_MACHINE_UNKNOWN</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_AM33</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_AMD64</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_ARM</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_ARMNT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_EBC</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_I386</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_IA64</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_M32R</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_MIPS16</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_MIPSFPU</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_MIPSFPU16</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_POWERPC</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_POWERPCFP</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_R4000</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_SH3</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_SH3DSP</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_SH4</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_SH5</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_THUMB</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_MACHINE_WCEMIPSV2</span>
+ <span class="p-Indicator">]}</span>
+ <span class="p-Indicator">,</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">int</span><span class="p-Indicator">}</span>
+ <span class="p-Indicator">]</span>
+ <span class="l-Scalar-Plain">Characteristics</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">seq</span>
+ <span class="l-Scalar-Plain">sequence</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">str</span>
+ <span class="l-Scalar-Plain">enum</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="nv">IMAGE_FILE_RELOCS_STRIPPED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_EXECUTABLE_IMAGE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_LINE_NUMS_STRIPPED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_LOCAL_SYMS_STRIPPED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_AGGRESSIVE_WS_TRIM</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_LARGE_ADDRESS_AWARE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_BYTES_REVERSED_LO</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_32BIT_MACHINE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_DEBUG_STRIPPED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_NET_RUN_FROM_SWAP</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_SYSTEM</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_DLL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_UP_SYSTEM_ONLY</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_FILE_BYTES_REVERSED_HI</span>
+ <span class="p-Indicator">]</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">int</span>
+ <span class="l-Scalar-Plain">sections</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">seq</span>
+ <span class="l-Scalar-Plain">sequence</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">map</span>
+ <span class="l-Scalar-Plain">mapping</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">Name</span><span class="p-Indicator">:</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">}</span>
+ <span class="l-Scalar-Plain">Characteristics</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">seq</span>
+ <span class="l-Scalar-Plain">sequence</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">str</span>
+ <span class="l-Scalar-Plain">enum</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="nv">IMAGE_SCN_TYPE_NO_PAD</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_CNT_CODE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_CNT_INITIALIZED_DATA</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_CNT_UNINITIALIZED_DATA</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_LNK_OTHER</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_LNK_INFO</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_LNK_REMOVE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_LNK_COMDAT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_GPREL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_PURGEABLE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_16BIT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_LOCKED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_PRELOAD</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_1BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_2BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_4BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_8BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_16BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_32BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_64BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_128BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_256BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_512BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_1024BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_2048BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_4096BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_ALIGN_8192BYTES</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_LNK_NRELOC_OVFL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_DISCARDABLE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_NOT_CACHED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_NOT_PAGED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_SHARED</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_EXECUTE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_READ</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SCN_MEM_WRITE</span>
+ <span class="p-Indicator">]</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">int</span>
+ <span class="l-Scalar-Plain">SectionData</span><span class="p-Indicator">:</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">}</span>
+ <span class="l-Scalar-Plain">symbols</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">seq</span>
+ <span class="l-Scalar-Plain">sequence</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">type</span><span class="p-Indicator">:</span> <span class="l-Scalar-Plain">map</span>
+ <span class="l-Scalar-Plain">mapping</span><span class="p-Indicator">:</span>
+ <span class="l-Scalar-Plain">Name</span><span class="p-Indicator">:</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">}</span>
+ <span class="l-Scalar-Plain">Value</span><span class="p-Indicator">:</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">int</span><span class="p-Indicator">}</span>
+ <span class="l-Scalar-Plain">SectionNumber</span><span class="p-Indicator">:</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">int</span><span class="p-Indicator">}</span>
+ <span class="l-Scalar-Plain">SimpleType</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">,</span> <span class="nv">enum</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="nv">IMAGE_SYM_TYPE_NULL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_VOID</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_CHAR</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_SHORT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_INT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_LONG</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_FLOAT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_DOUBLE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_STRUCT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_UNION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_ENUM</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_MOE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_BYTE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_WORD</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_UINT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_TYPE_DWORD</span>
+ <span class="p-Indicator">]}</span>
+ <span class="p-Indicator">,</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">int</span><span class="p-Indicator">}</span>
+ <span class="p-Indicator">]</span>
+ <span class="l-Scalar-Plain">ComplexType</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">,</span> <span class="nv">enum</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="nv">IMAGE_SYM_DTYPE_NULL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_DTYPE_POINTER</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_DTYPE_FUNCTION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_DTYPE_ARRAY</span>
+ <span class="p-Indicator">]}</span>
+ <span class="p-Indicator">,</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">int</span><span class="p-Indicator">}</span>
+ <span class="p-Indicator">]</span>
+ <span class="l-Scalar-Plain">StorageClass</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">str</span><span class="p-Indicator">,</span> <span class="nv">enum</span><span class="p-Indicator">:</span>
+ <span class="p-Indicator">[</span> <span class="nv">IMAGE_SYM_CLASS_END_OF_FUNCTION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_NULL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_AUTOMATIC</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_EXTERNAL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_STATIC</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_REGISTER</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_EXTERNAL_DEF</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_LABEL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_UNDEFINED_LABEL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_MEMBER_OF_STRUCT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_ARGUMENT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_STRUCT_TAG</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_MEMBER_OF_UNION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_UNION_TAG</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_TYPE_DEFINITION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_UNDEFINED_STATIC</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_ENUM_TAG</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_MEMBER_OF_ENUM</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_REGISTER_PARAM</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_BIT_FIELD</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_BLOCK</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_FUNCTION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_END_OF_STRUCT</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_FILE</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_SECTION</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_WEAK_EXTERNAL</span>
+ <span class="p-Indicator">,</span> <span class="nv">IMAGE_SYM_CLASS_CLR_TOKEN</span>
+ <span class="p-Indicator">]}</span>
+ <span class="p-Indicator">,</span> <span class="p-Indicator">{</span><span class="nv">type</span><span class="p-Indicator">:</span> <span class="nv">int</span><span class="p-Indicator">}</span>
+ <span class="p-Indicator">]</span>
+</pre></div>
+</div>
+</div>
+</div>
+
+
+ </div>
+ </div>
+ <div class="clearer"></div>
+ </div>
+ <div class="related">
+ <h3>Navigation</h3>
+ <ul>
+ <li class="right" style="margin-right: 10px">
+ <a href="genindex.html" title="General Index"
+ >index</a></li>
+ <li class="right" >
+ <a href="HowToSubmitABug.html" title="How to submit an LLVM bug report"
+ >next</a> |</li>
+ <li class="right" >
+ <a href="HowToAddABuilder.html" title="How To Add Your Build Configuration To LLVM Buildbot Infrastructure"
+ >previous</a> |</li>
+ <li><a href="http://llvm.org/">LLVM Home</a> | </li>
+ <li><a href="index.html">Documentation</a>»</li>
+
+ </ul>
+ </div>
+ <div class="footer">
+ © Copyright 2003-2014, LLVM Project.
+ Last updated on 2015-01-13.
+ Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.1.3.
+ </div>
+ </body>
+</html>
\ No newline at end of file
Added: www-releases/trunk/3.5.1/tools/clang/docs/AddressSanitizer.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/AddressSanitizer.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/AddressSanitizer.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/AddressSanitizer.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,216 @@
+================
+AddressSanitizer
+================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+AddressSanitizer is a fast memory error detector. It consists of a compiler
+instrumentation module and a run-time library. The tool can detect the
+following types of bugs:
+
+* Out-of-bounds accesses to heap, stack and globals
+* Use-after-free
+* Use-after-return (to some extent)
+* Double-free, invalid free
+* Memory leaks (experimental)
+
+Typical slowdown introduced by AddressSanitizer is **2x**.
+
+How to build
+============
+
+Follow the `clang build instructions <../get_started.html>`_. CMake build is
+supported.
+
+Usage
+=====
+
+Simply compile and link your program with ``-fsanitize=address`` flag. The
+AddressSanitizer run-time library should be linked to the final executable, so
+make sure to use ``clang`` (not ``ld``) for the final link step. When linking
+shared libraries, the AddressSanitizer run-time is not linked, so
+``-Wl,-z,defs`` may cause link errors (don't use it with AddressSanitizer). To
+get a reasonable performance add ``-O1`` or higher. To get nicer stack traces
+in error messages add ``-fno-omit-frame-pointer``. To get perfect stack traces
+you may need to disable inlining (just use ``-O1``) and tail call elimination
+(``-fno-optimize-sibling-calls``).
+
+.. code-block:: console
+
+ % cat example_UseAfterFree.cc
+ int main(int argc, char **argv) {
+ int *array = new int[100];
+ delete [] array;
+ return array[argc]; // BOOM
+ }
+
+ # Compile and link
+ % clang -O1 -g -fsanitize=address -fno-omit-frame-pointer example_UseAfterFree.cc
+
+or:
+
+.. code-block:: console
+
+ # Compile
+ % clang -O1 -g -fsanitize=address -fno-omit-frame-pointer -c example_UseAfterFree.cc
+ # Link
+ % clang -g -fsanitize=address example_UseAfterFree.o
+
+If a bug is detected, the program will print an error message to stderr and
+exit with a non-zero exit code. To make AddressSanitizer symbolize its output
+you need to set the ``ASAN_SYMBOLIZER_PATH`` environment variable to point to
+the ``llvm-symbolizer`` binary (or make sure ``llvm-symbolizer`` is in your
+``$PATH``):
+
+.. code-block:: console
+
+ % ASAN_SYMBOLIZER_PATH=/usr/local/bin/llvm-symbolizer ./a.out
+ ==9442== ERROR: AddressSanitizer heap-use-after-free on address 0x7f7ddab8c084 at pc 0x403c8c bp 0x7fff87fb82d0 sp 0x7fff87fb82c8
+ READ of size 4 at 0x7f7ddab8c084 thread T0
+ #0 0x403c8c in main example_UseAfterFree.cc:4
+ #1 0x7f7ddabcac4d in __libc_start_main ??:0
+ 0x7f7ddab8c084 is located 4 bytes inside of 400-byte region [0x7f7ddab8c080,0x7f7ddab8c210)
+ freed by thread T0 here:
+ #0 0x404704 in operator delete[](void*) ??:0
+ #1 0x403c53 in main example_UseAfterFree.cc:4
+ #2 0x7f7ddabcac4d in __libc_start_main ??:0
+ previously allocated by thread T0 here:
+ #0 0x404544 in operator new[](unsigned long) ??:0
+ #1 0x403c43 in main example_UseAfterFree.cc:2
+ #2 0x7f7ddabcac4d in __libc_start_main ??:0
+ ==9442== ABORTING
+
+If that does not work for you (e.g. your process is sandboxed), you can use a
+separate script to symbolize the result offline (online symbolization can be
+force disabled by setting ``ASAN_OPTIONS=symbolize=0``):
+
+.. code-block:: console
+
+ % ASAN_OPTIONS=symbolize=0 ./a.out 2> log
+ % projects/compiler-rt/lib/asan/scripts/asan_symbolize.py / < log | c++filt
+ ==9442== ERROR: AddressSanitizer heap-use-after-free on address 0x7f7ddab8c084 at pc 0x403c8c bp 0x7fff87fb82d0 sp 0x7fff87fb82c8
+ READ of size 4 at 0x7f7ddab8c084 thread T0
+ #0 0x403c8c in main example_UseAfterFree.cc:4
+ #1 0x7f7ddabcac4d in __libc_start_main ??:0
+ ...
+
+Note that on OS X you may need to run ``dsymutil`` on your binary to have the
+file\:line info in the AddressSanitizer reports.
+
+AddressSanitizer exits on the first detected error. This is by design.
+One reason: it makes the generated code smaller and faster (both by
+~5%). Another reason: this makes fixing bugs unavoidable. With Valgrind,
+it is often the case that users treat Valgrind warnings as false
+positives (which they are not) and don't fix them.
+
+``__has_feature(address_sanitizer)``
+------------------------------------
+
+In some cases one may need to execute different code depending on whether
+AddressSanitizer is enabled.
+:ref:`\_\_has\_feature <langext-__has_feature-__has_extension>` can be used for
+this purpose.
+
+.. code-block:: c
+
+ #if defined(__has_feature)
+ # if __has_feature(address_sanitizer)
+ // code that builds only under AddressSanitizer
+ # endif
+ #endif
+
+``__attribute__((no_sanitize_address))``
+-----------------------------------------------
+
+Some code should not be instrumented by AddressSanitizer. One may use the
+function attribute
+:ref:`no_sanitize_address <langext-address_sanitizer>`
+(or a deprecated synonym `no_address_safety_analysis`)
+to disable instrumentation of a particular function. This attribute may not be
+supported by other compilers, so we suggest to use it together with
+``__has_feature(address_sanitizer)``.
+
+Initialization order checking
+-----------------------------
+
+AddressSanitizer can optionally detect dynamic initialization order problems,
+when initialization of globals defined in one translation unit uses
+globals defined in another translation unit. To enable this check at runtime,
+you should set environment variable
+``ASAN_OPTIONS=check_initialization_order=1``.
+
+Blacklist
+---------
+
+AddressSanitizer supports ``src`` and ``fun`` entity types in
+:doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports
+in the specified source files or functions. Additionally, AddressSanitizer
+introduces ``global`` and ``type`` entity types that can be used to
+suppress error reports for out-of-bound access to globals with certain
+names and types (you may only specify class or struct types).
+
+You may use an ``init`` category to suppress reports about initialization-order
+problems happening in certain source files or with certain global variables.
+
+.. code-block:: bash
+
+ # Suppress error reports for code in a file or in a function:
+ src:bad_file.cpp
+ # Ignore all functions with names containing MyFooBar:
+ fun:*MyFooBar*
+ # Disable out-of-bound checks for global:
+ global:bad_array
+ # Disable out-of-bound checks for global instances of a given class ...
+ type:class.Namespace::BadClassName
+ # ... or a given struct. Use wildcard to deal with anonymous namespace.
+ type:struct.Namespace2::*::BadStructName
+ # Disable initialization-order checks for globals:
+ global:bad_init_global=init
+ type:*BadInitClassSubstring*=init
+ src:bad/init/files/*=init
+
+Memory leak detection
+---------------------
+
+For the experimental memory leak detector in AddressSanitizer, see
+:doc:`LeakSanitizer`.
+
+Supported Platforms
+===================
+
+AddressSanitizer is supported on
+
+* Linux i386/x86\_64 (tested on Ubuntu 12.04);
+* MacOS 10.6 - 10.9 (i386/x86\_64).
+* Android ARM
+
+Ports to various other platforms are in progress.
+
+Limitations
+===========
+
+* AddressSanitizer uses more real memory than a native run. Exact overhead
+ depends on the allocations sizes. The smaller the allocations you make the
+ bigger the overhead is.
+* AddressSanitizer uses more stack memory. We have seen up to 3x increase.
+* On 64-bit platforms AddressSanitizer maps (but not reserves) 16+ Terabytes of
+ virtual address space. This means that tools like ``ulimit`` may not work as
+ usually expected.
+* Static linking is not supported.
+
+Current Status
+==============
+
+AddressSanitizer is fully functional on supported platforms starting from LLVM
+3.1. The test suite is integrated into CMake build and can be run with ``make
+check-asan`` command.
+
+More Information
+================
+
+`http://code.google.com/p/address-sanitizer <http://code.google.com/p/address-sanitizer/>`_
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/AttributeReference.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/AttributeReference.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/AttributeReference.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/AttributeReference.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,1116 @@
+..
+ -------------------------------------------------------------------
+ NOTE: This file is automatically generated by running clang-tblgen
+ -gen-attr-docs. Do not edit this file by hand!!
+ -------------------------------------------------------------------
+
+===================
+Attributes in Clang
+===================
+.. contents::
+ :local:
+
+Introduction
+============
+
+This page lists the attributes currently supported by Clang.
+
+Function Attributes
+===================
+
+
+interrupt
+---------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Clang supports the GNU style ``__attribute__((interrupt("TYPE")))`` attribute on
+ARM targets. This attribute may be attached to a function definition and
+instructs the backend to generate appropriate function entry/exit code so that
+it can be used directly as an interrupt service routine.
+
+The parameter passed to the interrupt attribute is optional, but if
+provided it must be a string literal with one of the following values: "IRQ",
+"FIQ", "SWI", "ABORT", "UNDEF".
+
+The semantics are as follows:
+
+- If the function is AAPCS, Clang instructs the backend to realign the stack to
+ 8 bytes on entry. This is a general requirement of the AAPCS at public
+ interfaces, but may not hold when an exception is taken. Doing this allows
+ other AAPCS functions to be called.
+- If the CPU is M-class this is all that needs to be done since the architecture
+ itself is designed in such a way that functions obeying the normal AAPCS ABI
+ constraints are valid exception handlers.
+- If the CPU is not M-class, the prologue and epilogue are modified to save all
+ non-banked registers that are used, so that upon return the user-mode state
+ will not be corrupted. Note that to avoid unnecessary overhead, only
+ general-purpose (integer) registers are saved in this way. If VFP operations
+ are needed, that state must be saved manually.
+
+ Specifically, interrupt kinds other than "FIQ" will save all core registers
+ except "lr" and "sp". "FIQ" interrupts will save r0-r7.
+- If the CPU is not M-class, the return instruction is changed to one of the
+ canonical sequences permitted by the architecture for exception return. Where
+ possible the function itself will make the necessary "lr" adjustments so that
+ the "preferred return address" is selected.
+
+ Unfortunately the compiler is unable to make this guarantee for an "UNDEF"
+ handler, where the offset from "lr" to the preferred return address depends on
+ the execution state of the code which generated the exception. In this case
+ a sequence equivalent to "movs pc, lr" will be used.
+
+
+acquire_capability (acquire_shared_capability, clang::acquire_capability, clang::acquire_shared_capability)
+-----------------------------------------------------------------------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+Marks a function as acquiring a capability.
+
+
+assert_capability (assert_shared_capability, clang::assert_capability, clang::assert_shared_capability)
+-------------------------------------------------------------------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+Marks a function that dynamically tests whether a capability is held, and halts
+the program if it is not held.
+
+
+availability
+------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+The ``availability`` attribute can be placed on declarations to describe the
+lifecycle of that declaration relative to operating system versions. Consider
+the function declaration for a hypothetical function ``f``:
+
+.. code-block:: c++
+
+ void f(void) __attribute__((availability(macosx,introduced=10.4,deprecated=10.6,obsoleted=10.7)));
+
+The availability attribute states that ``f`` was introduced in Mac OS X 10.4,
+deprecated in Mac OS X 10.6, and obsoleted in Mac OS X 10.7. This information
+is used by Clang to determine when it is safe to use ``f``: for example, if
+Clang is instructed to compile code for Mac OS X 10.5, a call to ``f()``
+succeeds. If Clang is instructed to compile code for Mac OS X 10.6, the call
+succeeds but Clang emits a warning specifying that the function is deprecated.
+Finally, if Clang is instructed to compile code for Mac OS X 10.7, the call
+fails because ``f()`` is no longer available.
+
+The availability attribute is a comma-separated list starting with the
+platform name and then including clauses specifying important milestones in the
+declaration's lifetime (in any order) along with additional information. Those
+clauses can be:
+
+introduced=\ *version*
+ The first version in which this declaration was introduced.
+
+deprecated=\ *version*
+ The first version in which this declaration was deprecated, meaning that
+ users should migrate away from this API.
+
+obsoleted=\ *version*
+ The first version in which this declaration was obsoleted, meaning that it
+ was removed completely and can no longer be used.
+
+unavailable
+ This declaration is never available on this platform.
+
+message=\ *string-literal*
+ Additional message text that Clang will provide when emitting a warning or
+ error about use of a deprecated or obsoleted declaration. Useful to direct
+ users to replacement APIs.
+
+Multiple availability attributes can be placed on a declaration, which may
+correspond to different platforms. Only the availability attribute with the
+platform corresponding to the target platform will be used; any others will be
+ignored. If no availability attribute specifies availability for the current
+target platform, the availability attributes are ignored. Supported platforms
+are:
+
+``ios``
+ Apple's iOS operating system. The minimum deployment target is specified by
+ the ``-mios-version-min=*version*`` or ``-miphoneos-version-min=*version*``
+ command-line arguments.
+
+``macosx``
+ Apple's Mac OS X operating system. The minimum deployment target is
+ specified by the ``-mmacosx-version-min=*version*`` command-line argument.
+
+A declaration can be used even when deploying back to a platform version prior
+to when the declaration was introduced. When this happens, the declaration is
+`weakly linked
+<https://developer.apple.com/library/mac/#documentation/MacOSX/Conceptual/BPFrameworks/Concepts/WeakLinking.html>`_,
+as if the ``weak_import`` attribute were added to the declaration. A
+weakly-linked declaration may or may not be present a run-time, and a program
+can determine whether the declaration is present by checking whether the
+address of that declaration is non-NULL.
+
+If there are multiple declarations of the same entity, the availability
+attributes must either match on a per-platform basis or later
+declarations must not have availability attributes for that
+platform. For example:
+
+.. code-block:: c
+
+ void g(void) __attribute__((availability(macosx,introduced=10.4)));
+ void g(void) __attribute__((availability(macosx,introduced=10.4))); // okay, matches
+ void g(void) __attribute__((availability(ios,introduced=4.0))); // okay, adds a new platform
+ void g(void); // okay, inherits both macosx and ios availability from above.
+ void g(void) __attribute__((availability(macosx,introduced=10.5))); // error: mismatch
+
+When one method overrides another, the overriding method can be more widely available than the overridden method, e.g.,:
+
+.. code-block:: objc
+
+ @interface A
+ - (id)method __attribute__((availability(macosx,introduced=10.4)));
+ - (id)method2 __attribute__((availability(macosx,introduced=10.4)));
+ @end
+
+ @interface B : A
+ - (id)method __attribute__((availability(macosx,introduced=10.3))); // okay: method moved into base class later
+ - (id)method __attribute__((availability(macosx,introduced=10.5))); // error: this method was available via the base class in 10.4
+ @end
+
+
+_Noreturn
+---------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "","","","X"
+
+A function declared as ``_Noreturn`` shall not return to its caller. The
+compiler will generate a diagnostic for a function declared as ``_Noreturn``
+that appears to be capable of returning to its caller.
+
+
+noreturn
+--------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "","X","",""
+
+A function declared as ``[[noreturn]]`` shall not return to its caller. The
+compiler will generate a diagnostic for a function declared as ``[[noreturn]]``
+that appears to be capable of returning to its caller.
+
+
+carries_dependency
+------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+The ``carries_dependency`` attribute specifies dependency propagation into and
+out of functions.
+
+When specified on a function or Objective-C method, the ``carries_dependency``
+attribute means that the return value carries a dependency out of the function,
+so that the implementation need not constrain ordering upon return from that
+function. Implementations of the function and its caller may choose to preserve
+dependencies instead of emitting memory ordering instructions such as fences.
+
+Note, this attribute does not change the meaning of the program, but may result
+in generation of more efficient code.
+
+
+enable_if
+---------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+.. Note:: Some features of this attribute are experimental. The meaning of
+ multiple enable_if attributes on a single declaration is subject to change in
+ a future version of clang. Also, the ABI is not standardized and the name
+ mangling may change in future versions. To avoid that, use asm labels.
+
+The ``enable_if`` attribute can be placed on function declarations to control
+which overload is selected based on the values of the function's arguments.
+When combined with the ``overloadable`` attribute, this feature is also
+available in C.
+
+.. code-block:: c++
+
+ int isdigit(int c);
+ int isdigit(int c) __attribute__((enable_if(c <= -1 || c > 255, "chosen when 'c' is out of range"))) __attribute__((unavailable("'c' must have the value of an unsigned char or EOF")));
+
+ void foo(char c) {
+ isdigit(c);
+ isdigit(10);
+ isdigit(-10); // results in a compile-time error.
+ }
+
+The enable_if attribute takes two arguments, the first is an expression written
+in terms of the function parameters, the second is a string explaining why this
+overload candidate could not be selected to be displayed in diagnostics. The
+expression is part of the function signature for the purposes of determining
+whether it is a redeclaration (following the rules used when determining
+whether a C++ template specialization is ODR-equivalent), but is not part of
+the type.
+
+The enable_if expression is evaluated as if it were the body of a
+bool-returning constexpr function declared with the arguments of the function
+it is being applied to, then called with the parameters at the callsite. If the
+result is false or could not be determined through constant expression
+evaluation, then this overload will not be chosen and the provided string may
+be used in a diagnostic if the compile fails as a result.
+
+Because the enable_if expression is an unevaluated context, there are no global
+state changes, nor the ability to pass information from the enable_if
+expression to the function body. For example, suppose we want calls to
+strnlen(strbuf, maxlen) to resolve to strnlen_chk(strbuf, maxlen, size of
+strbuf) only if the size of strbuf can be determined:
+
+.. code-block:: c++
+
+ __attribute__((always_inline))
+ static inline size_t strnlen(const char *s, size_t maxlen)
+ __attribute__((overloadable))
+ __attribute__((enable_if(__builtin_object_size(s, 0) != -1))),
+ "chosen when the buffer size is known but 'maxlen' is not")))
+ {
+ return strnlen_chk(s, maxlen, __builtin_object_size(s, 0));
+ }
+
+Multiple enable_if attributes may be applied to a single declaration. In this
+case, the enable_if expressions are evaluated from left to right in the
+following manner. First, the candidates whose enable_if expressions evaluate to
+false or cannot be evaluated are discarded. If the remaining candidates do not
+share ODR-equivalent enable_if expressions, the overload resolution is
+ambiguous. Otherwise, enable_if overload resolution continues with the next
+enable_if attribute on the candidates that have not been discarded and have
+remaining enable_if attributes. In this way, we pick the most specific
+overload out of a number of viable overloads using enable_if.
+
+.. code-block:: c++
+
+ void f() __attribute__((enable_if(true, ""))); // #1
+ void f() __attribute__((enable_if(true, ""))) __attribute__((enable_if(true, ""))); // #2
+
+ void g(int i, int j) __attribute__((enable_if(i, ""))); // #1
+ void g(int i, int j) __attribute__((enable_if(j, ""))) __attribute__((enable_if(true))); // #2
+
+In this example, a call to f() is always resolved to #2, as the first enable_if
+expression is ODR-equivalent for both declarations, but #1 does not have another
+enable_if expression to continue evaluating, so the next round of evaluation has
+only a single candidate. In a call to g(1, 1), the call is ambiguous even though
+#2 has more enable_if attributes, because the first enable_if expressions are
+not ODR-equivalent.
+
+Query for this feature with ``__has_attribute(enable_if)``.
+
+
+flatten (gnu::flatten)
+----------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+The ``flatten`` attribute causes calls within the attributed function to
+be inlined unless it is impossible to do so, for example if the body of the
+callee is unavailable or if the callee has the ``noinline`` attribute.
+
+
+format (gnu::format)
+--------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+Clang supports the ``format`` attribute, which indicates that the function
+accepts a ``printf`` or ``scanf``-like format string and corresponding
+arguments or a ``va_list`` that contains these arguments.
+
+Please see `GCC documentation about format attribute
+<http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html>`_ to find details
+about attribute syntax.
+
+Clang implements two kinds of checks with this attribute.
+
+#. Clang checks that the function with the ``format`` attribute is called with
+ a format string that uses format specifiers that are allowed, and that
+ arguments match the format string. This is the ``-Wformat`` warning, it is
+ on by default.
+
+#. Clang checks that the format string argument is a literal string. This is
+ the ``-Wformat-nonliteral`` warning, it is off by default.
+
+ Clang implements this mostly the same way as GCC, but there is a difference
+ for functions that accept a ``va_list`` argument (for example, ``vprintf``).
+ GCC does not emit ``-Wformat-nonliteral`` warning for calls to such
+ fuctions. Clang does not warn if the format string comes from a function
+ parameter, where the function is annotated with a compatible attribute,
+ otherwise it warns. For example:
+
+ .. code-block:: c
+
+ __attribute__((__format__ (__scanf__, 1, 3)))
+ void foo(const char* s, char *buf, ...) {
+ va_list ap;
+ va_start(ap, buf);
+
+ vprintf(s, ap); // warning: format string is not a string literal
+ }
+
+ In this case we warn because ``s`` contains a format string for a
+ ``scanf``-like function, but it is passed to a ``printf``-like function.
+
+ If the attribute is removed, clang still warns, because the format string is
+ not a string literal.
+
+ Another example:
+
+ .. code-block:: c
+
+ __attribute__((__format__ (__printf__, 1, 3)))
+ void foo(const char* s, char *buf, ...) {
+ va_list ap;
+ va_start(ap, buf);
+
+ vprintf(s, ap); // warning
+ }
+
+ In this case Clang does not warn because the format string ``s`` and
+ the corresponding arguments are annotated. If the arguments are
+ incorrect, the caller of ``foo`` will receive a warning.
+
+
+noduplicate (clang::noduplicate)
+--------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+The ``noduplicate`` attribute can be placed on function declarations to control
+whether function calls to this function can be duplicated or not as a result of
+optimizations. This is required for the implementation of functions with
+certain special requirements, like the OpenCL "barrier" function, that might
+need to be run concurrently by all the threads that are executing in lockstep
+on the hardware. For example this attribute applied on the function
+"nodupfunc" in the code below avoids that:
+
+.. code-block:: c
+
+ void nodupfunc() __attribute__((noduplicate));
+ // Setting it as a C++11 attribute is also valid
+ // void nodupfunc() [[clang::noduplicate]];
+ void foo();
+ void bar();
+
+ nodupfunc();
+ if (a > n) {
+ foo();
+ } else {
+ bar();
+ }
+
+gets possibly modified by some optimizations into code similar to this:
+
+.. code-block:: c
+
+ if (a > n) {
+ nodupfunc();
+ foo();
+ } else {
+ nodupfunc();
+ bar();
+ }
+
+where the call to "nodupfunc" is duplicated and sunk into the two branches
+of the condition.
+
+
+no_sanitize_address (no_address_safety_analysis, gnu::no_address_safety_analysis, gnu::no_sanitize_address)
+-----------------------------------------------------------------------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+.. _langext-address_sanitizer:
+
+Use ``__attribute__((no_sanitize_address))`` on a function declaration to
+specify that address safety instrumentation (e.g. AddressSanitizer) should
+not be applied to that function.
+
+
+no_sanitize_memory
+------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+.. _langext-memory_sanitizer:
+
+Use ``__attribute__((no_sanitize_memory))`` on a function declaration to
+specify that checks for uninitialized memory should not be inserted
+(e.g. by MemorySanitizer). The function may still be instrumented by the tool
+to avoid false positives in other places.
+
+
+no_sanitize_thread
+------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+.. _langext-thread_sanitizer:
+
+Use ``__attribute__((no_sanitize_thread))`` on a function declaration to
+specify that checks for data races on plain (non-atomic) memory accesses should
+not be inserted by ThreadSanitizer. The function is still instrumented by the
+tool to avoid false positives and provide meaningful stack traces.
+
+
+no_split_stack (gnu::no_split_stack)
+------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+The ``no_split_stack`` attribute disables the emission of the split stack
+preamble for a particular function. It has no effect if ``-fsplit-stack``
+is not specified.
+
+
+objc_method_family
+------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Many methods in Objective-C have conventional meanings determined by their
+selectors. It is sometimes useful to be able to mark a method as having a
+particular conventional meaning despite not having the right selector, or as
+not having the conventional meaning that its selector would suggest. For these
+use cases, we provide an attribute to specifically describe the "method family"
+that a method belongs to.
+
+**Usage**: ``__attribute__((objc_method_family(X)))``, where ``X`` is one of
+``none``, ``alloc``, ``copy``, ``init``, ``mutableCopy``, or ``new``. This
+attribute can only be placed at the end of a method declaration:
+
+.. code-block:: objc
+
+ - (NSString *)initMyStringValue __attribute__((objc_method_family(none)));
+
+Users who do not wish to change the conventional meaning of a method, and who
+merely want to document its non-standard retain and release semantics, should
+use the retaining behavior attributes (``ns_returns_retained``,
+``ns_returns_not_retained``, etc).
+
+Query for this feature with ``__has_attribute(objc_method_family)``.
+
+
+objc_requires_super
+-------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Some Objective-C classes allow a subclass to override a particular method in a
+parent class but expect that the overriding method also calls the overridden
+method in the parent class. For these cases, we provide an attribute to
+designate that a method requires a "call to ``super``" in the overriding
+method in the subclass.
+
+**Usage**: ``__attribute__((objc_requires_super))``. This attribute can only
+be placed at the end of a method declaration:
+
+.. code-block:: objc
+
+ - (void)foo __attribute__((objc_requires_super));
+
+This attribute can only be applied the method declarations within a class, and
+not a protocol. Currently this attribute does not enforce any placement of
+where the call occurs in the overriding method (such as in the case of
+``-dealloc`` where the call must appear at the end). It checks only that it
+exists.
+
+Note that on both OS X and iOS that the Foundation framework provides a
+convenience macro ``NS_REQUIRES_SUPER`` that provides syntactic sugar for this
+attribute:
+
+.. code-block:: objc
+
+ - (void)foo NS_REQUIRES_SUPER;
+
+This macro is conditionally defined depending on the compiler's support for
+this attribute. If the compiler does not support the attribute the macro
+expands to nothing.
+
+Operationally, when a method has this annotation the compiler will warn if the
+implementation of an override in a subclass does not call super. For example:
+
+.. code-block:: objc
+
+ warning: method possibly missing a [super AnnotMeth] call
+ - (void) AnnotMeth{};
+ ^
+
+
+optnone (clang::optnone)
+------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+The ``optnone`` attribute suppresses essentially all optimizations
+on a function or method, regardless of the optimization level applied to
+the compilation unit as a whole. This is particularly useful when you
+need to debug a particular function, but it is infeasible to build the
+entire application without optimization. Avoiding optimization on the
+specified function can improve the quality of the debugging information
+for that function.
+
+This attribute is incompatible with the ``always_inline`` attribute.
+
+
+overloadable
+------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Clang provides support for C++ function overloading in C. Function overloading
+in C is introduced using the ``overloadable`` attribute. For example, one
+might provide several overloaded versions of a ``tgsin`` function that invokes
+the appropriate standard function computing the sine of a value with ``float``,
+``double``, or ``long double`` precision:
+
+.. code-block:: c
+
+ #include <math.h>
+ float __attribute__((overloadable)) tgsin(float x) { return sinf(x); }
+ double __attribute__((overloadable)) tgsin(double x) { return sin(x); }
+ long double __attribute__((overloadable)) tgsin(long double x) { return sinl(x); }
+
+Given these declarations, one can call ``tgsin`` with a ``float`` value to
+receive a ``float`` result, with a ``double`` to receive a ``double`` result,
+etc. Function overloading in C follows the rules of C++ function overloading
+to pick the best overload given the call arguments, with a few C-specific
+semantics:
+
+* Conversion from ``float`` or ``double`` to ``long double`` is ranked as a
+ floating-point promotion (per C99) rather than as a floating-point conversion
+ (as in C++).
+
+* A conversion from a pointer of type ``T*`` to a pointer of type ``U*`` is
+ considered a pointer conversion (with conversion rank) if ``T`` and ``U`` are
+ compatible types.
+
+* A conversion from type ``T`` to a value of type ``U`` is permitted if ``T``
+ and ``U`` are compatible types. This conversion is given "conversion" rank.
+
+The declaration of ``overloadable`` functions is restricted to function
+declarations and definitions. Most importantly, if any function with a given
+name is given the ``overloadable`` attribute, then all function declarations
+and definitions with that name (and in that scope) must have the
+``overloadable`` attribute. This rule even applies to redeclarations of
+functions whose original declaration had the ``overloadable`` attribute, e.g.,
+
+.. code-block:: c
+
+ int f(int) __attribute__((overloadable));
+ float f(float); // error: declaration of "f" must have the "overloadable" attribute
+
+ int g(int) __attribute__((overloadable));
+ int g(int) { } // error: redeclaration of "g" must also have the "overloadable" attribute
+
+Functions marked ``overloadable`` must have prototypes. Therefore, the
+following code is ill-formed:
+
+.. code-block:: c
+
+ int h() __attribute__((overloadable)); // error: h does not have a prototype
+
+However, ``overloadable`` functions are allowed to use a ellipsis even if there
+are no named parameters (as is permitted in C++). This feature is particularly
+useful when combined with the ``unavailable`` attribute:
+
+.. code-block:: c++
+
+ void honeypot(...) __attribute__((overloadable, unavailable)); // calling me is an error
+
+Functions declared with the ``overloadable`` attribute have their names mangled
+according to the same rules as C++ function names. For example, the three
+``tgsin`` functions in our motivating example get the mangled names
+``_Z5tgsinf``, ``_Z5tgsind``, and ``_Z5tgsine``, respectively. There are two
+caveats to this use of name mangling:
+
+* Future versions of Clang may change the name mangling of functions overloaded
+ in C, so you should not depend on an specific mangling. To be completely
+ safe, we strongly urge the use of ``static inline`` with ``overloadable``
+ functions.
+
+* The ``overloadable`` attribute has almost no meaning when used in C++,
+ because names will already be mangled and functions are already overloadable.
+ However, when an ``overloadable`` function occurs within an ``extern "C"``
+ linkage specification, it's name *will* be mangled in the same way as it
+ would in C.
+
+Query for this feature with ``__has_extension(attribute_overloadable)``.
+
+
+pcs (gnu::pcs)
+--------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+On ARM targets, this can attribute can be used to select calling conventions,
+similar to ``stdcall`` on x86. Valid parameter values are "aapcs" and
+"aapcs-vfp".
+
+
+release_capability (release_shared_capability, clang::release_capability, clang::release_shared_capability)
+-----------------------------------------------------------------------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+Marks a function as releasing a capability.
+
+
+try_acquire_capability (try_acquire_shared_capability, clang::try_acquire_capability, clang::try_acquire_shared_capability)
+---------------------------------------------------------------------------------------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+Marks a function that attempts to acquire a capability. This function may fail to
+actually acquire the capability; they accept a Boolean value determining
+whether acquiring the capability means success (true), or failing to acquire
+the capability means success (false).
+
+
+Variable Attributes
+===================
+
+
+section (gnu::section, __declspec(allocate))
+--------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","X",""
+
+The ``section`` attribute allows you to specify a specific section a
+global variable or function should be in after translation.
+
+
+tls_model (gnu::tls_model)
+--------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","X","",""
+
+The ``tls_model`` attribute allows you to specify which thread-local storage
+model to use. It accepts the following strings:
+
+* global-dynamic
+* local-dynamic
+* initial-exec
+* local-exec
+
+TLS models are mutually exclusive.
+
+
+thread
+------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "","","X",""
+
+The ``__declspec(thread)`` attribute declares a variable with thread local
+storage. It is available under the ``-fms-extensions`` flag for MSVC
+compatibility. Documentation for the Visual C++ attribute is available on MSDN_.
+
+.. _MSDN: http://msdn.microsoft.com/en-us/library/9w1sdazb.aspx
+
+In Clang, ``__declspec(thread)`` is generally equivalent in functionality to the
+GNU ``__thread`` keyword. The variable must not have a destructor and must have
+a constant initializer, if any. The attribute only applies to variables
+declared with static storage duration, such as globals, class static data
+members, and static locals.
+
+
+Type Attributes
+===============
+
+
+__single_inhertiance, __multiple_inheritance, __virtual_inheritance
+-------------------------------------------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "","","","X"
+
+This collection of keywords is enabled under ``-fms-extensions`` and controls
+the pointer-to-member representation used on ``*-*-win32`` targets.
+
+The ``*-*-win32`` targets utilize a pointer-to-member representation which
+varies in size and alignment depending on the definition of the underlying
+class.
+
+However, this is problematic when a forward declaration is only available and
+no definition has been made yet. In such cases, Clang is forced to utilize the
+most general representation that is available to it.
+
+These keywords make it possible to use a pointer-to-member representation other
+than the most general one regardless of whether or not the definition will ever
+be present in the current translation unit.
+
+This family of keywords belong between the ``class-key`` and ``class-name``:
+
+.. code-block:: c++
+
+ struct __single_inheritance S;
+ int S::*i;
+ struct S {};
+
+This keyword can be applied to class templates but only has an effect when used
+on full specializations:
+
+.. code-block:: c++
+
+ template <typename T, typename U> struct __single_inheritance A; // warning: inheritance model ignored on primary template
+ template <typename T> struct __multiple_inheritance A<T, T>; // warning: inheritance model ignored on partial specialization
+ template <> struct __single_inheritance A<int, float>;
+
+Note that choosing an inheritance model less general than strictly necessary is
+an error:
+
+.. code-block:: c++
+
+ struct __multiple_inheritance S; // error: inheritance model does not match definition
+ int S::*i;
+ struct S {};
+
+
+Statement Attributes
+====================
+
+
+fallthrough (clang::fallthrough)
+--------------------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "","X","",""
+
+The ``clang::fallthrough`` attribute is used along with the
+``-Wimplicit-fallthrough`` argument to annotate intentional fall-through
+between switch labels. It can only be applied to a null statement placed at a
+point of execution between any statement and the next switch label. It is
+common to mark these places with a specific comment, but this attribute is
+meant to replace comments with a more strict annotation, which can be checked
+by the compiler. This attribute doesn't change semantics of the code and can
+be used wherever an intended fall-through occurs. It is designed to mimic
+control-flow statements like ``break;``, so it can be placed in most places
+where ``break;`` can, but only if there are no statements on the execution path
+between it and the next switch label.
+
+Here is an example:
+
+.. code-block:: c++
+
+ // compile with -Wimplicit-fallthrough
+ switch (n) {
+ case 22:
+ case 33: // no warning: no statements between case labels
+ f();
+ case 44: // warning: unannotated fall-through
+ g();
+ [[clang::fallthrough]];
+ case 55: // no warning
+ if (x) {
+ h();
+ break;
+ }
+ else {
+ i();
+ [[clang::fallthrough]];
+ }
+ case 66: // no warning
+ p();
+ [[clang::fallthrough]]; // warning: fallthrough annotation does not
+ // directly precede case label
+ q();
+ case 77: // warning: unannotated fall-through
+ r();
+ }
+
+
+Consumed Annotation Checking
+============================
+Clang supports additional attributes for checking basic resource management
+properties, specifically for unique objects that have a single owning reference.
+The following attributes are currently supported, although **the implementation
+for these annotations is currently in development and are subject to change.**
+
+callable_when
+-------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Use ``__attribute__((callable_when(...)))`` to indicate what states a method
+may be called in. Valid states are unconsumed, consumed, or unknown. Each
+argument to this attribute must be a quoted string. E.g.:
+
+``__attribute__((callable_when("unconsumed", "unknown")))``
+
+
+consumable
+----------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Each ``class`` that uses any of the typestate annotations must first be marked
+using the ``consumable`` attribute. Failure to do so will result in a warning.
+
+This attribute accepts a single parameter that must be one of the following:
+``unknown``, ``consumed``, or ``unconsumed``.
+
+
+param_typestate
+---------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+This attribute specifies expectations about function parameters. Calls to an
+function with annotated parameters will issue a warning if the corresponding
+argument isn't in the expected state. The attribute is also used to set the
+initial state of the parameter when analyzing the function's body.
+
+
+return_typestate
+----------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+The ``return_typestate`` attribute can be applied to functions or parameters.
+When applied to a function the attribute specifies the state of the returned
+value. The function's body is checked to ensure that it always returns a value
+in the specified state. On the caller side, values returned by the annotated
+function are initialized to the given state.
+
+When applied to a function parameter it modifies the state of an argument after
+a call to the function returns. The function's body is checked to ensure that
+the parameter is in the expected state before returning.
+
+
+set_typestate
+-------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Annotate methods that transition an object into a new state with
+``__attribute__((set_typestate(new_state)))``. The new new state must be
+unconsumed, consumed, or unknown.
+
+
+test_typestate
+--------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Use ``__attribute__((test_typestate(tested_state)))`` to indicate that a method
+returns true if the object is in the specified state..
+
+
+Type Safety Checking
+====================
+Clang supports additional attributes to enable checking type safety properties
+that can't be enforced by the C type system. Use cases include:
+
+* MPI library implementations, where these attributes enable checking that
+ the buffer type matches the passed ``MPI_Datatype``;
+* for HDF5 library there is a similar use case to MPI;
+* checking types of variadic functions' arguments for functions like
+ ``fcntl()`` and ``ioctl()``.
+
+You can detect support for these attributes with ``__has_attribute()``. For
+example:
+
+.. code-block:: c++
+
+ #if defined(__has_attribute)
+ # if __has_attribute(argument_with_type_tag) && \
+ __has_attribute(pointer_with_type_tag) && \
+ __has_attribute(type_tag_for_datatype)
+ # define ATTR_MPI_PWT(buffer_idx, type_idx) __attribute__((pointer_with_type_tag(mpi,buffer_idx,type_idx)))
+ /* ... other macros ... */
+ # endif
+ #endif
+
+ #if !defined(ATTR_MPI_PWT)
+ # define ATTR_MPI_PWT(buffer_idx, type_idx)
+ #endif
+
+ int MPI_Send(void *buf, int count, MPI_Datatype datatype /*, other args omitted */)
+ ATTR_MPI_PWT(1,3);
+
+argument_with_type_tag
+----------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Use ``__attribute__((argument_with_type_tag(arg_kind, arg_idx,
+type_tag_idx)))`` on a function declaration to specify that the function
+accepts a type tag that determines the type of some other argument.
+``arg_kind`` is an identifier that should be used when annotating all
+applicable type tags.
+
+This attribute is primarily useful for checking arguments of variadic functions
+(``pointer_with_type_tag`` can be used in most non-variadic cases).
+
+For example:
+
+.. code-block:: c++
+
+ int fcntl(int fd, int cmd, ...)
+ __attribute__(( argument_with_type_tag(fcntl,3,2) ));
+
+
+pointer_with_type_tag
+---------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Use ``__attribute__((pointer_with_type_tag(ptr_kind, ptr_idx, type_tag_idx)))``
+on a function declaration to specify that the function accepts a type tag that
+determines the pointee type of some other pointer argument.
+
+For example:
+
+.. code-block:: c++
+
+ int MPI_Send(void *buf, int count, MPI_Datatype datatype /*, other args omitted */)
+ __attribute__(( pointer_with_type_tag(mpi,1,3) ));
+
+
+type_tag_for_datatype
+---------------------
+.. csv-table:: Supported Syntaxes
+ :header: "GNU", "C++11", "__declspec", "Keyword"
+
+ "X","","",""
+
+Clang supports annotating type tags of two forms.
+
+* **Type tag that is an expression containing a reference to some declared
+ identifier.** Use ``__attribute__((type_tag_for_datatype(kind, type)))`` on a
+ declaration with that identifier:
+
+ .. code-block:: c++
+
+ extern struct mpi_datatype mpi_datatype_int
+ __attribute__(( type_tag_for_datatype(mpi,int) ));
+ #define MPI_INT ((MPI_Datatype) &mpi_datatype_int)
+
+* **Type tag that is an integral literal.** Introduce a ``static const``
+ variable with a corresponding initializer value and attach
+ ``__attribute__((type_tag_for_datatype(kind, type)))`` on that declaration,
+ for example:
+
+ .. code-block:: c++
+
+ #define MPI_INT ((MPI_Datatype) 42)
+ static const MPI_Datatype mpi_datatype_int
+ __attribute__(( type_tag_for_datatype(mpi,int) )) = 42
+
+The attribute also accepts an optional third argument that determines how the
+expression is compared to the type tag. There are two supported flags:
+
+* ``layout_compatible`` will cause types to be compared according to
+ layout-compatibility rules (C++11 [class.mem] p 17, 18). This is
+ implemented to support annotating types like ``MPI_DOUBLE_INT``.
+
+ For example:
+
+ .. code-block:: c++
+
+ /* In mpi.h */
+ struct internal_mpi_double_int { double d; int i; };
+ extern struct mpi_datatype mpi_datatype_double_int
+ __attribute__(( type_tag_for_datatype(mpi, struct internal_mpi_double_int, layout_compatible) ));
+
+ #define MPI_DOUBLE_INT ((MPI_Datatype) &mpi_datatype_double_int)
+
+ /* In user code */
+ struct my_pair { double a; int b; };
+ struct my_pair *buffer;
+ MPI_Send(buffer, 1, MPI_DOUBLE_INT /*, ... */); // no warning
+
+ struct my_int_pair { int a; int b; }
+ struct my_int_pair *buffer2;
+ MPI_Send(buffer2, 1, MPI_DOUBLE_INT /*, ... */); // warning: actual buffer element
+ // type 'struct my_int_pair'
+ // doesn't match specified MPI_Datatype
+
+* ``must_be_null`` specifies that the expression should be a null pointer
+ constant, for example:
+
+ .. code-block:: c++
+
+ /* In mpi.h */
+ extern struct mpi_datatype mpi_datatype_null
+ __attribute__(( type_tag_for_datatype(mpi, void, must_be_null) ));
+
+ #define MPI_DATATYPE_NULL ((MPI_Datatype) &mpi_datatype_null)
+
+ /* In user code */
+ MPI_Send(buffer, 1, MPI_DATATYPE_NULL /*, ... */); // warning: MPI_DATATYPE_NULL
+ // was specified but buffer
+ // is not a null pointer
+
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/AutomaticReferenceCounting.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/AutomaticReferenceCounting.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/AutomaticReferenceCounting.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/AutomaticReferenceCounting.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,2283 @@
+.. FIXME: move to the stylesheet or Sphinx plugin
+
+.. raw:: html
+
+ <style>
+ .arc-term { font-style: italic; font-weight: bold; }
+ .revision { font-style: italic; }
+ .when-revised { font-weight: bold; font-style: normal; }
+
+ /*
+ * Automatic numbering is described in this article:
+ * http://dev.opera.com/articles/view/automatic-numbering-with-css-counters/
+ */
+ /*
+ * Automatic numbering for the TOC.
+ * This is wrong from the semantics point of view, since it is an ordered
+ * list, but uses "ul" tag.
+ */
+ div#contents.contents.local ul {
+ counter-reset: toc-section;
+ list-style-type: none;
+ }
+ div#contents.contents.local ul li {
+ counter-increment: toc-section;
+ background: none; // Remove bullets
+ }
+ div#contents.contents.local ul li a.reference:before {
+ content: counters(toc-section, ".") " ";
+ }
+
+ /* Automatic numbering for the body. */
+ body {
+ counter-reset: section subsection subsubsection;
+ }
+ .section h2 {
+ counter-reset: subsection subsubsection;
+ counter-increment: section;
+ }
+ .section h2 a.toc-backref:before {
+ content: counter(section) " ";
+ }
+ .section h3 {
+ counter-reset: subsubsection;
+ counter-increment: subsection;
+ }
+ .section h3 a.toc-backref:before {
+ content: counter(section) "." counter(subsection) " ";
+ }
+ .section h4 {
+ counter-increment: subsubsection;
+ }
+ .section h4 a.toc-backref:before {
+ content: counter(section) "." counter(subsection) "." counter(subsubsection) " ";
+ }
+ </style>
+
+.. role:: arc-term
+.. role:: revision
+.. role:: when-revised
+
+==============================================
+Objective-C Automatic Reference Counting (ARC)
+==============================================
+
+.. contents::
+ :local:
+
+.. _arc.meta:
+
+About this document
+===================
+
+.. _arc.meta.purpose:
+
+Purpose
+-------
+
+The first and primary purpose of this document is to serve as a complete
+technical specification of Automatic Reference Counting. Given a core
+Objective-C compiler and runtime, it should be possible to write a compiler and
+runtime which implements these new semantics.
+
+The secondary purpose is to act as a rationale for why ARC was designed in this
+way. This should remain tightly focused on the technical design and should not
+stray into marketing speculation.
+
+.. _arc.meta.background:
+
+Background
+----------
+
+This document assumes a basic familiarity with C.
+
+:arc-term:`Blocks` are a C language extension for creating anonymous functions.
+Users interact with and transfer block objects using :arc-term:`block
+pointers`, which are represented like a normal pointer. A block may capture
+values from local variables; when this occurs, memory must be dynamically
+allocated. The initial allocation is done on the stack, but the runtime
+provides a ``Block_copy`` function which, given a block pointer, either copies
+the underlying block object to the heap, setting its reference count to 1 and
+returning the new block pointer, or (if the block object is already on the
+heap) increases its reference count by 1. The paired function is
+``Block_release``, which decreases the reference count by 1 and destroys the
+object if the count reaches zero and is on the heap.
+
+Objective-C is a set of language extensions, significant enough to be
+considered a different language. It is a strict superset of C. The extensions
+can also be imposed on C++, producing a language called Objective-C++. The
+primary feature is a single-inheritance object system; we briefly describe the
+modern dialect.
+
+Objective-C defines a new type kind, collectively called the :arc-term:`object
+pointer types`. This kind has two notable builtin members, ``id`` and
+``Class``; ``id`` is the final supertype of all object pointers. The validity
+of conversions between object pointer types is not checked at runtime. Users
+may define :arc-term:`classes`; each class is a type, and the pointer to that
+type is an object pointer type. A class may have a superclass; its pointer
+type is a subtype of its superclass's pointer type. A class has a set of
+:arc-term:`ivars`, fields which appear on all instances of that class. For
+every class *T* there's an associated metaclass; it has no fields, its
+superclass is the metaclass of *T*'s superclass, and its metaclass is a global
+class. Every class has a global object whose class is the class's metaclass;
+metaclasses have no associated type, so pointers to this object have type
+``Class``.
+
+A class declaration (``@interface``) declares a set of :arc-term:`methods`. A
+method has a return type, a list of argument types, and a :arc-term:`selector`:
+a name like ``foo:bar:baz:``, where the number of colons corresponds to the
+number of formal arguments. A method may be an instance method, in which case
+it can be invoked on objects of the class, or a class method, in which case it
+can be invoked on objects of the metaclass. A method may be invoked by
+providing an object (called the :arc-term:`receiver`) and a list of formal
+arguments interspersed with the selector, like so:
+
+.. code-block:: objc
+
+ [receiver foo: fooArg bar: barArg baz: bazArg]
+
+This looks in the dynamic class of the receiver for a method with this name,
+then in that class's superclass, etc., until it finds something it can execute.
+The receiver "expression" may also be the name of a class, in which case the
+actual receiver is the class object for that class, or (within method
+definitions) it may be ``super``, in which case the lookup algorithm starts
+with the static superclass instead of the dynamic class. The actual methods
+dynamically found in a class are not those declared in the ``@interface``, but
+those defined in a separate ``@implementation`` declaration; however, when
+compiling a call, typechecking is done based on the methods declared in the
+``@interface``.
+
+Method declarations may also be grouped into :arc-term:`protocols`, which are not
+inherently associated with any class, but which classes may claim to follow.
+Object pointer types may be qualified with additional protocols that the object
+is known to support.
+
+:arc-term:`Class extensions` are collections of ivars and methods, designed to
+allow a class's ``@interface`` to be split across multiple files; however,
+there is still a primary implementation file which must see the
+``@interface``\ s of all class extensions. :arc-term:`Categories` allow
+methods (but not ivars) to be declared *post hoc* on an arbitrary class; the
+methods in the category's ``@implementation`` will be dynamically added to that
+class's method tables which the category is loaded at runtime, replacing those
+methods in case of a collision.
+
+In the standard environment, objects are allocated on the heap, and their
+lifetime is manually managed using a reference count. This is done using two
+instance methods which all classes are expected to implement: ``retain``
+increases the object's reference count by 1, whereas ``release`` decreases it
+by 1 and calls the instance method ``dealloc`` if the count reaches 0. To
+simplify certain operations, there is also an :arc-term:`autorelease pool`, a
+thread-local list of objects to call ``release`` on later; an object can be
+added to this pool by calling ``autorelease`` on it.
+
+Block pointers may be converted to type ``id``; block objects are laid out in a
+way that makes them compatible with Objective-C objects. There is a builtin
+class that all block objects are considered to be objects of; this class
+implements ``retain`` by adjusting the reference count, not by calling
+``Block_copy``.
+
+.. _arc.meta.evolution:
+
+Evolution
+---------
+
+ARC is under continual evolution, and this document must be updated as the
+language progresses.
+
+If a change increases the expressiveness of the language, for example by
+lifting a restriction or by adding new syntax, the change will be annotated
+with a revision marker, like so:
+
+ ARC applies to Objective-C pointer types, block pointer types, and
+ :when-revised:`[beginning Apple 8.0, LLVM 3.8]` :revision:`BPTRs declared
+ within` ``extern "BCPL"`` blocks.
+
+For now, it is sensible to version this document by the releases of its sole
+implementation (and its host project), clang. "LLVM X.Y" refers to an
+open-source release of clang from the LLVM project. "Apple X.Y" refers to an
+Apple-provided release of the Apple LLVM Compiler. Other organizations that
+prepare their own, separately-versioned clang releases and wish to maintain
+similar information in this document should send requests to cfe-dev.
+
+If a change decreases the expressiveness of the language, for example by
+imposing a new restriction, this should be taken as an oversight in the
+original specification and something to be avoided in all versions. Such
+changes are generally to be avoided.
+
+.. _arc.general:
+
+General
+=======
+
+Automatic Reference Counting implements automatic memory management for
+Objective-C objects and blocks, freeing the programmer from the need to
+explicitly insert retains and releases. It does not provide a cycle collector;
+users must explicitly manage the lifetime of their objects, breaking cycles
+manually or with weak or unsafe references.
+
+ARC may be explicitly enabled with the compiler flag ``-fobjc-arc``. It may
+also be explicitly disabled with the compiler flag ``-fno-objc-arc``. The last
+of these two flags appearing on the compile line "wins".
+
+If ARC is enabled, ``__has_feature(objc_arc)`` will expand to 1 in the
+preprocessor. For more information about ``__has_feature``, see the
+:ref:`language extensions <langext-__has_feature-__has_extension>` document.
+
+.. _arc.objects:
+
+Retainable object pointers
+==========================
+
+This section describes retainable object pointers, their basic operations, and
+the restrictions imposed on their use under ARC. Note in particular that it
+covers the rules for pointer *values* (patterns of bits indicating the location
+of a pointed-to object), not pointer *objects* (locations in memory which store
+pointer values). The rules for objects are covered in the next section.
+
+A :arc-term:`retainable object pointer` (or "retainable pointer") is a value of
+a :arc-term:`retainable object pointer type` ("retainable type"). There are
+three kinds of retainable object pointer types:
+
+* block pointers (formed by applying the caret (``^``) declarator sigil to a
+ function type)
+* Objective-C object pointers (``id``, ``Class``, ``NSFoo*``, etc.)
+* typedefs marked with ``__attribute__((NSObject))``
+
+Other pointer types, such as ``int*`` and ``CFStringRef``, are not subject to
+ARC's semantics and restrictions.
+
+.. admonition:: Rationale
+
+ We are not at liberty to require all code to be recompiled with ARC;
+ therefore, ARC must interoperate with Objective-C code which manages retains
+ and releases manually. In general, there are three requirements in order for
+ a compiler-supported reference-count system to provide reliable
+ interoperation:
+
+ * The type system must reliably identify which objects are to be managed. An
+ ``int*`` might be a pointer to a ``malloc``'ed array, or it might be an
+ interior pointer to such an array, or it might point to some field or local
+ variable. In contrast, values of the retainable object pointer types are
+ never interior.
+
+ * The type system must reliably indicate how to manage objects of a type.
+ This usually means that the type must imply a procedure for incrementing
+ and decrementing retain counts. Supporting single-ownership objects
+ requires a lot more explicit mediation in the language.
+
+ * There must be reliable conventions for whether and when "ownership" is
+ passed between caller and callee, for both arguments and return values.
+ Objective-C methods follow such a convention very reliably, at least for
+ system libraries on Mac OS X, and functions always pass objects at +0. The
+ C-based APIs for Core Foundation objects, on the other hand, have much more
+ varied transfer semantics.
+
+The use of ``__attribute__((NSObject))`` typedefs is not recommended. If it's
+absolutely necessary to use this attribute, be very explicit about using the
+typedef, and do not assume that it will be preserved by language features like
+``__typeof`` and C++ template argument substitution.
+
+.. admonition:: Rationale
+
+ Any compiler operation which incidentally strips type "sugar" from a type
+ will yield a type without the attribute, which may result in unexpected
+ behavior.
+
+.. _arc.objects.retains:
+
+Retain count semantics
+----------------------
+
+A retainable object pointer is either a :arc-term:`null pointer` or a pointer
+to a valid object. Furthermore, if it has block pointer type and is not
+``null`` then it must actually be a pointer to a block object, and if it has
+``Class`` type (possibly protocol-qualified) then it must actually be a pointer
+to a class object. Otherwise ARC does not enforce the Objective-C type system
+as long as the implementing methods follow the signature of the static type.
+It is undefined behavior if ARC is exposed to an invalid pointer.
+
+For ARC's purposes, a valid object is one with "well-behaved" retaining
+operations. Specifically, the object must be laid out such that the
+Objective-C message send machinery can successfully send it the following
+messages:
+
+* ``retain``, taking no arguments and returning a pointer to the object.
+* ``release``, taking no arguments and returning ``void``.
+* ``autorelease``, taking no arguments and returning a pointer to the object.
+
+The behavior of these methods is constrained in the following ways. The term
+:arc-term:`high-level semantics` is an intentionally vague term; the intent is
+that programmers must implement these methods in a way such that the compiler,
+modifying code in ways it deems safe according to these constraints, will not
+violate their requirements. For example, if the user puts logging statements
+in ``retain``, they should not be surprised if those statements are executed
+more or less often depending on optimization settings. These constraints are
+not exhaustive of the optimization opportunities: values held in local
+variables are subject to additional restrictions, described later in this
+document.
+
+It is undefined behavior if a computation history featuring a send of
+``retain`` followed by a send of ``release`` to the same object, with no
+intervening ``release`` on that object, is not equivalent under the high-level
+semantics to a computation history in which these sends are removed. Note that
+this implies that these methods may not raise exceptions.
+
+It is undefined behavior if a computation history features any use whatsoever
+of an object following the completion of a send of ``release`` that is not
+preceded by a send of ``retain`` to the same object.
+
+The behavior of ``autorelease`` must be equivalent to sending ``release`` when
+one of the autorelease pools currently in scope is popped. It may not throw an
+exception.
+
+When the semantics call for performing one of these operations on a retainable
+object pointer, if that pointer is ``null`` then the effect is a no-op.
+
+All of the semantics described in this document are subject to additional
+:ref:`optimization rules <arc.optimization>` which permit the removal or
+optimization of operations based on local knowledge of data flow. The
+semantics describe the high-level behaviors that the compiler implements, not
+an exact sequence of operations that a program will be compiled into.
+
+.. _arc.objects.operands:
+
+Retainable object pointers as operands and arguments
+----------------------------------------------------
+
+In general, ARC does not perform retain or release operations when simply using
+a retainable object pointer as an operand within an expression. This includes:
+
+* loading a retainable pointer from an object with non-weak :ref:`ownership
+ <arc.ownership>`,
+* passing a retainable pointer as an argument to a function or method, and
+* receiving a retainable pointer as the result of a function or method call.
+
+.. admonition:: Rationale
+
+ While this might seem uncontroversial, it is actually unsafe when multiple
+ expressions are evaluated in "parallel", as with binary operators and calls,
+ because (for example) one expression might load from an object while another
+ writes to it. However, C and C++ already call this undefined behavior
+ because the evaluations are unsequenced, and ARC simply exploits that here to
+ avoid needing to retain arguments across a large number of calls.
+
+The remainder of this section describes exceptions to these rules, how those
+exceptions are detected, and what those exceptions imply semantically.
+
+.. _arc.objects.operands.consumed:
+
+Consumed parameters
+^^^^^^^^^^^^^^^^^^^
+
+A function or method parameter of retainable object pointer type may be marked
+as :arc-term:`consumed`, signifying that the callee expects to take ownership
+of a +1 retain count. This is done by adding the ``ns_consumed`` attribute to
+the parameter declaration, like so:
+
+.. code-block:: objc
+
+ void foo(__attribute((ns_consumed)) id x);
+ - (void) foo: (id) __attribute((ns_consumed)) x;
+
+This attribute is part of the type of the function or method, not the type of
+the parameter. It controls only how the argument is passed and received.
+
+When passing such an argument, ARC retains the argument prior to making the
+call.
+
+When receiving such an argument, ARC releases the argument at the end of the
+function, subject to the usual optimizations for local values.
+
+.. admonition:: Rationale
+
+ This formalizes direct transfers of ownership from a caller to a callee. The
+ most common scenario here is passing the ``self`` parameter to ``init``, but
+ it is useful to generalize. Typically, local optimization will remove any
+ extra retains and releases: on the caller side the retain will be merged with
+ a +1 source, and on the callee side the release will be rolled into the
+ initialization of the parameter.
+
+The implicit ``self`` parameter of a method may be marked as consumed by adding
+``__attribute__((ns_consumes_self))`` to the method declaration. Methods in
+the ``init`` :ref:`family <arc.method-families>` are treated as if they were
+implicitly marked with this attribute.
+
+It is undefined behavior if an Objective-C message send to a method with
+``ns_consumed`` parameters (other than self) is made with a null receiver. It
+is undefined behavior if the method to which an Objective-C message send
+statically resolves to has a different set of ``ns_consumed`` parameters than
+the method it dynamically resolves to. It is undefined behavior if a block or
+function call is made through a static type with a different set of
+``ns_consumed`` parameters than the implementation of the called block or
+function.
+
+.. admonition:: Rationale
+
+ Consumed parameters with null receiver are a guaranteed leak. Mismatches
+ with consumed parameters will cause over-retains or over-releases, depending
+ on the direction. The rule about function calls is really just an
+ application of the existing C/C++ rule about calling functions through an
+ incompatible function type, but it's useful to state it explicitly.
+
+.. _arc.object.operands.retained-return-values:
+
+Retained return values
+^^^^^^^^^^^^^^^^^^^^^^
+
+A function or method which returns a retainable object pointer type may be
+marked as returning a retained value, signifying that the caller expects to take
+ownership of a +1 retain count. This is done by adding the
+``ns_returns_retained`` attribute to the function or method declaration, like
+so:
+
+.. code-block:: objc
+
+ id foo(void) __attribute((ns_returns_retained));
+ - (id) foo __attribute((ns_returns_retained));
+
+This attribute is part of the type of the function or method.
+
+When returning from such a function or method, ARC retains the value at the
+point of evaluation of the return statement, before leaving all local scopes.
+
+When receiving a return result from such a function or method, ARC releases the
+value at the end of the full-expression it is contained within, subject to the
+usual optimizations for local values.
+
+.. admonition:: Rationale
+
+ This formalizes direct transfers of ownership from a callee to a caller. The
+ most common scenario this models is the retained return from ``init``,
+ ``alloc``, ``new``, and ``copy`` methods, but there are other cases in the
+ frameworks. After optimization there are typically no extra retains and
+ releases required.
+
+Methods in the ``alloc``, ``copy``, ``init``, ``mutableCopy``, and ``new``
+:ref:`families <arc.method-families>` are implicitly marked
+``__attribute__((ns_returns_retained))``. This may be suppressed by explicitly
+marking the method ``__attribute__((ns_returns_not_retained))``.
+
+It is undefined behavior if the method to which an Objective-C message send
+statically resolves has different retain semantics on its result from the
+method it dynamically resolves to. It is undefined behavior if a block or
+function call is made through a static type with different retain semantics on
+its result from the implementation of the called block or function.
+
+.. admonition:: Rationale
+
+ Mismatches with returned results will cause over-retains or over-releases,
+ depending on the direction. Again, the rule about function calls is really
+ just an application of the existing C/C++ rule about calling functions
+ through an incompatible function type.
+
+.. _arc.objects.operands.unretained-returns:
+
+Unretained return values
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+A method or function which returns a retainable object type but does not return
+a retained value must ensure that the object is still valid across the return
+boundary.
+
+When returning from such a function or method, ARC retains the value at the
+point of evaluation of the return statement, then leaves all local scopes, and
+then balances out the retain while ensuring that the value lives across the
+call boundary. In the worst case, this may involve an ``autorelease``, but
+callers must not assume that the value is actually in the autorelease pool.
+
+ARC performs no extra mandatory work on the caller side, although it may elect
+to do something to shorten the lifetime of the returned value.
+
+.. admonition:: Rationale
+
+ It is common in non-ARC code to not return an autoreleased value; therefore
+ the convention does not force either path. It is convenient to not be
+ required to do unnecessary retains and autoreleases; this permits
+ optimizations such as eliding retain/autoreleases when it can be shown that
+ the original pointer will still be valid at the point of return.
+
+A method or function may be marked with
+``__attribute__((ns_returns_autoreleased))`` to indicate that it returns a
+pointer which is guaranteed to be valid at least as long as the innermost
+autorelease pool. There are no additional semantics enforced in the definition
+of such a method; it merely enables optimizations in callers.
+
+.. _arc.objects.operands.casts:
+
+Bridged casts
+^^^^^^^^^^^^^
+
+A :arc-term:`bridged cast` is a C-style cast annotated with one of three
+keywords:
+
+* ``(__bridge T) op`` casts the operand to the destination type ``T``. If
+ ``T`` is a retainable object pointer type, then ``op`` must have a
+ non-retainable pointer type. If ``T`` is a non-retainable pointer type,
+ then ``op`` must have a retainable object pointer type. Otherwise the cast
+ is ill-formed. There is no transfer of ownership, and ARC inserts no retain
+ operations.
+* ``(__bridge_retained T) op`` casts the operand, which must have retainable
+ object pointer type, to the destination type, which must be a non-retainable
+ pointer type. ARC retains the value, subject to the usual optimizations on
+ local values, and the recipient is responsible for balancing that +1.
+* ``(__bridge_transfer T) op`` casts the operand, which must have
+ non-retainable pointer type, to the destination type, which must be a
+ retainable object pointer type. ARC will release the value at the end of
+ the enclosing full-expression, subject to the usual optimizations on local
+ values.
+
+These casts are required in order to transfer objects in and out of ARC
+control; see the rationale in the section on :ref:`conversion of retainable
+object pointers <arc.objects.restrictions.conversion>`.
+
+Using a ``__bridge_retained`` or ``__bridge_transfer`` cast purely to convince
+ARC to emit an unbalanced retain or release, respectively, is poor form.
+
+.. _arc.objects.restrictions:
+
+Restrictions
+------------
+
+.. _arc.objects.restrictions.conversion:
+
+Conversion of retainable object pointers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In general, a program which attempts to implicitly or explicitly convert a
+value of retainable object pointer type to any non-retainable type, or
+vice-versa, is ill-formed. For example, an Objective-C object pointer shall
+not be converted to ``void*``. As an exception, cast to ``intptr_t`` is
+allowed because such casts are not transferring ownership. The :ref:`bridged
+casts <arc.objects.operands.casts>` may be used to perform these conversions
+where necessary.
+
+.. admonition:: Rationale
+
+ We cannot ensure the correct management of the lifetime of objects if they
+ may be freely passed around as unmanaged types. The bridged casts are
+ provided so that the programmer may explicitly describe whether the cast
+ transfers control into or out of ARC.
+
+However, the following exceptions apply.
+
+.. _arc.objects.restrictions.conversion.with.known.semantics:
+
+Conversion to retainable object pointer type of expressions with known semantics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+:when-revised:`[beginning Apple 4.0, LLVM 3.1]`
+:revision:`These exceptions have been greatly expanded; they previously applied
+only to a much-reduced subset which is difficult to categorize but which
+included null pointers, message sends (under the given rules), and the various
+global constants.`
+
+An unbridged conversion to a retainable object pointer type from a type other
+than a retainable object pointer type is ill-formed, as discussed above, unless
+the operand of the cast has a syntactic form which is known retained, known
+unretained, or known retain-agnostic.
+
+An expression is :arc-term:`known retain-agnostic` if it is:
+
+* an Objective-C string literal,
+* a load from a ``const`` system global variable of :ref:`C retainable pointer
+ type <arc.misc.c-retainable>`, or
+* a null pointer constant.
+
+An expression is :arc-term:`known unretained` if it is an rvalue of :ref:`C
+retainable pointer type <arc.misc.c-retainable>` and it is:
+
+* a direct call to a function, and either that function has the
+ ``cf_returns_not_retained`` attribute or it is an :ref:`audited
+ <arc.misc.c-retainable.audit>` function that does not have the
+ ``cf_returns_retained`` attribute and does not follow the create/copy naming
+ convention,
+* a message send, and the declared method either has the
+ ``cf_returns_not_retained`` attribute or it has neither the
+ ``cf_returns_retained`` attribute nor a :ref:`selector family
+ <arc.method-families>` that implies a retained result.
+
+An expression is :arc-term:`known retained` if it is an rvalue of :ref:`C
+retainable pointer type <arc.misc.c-retainable>` and it is:
+
+* a message send, and the declared method either has the
+ ``cf_returns_retained`` attribute, or it does not have the
+ ``cf_returns_not_retained`` attribute but it does have a :ref:`selector
+ family <arc.method-families>` that implies a retained result.
+
+Furthermore:
+
+* a comma expression is classified according to its right-hand side,
+* a statement expression is classified according to its result expression, if
+ it has one,
+* an lvalue-to-rvalue conversion applied to an Objective-C property lvalue is
+ classified according to the underlying message send, and
+* a conditional operator is classified according to its second and third
+ operands, if they agree in classification, or else the other if one is known
+ retain-agnostic.
+
+If the cast operand is known retained, the conversion is treated as a
+``__bridge_transfer`` cast. If the cast operand is known unretained or known
+retain-agnostic, the conversion is treated as a ``__bridge`` cast.
+
+.. admonition:: Rationale
+
+ Bridging casts are annoying. Absent the ability to completely automate the
+ management of CF objects, however, we are left with relatively poor attempts
+ to reduce the need for a glut of explicit bridges. Hence these rules.
+
+ We've so far consciously refrained from implicitly turning retained CF
+ results from function calls into ``__bridge_transfer`` casts. The worry is
+ that some code patterns --- for example, creating a CF value, assigning it
+ to an ObjC-typed local, and then calling ``CFRelease`` when done --- are a
+ bit too likely to be accidentally accepted, leading to mysterious behavior.
+
+.. _arc.objects.restrictions.conversion-exception-contextual:
+
+Conversion from retainable object pointer type in certain contexts
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+:when-revised:`[beginning Apple 4.0, LLVM 3.1]`
+
+If an expression of retainable object pointer type is explicitly cast to a
+:ref:`C retainable pointer type <arc.misc.c-retainable>`, the program is
+ill-formed as discussed above unless the result is immediately used:
+
+* to initialize a parameter in an Objective-C message send where the parameter
+ is not marked with the ``cf_consumed`` attribute, or
+* to initialize a parameter in a direct call to an
+ :ref:`audited <arc.misc.c-retainable.audit>` function where the parameter is
+ not marked with the ``cf_consumed`` attribute.
+
+.. admonition:: Rationale
+
+ Consumed parameters are left out because ARC would naturally balance them
+ with a retain, which was judged too treacherous. This is in part because
+ several of the most common consuming functions are in the ``Release`` family,
+ and it would be quite unfortunate for explicit releases to be silently
+ balanced out in this way.
+
+.. _arc.ownership:
+
+Ownership qualification
+=======================
+
+This section describes the behavior of *objects* of retainable object pointer
+type; that is, locations in memory which store retainable object pointers.
+
+A type is a :arc-term:`retainable object owner type` if it is a retainable
+object pointer type or an array type whose element type is a retainable object
+owner type.
+
+An :arc-term:`ownership qualifier` is a type qualifier which applies only to
+retainable object owner types. An array type is ownership-qualified according
+to its element type, and adding an ownership qualifier to an array type so
+qualifies its element type.
+
+A program is ill-formed if it attempts to apply an ownership qualifier to a
+type which is already ownership-qualified, even if it is the same qualifier.
+There is a single exception to this rule: an ownership qualifier may be applied
+to a substituted template type parameter, which overrides the ownership
+qualifier provided by the template argument.
+
+When forming a function type, the result type is adjusted so that any
+top-level ownership qualifier is deleted.
+
+Except as described under the :ref:`inference rules <arc.ownership.inference>`,
+a program is ill-formed if it attempts to form a pointer or reference type to a
+retainable object owner type which lacks an ownership qualifier.
+
+.. admonition:: Rationale
+
+ These rules, together with the inference rules, ensure that all objects and
+ lvalues of retainable object pointer type have an ownership qualifier. The
+ ability to override an ownership qualifier during template substitution is
+ required to counteract the :ref:`inference of __strong for template type
+ arguments <arc.ownership.inference.template.arguments>`. Ownership qualifiers
+ on return types are dropped because they serve no purpose there except to
+ cause spurious problems with overloading and templates.
+
+There are four ownership qualifiers:
+
+* ``__autoreleasing``
+* ``__strong``
+* ``__unsafe_unretained``
+* ``__weak``
+
+A type is :arc-term:`nontrivially ownership-qualified` if it is qualified with
+``__autoreleasing``, ``__strong``, or ``__weak``.
+
+.. _arc.ownership.spelling:
+
+Spelling
+--------
+
+The names of the ownership qualifiers are reserved for the implementation. A
+program may not assume that they are or are not implemented with macros, or
+what those macros expand to.
+
+An ownership qualifier may be written anywhere that any other type qualifier
+may be written.
+
+If an ownership qualifier appears in the *declaration-specifiers*, the
+following rules apply:
+
+* if the type specifier is a retainable object owner type, the qualifier
+ initially applies to that type;
+
+* otherwise, if the outermost non-array declarator is a pointer
+ or block pointer declarator, the qualifier initially applies to
+ that type;
+
+* otherwise the program is ill-formed.
+
+* If the qualifier is so applied at a position in the declaration
+ where the next-innermost declarator is a function declarator, and
+ there is an block declarator within that function declarator, then
+ the qualifier applies instead to that block declarator and this rule
+ is considered afresh beginning from the new position.
+
+If an ownership qualifier appears on the declarator name, or on the declared
+object, it is applied to the innermost pointer or block-pointer type.
+
+If an ownership qualifier appears anywhere else in a declarator, it applies to
+the type there.
+
+.. admonition:: Rationale
+
+ Ownership qualifiers are like ``const`` and ``volatile`` in the sense
+ that they may sensibly apply at multiple distinct positions within a
+ declarator. However, unlike those qualifiers, there are many
+ situations where they are not meaningful, and so we make an effort
+ to "move" the qualifier to a place where it will be meaningful. The
+ general goal is to allow the programmer to write, say, ``__strong``
+ before the entire declaration and have it apply in the leftmost
+ sensible place.
+
+.. _arc.ownership.spelling.property:
+
+Property declarations
+^^^^^^^^^^^^^^^^^^^^^
+
+A property of retainable object pointer type may have ownership. If the
+property's type is ownership-qualified, then the property has that ownership.
+If the property has one of the following modifiers, then the property has the
+corresponding ownership. A property is ill-formed if it has conflicting
+sources of ownership, or if it has redundant ownership modifiers, or if it has
+``__autoreleasing`` ownership.
+
+* ``assign`` implies ``__unsafe_unretained`` ownership.
+* ``copy`` implies ``__strong`` ownership, as well as the usual behavior of
+ copy semantics on the setter.
+* ``retain`` implies ``__strong`` ownership.
+* ``strong`` implies ``__strong`` ownership.
+* ``unsafe_unretained`` implies ``__unsafe_unretained`` ownership.
+* ``weak`` implies ``__weak`` ownership.
+
+With the exception of ``weak``, these modifiers are available in non-ARC
+modes.
+
+A property's specified ownership is preserved in its metadata, but otherwise
+the meaning is purely conventional unless the property is synthesized. If a
+property is synthesized, then the :arc-term:`associated instance variable` is
+the instance variable which is named, possibly implicitly, by the
+``@synthesize`` declaration. If the associated instance variable already
+exists, then its ownership qualification must equal the ownership of the
+property; otherwise, the instance variable is created with that ownership
+qualification.
+
+A property of retainable object pointer type which is synthesized without a
+source of ownership has the ownership of its associated instance variable, if it
+already exists; otherwise, :when-revised:`[beginning Apple 3.1, LLVM 3.1]`
+:revision:`its ownership is implicitly` ``strong``. Prior to this revision, it
+was ill-formed to synthesize such a property.
+
+.. admonition:: Rationale
+
+ Using ``strong`` by default is safe and consistent with the generic ARC rule
+ about :ref:`inferring ownership <arc.ownership.inference.variables>`. It is,
+ unfortunately, inconsistent with the non-ARC rule which states that such
+ properties are implicitly ``assign``. However, that rule is clearly
+ untenable in ARC, since it leads to default-unsafe code. The main merit to
+ banning the properties is to avoid confusion with non-ARC practice, which did
+ not ultimately strike us as sufficient to justify requiring extra syntax and
+ (more importantly) forcing novices to understand ownership rules just to
+ declare a property when the default is so reasonable. Changing the rule away
+ from non-ARC practice was acceptable because we had conservatively banned the
+ synthesis in order to give ourselves exactly this leeway.
+
+Applying ``__attribute__((NSObject))`` to a property not of retainable object
+pointer type has the same behavior it does outside of ARC: it requires the
+property type to be some sort of pointer and permits the use of modifiers other
+than ``assign``. These modifiers only affect the synthesized getter and
+setter; direct accesses to the ivar (even if synthesized) still have primitive
+semantics, and the value in the ivar will not be automatically released during
+deallocation.
+
+.. _arc.ownership.semantics:
+
+Semantics
+---------
+
+There are five :arc-term:`managed operations` which may be performed on an
+object of retainable object pointer type. Each qualifier specifies different
+semantics for each of these operations. It is still undefined behavior to
+access an object outside of its lifetime.
+
+A load or store with "primitive semantics" has the same semantics as the
+respective operation would have on an ``void*`` lvalue with the same alignment
+and non-ownership qualification.
+
+:arc-term:`Reading` occurs when performing a lvalue-to-rvalue conversion on an
+object lvalue.
+
+* For ``__weak`` objects, the current pointee is retained and then released at
+ the end of the current full-expression. This must execute atomically with
+ respect to assignments and to the final release of the pointee.
+* For all other objects, the lvalue is loaded with primitive semantics.
+
+:arc-term:`Assignment` occurs when evaluating an assignment operator. The
+semantics vary based on the qualification:
+
+* For ``__strong`` objects, the new pointee is first retained; second, the
+ lvalue is loaded with primitive semantics; third, the new pointee is stored
+ into the lvalue with primitive semantics; and finally, the old pointee is
+ released. This is not performed atomically; external synchronization must be
+ used to make this safe in the face of concurrent loads and stores.
+* For ``__weak`` objects, the lvalue is updated to point to the new pointee,
+ unless the new pointee is an object currently undergoing deallocation, in
+ which case the lvalue is updated to a null pointer. This must execute
+ atomically with respect to other assignments to the object, to reads from the
+ object, and to the final release of the new pointee.
+* For ``__unsafe_unretained`` objects, the new pointee is stored into the
+ lvalue using primitive semantics.
+* For ``__autoreleasing`` objects, the new pointee is retained, autoreleased,
+ and stored into the lvalue using primitive semantics.
+
+:arc-term:`Initialization` occurs when an object's lifetime begins, which
+depends on its storage duration. Initialization proceeds in two stages:
+
+#. First, a null pointer is stored into the lvalue using primitive semantics.
+ This step is skipped if the object is ``__unsafe_unretained``.
+#. Second, if the object has an initializer, that expression is evaluated and
+ then assigned into the object using the usual assignment semantics.
+
+:arc-term:`Destruction` occurs when an object's lifetime ends. In all cases it
+is semantically equivalent to assigning a null pointer to the object, with the
+proviso that of course the object cannot be legally read after the object's
+lifetime ends.
+
+:arc-term:`Moving` occurs in specific situations where an lvalue is "moved
+from", meaning that its current pointee will be used but the object may be left
+in a different (but still valid) state. This arises with ``__block`` variables
+and rvalue references in C++. For ``__strong`` lvalues, moving is equivalent
+to loading the lvalue with primitive semantics, writing a null pointer to it
+with primitive semantics, and then releasing the result of the load at the end
+of the current full-expression. For all other lvalues, moving is equivalent to
+reading the object.
+
+.. _arc.ownership.restrictions:
+
+Restrictions
+------------
+
+.. _arc.ownership.restrictions.weak:
+
+Weak-unavailable types
+^^^^^^^^^^^^^^^^^^^^^^
+
+It is explicitly permitted for Objective-C classes to not support ``__weak``
+references. It is undefined behavior to perform an operation with weak
+assignment semantics with a pointer to an Objective-C object whose class does
+not support ``__weak`` references.
+
+.. admonition:: Rationale
+
+ Historically, it has been possible for a class to provide its own
+ reference-count implementation by overriding ``retain``, ``release``, etc.
+ However, weak references to an object require coordination with its class's
+ reference-count implementation because, among other things, weak loads and
+ stores must be atomic with respect to the final release. Therefore, existing
+ custom reference-count implementations will generally not support weak
+ references without additional effort. This is unavoidable without breaking
+ binary compatibility.
+
+A class may indicate that it does not support weak references by providing the
+``objc_arc_weak_unavailable`` attribute on the class's interface declaration. A
+retainable object pointer type is **weak-unavailable** if
+is a pointer to an (optionally protocol-qualified) Objective-C class ``T`` where
+``T`` or one of its superclasses has the ``objc_arc_weak_unavailable``
+attribute. A program is ill-formed if it applies the ``__weak`` ownership
+qualifier to a weak-unavailable type or if the value operand of a weak
+assignment operation has a weak-unavailable type.
+
+.. _arc.ownership.restrictions.autoreleasing:
+
+Storage duration of ``__autoreleasing`` objects
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A program is ill-formed if it declares an ``__autoreleasing`` object of
+non-automatic storage duration. A program is ill-formed if it captures an
+``__autoreleasing`` object in a block or, unless by reference, in a C++11
+lambda.
+
+.. admonition:: Rationale
+
+ Autorelease pools are tied to the current thread and scope by their nature.
+ While it is possible to have temporary objects whose instance variables are
+ filled with autoreleased objects, there is no way that ARC can provide any
+ sort of safety guarantee there.
+
+It is undefined behavior if a non-null pointer is assigned to an
+``__autoreleasing`` object while an autorelease pool is in scope and then that
+object is read after the autorelease pool's scope is left.
+
+.. _arc.ownership.restrictions.conversion.indirect:
+
+Conversion of pointers to ownership-qualified types
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A program is ill-formed if an expression of type ``T*`` is converted,
+explicitly or implicitly, to the type ``U*``, where ``T`` and ``U`` have
+different ownership qualification, unless:
+
+* ``T`` is qualified with ``__strong``, ``__autoreleasing``, or
+ ``__unsafe_unretained``, and ``U`` is qualified with both ``const`` and
+ ``__unsafe_unretained``; or
+* either ``T`` or ``U`` is ``cv void``, where ``cv`` is an optional sequence
+ of non-ownership qualifiers; or
+* the conversion is requested with a ``reinterpret_cast`` in Objective-C++; or
+* the conversion is a well-formed :ref:`pass-by-writeback
+ <arc.ownership.restrictions.pass_by_writeback>`.
+
+The analogous rule applies to ``T&`` and ``U&`` in Objective-C++.
+
+.. admonition:: Rationale
+
+ These rules provide a reasonable level of type-safety for indirect pointers,
+ as long as the underlying memory is not deallocated. The conversion to
+ ``const __unsafe_unretained`` is permitted because the semantics of reads are
+ equivalent across all these ownership semantics, and that's a very useful and
+ common pattern. The interconversion with ``void*`` is useful for allocating
+ memory or otherwise escaping the type system, but use it carefully.
+ ``reinterpret_cast`` is considered to be an obvious enough sign of taking
+ responsibility for any problems.
+
+It is undefined behavior to access an ownership-qualified object through an
+lvalue of a differently-qualified type, except that any non-``__weak`` object
+may be read through an ``__unsafe_unretained`` lvalue.
+
+It is undefined behavior if a managed operation is performed on a ``__strong``
+or ``__weak`` object without a guarantee that it contains a primitive zero
+bit-pattern, or if the storage for such an object is freed or reused without the
+object being first assigned a null pointer.
+
+.. admonition:: Rationale
+
+ ARC cannot differentiate between an assignment operator which is intended to
+ "initialize" dynamic memory and one which is intended to potentially replace
+ a value. Therefore the object's pointer must be valid before letting ARC at
+ it. Similarly, C and Objective-C do not provide any language hooks for
+ destroying objects held in dynamic memory, so it is the programmer's
+ responsibility to avoid leaks (``__strong`` objects) and consistency errors
+ (``__weak`` objects).
+
+These requirements are followed automatically in Objective-C++ when creating
+objects of retainable object owner type with ``new`` or ``new[]`` and destroying
+them with ``delete``, ``delete[]``, or a pseudo-destructor expression. Note
+that arrays of nontrivially-ownership-qualified type are not ABI compatible with
+non-ARC code because the element type is non-POD: such arrays that are
+``new[]``'d in ARC translation units cannot be ``delete[]``'d in non-ARC
+translation units and vice-versa.
+
+.. _arc.ownership.restrictions.pass_by_writeback:
+
+Passing to an out parameter by writeback
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If the argument passed to a parameter of type ``T __autoreleasing *`` has type
+``U oq *``, where ``oq`` is an ownership qualifier, then the argument is a
+candidate for :arc-term:`pass-by-writeback`` if:
+
+* ``oq`` is ``__strong`` or ``__weak``, and
+* it would be legal to initialize a ``T __strong *`` with a ``U __strong *``.
+
+For purposes of overload resolution, an implicit conversion sequence requiring
+a pass-by-writeback is always worse than an implicit conversion sequence not
+requiring a pass-by-writeback.
+
+The pass-by-writeback is ill-formed if the argument expression does not have a
+legal form:
+
+* ``&var``, where ``var`` is a scalar variable of automatic storage duration
+ with retainable object pointer type
+* a conditional expression where the second and third operands are both legal
+ forms
+* a cast whose operand is a legal form
+* a null pointer constant
+
+.. admonition:: Rationale
+
+ The restriction in the form of the argument serves two purposes. First, it
+ makes it impossible to pass the address of an array to the argument, which
+ serves to protect against an otherwise serious risk of mis-inferring an
+ "array" argument as an out-parameter. Second, it makes it much less likely
+ that the user will see confusing aliasing problems due to the implementation,
+ below, where their store to the writeback temporary is not immediately seen
+ in the original argument variable.
+
+A pass-by-writeback is evaluated as follows:
+
+#. The argument is evaluated to yield a pointer ``p`` of type ``U oq *``.
+#. If ``p`` is a null pointer, then a null pointer is passed as the argument,
+ and no further work is required for the pass-by-writeback.
+#. Otherwise, a temporary of type ``T __autoreleasing`` is created and
+ initialized to a null pointer.
+#. If the parameter is not an Objective-C method parameter marked ``out``,
+ then ``*p`` is read, and the result is written into the temporary with
+ primitive semantics.
+#. The address of the temporary is passed as the argument to the actual call.
+#. After the call completes, the temporary is loaded with primitive
+ semantics, and that value is assigned into ``*p``.
+
+.. admonition:: Rationale
+
+ This is all admittedly convoluted. In an ideal world, we would see that a
+ local variable is being passed to an out-parameter and retroactively modify
+ its type to be ``__autoreleasing`` rather than ``__strong``. This would be
+ remarkably difficult and not always well-founded under the C type system.
+ However, it was judged unacceptably invasive to require programmers to write
+ ``__autoreleasing`` on all the variables they intend to use for
+ out-parameters. This was the least bad solution.
+
+.. _arc.ownership.restrictions.records:
+
+Ownership-qualified fields of structs and unions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A program is ill-formed if it declares a member of a C struct or union to have
+a nontrivially ownership-qualified type.
+
+.. admonition:: Rationale
+
+ The resulting type would be non-POD in the C++ sense, but C does not give us
+ very good language tools for managing the lifetime of aggregates, so it is
+ more convenient to simply forbid them. It is still possible to manage this
+ with a ``void*`` or an ``__unsafe_unretained`` object.
+
+This restriction does not apply in Objective-C++. However, nontrivally
+ownership-qualified types are considered non-POD: in C++11 terms, they are not
+trivially default constructible, copy constructible, move constructible, copy
+assignable, move assignable, or destructible. It is a violation of C++'s One
+Definition Rule to use a class outside of ARC that, under ARC, would have a
+nontrivially ownership-qualified member.
+
+.. admonition:: Rationale
+
+ Unlike in C, we can express all the necessary ARC semantics for
+ ownership-qualified subobjects as suboperations of the (default) special
+ member functions for the class. These functions then become non-trivial.
+ This has the non-obvious result that the class will have a non-trivial copy
+ constructor and non-trivial destructor; if this would not normally be true
+ outside of ARC, objects of the type will be passed and returned in an
+ ABI-incompatible manner.
+
+.. _arc.ownership.inference:
+
+Ownership inference
+-------------------
+
+.. _arc.ownership.inference.variables:
+
+Objects
+^^^^^^^
+
+If an object is declared with retainable object owner type, but without an
+explicit ownership qualifier, its type is implicitly adjusted to have
+``__strong`` qualification.
+
+As a special case, if the object's base type is ``Class`` (possibly
+protocol-qualified), the type is adjusted to have ``__unsafe_unretained``
+qualification instead.
+
+.. _arc.ownership.inference.indirect_parameters:
+
+Indirect parameters
+^^^^^^^^^^^^^^^^^^^
+
+If a function or method parameter has type ``T*``, where ``T`` is an
+ownership-unqualified retainable object pointer type, then:
+
+* if ``T`` is ``const``-qualified or ``Class``, then it is implicitly
+ qualified with ``__unsafe_unretained``;
+* otherwise, it is implicitly qualified with ``__autoreleasing``.
+
+.. admonition:: Rationale
+
+ ``__autoreleasing`` exists mostly for this case, the Cocoa convention for
+ out-parameters. Since a pointer to ``const`` is obviously not an
+ out-parameter, we instead use a type more useful for passing arrays. If the
+ user instead intends to pass in a *mutable* array, inferring
+ ``__autoreleasing`` is the wrong thing to do; this directs some of the
+ caution in the following rules about writeback.
+
+Such a type written anywhere else would be ill-formed by the general rule
+requiring ownership qualifiers.
+
+This rule does not apply in Objective-C++ if a parameter's type is dependent in
+a template pattern and is only *instantiated* to a type which would be a
+pointer to an unqualified retainable object pointer type. Such code is still
+ill-formed.
+
+.. admonition:: Rationale
+
+ The convention is very unlikely to be intentional in template code.
+
+.. _arc.ownership.inference.template.arguments:
+
+Template arguments
+^^^^^^^^^^^^^^^^^^
+
+If a template argument for a template type parameter is an retainable object
+owner type that does not have an explicit ownership qualifier, it is adjusted
+to have ``__strong`` qualification. This adjustment occurs regardless of
+whether the template argument was deduced or explicitly specified.
+
+.. admonition:: Rationale
+
+ ``__strong`` is a useful default for containers (e.g., ``std::vector<id>``),
+ which would otherwise require explicit qualification. Moreover, unqualified
+ retainable object pointer types are unlikely to be useful within templates,
+ since they generally need to have a qualifier applied to the before being
+ used.
+
+.. _arc.method-families:
+
+Method families
+===============
+
+An Objective-C method may fall into a :arc-term:`method family`, which is a
+conventional set of behaviors ascribed to it by the Cocoa conventions.
+
+A method is in a certain method family if:
+
+* it has a ``objc_method_family`` attribute placing it in that family; or if
+ not that,
+* it does not have an ``objc_method_family`` attribute placing it in a
+ different or no family, and
+* its selector falls into the corresponding selector family, and
+* its signature obeys the added restrictions of the method family.
+
+A selector is in a certain selector family if, ignoring any leading
+underscores, the first component of the selector either consists entirely of
+the name of the method family or it begins with that name followed by a
+character other than a lowercase letter. For example, ``_perform:with:`` and
+``performWith:`` would fall into the ``perform`` family (if we recognized one),
+but ``performing:with`` would not.
+
+The families and their added restrictions are:
+
+* ``alloc`` methods must return a retainable object pointer type.
+* ``copy`` methods must return a retainable object pointer type.
+* ``mutableCopy`` methods must return a retainable object pointer type.
+* ``new`` methods must return a retainable object pointer type.
+* ``init`` methods must be instance methods and must return an Objective-C
+ pointer type. Additionally, a program is ill-formed if it declares or
+ contains a call to an ``init`` method whose return type is neither ``id`` nor
+ a pointer to a super-class or sub-class of the declaring class (if the method
+ was declared on a class) or the static receiver type of the call (if it was
+ declared on a protocol).
+
+ .. admonition:: Rationale
+
+ There are a fair number of existing methods with ``init``-like selectors
+ which nonetheless don't follow the ``init`` conventions. Typically these
+ are either accidental naming collisions or helper methods called during
+ initialization. Because of the peculiar retain/release behavior of
+ ``init`` methods, it's very important not to treat these methods as
+ ``init`` methods if they aren't meant to be. It was felt that implicitly
+ defining these methods out of the family based on the exact relationship
+ between the return type and the declaring class would be much too subtle
+ and fragile. Therefore we identify a small number of legitimate-seeming
+ return types and call everything else an error. This serves the secondary
+ purpose of encouraging programmers not to accidentally give methods names
+ in the ``init`` family.
+
+ Note that a method with an ``init``-family selector which returns a
+ non-Objective-C type (e.g. ``void``) is perfectly well-formed; it simply
+ isn't in the ``init`` family.
+
+A program is ill-formed if a method's declarations, implementations, and
+overrides do not all have the same method family.
+
+.. _arc.family.attribute:
+
+Explicit method family control
+------------------------------
+
+A method may be annotated with the ``objc_method_family`` attribute to
+precisely control which method family it belongs to. If a method in an
+``@implementation`` does not have this attribute, but there is a method
+declared in the corresponding ``@interface`` that does, then the attribute is
+copied to the declaration in the ``@implementation``. The attribute is
+available outside of ARC, and may be tested for with the preprocessor query
+``__has_attribute(objc_method_family)``.
+
+The attribute is spelled
+``__attribute__((objc_method_family(`` *family* ``)))``. If *family* is
+``none``, the method has no family, even if it would otherwise be considered to
+have one based on its selector and type. Otherwise, *family* must be one of
+``alloc``, ``copy``, ``init``, ``mutableCopy``, or ``new``, in which case the
+method is considered to belong to the corresponding family regardless of its
+selector. It is an error if a method that is explicitly added to a family in
+this way does not meet the requirements of the family other than the selector
+naming convention.
+
+.. admonition:: Rationale
+
+ The rules codified in this document describe the standard conventions of
+ Objective-C. However, as these conventions have not heretofore been enforced
+ by an unforgiving mechanical system, they are only imperfectly kept,
+ especially as they haven't always even been precisely defined. While it is
+ possible to define low-level ownership semantics with attributes like
+ ``ns_returns_retained``, this attribute allows the user to communicate
+ semantic intent, which is of use both to ARC (which, e.g., treats calls to
+ ``init`` specially) and the static analyzer.
+
+.. _arc.family.semantics:
+
+Semantics of method families
+----------------------------
+
+A method's membership in a method family may imply non-standard semantics for
+its parameters and return type.
+
+Methods in the ``alloc``, ``copy``, ``mutableCopy``, and ``new`` families ---
+that is, methods in all the currently-defined families except ``init`` ---
+implicitly :ref:`return a retained object
+<arc.object.operands.retained-return-values>` as if they were annotated with
+the ``ns_returns_retained`` attribute. This can be overridden by annotating
+the method with either of the ``ns_returns_autoreleased`` or
+``ns_returns_not_retained`` attributes.
+
+Properties also follow same naming rules as methods. This means that those in
+the ``alloc``, ``copy``, ``mutableCopy``, and ``new`` families provide access
+to :ref:`retained objects <arc.object.operands.retained-return-values>`. This
+can be overridden by annotating the property with ``ns_returns_not_retained``
+attribute.
+
+.. _arc.family.semantics.init:
+
+Semantics of ``init``
+^^^^^^^^^^^^^^^^^^^^^
+
+Methods in the ``init`` family implicitly :ref:`consume
+<arc.objects.operands.consumed>` their ``self`` parameter and :ref:`return a
+retained object <arc.object.operands.retained-return-values>`. Neither of
+these properties can be altered through attributes.
+
+A call to an ``init`` method with a receiver that is either ``self`` (possibly
+parenthesized or casted) or ``super`` is called a :arc-term:`delegate init
+call`. It is an error for a delegate init call to be made except from an
+``init`` method, and excluding blocks within such methods.
+
+As an exception to the :ref:`usual rule <arc.misc.self>`, the variable ``self``
+is mutable in an ``init`` method and has the usual semantics for a ``__strong``
+variable. However, it is undefined behavior and the program is ill-formed, no
+diagnostic required, if an ``init`` method attempts to use the previous value
+of ``self`` after the completion of a delegate init call. It is conventional,
+but not required, for an ``init`` method to return ``self``.
+
+It is undefined behavior for a program to cause two or more calls to ``init``
+methods on the same object, except that each ``init`` method invocation may
+perform at most one delegate init call.
+
+.. _arc.family.semantics.result_type:
+
+Related result types
+^^^^^^^^^^^^^^^^^^^^
+
+Certain methods are candidates to have :arc-term:`related result types`:
+
+* class methods in the ``alloc`` and ``new`` method families
+* instance methods in the ``init`` family
+* the instance method ``self``
+* outside of ARC, the instance methods ``retain`` and ``autorelease``
+
+If the formal result type of such a method is ``id`` or protocol-qualified
+``id``, or a type equal to the declaring class or a superclass, then it is said
+to have a related result type. In this case, when invoked in an explicit
+message send, it is assumed to return a type related to the type of the
+receiver:
+
+* if it is a class method, and the receiver is a class name ``T``, the message
+ send expression has type ``T*``; otherwise
+* if it is an instance method, and the receiver has type ``T``, the message
+ send expression has type ``T``; otherwise
+* the message send expression has the normal result type of the method.
+
+This is a new rule of the Objective-C language and applies outside of ARC.
+
+.. admonition:: Rationale
+
+ ARC's automatic code emission is more prone than most code to signature
+ errors, i.e. errors where a call was emitted against one method signature,
+ but the implementing method has an incompatible signature. Having more
+ precise type information helps drastically lower this risk, as well as
+ catching a number of latent bugs.
+
+.. _arc.optimization:
+
+Optimization
+============
+
+Within this section, the word :arc-term:`function` will be used to
+refer to any structured unit of code, be it a C function, an
+Objective-C method, or a block.
+
+This specification describes ARC as performing specific ``retain`` and
+``release`` operations on retainable object pointers at specific
+points during the execution of a program. These operations make up a
+non-contiguous subsequence of the computation history of the program.
+The portion of this sequence for a particular retainable object
+pointer for which a specific function execution is directly
+responsible is the :arc-term:`formal local retain history` of the
+object pointer. The corresponding actual sequence executed is the
+`dynamic local retain history`.
+
+However, under certain circumstances, ARC is permitted to re-order and
+eliminate operations in a manner which may alter the overall
+computation history beyond what is permitted by the general "as if"
+rule of C/C++ and the :ref:`restrictions <arc.objects.retains>` on
+the implementation of ``retain`` and ``release``.
+
+.. admonition:: Rationale
+
+ Specifically, ARC is sometimes permitted to optimize ``release``
+ operations in ways which might cause an object to be deallocated
+ before it would otherwise be. Without this, it would be almost
+ impossible to eliminate any ``retain``/``release`` pairs. For
+ example, consider the following code:
+
+ .. code-block:: objc
+
+ id x = _ivar;
+ [x foo];
+
+ If we were not permitted in any event to shorten the lifetime of the
+ object in ``x``, then we would not be able to eliminate this retain
+ and release unless we could prove that the message send could not
+ modify ``_ivar`` (or deallocate ``self``). Since message sends are
+ opaque to the optimizer, this is not possible, and so ARC's hands
+ would be almost completely tied.
+
+ARC makes no guarantees about the execution of a computation history
+which contains undefined behavior. In particular, ARC makes no
+guarantees in the presence of race conditions.
+
+ARC may assume that any retainable object pointers it receives or
+generates are instantaneously valid from that point until a point
+which, by the concurrency model of the host language, happens-after
+the generation of the pointer and happens-before a release of that
+object (possibly via an aliasing pointer or indirectly due to
+destruction of a different object).
+
+.. admonition:: Rationale
+
+ There is very little point in trying to guarantee correctness in the
+ presence of race conditions. ARC does not have a stack-scanning
+ garbage collector, and guaranteeing the atomicity of every load and
+ store operation would be prohibitive and preclude a vast amount of
+ optimization.
+
+ARC may assume that non-ARC code engages in sensible balancing
+behavior and does not rely on exact or minimum retain count values
+except as guaranteed by ``__strong`` object invariants or +1 transfer
+conventions. For example, if an object is provably double-retained
+and double-released, ARC may eliminate the inner retain and release;
+it does not need to guard against code which performs an unbalanced
+release followed by a "balancing" retain.
+
+.. _arc.optimization.liveness:
+
+Object liveness
+---------------
+
+ARC may not allow a retainable object ``X`` to be deallocated at a
+time ``T`` in a computation history if:
+
+* ``X`` is the value stored in a ``__strong`` object ``S`` with
+ :ref:`precise lifetime semantics <arc.optimization.precise>`, or
+
+* ``X`` is the value stored in a ``__strong`` object ``S`` with
+ imprecise lifetime semantics and, at some point after ``T`` but
+ before the next store to ``S``, the computation history features a
+ load from ``S`` and in some way depends on the value loaded, or
+
+* ``X`` is a value described as being released at the end of the
+ current full-expression and, at some point after ``T`` but before
+ the end of the full-expression, the computation history depends
+ on that value.
+
+.. admonition:: Rationale
+
+ The intent of the second rule is to say that objects held in normal
+ ``__strong`` local variables may be released as soon as the value in
+ the variable is no longer being used: either the variable stops
+ being used completely or a new value is stored in the variable.
+
+ The intent of the third rule is to say that return values may be
+ released after they've been used.
+
+A computation history depends on a pointer value ``P`` if it:
+
+* performs a pointer comparison with ``P``,
+* loads from ``P``,
+* stores to ``P``,
+* depends on a pointer value ``Q`` derived via pointer arithmetic
+ from ``P`` (including an instance-variable or field access), or
+* depends on a pointer value ``Q`` loaded from ``P``.
+
+Dependency applies only to values derived directly or indirectly from
+a particular expression result and does not occur merely because a
+separate pointer value dynamically aliases ``P``. Furthermore, this
+dependency is not carried by values that are stored to objects.
+
+.. admonition:: Rationale
+
+ The restrictions on dependency are intended to make this analysis
+ feasible by an optimizer with only incomplete information about a
+ program. Essentially, dependence is carried to "obvious" uses of a
+ pointer. Merely passing a pointer argument to a function does not
+ itself cause dependence, but since generally the optimizer will not
+ be able to prove that the function doesn't depend on that parameter,
+ it will be forced to conservatively assume it does.
+
+ Dependency propagates to values loaded from a pointer because those
+ values might be invalidated by deallocating the object. For
+ example, given the code ``__strong id x = p->ivar;``, ARC must not
+ move the release of ``p`` to between the load of ``p->ivar`` and the
+ retain of that value for storing into ``x``.
+
+ Dependency does not propagate through stores of dependent pointer
+ values because doing so would allow dependency to outlive the
+ full-expression which produced the original value. For example, the
+ address of an instance variable could be written to some global
+ location and then freely accessed during the lifetime of the local,
+ or a function could return an inner pointer of an object and store
+ it to a local. These cases would be potentially impossible to
+ reason about and so would basically prevent any optimizations based
+ on imprecise lifetime. There are also uncommon enough to make it
+ reasonable to require the precise-lifetime annotation if someone
+ really wants to rely on them.
+
+ Dependency does propagate through return values of pointer type.
+ The compelling source of need for this rule is a property accessor
+ which returns an un-autoreleased result; the calling function must
+ have the chance to operate on the value, e.g. to retain it, before
+ ARC releases the original pointer. Note again, however, that
+ dependence does not survive a store, so ARC does not guarantee the
+ continued validity of the return value past the end of the
+ full-expression.
+
+.. _arc.optimization.object_lifetime:
+
+No object lifetime extension
+----------------------------
+
+If, in the formal computation history of the program, an object ``X``
+has been deallocated by the time of an observable side-effect, then
+ARC must cause ``X`` to be deallocated by no later than the occurrence
+of that side-effect, except as influenced by the re-ordering of the
+destruction of objects.
+
+.. admonition:: Rationale
+
+ This rule is intended to prohibit ARC from observably extending the
+ lifetime of a retainable object, other than as specified in this
+ document. Together with the rule limiting the transformation of
+ releases, this rule requires ARC to eliminate retains and release
+ only in pairs.
+
+ ARC's power to reorder the destruction of objects is critical to its
+ ability to do any optimization, for essentially the same reason that
+ it must retain the power to decrease the lifetime of an object.
+ Unfortunately, while it's generally poor style for the destruction
+ of objects to have arbitrary side-effects, it's certainly possible.
+ Hence the caveat.
+
+.. _arc.optimization.precise:
+
+Precise lifetime semantics
+--------------------------
+
+In general, ARC maintains an invariant that a retainable object pointer held in
+a ``__strong`` object will be retained for the full formal lifetime of the
+object. Objects subject to this invariant have :arc-term:`precise lifetime
+semantics`.
+
+By default, local variables of automatic storage duration do not have precise
+lifetime semantics. Such objects are simply strong references which hold
+values of retainable object pointer type, and these values are still fully
+subject to the optimizations on values under local control.
+
+.. admonition:: Rationale
+
+ Applying these precise-lifetime semantics strictly would be prohibitive.
+ Many useful optimizations that might theoretically decrease the lifetime of
+ an object would be rendered impossible. Essentially, it promises too much.
+
+A local variable of retainable object owner type and automatic storage duration
+may be annotated with the ``objc_precise_lifetime`` attribute to indicate that
+it should be considered to be an object with precise lifetime semantics.
+
+.. admonition:: Rationale
+
+ Nonetheless, it is sometimes useful to be able to force an object to be
+ released at a precise time, even if that object does not appear to be used.
+ This is likely to be uncommon enough that the syntactic weight of explicitly
+ requesting these semantics will not be burdensome, and may even make the code
+ clearer.
+
+.. _arc.misc:
+
+Miscellaneous
+=============
+
+.. _arc.misc.special_methods:
+
+Special methods
+---------------
+
+.. _arc.misc.special_methods.retain:
+
+Memory management methods
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A program is ill-formed if it contains a method definition, message send, or
+``@selector`` expression for any of the following selectors:
+
+* ``autorelease``
+* ``release``
+* ``retain``
+* ``retainCount``
+
+.. admonition:: Rationale
+
+ ``retainCount`` is banned because ARC robs it of consistent semantics. The
+ others were banned after weighing three options for how to deal with message
+ sends:
+
+ **Honoring** them would work out very poorly if a programmer naively or
+ accidentally tried to incorporate code written for manual retain/release code
+ into an ARC program. At best, such code would do twice as much work as
+ necessary; quite frequently, however, ARC and the explicit code would both
+ try to balance the same retain, leading to crashes. The cost is losing the
+ ability to perform "unrooted" retains, i.e. retains not logically
+ corresponding to a strong reference in the object graph.
+
+ **Ignoring** them would badly violate user expectations about their code.
+ While it *would* make it easier to develop code simultaneously for ARC and
+ non-ARC, there is very little reason to do so except for certain library
+ developers. ARC and non-ARC translation units share an execution model and
+ can seamlessly interoperate. Within a translation unit, a developer who
+ faithfully maintains their code in non-ARC mode is suffering all the
+ restrictions of ARC for zero benefit, while a developer who isn't testing the
+ non-ARC mode is likely to be unpleasantly surprised if they try to go back to
+ it.
+
+ **Banning** them has the disadvantage of making it very awkward to migrate
+ existing code to ARC. The best answer to that, given a number of other
+ changes and restrictions in ARC, is to provide a specialized tool to assist
+ users in that migration.
+
+ Implementing these methods was banned because they are too integral to the
+ semantics of ARC; many tricks which worked tolerably under manual reference
+ counting will misbehave if ARC performs an ephemeral extra retain or two. If
+ absolutely required, it is still possible to implement them in non-ARC code,
+ for example in a category; the implementations must obey the :ref:`semantics
+ <arc.objects.retains>` laid out elsewhere in this document.
+
+.. _arc.misc.special_methods.dealloc:
+
+``dealloc``
+^^^^^^^^^^^
+
+A program is ill-formed if it contains a message send or ``@selector``
+expression for the selector ``dealloc``.
+
+.. admonition:: Rationale
+
+ There are no legitimate reasons to call ``dealloc`` directly.
+
+A class may provide a method definition for an instance method named
+``dealloc``. This method will be called after the final ``release`` of the
+object but before it is deallocated or any of its instance variables are
+destroyed. The superclass's implementation of ``dealloc`` will be called
+automatically when the method returns.
+
+.. admonition:: Rationale
+
+ Even though ARC destroys instance variables automatically, there are still
+ legitimate reasons to write a ``dealloc`` method, such as freeing
+ non-retainable resources. Failing to call ``[super dealloc]`` in such a
+ method is nearly always a bug. Sometimes, the object is simply trying to
+ prevent itself from being destroyed, but ``dealloc`` is really far too late
+ for the object to be raising such objections. Somewhat more legitimately, an
+ object may have been pool-allocated and should not be deallocated with
+ ``free``; for now, this can only be supported with a ``dealloc``
+ implementation outside of ARC. Such an implementation must be very careful
+ to do all the other work that ``NSObject``'s ``dealloc`` would, which is
+ outside the scope of this document to describe.
+
+The instance variables for an ARC-compiled class will be destroyed at some
+point after control enters the ``dealloc`` method for the root class of the
+class. The ordering of the destruction of instance variables is unspecified,
+both within a single class and between subclasses and superclasses.
+
+.. admonition:: Rationale
+
+ The traditional, non-ARC pattern for destroying instance variables is to
+ destroy them immediately before calling ``[super dealloc]``. Unfortunately,
+ message sends from the superclass are quite capable of reaching methods in
+ the subclass, and those methods may well read or write to those instance
+ variables. Making such message sends from dealloc is generally discouraged,
+ since the subclass may well rely on other invariants that were broken during
+ ``dealloc``, but it's not so inescapably dangerous that we felt comfortable
+ calling it undefined behavior. Therefore we chose to delay destroying the
+ instance variables to a point at which message sends are clearly disallowed:
+ the point at which the root class's deallocation routines take over.
+
+ In most code, the difference is not observable. It can, however, be observed
+ if an instance variable holds a strong reference to an object whose
+ deallocation will trigger a side-effect which must be carefully ordered with
+ respect to the destruction of the super class. Such code violates the design
+ principle that semantically important behavior should be explicit. A simple
+ fix is to clear the instance variable manually during ``dealloc``; a more
+ holistic solution is to move semantically important side-effects out of
+ ``dealloc`` and into a separate teardown phase which can rely on working with
+ well-formed objects.
+
+.. _arc.misc.autoreleasepool:
+
+``@autoreleasepool``
+--------------------
+
+To simplify the use of autorelease pools, and to bring them under the control
+of the compiler, a new kind of statement is available in Objective-C. It is
+written ``@autoreleasepool`` followed by a *compound-statement*, i.e. by a new
+scope delimited by curly braces. Upon entry to this block, the current state
+of the autorelease pool is captured. When the block is exited normally,
+whether by fallthrough or directed control flow (such as ``return`` or
+``break``), the autorelease pool is restored to the saved state, releasing all
+the objects in it. When the block is exited with an exception, the pool is not
+drained.
+
+``@autoreleasepool`` may be used in non-ARC translation units, with equivalent
+semantics.
+
+A program is ill-formed if it refers to the ``NSAutoreleasePool`` class.
+
+.. admonition:: Rationale
+
+ Autorelease pools are clearly important for the compiler to reason about, but
+ it is far too much to expect the compiler to accurately reason about control
+ dependencies between two calls. It is also very easy to accidentally forget
+ to drain an autorelease pool when using the manual API, and this can
+ significantly inflate the process's high-water-mark. The introduction of a
+ new scope is unfortunate but basically required for sane interaction with the
+ rest of the language. Not draining the pool during an unwind is apparently
+ required by the Objective-C exceptions implementation.
+
+.. _arc.misc.self:
+
+``self``
+--------
+
+The ``self`` parameter variable of an Objective-C method is never actually
+retained by the implementation. It is undefined behavior, or at least
+dangerous, to cause an object to be deallocated during a message send to that
+object.
+
+To make this safe, for Objective-C instance methods ``self`` is implicitly
+``const`` unless the method is in the :ref:`init family
+<arc.family.semantics.init>`. Further, ``self`` is **always** implicitly
+``const`` within a class method.
+
+.. admonition:: Rationale
+
+ The cost of retaining ``self`` in all methods was found to be prohibitive, as
+ it tends to be live across calls, preventing the optimizer from proving that
+ the retain and release are unnecessary --- for good reason, as it's quite
+ possible in theory to cause an object to be deallocated during its execution
+ without this retain and release. Since it's extremely uncommon to actually
+ do so, even unintentionally, and since there's no natural way for the
+ programmer to remove this retain/release pair otherwise (as there is for
+ other parameters by, say, making the variable ``__unsafe_unretained``), we
+ chose to make this optimizing assumption and shift some amount of risk to the
+ user.
+
+.. _arc.misc.enumeration:
+
+Fast enumeration iteration variables
+------------------------------------
+
+If a variable is declared in the condition of an Objective-C fast enumeration
+loop, and the variable has no explicit ownership qualifier, then it is
+qualified with ``const __strong`` and objects encountered during the
+enumeration are not actually retained.
+
+.. admonition:: Rationale
+
+ This is an optimization made possible because fast enumeration loops promise
+ to keep the objects retained during enumeration, and the collection itself
+ cannot be synchronously modified. It can be overridden by explicitly
+ qualifying the variable with ``__strong``, which will make the variable
+ mutable again and cause the loop to retain the objects it encounters.
+
+.. _arc.misc.blocks:
+
+Blocks
+------
+
+The implicit ``const`` capture variables created when evaluating a block
+literal expression have the same ownership semantics as the local variables
+they capture. The capture is performed by reading from the captured variable
+and initializing the capture variable with that value; the capture variable is
+destroyed when the block literal is, i.e. at the end of the enclosing scope.
+
+The :ref:`inference <arc.ownership.inference>` rules apply equally to
+``__block`` variables, which is a shift in semantics from non-ARC, where
+``__block`` variables did not implicitly retain during capture.
+
+``__block`` variables of retainable object owner type are moved off the stack
+by initializing the heap copy with the result of moving from the stack copy.
+
+With the exception of retains done as part of initializing a ``__strong``
+parameter variable or reading a ``__weak`` variable, whenever these semantics
+call for retaining a value of block-pointer type, it has the effect of a
+``Block_copy``. The optimizer may remove such copies when it sees that the
+result is used only as an argument to a call.
+
+.. _arc.misc.exceptions:
+
+Exceptions
+----------
+
+By default in Objective C, ARC is not exception-safe for normal releases:
+
+* It does not end the lifetime of ``__strong`` variables when their scopes are
+ abnormally terminated by an exception.
+* It does not perform releases which would occur at the end of a
+ full-expression if that full-expression throws an exception.
+
+A program may be compiled with the option ``-fobjc-arc-exceptions`` in order to
+enable these, or with the option ``-fno-objc-arc-exceptions`` to explicitly
+disable them, with the last such argument "winning".
+
+.. admonition:: Rationale
+
+ The standard Cocoa convention is that exceptions signal programmer error and
+ are not intended to be recovered from. Making code exceptions-safe by
+ default would impose severe runtime and code size penalties on code that
+ typically does not actually care about exceptions safety. Therefore,
+ ARC-generated code leaks by default on exceptions, which is just fine if the
+ process is going to be immediately terminated anyway. Programs which do care
+ about recovering from exceptions should enable the option.
+
+In Objective-C++, ``-fobjc-arc-exceptions`` is enabled by default.
+
+.. admonition:: Rationale
+
+ C++ already introduces pervasive exceptions-cleanup code of the sort that ARC
+ introduces. C++ programmers who have not already disabled exceptions are
+ much more likely to actual require exception-safety.
+
+ARC does end the lifetimes of ``__weak`` objects when an exception terminates
+their scope unless exceptions are disabled in the compiler.
+
+.. admonition:: Rationale
+
+ The consequence of a local ``__weak`` object not being destroyed is very
+ likely to be corruption of the Objective-C runtime, so we want to be safer
+ here. Of course, potentially massive leaks are about as likely to take down
+ the process as this corruption is if the program does try to recover from
+ exceptions.
+
+.. _arc.misc.interior:
+
+Interior pointers
+-----------------
+
+An Objective-C method returning a non-retainable pointer may be annotated with
+the ``objc_returns_inner_pointer`` attribute to indicate that it returns a
+handle to the internal data of an object, and that this reference will be
+invalidated if the object is destroyed. When such a message is sent to an
+object, the object's lifetime will be extended until at least the earliest of:
+
+* the last use of the returned pointer, or any pointer derived from it, in the
+ calling function or
+* the autorelease pool is restored to a previous state.
+
+.. admonition:: Rationale
+
+ Rationale: not all memory and resources are managed with reference counts; it
+ is common for objects to manage private resources in their own, private way.
+ Typically these resources are completely encapsulated within the object, but
+ some classes offer their users direct access for efficiency. If ARC is not
+ aware of methods that return such "interior" pointers, its optimizations can
+ cause the owning object to be reclaimed too soon. This attribute informs ARC
+ that it must tread lightly.
+
+ The extension rules are somewhat intentionally vague. The autorelease pool
+ limit is there to permit a simple implementation to simply retain and
+ autorelease the receiver. The other limit permits some amount of
+ optimization. The phrase "derived from" is intended to encompass the results
+ both of pointer transformations, such as casts and arithmetic, and of loading
+ from such derived pointers; furthermore, it applies whether or not such
+ derivations are applied directly in the calling code or by other utility code
+ (for example, the C library routine ``strchr``). However, the implementation
+ never need account for uses after a return from the code which calls the
+ method returning an interior pointer.
+
+As an exception, no extension is required if the receiver is loaded directly
+from a ``__strong`` object with :ref:`precise lifetime semantics
+<arc.optimization.precise>`.
+
+.. admonition:: Rationale
+
+ Implicit autoreleases carry the risk of significantly inflating memory use,
+ so it's important to provide users a way of avoiding these autoreleases.
+ Tying this to precise lifetime semantics is ideal, as for local variables
+ this requires a very explicit annotation, which allows ARC to trust the user
+ with good cheer.
+
+.. _arc.misc.c-retainable:
+
+C retainable pointer types
+--------------------------
+
+A type is a :arc-term:`C retainable pointer type` if it is a pointer to
+(possibly qualified) ``void`` or a pointer to a (possibly qualifier) ``struct``
+or ``class`` type.
+
+.. admonition:: Rationale
+
+ ARC does not manage pointers of CoreFoundation type (or any of the related
+ families of retainable C pointers which interoperate with Objective-C for
+ retain/release operation). In fact, ARC does not even know how to
+ distinguish these types from arbitrary C pointer types. The intent of this
+ concept is to filter out some obviously non-object types while leaving a hook
+ for later tightening if a means of exhaustively marking CF types is made
+ available.
+
+.. _arc.misc.c-retainable.audit:
+
+Auditing of C retainable pointer interfaces
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+:when-revised:`[beginning Apple 4.0, LLVM 3.1]`
+
+A C function may be marked with the ``cf_audited_transfer`` attribute to
+express that, except as otherwise marked with attributes, it obeys the
+parameter (consuming vs. non-consuming) and return (retained vs. non-retained)
+conventions for a C function of its name, namely:
+
+* A parameter of C retainable pointer type is assumed to not be consumed
+ unless it is marked with the ``cf_consumed`` attribute, and
+* A result of C retainable pointer type is assumed to not be returned retained
+ unless the function is either marked ``cf_returns_retained`` or it follows
+ the create/copy naming convention and is not marked
+ ``cf_returns_not_retained``.
+
+A function obeys the :arc-term:`create/copy` naming convention if its name
+contains as a substring:
+
+* either "Create" or "Copy" not followed by a lowercase letter, or
+* either "create" or "copy" not followed by a lowercase letter and
+ not preceded by any letter, whether uppercase or lowercase.
+
+A second attribute, ``cf_unknown_transfer``, signifies that a function's
+transfer semantics cannot be accurately captured using any of these
+annotations. A program is ill-formed if it annotates the same function with
+both ``cf_audited_transfer`` and ``cf_unknown_transfer``.
+
+A pragma is provided to facilitate the mass annotation of interfaces:
+
+.. code-block:: objc
+
+ #pragma clang arc_cf_code_audited begin
+ ...
+ #pragma clang arc_cf_code_audited end
+
+All C functions declared within the extent of this pragma are treated as if
+annotated with the ``cf_audited_transfer`` attribute unless they otherwise have
+the ``cf_unknown_transfer`` attribute. The pragma is accepted in all language
+modes. A program is ill-formed if it attempts to change files, whether by
+including a file or ending the current file, within the extent of this pragma.
+
+It is possible to test for all the features in this section with
+``__has_feature(arc_cf_code_audited)``.
+
+.. admonition:: Rationale
+
+ A significant inconvenience in ARC programming is the necessity of
+ interacting with APIs based around C retainable pointers. These features are
+ designed to make it relatively easy for API authors to quickly review and
+ annotate their interfaces, in turn improving the fidelity of tools such as
+ the static analyzer and ARC. The single-file restriction on the pragma is
+ designed to eliminate the risk of accidentally annotating some other header's
+ interfaces.
+
+.. _arc.runtime:
+
+Runtime support
+===============
+
+This section describes the interaction between the ARC runtime and the code
+generated by the ARC compiler. This is not part of the ARC language
+specification; instead, it is effectively a language-specific ABI supplement,
+akin to the "Itanium" generic ABI for C++.
+
+Ownership qualification does not alter the storage requirements for objects,
+except that it is undefined behavior if a ``__weak`` object is inadequately
+aligned for an object of type ``id``. The other qualifiers may be used on
+explicitly under-aligned memory.
+
+The runtime tracks ``__weak`` objects which holds non-null values. It is
+undefined behavior to direct modify a ``__weak`` object which is being tracked
+by the runtime except through an
+:ref:`objc_storeWeak <arc.runtime.objc_storeWeak>`,
+:ref:`objc_destroyWeak <arc.runtime.objc_destroyWeak>`, or
+:ref:`objc_moveWeak <arc.runtime.objc_moveWeak>` call.
+
+The runtime must provide a number of new entrypoints which the compiler may
+emit, which are described in the remainder of this section.
+
+.. admonition:: Rationale
+
+ Several of these functions are semantically equivalent to a message send; we
+ emit calls to C functions instead because:
+
+ * the machine code to do so is significantly smaller,
+ * it is much easier to recognize the C functions in the ARC optimizer, and
+ * a sufficient sophisticated runtime may be able to avoid the message send in
+ common cases.
+
+ Several other of these functions are "fused" operations which can be
+ described entirely in terms of other operations. We use the fused operations
+ primarily as a code-size optimization, although in some cases there is also a
+ real potential for avoiding redundant operations in the runtime.
+
+.. _arc.runtime.objc_autorelease:
+
+``id objc_autorelease(id value);``
+----------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it adds the object
+to the innermost autorelease pool exactly as if the object had been sent the
+``autorelease`` message.
+
+Always returns ``value``.
+
+.. _arc.runtime.objc_autoreleasePoolPop:
+
+``void objc_autoreleasePoolPop(void *pool);``
+---------------------------------------------
+
+*Precondition:* ``pool`` is the result of a previous call to
+:ref:`objc_autoreleasePoolPush <arc.runtime.objc_autoreleasePoolPush>` on the
+current thread, where neither ``pool`` nor any enclosing pool have previously
+been popped.
+
+Releases all the objects added to the given autorelease pool and any
+autorelease pools it encloses, then sets the current autorelease pool to the
+pool directly enclosing ``pool``.
+
+.. _arc.runtime.objc_autoreleasePoolPush:
+
+``void *objc_autoreleasePoolPush(void);``
+-----------------------------------------
+
+Creates a new autorelease pool that is enclosed by the current pool, makes that
+the current pool, and returns an opaque "handle" to it.
+
+.. admonition:: Rationale
+
+ While the interface is described as an explicit hierarchy of pools, the rules
+ allow the implementation to just keep a stack of objects, using the stack
+ depth as the opaque pool handle.
+
+.. _arc.runtime.objc_autoreleaseReturnValue:
+
+``id objc_autoreleaseReturnValue(id value);``
+---------------------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it makes a best
+effort to hand off ownership of a retain count on the object to a call to
+:ref:`objc_retainAutoreleasedReturnValue
+<arc.runtime.objc_retainAutoreleasedReturnValue>` for the same object in an
+enclosing call frame. If this is not possible, the object is autoreleased as
+above.
+
+Always returns ``value``.
+
+.. _arc.runtime.objc_copyWeak:
+
+``void objc_copyWeak(id *dest, id *src);``
+------------------------------------------
+
+*Precondition:* ``src`` is a valid pointer which either contains a null pointer
+or has been registered as a ``__weak`` object. ``dest`` is a valid pointer
+which has not been registered as a ``__weak`` object.
+
+``dest`` is initialized to be equivalent to ``src``, potentially registering it
+with the runtime. Equivalent to the following code:
+
+.. code-block:: objc
+
+ void objc_copyWeak(id *dest, id *src) {
+ objc_release(objc_initWeak(dest, objc_loadWeakRetained(src)));
+ }
+
+Must be atomic with respect to calls to ``objc_storeWeak`` on ``src``.
+
+.. _arc.runtime.objc_destroyWeak:
+
+``void objc_destroyWeak(id *object);``
+--------------------------------------
+
+*Precondition:* ``object`` is a valid pointer which either contains a null
+pointer or has been registered as a ``__weak`` object.
+
+``object`` is unregistered as a weak object, if it ever was. The current value
+of ``object`` is left unspecified; otherwise, equivalent to the following code:
+
+.. code-block:: objc
+
+ void objc_destroyWeak(id *object) {
+ objc_storeWeak(object, nil);
+ }
+
+Does not need to be atomic with respect to calls to ``objc_storeWeak`` on
+``object``.
+
+.. _arc.runtime.objc_initWeak:
+
+``id objc_initWeak(id *object, id value);``
+-------------------------------------------
+
+*Precondition:* ``object`` is a valid pointer which has not been registered as
+a ``__weak`` object. ``value`` is null or a pointer to a valid object.
+
+If ``value`` is a null pointer or the object to which it points has begun
+deallocation, ``object`` is zero-initialized. Otherwise, ``object`` is
+registered as a ``__weak`` object pointing to ``value``. Equivalent to the
+following code:
+
+.. code-block:: objc
+
+ id objc_initWeak(id *object, id value) {
+ *object = nil;
+ return objc_storeWeak(object, value);
+ }
+
+Returns the value of ``object`` after the call.
+
+Does not need to be atomic with respect to calls to ``objc_storeWeak`` on
+``object``.
+
+.. _arc.runtime.objc_loadWeak:
+
+``id objc_loadWeak(id *object);``
+---------------------------------
+
+*Precondition:* ``object`` is a valid pointer which either contains a null
+pointer or has been registered as a ``__weak`` object.
+
+If ``object`` is registered as a ``__weak`` object, and the last value stored
+into ``object`` has not yet been deallocated or begun deallocation, retains and
+autoreleases that value and returns it. Otherwise returns null. Equivalent to
+the following code:
+
+.. code-block:: objc
+
+ id objc_loadWeak(id *object) {
+ return objc_autorelease(objc_loadWeakRetained(object));
+ }
+
+Must be atomic with respect to calls to ``objc_storeWeak`` on ``object``.
+
+.. admonition:: Rationale
+
+ Loading weak references would be inherently prone to race conditions without
+ the retain.
+
+.. _arc.runtime.objc_loadWeakRetained:
+
+``id objc_loadWeakRetained(id *object);``
+-----------------------------------------
+
+*Precondition:* ``object`` is a valid pointer which either contains a null
+pointer or has been registered as a ``__weak`` object.
+
+If ``object`` is registered as a ``__weak`` object, and the last value stored
+into ``object`` has not yet been deallocated or begun deallocation, retains
+that value and returns it. Otherwise returns null.
+
+Must be atomic with respect to calls to ``objc_storeWeak`` on ``object``.
+
+.. _arc.runtime.objc_moveWeak:
+
+``void objc_moveWeak(id *dest, id *src);``
+------------------------------------------
+
+*Precondition:* ``src`` is a valid pointer which either contains a null pointer
+or has been registered as a ``__weak`` object. ``dest`` is a valid pointer
+which has not been registered as a ``__weak`` object.
+
+``dest`` is initialized to be equivalent to ``src``, potentially registering it
+with the runtime. ``src`` may then be left in its original state, in which
+case this call is equivalent to :ref:`objc_copyWeak
+<arc.runtime.objc_copyWeak>`, or it may be left as null.
+
+Must be atomic with respect to calls to ``objc_storeWeak`` on ``src``.
+
+.. _arc.runtime.objc_release:
+
+``void objc_release(id value);``
+--------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it performs a
+release operation exactly as if the object had been sent the ``release``
+message.
+
+.. _arc.runtime.objc_retain:
+
+``id objc_retain(id value);``
+-----------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it performs a retain
+operation exactly as if the object had been sent the ``retain`` message.
+
+Always returns ``value``.
+
+.. _arc.runtime.objc_retainAutorelease:
+
+``id objc_retainAutorelease(id value);``
+----------------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it performs a retain
+operation followed by an autorelease operation. Equivalent to the following
+code:
+
+.. code-block:: objc
+
+ id objc_retainAutorelease(id value) {
+ return objc_autorelease(objc_retain(value));
+ }
+
+Always returns ``value``.
+
+.. _arc.runtime.objc_retainAutoreleaseReturnValue:
+
+``id objc_retainAutoreleaseReturnValue(id value);``
+---------------------------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it performs a retain
+operation followed by the operation described in
+:ref:`objc_autoreleaseReturnValue <arc.runtime.objc_autoreleaseReturnValue>`.
+Equivalent to the following code:
+
+.. code-block:: objc
+
+ id objc_retainAutoreleaseReturnValue(id value) {
+ return objc_autoreleaseReturnValue(objc_retain(value));
+ }
+
+Always returns ``value``.
+
+.. _arc.runtime.objc_retainAutoreleasedReturnValue:
+
+``id objc_retainAutoreleasedReturnValue(id value);``
+----------------------------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid object.
+
+If ``value`` is null, this call has no effect. Otherwise, it attempts to
+accept a hand off of a retain count from a call to
+:ref:`objc_autoreleaseReturnValue <arc.runtime.objc_autoreleaseReturnValue>` on
+``value`` in a recently-called function or something it calls. If that fails,
+it performs a retain operation exactly like :ref:`objc_retain
+<arc.runtime.objc_retain>`.
+
+Always returns ``value``.
+
+.. _arc.runtime.objc_retainBlock:
+
+``id objc_retainBlock(id value);``
+----------------------------------
+
+*Precondition:* ``value`` is null or a pointer to a valid block object.
+
+If ``value`` is null, this call has no effect. Otherwise, if the block pointed
+to by ``value`` is still on the stack, it is copied to the heap and the address
+of the copy is returned. Otherwise a retain operation is performed on the
+block exactly as if it had been sent the ``retain`` message.
+
+.. _arc.runtime.objc_storeStrong:
+
+``id objc_storeStrong(id *object, id value);``
+----------------------------------------------
+
+*Precondition:* ``object`` is a valid pointer to a ``__strong`` object which is
+adequately aligned for a pointer. ``value`` is null or a pointer to a valid
+object.
+
+Performs the complete sequence for assigning to a ``__strong`` object of
+non-block type [*]_. Equivalent to the following code:
+
+.. code-block:: objc
+
+ id objc_storeStrong(id *object, id value) {
+ value = [value retain];
+ id oldValue = *object;
+ *object = value;
+ [oldValue release];
+ return value;
+ }
+
+Always returns ``value``.
+
+.. [*] This does not imply that a ``__strong`` object of block type is an
+ invalid argument to this function. Rather it implies that an ``objc_retain``
+ and not an ``objc_retainBlock`` operation will be emitted if the argument is
+ a block.
+
+.. _arc.runtime.objc_storeWeak:
+
+``id objc_storeWeak(id *object, id value);``
+--------------------------------------------
+
+*Precondition:* ``object`` is a valid pointer which either contains a null
+pointer or has been registered as a ``__weak`` object. ``value`` is null or a
+pointer to a valid object.
+
+If ``value`` is a null pointer or the object to which it points has begun
+deallocation, ``object`` is assigned null and unregistered as a ``__weak``
+object. Otherwise, ``object`` is registered as a ``__weak`` object or has its
+registration updated to point to ``value``.
+
+Returns the value of ``object`` after the call.
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,935 @@
+==================================
+Block Implementation Specification
+==================================
+
+.. contents::
+ :local:
+
+History
+=======
+
+* 2008/7/14 - created.
+* 2008/8/21 - revised, C++.
+* 2008/9/24 - add ``NULL`` ``isa`` field to ``__block`` storage.
+* 2008/10/1 - revise block layout to use a ``static`` descriptor structure.
+* 2008/10/6 - revise block layout to use an unsigned long int flags.
+* 2008/10/28 - specify use of ``_Block_object_assign`` and
+ ``_Block_object_dispose`` for all "Object" types in helper functions.
+* 2008/10/30 - revise new layout to have invoke function in same place.
+* 2008/10/30 - add ``__weak`` support.
+* 2010/3/16 - rev for stret return, signature field.
+* 2010/4/6 - improved wording.
+* 2013/1/6 - improved wording and converted to rst.
+
+This document describes the Apple ABI implementation specification of Blocks.
+
+The first shipping version of this ABI is found in Mac OS X 10.6, and shall be
+referred to as 10.6.ABI. As of 2010/3/16, the following describes the ABI
+contract with the runtime and the compiler, and, as necessary, will be referred
+to as ABI.2010.3.16.
+
+Since the Apple ABI references symbols from other elements of the system, any
+attempt to use this ABI on systems prior to SnowLeopard is undefined.
+
+High Level
+==========
+
+The ABI of ``Blocks`` consist of their layout and the runtime functions required
+by the compiler. A ``Block`` consists of a structure of the following form:
+
+.. code-block:: c
+
+ struct Block_literal_1 {
+ void *isa; // initialized to &_NSConcreteStackBlock or &_NSConcreteGlobalBlock
+ int flags;
+ int reserved;
+ void (*invoke)(void *, ...);
+ struct Block_descriptor_1 {
+ unsigned long int reserved; // NULL
+ unsigned long int size; // sizeof(struct Block_literal_1)
+ // optional helper functions
+ void (*copy_helper)(void *dst, void *src); // IFF (1<<25)
+ void (*dispose_helper)(void *src); // IFF (1<<25)
+ // required ABI.2010.3.16
+ const char *signature; // IFF (1<<30)
+ } *descriptor;
+ // imported variables
+ };
+
+The following flags bits are in use thusly for a possible ABI.2010.3.16:
+
+.. code-block:: c
+
+ enum {
+ BLOCK_HAS_COPY_DISPOSE = (1 << 25),
+ BLOCK_HAS_CTOR = (1 << 26), // helpers have C++ code
+ BLOCK_IS_GLOBAL = (1 << 28),
+ BLOCK_HAS_STRET = (1 << 29), // IFF BLOCK_HAS_SIGNATURE
+ BLOCK_HAS_SIGNATURE = (1 << 30),
+ };
+
+In 10.6.ABI the (1<<29) was usually set and was always ignored by the runtime -
+it had been a transitional marker that did not get deleted after the
+transition. This bit is now paired with (1<<30), and represented as the pair
+(3<<30), for the following combinations of valid bit settings, and their
+meanings:
+
+.. code-block:: c
+
+ switch (flags & (3<<29)) {
+ case (0<<29): 10.6.ABI, no signature field available
+ case (1<<29): 10.6.ABI, no signature field available
+ case (2<<29): ABI.2010.3.16, regular calling convention, presence of signature field
+ case (3<<29): ABI.2010.3.16, stret calling convention, presence of signature field,
+ }
+
+The signature field is not always populated.
+
+The following discussions are presented as 10.6.ABI otherwise.
+
+``Block`` literals may occur within functions where the structure is created in
+stack local memory. They may also appear as initialization expressions for
+``Block`` variables of global or ``static`` local variables.
+
+When a ``Block`` literal expression is evaluated the stack based structure is
+initialized as follows:
+
+1. A ``static`` descriptor structure is declared and initialized as follows:
+
+ a. The ``invoke`` function pointer is set to a function that takes the
+ ``Block`` structure as its first argument and the rest of the arguments (if
+ any) to the ``Block`` and executes the ``Block`` compound statement.
+
+ b. The ``size`` field is set to the size of the following ``Block`` literal
+ structure.
+
+ c. The ``copy_helper`` and ``dispose_helper`` function pointers are set to
+ respective helper functions if they are required by the ``Block`` literal.
+
+2. A stack (or global) ``Block`` literal data structure is created and
+ initialized as follows:
+
+ a. The ``isa`` field is set to the address of the external
+ ``_NSConcreteStackBlock``, which is a block of uninitialized memory supplied
+ in ``libSystem``, or ``_NSConcreteGlobalBlock`` if this is a static or file
+ level ``Block`` literal.
+
+ b. The ``flags`` field is set to zero unless there are variables imported
+ into the ``Block`` that need helper functions for program level
+ ``Block_copy()`` and ``Block_release()`` operations, in which case the
+ (1<<25) flags bit is set.
+
+As an example, the ``Block`` literal expression:
+
+.. code-block:: c
+
+ ^ { printf("hello world\n"); }
+
+would cause the following to be created on a 32-bit system:
+
+.. code-block:: c
+
+ struct __block_literal_1 {
+ void *isa;
+ int flags;
+ int reserved;
+ void (*invoke)(struct __block_literal_1 *);
+ struct __block_descriptor_1 *descriptor;
+ };
+
+ void __block_invoke_1(struct __block_literal_1 *_block) {
+ printf("hello world\n");
+ }
+
+ static struct __block_descriptor_1 {
+ unsigned long int reserved;
+ unsigned long int Block_size;
+ } __block_descriptor_1 = { 0, sizeof(struct __block_literal_1), __block_invoke_1 };
+
+and where the ``Block`` literal itself appears:
+
+.. code-block:: c
+
+ struct __block_literal_1 _block_literal = {
+ &_NSConcreteStackBlock,
+ (1<<29), <uninitialized>,
+ __block_invoke_1,
+ &__block_descriptor_1
+ };
+
+A ``Block`` imports other ``Block`` references, ``const`` copies of other
+variables, and variables marked ``__block``. In Objective-C, variables may
+additionally be objects.
+
+When a ``Block`` literal expression is used as the initial value of a global
+or ``static`` local variable, it is initialized as follows:
+
+.. code-block:: c
+
+ struct __block_literal_1 __block_literal_1 = {
+ &_NSConcreteGlobalBlock,
+ (1<<28)|(1<<29), <uninitialized>,
+ __block_invoke_1,
+ &__block_descriptor_1
+ };
+
+that is, a different address is provided as the first value and a particular
+(1<<28) bit is set in the ``flags`` field, and otherwise it is the same as for
+stack based ``Block`` literals. This is an optimization that can be used for
+any ``Block`` literal that imports no ``const`` or ``__block`` storage
+variables.
+
+Imported Variables
+==================
+
+Variables of ``auto`` storage class are imported as ``const`` copies. Variables
+of ``__block`` storage class are imported as a pointer to an enclosing data
+structure. Global variables are simply referenced and not considered as
+imported.
+
+Imported ``const`` copy variables
+---------------------------------
+
+Automatic storage variables not marked with ``__block`` are imported as
+``const`` copies.
+
+The simplest example is that of importing a variable of type ``int``:
+
+.. code-block:: c
+
+ int x = 10;
+ void (^vv)(void) = ^{ printf("x is %d\n", x); }
+ x = 11;
+ vv();
+
+which would be compiled to:
+
+.. code-block:: c
+
+ struct __block_literal_2 {
+ void *isa;
+ int flags;
+ int reserved;
+ void (*invoke)(struct __block_literal_2 *);
+ struct __block_descriptor_2 *descriptor;
+ const int x;
+ };
+
+ void __block_invoke_2(struct __block_literal_2 *_block) {
+ printf("x is %d\n", _block->x);
+ }
+
+ static struct __block_descriptor_2 {
+ unsigned long int reserved;
+ unsigned long int Block_size;
+ } __block_descriptor_2 = { 0, sizeof(struct __block_literal_2) };
+
+and:
+
+.. code-block:: c
+
+ struct __block_literal_2 __block_literal_2 = {
+ &_NSConcreteStackBlock,
+ (1<<29), <uninitialized>,
+ __block_invoke_2,
+ &__block_descriptor_2,
+ x
+ };
+
+In summary, scalars, structures, unions, and function pointers are generally
+imported as ``const`` copies with no need for helper functions.
+
+Imported ``const`` copy of ``Block`` reference
+----------------------------------------------
+
+The first case where copy and dispose helper functions are required is for the
+case of when a ``Block`` itself is imported. In this case both a
+``copy_helper`` function and a ``dispose_helper`` function are needed. The
+``copy_helper`` function is passed both the existing stack based pointer and the
+pointer to the new heap version and should call back into the runtime to
+actually do the copy operation on the imported fields within the ``Block``. The
+runtime functions are all described in :ref:`RuntimeHelperFunctions`.
+
+A quick example:
+
+.. code-block:: c
+
+ void (^existingBlock)(void) = ...;
+ void (^vv)(void) = ^{ existingBlock(); }
+ vv();
+
+ struct __block_literal_3 {
+ ...; // existing block
+ };
+
+ struct __block_literal_4 {
+ void *isa;
+ int flags;
+ int reserved;
+ void (*invoke)(struct __block_literal_4 *);
+ struct __block_literal_3 *const existingBlock;
+ };
+
+ void __block_invoke_4(struct __block_literal_2 *_block) {
+ __block->existingBlock->invoke(__block->existingBlock);
+ }
+
+ void __block_copy_4(struct __block_literal_4 *dst, struct __block_literal_4 *src) {
+ //_Block_copy_assign(&dst->existingBlock, src->existingBlock, 0);
+ _Block_object_assign(&dst->existingBlock, src->existingBlock, BLOCK_FIELD_IS_BLOCK);
+ }
+
+ void __block_dispose_4(struct __block_literal_4 *src) {
+ // was _Block_destroy
+ _Block_object_dispose(src->existingBlock, BLOCK_FIELD_IS_BLOCK);
+ }
+
+ static struct __block_descriptor_4 {
+ unsigned long int reserved;
+ unsigned long int Block_size;
+ void (*copy_helper)(struct __block_literal_4 *dst, struct __block_literal_4 *src);
+ void (*dispose_helper)(struct __block_literal_4 *);
+ } __block_descriptor_4 = {
+ 0,
+ sizeof(struct __block_literal_4),
+ __block_copy_4,
+ __block_dispose_4,
+ };
+
+and where said ``Block`` is used:
+
+.. code-block:: c
+
+ struct __block_literal_4 _block_literal = {
+ &_NSConcreteStackBlock,
+ (1<<25)|(1<<29), <uninitialized>
+ __block_invoke_4,
+ & __block_descriptor_4
+ existingBlock,
+ };
+
+Importing ``__attribute__((NSObject))`` variables
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+GCC introduces ``__attribute__((NSObject))`` on structure pointers to mean "this
+is an object". This is useful because many low level data structures are
+declared as opaque structure pointers, e.g. ``CFStringRef``, ``CFArrayRef``,
+etc. When used from C, however, these are still really objects and are the
+second case where that requires copy and dispose helper functions to be
+generated. The copy helper functions generated by the compiler should use the
+``_Block_object_assign`` runtime helper function and in the dispose helper the
+``_Block_object_dispose`` runtime helper function should be called.
+
+For example, ``Block`` foo in the following:
+
+.. code-block:: c
+
+ struct Opaque *__attribute__((NSObject)) objectPointer = ...;
+ ...
+ void (^foo)(void) = ^{ CFPrint(objectPointer); };
+
+would have the following helper functions generated:
+
+.. code-block:: c
+
+ void __block_copy_foo(struct __block_literal_5 *dst, struct __block_literal_5 *src) {
+ _Block_object_assign(&dst->objectPointer, src-> objectPointer, BLOCK_FIELD_IS_OBJECT);
+ }
+
+ void __block_dispose_foo(struct __block_literal_5 *src) {
+ _Block_object_dispose(src->objectPointer, BLOCK_FIELD_IS_OBJECT);
+ }
+
+Imported ``__block`` marked variables
+-------------------------------------
+
+Layout of ``__block`` marked variables
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The compiler must embed variables that are marked ``__block`` in a specialized
+structure of the form:
+
+.. code-block:: c
+
+ struct _block_byref_foo {
+ void *isa;
+ struct Block_byref *forwarding;
+ int flags; //refcount;
+ int size;
+ typeof(marked_variable) marked_variable;
+ };
+
+Variables of certain types require helper functions for when ``Block_copy()``
+and ``Block_release()`` are performed upon a referencing ``Block``. At the "C"
+level only variables that are of type ``Block`` or ones that have
+``__attribute__((NSObject))`` marked require helper functions. In Objective-C
+objects require helper functions and in C++ stack based objects require helper
+functions. Variables that require helper functions use the form:
+
+.. code-block:: c
+
+ struct _block_byref_foo {
+ void *isa;
+ struct _block_byref_foo *forwarding;
+ int flags; //refcount;
+ int size;
+ // helper functions called via Block_copy() and Block_release()
+ void (*byref_keep)(void *dst, void *src);
+ void (*byref_dispose)(void *);
+ typeof(marked_variable) marked_variable;
+ };
+
+The structure is initialized such that:
+
+ a. The ``forwarding`` pointer is set to the beginning of its enclosing
+ structure.
+
+ b. The ``size`` field is initialized to the total size of the enclosing
+ structure.
+
+ c. The ``flags`` field is set to either 0 if no helper functions are needed
+ or (1<<25) if they are.
+
+ d. The helper functions are initialized (if present).
+
+ e. The variable itself is set to its initial value.
+
+ f. The ``isa`` field is set to ``NULL``.
+
+Access to ``__block`` variables from within its lexical scope
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In order to "move" the variable to the heap upon a ``copy_helper`` operation the
+compiler must rewrite access to such a variable to be indirect through the
+structures ``forwarding`` pointer. For example:
+
+.. code-block:: c
+
+ int __block i = 10;
+ i = 11;
+
+would be rewritten to be:
+
+.. code-block:: c
+
+ struct _block_byref_i {
+ void *isa;
+ struct _block_byref_i *forwarding;
+ int flags; //refcount;
+ int size;
+ int captured_i;
+ } i = { NULL, &i, 0, sizeof(struct _block_byref_i), 10 };
+
+ i.forwarding->captured_i = 11;
+
+In the case of a ``Block`` reference variable being marked ``__block`` the
+helper code generated must use the ``_Block_object_assign`` and
+``_Block_object_dispose`` routines supplied by the runtime to make the
+copies. For example:
+
+.. code-block:: c
+
+ __block void (voidBlock)(void) = blockA;
+ voidBlock = blockB;
+
+would translate into:
+
+.. code-block:: c
+
+ struct _block_byref_voidBlock {
+ void *isa;
+ struct _block_byref_voidBlock *forwarding;
+ int flags; //refcount;
+ int size;
+ void (*byref_keep)(struct _block_byref_voidBlock *dst, struct _block_byref_voidBlock *src);
+ void (*byref_dispose)(struct _block_byref_voidBlock *);
+ void (^captured_voidBlock)(void);
+ };
+
+ void _block_byref_keep_helper(struct _block_byref_voidBlock *dst, struct _block_byref_voidBlock *src) {
+ //_Block_copy_assign(&dst->captured_voidBlock, src->captured_voidBlock, 0);
+ _Block_object_assign(&dst->captured_voidBlock, src->captured_voidBlock, BLOCK_FIELD_IS_BLOCK | BLOCK_BYREF_CALLER);
+ }
+
+ void _block_byref_dispose_helper(struct _block_byref_voidBlock *param) {
+ //_Block_destroy(param->captured_voidBlock, 0);
+ _Block_object_dispose(param->captured_voidBlock, BLOCK_FIELD_IS_BLOCK | BLOCK_BYREF_CALLER)}
+
+and:
+
+.. code-block:: c
+
+ struct _block_byref_voidBlock voidBlock = {( .forwarding=&voidBlock, .flags=(1<<25), .size=sizeof(struct _block_byref_voidBlock *),
+ .byref_keep=_block_byref_keep_helper, .byref_dispose=_block_byref_dispose_helper,
+ .captured_voidBlock=blockA )};
+
+ voidBlock.forwarding->captured_voidBlock = blockB;
+
+Importing ``__block`` variables into ``Blocks``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A ``Block`` that uses a ``__block`` variable in its compound statement body must
+import the variable and emit ``copy_helper`` and ``dispose_helper`` helper
+functions that, in turn, call back into the runtime to actually copy or release
+the ``byref`` data block using the functions ``_Block_object_assign`` and
+``_Block_object_dispose``.
+
+For example:
+
+.. code-block:: c
+
+ int __block i = 2;
+ functioncall(^{ i = 10; });
+
+would translate to:
+
+.. code-block:: c
+
+ struct _block_byref_i {
+ void *isa; // set to NULL
+ struct _block_byref_voidBlock *forwarding;
+ int flags; //refcount;
+ int size;
+ void (*byref_keep)(struct _block_byref_i *dst, struct _block_byref_i *src);
+ void (*byref_dispose)(struct _block_byref_i *);
+ int captured_i;
+ };
+
+
+ struct __block_literal_5 {
+ void *isa;
+ int flags;
+ int reserved;
+ void (*invoke)(struct __block_literal_5 *);
+ struct __block_descriptor_5 *descriptor;
+ struct _block_byref_i *i_holder;
+ };
+
+ void __block_invoke_5(struct __block_literal_5 *_block) {
+ _block->forwarding->captured_i = 10;
+ }
+
+ void __block_copy_5(struct __block_literal_5 *dst, struct __block_literal_5 *src) {
+ //_Block_byref_assign_copy(&dst->captured_i, src->captured_i);
+ _Block_object_assign(&dst->captured_i, src->captured_i, BLOCK_FIELD_IS_BYREF | BLOCK_BYREF_CALLER);
+ }
+
+ void __block_dispose_5(struct __block_literal_5 *src) {
+ //_Block_byref_release(src->captured_i);
+ _Block_object_dispose(src->captured_i, BLOCK_FIELD_IS_BYREF | BLOCK_BYREF_CALLER);
+ }
+
+ static struct __block_descriptor_5 {
+ unsigned long int reserved;
+ unsigned long int Block_size;
+ void (*copy_helper)(struct __block_literal_5 *dst, struct __block_literal_5 *src);
+ void (*dispose_helper)(struct __block_literal_5 *);
+ } __block_descriptor_5 = { 0, sizeof(struct __block_literal_5) __block_copy_5, __block_dispose_5 };
+
+and:
+
+.. code-block:: c
+
+ struct _block_byref_i i = {( .forwarding=&i, .flags=0, .size=sizeof(struct _block_byref_i) )};
+ struct __block_literal_5 _block_literal = {
+ &_NSConcreteStackBlock,
+ (1<<25)|(1<<29), <uninitialized>,
+ __block_invoke_5,
+ &__block_descriptor_5,
+ 2,
+ };
+
+Importing ``__attribute__((NSObject))`` ``__block`` variables
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A ``__block`` variable that is also marked ``__attribute__((NSObject))`` should
+have ``byref_keep`` and ``byref_dispose`` helper functions that use
+``_Block_object_assign`` and ``_Block_object_dispose``.
+
+``__block`` escapes
+^^^^^^^^^^^^^^^^^^^
+
+Because ``Blocks`` referencing ``__block`` variables may have ``Block_copy()``
+performed upon them the underlying storage for the variables may move to the
+heap. In Objective-C Garbage Collection Only compilation environments the heap
+used is the garbage collected one and no further action is required. Otherwise
+the compiler must issue a call to potentially release any heap storage for
+``__block`` variables at all escapes or terminations of their scope. The call
+should be:
+
+.. code-block:: c
+
+ _Block_object_dispose(&_block_byref_foo, BLOCK_FIELD_IS_BYREF);
+
+Nesting
+^^^^^^^
+
+``Blocks`` may contain ``Block`` literal expressions. Any variables used within
+inner blocks are imported into all enclosing ``Block`` scopes even if the
+variables are not used. This includes ``const`` imports as well as ``__block``
+variables.
+
+Objective C Extensions to ``Blocks``
+====================================
+
+Importing Objects
+-----------------
+
+Objects should be treated as ``__attribute__((NSObject))`` variables; all
+``copy_helper``, ``dispose_helper``, ``byref_keep``, and ``byref_dispose``
+helper functions should use ``_Block_object_assign`` and
+``_Block_object_dispose``. There should be no code generated that uses
+``*-retain`` or ``*-release`` methods.
+
+``Blocks`` as Objects
+---------------------
+
+The compiler will treat ``Blocks`` as objects when synthesizing property setters
+and getters, will characterize them as objects when generating garbage
+collection strong and weak layout information in the same manner as objects, and
+will issue strong and weak write-barrier assignments in the same manner as
+objects.
+
+``__weak __block`` Support
+--------------------------
+
+Objective-C (and Objective-C++) support the ``__weak`` attribute on ``__block``
+variables. Under normal circumstances the compiler uses the Objective-C runtime
+helper support functions ``objc_assign_weak`` and ``objc_read_weak``. Both
+should continue to be used for all reads and writes of ``__weak __block``
+variables:
+
+.. code-block:: c
+
+ objc_read_weak(&block->byref_i->forwarding->i)
+
+The ``__weak`` variable is stored in a ``_block_byref_foo`` structure and the
+``Block`` has copy and dispose helpers for this structure that call:
+
+.. code-block:: c
+
+ _Block_object_assign(&dest->_block_byref_i, src-> _block_byref_i, BLOCK_FIELD_IS_WEAK | BLOCK_FIELD_IS_BYREF);
+
+and:
+
+.. code-block:: c
+
+ _Block_object_dispose(src->_block_byref_i, BLOCK_FIELD_IS_WEAK | BLOCK_FIELD_IS_BYREF);
+
+In turn, the ``block_byref`` copy support helpers distinguish between whether
+the ``__block`` variable is a ``Block`` or not and should either call:
+
+.. code-block:: c
+
+ _Block_object_assign(&dest->_block_byref_i, src->_block_byref_i, BLOCK_FIELD_IS_WEAK | BLOCK_FIELD_IS_OBJECT | BLOCK_BYREF_CALLER);
+
+for something declared as an object or:
+
+.. code-block:: c
+
+ _Block_object_assign(&dest->_block_byref_i, src->_block_byref_i, BLOCK_FIELD_IS_WEAK | BLOCK_FIELD_IS_BLOCK | BLOCK_BYREF_CALLER);
+
+for something declared as a ``Block``.
+
+A full example follows:
+
+.. code-block:: c
+
+ __block __weak id obj = <initialization expression>;
+ functioncall(^{ [obj somemessage]; });
+
+would translate to:
+
+.. code-block:: c
+
+ struct _block_byref_obj {
+ void *isa; // uninitialized
+ struct _block_byref_obj *forwarding;
+ int flags; //refcount;
+ int size;
+ void (*byref_keep)(struct _block_byref_i *dst, struct _block_byref_i *src);
+ void (*byref_dispose)(struct _block_byref_i *);
+ id captured_obj;
+ };
+
+ void _block_byref_obj_keep(struct _block_byref_voidBlock *dst, struct _block_byref_voidBlock *src) {
+ //_Block_copy_assign(&dst->captured_obj, src->captured_obj, 0);
+ _Block_object_assign(&dst->captured_obj, src->captured_obj, BLOCK_FIELD_IS_OBJECT | BLOCK_FIELD_IS_WEAK | BLOCK_BYREF_CALLER);
+ }
+
+ void _block_byref_obj_dispose(struct _block_byref_voidBlock *param) {
+ //_Block_destroy(param->captured_obj, 0);
+ _Block_object_dispose(param->captured_obj, BLOCK_FIELD_IS_OBJECT | BLOCK_FIELD_IS_WEAK | BLOCK_BYREF_CALLER);
+ };
+
+for the block ``byref`` part and:
+
+.. code-block:: c
+
+ struct __block_literal_5 {
+ void *isa;
+ int flags;
+ int reserved;
+ void (*invoke)(struct __block_literal_5 *);
+ struct __block_descriptor_5 *descriptor;
+ struct _block_byref_obj *byref_obj;
+ };
+
+ void __block_invoke_5(struct __block_literal_5 *_block) {
+ [objc_read_weak(&_block->byref_obj->forwarding->captured_obj) somemessage];
+ }
+
+ void __block_copy_5(struct __block_literal_5 *dst, struct __block_literal_5 *src) {
+ //_Block_byref_assign_copy(&dst->byref_obj, src->byref_obj);
+ _Block_object_assign(&dst->byref_obj, src->byref_obj, BLOCK_FIELD_IS_BYREF | BLOCK_FIELD_IS_WEAK);
+ }
+
+ void __block_dispose_5(struct __block_literal_5 *src) {
+ //_Block_byref_release(src->byref_obj);
+ _Block_object_dispose(src->byref_obj, BLOCK_FIELD_IS_BYREF | BLOCK_FIELD_IS_WEAK);
+ }
+
+ static struct __block_descriptor_5 {
+ unsigned long int reserved;
+ unsigned long int Block_size;
+ void (*copy_helper)(struct __block_literal_5 *dst, struct __block_literal_5 *src);
+ void (*dispose_helper)(struct __block_literal_5 *);
+ } __block_descriptor_5 = { 0, sizeof(struct __block_literal_5), __block_copy_5, __block_dispose_5 };
+
+and within the compound statement:
+
+.. code-block:: c
+
+ truct _block_byref_obj obj = {( .forwarding=&obj, .flags=(1<<25), .size=sizeof(struct _block_byref_obj),
+ .byref_keep=_block_byref_obj_keep, .byref_dispose=_block_byref_obj_dispose,
+ .captured_obj = <initialization expression> )};
+
+ truct __block_literal_5 _block_literal = {
+ &_NSConcreteStackBlock,
+ (1<<25)|(1<<29), <uninitialized>,
+ __block_invoke_5,
+ &__block_descriptor_5,
+ &obj, // a reference to the on-stack structure containing "captured_obj"
+ };
+
+
+ functioncall(_block_literal->invoke(&_block_literal));
+
+C++ Support
+===========
+
+Within a block stack based C++ objects are copied into ``const`` copies using
+the copy constructor. It is an error if a stack based C++ object is used within
+a block if it does not have a copy constructor. In addition both copy and
+destroy helper routines must be synthesized for the block to support the
+``Block_copy()`` operation, and the flags work marked with the (1<<26) bit in
+addition to the (1<<25) bit. The copy helper should call the constructor using
+appropriate offsets of the variable within the supplied stack based block source
+and heap based destination for all ``const`` constructed copies, and similarly
+should call the destructor in the destroy routine.
+
+As an example, suppose a C++ class ``FOO`` existed with a copy constructor.
+Within a code block a stack version of a ``FOO`` object is declared and used
+within a ``Block`` literal expression:
+
+.. code-block:: c++
+
+ {
+ FOO foo;
+ void (^block)(void) = ^{ printf("%d\n", foo.value()); };
+ }
+
+The compiler would synthesize:
+
+.. code-block:: c++
+
+ struct __block_literal_10 {
+ void *isa;
+ int flags;
+ int reserved;
+ void (*invoke)(struct __block_literal_10 *);
+ struct __block_descriptor_10 *descriptor;
+ const FOO foo;
+ };
+
+ void __block_invoke_10(struct __block_literal_10 *_block) {
+ printf("%d\n", _block->foo.value());
+ }
+
+ void __block_literal_10(struct __block_literal_10 *dst, struct __block_literal_10 *src) {
+ FOO_ctor(&dst->foo, &src->foo);
+ }
+
+ void __block_dispose_10(struct __block_literal_10 *src) {
+ FOO_dtor(&src->foo);
+ }
+
+ static struct __block_descriptor_10 {
+ unsigned long int reserved;
+ unsigned long int Block_size;
+ void (*copy_helper)(struct __block_literal_10 *dst, struct __block_literal_10 *src);
+ void (*dispose_helper)(struct __block_literal_10 *);
+ } __block_descriptor_10 = { 0, sizeof(struct __block_literal_10), __block_copy_10, __block_dispose_10 };
+
+and the code would be:
+
+.. code-block:: c++
+
+ {
+ FOO foo;
+ comp_ctor(&foo); // default constructor
+ struct __block_literal_10 _block_literal = {
+ &_NSConcreteStackBlock,
+ (1<<25)|(1<<26)|(1<<29), <uninitialized>,
+ __block_invoke_10,
+ &__block_descriptor_10,
+ };
+ comp_ctor(&_block_literal->foo, &foo); // const copy into stack version
+ struct __block_literal_10 &block = &_block_literal; // assign literal to block variable
+ block->invoke(block); // invoke block
+ comp_dtor(&_block_literal->foo); // destroy stack version of const block copy
+ comp_dtor(&foo); // destroy original version
+ }
+
+
+C++ objects stored in ``__block`` storage start out on the stack in a
+``block_byref`` data structure as do other variables. Such objects (if not
+``const`` objects) must support a regular copy constructor. The ``block_byref``
+data structure will have copy and destroy helper routines synthesized by the
+compiler. The copy helper will have code created to perform the copy
+constructor based on the initial stack ``block_byref`` data structure, and will
+also set the (1<<26) bit in addition to the (1<<25) bit. The destroy helper
+will have code to do the destructor on the object stored within the supplied
+``block_byref`` heap data structure. For example,
+
+.. code-block:: c++
+
+ __block FOO blockStorageFoo;
+
+requires the normal constructor for the embedded ``blockStorageFoo`` object:
+
+.. code-block:: c++
+
+ FOO_ctor(& _block_byref_blockStorageFoo->blockStorageFoo);
+
+and at scope termination the destructor:
+
+.. code-block:: c++
+
+ FOO_dtor(& _block_byref_blockStorageFoo->blockStorageFoo);
+
+Note that the forwarding indirection is *NOT* used.
+
+The compiler would need to generate (if used from a block literal) the following
+copy/dispose helpers:
+
+.. code-block:: c++
+
+ void _block_byref_obj_keep(struct _block_byref_blockStorageFoo *dst, struct _block_byref_blockStorageFoo *src) {
+ FOO_ctor(&dst->blockStorageFoo, &src->blockStorageFoo);
+ }
+
+ void _block_byref_obj_dispose(struct _block_byref_blockStorageFoo *src) {
+ FOO_dtor(&src->blockStorageFoo);
+ }
+
+for the appropriately named constructor and destructor for the class/struct
+``FOO``.
+
+To support member variable and function access the compiler will synthesize a
+``const`` pointer to a block version of the ``this`` pointer.
+
+.. _RuntimeHelperFunctions:
+
+Runtime Helper Functions
+========================
+
+The runtime helper functions are described in
+``/usr/local/include/Block_private.h``. To summarize their use, a ``Block``
+requires copy/dispose helpers if it imports any block variables, ``__block``
+storage variables, ``__attribute__((NSObject))`` variables, or C++ ``const``
+copied objects with constructor/destructors. The (1<<26) bit is set and
+functions are generated.
+
+The block copy helper function should, for each of the variables of the type
+mentioned above, call:
+
+.. code-block:: c
+
+ _Block_object_assign(&dst->target, src->target, BLOCK_FIELD_<appropo>);
+
+in the copy helper and:
+
+.. code-block:: c
+
+ _Block_object_dispose(->target, BLOCK_FIELD_<appropo>);
+
+in the dispose helper where ``<appropo>`` is:
+
+.. code-block:: c
+
+ enum {
+ BLOCK_FIELD_IS_OBJECT = 3, // id, NSObject, __attribute__((NSObject)), block, ...
+ BLOCK_FIELD_IS_BLOCK = 7, // a block variable
+ BLOCK_FIELD_IS_BYREF = 8, // the on stack structure holding the __block variable
+
+ BLOCK_FIELD_IS_WEAK = 16, // declared __weak
+
+ BLOCK_BYREF_CALLER = 128, // called from byref copy/dispose helpers
+ };
+
+and of course the constructors/destructors for ``const`` copied C++ objects.
+
+The ``block_byref`` data structure similarly requires copy/dispose helpers for
+block variables, ``__attribute__((NSObject))`` variables, or C++ ``const``
+copied objects with constructor/destructors, and again the (1<<26) bit is set
+and functions are generated in the same manner.
+
+Under ObjC we allow ``__weak`` as an attribute on ``__block`` variables, and
+this causes the addition of ``BLOCK_FIELD_IS_WEAK`` orred onto the
+``BLOCK_FIELD_IS_BYREF`` flag when copying the ``block_byref`` structure in the
+``Block`` copy helper, and onto the ``BLOCK_FIELD_<appropo>`` field within the
+``block_byref`` copy/dispose helper calls.
+
+The prototypes, and summary, of the helper functions are:
+
+.. code-block:: c
+
+ /* Certain field types require runtime assistance when being copied to the
+ heap. The following function is used to copy fields of types: blocks,
+ pointers to byref structures, and objects (including
+ __attribute__((NSObject)) pointers. BLOCK_FIELD_IS_WEAK is orthogonal to
+ the other choices which are mutually exclusive. Only in a Block copy
+ helper will one see BLOCK_FIELD_IS_BYREF.
+ */
+ void _Block_object_assign(void *destAddr, const void *object, const int flags);
+
+ /* Similarly a compiler generated dispose helper needs to call back for each
+ field of the byref data structure. (Currently the implementation only
+ packs one field into the byref structure but in principle there could be
+ more). The same flags used in the copy helper should be used for each
+ call generated to this function:
+ */
+ void _Block_object_dispose(const void *object, const int flags);
+
+Copyright
+=========
+
+Copyright 2008-2010 Apple, Inc.
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
Added: www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.txt?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.txt (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/Block-ABI-Apple.txt Tue Jan 13 16:55:20 2015
@@ -0,0 +1 @@
+*NOTE* This document has moved to http://clang.llvm.org/docs/Block-ABI-Apple.html.
Added: www-releases/trunk/3.5.1/tools/clang/docs/BlockLanguageSpec.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/BlockLanguageSpec.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/BlockLanguageSpec.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/BlockLanguageSpec.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,361 @@
+
+.. role:: block-term
+
+=================================
+Language Specification for Blocks
+=================================
+
+.. contents::
+ :local:
+
+Revisions
+=========
+
+- 2008/2/25 --- created
+- 2008/7/28 --- revised, ``__block`` syntax
+- 2008/8/13 --- revised, Block globals
+- 2008/8/21 --- revised, C++ elaboration
+- 2008/11/1 --- revised, ``__weak`` support
+- 2009/1/12 --- revised, explicit return types
+- 2009/2/10 --- revised, ``__block`` objects need retain
+
+Overview
+========
+
+A new derived type is introduced to C and, by extension, Objective-C,
+C++, and Objective-C++
+
+The Block Type
+==============
+
+Like function types, the :block-term:`Block type` is a pair consisting
+of a result value type and a list of parameter types very similar to a
+function type. Blocks are intended to be used much like functions with
+the key distinction being that in addition to executable code they
+also contain various variable bindings to automatic (stack) or managed
+(heap) memory.
+
+The abstract declarator,
+
+.. code-block:: c
+
+ int (^)(char, float)
+
+describes a reference to a Block that, when invoked, takes two
+parameters, the first of type char and the second of type float, and
+returns a value of type int. The Block referenced is of opaque data
+that may reside in automatic (stack) memory, global memory, or heap
+memory.
+
+Block Variable Declarations
+===========================
+
+A :block-term:`variable with Block type` is declared using function
+pointer style notation substituting ``^`` for ``*``. The following are
+valid Block variable declarations:
+
+.. code-block:: c
+
+ void (^blockReturningVoidWithVoidArgument)(void);
+ int (^blockReturningIntWithIntAndCharArguments)(int, char);
+ void (^arrayOfTenBlocksReturningVoidWithIntArgument[10])(int);
+
+Variadic ``...`` arguments are supported. [variadic.c] A Block that
+takes no arguments must specify void in the argument list [voidarg.c].
+An empty parameter list does not represent, as K&R provide, an
+unspecified argument list. Note: both gcc and clang support K&R style
+as a convenience.
+
+A Block reference may be cast to a pointer of arbitrary type and vice
+versa. [cast.c] A Block reference may not be dereferenced via the
+pointer dereference operator ``*``, and thus a Block's size may not be
+computed at compile time. [sizeof.c]
+
+Block Literal Expressions
+=========================
+
+A :block-term:`Block literal expression` produces a reference to a
+Block. It is introduced by the use of the ``^`` token as a unary
+operator.
+
+.. code-block:: c
+
+ Block_literal_expression ::= ^ block_decl compound_statement_body
+ block_decl ::=
+ block_decl ::= parameter_list
+ block_decl ::= type_expression
+
+where type expression is extended to allow ``^`` as a Block reference
+(pointer) where ``*`` is allowed as a function reference (pointer).
+
+The following Block literal:
+
+.. code-block:: c
+
+ ^ void (void) { printf("hello world\n"); }
+
+produces a reference to a Block with no arguments with no return value.
+
+The return type is optional and is inferred from the return
+statements. If the return statements return a value, they all must
+return a value of the same type. If there is no value returned the
+inferred type of the Block is void; otherwise it is the type of the
+return statement value.
+
+If the return type is omitted and the argument list is ``( void )``,
+the ``( void )`` argument list may also be omitted.
+
+So:
+
+.. code-block:: c
+
+ ^ ( void ) { printf("hello world\n"); }
+
+and:
+
+.. code-block:: c
+
+ ^ { printf("hello world\n"); }
+
+are exactly equivalent constructs for the same expression.
+
+The type_expression extends C expression parsing to accommodate Block
+reference declarations as it accommodates function pointer
+declarations.
+
+Given:
+
+.. code-block:: c
+
+ typedef int (*pointerToFunctionThatReturnsIntWithCharArg)(char);
+ pointerToFunctionThatReturnsIntWithCharArg functionPointer;
+ ^ pointerToFunctionThatReturnsIntWithCharArg (float x) { return functionPointer; }
+
+and:
+
+.. code-block:: c
+
+ ^ int ((*)(float x))(char) { return functionPointer; }
+
+are equivalent expressions, as is:
+
+.. code-block:: c
+
+ ^(float x) { return functionPointer; }
+
+[returnfunctionptr.c]
+
+The compound statement body establishes a new lexical scope within
+that of its parent. Variables used within the scope of the compound
+statement are bound to the Block in the normal manner with the
+exception of those in automatic (stack) storage. Thus one may access
+functions and global variables as one would expect, as well as static
+local variables. [testme]
+
+Local automatic (stack) variables referenced within the compound
+statement of a Block are imported and captured by the Block as const
+copies. The capture (binding) is performed at the time of the Block
+literal expression evaluation.
+
+The compiler is not required to capture a variable if it can prove
+that no references to the variable will actually be evaluated.
+Programmers can force a variable to be captured by referencing it in a
+statement at the beginning of the Block, like so:
+
+.. code-block:: c
+
+ (void) foo;
+
+This matters when capturing the variable has side-effects, as it can
+in Objective-C or C++.
+
+The lifetime of variables declared in a Block is that of a function;
+each activation frame contains a new copy of variables declared within
+the local scope of the Block. Such variable declarations should be
+allowed anywhere [testme] rather than only when C99 parsing is
+requested, including for statements. [testme]
+
+Block literal expressions may occur within Block literal expressions
+(nest) and all variables captured by any nested blocks are implicitly
+also captured in the scopes of their enclosing Blocks.
+
+A Block literal expression may be used as the initialization value for
+Block variables at global or local static scope.
+
+The Invoke Operator
+===================
+
+Blocks are :block-term:`invoked` using function call syntax with a
+list of expression parameters of types corresponding to the
+declaration and returning a result type also according to the
+declaration. Given:
+
+.. code-block:: c
+
+ int (^x)(char);
+ void (^z)(void);
+ int (^(*y))(char) = &x;
+
+the following are all legal Block invocations:
+
+.. code-block:: c
+
+ x('a');
+ (*y)('a');
+ (true ? x : *y)('a')
+
+The Copy and Release Operations
+===============================
+
+The compiler and runtime provide :block-term:`copy` and
+:block-term:`release` operations for Block references that create and,
+in matched use, release allocated storage for referenced Blocks.
+
+The copy operation ``Block_copy()`` is styled as a function that takes
+an arbitrary Block reference and returns a Block reference of the same
+type. The release operation, ``Block_release()``, is styled as a
+function that takes an arbitrary Block reference and, if dynamically
+matched to a Block copy operation, allows recovery of the referenced
+allocated memory.
+
+
+The ``__block`` Storage Qualifier
+=================================
+
+In addition to the new Block type we also introduce a new storage
+qualifier, :block-term:`__block`, for local variables. [testme: a
+__block declaration within a block literal] The ``__block`` storage
+qualifier is mutually exclusive to the existing local storage
+qualifiers auto, register, and static. [testme] Variables qualified by
+``__block`` act as if they were in allocated storage and this storage
+is automatically recovered after last use of said variable. An
+implementation may choose an optimization where the storage is
+initially automatic and only "moved" to allocated (heap) storage upon
+a Block_copy of a referencing Block. Such variables may be mutated as
+normal variables are.
+
+In the case where a ``__block`` variable is a Block one must assume
+that the ``__block`` variable resides in allocated storage and as such
+is assumed to reference a Block that is also in allocated storage
+(that it is the result of a ``Block_copy`` operation). Despite this
+there is no provision to do a ``Block_copy`` or a ``Block_release`` if
+an implementation provides initial automatic storage for Blocks. This
+is due to the inherent race condition of potentially several threads
+trying to update the shared variable and the need for synchronization
+around disposing of older values and copying new ones. Such
+synchronization is beyond the scope of this language specification.
+
+
+Control Flow
+============
+
+The compound statement of a Block is treated much like a function body
+with respect to control flow in that goto, break, and continue do not
+escape the Block. Exceptions are treated *normally* in that when
+thrown they pop stack frames until a catch clause is found.
+
+
+Objective-C Extensions
+======================
+
+Objective-C extends the definition of a Block reference type to be
+that also of id. A variable or expression of Block type may be
+messaged or used as a parameter wherever an id may be. The converse is
+also true. Block references may thus appear as properties and are
+subject to the assign, retain, and copy attribute logic that is
+reserved for objects.
+
+All Blocks are constructed to be Objective-C objects regardless of
+whether the Objective-C runtime is operational in the program or
+not. Blocks using automatic (stack) memory are objects and may be
+messaged, although they may not be assigned into ``__weak`` locations
+if garbage collection is enabled.
+
+Within a Block literal expression within a method definition
+references to instance variables are also imported into the lexical
+scope of the compound statement. These variables are implicitly
+qualified as references from self, and so self is imported as a const
+copy. The net effect is that instance variables can be mutated.
+
+The :block-term:`Block_copy` operator retains all objects held in
+variables of automatic storage referenced within the Block expression
+(or form strong references if running under garbage collection).
+Object variables of ``__block`` storage type are assumed to hold
+normal pointers with no provision for retain and release messages.
+
+Foundation defines (and supplies) ``-copy`` and ``-release`` methods for
+Blocks.
+
+In the Objective-C and Objective-C++ languages, we allow the
+``__weak`` specifier for ``__block`` variables of object type. If
+garbage collection is not enabled, this qualifier causes these
+variables to be kept without retain messages being sent. This
+knowingly leads to dangling pointers if the Block (or a copy) outlives
+the lifetime of this object.
+
+In garbage collected environments, the ``__weak`` variable is set to
+nil when the object it references is collected, as long as the
+``__block`` variable resides in the heap (either by default or via
+``Block_copy()``). The initial Apple implementation does in fact
+start ``__block`` variables on the stack and migrate them to the heap
+only as a result of a ``Block_copy()`` operation.
+
+It is a runtime error to attempt to assign a reference to a
+stack-based Block into any storage marked ``__weak``, including
+``__weak`` ``__block`` variables.
+
+
+C++ Extensions
+==============
+
+Block literal expressions within functions are extended to allow const
+use of C++ objects, pointers, or references held in automatic storage.
+
+As usual, within the block, references to captured variables become
+const-qualified, as if they were references to members of a const
+object. Note that this does not change the type of a variable of
+reference type.
+
+For example, given a class Foo:
+
+.. code-block:: c
+
+ Foo foo;
+ Foo &fooRef = foo;
+ Foo *fooPtr = &foo;
+
+A Block that referenced these variables would import the variables as
+const variations:
+
+.. code-block:: c
+
+ const Foo block_foo = foo;
+ Foo &block_fooRef = fooRef;
+ Foo *const block_fooPtr = fooPtr;
+
+Captured variables are copied into the Block at the instant of
+evaluating the Block literal expression. They are also copied when
+calling ``Block_copy()`` on a Block allocated on the stack. In both
+cases, they are copied as if the variable were const-qualified, and
+it's an error if there's no such constructor.
+
+Captured variables in Blocks on the stack are destroyed when control
+leaves the compound statement that contains the Block literal
+expression. Captured variables in Blocks on the heap are destroyed
+when the reference count of the Block drops to zero.
+
+Variables declared as residing in ``__block`` storage may be initially
+allocated in the heap or may first appear on the stack and be copied
+to the heap as a result of a ``Block_copy()`` operation. When copied
+from the stack, ``__block`` variables are copied using their normal
+qualification (i.e. without adding const). In C++11, ``__block``
+variables are copied as x-values if that is possible, then as l-values
+if not; if both fail, it's an error. The destructor for any initial
+stack-based version is called at the variable's normal end of scope.
+
+References to ``this``, as well as references to non-static members of
+any enclosing class, are evaluated by capturing ``this`` just like a
+normal variable of C pointer type.
+
+Member variables that are Blocks may not be overloaded by the types of
+their arguments.
Added: www-releases/trunk/3.5.1/tools/clang/docs/CMakeLists.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/CMakeLists.txt?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/CMakeLists.txt (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/CMakeLists.txt Tue Jan 13 16:55:20 2015
@@ -0,0 +1,91 @@
+
+if (DOXYGEN_FOUND)
+if (LLVM_ENABLE_DOXYGEN)
+ set(abs_srcdir ${CMAKE_CURRENT_SOURCE_DIR})
+ set(abs_builddir ${CMAKE_CURRENT_BINARY_DIR})
+
+ if (HAVE_DOT)
+ set(DOT ${LLVM_PATH_DOT})
+ endif()
+
+ if (LLVM_DOXYGEN_EXTERNAL_SEARCH)
+ set(enable_searchengine "YES")
+ set(searchengine_url "${LLVM_DOXYGEN_SEARCHENGINE_URL}")
+ set(enable_server_based_search "YES")
+ set(enable_external_search "YES")
+ set(extra_search_mappings "${LLVM_DOXYGEN_SEARCH_MAPPINGS}")
+ else()
+ set(enable_searchengine "NO")
+ set(searchengine_url "")
+ set(enable_server_based_search "NO")
+ set(enable_external_search "NO")
+ set(extra_search_mappings "")
+ endif()
+
+ # If asked, configure doxygen for the creation of a Qt Compressed Help file.
+ if (LLVM_ENABLE_DOXYGEN_QT_HELP)
+ set(CLANG_DOXYGEN_QCH_FILENAME "org.llvm.clang.qch" CACHE STRING
+ "Filename of the Qt Compressed help file")
+ set(CLANG_DOXYGEN_QHP_NAMESPACE "org.llvm.clang" CACHE STRING
+ "Namespace under which the intermediate Qt Help Project file lives")
+ set(CLANG_DOXYGEN_QHP_CUST_FILTER_NAME "Clang ${CLANG_VERSION}" CACHE STRING
+ "See http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom-filters")
+ set(CLANG_DOXYGEN_QHP_CUST_FILTER_ATTRS "Clang,${CLANG_VERSION}" CACHE STRING
+ "See http://qt-project.org/doc/qt-4.8/qthelpproject.html#filter-attributes")
+ set(clang_doxygen_generate_qhp "YES")
+ set(clang_doxygen_qch_filename "${CLANG_DOXYGEN_QCH_FILENAME}")
+ set(clang_doxygen_qhp_namespace "${CLANG_DOXYGEN_QHP_NAMESPACE}")
+ set(clang_doxygen_qhelpgenerator_path "${LLVM_DOXYGEN_QHELPGENERATOR_PATH}")
+ set(clang_doxygen_qhp_cust_filter_name "${CLANG_DOXYGEN_QHP_CUST_FILTER_NAME}")
+ set(clang_doxygen_qhp_cust_filter_attrs "${CLANG_DOXYGEN_QHP_CUST_FILTER_ATTRS}")
+ else()
+ set(clang_doxygen_generate_qhp "NO")
+ set(clang_doxygen_qch_filename "")
+ set(clang_doxygen_qhp_namespace "")
+ set(clang_doxygen_qhelpgenerator_path "")
+ set(clang_doxygen_qhp_cust_filter_name "")
+ set(clang_doxygen_qhp_cust_filter_attrs "")
+ endif()
+
+ configure_file(${CMAKE_CURRENT_SOURCE_DIR}/doxygen.cfg.in
+ ${CMAKE_CURRENT_BINARY_DIR}/doxygen.cfg @ONLY)
+
+ set(abs_top_srcdir)
+ set(abs_top_builddir)
+ set(DOT)
+ set(enable_searchengine)
+ set(searchengine_url)
+ set(enable_server_based_search)
+ set(enable_external_search)
+ set(extra_search_mappings)
+ set(clang_doxygen_generate_qhp)
+ set(clang_doxygen_qch_filename)
+ set(clang_doxygen_qhp_namespace)
+ set(clang_doxygen_qhelpgenerator_path)
+ set(clang_doxygen_qhp_cust_filter_name)
+ set(clang_doxygen_qhp_cust_filter_attrs)
+
+ add_custom_target(doxygen-clang
+ COMMAND ${DOXYGEN_EXECUTABLE} ${CMAKE_CURRENT_BINARY_DIR}/doxygen.cfg
+ WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
+ COMMENT "Generating clang doxygen documentation." VERBATIM)
+
+ if (LLVM_BUILD_DOCS)
+ add_dependencies(doxygen doxygen-clang)
+ endif()
+
+ if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY)
+ install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/doxygen/html
+ DESTINATION docs/html)
+ endif()
+endif()
+endif()
+
+if (LLVM_ENABLE_SPHINX)
+ if (SPHINX_FOUND)
+ include(AddSphinxTarget)
+ if (${SPHINX_OUTPUT_HTML})
+ add_sphinx_target(html clang)
+ endif()
+ endif()
+endif()
Added: www-releases/trunk/3.5.1/tools/clang/docs/ClangCheck.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/ClangCheck.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/ClangCheck.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/ClangCheck.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,36 @@
+==========
+ClangCheck
+==========
+
+`ClangCheck` is a small wrapper around :doc:`LibTooling` which can be used to
+do basic error checking and AST dumping.
+
+.. code-block:: console
+
+ $ cat <<EOF > snippet.cc
+ > void f() {
+ > int a = 0
+ > }
+ > EOF
+ $ ~/clang/build/bin/clang-check snippet.cc -ast-dump --
+ Processing: /Users/danieljasper/clang/llvm/tools/clang/docs/snippet.cc.
+ /Users/danieljasper/clang/llvm/tools/clang/docs/snippet.cc:2:12: error: expected ';' at end of
+ declaration
+ int a = 0
+ ^
+ ;
+ (TranslationUnitDecl 0x7ff3a3029ed0 <<invalid sloc>>
+ (TypedefDecl 0x7ff3a302a410 <<invalid sloc>> __int128_t '__int128')
+ (TypedefDecl 0x7ff3a302a470 <<invalid sloc>> __uint128_t 'unsigned __int128')
+ (TypedefDecl 0x7ff3a302a830 <<invalid sloc>> __builtin_va_list '__va_list_tag [1]')
+ (FunctionDecl 0x7ff3a302a8d0 </Users/danieljasper/clang/llvm/tools/clang/docs/snippet.cc:1:1, line:3:1> f 'void (void)'
+ (CompoundStmt 0x7ff3a302aa10 <line:1:10, line:3:1>
+ (DeclStmt 0x7ff3a302a9f8 <line:2:3, line:3:1>
+ (VarDecl 0x7ff3a302a980 <line:2:3, col:11> a 'int'
+ (IntegerLiteral 0x7ff3a302a9d8 <col:11> 'int' 0))))))
+ 1 error generated.
+ Error while processing snippet.cc.
+
+The '--' at the end is important as it prevents `clang-check` from search for a
+compilation database. For more information on how to setup and use `clang-check`
+in a project, see :doc:`HowToSetupToolingForLLVM`.
Added: www-releases/trunk/3.5.1/tools/clang/docs/ClangFormat.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/ClangFormat.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/ClangFormat.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/ClangFormat.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,187 @@
+===========
+ClangFormat
+===========
+
+`ClangFormat` describes a set of tools that are built on top of
+:doc:`LibFormat`. It can support your workflow in a variety of ways including a
+standalone tool and editor integrations.
+
+
+Standalone Tool
+===============
+
+:program:`clang-format` is located in `clang/tools/clang-format` and can be used
+to format C/C++/Obj-C code.
+
+.. code-block:: console
+
+ $ clang-format -help
+ OVERVIEW: A tool to format C/C++/Obj-C code.
+
+ If no arguments are specified, it formats the code from standard input
+ and writes the result to the standard output.
+ If <file>s are given, it reformats the files. If -i is specified
+ together with <file>s, the files are edited in-place. Otherwise, the
+ result is written to the standard output.
+
+ USAGE: clang-format [options] [<file> ...]
+
+ OPTIONS:
+
+ Clang-format options:
+
+ -cursor=<uint> - The position of the cursor when invoking
+ clang-format from an editor integration
+ -dump-config - Dump configuration options to stdout and exit.
+ Can be used with -style option.
+ -i - Inplace edit <file>s, if specified.
+ -length=<uint> - Format a range of this length (in bytes).
+ Multiple ranges can be formatted by specifying
+ several -offset and -length pairs.
+ When only a single -offset is specified without
+ -length, clang-format will format up to the end
+ of the file.
+ Can only be used with one input file.
+ -lines=<string> - <start line>:<end line> - format a range of
+ lines (both 1-based).
+ Multiple ranges can be formatted by specifying
+ several -lines arguments.
+ Can't be used with -offset and -length.
+ Can only be used with one input file.
+ -offset=<uint> - Format a range starting at this byte offset.
+ Multiple ranges can be formatted by specifying
+ several -offset and -length pairs.
+ Can only be used with one input file.
+ -output-replacements-xml - Output replacements as XML.
+ -style=<string> - Coding style, currently supports:
+ LLVM, Google, Chromium, Mozilla, WebKit.
+ Use -style=file to load style configuration from
+ .clang-format file located in one of the parent
+ directories of the source file (or current
+ directory for stdin).
+ Use -style="{key: value, ...}" to set specific
+ parameters, e.g.:
+ -style="{BasedOnStyle: llvm, IndentWidth: 8}"
+
+ General options:
+
+ -help - Display available options (-help-hidden for more)
+ -help-list - Display list of available options (-help-list-hidden for more)
+ -version - Display the version of this program
+
+
+When the desired code formatting style is different from the available options,
+the style can be customized using the ``-style="{key: value, ...}"`` option or
+by putting your style configuration in the ``.clang-format`` or ``_clang-format``
+file in your project's directory and using ``clang-format -style=file``.
+
+An easy way to create the ``.clang-format`` file is:
+
+.. code-block:: console
+
+ clang-format -style=llvm -dump-config > .clang-format
+
+Available style options are described in :doc:`ClangFormatStyleOptions`.
+
+
+Vim Integration
+===============
+
+There is an integration for :program:`vim` which lets you run the
+:program:`clang-format` standalone tool on your current buffer, optionally
+selecting regions to reformat. The integration has the form of a `python`-file
+which can be found under `clang/tools/clang-format/clang-format.py`.
+
+This can be integrated by adding the following to your `.vimrc`:
+
+.. code-block:: vim
+
+ map <C-K> :pyf <path-to-this-file>/clang-format.py<CR>
+ imap <C-K> <ESC>:pyf <path-to-this-file>/clang-format.py<CR>i
+
+The first line enables :program:`clang-format` for NORMAL and VISUAL mode, the
+second line adds support for INSERT mode. Change "C-K" to another binding if
+you need :program:`clang-format` on a different key (C-K stands for Ctrl+k).
+
+With this integration you can press the bound key and clang-format will
+format the current line in NORMAL and INSERT mode or the selected region in
+VISUAL mode. The line or region is extended to the next bigger syntactic
+entity.
+
+It operates on the current, potentially unsaved buffer and does not create
+or save any files. To revert a formatting, just undo.
+
+
+Emacs Integration
+=================
+
+Similar to the integration for :program:`vim`, there is an integration for
+:program:`emacs`. It can be found at `clang/tools/clang-format/clang-format.el`
+and used by adding this to your `.emacs`:
+
+.. code-block:: common-lisp
+
+ (load "<path-to-clang>/tools/clang-format/clang-format.el")
+ (global-set-key [C-M-tab] 'clang-format-region)
+
+This binds the function `clang-format-region` to C-M-tab, which then formats the
+current line or selected region.
+
+
+BBEdit Integration
+==================
+
+:program:`clang-format` cannot be used as a text filter with BBEdit, but works
+well via a script. The AppleScript to do this integration can be found at
+`clang/tools/clang-format/clang-format-bbedit.applescript`; place a copy in
+`~/Library/Application Support/BBEdit/Scripts`, and edit the path within it to
+point to your local copy of :program:`clang-format`.
+
+With this integration you can select the script from the Script menu and
+:program:`clang-format` will format the selection. Note that you can rename the
+menu item by renaming the script, and can assign the menu item a keyboard
+shortcut in the BBEdit preferences, under Menus & Shortcuts.
+
+
+Visual Studio Integration
+=========================
+
+Download the latest Visual Studio extension from the `alpha build site
+<http://llvm.org/builds/>`_. The default key-binding is Ctrl-R,Ctrl-F.
+
+
+Script for patch reformatting
+=============================
+
+The python script `clang/tools/clang-format-diff.py` parses the output of
+a unified diff and reformats all contained lines with :program:`clang-format`.
+
+.. code-block:: console
+
+ usage: clang-format-diff.py [-h] [-i] [-p NUM] [-regex PATTERN] [-style STYLE]
+
+ Reformat changed lines in diff. Without -i option just output the diff that
+ would be introduced.
+
+ optional arguments:
+ -h, --help show this help message and exit
+ -i apply edits to files instead of displaying a diff
+ -p NUM strip the smallest prefix containing P slashes
+ -regex PATTERN custom pattern selecting file paths to reformat
+ -style STYLE formatting style to apply (LLVM, Google, Chromium, Mozilla,
+ WebKit)
+
+So to reformat all the lines in the latest :program:`git` commit, just do:
+
+.. code-block:: console
+
+ git diff -U0 HEAD^ | clang-format-diff.py -i -p1
+
+In an SVN client, you can do:
+
+.. code-block:: console
+
+ svn diff --diff-cmd=diff -x-U0 | clang-format-diff.py -i
+
+The :option:`-U0` will create a diff without context lines (the script would format
+those as well).
Added: www-releases/trunk/3.5.1/tools/clang/docs/ClangFormatStyleOptions.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/ClangFormatStyleOptions.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/ClangFormatStyleOptions.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/ClangFormatStyleOptions.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,500 @@
+==========================
+Clang-Format Style Options
+==========================
+
+:doc:`ClangFormatStyleOptions` describes configurable formatting style options
+supported by :doc:`LibFormat` and :doc:`ClangFormat`.
+
+When using :program:`clang-format` command line utility or
+``clang::format::reformat(...)`` functions from code, one can either use one of
+the predefined styles (LLVM, Google, Chromium, Mozilla, WebKit) or create a
+custom style by configuring specific style options.
+
+
+Configuring Style with clang-format
+===================================
+
+:program:`clang-format` supports two ways to provide custom style options:
+directly specify style configuration in the ``-style=`` command line option or
+use ``-style=file`` and put style configuration in the ``.clang-format`` or
+``_clang-format`` file in the project directory.
+
+When using ``-style=file``, :program:`clang-format` for each input file will
+try to find the ``.clang-format`` file located in the closest parent directory
+of the input file. When the standard input is used, the search is started from
+the current directory.
+
+The ``.clang-format`` file uses YAML format:
+
+.. code-block:: yaml
+
+ key1: value1
+ key2: value2
+ # A comment.
+ ...
+
+An easy way to get a valid ``.clang-format`` file containing all configuration
+options of a certain predefined style is:
+
+.. code-block:: console
+
+ clang-format -style=llvm -dump-config > .clang-format
+
+When specifying configuration in the ``-style=`` option, the same configuration
+is applied for all input files. The format of the configuration is:
+
+.. code-block:: console
+
+ -style='{key1: value1, key2: value2, ...}'
+
+
+Configuring Style in Code
+=========================
+
+When using ``clang::format::reformat(...)`` functions, the format is specified
+by supplying the `clang::format::FormatStyle
+<http://clang.llvm.org/doxygen/structclang_1_1format_1_1FormatStyle.html>`_
+structure.
+
+
+Configurable Format Style Options
+=================================
+
+This section lists the supported style options. Value type is specified for
+each option. For enumeration types possible values are specified both as a C++
+enumeration member (with a prefix, e.g. ``LS_Auto``), and as a value usable in
+the configuration (without a prefix: ``Auto``).
+
+
+**BasedOnStyle** (``string``)
+ The style used for all options not specifically set in the configuration.
+
+ This option is supported only in the :program:`clang-format` configuration
+ (both within ``-style='{...}'`` and the ``.clang-format`` file).
+
+ Possible values:
+
+ * ``LLVM``
+ A style complying with the `LLVM coding standards
+ <http://llvm.org/docs/CodingStandards.html>`_
+ * ``Google``
+ A style complying with `Google's C++ style guide
+ <http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml>`_
+ * ``Chromium``
+ A style complying with `Chromium's style guide
+ <http://www.chromium.org/developers/coding-style>`_
+ * ``Mozilla``
+ A style complying with `Mozilla's style guide
+ <https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style>`_
+ * ``WebKit``
+ A style complying with `WebKit's style guide
+ <http://www.webkit.org/coding/coding-style.html>`_
+
+.. START_FORMAT_STYLE_OPTIONS
+
+**AccessModifierOffset** (``int``)
+ The extra indent or outdent of access modifiers, e.g. ``public:``.
+
+**AlignEscapedNewlinesLeft** (``bool``)
+ If ``true``, aligns escaped newlines as far left as possible.
+ Otherwise puts them into the right-most column.
+
+**AlignTrailingComments** (``bool``)
+ If ``true``, aligns trailing comments.
+
+**AllowAllParametersOfDeclarationOnNextLine** (``bool``)
+ Allow putting all parameters of a function declaration onto
+ the next line even if ``BinPackParameters`` is ``false``.
+
+**AllowShortBlocksOnASingleLine** (``bool``)
+ Allows contracting simple braced statements to a single line.
+
+ E.g., this allows ``if (a) { return; }`` to be put on a single line.
+
+**AllowShortFunctionsOnASingleLine** (``ShortFunctionStyle``)
+ Dependent on the value, ``int f() { return 0; }`` can be put
+ on a single line.
+
+ Possible values:
+
+ * ``SFS_None`` (in configuration: ``None``)
+ Never merge functions into a single line.
+ * ``SFS_Inline`` (in configuration: ``Inline``)
+ Only merge functions defined inside a class.
+ * ``SFS_All`` (in configuration: ``All``)
+ Merge all functions fitting on a single line.
+
+
+**AllowShortIfStatementsOnASingleLine** (``bool``)
+ If ``true``, ``if (a) return;`` can be put on a single
+ line.
+
+**AllowShortLoopsOnASingleLine** (``bool``)
+ If ``true``, ``while (true) continue;`` can be put on a
+ single line.
+
+**AlwaysBreakBeforeMultilineStrings** (``bool``)
+ If ``true``, always break before multiline string literals.
+
+**AlwaysBreakTemplateDeclarations** (``bool``)
+ If ``true``, always break after the ``template<...>`` of a
+ template declaration.
+
+**BinPackParameters** (``bool``)
+ If ``false``, a function call's or function definition's parameters
+ will either all be on the same line or will have one line each.
+
+**BreakBeforeBinaryOperators** (``bool``)
+ If ``true``, binary operators will be placed after line breaks.
+
+**BreakBeforeBraces** (``BraceBreakingStyle``)
+ The brace breaking style to use.
+
+ Possible values:
+
+ * ``BS_Attach`` (in configuration: ``Attach``)
+ Always attach braces to surrounding context.
+ * ``BS_Linux`` (in configuration: ``Linux``)
+ Like ``Attach``, but break before braces on function, namespace and
+ class definitions.
+ * ``BS_Stroustrup`` (in configuration: ``Stroustrup``)
+ Like ``Attach``, but break before function definitions.
+ * ``BS_Allman`` (in configuration: ``Allman``)
+ Always break before braces.
+ * ``BS_GNU`` (in configuration: ``GNU``)
+ Always break before braces and add an extra level of indentation to
+ braces of control statements, not to those of class, function
+ or other definitions.
+
+
+**BreakBeforeTernaryOperators** (``bool``)
+ If ``true``, ternary operators will be placed after line breaks.
+
+**BreakConstructorInitializersBeforeComma** (``bool``)
+ Always break constructor initializers before commas and align
+ the commas with the colon.
+
+**ColumnLimit** (``unsigned``)
+ The column limit.
+
+ A column limit of ``0`` means that there is no column limit. In this case,
+ clang-format will respect the input's line breaking decisions within
+ statements unless they contradict other rules.
+
+**CommentPragmas** (``std::string``)
+ A regular expression that describes comments with special meaning,
+ which should not be split into lines or otherwise changed.
+
+**ConstructorInitializerAllOnOneLineOrOnePerLine** (``bool``)
+ If the constructor initializers don't fit on a line, put each
+ initializer on its own line.
+
+**ConstructorInitializerIndentWidth** (``unsigned``)
+ The number of characters to use for indentation of constructor
+ initializer lists.
+
+**ContinuationIndentWidth** (``unsigned``)
+ Indent width for line continuations.
+
+**Cpp11BracedListStyle** (``bool``)
+ If ``true``, format braced lists as best suited for C++11 braced
+ lists.
+
+ Important differences:
+ - No spaces inside the braced list.
+ - No line break before the closing brace.
+ - Indentation with the continuation indent, not with the block indent.
+
+ Fundamentally, C++11 braced lists are formatted exactly like function
+ calls would be formatted in their place. If the braced list follows a name
+ (e.g. a type or variable name), clang-format formats as if the ``{}`` were
+ the parentheses of a function call with that name. If there is no name,
+ a zero-length name is assumed.
+
+**DerivePointerAlignment** (``bool``)
+ If ``true``, analyze the formatted file for the most common
+ alignment of & and \*. ``PointerAlignment`` is then used only as fallback.
+
+**DisableFormat** (``bool``)
+ Disables formatting at all.
+
+**ExperimentalAutoDetectBinPacking** (``bool``)
+ If ``true``, clang-format detects whether function calls and
+ definitions are formatted with one parameter per line.
+
+ Each call can be bin-packed, one-per-line or inconclusive. If it is
+ inconclusive, e.g. completely on one line, but a decision needs to be
+ made, clang-format analyzes whether there are other bin-packed cases in
+ the input file and act accordingly.
+
+ NOTE: This is an experimental flag, that might go away or be renamed. Do
+ not use this in config files, etc. Use at your own risk.
+
+**ForEachMacros** (``std::vector<std::string>``)
+ A vector of macros that should be interpreted as foreach loops
+ instead of as function calls.
+
+ These are expected to be macros of the form:
+ \code
+ FOREACH(<variable-declaration>, ...)
+ <loop-body>
+ \endcode
+
+ For example: BOOST_FOREACH.
+
+**IndentCaseLabels** (``bool``)
+ Indent case labels one level from the switch statement.
+
+ When ``false``, use the same indentation level as for the switch statement.
+ Switch statement body is always indented one level more than case labels.
+
+**IndentWidth** (``unsigned``)
+ The number of columns to use for indentation.
+
+**IndentWrappedFunctionNames** (``bool``)
+ Indent if a function definition or declaration is wrapped after the
+ type.
+
+**KeepEmptyLinesAtTheStartOfBlocks** (``bool``)
+ If true, empty lines at the start of blocks are kept.
+
+**Language** (``LanguageKind``)
+ Language, this format style is targeted at.
+
+ Possible values:
+
+ * ``LK_None`` (in configuration: ``None``)
+ Do not use.
+ * ``LK_Cpp`` (in configuration: ``Cpp``)
+ Should be used for C, C++, ObjectiveC, ObjectiveC++.
+ * ``LK_JavaScript`` (in configuration: ``JavaScript``)
+ Should be used for JavaScript.
+ * ``LK_Proto`` (in configuration: ``Proto``)
+ Should be used for Protocol Buffers
+ (https://developers.google.com/protocol-buffers/).
+
+
+**MaxEmptyLinesToKeep** (``unsigned``)
+ The maximum number of consecutive empty lines to keep.
+
+**NamespaceIndentation** (``NamespaceIndentationKind``)
+ The indentation used for namespaces.
+
+ Possible values:
+
+ * ``NI_None`` (in configuration: ``None``)
+ Don't indent in namespaces.
+ * ``NI_Inner`` (in configuration: ``Inner``)
+ Indent only in inner namespaces (nested in other namespaces).
+ * ``NI_All`` (in configuration: ``All``)
+ Indent in all namespaces.
+
+
+**ObjCSpaceAfterProperty** (``bool``)
+ Add a space after ``@property`` in Objective-C, i.e. use
+ ``\@property (readonly)`` instead of ``\@property(readonly)``.
+
+**ObjCSpaceBeforeProtocolList** (``bool``)
+ Add a space in front of an Objective-C protocol list, i.e. use
+ ``Foo <Protocol>`` instead of ``Foo<Protocol>``.
+
+**PenaltyBreakBeforeFirstCallParameter** (``unsigned``)
+ The penalty for breaking a function call after "call(".
+
+**PenaltyBreakComment** (``unsigned``)
+ The penalty for each line break introduced inside a comment.
+
+**PenaltyBreakFirstLessLess** (``unsigned``)
+ The penalty for breaking before the first ``<<``.
+
+**PenaltyBreakString** (``unsigned``)
+ The penalty for each line break introduced inside a string literal.
+
+**PenaltyExcessCharacter** (``unsigned``)
+ The penalty for each character outside of the column limit.
+
+**PenaltyReturnTypeOnItsOwnLine** (``unsigned``)
+ Penalty for putting the return type of a function onto its own
+ line.
+
+**PointerAlignment** (``PointerAlignmentStyle``)
+ Pointer and reference alignment style.
+
+ Possible values:
+
+ * ``PAS_Left`` (in configuration: ``Left``)
+ Align pointer to the left.
+ * ``PAS_Right`` (in configuration: ``Right``)
+ Align pointer to the right.
+ * ``PAS_Middle`` (in configuration: ``Middle``)
+ Align pointer in the middle.
+
+
+**SpaceBeforeAssignmentOperators** (``bool``)
+ If ``false``, spaces will be removed before assignment operators.
+
+**SpaceBeforeParens** (``SpaceBeforeParensOptions``)
+ Defines in which cases to put a space before opening parentheses.
+
+ Possible values:
+
+ * ``SBPO_Never`` (in configuration: ``Never``)
+ Never put a space before opening parentheses.
+ * ``SBPO_ControlStatements`` (in configuration: ``ControlStatements``)
+ Put a space before opening parentheses only after control statement
+ keywords (``for/if/while...``).
+ * ``SBPO_Always`` (in configuration: ``Always``)
+ Always put a space before opening parentheses, except when it's
+ prohibited by the syntax rules (in function-like macro definitions) or
+ when determined by other style rules (after unary operators, opening
+ parentheses, etc.)
+
+
+**SpaceInEmptyParentheses** (``bool``)
+ If ``true``, spaces may be inserted into '()'.
+
+**SpacesBeforeTrailingComments** (``unsigned``)
+ The number of spaces before trailing line comments
+ (``//`` - comments).
+
+ This does not affect trailing block comments (``/**/`` - comments) as those
+ commonly have different usage patterns and a number of special cases.
+
+**SpacesInAngles** (``bool``)
+ If ``true``, spaces will be inserted after '<' and before '>' in
+ template argument lists
+
+**SpacesInCStyleCastParentheses** (``bool``)
+ If ``true``, spaces may be inserted into C style casts.
+
+**SpacesInContainerLiterals** (``bool``)
+ If ``true``, spaces are inserted inside container literals (e.g.
+ ObjC and Javascript array and dict literals).
+
+**SpacesInParentheses** (``bool``)
+ If ``true``, spaces will be inserted after '(' and before ')'.
+
+**Standard** (``LanguageStandard``)
+ Format compatible with this standard, e.g. use
+ ``A<A<int> >`` instead of ``A<A<int>>`` for LS_Cpp03.
+
+ Possible values:
+
+ * ``LS_Cpp03`` (in configuration: ``Cpp03``)
+ Use C++03-compatible syntax.
+ * ``LS_Cpp11`` (in configuration: ``Cpp11``)
+ Use features of C++11 (e.g. ``A<A<int>>`` instead of
+ ``A<A<int> >``).
+ * ``LS_Auto`` (in configuration: ``Auto``)
+ Automatic detection based on the input.
+
+
+**TabWidth** (``unsigned``)
+ The number of columns used for tab stops.
+
+**UseTab** (``UseTabStyle``)
+ The way to use tab characters in the resulting file.
+
+ Possible values:
+
+ * ``UT_Never`` (in configuration: ``Never``)
+ Never use tab.
+ * ``UT_ForIndentation`` (in configuration: ``ForIndentation``)
+ Use tabs only for indentation.
+ * ``UT_Always`` (in configuration: ``Always``)
+ Use tabs whenever we need to fill whitespace that spans at least from
+ one tab stop to the next one.
+
+
+.. END_FORMAT_STYLE_OPTIONS
+
+Examples
+========
+
+A style similar to the `Linux Kernel style
+<https://www.kernel.org/doc/Documentation/CodingStyle>`_:
+
+.. code-block:: yaml
+
+ BasedOnStyle: LLVM
+ IndentWidth: 8
+ UseTab: Always
+ BreakBeforeBraces: Linux
+ AllowShortIfStatementsOnASingleLine: false
+ IndentCaseLabels: false
+
+The result is (imagine that tabs are used for indentation here):
+
+.. code-block:: c++
+
+ void test()
+ {
+ switch (x) {
+ case 0:
+ case 1:
+ do_something();
+ break;
+ case 2:
+ do_something_else();
+ break;
+ default:
+ break;
+ }
+ if (condition)
+ do_something_completely_different();
+
+ if (x == y) {
+ q();
+ } else if (x > y) {
+ w();
+ } else {
+ r();
+ }
+ }
+
+A style similar to the default Visual Studio formatting style:
+
+.. code-block:: yaml
+
+ UseTab: Never
+ IndentWidth: 4
+ BreakBeforeBraces: Allman
+ AllowShortIfStatementsOnASingleLine: false
+ IndentCaseLabels: false
+ ColumnLimit: 0
+
+The result is:
+
+.. code-block:: c++
+
+ void test()
+ {
+ switch (suffix)
+ {
+ case 0:
+ case 1:
+ do_something();
+ break;
+ case 2:
+ do_something_else();
+ break;
+ default:
+ break;
+ }
+ if (condition)
+ do_somthing_completely_different();
+
+ if (x == y)
+ {
+ q();
+ }
+ else if (x > y)
+ {
+ w();
+ }
+ else
+ {
+ r();
+ }
+ }
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/ClangPlugins.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/ClangPlugins.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/ClangPlugins.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/ClangPlugins.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,90 @@
+=============
+Clang Plugins
+=============
+
+Clang Plugins make it possible to run extra user defined actions during a
+compilation. This document will provide a basic walkthrough of how to write and
+run a Clang Plugin.
+
+Introduction
+============
+
+Clang Plugins run FrontendActions over code. See the :doc:`FrontendAction
+tutorial <RAVFrontendAction>` on how to write a ``FrontendAction`` using the
+``RecursiveASTVisitor``. In this tutorial, we'll demonstrate how to write a
+simple clang plugin.
+
+Writing a ``PluginASTAction``
+=============================
+
+The main difference from writing normal ``FrontendActions`` is that you can
+handle plugin command line options. The ``PluginASTAction`` base class declares
+a ``ParseArgs`` method which you have to implement in your plugin.
+
+.. code-block:: c++
+
+ bool ParseArgs(const CompilerInstance &CI,
+ const std::vector<std::string>& args) {
+ for (unsigned i = 0, e = args.size(); i != e; ++i) {
+ if (args[i] == "-some-arg") {
+ // Handle the command line argument.
+ }
+ }
+ return true;
+ }
+
+Registering a plugin
+====================
+
+A plugin is loaded from a dynamic library at runtime by the compiler. To
+register a plugin in a library, use ``FrontendPluginRegistry::Add<>``:
+
+.. code-block:: c++
+
+ static FrontendPluginRegistry::Add<MyPlugin> X("my-plugin-name", "my plugin description");
+
+Putting it all together
+=======================
+
+Let's look at an example plugin that prints top-level function names. This
+example is checked into the clang repository; please take a look at
+the `latest version of PrintFunctionNames.cpp
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/PrintFunctionNames/PrintFunctionNames.cpp?view=markup>`_.
+
+Running the plugin
+==================
+
+To run a plugin, the dynamic library containing the plugin registry must be
+loaded via the :option:`-load` command line option. This will load all plugins
+that are registered, and you can select the plugins to run by specifying the
+:option:`-plugin` option. Additional parameters for the plugins can be passed with
+:option:`-plugin-arg-<plugin-name>`.
+
+Note that those options must reach clang's cc1 process. There are two
+ways to do so:
+
+* Directly call the parsing process by using the :option:`-cc1` option; this
+ has the downside of not configuring the default header search paths, so
+ you'll need to specify the full system path configuration on the command
+ line.
+* Use clang as usual, but prefix all arguments to the cc1 process with
+ :option:`-Xclang`.
+
+For example, to run the ``print-function-names`` plugin over a source file in
+clang, first build the plugin, and then call clang with the plugin from the
+source tree:
+
+.. code-block:: console
+
+ $ export BD=/path/to/build/directory
+ $ (cd $BD && make PrintFunctionNames )
+ $ clang++ -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS \
+ -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D_GNU_SOURCE \
+ -I$BD/tools/clang/include -Itools/clang/include -I$BD/include -Iinclude \
+ tools/clang/tools/clang-check/ClangCheck.cpp -fsyntax-only \
+ -Xclang -load -Xclang $BD/lib/PrintFunctionNames.so -Xclang \
+ -plugin -Xclang print-fns
+
+Also see the print-function-name plugin example's
+`README <http://llvm.org/viewvc/llvm-project/cfe/trunk/examples/PrintFunctionNames/README.txt?view=markup>`_
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/ClangTools.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/ClangTools.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/ClangTools.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/ClangTools.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,191 @@
+========
+Overview
+========
+
+Clang Tools are standalone command line (and potentially GUI) tools
+designed for use by C++ developers who are already using and enjoying
+Clang as their compiler. These tools provide developer-oriented
+functionality such as fast syntax checking, automatic formatting,
+refactoring, etc.
+
+Only a couple of the most basic and fundamental tools are kept in the
+primary Clang Subversion project. The rest of the tools are kept in a
+side-project so that developers who don't want or need to build them
+don't. If you want to get access to the extra Clang Tools repository,
+simply check it out into the tools tree of your Clang checkout and
+follow the usual process for building and working with a combined
+LLVM/Clang checkout:
+
+- With Subversion:
+
+ - ``cd llvm/tools/clang/tools``
+ - ``svn co http://llvm.org/svn/llvm-project/clang-tools-extra/trunk extra``
+
+- Or with Git:
+
+ - ``cd llvm/tools/clang/tools``
+ - ``git clone http://llvm.org/git/clang-tools-extra.git extra``
+
+This document describes a high-level overview of the organization of
+Clang Tools within the project as well as giving an introduction to some
+of the more important tools. However, it should be noted that this
+document is currently focused on Clang and Clang Tool developers, not on
+end users of these tools.
+
+Clang Tools Organization
+========================
+
+Clang Tools are CLI or GUI programs that are intended to be directly
+used by C++ developers. That is they are *not* primarily for use by
+Clang developers, although they are hopefully useful to C++ developers
+who happen to work on Clang, and we try to actively dogfood their
+functionality. They are developed in three components: the underlying
+infrastructure for building a standalone tool based on Clang, core
+shared logic used by many different tools in the form of refactoring and
+rewriting libraries, and the tools themselves.
+
+The underlying infrastructure for Clang Tools is the
+:doc:`LibTooling <LibTooling>` platform. See its documentation for much
+more detailed information about how this infrastructure works. The
+common refactoring and rewriting toolkit-style library is also part of
+LibTooling organizationally.
+
+A few Clang Tools are developed along side the core Clang libraries as
+examples and test cases of fundamental functionality. However, most of
+the tools are developed in a side repository to provide easy separation
+from the core libraries. We intentionally do not support public
+libraries in the side repository, as we want to carefully review and
+find good APIs for libraries as they are lifted out of a few tools and
+into the core Clang library set.
+
+Regardless of which repository Clang Tools' code resides in, the
+development process and practices for all Clang Tools are exactly those
+of Clang itself. They are entirely within the Clang *project*,
+regardless of the version control scheme.
+
+Core Clang Tools
+================
+
+The core set of Clang tools that are within the main repository are
+tools that very specifically complement, and allow use and testing of
+*Clang* specific functionality.
+
+``clang-check``
+---------------
+
+:doc:`ClangCheck` combines the LibTooling framework for running a
+Clang tool with the basic Clang diagnostics by syntax checking specific files
+in a fast, command line interface. It can also accept flags to re-display the
+diagnostics in different formats with different flags, suitable for use driving
+an IDE or editor. Furthermore, it can be used in fixit-mode to directly apply
+fixit-hints offered by clang. See :doc:`HowToSetupToolingForLLVM` for
+instructions on how to setup and used `clang-check`.
+
+``clang-format``
+~~~~~~~~~~~~~~~~
+
+Clang-format is both a :doc:`library <LibFormat>` and a :doc:`stand-alone tool
+<ClangFormat>` with the goal of automatically reformatting C++ sources files
+according to configurable style guides. To do so, clang-format uses Clang's
+``Lexer`` to transform an input file into a token stream and then changes all
+the whitespace around those tokens. The goal is for clang-format to serve both
+as a user tool (ideally with powerful IDE integrations) and as part of other
+refactoring tools, e.g. to do a reformatting of all the lines changed during a
+renaming.
+
+``clang-modernize``
+~~~~~~~~~~~~~~~~~~~
+``clang-modernize`` migrates C++ code to use C++11 features where appropriate.
+Currently it can:
+
+* convert loops to range-based for loops;
+
+* convert null pointer constants (like ``NULL`` or ``0``) to C++11 ``nullptr``;
+
+* replace the type specifier in variable declarations with the ``auto`` type specifier;
+
+* add the ``override`` specifier to applicable member functions.
+
+Extra Clang Tools
+=================
+
+As various categories of Clang Tools are added to the extra repository,
+they'll be tracked here. The focus of this documentation is on the scope
+and features of the tools for other tool developers; each tool should
+provide its own user-focused documentation.
+
+Ideas for new Tools
+===================
+
+* C++ cast conversion tool. Will convert C-style casts (``(type) value``) to
+ appropriate C++ cast (``static_cast``, ``const_cast`` or
+ ``reinterpret_cast``).
+* Non-member ``begin()`` and ``end()`` conversion tool. Will convert
+ ``foo.begin()`` into ``begin(foo)`` and similarly for ``end()``, where
+ ``foo`` is a standard container. We could also detect similar patterns for
+ arrays.
+* ``make_shared`` / ``make_unique`` conversion. Part of this transformation
+ can be incorporated into the ``auto`` transformation. Will convert
+
+ .. code-block:: c++
+
+ std::shared_ptr<Foo> sp(new Foo);
+ std::unique_ptr<Foo> up(new Foo);
+
+ func(std::shared_ptr<Foo>(new Foo), bar());
+
+ into:
+
+ .. code-block:: c++
+
+ auto sp = std::make_shared<Foo>();
+ auto up = std::make_unique<Foo>(); // In C++14 mode.
+
+ // This also affects correctness. For the cases where bar() throws,
+ // make_shared() is safe and the original code may leak.
+ func(std::make_shared<Foo>(), bar());
+
+* ``tr1`` removal tool. Will migrate source code from using TR1 library
+ features to C++11 library. For example:
+
+ .. code-block:: c++
+
+ #include <tr1/unordered_map>
+ int main()
+ {
+ std::tr1::unordered_map <int, int> ma;
+ std::cout << ma.size () << std::endl;
+ return 0;
+ }
+
+ should be rewritten to:
+
+ .. code-block:: c++
+
+ #include <unordered_map>
+ int main()
+ {
+ std::unordered_map <int, int> ma;
+ std::cout << ma.size () << std::endl;
+ return 0;
+ }
+
+* A tool to remove ``auto``. Will convert ``auto`` to an explicit type or add
+ comments with deduced types. The motivation is that there are developers
+ that don't want to use ``auto`` because they are afraid that they might lose
+ control over their code.
+
+* C++14: less verbose operator function objects (`N3421
+ <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3421.htm>`_).
+ For example:
+
+ .. code-block:: c++
+
+ sort(v.begin(), v.end(), greater<ValueType>());
+
+ should be rewritten to:
+
+ .. code-block:: c++
+
+ sort(v.begin(), v.end(), greater<>());
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/CrossCompilation.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/CrossCompilation.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/CrossCompilation.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/CrossCompilation.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,204 @@
+===================================================================
+Cross-compilation using Clang
+===================================================================
+
+Introduction
+============
+
+This document will guide you in choosing the right Clang options
+for cross-compiling your code to a different architecture. It assumes you
+already know how to compile the code in question for the host architecture,
+and that you know how to choose additional include and library paths.
+
+However, this document is *not* a "how to" and won't help you setting your
+build system or Makefiles, nor choosing the right CMake options, etc.
+Also, it does not cover all the possible options, nor does it contain
+specific examples for specific architectures. For a concrete example, the
+`instructions for cross-compiling LLVM itself
+<http://llvm.org/docs/HowToCrossCompileLLVM.html>`_ may be of interest.
+
+After reading this document, you should be familiar with the main issues
+related to cross-compilation, and what main compiler options Clang provides
+for performing cross-compilation.
+
+Cross compilation issues
+========================
+
+In GCC world, every host/target combination has its own set of binaries,
+headers, libraries, etc. So, it's usually simple to download a package
+with all files in, unzip to a directory and point the build system to
+that compiler, that will know about its location and find all it needs to
+when compiling your code.
+
+On the other hand, Clang/LLVM is natively a cross-compiler, meaning that
+one set of programs can compile to all targets by setting the ``-target``
+option. That makes it a lot easier for programers wishing to compile to
+different platforms and architectures, and for compiler developers that
+only have to maintain one build system, and for OS distributions, that
+need only one set of main packages.
+
+But, as is true to any cross-compiler, and given the complexity of
+different architectures, OS's and options, it's not always easy finding
+the headers, libraries or binutils to generate target specific code.
+So you'll need special options to help Clang understand what target
+you're compiling to, where your tools are, etc.
+
+Another problem is that compilers come with standard libraries only (like
+``compiler-rt``, ``libcxx``, ``libgcc``, ``libm``, etc), so you'll have to
+find and make available to the build system, every other library required
+to build your software, that is specific to your target. It's not enough to
+have your host's libraries installed.
+
+Finally, not all toolchains are the same, and consequently, not every Clang
+option will work magically. Some options, like ``--sysroot`` (which
+effectively changes the logical root for headers and libraries), assume
+all your binaries and libraries are in the same directory, which may not
+true when your cross-compiler was installed by the distribution's package
+management. So, for each specific case, you may use more than one
+option, and in most cases, you'll end up setting include paths (``-I``) and
+library paths (``-L``) manually.
+
+To sum up, different toolchains can:
+ * be host/target specific or more flexible
+ * be in a single directory, or spread out across your system
+ * have different sets of libraries and headers by default
+ * need special options, which your build system won't be able to figure
+ out by itself
+
+General Cross-Compilation Options in Clang
+==========================================
+
+Target Triple
+-------------
+
+The basic option is to define the target architecture. For that, use
+``-target <triple>``. If you don't specify the target, CPU names won't
+match (since Clang assumes the host triple), and the compilation will
+go ahead, creating code for the host platform, which will break later
+on when assembling or linking.
+
+The triple has the general format ``<arch><sub>-<vendor>-<sys>-<abi>``, where:
+ * ``arch`` = ``x86``, ``arm``, ``thumb``, ``mips``, etc.
+ * ``sub`` = for ex. on ARM: ``v5``, ``v6m``, ``v7a``, ``v7m``, etc.
+ * ``vendor`` = ``pc``, ``apple``, ``nvidia``, ``ibm``, etc.
+ * ``sys`` = ``none``, ``linux``, ``win32``, ``darwin``, ``cuda``, etc.
+ * ``abi`` = ``eabi``, ``gnu``, ``android``, ``macho``, ``elf``, etc.
+
+The sub-architecture options are available for their own architectures,
+of course, so "x86v7a" doesn't make sense. The vendor needs to be
+specified only if there's a relevant change, for instance between PC
+and Apple. Most of the time it can be omitted (and Unknown)
+will be assumed, which sets the defaults for the specified architecture.
+The system name is generally the OS (linux, darwin), but could be special
+like the bare-metal "none".
+
+When a parameter is not important, they can be omitted, or you can
+choose ``unknown`` and the defaults will be used. If you choose a parameter
+that Clang doesn't know, like ``blerg``, it'll ignore and assume
+``unknown``, which is not always desired, so be careful.
+
+Finally, the ABI option is something that will pick default CPU/FPU,
+define the specific behaviour of your code (PCS, extensions),
+and also choose the correct library calls, etc.
+
+CPU, FPU, ABI
+-------------
+
+Once your target is specified, it's time to pick the hardware you'll
+be compiling to. For every architecture, a default set of CPU/FPU/ABI
+will be chosen, so you'll almost always have to change it via flags.
+
+Typical flags include:
+ * ``-mcpu=<cpu-name>``, like x86-64, swift, cortex-a15
+ * ``-fpu=<fpu-name>``, like SSE3, NEON, controlling the FP unit available
+ * ``-mfloat-abi=<fabi>``, like soft, hard, controlling which registers
+ to use for floating-point
+
+The default is normally the common denominator, so that Clang doesn't
+generate code that breaks. But that also means you won't get the best
+code for your specific hardware, which may mean orders of magnitude
+slower than you expect.
+
+For example, if your target is ``arm-none-eabi``, the default CPU will
+be ``arm7tdmi`` using soft float, which is extremely slow on modern cores,
+whereas if your triple is ``armv7a-none-eabi``, it'll be Cortex-A8 with
+NEON, but still using soft-float, which is much better, but still not
+great.
+
+Toolchain Options
+-----------------
+
+There are three main options to control access to your cross-compiler:
+``--sysroot``, ``-I``, and ``-L``. The two last ones are well known,
+but they're particularly important for additional libraries
+and headers that are specific to your target.
+
+There are two main ways to have a cross-compiler:
+
+#. When you have extracted your cross-compiler from a zip file into
+ a directory, you have to use ``--sysroot=<path>``. The path is the
+ root directory where you have unpacked your file, and Clang will
+ look for the directories ``bin``, ``lib``, ``include`` in there.
+
+ In this case, your setup should be pretty much done (if no
+ additional headers or libraries are needed), as Clang will find
+ all binaries it needs (assembler, linker, etc) in there.
+
+#. When you have installed via a package manager (modern Linux
+ distributions have cross-compiler packages available), make
+ sure the target triple you set is *also* the prefix of your
+ cross-compiler toolchain.
+
+ In this case, Clang will find the other binaries (assembler,
+ linker), but not always where the target headers and libraries
+ are. People add system-specific clues to Clang often, but as
+ things change, it's more likely that it won't find than the
+ other way around.
+
+ So, here, you'll be a lot safer if you specify the include/library
+ directories manually (via ``-I`` and ``-L``).
+
+Target-Specific Libraries
+=========================
+
+All libraries that you compile as part of your build will be
+cross-compiled to your target, and your build system will probably
+find them in the right place. But all dependencies that are
+normally checked against (like ``libxml`` or ``libz`` etc) will match
+against the host platform, not the target.
+
+So, if the build system is not aware that you want to cross-compile
+your code, it will get every dependency wrong, and your compilation
+will fail during build time, not configure time.
+
+Also, finding the libraries for your target are not as easy
+as for your host machine. There aren't many cross-libraries available
+as packages to most OS's, so you'll have to either cross-compile them
+from source, or download the package for your target platform,
+extract the libraries and headers, put them in specific directories
+and add ``-I`` and ``-L`` pointing to them.
+
+Also, some libraries have different dependencies on different targets,
+so configuration tools to find dependencies in the host can get the
+list wrong for the target platform. This means that the configuration
+of your build can get things wrong when setting their own library
+paths, and you'll have to augment it via additional flags (configure,
+Make, CMake, etc).
+
+Multilibs
+---------
+
+When you want to cross-compile to more than one configuration, for
+example hard-float-ARM and soft-float-ARM, you'll have to have multiple
+copies of your libraries and (possibly) headers.
+
+Some Linux distributions have support for Multilib, which handle that
+for you in an easier way, but if you're not careful and, for instance,
+forget to specify ``-ccc-gcc-name armv7l-linux-gnueabihf-gcc`` (which
+uses hard-float), Clang will pick the ``armv7l-linux-gnueabi-ld``
+(which uses soft-float) and linker errors will happen.
+
+The same is true if you're compiling for different ABIs, like ``gnueabi``
+and ``androideabi``, and might even link and run, but produce run-time
+errors, which are much harder to track down and fix.
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizer.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizer.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizer.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizer.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,158 @@
+=================
+DataFlowSanitizer
+=================
+
+.. toctree::
+ :hidden:
+
+ DataFlowSanitizerDesign
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+DataFlowSanitizer is a generalised dynamic data flow analysis.
+
+Unlike other Sanitizer tools, this tool is not designed to detect a
+specific class of bugs on its own. Instead, it provides a generic
+dynamic data flow analysis framework to be used by clients to help
+detect application-specific issues within their own code.
+
+Usage
+=====
+
+With no program changes, applying DataFlowSanitizer to a program
+will not alter its behavior. To use DataFlowSanitizer, the program
+uses API functions to apply tags to data to cause it to be tracked, and to
+check the tag of a specific data item. DataFlowSanitizer manages
+the propagation of tags through the program according to its data flow.
+
+The APIs are defined in the header file ``sanitizer/dfsan_interface.h``.
+For further information about each function, please refer to the header
+file.
+
+ABI List
+--------
+
+DataFlowSanitizer uses a list of functions known as an ABI list to decide
+whether a call to a specific function should use the operating system's native
+ABI or whether it should use a variant of this ABI that also propagates labels
+through function parameters and return values. The ABI list file also controls
+how labels are propagated in the former case. DataFlowSanitizer comes with a
+default ABI list which is intended to eventually cover the glibc library on
+Linux but it may become necessary for users to extend the ABI list in cases
+where a particular library or function cannot be instrumented (e.g. because
+it is implemented in assembly or another language which DataFlowSanitizer does
+not support) or a function is called from a library or function which cannot
+be instrumented.
+
+DataFlowSanitizer's ABI list file is a :doc:`SanitizerSpecialCaseList`.
+The pass treats every function in the ``uninstrumented`` category in the
+ABI list file as conforming to the native ABI. Unless the ABI list contains
+additional categories for those functions, a call to one of those functions
+will produce a warning message, as the labelling behavior of the function
+is unknown. The other supported categories are ``discard``, ``functional``
+and ``custom``.
+
+* ``discard`` -- To the extent that this function writes to (user-accessible)
+ memory, it also updates labels in shadow memory (this condition is trivially
+ satisfied for functions which do not write to user-accessible memory). Its
+ return value is unlabelled.
+* ``functional`` -- Like ``discard``, except that the label of its return value
+ is the union of the label of its arguments.
+* ``custom`` -- Instead of calling the function, a custom wrapper ``__dfsw_F``
+ is called, where ``F`` is the name of the function. This function may wrap
+ the original function or provide its own implementation. This category is
+ generally used for uninstrumentable functions which write to user-accessible
+ memory or which have more complex label propagation behavior. The signature
+ of ``__dfsw_F`` is based on that of ``F`` with each argument having a
+ label of type ``dfsan_label`` appended to the argument list. If ``F``
+ is of non-void return type a final argument of type ``dfsan_label *``
+ is appended to which the custom function can store the label for the
+ return value. For example:
+
+.. code-block:: c++
+
+ void f(int x);
+ void __dfsw_f(int x, dfsan_label x_label);
+
+ void *memcpy(void *dest, const void *src, size_t n);
+ void *__dfsw_memcpy(void *dest, const void *src, size_t n,
+ dfsan_label dest_label, dfsan_label src_label,
+ dfsan_label n_label, dfsan_label *ret_label);
+
+If a function defined in the translation unit being compiled belongs to the
+``uninstrumented`` category, it will be compiled so as to conform to the
+native ABI. Its arguments will be assumed to be unlabelled, but it will
+propagate labels in shadow memory.
+
+For example:
+
+.. code-block:: none
+
+ # main is called by the C runtime using the native ABI.
+ fun:main=uninstrumented
+ fun:main=discard
+
+ # malloc only writes to its internal data structures, not user-accessible memory.
+ fun:malloc=uninstrumented
+ fun:malloc=discard
+
+ # tolower is a pure function.
+ fun:tolower=uninstrumented
+ fun:tolower=functional
+
+ # memcpy needs to copy the shadow from the source to the destination region.
+ # This is done in a custom function.
+ fun:memcpy=uninstrumented
+ fun:memcpy=custom
+
+Example
+=======
+
+The following program demonstrates label propagation by checking that
+the correct labels are propagated.
+
+.. code-block:: c++
+
+ #include <sanitizer/dfsan_interface.h>
+ #include <assert.h>
+
+ int main(void) {
+ int i = 1;
+ dfsan_label i_label = dfsan_create_label("i", 0);
+ dfsan_set_label(i_label, &i, sizeof(i));
+
+ int j = 2;
+ dfsan_label j_label = dfsan_create_label("j", 0);
+ dfsan_set_label(j_label, &j, sizeof(j));
+
+ int k = 3;
+ dfsan_label k_label = dfsan_create_label("k", 0);
+ dfsan_set_label(k_label, &k, sizeof(k));
+
+ dfsan_label ij_label = dfsan_get_label(i + j);
+ assert(dfsan_has_label(ij_label, i_label));
+ assert(dfsan_has_label(ij_label, j_label));
+ assert(!dfsan_has_label(ij_label, k_label));
+
+ dfsan_label ijk_label = dfsan_get_label(i + j + k);
+ assert(dfsan_has_label(ijk_label, i_label));
+ assert(dfsan_has_label(ijk_label, j_label));
+ assert(dfsan_has_label(ijk_label, k_label));
+
+ return 0;
+ }
+
+Current status
+==============
+
+DataFlowSanitizer is a work in progress, currently under development for
+x86\_64 Linux.
+
+Design
+======
+
+Please refer to the :doc:`design document<DataFlowSanitizerDesign>`.
Added: www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizerDesign.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizerDesign.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizerDesign.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/DataFlowSanitizerDesign.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,220 @@
+DataFlowSanitizer Design Document
+=================================
+
+This document sets out the design for DataFlowSanitizer, a general
+dynamic data flow analysis. Unlike other Sanitizer tools, this tool is
+not designed to detect a specific class of bugs on its own. Instead,
+it provides a generic dynamic data flow analysis framework to be used
+by clients to help detect application-specific issues within their
+own code.
+
+DataFlowSanitizer is a program instrumentation which can associate
+a number of taint labels with any data stored in any memory region
+accessible by the program. The analysis is dynamic, which means that
+it operates on a running program, and tracks how the labels propagate
+through that program. The tool shall support a large (>100) number
+of labels, such that programs which operate on large numbers of data
+items may be analysed with each data item being tracked separately.
+
+Use Cases
+---------
+
+This instrumentation can be used as a tool to help monitor how data
+flows from a program's inputs (sources) to its outputs (sinks).
+This has applications from a privacy/security perspective in that
+one can audit how a sensitive data item is used within a program and
+ensure it isn't exiting the program anywhere it shouldn't be.
+
+Interface
+---------
+
+A number of functions are provided which will create taint labels,
+attach labels to memory regions and extract the set of labels
+associated with a specific memory region. These functions are declared
+in the header file ``sanitizer/dfsan_interface.h``.
+
+.. code-block:: c
+
+ /// Creates and returns a base label with the given description and user data.
+ dfsan_label dfsan_create_label(const char *desc, void *userdata);
+
+ /// Sets the label for each address in [addr,addr+size) to \c label.
+ void dfsan_set_label(dfsan_label label, void *addr, size_t size);
+
+ /// Sets the label for each address in [addr,addr+size) to the union of the
+ /// current label for that address and \c label.
+ void dfsan_add_label(dfsan_label label, void *addr, size_t size);
+
+ /// Retrieves the label associated with the given data.
+ ///
+ /// The type of 'data' is arbitrary. The function accepts a value of any type,
+ /// which can be truncated or extended (implicitly or explicitly) as necessary.
+ /// The truncation/extension operations will preserve the label of the original
+ /// value.
+ dfsan_label dfsan_get_label(long data);
+
+ /// Retrieves a pointer to the dfsan_label_info struct for the given label.
+ const struct dfsan_label_info *dfsan_get_label_info(dfsan_label label);
+
+ /// Returns whether the given label label contains the label elem.
+ int dfsan_has_label(dfsan_label label, dfsan_label elem);
+
+ /// If the given label label contains a label with the description desc, returns
+ /// that label, else returns 0.
+ dfsan_label dfsan_has_label_with_desc(dfsan_label label, const char *desc);
+
+Taint label representation
+--------------------------
+
+As stated above, the tool must track a large number of taint
+labels. This poses an implementation challenge, as most multiple-label
+tainting systems assign one label per bit to shadow storage, and
+union taint labels using a bitwise or operation. This will not scale
+to clients which use hundreds or thousands of taint labels, as the
+label union operation becomes O(n) in the number of supported labels,
+and data associated with it will quickly dominate the live variable
+set, causing register spills and hampering performance.
+
+Instead, a low overhead approach is proposed which is best-case O(log\
+:sub:`2` n) during execution. The underlying assumption is that
+the required space of label unions is sparse, which is a reasonable
+assumption to make given that we are optimizing for the case where
+applications mostly copy data from one place to another, without often
+invoking the need for an actual union operation. The representation
+of a taint label is a 16-bit integer, and new labels are allocated
+sequentially from a pool. The label identifier 0 is special, and means
+that the data item is unlabelled.
+
+When a label union operation is requested at a join point (any
+arithmetic or logical operation with two or more operands, such as
+addition), the code checks whether a union is required, whether the
+same union has been requested before, and whether one union label
+subsumes the other. If so, it returns the previously allocated union
+label. If not, it allocates a new union label from the same pool used
+for new labels.
+
+Specifically, the instrumentation pass will insert code like this
+to decide the union label ``lu`` for a pair of labels ``l1``
+and ``l2``:
+
+.. code-block:: c
+
+ if (l1 == l2)
+ lu = l1;
+ else
+ lu = __dfsan_union(l1, l2);
+
+The equality comparison is outlined, to provide an early exit in
+the common cases where the program is processing unlabelled data, or
+where the two data items have the same label. ``__dfsan_union`` is
+a runtime library function which performs all other union computation.
+
+Further optimizations are possible, for example if ``l1`` is known
+at compile time to be zero (e.g. it is derived from a constant),
+``l2`` can be used for ``lu``, and vice versa.
+
+Memory layout and label management
+----------------------------------
+
+The following is the current memory layout for Linux/x86\_64:
+
++---------------+---------------+--------------------+
+| Start | End | Use |
++===============+===============+====================+
+| 0x700000008000|0x800000000000 | application memory |
++---------------+---------------+--------------------+
+| 0x200200000000|0x700000008000 | unused |
++---------------+---------------+--------------------+
+| 0x200000000000|0x200200000000 | union table |
++---------------+---------------+--------------------+
+| 0x000000010000|0x200000000000 | shadow memory |
++---------------+---------------+--------------------+
+| 0x000000000000|0x000000010000 | reserved by kernel |
++---------------+---------------+--------------------+
+
+Each byte of application memory corresponds to two bytes of shadow
+memory, which are used to store its taint label. As for LLVM SSA
+registers, we have not found it necessary to associate a label with
+each byte or bit of data, as some other tools do. Instead, labels are
+associated directly with registers. Loads will result in a union of
+all shadow labels corresponding to bytes loaded (which most of the
+time will be short circuited by the initial comparison) and stores will
+result in a copy of the label to the shadow of all bytes stored to.
+
+Propagating labels through arguments
+------------------------------------
+
+In order to propagate labels through function arguments and return values,
+DataFlowSanitizer changes the ABI of each function in the translation unit.
+There are currently two supported ABIs:
+
+* Args -- Argument and return value labels are passed through additional
+ arguments and by modifying the return type.
+
+* TLS -- Argument and return value labels are passed through TLS variables
+ ``__dfsan_arg_tls`` and ``__dfsan_retval_tls``.
+
+The main advantage of the TLS ABI is that it is more tolerant of ABI mismatches
+(TLS storage is not shared with any other form of storage, whereas extra
+arguments may be stored in registers which under the native ABI are not used
+for parameter passing and thus could contain arbitrary values). On the other
+hand the args ABI is more efficient and allows ABI mismatches to be more easily
+identified by checking for nonzero labels in nominally unlabelled programs.
+
+Implementing the ABI list
+-------------------------
+
+The `ABI list <DataFlowSanitizer.html#abi-list>`_ provides a list of functions
+which conform to the native ABI, each of which is callable from an instrumented
+program. This is implemented by replacing each reference to a native ABI
+function with a reference to a function which uses the instrumented ABI.
+Such functions are automatically-generated wrappers for the native functions.
+For example, given the ABI list example provided in the user manual, the
+following wrappers will be generated under the args ABI:
+
+.. code-block:: llvm
+
+ define linkonce_odr { i8*, i16 } @"dfsw$malloc"(i64 %0, i16 %1) {
+ entry:
+ %2 = call i8* @malloc(i64 %0)
+ %3 = insertvalue { i8*, i16 } undef, i8* %2, 0
+ %4 = insertvalue { i8*, i16 } %3, i16 0, 1
+ ret { i8*, i16 } %4
+ }
+
+ define linkonce_odr { i32, i16 } @"dfsw$tolower"(i32 %0, i16 %1) {
+ entry:
+ %2 = call i32 @tolower(i32 %0)
+ %3 = insertvalue { i32, i16 } undef, i32 %2, 0
+ %4 = insertvalue { i32, i16 } %3, i16 %1, 1
+ ret { i32, i16 } %4
+ }
+
+ define linkonce_odr { i8*, i16 } @"dfsw$memcpy"(i8* %0, i8* %1, i64 %2, i16 %3, i16 %4, i16 %5) {
+ entry:
+ %labelreturn = alloca i16
+ %6 = call i8* @__dfsw_memcpy(i8* %0, i8* %1, i64 %2, i16 %3, i16 %4, i16 %5, i16* %labelreturn)
+ %7 = load i16* %labelreturn
+ %8 = insertvalue { i8*, i16 } undef, i8* %6, 0
+ %9 = insertvalue { i8*, i16 } %8, i16 %7, 1
+ ret { i8*, i16 } %9
+ }
+
+As an optimization, direct calls to native ABI functions will call the
+native ABI function directly and the pass will compute the appropriate label
+internally. This has the advantage of reducing the number of union operations
+required when the return value label is known to be zero (i.e. ``discard``
+functions, or ``functional`` functions with known unlabelled arguments).
+
+Checking ABI Consistency
+------------------------
+
+DFSan changes the ABI of each function in the module. This makes it possible
+for a function with the native ABI to be called with the instrumented ABI,
+or vice versa, thus possibly invoking undefined behavior. A simple way
+of statically detecting instances of this problem is to prepend the prefix
+"dfs$" to the name of each instrumented-ABI function.
+
+This will not catch every such problem; in particular function pointers passed
+across the instrumented-native barrier cannot be used on the other side.
+These problems could potentially be caught dynamically.
Added: www-releases/trunk/3.5.1/tools/clang/docs/DriverArchitecture.png
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/DriverArchitecture.png?rev=225843&view=auto
==============================================================================
Binary file - no diff available.
Propchange: www-releases/trunk/3.5.1/tools/clang/docs/DriverArchitecture.png
------------------------------------------------------------------------------
svn:mime-type = image/png
Added: www-releases/trunk/3.5.1/tools/clang/docs/DriverInternals.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/DriverInternals.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/DriverInternals.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/DriverInternals.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,400 @@
+=========================
+Driver Design & Internals
+=========================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+This document describes the Clang driver. The purpose of this document
+is to describe both the motivation and design goals for the driver, as
+well as details of the internal implementation.
+
+Features and Goals
+==================
+
+The Clang driver is intended to be a production quality compiler driver
+providing access to the Clang compiler and tools, with a command line
+interface which is compatible with the gcc driver.
+
+Although the driver is part of and driven by the Clang project, it is
+logically a separate tool which shares many of the same goals as Clang:
+
+.. contents:: Features
+ :local:
+
+GCC Compatibility
+-----------------
+
+The number one goal of the driver is to ease the adoption of Clang by
+allowing users to drop Clang into a build system which was designed to
+call GCC. Although this makes the driver much more complicated than
+might otherwise be necessary, we decided that being very compatible with
+the gcc command line interface was worth it in order to allow users to
+quickly test clang on their projects.
+
+Flexible
+--------
+
+The driver was designed to be flexible and easily accommodate new uses
+as we grow the clang and LLVM infrastructure. As one example, the driver
+can easily support the introduction of tools which have an integrated
+assembler; something we hope to add to LLVM in the future.
+
+Similarly, most of the driver functionality is kept in a library which
+can be used to build other tools which want to implement or accept a gcc
+like interface.
+
+Low Overhead
+------------
+
+The driver should have as little overhead as possible. In practice, we
+found that the gcc driver by itself incurred a small but meaningful
+overhead when compiling many small files. The driver doesn't do much
+work compared to a compilation, but we have tried to keep it as
+efficient as possible by following a few simple principles:
+
+- Avoid memory allocation and string copying when possible.
+- Don't parse arguments more than once.
+- Provide a few simple interfaces for efficiently searching arguments.
+
+Simple
+------
+
+Finally, the driver was designed to be "as simple as possible", given
+the other goals. Notably, trying to be completely compatible with the
+gcc driver adds a significant amount of complexity. However, the design
+of the driver attempts to mitigate this complexity by dividing the
+process into a number of independent stages instead of a single
+monolithic task.
+
+Internal Design and Implementation
+==================================
+
+.. contents::
+ :local:
+ :depth: 1
+
+Internals Introduction
+----------------------
+
+In order to satisfy the stated goals, the driver was designed to
+completely subsume the functionality of the gcc executable; that is, the
+driver should not need to delegate to gcc to perform subtasks. On
+Darwin, this implies that the Clang driver also subsumes the gcc
+driver-driver, which is used to implement support for building universal
+images (binaries and object files). This also implies that the driver
+should be able to call the language specific compilers (e.g. cc1)
+directly, which means that it must have enough information to forward
+command line arguments to child processes correctly.
+
+Design Overview
+---------------
+
+The diagram below shows the significant components of the driver
+architecture and how they relate to one another. The orange components
+represent concrete data structures built by the driver, the green
+components indicate conceptually distinct stages which manipulate these
+data structures, and the blue components are important helper classes.
+
+.. image:: DriverArchitecture.png
+ :align: center
+ :alt: Driver Architecture Diagram
+
+Driver Stages
+-------------
+
+The driver functionality is conceptually divided into five stages:
+
+#. **Parse: Option Parsing**
+
+ The command line argument strings are decomposed into arguments
+ (``Arg`` instances). The driver expects to understand all available
+ options, although there is some facility for just passing certain
+ classes of options through (like ``-Wl,``).
+
+ Each argument corresponds to exactly one abstract ``Option``
+ definition, which describes how the option is parsed along with some
+ additional metadata. The Arg instances themselves are lightweight and
+ merely contain enough information for clients to determine which
+ option they correspond to and their values (if they have additional
+ parameters).
+
+ For example, a command line like "-Ifoo -I foo" would parse to two
+ Arg instances (a JoinedArg and a SeparateArg instance), but each
+ would refer to the same Option.
+
+ Options are lazily created in order to avoid populating all Option
+ classes when the driver is loaded. Most of the driver code only needs
+ to deal with options by their unique ID (e.g., ``options::OPT_I``),
+
+ Arg instances themselves do not generally store the values of
+ parameters. In many cases, this would simply result in creating
+ unnecessary string copies. Instead, Arg instances are always embedded
+ inside an ArgList structure, which contains the original vector of
+ argument strings. Each Arg itself only needs to contain an index into
+ this vector instead of storing its values directly.
+
+ The clang driver can dump the results of this stage using the
+ ``-ccc-print-options`` flag (which must precede any actual command
+ line arguments). For example:
+
+ .. code-block:: console
+
+ $ clang -ccc-print-options -Xarch_i386 -fomit-frame-pointer -Wa,-fast -Ifoo -I foo t.c
+ Option 0 - Name: "-Xarch_", Values: {"i386", "-fomit-frame-pointer"}
+ Option 1 - Name: "-Wa,", Values: {"-fast"}
+ Option 2 - Name: "-I", Values: {"foo"}
+ Option 3 - Name: "-I", Values: {"foo"}
+ Option 4 - Name: "<input>", Values: {"t.c"}
+
+ After this stage is complete the command line should be broken down
+ into well defined option objects with their appropriate parameters.
+ Subsequent stages should rarely, if ever, need to do any string
+ processing.
+
+#. **Pipeline: Compilation Job Construction**
+
+ Once the arguments are parsed, the tree of subprocess jobs needed for
+ the desired compilation sequence are constructed. This involves
+ determining the input files and their types, what work is to be done
+ on them (preprocess, compile, assemble, link, etc.), and constructing
+ a list of Action instances for each task. The result is a list of one
+ or more top-level actions, each of which generally corresponds to a
+ single output (for example, an object or linked executable).
+
+ The majority of Actions correspond to actual tasks, however there are
+ two special Actions. The first is InputAction, which simply serves to
+ adapt an input argument for use as an input to other Actions. The
+ second is BindArchAction, which conceptually alters the architecture
+ to be used for all of its input Actions.
+
+ The clang driver can dump the results of this stage using the
+ ``-ccc-print-phases`` flag. For example:
+
+ .. code-block:: console
+
+ $ clang -ccc-print-phases -x c t.c -x assembler t.s
+ 0: input, "t.c", c
+ 1: preprocessor, {0}, cpp-output
+ 2: compiler, {1}, assembler
+ 3: assembler, {2}, object
+ 4: input, "t.s", assembler
+ 5: assembler, {4}, object
+ 6: linker, {3, 5}, image
+
+ Here the driver is constructing seven distinct actions, four to
+ compile the "t.c" input into an object file, two to assemble the
+ "t.s" input, and one to link them together.
+
+ A rather different compilation pipeline is shown here; in this
+ example there are two top level actions to compile the input files
+ into two separate object files, where each object file is built using
+ ``lipo`` to merge results built for two separate architectures.
+
+ .. code-block:: console
+
+ $ clang -ccc-print-phases -c -arch i386 -arch x86_64 t0.c t1.c
+ 0: input, "t0.c", c
+ 1: preprocessor, {0}, cpp-output
+ 2: compiler, {1}, assembler
+ 3: assembler, {2}, object
+ 4: bind-arch, "i386", {3}, object
+ 5: bind-arch, "x86_64", {3}, object
+ 6: lipo, {4, 5}, object
+ 7: input, "t1.c", c
+ 8: preprocessor, {7}, cpp-output
+ 9: compiler, {8}, assembler
+ 10: assembler, {9}, object
+ 11: bind-arch, "i386", {10}, object
+ 12: bind-arch, "x86_64", {10}, object
+ 13: lipo, {11, 12}, object
+
+ After this stage is complete the compilation process is divided into
+ a simple set of actions which need to be performed to produce
+ intermediate or final outputs (in some cases, like ``-fsyntax-only``,
+ there is no "real" final output). Phases are well known compilation
+ steps, such as "preprocess", "compile", "assemble", "link", etc.
+
+#. **Bind: Tool & Filename Selection**
+
+ This stage (in conjunction with the Translate stage) turns the tree
+ of Actions into a list of actual subprocess to run. Conceptually, the
+ driver performs a top down matching to assign Action(s) to Tools. The
+ ToolChain is responsible for selecting the tool to perform a
+ particular action; once selected the driver interacts with the tool
+ to see if it can match additional actions (for example, by having an
+ integrated preprocessor).
+
+ Once Tools have been selected for all actions, the driver determines
+ how the tools should be connected (for example, using an inprocess
+ module, pipes, temporary files, or user provided filenames). If an
+ output file is required, the driver also computes the appropriate
+ file name (the suffix and file location depend on the input types and
+ options such as ``-save-temps``).
+
+ The driver interacts with a ToolChain to perform the Tool bindings.
+ Each ToolChain contains information about all the tools needed for
+ compilation for a particular architecture, platform, and operating
+ system. A single driver invocation may query multiple ToolChains
+ during one compilation in order to interact with tools for separate
+ architectures.
+
+ The results of this stage are not computed directly, but the driver
+ can print the results via the ``-ccc-print-bindings`` option. For
+ example:
+
+ .. code-block:: console
+
+ $ clang -ccc-print-bindings -arch i386 -arch ppc t0.c
+ # "i386-apple-darwin9" - "clang", inputs: ["t0.c"], output: "/tmp/cc-Sn4RKF.s"
+ # "i386-apple-darwin9" - "darwin::Assemble", inputs: ["/tmp/cc-Sn4RKF.s"], output: "/tmp/cc-gvSnbS.o"
+ # "i386-apple-darwin9" - "darwin::Link", inputs: ["/tmp/cc-gvSnbS.o"], output: "/tmp/cc-jgHQxi.out"
+ # "ppc-apple-darwin9" - "gcc::Compile", inputs: ["t0.c"], output: "/tmp/cc-Q0bTox.s"
+ # "ppc-apple-darwin9" - "gcc::Assemble", inputs: ["/tmp/cc-Q0bTox.s"], output: "/tmp/cc-WCdicw.o"
+ # "ppc-apple-darwin9" - "gcc::Link", inputs: ["/tmp/cc-WCdicw.o"], output: "/tmp/cc-HHBEBh.out"
+ # "i386-apple-darwin9" - "darwin::Lipo", inputs: ["/tmp/cc-jgHQxi.out", "/tmp/cc-HHBEBh.out"], output: "a.out"
+
+ This shows the tool chain, tool, inputs and outputs which have been
+ bound for this compilation sequence. Here clang is being used to
+ compile t0.c on the i386 architecture and darwin specific versions of
+ the tools are being used to assemble and link the result, but generic
+ gcc versions of the tools are being used on PowerPC.
+
+#. **Translate: Tool Specific Argument Translation**
+
+ Once a Tool has been selected to perform a particular Action, the
+ Tool must construct concrete Jobs which will be executed during
+ compilation. The main work is in translating from the gcc style
+ command line options to whatever options the subprocess expects.
+
+ Some tools, such as the assembler, only interact with a handful of
+ arguments and just determine the path of the executable to call and
+ pass on their input and output arguments. Others, like the compiler
+ or the linker, may translate a large number of arguments in addition.
+
+ The ArgList class provides a number of simple helper methods to
+ assist with translating arguments; for example, to pass on only the
+ last of arguments corresponding to some option, or all arguments for
+ an option.
+
+ The result of this stage is a list of Jobs (executable paths and
+ argument strings) to execute.
+
+#. **Execute**
+
+ Finally, the compilation pipeline is executed. This is mostly
+ straightforward, although there is some interaction with options like
+ ``-pipe``, ``-pass-exit-codes`` and ``-time``.
+
+Additional Notes
+----------------
+
+The Compilation Object
+^^^^^^^^^^^^^^^^^^^^^^
+
+The driver constructs a Compilation object for each set of command line
+arguments. The Driver itself is intended to be invariant during
+construction of a Compilation; an IDE should be able to construct a
+single long lived driver instance to use for an entire build, for
+example.
+
+The Compilation object holds information that is particular to each
+compilation sequence. For example, the list of used temporary files
+(which must be removed once compilation is finished) and result files
+(which should be removed if compilation fails).
+
+Unified Parsing & Pipelining
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Parsing and pipelining both occur without reference to a Compilation
+instance. This is by design; the driver expects that both of these
+phases are platform neutral, with a few very well defined exceptions
+such as whether the platform uses a driver driver.
+
+ToolChain Argument Translation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In order to match gcc very closely, the clang driver currently allows
+tool chains to perform their own translation of the argument list (into
+a new ArgList data structure). Although this allows the clang driver to
+match gcc easily, it also makes the driver operation much harder to
+understand (since the Tools stop seeing some arguments the user
+provided, and see new ones instead).
+
+For example, on Darwin ``-gfull`` gets translated into two separate
+arguments, ``-g`` and ``-fno-eliminate-unused-debug-symbols``. Trying to
+write Tool logic to do something with ``-gfull`` will not work, because
+Tool argument translation is done after the arguments have been
+translated.
+
+A long term goal is to remove this tool chain specific translation, and
+instead force each tool to change its own logic to do the right thing on
+the untranslated original arguments.
+
+Unused Argument Warnings
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The driver operates by parsing all arguments but giving Tools the
+opportunity to choose which arguments to pass on. One downside of this
+infrastructure is that if the user misspells some option, or is confused
+about which options to use, some command line arguments the user really
+cared about may go unused. This problem is particularly important when
+using clang as a compiler, since the clang compiler does not support
+anywhere near all the options that gcc does, and we want to make sure
+users know which ones are being used.
+
+To support this, the driver maintains a bit associated with each
+argument of whether it has been used (at all) during the compilation.
+This bit usually doesn't need to be set by hand, as the key ArgList
+accessors will set it automatically.
+
+When a compilation is successful (there are no errors), the driver
+checks the bit and emits an "unused argument" warning for any arguments
+which were never accessed. This is conservative (the argument may not
+have been used to do what the user wanted) but still catches the most
+obvious cases.
+
+Relation to GCC Driver Concepts
+-------------------------------
+
+For those familiar with the gcc driver, this section provides a brief
+overview of how things from the gcc driver map to the clang driver.
+
+- **Driver Driver**
+
+ The driver driver is fully integrated into the clang driver. The
+ driver simply constructs additional Actions to bind the architecture
+ during the *Pipeline* phase. The tool chain specific argument
+ translation is responsible for handling ``-Xarch_``.
+
+ The one caveat is that this approach requires ``-Xarch_`` not be used
+ to alter the compilation itself (for example, one cannot provide
+ ``-S`` as an ``-Xarch_`` argument). The driver attempts to reject
+ such invocations, and overall there isn't a good reason to abuse
+ ``-Xarch_`` to that end in practice.
+
+ The upside is that the clang driver is more efficient and does little
+ extra work to support universal builds. It also provides better error
+ reporting and UI consistency.
+
+- **Specs**
+
+ The clang driver has no direct correspondent for "specs". The
+ majority of the functionality that is embedded in specs is in the
+ Tool specific argument translation routines. The parts of specs which
+ control the compilation pipeline are generally part of the *Pipeline*
+ stage.
+
+- **Toolchains**
+
+ The gcc driver has no direct understanding of tool chains. Each gcc
+ binary roughly corresponds to the information which is embedded
+ inside a single ToolChain.
+
+ The clang driver is intended to be portable and support complex
+ compilation environments. All platform and tool chain specific code
+ should be protected behind either abstract or well defined interfaces
+ (such as whether the platform supports use as a driver driver).
Added: www-releases/trunk/3.5.1/tools/clang/docs/ExternalClangExamples.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/ExternalClangExamples.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/ExternalClangExamples.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/ExternalClangExamples.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,87 @@
+=======================
+External Clang Examples
+=======================
+
+Introduction
+============
+
+This page provides some examples of the kinds of things that people have
+done with Clang that might serve as useful guides (or starting points) from
+which to develop your own tools. They may be helpful even for something as
+banal (but necessary) as how to set up your build to integrate Clang.
+
+Clang's library-based design is deliberately aimed at facilitating use by
+external projects, and we are always interested in improving Clang to
+better serve our external users. Some typical categories of applications
+where Clang is used are:
+
+- Static analysis.
+- Documentation/cross-reference generation.
+
+If you know of (or wrote!) a tool or project using Clang, please send an
+email to Clang's `development discussion mailing list
+<http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>`_ to have it added.
+(or if you are already a Clang contributor, feel free to directly commit
+additions). Since the primary purpose of this page is to provide examples
+that can help developers, generally they must have code available.
+
+List of projects and tools
+==========================
+
+`<https://github.com/Andersbakken/rtags/>`_
+ "RTags is a client/server application that indexes c/c++ code and keeps
+ a persistent in-memory database of references, symbolnames, completions
+ etc."
+
+`<http://rprichard.github.com/sourceweb/>`_
+ "A C/C++ source code indexer and navigator"
+
+`<https://github.com/etaoins/qconnectlint>`_
+ "qconnectlint is a Clang tool for statically verifying the consistency
+ of signal and slot connections made with Qt's ``QObject::connect``."
+
+`<https://github.com/woboq/woboq_codebrowser>`_
+ "The Woboq Code Browser is a web-based code browser for C/C++ projects.
+ Check out `<http://code.woboq.org/>`_ for an example!"
+
+`<https://github.com/mozilla/dxr>`_
+ "DXR is a source code cross-reference tool that uses static analysis
+ data collected by instrumented compilers."
+
+`<https://github.com/eschulte/clang-mutate>`_
+ "This tool performs a number of operations on C-language source files."
+
+`<https://github.com/gmarpons/Crisp>`_
+ "A coding rule validation add-on for LLVM/clang. Crisp rules are written
+ in Prolog. A high-level declarative DSL to easily write new rules is under
+ development. It will be called CRISP, an acronym for *Coding Rules in
+ Sugared Prolog*."
+
+`<https://github.com/drothlis/clang-ctags>`_
+ "Generate tag file for C++ source code."
+
+`<https://github.com/exclipy/clang_indexer>`_
+ "This is an indexer for C and C++ based on the libclang library."
+
+`<https://github.com/holtgrewe/linty>`_
+ "Linty - C/C++ Style Checking with Python & libclang."
+
+`<https://github.com/axw/cmonster>`_
+ "cmonster is a Python wrapper for the Clang C++ parser."
+
+`<https://github.com/rizsotto/Constantine>`_
+ "Constantine is a toy project to learn how to write clang plugin.
+ Implements pseudo const analysis. Generates warnings about variables,
+ which were declared without const qualifier."
+
+`<https://github.com/jessevdk/cldoc>`_
+ "cldoc is a Clang based documentation generator for C and C++.
+ cldoc tries to solve the issue of writing C/C++ software documentation
+ with a modern, non-intrusive and robust approach."
+
+`<https://github.com/AlexDenisov/ToyClangPlugin>`_
+ "The simplest Clang plugin implementing a semantic check for Objective-C.
+ This example shows how to use the ``DiagnosticsEngine`` (emit warnings,
+ errors, fixit hints). See also `<http://l.rw.rw/clang_plugin>`_ for
+ step-by-step instructions."
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/FAQ.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/FAQ.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/FAQ.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/FAQ.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,64 @@
+================================
+Frequently Asked Questions (FAQ)
+================================
+
+.. contents::
+ :local:
+
+Driver
+======
+
+I run ``clang -cc1 ...`` and get weird errors about missing headers
+-------------------------------------------------------------------
+
+Given this source file:
+
+.. code-block:: c
+
+ #include <stdio.h>
+
+ int main() {
+ printf("Hello world\n");
+ }
+
+
+If you run:
+
+.. code-block:: console
+
+ $ clang -cc1 hello.c
+ hello.c:1:10: fatal error: 'stdio.h' file not found
+ #include <stdio.h>
+ ^
+ 1 error generated.
+
+``clang -cc1`` is the frontend, ``clang`` is the :doc:`driver
+<DriverInternals>`. The driver invokes the frontend with options appropriate
+for your system. To see these options, run:
+
+.. code-block:: console
+
+ $ clang -### -c hello.c
+
+Some clang command line options are driver-only options, some are frontend-only
+options. Frontend-only options are intended to be used only by clang developers.
+Users should not run ``clang -cc1`` directly, because ``-cc1`` options are not
+guaranteed to be stable.
+
+If you want to use a frontend-only option ("a ``-cc1`` option"), for example
+``-ast-dump``, then you need to take the ``clang -cc1`` line generated by the
+driver and add the option you need. Alternatively, you can run
+``clang -Xclang <option> ...`` to force the driver pass ``<option>`` to
+``clang -cc1``.
+
+I get errors about some headers being missing (``stddef.h``, ``stdarg.h``)
+--------------------------------------------------------------------------
+
+Some header files (``stddef.h``, ``stdarg.h``, and others) are shipped with
+Clang --- these are called builtin includes. Clang searches for them in a
+directory relative to the location of the ``clang`` binary. If you moved the
+``clang`` binary, you need to move the builtin headers, too.
+
+More information can be found in the :ref:`libtooling_builtin_includes`
+section.
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/HowToSetupToolingForLLVM.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/HowToSetupToolingForLLVM.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/HowToSetupToolingForLLVM.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/HowToSetupToolingForLLVM.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,199 @@
+===================================
+How To Setup Clang Tooling For LLVM
+===================================
+
+Clang Tooling provides infrastructure to write tools that need syntactic
+and semantic information about a program. This term also relates to a set
+of specific tools using this infrastructure (e.g. ``clang-check``). This
+document provides information on how to set up and use Clang Tooling for
+the LLVM source code.
+
+Introduction
+============
+
+Clang Tooling needs a compilation database to figure out specific build
+options for each file. Currently it can create a compilation database
+from the ``compilation_commands.json`` file, generated by CMake. When
+invoking clang tools, you can either specify a path to a build directory
+using a command line parameter ``-p`` or let Clang Tooling find this
+file in your source tree. In either case you need to configure your
+build using CMake to use clang tools.
+
+Setup Clang Tooling Using CMake and Make
+========================================
+
+If you intend to use make to build LLVM, you should have CMake 2.8.6 or
+later installed (can be found `here <http://cmake.org>`_).
+
+First, you need to generate Makefiles for LLVM with CMake. You need to
+make a build directory and run CMake from it:
+
+.. code-block:: console
+
+ $ mkdir your/build/directory
+ $ cd your/build/directory
+ $ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON path/to/llvm/sources
+
+If you want to use clang instead of GCC, you can add
+``-DCMAKE_C_COMPILER=/path/to/clang -DCMAKE_CXX_COMPILER=/path/to/clang++``.
+You can also use ``ccmake``, which provides a curses interface to configure
+CMake variables for lazy people.
+
+As a result, the new ``compile_commands.json`` file should appear in the
+current directory. You should link it to the LLVM source tree so that
+Clang Tooling is able to use it:
+
+.. code-block:: console
+
+ $ ln -s $PWD/compile_commands.json path/to/llvm/source/
+
+Now you are ready to build and test LLVM using make:
+
+.. code-block:: console
+
+ $ make check-all
+
+Using Clang Tools
+=================
+
+After you completed the previous steps, you are ready to run clang tools. If
+you have a recent clang installed, you should have ``clang-check`` in
+``$PATH``. Try to run it on any ``.cpp`` file inside the LLVM source tree:
+
+.. code-block:: console
+
+ $ clang-check tools/clang/lib/Tooling/CompilationDatabase.cpp
+
+If you're using vim, it's convenient to have clang-check integrated. Put
+this into your ``.vimrc``:
+
+::
+
+ function! ClangCheckImpl(cmd)
+ if &autowrite | wall | endif
+ echo "Running " . a:cmd . " ..."
+ let l:output = system(a:cmd)
+ cexpr l:output
+ cwindow
+ let w:quickfix_title = a:cmd
+ if v:shell_error != 0
+ cc
+ endif
+ let g:clang_check_last_cmd = a:cmd
+ endfunction
+
+ function! ClangCheck()
+ let l:filename = expand('%')
+ if l:filename =~ '\.\(cpp\|cxx\|cc\|c\)$'
+ call ClangCheckImpl("clang-check " . l:filename)
+ elseif exists("g:clang_check_last_cmd")
+ call ClangCheckImpl(g:clang_check_last_cmd)
+ else
+ echo "Can't detect file's compilation arguments and no previous clang-check invocation!"
+ endif
+ endfunction
+
+ nmap <silent> <F5> :call ClangCheck()<CR><CR>
+
+When editing a .cpp/.cxx/.cc/.c file, hit F5 to reparse the file. In
+case the current file has a different extension (for example, .h), F5
+will re-run the last clang-check invocation made from this vim instance
+(if any). The output will go into the error window, which is opened
+automatically when clang-check finds errors, and can be re-opened with
+``:cope``.
+
+Other ``clang-check`` options that can be useful when working with clang
+AST:
+
+* ``-ast-print`` --- Build ASTs and then pretty-print them.
+* ``-ast-dump`` --- Build ASTs and then debug dump them.
+* ``-ast-dump-filter=<string>`` --- Use with ``-ast-dump`` or ``-ast-print`` to
+ dump/print only AST declaration nodes having a certain substring in a
+ qualified name. Use ``-ast-list`` to list all filterable declaration node
+ names.
+* ``-ast-list`` --- Build ASTs and print the list of declaration node qualified
+ names.
+
+Examples:
+
+.. code-block:: console
+
+ $ clang-check tools/clang/tools/clang-check/ClangCheck.cpp -ast-dump -ast-dump-filter ActionFactory::newASTConsumer
+ Processing: tools/clang/tools/clang-check/ClangCheck.cpp.
+ Dumping ::ActionFactory::newASTConsumer:
+ clang::ASTConsumer *newASTConsumer() (CompoundStmt 0x44da290 </home/alexfh/local/llvm/tools/clang/tools/clang-check/ClangCheck.cpp:64:40, line:72:3>
+ (IfStmt 0x44d97c8 <line:65:5, line:66:45>
+ <<<NULL>>>
+ (ImplicitCastExpr 0x44d96d0 <line:65:9> '_Bool':'_Bool' <UserDefinedConversion>
+ ...
+ $ clang-check tools/clang/tools/clang-check/ClangCheck.cpp -ast-print -ast-dump-filter ActionFactory::newASTConsumer
+ Processing: tools/clang/tools/clang-check/ClangCheck.cpp.
+ Printing <anonymous namespace>::ActionFactory::newASTConsumer:
+ clang::ASTConsumer *newASTConsumer() {
+ if (this->ASTList.operator _Bool())
+ return clang::CreateASTDeclNodeLister();
+ if (this->ASTDump.operator _Bool())
+ return clang::CreateASTDumper(this->ASTDumpFilter);
+ if (this->ASTPrint.operator _Bool())
+ return clang::CreateASTPrinter(&llvm::outs(), this->ASTDumpFilter);
+ return new clang::ASTConsumer();
+ }
+
+(Experimental) Using Ninja Build System
+=======================================
+
+Optionally you can use the `Ninja <https://github.com/martine/ninja>`_
+build system instead of make. It is aimed at making your builds faster.
+Currently this step will require building Ninja from sources.
+
+To take advantage of using Clang Tools along with Ninja build you need
+at least CMake 2.8.9.
+
+Clone the Ninja git repository and build Ninja from sources:
+
+.. code-block:: console
+
+ $ git clone git://github.com/martine/ninja.git
+ $ cd ninja/
+ $ ./bootstrap.py
+
+This will result in a single binary ``ninja`` in the current directory.
+It doesn't require installation and can just be copied to any location
+inside ``$PATH``, say ``/usr/local/bin/``:
+
+.. code-block:: console
+
+ $ sudo cp ninja /usr/local/bin/
+ $ sudo chmod a+rx /usr/local/bin/ninja
+
+After doing all of this, you'll need to generate Ninja build files for
+LLVM with CMake. You need to make a build directory and run CMake from
+it:
+
+.. code-block:: console
+
+ $ mkdir your/build/directory
+ $ cd your/build/directory
+ $ cmake -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON path/to/llvm/sources
+
+If you want to use clang instead of GCC, you can add
+``-DCMAKE_C_COMPILER=/path/to/clang -DCMAKE_CXX_COMPILER=/path/to/clang++``.
+You can also use ``ccmake``, which provides a curses interface to configure
+CMake variables in an interactive manner.
+
+As a result, the new ``compile_commands.json`` file should appear in the
+current directory. You should link it to the LLVM source tree so that
+Clang Tooling is able to use it:
+
+.. code-block:: console
+
+ $ ln -s $PWD/compile_commands.json path/to/llvm/source/
+
+Now you are ready to build and test LLVM using Ninja:
+
+.. code-block:: console
+
+ $ ninja check-all
+
+Other target names can be used in the same way as with make.
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/InternalsManual.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/InternalsManual.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/InternalsManual.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/InternalsManual.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,1953 @@
+============================
+"Clang" CFE Internals Manual
+============================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+This document describes some of the more important APIs and internal design
+decisions made in the Clang C front-end. The purpose of this document is to
+both capture some of this high level information and also describe some of the
+design decisions behind it. This is meant for people interested in hacking on
+Clang, not for end-users. The description below is categorized by libraries,
+and does not describe any of the clients of the libraries.
+
+LLVM Support Library
+====================
+
+The LLVM ``libSupport`` library provides many underlying libraries and
+`data-structures <http://llvm.org/docs/ProgrammersManual.html>`_, including
+command line option processing, various containers and a system abstraction
+layer, which is used for file system access.
+
+The Clang "Basic" Library
+=========================
+
+This library certainly needs a better name. The "basic" library contains a
+number of low-level utilities for tracking and manipulating source buffers,
+locations within the source buffers, diagnostics, tokens, target abstraction,
+and information about the subset of the language being compiled for.
+
+Part of this infrastructure is specific to C (such as the ``TargetInfo``
+class), other parts could be reused for other non-C-based languages
+(``SourceLocation``, ``SourceManager``, ``Diagnostics``, ``FileManager``).
+When and if there is future demand we can figure out if it makes sense to
+introduce a new library, move the general classes somewhere else, or introduce
+some other solution.
+
+We describe the roles of these classes in order of their dependencies.
+
+The Diagnostics Subsystem
+-------------------------
+
+The Clang Diagnostics subsystem is an important part of how the compiler
+communicates with the human. Diagnostics are the warnings and errors produced
+when the code is incorrect or dubious. In Clang, each diagnostic produced has
+(at the minimum) a unique ID, an English translation associated with it, a
+:ref:`SourceLocation <SourceLocation>` to "put the caret", and a severity
+(e.g., ``WARNING`` or ``ERROR``). They can also optionally include a number of
+arguments to the dianostic (which fill in "%0"'s in the string) as well as a
+number of source ranges that related to the diagnostic.
+
+In this section, we'll be giving examples produced by the Clang command line
+driver, but diagnostics can be :ref:`rendered in many different ways
+<DiagnosticClient>` depending on how the ``DiagnosticClient`` interface is
+implemented. A representative example of a diagnostic is:
+
+.. code-block:: c++
+
+ t.c:38:15: error: invalid operands to binary expression ('int *' and '_Complex float')
+ P = (P-42) + Gamma*4;
+ ~~~~~~ ^ ~~~~~~~
+
+In this example, you can see the English translation, the severity (error), you
+can see the source location (the caret ("``^``") and file/line/column info),
+the source ranges "``~~~~``", arguments to the diagnostic ("``int*``" and
+"``_Complex float``"). You'll have to believe me that there is a unique ID
+backing the diagnostic :).
+
+Getting all of this to happen has several steps and involves many moving
+pieces, this section describes them and talks about best practices when adding
+a new diagnostic.
+
+The ``Diagnostic*Kinds.td`` files
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Diagnostics are created by adding an entry to one of the
+``clang/Basic/Diagnostic*Kinds.td`` files, depending on what library will be
+using it. From this file, :program:`tblgen` generates the unique ID of the
+diagnostic, the severity of the diagnostic and the English translation + format
+string.
+
+There is little sanity with the naming of the unique ID's right now. Some
+start with ``err_``, ``warn_``, ``ext_`` to encode the severity into the name.
+Since the enum is referenced in the C++ code that produces the diagnostic, it
+is somewhat useful for it to be reasonably short.
+
+The severity of the diagnostic comes from the set {``NOTE``, ``REMARK``,
+``WARNING``,
+``EXTENSION``, ``EXTWARN``, ``ERROR``}. The ``ERROR`` severity is used for
+diagnostics indicating the program is never acceptable under any circumstances.
+When an error is emitted, the AST for the input code may not be fully built.
+The ``EXTENSION`` and ``EXTWARN`` severities are used for extensions to the
+language that Clang accepts. This means that Clang fully understands and can
+represent them in the AST, but we produce diagnostics to tell the user their
+code is non-portable. The difference is that the former are ignored by
+default, and the later warn by default. The ``WARNING`` severity is used for
+constructs that are valid in the currently selected source language but that
+are dubious in some way. The ``REMARK`` severity provides generic information
+about the compilation that is not necessarily related to any dubious code. The
+``NOTE`` level is used to staple more information onto previous diagnostics.
+
+These *severities* are mapped into a smaller set (the ``Diagnostic::Level``
+enum, {``Ignored``, ``Note``, ``Remark``, ``Warning``, ``Error``, ``Fatal``}) of
+output
+*levels* by the diagnostics subsystem based on various configuration options.
+Clang internally supports a fully fine grained mapping mechanism that allows
+you to map almost any diagnostic to the output level that you want. The only
+diagnostics that cannot be mapped are ``NOTE``\ s, which always follow the
+severity of the previously emitted diagnostic and ``ERROR``\ s, which can only
+be mapped to ``Fatal`` (it is not possible to turn an error into a warning, for
+example).
+
+Diagnostic mappings are used in many ways. For example, if the user specifies
+``-pedantic``, ``EXTENSION`` maps to ``Warning``, if they specify
+``-pedantic-errors``, it turns into ``Error``. This is used to implement
+options like ``-Wunused_macros``, ``-Wundef`` etc.
+
+Mapping to ``Fatal`` should only be used for diagnostics that are considered so
+severe that error recovery won't be able to recover sensibly from them (thus
+spewing a ton of bogus errors). One example of this class of error are failure
+to ``#include`` a file.
+
+The Format String
+^^^^^^^^^^^^^^^^^
+
+The format string for the diagnostic is very simple, but it has some power. It
+takes the form of a string in English with markers that indicate where and how
+arguments to the diagnostic are inserted and formatted. For example, here are
+some simple format strings:
+
+.. code-block:: c++
+
+ "binary integer literals are an extension"
+ "format string contains '\\0' within the string body"
+ "more '%%' conversions than data arguments"
+ "invalid operands to binary expression (%0 and %1)"
+ "overloaded '%0' must be a %select{unary|binary|unary or binary}2 operator"
+ " (has %1 parameter%s1)"
+
+These examples show some important points of format strings. You can use any
+plain ASCII character in the diagnostic string except "``%``" without a
+problem, but these are C strings, so you have to use and be aware of all the C
+escape sequences (as in the second example). If you want to produce a "``%``"
+in the output, use the "``%%``" escape sequence, like the third diagnostic.
+Finally, Clang uses the "``%...[digit]``" sequences to specify where and how
+arguments to the diagnostic are formatted.
+
+Arguments to the diagnostic are numbered according to how they are specified by
+the C++ code that :ref:`produces them <internals-producing-diag>`, and are
+referenced by ``%0`` .. ``%9``. If you have more than 10 arguments to your
+diagnostic, you are doing something wrong :). Unlike ``printf``, there is no
+requirement that arguments to the diagnostic end up in the output in the same
+order as they are specified, you could have a format string with "``%1 %0``"
+that swaps them, for example. The text in between the percent and digit are
+formatting instructions. If there are no instructions, the argument is just
+turned into a string and substituted in.
+
+Here are some "best practices" for writing the English format string:
+
+* Keep the string short. It should ideally fit in the 80 column limit of the
+ ``DiagnosticKinds.td`` file. This avoids the diagnostic wrapping when
+ printed, and forces you to think about the important point you are conveying
+ with the diagnostic.
+* Take advantage of location information. The user will be able to see the
+ line and location of the caret, so you don't need to tell them that the
+ problem is with the 4th argument to the function: just point to it.
+* Do not capitalize the diagnostic string, and do not end it with a period.
+* If you need to quote something in the diagnostic string, use single quotes.
+
+Diagnostics should never take random English strings as arguments: you
+shouldn't use "``you have a problem with %0``" and pass in things like "``your
+argument``" or "``your return value``" as arguments. Doing this prevents
+:ref:`translating <internals-diag-translation>` the Clang diagnostics to other
+languages (because they'll get random English words in their otherwise
+localized diagnostic). The exceptions to this are C/C++ language keywords
+(e.g., ``auto``, ``const``, ``mutable``, etc) and C/C++ operators (``/=``).
+Note that things like "pointer" and "reference" are not keywords. On the other
+hand, you *can* include anything that comes from the user's source code,
+including variable names, types, labels, etc. The "``select``" format can be
+used to achieve this sort of thing in a localizable way, see below.
+
+Formatting a Diagnostic Argument
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Arguments to diagnostics are fully typed internally, and come from a couple
+different classes: integers, types, names, and random strings. Depending on
+the class of the argument, it can be optionally formatted in different ways.
+This gives the ``DiagnosticClient`` information about what the argument means
+without requiring it to use a specific presentation (consider this MVC for
+Clang :).
+
+Here are the different diagnostic argument formats currently supported by
+Clang:
+
+**"s" format**
+
+Example:
+ ``"requires %1 parameter%s1"``
+Class:
+ Integers
+Description:
+ This is a simple formatter for integers that is useful when producing English
+ diagnostics. When the integer is 1, it prints as nothing. When the integer
+ is not 1, it prints as "``s``". This allows some simple grammatical forms to
+ be to be handled correctly, and eliminates the need to use gross things like
+ ``"requires %1 parameter(s)"``.
+
+**"select" format**
+
+Example:
+ ``"must be a %select{unary|binary|unary or binary}2 operator"``
+Class:
+ Integers
+Description:
+ This format specifier is used to merge multiple related diagnostics together
+ into one common one, without requiring the difference to be specified as an
+ English string argument. Instead of specifying the string, the diagnostic
+ gets an integer argument and the format string selects the numbered option.
+ In this case, the "``%2``" value must be an integer in the range [0..2]. If
+ it is 0, it prints "unary", if it is 1 it prints "binary" if it is 2, it
+ prints "unary or binary". This allows other language translations to
+ substitute reasonable words (or entire phrases) based on the semantics of the
+ diagnostic instead of having to do things textually. The selected string
+ does undergo formatting.
+
+**"plural" format**
+
+Example:
+ ``"you have %1 %plural{1:mouse|:mice}1 connected to your computer"``
+Class:
+ Integers
+Description:
+ This is a formatter for complex plural forms. It is designed to handle even
+ the requirements of languages with very complex plural forms, as many Baltic
+ languages have. The argument consists of a series of expression/form pairs,
+ separated by ":", where the first form whose expression evaluates to true is
+ the result of the modifier.
+
+ An expression can be empty, in which case it is always true. See the example
+ at the top. Otherwise, it is a series of one or more numeric conditions,
+ separated by ",". If any condition matches, the expression matches. Each
+ numeric condition can take one of three forms.
+
+ * number: A simple decimal number matches if the argument is the same as the
+ number. Example: ``"%plural{1:mouse|:mice}4"``
+ * range: A range in square brackets matches if the argument is within the
+ range. Then range is inclusive on both ends. Example:
+ ``"%plural{0:none|1:one|[2,5]:some|:many}2"``
+ * modulo: A modulo operator is followed by a number, and equals sign and
+ either a number or a range. The tests are the same as for plain numbers
+ and ranges, but the argument is taken modulo the number first. Example:
+ ``"%plural{%100=0:even hundred|%100=[1,50]:lower half|:everything else}1"``
+
+ The parser is very unforgiving. A syntax error, even whitespace, will abort,
+ as will a failure to match the argument against any expression.
+
+**"ordinal" format**
+
+Example:
+ ``"ambiguity in %ordinal0 argument"``
+Class:
+ Integers
+Description:
+ This is a formatter which represents the argument number as an ordinal: the
+ value ``1`` becomes ``1st``, ``3`` becomes ``3rd``, and so on. Values less
+ than ``1`` are not supported. This formatter is currently hard-coded to use
+ English ordinals.
+
+**"objcclass" format**
+
+Example:
+ ``"method %objcclass0 not found"``
+Class:
+ ``DeclarationName``
+Description:
+ This is a simple formatter that indicates the ``DeclarationName`` corresponds
+ to an Objective-C class method selector. As such, it prints the selector
+ with a leading "``+``".
+
+**"objcinstance" format**
+
+Example:
+ ``"method %objcinstance0 not found"``
+Class:
+ ``DeclarationName``
+Description:
+ This is a simple formatter that indicates the ``DeclarationName`` corresponds
+ to an Objective-C instance method selector. As such, it prints the selector
+ with a leading "``-``".
+
+**"q" format**
+
+Example:
+ ``"candidate found by name lookup is %q0"``
+Class:
+ ``NamedDecl *``
+Description:
+ This formatter indicates that the fully-qualified name of the declaration
+ should be printed, e.g., "``std::vector``" rather than "``vector``".
+
+**"diff" format**
+
+Example:
+ ``"no known conversion %diff{from $ to $|from argument type to parameter type}1,2"``
+Class:
+ ``QualType``
+Description:
+ This formatter takes two ``QualType``\ s and attempts to print a template
+ difference between the two. If tree printing is off, the text inside the
+ braces before the pipe is printed, with the formatted text replacing the $.
+ If tree printing is on, the text after the pipe is printed and a type tree is
+ printed after the diagnostic message.
+
+It is really easy to add format specifiers to the Clang diagnostics system, but
+they should be discussed before they are added. If you are creating a lot of
+repetitive diagnostics and/or have an idea for a useful formatter, please bring
+it up on the cfe-dev mailing list.
+
+.. _internals-producing-diag:
+
+Producing the Diagnostic
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Now that you've created the diagnostic in the ``Diagnostic*Kinds.td`` file, you
+need to write the code that detects the condition in question and emits the new
+diagnostic. Various components of Clang (e.g., the preprocessor, ``Sema``,
+etc.) provide a helper function named "``Diag``". It creates a diagnostic and
+accepts the arguments, ranges, and other information that goes along with it.
+
+For example, the binary expression error comes from code like this:
+
+.. code-block:: c++
+
+ if (various things that are bad)
+ Diag(Loc, diag::err_typecheck_invalid_operands)
+ << lex->getType() << rex->getType()
+ << lex->getSourceRange() << rex->getSourceRange();
+
+This shows that use of the ``Diag`` method: it takes a location (a
+:ref:`SourceLocation <SourceLocation>` object) and a diagnostic enum value
+(which matches the name from ``Diagnostic*Kinds.td``). If the diagnostic takes
+arguments, they are specified with the ``<<`` operator: the first argument
+becomes ``%0``, the second becomes ``%1``, etc. The diagnostic interface
+allows you to specify arguments of many different types, including ``int`` and
+``unsigned`` for integer arguments, ``const char*`` and ``std::string`` for
+string arguments, ``DeclarationName`` and ``const IdentifierInfo *`` for names,
+``QualType`` for types, etc. ``SourceRange``\ s are also specified with the
+``<<`` operator, but do not have a specific ordering requirement.
+
+As you can see, adding and producing a diagnostic is pretty straightforward.
+The hard part is deciding exactly what you need to say to help the user,
+picking a suitable wording, and providing the information needed to format it
+correctly. The good news is that the call site that issues a diagnostic should
+be completely independent of how the diagnostic is formatted and in what
+language it is rendered.
+
+Fix-It Hints
+^^^^^^^^^^^^
+
+In some cases, the front end emits diagnostics when it is clear that some small
+change to the source code would fix the problem. For example, a missing
+semicolon at the end of a statement or a use of deprecated syntax that is
+easily rewritten into a more modern form. Clang tries very hard to emit the
+diagnostic and recover gracefully in these and other cases.
+
+However, for these cases where the fix is obvious, the diagnostic can be
+annotated with a hint (referred to as a "fix-it hint") that describes how to
+change the code referenced by the diagnostic to fix the problem. For example,
+it might add the missing semicolon at the end of the statement or rewrite the
+use of a deprecated construct into something more palatable. Here is one such
+example from the C++ front end, where we warn about the right-shift operator
+changing meaning from C++98 to C++11:
+
+.. code-block:: c++
+
+ test.cpp:3:7: warning: use of right-shift operator ('>>') in template argument
+ will require parentheses in C++11
+ A<100 >> 2> *a;
+ ^
+ ( )
+
+Here, the fix-it hint is suggesting that parentheses be added, and showing
+exactly where those parentheses would be inserted into the source code. The
+fix-it hints themselves describe what changes to make to the source code in an
+abstract manner, which the text diagnostic printer renders as a line of
+"insertions" below the caret line. :ref:`Other diagnostic clients
+<DiagnosticClient>` might choose to render the code differently (e.g., as
+markup inline) or even give the user the ability to automatically fix the
+problem.
+
+Fix-it hints on errors and warnings need to obey these rules:
+
+* Since they are automatically applied if ``-Xclang -fixit`` is passed to the
+ driver, they should only be used when it's very likely they match the user's
+ intent.
+* Clang must recover from errors as if the fix-it had been applied.
+
+If a fix-it can't obey these rules, put the fix-it on a note. Fix-its on notes
+are not applied automatically.
+
+All fix-it hints are described by the ``FixItHint`` class, instances of which
+should be attached to the diagnostic using the ``<<`` operator in the same way
+that highlighted source ranges and arguments are passed to the diagnostic.
+Fix-it hints can be created with one of three constructors:
+
+* ``FixItHint::CreateInsertion(Loc, Code)``
+
+ Specifies that the given ``Code`` (a string) should be inserted before the
+ source location ``Loc``.
+
+* ``FixItHint::CreateRemoval(Range)``
+
+ Specifies that the code in the given source ``Range`` should be removed.
+
+* ``FixItHint::CreateReplacement(Range, Code)``
+
+ Specifies that the code in the given source ``Range`` should be removed,
+ and replaced with the given ``Code`` string.
+
+.. _DiagnosticClient:
+
+The ``DiagnosticClient`` Interface
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Once code generates a diagnostic with all of the arguments and the rest of the
+relevant information, Clang needs to know what to do with it. As previously
+mentioned, the diagnostic machinery goes through some filtering to map a
+severity onto a diagnostic level, then (assuming the diagnostic is not mapped
+to "``Ignore``") it invokes an object that implements the ``DiagnosticClient``
+interface with the information.
+
+It is possible to implement this interface in many different ways. For
+example, the normal Clang ``DiagnosticClient`` (named
+``TextDiagnosticPrinter``) turns the arguments into strings (according to the
+various formatting rules), prints out the file/line/column information and the
+string, then prints out the line of code, the source ranges, and the caret.
+However, this behavior isn't required.
+
+Another implementation of the ``DiagnosticClient`` interface is the
+``TextDiagnosticBuffer`` class, which is used when Clang is in ``-verify``
+mode. Instead of formatting and printing out the diagnostics, this
+implementation just captures and remembers the diagnostics as they fly by.
+Then ``-verify`` compares the list of produced diagnostics to the list of
+expected ones. If they disagree, it prints out its own output. Full
+documentation for the ``-verify`` mode can be found in the Clang API
+documentation for `VerifyDiagnosticConsumer
+</doxygen/classclang_1_1VerifyDiagnosticConsumer.html#details>`_.
+
+There are many other possible implementations of this interface, and this is
+why we prefer diagnostics to pass down rich structured information in
+arguments. For example, an HTML output might want declaration names be
+linkified to where they come from in the source. Another example is that a GUI
+might let you click on typedefs to expand them. This application would want to
+pass significantly more information about types through to the GUI than a
+simple flat string. The interface allows this to happen.
+
+.. _internals-diag-translation:
+
+Adding Translations to Clang
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Not possible yet! Diagnostic strings should be written in UTF-8, the client can
+translate to the relevant code page if needed. Each translation completely
+replaces the format string for the diagnostic.
+
+.. _SourceLocation:
+.. _SourceManager:
+
+The ``SourceLocation`` and ``SourceManager`` classes
+----------------------------------------------------
+
+Strangely enough, the ``SourceLocation`` class represents a location within the
+source code of the program. Important design points include:
+
+#. ``sizeof(SourceLocation)`` must be extremely small, as these are embedded
+ into many AST nodes and are passed around often. Currently it is 32 bits.
+#. ``SourceLocation`` must be a simple value object that can be efficiently
+ copied.
+#. We should be able to represent a source location for any byte of any input
+ file. This includes in the middle of tokens, in whitespace, in trigraphs,
+ etc.
+#. A ``SourceLocation`` must encode the current ``#include`` stack that was
+ active when the location was processed. For example, if the location
+ corresponds to a token, it should contain the set of ``#include``\ s active
+ when the token was lexed. This allows us to print the ``#include`` stack
+ for a diagnostic.
+#. ``SourceLocation`` must be able to describe macro expansions, capturing both
+ the ultimate instantiation point and the source of the original character
+ data.
+
+In practice, the ``SourceLocation`` works together with the ``SourceManager``
+class to encode two pieces of information about a location: its spelling
+location and its instantiation location. For most tokens, these will be the
+same. However, for a macro expansion (or tokens that came from a ``_Pragma``
+directive) these will describe the location of the characters corresponding to
+the token and the location where the token was used (i.e., the macro
+instantiation point or the location of the ``_Pragma`` itself).
+
+The Clang front-end inherently depends on the location of a token being tracked
+correctly. If it is ever incorrect, the front-end may get confused and die.
+The reason for this is that the notion of the "spelling" of a ``Token`` in
+Clang depends on being able to find the original input characters for the
+token. This concept maps directly to the "spelling location" for the token.
+
+``SourceRange`` and ``CharSourceRange``
+---------------------------------------
+
+.. mostly taken from http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-August/010595.html
+
+Clang represents most source ranges by [first, last], where "first" and "last"
+each point to the beginning of their respective tokens. For example consider
+the ``SourceRange`` of the following statement:
+
+.. code-block:: c++
+
+ x = foo + bar;
+ ^first ^last
+
+To map from this representation to a character-based representation, the "last"
+location needs to be adjusted to point to (or past) the end of that token with
+either ``Lexer::MeasureTokenLength()`` or ``Lexer::getLocForEndOfToken()``. For
+the rare cases where character-level source ranges information is needed we use
+the ``CharSourceRange`` class.
+
+The Driver Library
+==================
+
+The clang Driver and library are documented :doc:`here <DriverInternals>`.
+
+Precompiled Headers
+===================
+
+Clang supports two implementations of precompiled headers. The default
+implementation, precompiled headers (:doc:`PCH <PCHInternals>`) uses a
+serialized representation of Clang's internal data structures, encoded with the
+`LLVM bitstream format <http://llvm.org/docs/BitCodeFormat.html>`_.
+Pretokenized headers (:doc:`PTH <PTHInternals>`), on the other hand, contain a
+serialized representation of the tokens encountered when preprocessing a header
+(and anything that header includes).
+
+The Frontend Library
+====================
+
+The Frontend library contains functionality useful for building tools on top of
+the Clang libraries, for example several methods for outputting diagnostics.
+
+The Lexer and Preprocessor Library
+==================================
+
+The Lexer library contains several tightly-connected classes that are involved
+with the nasty process of lexing and preprocessing C source code. The main
+interface to this library for outside clients is the large ``Preprocessor``
+class. It contains the various pieces of state that are required to coherently
+read tokens out of a translation unit.
+
+The core interface to the ``Preprocessor`` object (once it is set up) is the
+``Preprocessor::Lex`` method, which returns the next :ref:`Token <Token>` from
+the preprocessor stream. There are two types of token providers that the
+preprocessor is capable of reading from: a buffer lexer (provided by the
+:ref:`Lexer <Lexer>` class) and a buffered token stream (provided by the
+:ref:`TokenLexer <TokenLexer>` class).
+
+.. _Token:
+
+The Token class
+---------------
+
+The ``Token`` class is used to represent a single lexed token. Tokens are
+intended to be used by the lexer/preprocess and parser libraries, but are not
+intended to live beyond them (for example, they should not live in the ASTs).
+
+Tokens most often live on the stack (or some other location that is efficient
+to access) as the parser is running, but occasionally do get buffered up. For
+example, macro definitions are stored as a series of tokens, and the C++
+front-end periodically needs to buffer tokens up for tentative parsing and
+various pieces of look-ahead. As such, the size of a ``Token`` matters. On a
+32-bit system, ``sizeof(Token)`` is currently 16 bytes.
+
+Tokens occur in two forms: :ref:`annotation tokens <AnnotationToken>` and
+normal tokens. Normal tokens are those returned by the lexer, annotation
+tokens represent semantic information and are produced by the parser, replacing
+normal tokens in the token stream. Normal tokens contain the following
+information:
+
+* **A SourceLocation** --- This indicates the location of the start of the
+ token.
+
+* **A length** --- This stores the length of the token as stored in the
+ ``SourceBuffer``. For tokens that include them, this length includes
+ trigraphs and escaped newlines which are ignored by later phases of the
+ compiler. By pointing into the original source buffer, it is always possible
+ to get the original spelling of a token completely accurately.
+
+* **IdentifierInfo** --- If a token takes the form of an identifier, and if
+ identifier lookup was enabled when the token was lexed (e.g., the lexer was
+ not reading in "raw" mode) this contains a pointer to the unique hash value
+ for the identifier. Because the lookup happens before keyword
+ identification, this field is set even for language keywords like "``for``".
+
+* **TokenKind** --- This indicates the kind of token as classified by the
+ lexer. This includes things like ``tok::starequal`` (for the "``*=``"
+ operator), ``tok::ampamp`` for the "``&&``" token, and keyword values (e.g.,
+ ``tok::kw_for``) for identifiers that correspond to keywords. Note that
+ some tokens can be spelled multiple ways. For example, C++ supports
+ "operator keywords", where things like "``and``" are treated exactly like the
+ "``&&``" operator. In these cases, the kind value is set to ``tok::ampamp``,
+ which is good for the parser, which doesn't have to consider both forms. For
+ something that cares about which form is used (e.g., the preprocessor
+ "stringize" operator) the spelling indicates the original form.
+
+* **Flags** --- There are currently four flags tracked by the
+ lexer/preprocessor system on a per-token basis:
+
+ #. **StartOfLine** --- This was the first token that occurred on its input
+ source line.
+ #. **LeadingSpace** --- There was a space character either immediately before
+ the token or transitively before the token as it was expanded through a
+ macro. The definition of this flag is very closely defined by the
+ stringizing requirements of the preprocessor.
+ #. **DisableExpand** --- This flag is used internally to the preprocessor to
+ represent identifier tokens which have macro expansion disabled. This
+ prevents them from being considered as candidates for macro expansion ever
+ in the future.
+ #. **NeedsCleaning** --- This flag is set if the original spelling for the
+ token includes a trigraph or escaped newline. Since this is uncommon,
+ many pieces of code can fast-path on tokens that did not need cleaning.
+
+One interesting (and somewhat unusual) aspect of normal tokens is that they
+don't contain any semantic information about the lexed value. For example, if
+the token was a pp-number token, we do not represent the value of the number
+that was lexed (this is left for later pieces of code to decide).
+Additionally, the lexer library has no notion of typedef names vs variable
+names: both are returned as identifiers, and the parser is left to decide
+whether a specific identifier is a typedef or a variable (tracking this
+requires scope information among other things). The parser can do this
+translation by replacing tokens returned by the preprocessor with "Annotation
+Tokens".
+
+.. _AnnotationToken:
+
+Annotation Tokens
+-----------------
+
+Annotation tokens are tokens that are synthesized by the parser and injected
+into the preprocessor's token stream (replacing existing tokens) to record
+semantic information found by the parser. For example, if "``foo``" is found
+to be a typedef, the "``foo``" ``tok::identifier`` token is replaced with an
+``tok::annot_typename``. This is useful for a couple of reasons: 1) this makes
+it easy to handle qualified type names (e.g., "``foo::bar::baz<42>::t``") in
+C++ as a single "token" in the parser. 2) if the parser backtracks, the
+reparse does not need to redo semantic analysis to determine whether a token
+sequence is a variable, type, template, etc.
+
+Annotation tokens are created by the parser and reinjected into the parser's
+token stream (when backtracking is enabled). Because they can only exist in
+tokens that the preprocessor-proper is done with, it doesn't need to keep
+around flags like "start of line" that the preprocessor uses to do its job.
+Additionally, an annotation token may "cover" a sequence of preprocessor tokens
+(e.g., "``a::b::c``" is five preprocessor tokens). As such, the valid fields
+of an annotation token are different than the fields for a normal token (but
+they are multiplexed into the normal ``Token`` fields):
+
+* **SourceLocation "Location"** --- The ``SourceLocation`` for the annotation
+ token indicates the first token replaced by the annotation token. In the
+ example above, it would be the location of the "``a``" identifier.
+* **SourceLocation "AnnotationEndLoc"** --- This holds the location of the last
+ token replaced with the annotation token. In the example above, it would be
+ the location of the "``c``" identifier.
+* **void* "AnnotationValue"** --- This contains an opaque object that the
+ parser gets from ``Sema``. The parser merely preserves the information for
+ ``Sema`` to later interpret based on the annotation token kind.
+* **TokenKind "Kind"** --- This indicates the kind of Annotation token this is.
+ See below for the different valid kinds.
+
+Annotation tokens currently come in three kinds:
+
+#. **tok::annot_typename**: This annotation token represents a resolved
+ typename token that is potentially qualified. The ``AnnotationValue`` field
+ contains the ``QualType`` returned by ``Sema::getTypeName()``, possibly with
+ source location information attached.
+#. **tok::annot_cxxscope**: This annotation token represents a C++ scope
+ specifier, such as "``A::B::``". This corresponds to the grammar
+ productions "*::*" and "*:: [opt] nested-name-specifier*". The
+ ``AnnotationValue`` pointer is a ``NestedNameSpecifier *`` returned by the
+ ``Sema::ActOnCXXGlobalScopeSpecifier`` and
+ ``Sema::ActOnCXXNestedNameSpecifier`` callbacks.
+#. **tok::annot_template_id**: This annotation token represents a C++
+ template-id such as "``foo<int, 4>``", where "``foo``" is the name of a
+ template. The ``AnnotationValue`` pointer is a pointer to a ``malloc``'d
+ ``TemplateIdAnnotation`` object. Depending on the context, a parsed
+ template-id that names a type might become a typename annotation token (if
+ all we care about is the named type, e.g., because it occurs in a type
+ specifier) or might remain a template-id token (if we want to retain more
+ source location information or produce a new type, e.g., in a declaration of
+ a class template specialization). template-id annotation tokens that refer
+ to a type can be "upgraded" to typename annotation tokens by the parser.
+
+As mentioned above, annotation tokens are not returned by the preprocessor,
+they are formed on demand by the parser. This means that the parser has to be
+aware of cases where an annotation could occur and form it where appropriate.
+This is somewhat similar to how the parser handles Translation Phase 6 of C99:
+String Concatenation (see C99 5.1.1.2). In the case of string concatenation,
+the preprocessor just returns distinct ``tok::string_literal`` and
+``tok::wide_string_literal`` tokens and the parser eats a sequence of them
+wherever the grammar indicates that a string literal can occur.
+
+In order to do this, whenever the parser expects a ``tok::identifier`` or
+``tok::coloncolon``, it should call the ``TryAnnotateTypeOrScopeToken`` or
+``TryAnnotateCXXScopeToken`` methods to form the annotation token. These
+methods will maximally form the specified annotation tokens and replace the
+current token with them, if applicable. If the current tokens is not valid for
+an annotation token, it will remain an identifier or "``::``" token.
+
+.. _Lexer:
+
+The ``Lexer`` class
+-------------------
+
+The ``Lexer`` class provides the mechanics of lexing tokens out of a source
+buffer and deciding what they mean. The ``Lexer`` is complicated by the fact
+that it operates on raw buffers that have not had spelling eliminated (this is
+a necessity to get decent performance), but this is countered with careful
+coding as well as standard performance techniques (for example, the comment
+handling code is vectorized on X86 and PowerPC hosts).
+
+The lexer has a couple of interesting modal features:
+
+* The lexer can operate in "raw" mode. This mode has several features that
+ make it possible to quickly lex the file (e.g., it stops identifier lookup,
+ doesn't specially handle preprocessor tokens, handles EOF differently, etc).
+ This mode is used for lexing within an "``#if 0``" block, for example.
+* The lexer can capture and return comments as tokens. This is required to
+ support the ``-C`` preprocessor mode, which passes comments through, and is
+ used by the diagnostic checker to identifier expect-error annotations.
+* The lexer can be in ``ParsingFilename`` mode, which happens when
+ preprocessing after reading a ``#include`` directive. This mode changes the
+ parsing of "``<``" to return an "angled string" instead of a bunch of tokens
+ for each thing within the filename.
+* When parsing a preprocessor directive (after "``#``") the
+ ``ParsingPreprocessorDirective`` mode is entered. This changes the parser to
+ return EOD at a newline.
+* The ``Lexer`` uses a ``LangOptions`` object to know whether trigraphs are
+ enabled, whether C++ or ObjC keywords are recognized, etc.
+
+In addition to these modes, the lexer keeps track of a couple of other features
+that are local to a lexed buffer, which change as the buffer is lexed:
+
+* The ``Lexer`` uses ``BufferPtr`` to keep track of the current character being
+ lexed.
+* The ``Lexer`` uses ``IsAtStartOfLine`` to keep track of whether the next
+ lexed token will start with its "start of line" bit set.
+* The ``Lexer`` keeps track of the current "``#if``" directives that are active
+ (which can be nested).
+* The ``Lexer`` keeps track of an :ref:`MultipleIncludeOpt
+ <MultipleIncludeOpt>` object, which is used to detect whether the buffer uses
+ the standard "``#ifndef XX`` / ``#define XX``" idiom to prevent multiple
+ inclusion. If a buffer does, subsequent includes can be ignored if the
+ "``XX``" macro is defined.
+
+.. _TokenLexer:
+
+The ``TokenLexer`` class
+------------------------
+
+The ``TokenLexer`` class is a token provider that returns tokens from a list of
+tokens that came from somewhere else. It typically used for two things: 1)
+returning tokens from a macro definition as it is being expanded 2) returning
+tokens from an arbitrary buffer of tokens. The later use is used by
+``_Pragma`` and will most likely be used to handle unbounded look-ahead for the
+C++ parser.
+
+.. _MultipleIncludeOpt:
+
+The ``MultipleIncludeOpt`` class
+--------------------------------
+
+The ``MultipleIncludeOpt`` class implements a really simple little state
+machine that is used to detect the standard "``#ifndef XX`` / ``#define XX``"
+idiom that people typically use to prevent multiple inclusion of headers. If a
+buffer uses this idiom and is subsequently ``#include``'d, the preprocessor can
+simply check to see whether the guarding condition is defined or not. If so,
+the preprocessor can completely ignore the include of the header.
+
+The Parser Library
+==================
+
+The AST Library
+===============
+
+.. _Type:
+
+The ``Type`` class and its subclasses
+-------------------------------------
+
+The ``Type`` class (and its subclasses) are an important part of the AST.
+Types are accessed through the ``ASTContext`` class, which implicitly creates
+and uniques them as they are needed. Types have a couple of non-obvious
+features: 1) they do not capture type qualifiers like ``const`` or ``volatile``
+(see :ref:`QualType <QualType>`), and 2) they implicitly capture typedef
+information. Once created, types are immutable (unlike decls).
+
+Typedefs in C make semantic analysis a bit more complex than it would be without
+them. The issue is that we want to capture typedef information and represent it
+in the AST perfectly, but the semantics of operations need to "see through"
+typedefs. For example, consider this code:
+
+.. code-block:: c++
+
+ void func() {
+ typedef int foo;
+ foo X, *Y;
+ typedef foo *bar;
+ bar Z;
+ *X; // error
+ **Y; // error
+ **Z; // error
+ }
+
+The code above is illegal, and thus we expect there to be diagnostics emitted
+on the annotated lines. In this example, we expect to get:
+
+.. code-block:: c++
+
+ test.c:6:1: error: indirection requires pointer operand ('foo' invalid)
+ *X; // error
+ ^~
+ test.c:7:1: error: indirection requires pointer operand ('foo' invalid)
+ **Y; // error
+ ^~~
+ test.c:8:1: error: indirection requires pointer operand ('foo' invalid)
+ **Z; // error
+ ^~~
+
+While this example is somewhat silly, it illustrates the point: we want to
+retain typedef information where possible, so that we can emit errors about
+"``std::string``" instead of "``std::basic_string<char, std:...``". Doing this
+requires properly keeping typedef information (for example, the type of ``X``
+is "``foo``", not "``int``"), and requires properly propagating it through the
+various operators (for example, the type of ``*Y`` is "``foo``", not
+"``int``"). In order to retain this information, the type of these expressions
+is an instance of the ``TypedefType`` class, which indicates that the type of
+these expressions is a typedef for "``foo``".
+
+Representing types like this is great for diagnostics, because the
+user-specified type is always immediately available. There are two problems
+with this: first, various semantic checks need to make judgements about the
+*actual structure* of a type, ignoring typedefs. Second, we need an efficient
+way to query whether two types are structurally identical to each other,
+ignoring typedefs. The solution to both of these problems is the idea of
+canonical types.
+
+Canonical Types
+^^^^^^^^^^^^^^^
+
+Every instance of the ``Type`` class contains a canonical type pointer. For
+simple types with no typedefs involved (e.g., "``int``", "``int*``",
+"``int**``"), the type just points to itself. For types that have a typedef
+somewhere in their structure (e.g., "``foo``", "``foo*``", "``foo**``",
+"``bar``"), the canonical type pointer points to their structurally equivalent
+type without any typedefs (e.g., "``int``", "``int*``", "``int**``", and
+"``int*``" respectively).
+
+This design provides a constant time operation (dereferencing the canonical type
+pointer) that gives us access to the structure of types. For example, we can
+trivially tell that "``bar``" and "``foo*``" are the same type by dereferencing
+their canonical type pointers and doing a pointer comparison (they both point
+to the single "``int*``" type).
+
+Canonical types and typedef types bring up some complexities that must be
+carefully managed. Specifically, the ``isa``/``cast``/``dyn_cast`` operators
+generally shouldn't be used in code that is inspecting the AST. For example,
+when type checking the indirection operator (unary "``*``" on a pointer), the
+type checker must verify that the operand has a pointer type. It would not be
+correct to check that with "``isa<PointerType>(SubExpr->getType())``", because
+this predicate would fail if the subexpression had a typedef type.
+
+The solution to this problem are a set of helper methods on ``Type``, used to
+check their properties. In this case, it would be correct to use
+"``SubExpr->getType()->isPointerType()``" to do the check. This predicate will
+return true if the *canonical type is a pointer*, which is true any time the
+type is structurally a pointer type. The only hard part here is remembering
+not to use the ``isa``/``cast``/``dyn_cast`` operations.
+
+The second problem we face is how to get access to the pointer type once we
+know it exists. To continue the example, the result type of the indirection
+operator is the pointee type of the subexpression. In order to determine the
+type, we need to get the instance of ``PointerType`` that best captures the
+typedef information in the program. If the type of the expression is literally
+a ``PointerType``, we can return that, otherwise we have to dig through the
+typedefs to find the pointer type. For example, if the subexpression had type
+"``foo*``", we could return that type as the result. If the subexpression had
+type "``bar``", we want to return "``foo*``" (note that we do *not* want
+"``int*``"). In order to provide all of this, ``Type`` has a
+``getAsPointerType()`` method that checks whether the type is structurally a
+``PointerType`` and, if so, returns the best one. If not, it returns a null
+pointer.
+
+This structure is somewhat mystical, but after meditating on it, it will make
+sense to you :).
+
+.. _QualType:
+
+The ``QualType`` class
+----------------------
+
+The ``QualType`` class is designed as a trivial value class that is small,
+passed by-value and is efficient to query. The idea of ``QualType`` is that it
+stores the type qualifiers (``const``, ``volatile``, ``restrict``, plus some
+extended qualifiers required by language extensions) separately from the types
+themselves. ``QualType`` is conceptually a pair of "``Type*``" and the bits
+for these type qualifiers.
+
+By storing the type qualifiers as bits in the conceptual pair, it is extremely
+efficient to get the set of qualifiers on a ``QualType`` (just return the field
+of the pair), add a type qualifier (which is a trivial constant-time operation
+that sets a bit), and remove one or more type qualifiers (just return a
+``QualType`` with the bitfield set to empty).
+
+Further, because the bits are stored outside of the type itself, we do not need
+to create duplicates of types with different sets of qualifiers (i.e. there is
+only a single heap allocated "``int``" type: "``const int``" and "``volatile
+const int``" both point to the same heap allocated "``int``" type). This
+reduces the heap size used to represent bits and also means we do not have to
+consider qualifiers when uniquing types (:ref:`Type <Type>` does not even
+contain qualifiers).
+
+In practice, the two most common type qualifiers (``const`` and ``restrict``)
+are stored in the low bits of the pointer to the ``Type`` object, together with
+a flag indicating whether extended qualifiers are present (which must be
+heap-allocated). This means that ``QualType`` is exactly the same size as a
+pointer.
+
+.. _DeclarationName:
+
+Declaration names
+-----------------
+
+The ``DeclarationName`` class represents the name of a declaration in Clang.
+Declarations in the C family of languages can take several different forms.
+Most declarations are named by simple identifiers, e.g., "``f``" and "``x``" in
+the function declaration ``f(int x)``. In C++, declaration names can also name
+class constructors ("``Class``" in ``struct Class { Class(); }``), class
+destructors ("``~Class``"), overloaded operator names ("``operator+``"), and
+conversion functions ("``operator void const *``"). In Objective-C,
+declaration names can refer to the names of Objective-C methods, which involve
+the method name and the parameters, collectively called a *selector*, e.g.,
+"``setWidth:height:``". Since all of these kinds of entities --- variables,
+functions, Objective-C methods, C++ constructors, destructors, and operators
+--- are represented as subclasses of Clang's common ``NamedDecl`` class,
+``DeclarationName`` is designed to efficiently represent any kind of name.
+
+Given a ``DeclarationName`` ``N``, ``N.getNameKind()`` will produce a value
+that describes what kind of name ``N`` stores. There are 10 options (all of
+the names are inside the ``DeclarationName`` class).
+
+``Identifier``
+
+ The name is a simple identifier. Use ``N.getAsIdentifierInfo()`` to retrieve
+ the corresponding ``IdentifierInfo*`` pointing to the actual identifier.
+
+``ObjCZeroArgSelector``, ``ObjCOneArgSelector``, ``ObjCMultiArgSelector``
+
+ The name is an Objective-C selector, which can be retrieved as a ``Selector``
+ instance via ``N.getObjCSelector()``. The three possible name kinds for
+ Objective-C reflect an optimization within the ``DeclarationName`` class:
+ both zero- and one-argument selectors are stored as a masked
+ ``IdentifierInfo`` pointer, and therefore require very little space, since
+ zero- and one-argument selectors are far more common than multi-argument
+ selectors (which use a different structure).
+
+``CXXConstructorName``
+
+ The name is a C++ constructor name. Use ``N.getCXXNameType()`` to retrieve
+ the :ref:`type <QualType>` that this constructor is meant to construct. The
+ type is always the canonical type, since all constructors for a given type
+ have the same name.
+
+``CXXDestructorName``
+
+ The name is a C++ destructor name. Use ``N.getCXXNameType()`` to retrieve
+ the :ref:`type <QualType>` whose destructor is being named. This type is
+ always a canonical type.
+
+``CXXConversionFunctionName``
+
+ The name is a C++ conversion function. Conversion functions are named
+ according to the type they convert to, e.g., "``operator void const *``".
+ Use ``N.getCXXNameType()`` to retrieve the type that this conversion function
+ converts to. This type is always a canonical type.
+
+``CXXOperatorName``
+
+ The name is a C++ overloaded operator name. Overloaded operators are named
+ according to their spelling, e.g., "``operator+``" or "``operator new []``".
+ Use ``N.getCXXOverloadedOperator()`` to retrieve the overloaded operator (a
+ value of type ``OverloadedOperatorKind``).
+
+``CXXLiteralOperatorName``
+
+ The name is a C++11 user defined literal operator. User defined
+ Literal operators are named according to the suffix they define,
+ e.g., "``_foo``" for "``operator "" _foo``". Use
+ ``N.getCXXLiteralIdentifier()`` to retrieve the corresponding
+ ``IdentifierInfo*`` pointing to the identifier.
+
+``CXXUsingDirective``
+
+ The name is a C++ using directive. Using directives are not really
+ NamedDecls, in that they all have the same name, but they are
+ implemented as such in order to store them in DeclContext
+ effectively.
+
+``DeclarationName``\ s are cheap to create, copy, and compare. They require
+only a single pointer's worth of storage in the common cases (identifiers,
+zero- and one-argument Objective-C selectors) and use dense, uniqued storage
+for the other kinds of names. Two ``DeclarationName``\ s can be compared for
+equality (``==``, ``!=``) using a simple bitwise comparison, can be ordered
+with ``<``, ``>``, ``<=``, and ``>=`` (which provide a lexicographical ordering
+for normal identifiers but an unspecified ordering for other kinds of names),
+and can be placed into LLVM ``DenseMap``\ s and ``DenseSet``\ s.
+
+``DeclarationName`` instances can be created in different ways depending on
+what kind of name the instance will store. Normal identifiers
+(``IdentifierInfo`` pointers) and Objective-C selectors (``Selector``) can be
+implicitly converted to ``DeclarationNames``. Names for C++ constructors,
+destructors, conversion functions, and overloaded operators can be retrieved
+from the ``DeclarationNameTable``, an instance of which is available as
+``ASTContext::DeclarationNames``. The member functions
+``getCXXConstructorName``, ``getCXXDestructorName``,
+``getCXXConversionFunctionName``, and ``getCXXOperatorName``, respectively,
+return ``DeclarationName`` instances for the four kinds of C++ special function
+names.
+
+.. _DeclContext:
+
+Declaration contexts
+--------------------
+
+Every declaration in a program exists within some *declaration context*, such
+as a translation unit, namespace, class, or function. Declaration contexts in
+Clang are represented by the ``DeclContext`` class, from which the various
+declaration-context AST nodes (``TranslationUnitDecl``, ``NamespaceDecl``,
+``RecordDecl``, ``FunctionDecl``, etc.) will derive. The ``DeclContext`` class
+provides several facilities common to each declaration context:
+
+Source-centric vs. Semantics-centric View of Declarations
+
+ ``DeclContext`` provides two views of the declarations stored within a
+ declaration context. The source-centric view accurately represents the
+ program source code as written, including multiple declarations of entities
+ where present (see the section :ref:`Redeclarations and Overloads
+ <Redeclarations>`), while the semantics-centric view represents the program
+ semantics. The two views are kept synchronized by semantic analysis while
+ the ASTs are being constructed.
+
+Storage of declarations within that context
+
+ Every declaration context can contain some number of declarations. For
+ example, a C++ class (represented by ``RecordDecl``) contains various member
+ functions, fields, nested types, and so on. All of these declarations will
+ be stored within the ``DeclContext``, and one can iterate over the
+ declarations via [``DeclContext::decls_begin()``,
+ ``DeclContext::decls_end()``). This mechanism provides the source-centric
+ view of declarations in the context.
+
+Lookup of declarations within that context
+
+ The ``DeclContext`` structure provides efficient name lookup for names within
+ that declaration context. For example, if ``N`` is a namespace we can look
+ for the name ``N::f`` using ``DeclContext::lookup``. The lookup itself is
+ based on a lazily-constructed array (for declaration contexts with a small
+ number of declarations) or hash table (for declaration contexts with more
+ declarations). The lookup operation provides the semantics-centric view of
+ the declarations in the context.
+
+Ownership of declarations
+
+ The ``DeclContext`` owns all of the declarations that were declared within
+ its declaration context, and is responsible for the management of their
+ memory as well as their (de-)serialization.
+
+All declarations are stored within a declaration context, and one can query
+information about the context in which each declaration lives. One can
+retrieve the ``DeclContext`` that contains a particular ``Decl`` using
+``Decl::getDeclContext``. However, see the section
+:ref:`LexicalAndSemanticContexts` for more information about how to interpret
+this context information.
+
+.. _Redeclarations:
+
+Redeclarations and Overloads
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Within a translation unit, it is common for an entity to be declared several
+times. For example, we might declare a function "``f``" and then later
+re-declare it as part of an inlined definition:
+
+.. code-block:: c++
+
+ void f(int x, int y, int z = 1);
+
+ inline void f(int x, int y, int z) { /* ... */ }
+
+The representation of "``f``" differs in the source-centric and
+semantics-centric views of a declaration context. In the source-centric view,
+all redeclarations will be present, in the order they occurred in the source
+code, making this view suitable for clients that wish to see the structure of
+the source code. In the semantics-centric view, only the most recent "``f``"
+will be found by the lookup, since it effectively replaces the first
+declaration of "``f``".
+
+In the semantics-centric view, overloading of functions is represented
+explicitly. For example, given two declarations of a function "``g``" that are
+overloaded, e.g.,
+
+.. code-block:: c++
+
+ void g();
+ void g(int);
+
+the ``DeclContext::lookup`` operation will return a
+``DeclContext::lookup_result`` that contains a range of iterators over
+declarations of "``g``". Clients that perform semantic analysis on a program
+that is not concerned with the actual source code will primarily use this
+semantics-centric view.
+
+.. _LexicalAndSemanticContexts:
+
+Lexical and Semantic Contexts
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Each declaration has two potentially different declaration contexts: a
+*lexical* context, which corresponds to the source-centric view of the
+declaration context, and a *semantic* context, which corresponds to the
+semantics-centric view. The lexical context is accessible via
+``Decl::getLexicalDeclContext`` while the semantic context is accessible via
+``Decl::getDeclContext``, both of which return ``DeclContext`` pointers. For
+most declarations, the two contexts are identical. For example:
+
+.. code-block:: c++
+
+ class X {
+ public:
+ void f(int x);
+ };
+
+Here, the semantic and lexical contexts of ``X::f`` are the ``DeclContext``
+associated with the class ``X`` (itself stored as a ``RecordDecl`` AST node).
+However, we can now define ``X::f`` out-of-line:
+
+.. code-block:: c++
+
+ void X::f(int x = 17) { /* ... */ }
+
+This definition of "``f``" has different lexical and semantic contexts. The
+lexical context corresponds to the declaration context in which the actual
+declaration occurred in the source code, e.g., the translation unit containing
+``X``. Thus, this declaration of ``X::f`` can be found by traversing the
+declarations provided by [``decls_begin()``, ``decls_end()``) in the
+translation unit.
+
+The semantic context of ``X::f`` corresponds to the class ``X``, since this
+member function is (semantically) a member of ``X``. Lookup of the name ``f``
+into the ``DeclContext`` associated with ``X`` will then return the definition
+of ``X::f`` (including information about the default argument).
+
+Transparent Declaration Contexts
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In C and C++, there are several contexts in which names that are logically
+declared inside another declaration will actually "leak" out into the enclosing
+scope from the perspective of name lookup. The most obvious instance of this
+behavior is in enumeration types, e.g.,
+
+.. code-block:: c++
+
+ enum Color {
+ Red,
+ Green,
+ Blue
+ };
+
+Here, ``Color`` is an enumeration, which is a declaration context that contains
+the enumerators ``Red``, ``Green``, and ``Blue``. Thus, traversing the list of
+declarations contained in the enumeration ``Color`` will yield ``Red``,
+``Green``, and ``Blue``. However, outside of the scope of ``Color`` one can
+name the enumerator ``Red`` without qualifying the name, e.g.,
+
+.. code-block:: c++
+
+ Color c = Red;
+
+There are other entities in C++ that provide similar behavior. For example,
+linkage specifications that use curly braces:
+
+.. code-block:: c++
+
+ extern "C" {
+ void f(int);
+ void g(int);
+ }
+ // f and g are visible here
+
+For source-level accuracy, we treat the linkage specification and enumeration
+type as a declaration context in which its enclosed declarations ("``Red``",
+"``Green``", and "``Blue``"; "``f``" and "``g``") are declared. However, these
+declarations are visible outside of the scope of the declaration context.
+
+These language features (and several others, described below) have roughly the
+same set of requirements: declarations are declared within a particular lexical
+context, but the declarations are also found via name lookup in scopes
+enclosing the declaration itself. This feature is implemented via
+*transparent* declaration contexts (see
+``DeclContext::isTransparentContext()``), whose declarations are visible in the
+nearest enclosing non-transparent declaration context. This means that the
+lexical context of the declaration (e.g., an enumerator) will be the
+transparent ``DeclContext`` itself, as will the semantic context, but the
+declaration will be visible in every outer context up to and including the
+first non-transparent declaration context (since transparent declaration
+contexts can be nested).
+
+The transparent ``DeclContext``\ s are:
+
+* Enumerations (but not C++11 "scoped enumerations"):
+
+ .. code-block:: c++
+
+ enum Color {
+ Red,
+ Green,
+ Blue
+ };
+ // Red, Green, and Blue are in scope
+
+* C++ linkage specifications:
+
+ .. code-block:: c++
+
+ extern "C" {
+ void f(int);
+ void g(int);
+ }
+ // f and g are in scope
+
+* Anonymous unions and structs:
+
+ .. code-block:: c++
+
+ struct LookupTable {
+ bool IsVector;
+ union {
+ std::vector<Item> *Vector;
+ std::set<Item> *Set;
+ };
+ };
+
+ LookupTable LT;
+ LT.Vector = 0; // Okay: finds Vector inside the unnamed union
+
+* C++11 inline namespaces:
+
+ .. code-block:: c++
+
+ namespace mylib {
+ inline namespace debug {
+ class X;
+ }
+ }
+ mylib::X *xp; // okay: mylib::X refers to mylib::debug::X
+
+.. _MultiDeclContext:
+
+Multiply-Defined Declaration Contexts
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+C++ namespaces have the interesting --- and, so far, unique --- property that
+the namespace can be defined multiple times, and the declarations provided by
+each namespace definition are effectively merged (from the semantic point of
+view). For example, the following two code snippets are semantically
+indistinguishable:
+
+.. code-block:: c++
+
+ // Snippet #1:
+ namespace N {
+ void f();
+ }
+ namespace N {
+ void f(int);
+ }
+
+ // Snippet #2:
+ namespace N {
+ void f();
+ void f(int);
+ }
+
+In Clang's representation, the source-centric view of declaration contexts will
+actually have two separate ``NamespaceDecl`` nodes in Snippet #1, each of which
+is a declaration context that contains a single declaration of "``f``".
+However, the semantics-centric view provided by name lookup into the namespace
+``N`` for "``f``" will return a ``DeclContext::lookup_result`` that contains a
+range of iterators over declarations of "``f``".
+
+``DeclContext`` manages multiply-defined declaration contexts internally. The
+function ``DeclContext::getPrimaryContext`` retrieves the "primary" context for
+a given ``DeclContext`` instance, which is the ``DeclContext`` responsible for
+maintaining the lookup table used for the semantics-centric view. Given the
+primary context, one can follow the chain of ``DeclContext`` nodes that define
+additional declarations via ``DeclContext::getNextContext``. Note that these
+functions are used internally within the lookup and insertion methods of the
+``DeclContext``, so the vast majority of clients can ignore them.
+
+.. _CFG:
+
+The ``CFG`` class
+-----------------
+
+The ``CFG`` class is designed to represent a source-level control-flow graph
+for a single statement (``Stmt*``). Typically instances of ``CFG`` are
+constructed for function bodies (usually an instance of ``CompoundStmt``), but
+can also be instantiated to represent the control-flow of any class that
+subclasses ``Stmt``, which includes simple expressions. Control-flow graphs
+are especially useful for performing `flow- or path-sensitive
+<http://en.wikipedia.org/wiki/Data_flow_analysis#Sensitivities>`_ program
+analyses on a given function.
+
+Basic Blocks
+^^^^^^^^^^^^
+
+Concretely, an instance of ``CFG`` is a collection of basic blocks. Each basic
+block is an instance of ``CFGBlock``, which simply contains an ordered sequence
+of ``Stmt*`` (each referring to statements in the AST). The ordering of
+statements within a block indicates unconditional flow of control from one
+statement to the next. :ref:`Conditional control-flow
+<ConditionalControlFlow>` is represented using edges between basic blocks. The
+statements within a given ``CFGBlock`` can be traversed using the
+``CFGBlock::*iterator`` interface.
+
+A ``CFG`` object owns the instances of ``CFGBlock`` within the control-flow
+graph it represents. Each ``CFGBlock`` within a CFG is also uniquely numbered
+(accessible via ``CFGBlock::getBlockID()``). Currently the number is based on
+the ordering the blocks were created, but no assumptions should be made on how
+``CFGBlocks`` are numbered other than their numbers are unique and that they
+are numbered from 0..N-1 (where N is the number of basic blocks in the CFG).
+
+Entry and Exit Blocks
+^^^^^^^^^^^^^^^^^^^^^
+
+Each instance of ``CFG`` contains two special blocks: an *entry* block
+(accessible via ``CFG::getEntry()``), which has no incoming edges, and an
+*exit* block (accessible via ``CFG::getExit()``), which has no outgoing edges.
+Neither block contains any statements, and they serve the role of providing a
+clear entrance and exit for a body of code such as a function body. The
+presence of these empty blocks greatly simplifies the implementation of many
+analyses built on top of CFGs.
+
+.. _ConditionalControlFlow:
+
+Conditional Control-Flow
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Conditional control-flow (such as those induced by if-statements and loops) is
+represented as edges between ``CFGBlocks``. Because different C language
+constructs can induce control-flow, each ``CFGBlock`` also records an extra
+``Stmt*`` that represents the *terminator* of the block. A terminator is
+simply the statement that caused the control-flow, and is used to identify the
+nature of the conditional control-flow between blocks. For example, in the
+case of an if-statement, the terminator refers to the ``IfStmt`` object in the
+AST that represented the given branch.
+
+To illustrate, consider the following code example:
+
+.. code-block:: c++
+
+ int foo(int x) {
+ x = x + 1;
+ if (x > 2)
+ x++;
+ else {
+ x += 2;
+ x *= 2;
+ }
+
+ return x;
+ }
+
+After invoking the parser+semantic analyzer on this code fragment, the AST of
+the body of ``foo`` is referenced by a single ``Stmt*``. We can then construct
+an instance of ``CFG`` representing the control-flow graph of this function
+body by single call to a static class method:
+
+.. code-block:: c++
+
+ Stmt *FooBody = ...
+ CFG *FooCFG = CFG::buildCFG(FooBody);
+
+It is the responsibility of the caller of ``CFG::buildCFG`` to ``delete`` the
+returned ``CFG*`` when the CFG is no longer needed.
+
+Along with providing an interface to iterate over its ``CFGBlocks``, the
+``CFG`` class also provides methods that are useful for debugging and
+visualizing CFGs. For example, the method ``CFG::dump()`` dumps a
+pretty-printed version of the CFG to standard error. This is especially useful
+when one is using a debugger such as gdb. For example, here is the output of
+``FooCFG->dump()``:
+
+.. code-block:: c++
+
+ [ B5 (ENTRY) ]
+ Predecessors (0):
+ Successors (1): B4
+
+ [ B4 ]
+ 1: x = x + 1
+ 2: (x > 2)
+ T: if [B4.2]
+ Predecessors (1): B5
+ Successors (2): B3 B2
+
+ [ B3 ]
+ 1: x++
+ Predecessors (1): B4
+ Successors (1): B1
+
+ [ B2 ]
+ 1: x += 2
+ 2: x *= 2
+ Predecessors (1): B4
+ Successors (1): B1
+
+ [ B1 ]
+ 1: return x;
+ Predecessors (2): B2 B3
+ Successors (1): B0
+
+ [ B0 (EXIT) ]
+ Predecessors (1): B1
+ Successors (0):
+
+For each block, the pretty-printed output displays for each block the number of
+*predecessor* blocks (blocks that have outgoing control-flow to the given
+block) and *successor* blocks (blocks that have control-flow that have incoming
+control-flow from the given block). We can also clearly see the special entry
+and exit blocks at the beginning and end of the pretty-printed output. For the
+entry block (block B5), the number of predecessor blocks is 0, while for the
+exit block (block B0) the number of successor blocks is 0.
+
+The most interesting block here is B4, whose outgoing control-flow represents
+the branching caused by the sole if-statement in ``foo``. Of particular
+interest is the second statement in the block, ``(x > 2)``, and the terminator,
+printed as ``if [B4.2]``. The second statement represents the evaluation of
+the condition of the if-statement, which occurs before the actual branching of
+control-flow. Within the ``CFGBlock`` for B4, the ``Stmt*`` for the second
+statement refers to the actual expression in the AST for ``(x > 2)``. Thus
+pointers to subclasses of ``Expr`` can appear in the list of statements in a
+block, and not just subclasses of ``Stmt`` that refer to proper C statements.
+
+The terminator of block B4 is a pointer to the ``IfStmt`` object in the AST.
+The pretty-printer outputs ``if [B4.2]`` because the condition expression of
+the if-statement has an actual place in the basic block, and thus the
+terminator is essentially *referring* to the expression that is the second
+statement of block B4 (i.e., B4.2). In this manner, conditions for
+control-flow (which also includes conditions for loops and switch statements)
+are hoisted into the actual basic block.
+
+.. Implicit Control-Flow
+.. ^^^^^^^^^^^^^^^^^^^^^
+
+.. A key design principle of the ``CFG`` class was to not require any
+.. transformations to the AST in order to represent control-flow. Thus the
+.. ``CFG`` does not perform any "lowering" of the statements in an AST: loops
+.. are not transformed into guarded gotos, short-circuit operations are not
+.. converted to a set of if-statements, and so on.
+
+Constant Folding in the Clang AST
+---------------------------------
+
+There are several places where constants and constant folding matter a lot to
+the Clang front-end. First, in general, we prefer the AST to retain the source
+code as close to how the user wrote it as possible. This means that if they
+wrote "``5+4``", we want to keep the addition and two constants in the AST, we
+don't want to fold to "``9``". This means that constant folding in various
+ways turns into a tree walk that needs to handle the various cases.
+
+However, there are places in both C and C++ that require constants to be
+folded. For example, the C standard defines what an "integer constant
+expression" (i-c-e) is with very precise and specific requirements. The
+language then requires i-c-e's in a lot of places (for example, the size of a
+bitfield, the value for a case statement, etc). For these, we have to be able
+to constant fold the constants, to do semantic checks (e.g., verify bitfield
+size is non-negative and that case statements aren't duplicated). We aim for
+Clang to be very pedantic about this, diagnosing cases when the code does not
+use an i-c-e where one is required, but accepting the code unless running with
+``-pedantic-errors``.
+
+Things get a little bit more tricky when it comes to compatibility with
+real-world source code. Specifically, GCC has historically accepted a huge
+superset of expressions as i-c-e's, and a lot of real world code depends on
+this unfortuate accident of history (including, e.g., the glibc system
+headers). GCC accepts anything its "fold" optimizer is capable of reducing to
+an integer constant, which means that the definition of what it accepts changes
+as its optimizer does. One example is that GCC accepts things like "``case
+X-X:``" even when ``X`` is a variable, because it can fold this to 0.
+
+Another issue are how constants interact with the extensions we support, such
+as ``__builtin_constant_p``, ``__builtin_inf``, ``__extension__`` and many
+others. C99 obviously does not specify the semantics of any of these
+extensions, and the definition of i-c-e does not include them. However, these
+extensions are often used in real code, and we have to have a way to reason
+about them.
+
+Finally, this is not just a problem for semantic analysis. The code generator
+and other clients have to be able to fold constants (e.g., to initialize global
+variables) and has to handle a superset of what C99 allows. Further, these
+clients can benefit from extended information. For example, we know that
+"``foo() || 1``" always evaluates to ``true``, but we can't replace the
+expression with ``true`` because it has side effects.
+
+Implementation Approach
+^^^^^^^^^^^^^^^^^^^^^^^
+
+After trying several different approaches, we've finally converged on a design
+(Note, at the time of this writing, not all of this has been implemented,
+consider this a design goal!). Our basic approach is to define a single
+recursive method evaluation method (``Expr::Evaluate``), which is implemented
+in ``AST/ExprConstant.cpp``. Given an expression with "scalar" type (integer,
+fp, complex, or pointer) this method returns the following information:
+
+* Whether the expression is an integer constant expression, a general constant
+ that was folded but has no side effects, a general constant that was folded
+ but that does have side effects, or an uncomputable/unfoldable value.
+* If the expression was computable in any way, this method returns the
+ ``APValue`` for the result of the expression.
+* If the expression is not evaluatable at all, this method returns information
+ on one of the problems with the expression. This includes a
+ ``SourceLocation`` for where the problem is, and a diagnostic ID that explains
+ the problem. The diagnostic should have ``ERROR`` type.
+* If the expression is not an integer constant expression, this method returns
+ information on one of the problems with the expression. This includes a
+ ``SourceLocation`` for where the problem is, and a diagnostic ID that
+ explains the problem. The diagnostic should have ``EXTENSION`` type.
+
+This information gives various clients the flexibility that they want, and we
+will eventually have some helper methods for various extensions. For example,
+``Sema`` should have a ``Sema::VerifyIntegerConstantExpression`` method, which
+calls ``Evaluate`` on the expression. If the expression is not foldable, the
+error is emitted, and it would return ``true``. If the expression is not an
+i-c-e, the ``EXTENSION`` diagnostic is emitted. Finally it would return
+``false`` to indicate that the AST is OK.
+
+Other clients can use the information in other ways, for example, codegen can
+just use expressions that are foldable in any way.
+
+Extensions
+^^^^^^^^^^
+
+This section describes how some of the various extensions Clang supports
+interacts with constant evaluation:
+
+* ``__extension__``: The expression form of this extension causes any
+ evaluatable subexpression to be accepted as an integer constant expression.
+* ``__builtin_constant_p``: This returns true (as an integer constant
+ expression) if the operand evaluates to either a numeric value (that is, not
+ a pointer cast to integral type) of integral, enumeration, floating or
+ complex type, or if it evaluates to the address of the first character of a
+ string literal (possibly cast to some other type). As a special case, if
+ ``__builtin_constant_p`` is the (potentially parenthesized) condition of a
+ conditional operator expression ("``?:``"), only the true side of the
+ conditional operator is considered, and it is evaluated with full constant
+ folding.
+* ``__builtin_choose_expr``: The condition is required to be an integer
+ constant expression, but we accept any constant as an "extension of an
+ extension". This only evaluates one operand depending on which way the
+ condition evaluates.
+* ``__builtin_classify_type``: This always returns an integer constant
+ expression.
+* ``__builtin_inf, nan, ...``: These are treated just like a floating-point
+ literal.
+* ``__builtin_abs, copysign, ...``: These are constant folded as general
+ constant expressions.
+* ``__builtin_strlen`` and ``strlen``: These are constant folded as integer
+ constant expressions if the argument is a string literal.
+
+How to change Clang
+===================
+
+How to add an attribute
+-----------------------
+
+Attribute Basics
+^^^^^^^^^^^^^^^^
+
+Attributes in clang come in two forms: parsed form, and semantic form. Both
+forms are represented via a tablegen definition of the attribute, specified in
+Attr.td.
+
+
+``include/clang/Basic/Attr.td``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+First, add your attribute to the `include/clang/Basic/Attr.td
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Attr.td?view=markup>`_
+file.
+
+Each attribute gets a ``def`` inheriting from ``Attr`` or one of its
+subclasses. ``InheritableAttr`` means that the attribute also applies to
+subsequent declarations of the same name. ``InheritableParamAttr`` is similar
+to ``InheritableAttr``, except that the attribute is written on a parameter
+instead of a declaration, type or statement. Attributes inheriting from
+``TypeAttr`` are pure type attributes which generally are not given a
+representation in the AST. Attributes inheriting from ``TargetSpecificAttr``
+are attributes specific to one or more target architectures. An attribute that
+inherits from ``IgnoredAttr`` is parsed, but will generate an ignored attribute
+diagnostic when used. The attribute type may be useful when an attribute is
+supported by another vendor, but not supported by clang.
+
+``Spellings`` lists the strings that can appear in ``__attribute__((here))`` or
+``[[here]]``. All such strings will be synonymous. Possible ``Spellings``
+are: ``GNU`` (for use with GNU-style __attribute__ spellings), ``Declspec``
+(for use with Microsoft Visual Studio-style __declspec spellings), ``CXX11`
+(for use with C++11-style [[foo]] and [[foo::bar]] spellings), and ``Keyword``
+(for use with attributes that are implemented as keywords, like C++11's
+``override`` or ``final``). If you want to allow the ``[[]]`` C++11 syntax, you
+have to define a list of ``Namespaces``, which will let users write
+``[[namespace::spelling]]``. Using the empty string for a namespace will allow
+users to write just the spelling with no "``::``". Attributes which g++-4.8
+or later accepts should also have a ``CXX11<"gnu", "spelling">`` spelling.
+
+``Subjects`` restricts what kinds of AST node to which this attribute can
+appertain (roughly, attach). The subjects are specified via a ``SubjectList``,
+which specify the list of subjects. Additionally, subject-related diagnostics
+can be specified to be warnings or errors, with the default being a warning.
+The diagnostics displayed to the user are automatically determined based on
+the subjects in the list, but a custom diagnostic parameter can also be
+specified in the ``SubjectList``. The diagnostics generated for subject list
+violations are either ``diag::warn_attribute_wrong_decl_type`` or
+``diag::err_attribute_wrong_decl_type``, and the parameter enumeration is
+found in `include/clang/Sema/AttributeList.h
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Sema/AttributeList.h?view=markup>`_
+If you add new Decl nodes to the ``SubjectList``, you may need to update the
+logic used to automatically determine the diagnostic parameter in `utils/TableGen/ClangAttrEmitter.cpp
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/utils/TableGen/ClangAttrEmitter.cpp?view=markup>`_.
+
+Diagnostic checking for attribute subject lists is automated except when
+``HasCustomParsing`` is set to ``1``.
+
+By default, all subjects in the SubjectList must either be a Decl node defined
+in ``DeclNodes.td``, or a statement node defined in ``StmtNodes.td``. However,
+more complex subjects can be created by creating a ``SubsetSubject`` object.
+Each such object has a base subject which it appertains to (which must be a
+Decl or Stmt node, and not a SubsetSubject node), and some custom code which is
+called when determining whether an attribute appertains to the subject. For
+instance, a ``NonBitField`` SubsetSubject appertains to a ``FieldDecl``, and
+tests whether the given FieldDecl is a bit field. When a SubsetSubject is
+specified in a SubjectList, a custom diagnostic parameter must also be provided.
+
+``Args`` names the arguments the attribute takes, in order. If ``Args`` is
+``[StringArgument<"Arg1">, IntArgument<"Arg2">]`` then
+``__attribute__((myattribute("Hello", 3)))`` will be a valid use. Attribute
+arguments specify both the parsed form and the semantic form of the attribute.
+The previous example shows an attribute which requires two attributes while
+parsing, and the Attr subclass' constructor for the attribute will require a
+string and integer argument.
+
+Diagnostic checking for argument counts is automated except when
+``HasCustomParsing`` is set to ``1``, or when the attribute uses an optional or
+variadic argument. Diagnostic checking for argument semantics is not automated.
+
+If the parsed form of the attribute is more complex, or differs from the
+semantic form, the ``HasCustomParsing`` bit can be set to ``1`` for the class,
+and the parsing code in `Parser::ParseGNUAttributeArgs
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Parse/ParseDecl.cpp?view=markup>`_
+can be updated for the special case. Note that this only applies to arguments
+with a GNU spelling -- attributes with a __declspec spelling currently ignore
+this flag and are handled by ``Parser::ParseMicrosoftDeclSpec``.
+
+Custom accessors can be generated for an attribute based on the spelling list
+for that attribute. For instance, if an attribute has two different spellings:
+'Foo' and 'Bar', accessors can be created:
+``[Accessor<"isFoo", [GNU<"Foo">]>, Accessor<"isBar", [GNU<"Bar">]>]``
+These accessors will be generated on the semantic form of the attribute,
+accepting no arguments and returning a Boolean.
+
+Attributes which do not require an AST node should set the ``ASTNode`` field to
+``0`` to avoid polluting the AST. Note that anything inheriting from
+``TypeAttr`` or ``IgnoredAttr`` automatically do not generate an AST node. All
+other attributes generate an AST node by default. The AST node is the semantic
+representation of the attribute.
+
+Attributes which do not require custom semantic handling should set the
+``SemaHandler`` field to ``0``. Note that anything inheriting from
+``IgnoredAttr`` automatically do not get a semantic handler. All other
+attributes are assumed to use a semantic handler by default. Attributes
+without a semantic handler are not given a parsed attribute Kind enumeration.
+
+The ``LangOpts`` field can be used to specify a list of language options
+required by the attribute. For instance, all of the CUDA-specific attributes
+specify ``[CUDA]`` for the ``LangOpts`` field, and when the CUDA language
+option is not enabled, an "attribute ignored" warning diagnostic is emitted.
+Since language options are not table generated nodes, new language options must
+be created manually and should specify the spelling used by ``LangOptions`` class.
+
+Target-specific attribute sometimes share a spelling with other attributes in
+different targets. For instance, the ARM and MSP430 targets both have an
+attribute spelled ``GNU<"interrupt">``, but with different parsing and semantic
+requirements. To support this feature, an attribute inheriting from
+``TargetSpecificAttribute`` make specify a ``ParseKind`` field. This field
+should be the same value between all arguments sharing a spelling, and
+corresponds to the parsed attribute's Kind enumeration. This allows attributes
+to share a parsed attribute kind, but have distinct semantic attribute classes.
+For instance, ``AttributeList::AT_Interrupt`` is the shared parsed attribute
+kind, but ARMInterruptAttr and MSP430InterruptAttr are the semantic attributes
+generated.
+
+By default, when declarations are merging attributes, an attribute will not be
+duplicated. However, if an attribute can be duplicated during this merging
+stage, set ``DuplicatesAllowedWhileMerging`` to ``1``, and the attribute will
+be merged.
+
+By default, attribute arguments are parsed in an evaluated context. If the
+arguments for an attribute should be parsed in an unevaluated context (akin to
+the way the argument to a ``sizeof`` expression is parsed), you can set
+``ParseArgumentsAsUnevaluated`` to ``1``.
+
+If additional functionality is desired for the semantic form of the attribute,
+the ``AdditionalMembers`` field specifies code to be copied verbatim into the
+semantic attribute class object.
+
+All attributes must have one or more form of documentation, which is provided
+in the ``Documentation`` list. Generally, the documentation for an attribute
+is a stand-alone definition in `include/clang/Basic/AttrDocs.td
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/AttdDocs.td?view=markup>`_
+that is named after the attribute being documented. Each documentation element
+is given a ``Category`` (variable, function, or type) and ``Content``. A single
+attribute may contain multiple documentation elements for distinct categories.
+For instance, an attribute which can appertain to both function and types (such
+as a calling convention attribute), should contain two documentation elements.
+The ``Content`` for an attribute uses reStructuredText (RST) syntax.
+
+If an attribute is used internally by the compiler, but is not written by users
+(such as attributes with an empty spelling list), it can use the
+``Undocumented`` documentation element.
+
+Boilerplate
+^^^^^^^^^^^
+
+All semantic processing of declaration attributes happens in `lib/Sema/SemaDeclAttr.cpp
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaDeclAttr.cpp?view=markup>`_,
+and generally starts in the ``ProcessDeclAttribute`` function. If your
+attribute is a "simple" attribute -- meaning that it requires no custom
+semantic processing aside from what is automatically provided for you, you can
+add a call to ``handleSimpleAttribute<YourAttr>(S, D, Attr);`` to the switch
+statement. Otherwise, write a new ``handleYourAttr()`` function, and add that
+to the switch statement.
+
+If your attribute causes extra warnings to fire, define a ``DiagGroup`` in
+`include/clang/Basic/DiagnosticGroups.td
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticGroups.td?view=markup>`_
+named after the attribute's ``Spelling`` with "_"s replaced by "-"s. If you're
+only defining one diagnostic, you can skip ``DiagnosticGroups.td`` and use
+``InGroup<DiagGroup<"your-attribute">>`` directly in `DiagnosticSemaKinds.td
+<http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?view=markup>`_
+
+All semantic diagnostics generated for your attribute, including automatically-
+generated ones (such as subjects and argument counts), should have a
+corresponding test case.
+
+The meat of your attribute
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Find an appropriate place in Clang to do whatever your attribute needs to do.
+Check for the attribute's presence using ``Decl::getAttr<YourAttr>()``.
+
+Update the :doc:`LanguageExtensions` document to describe your new attribute.
+
+How to add an expression or statement
+-------------------------------------
+
+Expressions and statements are one of the most fundamental constructs within a
+compiler, because they interact with many different parts of the AST, semantic
+analysis, and IR generation. Therefore, adding a new expression or statement
+kind into Clang requires some care. The following list details the various
+places in Clang where an expression or statement needs to be introduced, along
+with patterns to follow to ensure that the new expression or statement works
+well across all of the C languages. We focus on expressions, but statements
+are similar.
+
+#. Introduce parsing actions into the parser. Recursive-descent parsing is
+ mostly self-explanatory, but there are a few things that are worth keeping
+ in mind:
+
+ * Keep as much source location information as possible! You'll want it later
+ to produce great diagnostics and support Clang's various features that map
+ between source code and the AST.
+ * Write tests for all of the "bad" parsing cases, to make sure your recovery
+ is good. If you have matched delimiters (e.g., parentheses, square
+ brackets, etc.), use ``Parser::BalancedDelimiterTracker`` to give nice
+ diagnostics when things go wrong.
+
+#. Introduce semantic analysis actions into ``Sema``. Semantic analysis should
+ always involve two functions: an ``ActOnXXX`` function that will be called
+ directly from the parser, and a ``BuildXXX`` function that performs the
+ actual semantic analysis and will (eventually!) build the AST node. It's
+ fairly common for the ``ActOnCXX`` function to do very little (often just
+ some minor translation from the parser's representation to ``Sema``'s
+ representation of the same thing), but the separation is still important:
+ C++ template instantiation, for example, should always call the ``BuildXXX``
+ variant. Several notes on semantic analysis before we get into construction
+ of the AST:
+
+ * Your expression probably involves some types and some subexpressions.
+ Make sure to fully check that those types, and the types of those
+ subexpressions, meet your expectations. Add implicit conversions where
+ necessary to make sure that all of the types line up exactly the way you
+ want them. Write extensive tests to check that you're getting good
+ diagnostics for mistakes and that you can use various forms of
+ subexpressions with your expression.
+ * When type-checking a type or subexpression, make sure to first check
+ whether the type is "dependent" (``Type::isDependentType()``) or whether a
+ subexpression is type-dependent (``Expr::isTypeDependent()``). If any of
+ these return ``true``, then you're inside a template and you can't do much
+ type-checking now. That's normal, and your AST node (when you get there)
+ will have to deal with this case. At this point, you can write tests that
+ use your expression within templates, but don't try to instantiate the
+ templates.
+ * For each subexpression, be sure to call ``Sema::CheckPlaceholderExpr()``
+ to deal with "weird" expressions that don't behave well as subexpressions.
+ Then, determine whether you need to perform lvalue-to-rvalue conversions
+ (``Sema::DefaultLvalueConversions``) or the usual unary conversions
+ (``Sema::UsualUnaryConversions``), for places where the subexpression is
+ producing a value you intend to use.
+ * Your ``BuildXXX`` function will probably just return ``ExprError()`` at
+ this point, since you don't have an AST. That's perfectly fine, and
+ shouldn't impact your testing.
+
+#. Introduce an AST node for your new expression. This starts with declaring
+ the node in ``include/Basic/StmtNodes.td`` and creating a new class for your
+ expression in the appropriate ``include/AST/Expr*.h`` header. It's best to
+ look at the class for a similar expression to get ideas, and there are some
+ specific things to watch for:
+
+ * If you need to allocate memory, use the ``ASTContext`` allocator to
+ allocate memory. Never use raw ``malloc`` or ``new``, and never hold any
+ resources in an AST node, because the destructor of an AST node is never
+ called.
+ * Make sure that ``getSourceRange()`` covers the exact source range of your
+ expression. This is needed for diagnostics and for IDE support.
+ * Make sure that ``children()`` visits all of the subexpressions. This is
+ important for a number of features (e.g., IDE support, C++ variadic
+ templates). If you have sub-types, you'll also need to visit those
+ sub-types in ``RecursiveASTVisitor`` and ``DataRecursiveASTVisitor``.
+ * Add printing support (``StmtPrinter.cpp``) for your expression.
+ * Add profiling support (``StmtProfile.cpp``) for your AST node, noting the
+ distinguishing (non-source location) characteristics of an instance of
+ your expression. Omitting this step will lead to hard-to-diagnose
+ failures regarding matching of template declarations.
+ * Add serialization support (``ASTReaderStmt.cpp``, ``ASTWriterStmt.cpp``)
+ for your AST node.
+
+#. Teach semantic analysis to build your AST node. At this point, you can wire
+ up your ``Sema::BuildXXX`` function to actually create your AST. A few
+ things to check at this point:
+
+ * If your expression can construct a new C++ class or return a new
+ Objective-C object, be sure to update and then call
+ ``Sema::MaybeBindToTemporary`` for your just-created AST node to be sure
+ that the object gets properly destructed. An easy way to test this is to
+ return a C++ class with a private destructor: semantic analysis should
+ flag an error here with the attempt to call the destructor.
+ * Inspect the generated AST by printing it using ``clang -cc1 -ast-print``,
+ to make sure you're capturing all of the important information about how
+ the AST was written.
+ * Inspect the generated AST under ``clang -cc1 -ast-dump`` to verify that
+ all of the types in the generated AST line up the way you want them.
+ Remember that clients of the AST should never have to "think" to
+ understand what's going on. For example, all implicit conversions should
+ show up explicitly in the AST.
+ * Write tests that use your expression as a subexpression of other,
+ well-known expressions. Can you call a function using your expression as
+ an argument? Can you use the ternary operator?
+
+#. Teach code generation to create IR to your AST node. This step is the first
+ (and only) that requires knowledge of LLVM IR. There are several things to
+ keep in mind:
+
+ * Code generation is separated into scalar/aggregate/complex and
+ lvalue/rvalue paths, depending on what kind of result your expression
+ produces. On occasion, this requires some careful factoring of code to
+ avoid duplication.
+ * ``CodeGenFunction`` contains functions ``ConvertType`` and
+ ``ConvertTypeForMem`` that convert Clang's types (``clang::Type*`` or
+ ``clang::QualType``) to LLVM types. Use the former for values, and the
+ later for memory locations: test with the C++ "``bool``" type to check
+ this. If you find that you are having to use LLVM bitcasts to make the
+ subexpressions of your expression have the type that your expression
+ expects, STOP! Go fix semantic analysis and the AST so that you don't
+ need these bitcasts.
+ * The ``CodeGenFunction`` class has a number of helper functions to make
+ certain operations easy, such as generating code to produce an lvalue or
+ an rvalue, or to initialize a memory location with a given value. Prefer
+ to use these functions rather than directly writing loads and stores,
+ because these functions take care of some of the tricky details for you
+ (e.g., for exceptions).
+ * If your expression requires some special behavior in the event of an
+ exception, look at the ``push*Cleanup`` functions in ``CodeGenFunction``
+ to introduce a cleanup. You shouldn't have to deal with
+ exception-handling directly.
+ * Testing is extremely important in IR generation. Use ``clang -cc1
+ -emit-llvm`` and `FileCheck
+ <http://llvm.org/docs/CommandGuide/FileCheck.html>`_ to verify that you're
+ generating the right IR.
+
+#. Teach template instantiation how to cope with your AST node, which requires
+ some fairly simple code:
+
+ * Make sure that your expression's constructor properly computes the flags
+ for type dependence (i.e., the type your expression produces can change
+ from one instantiation to the next), value dependence (i.e., the constant
+ value your expression produces can change from one instantiation to the
+ next), instantiation dependence (i.e., a template parameter occurs
+ anywhere in your expression), and whether your expression contains a
+ parameter pack (for variadic templates). Often, computing these flags
+ just means combining the results from the various types and
+ subexpressions.
+ * Add ``TransformXXX`` and ``RebuildXXX`` functions to the ``TreeTransform``
+ class template in ``Sema``. ``TransformXXX`` should (recursively)
+ transform all of the subexpressions and types within your expression,
+ using ``getDerived().TransformYYY``. If all of the subexpressions and
+ types transform without error, it will then call the ``RebuildXXX``
+ function, which will in turn call ``getSema().BuildXXX`` to perform
+ semantic analysis and build your expression.
+ * To test template instantiation, take those tests you wrote to make sure
+ that you were type checking with type-dependent expressions and dependent
+ types (from step #2) and instantiate those templates with various types,
+ some of which type-check and some that don't, and test the error messages
+ in each case.
+
+#. There are some "extras" that make other features work better. It's worth
+ handling these extras to give your expression complete integration into
+ Clang:
+
+ * Add code completion support for your expression in
+ ``SemaCodeComplete.cpp``.
+ * If your expression has types in it, or has any "interesting" features
+ other than subexpressions, extend libclang's ``CursorVisitor`` to provide
+ proper visitation for your expression, enabling various IDE features such
+ as syntax highlighting, cross-referencing, and so on. The
+ ``c-index-test`` helper program can be used to test these features.
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/IntroductionToTheClangAST.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/IntroductionToTheClangAST.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/IntroductionToTheClangAST.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/IntroductionToTheClangAST.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,126 @@
+=============================
+Introduction to the Clang AST
+=============================
+
+This document gives a gentle introduction to the mysteries of the Clang
+AST. It is targeted at developers who either want to contribute to
+Clang, or use tools that work based on Clang's AST, like the AST
+matchers.
+
+.. raw:: html
+
+ <center><iframe width="560" height="315" src="http://www.youtube.com/embed/VqCkCDFLSsc?vq=hd720" frameborder="0" allowfullscreen></iframe></center>
+
+`Slides <http://llvm.org/devmtg/2013-04/klimek-slides.pdf>`_
+
+Introduction
+============
+
+Clang's AST is different from ASTs produced by some other compilers in
+that it closely resembles both the written C++ code and the C++
+standard. For example, parenthesis expressions and compile time
+constants are available in an unreduced form in the AST. This makes
+Clang's AST a good fit for refactoring tools.
+
+Documentation for all Clang AST nodes is available via the generated
+`Doxygen <http://clang.llvm.org/doxygen>`_. The doxygen online
+documentation is also indexed by your favorite search engine, which will
+make a search for clang and the AST node's class name usually turn up
+the doxygen of the class you're looking for (for example, search for:
+clang ParenExpr).
+
+Examining the AST
+=================
+
+A good way to familarize yourself with the Clang AST is to actually look
+at it on some simple example code. Clang has a builtin AST-dump mode,
+which can be enabled with the flag ``-ast-dump``.
+
+Let's look at a simple example AST:
+
+::
+
+ $ cat test.cc
+ int f(int x) {
+ int result = (x / 42);
+ return result;
+ }
+
+ # Clang by default is a frontend for many tools; -Xclang is used to pass
+ # options directly to the C++ frontend.
+ $ clang -Xclang -ast-dump -fsyntax-only test.cc
+ TranslationUnitDecl 0x5aea0d0 <<invalid sloc>>
+ ... cutting out internal declarations of clang ...
+ `-FunctionDecl 0x5aeab50 <test.cc:1:1, line:4:1> f 'int (int)'
+ |-ParmVarDecl 0x5aeaa90 <line:1:7, col:11> x 'int'
+ `-CompoundStmt 0x5aead88 <col:14, line:4:1>
+ |-DeclStmt 0x5aead10 <line:2:3, col:24>
+ | `-VarDecl 0x5aeac10 <col:3, col:23> result 'int'
+ | `-ParenExpr 0x5aeacf0 <col:16, col:23> 'int'
+ | `-BinaryOperator 0x5aeacc8 <col:17, col:21> 'int' '/'
+ | |-ImplicitCastExpr 0x5aeacb0 <col:17> 'int' <LValueToRValue>
+ | | `-DeclRefExpr 0x5aeac68 <col:17> 'int' lvalue ParmVar 0x5aeaa90 'x' 'int'
+ | `-IntegerLiteral 0x5aeac90 <col:21> 'int' 42
+ `-ReturnStmt 0x5aead68 <line:3:3, col:10>
+ `-ImplicitCastExpr 0x5aead50 <col:10> 'int' <LValueToRValue>
+ `-DeclRefExpr 0x5aead28 <col:10> 'int' lvalue Var 0x5aeac10 'result' 'int'
+
+The toplevel declaration in
+a translation unit is always the `translation unit
+declaration <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_.
+In this example, our first user written declaration is the `function
+declaration <http://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html>`_
+of "``f``". The body of "``f``" is a `compound
+statement <http://clang.llvm.org/doxygen/classclang_1_1CompoundStmt.html>`_,
+whose child nodes are a `declaration
+statement <http://clang.llvm.org/doxygen/classclang_1_1DeclStmt.html>`_
+that declares our result variable, and the `return
+statement <http://clang.llvm.org/doxygen/classclang_1_1ReturnStmt.html>`_.
+
+AST Context
+===========
+
+All information about the AST for a translation unit is bundled up in
+the class
+`ASTContext <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html>`_.
+It allows traversal of the whole translation unit starting from
+`getTranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#abd909fb01ef10cfd0244832a67b1dd64>`_,
+or to access Clang's `table of
+identifiers <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#a4f95adb9958e22fbe55212ae6482feb4>`_
+for the parsed translation unit.
+
+AST Nodes
+=========
+
+Clang's AST nodes are modeled on a class hierarchy that does not have a
+common ancestor. Instead, there are multiple larger hierarchies for
+basic node types like
+`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ and
+`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_. Many
+important AST nodes derive from
+`Type <http://clang.llvm.org/doxygen/classclang_1_1Type.html>`_,
+`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_,
+`DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_
+or `Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_, with
+some classes deriving from both Decl and DeclContext.
+
+There are also a multitude of nodes in the AST that are not part of a
+larger hierarchy, and are only reachable from specific other nodes, like
+`CXXBaseSpecifier <http://clang.llvm.org/doxygen/classclang_1_1CXXBaseSpecifier.html>`_.
+
+Thus, to traverse the full AST, one starts from the
+`TranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_
+and then recursively traverses everything that can be reached from that
+node - this information has to be encoded for each specific node type.
+This algorithm is encoded in the
+`RecursiveASTVisitor <http://clang.llvm.org/doxygen/classclang_1_1RecursiveASTVisitor.html>`_.
+See the `RecursiveASTVisitor
+tutorial <http://clang.llvm.org/docs/RAVFrontendAction.html>`_.
+
+The two most basic nodes in the Clang AST are statements
+(`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_) and
+declarations
+(`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_). Note
+that expressions
+(`Expr <http://clang.llvm.org/doxygen/classclang_1_1Expr.html>`_) are
+also statements in Clang's AST.
Added: www-releases/trunk/3.5.1/tools/clang/docs/JSONCompilationDatabase.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/JSONCompilationDatabase.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/JSONCompilationDatabase.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/JSONCompilationDatabase.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,88 @@
+==============================================
+JSON Compilation Database Format Specification
+==============================================
+
+This document describes a format for specifying how to replay single
+compilations independently of the build system.
+
+Background
+==========
+
+Tools based on the C++ Abstract Syntax Tree need full information how to
+parse a translation unit. Usually this information is implicitly
+available in the build system, but running tools as part of the build
+system is not necessarily the best solution:
+
+- Build systems are inherently change driven, so running multiple tools
+ over the same code base without changing the code does not fit into
+ the architecture of many build systems.
+- Figuring out whether things have changed is often an IO bound
+ process; this makes it hard to build low latency end user tools based
+ on the build system.
+- Build systems are inherently sequential in the build graph, for
+ example due to generated source code. While tools that run
+ independently of the build still need the generated source code to
+ exist, running tools multiple times over unchanging source does not
+ require serialization of the runs according to the build dependency
+ graph.
+
+Supported Systems
+=================
+
+Currently `CMake <http://cmake.org>`_ (since 2.8.5) supports generation
+of compilation databases for Unix Makefile builds (Ninja builds in the
+works) with the option ``CMAKE_EXPORT_COMPILE_COMMANDS``.
+
+For projects on Linux, there is an alternative to intercept compiler
+calls with a tool called `Bear <https://github.com/rizsotto/Bear>`_.
+
+Clang's tooling interface supports reading compilation databases; see
+the :doc:`LibTooling documentation <LibTooling>`. libclang and its
+python bindings also support this (since clang 3.2); see
+`CXCompilationDatabase.h </doxygen/group__COMPILATIONDB.html>`_.
+
+Format
+======
+
+A compilation database is a JSON file, which consist of an array of
+"command objects", where each command object specifies one way a
+translation unit is compiled in the project.
+
+Each command object contains the translation unit's main file, the
+working directory of the compile run and the actual compile command.
+
+Example:
+
+::
+
+ [
+ { "directory": "/home/user/llvm/build",
+ "command": "/usr/bin/clang++ -Irelative -DSOMEDEF=\"With spaces, quotes and \\-es.\" -c -o file.o file.cc",
+ "file": "file.cc" },
+ ...
+ ]
+
+The contracts for each field in the command object are:
+
+- **directory:** The working directory of the compilation. All paths
+ specified in the **command** or **file** fields must be either
+ absolute or relative to this directory.
+- **file:** The main translation unit source processed by this
+ compilation step. This is used by tools as the key into the
+ compilation database. There can be multiple command objects for the
+ same file, for example if the same source file is compiled with
+ different configurations.
+- **command:** The compile command executed. After JSON unescaping,
+ this must be a valid command to rerun the exact compilation step for
+ the translation unit in the environment the build system uses.
+ Parameters use shell quoting and shell escaping of quotes, with '``"``'
+ and '``\``' being the only special characters. Shell expansion is not
+ supported.
+
+Build System Integration
+========================
+
+The convention is to name the file compile\_commands.json and put it at
+the top of the build directory. Clang tools are pointed to the top of
+the build directory to detect the file and use the compilation database
+to parse C++ code in the source tree.
Added: www-releases/trunk/3.5.1/tools/clang/docs/LanguageExtensions.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/LanguageExtensions.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/LanguageExtensions.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/LanguageExtensions.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,1847 @@
+=========================
+Clang Language Extensions
+=========================
+
+.. contents::
+ :local:
+ :depth: 1
+
+.. toctree::
+ :hidden:
+
+ ObjectiveCLiterals
+ BlockLanguageSpec
+ Block-ABI-Apple
+ AutomaticReferenceCounting
+
+Introduction
+============
+
+This document describes the language extensions provided by Clang. In addition
+to the language extensions listed here, Clang aims to support a broad range of
+GCC extensions. Please see the `GCC manual
+<http://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html>`_ for more information on
+these extensions.
+
+.. _langext-feature_check:
+
+Feature Checking Macros
+=======================
+
+Language extensions can be very useful, but only if you know you can depend on
+them. In order to allow fine-grain features checks, we support three builtin
+function-like macros. This allows you to directly test for a feature in your
+code without having to resort to something like autoconf or fragile "compiler
+version checks".
+
+``__has_builtin``
+-----------------
+
+This function-like macro takes a single identifier argument that is the name of
+a builtin function. It evaluates to 1 if the builtin is supported or 0 if not.
+It can be used like this:
+
+.. code-block:: c++
+
+ #ifndef __has_builtin // Optional of course.
+ #define __has_builtin(x) 0 // Compatibility with non-clang compilers.
+ #endif
+
+ ...
+ #if __has_builtin(__builtin_trap)
+ __builtin_trap();
+ #else
+ abort();
+ #endif
+ ...
+
+.. _langext-__has_feature-__has_extension:
+
+``__has_feature`` and ``__has_extension``
+-----------------------------------------
+
+These function-like macros take a single identifier argument that is the name
+of a feature. ``__has_feature`` evaluates to 1 if the feature is both
+supported by Clang and standardized in the current language standard or 0 if
+not (but see :ref:`below <langext-has-feature-back-compat>`), while
+``__has_extension`` evaluates to 1 if the feature is supported by Clang in the
+current language (either as a language extension or a standard language
+feature) or 0 if not. They can be used like this:
+
+.. code-block:: c++
+
+ #ifndef __has_feature // Optional of course.
+ #define __has_feature(x) 0 // Compatibility with non-clang compilers.
+ #endif
+ #ifndef __has_extension
+ #define __has_extension __has_feature // Compatibility with pre-3.0 compilers.
+ #endif
+
+ ...
+ #if __has_feature(cxx_rvalue_references)
+ // This code will only be compiled with the -std=c++11 and -std=gnu++11
+ // options, because rvalue references are only standardized in C++11.
+ #endif
+
+ #if __has_extension(cxx_rvalue_references)
+ // This code will be compiled with the -std=c++11, -std=gnu++11, -std=c++98
+ // and -std=gnu++98 options, because rvalue references are supported as a
+ // language extension in C++98.
+ #endif
+
+.. _langext-has-feature-back-compat:
+
+For backward compatibility, ``__has_feature`` can also be used to test
+for support for non-standardized features, i.e. features not prefixed ``c_``,
+``cxx_`` or ``objc_``.
+
+Another use of ``__has_feature`` is to check for compiler features not related
+to the language standard, such as e.g. :doc:`AddressSanitizer
+<AddressSanitizer>`.
+
+If the ``-pedantic-errors`` option is given, ``__has_extension`` is equivalent
+to ``__has_feature``.
+
+The feature tag is described along with the language feature below.
+
+The feature name or extension name can also be specified with a preceding and
+following ``__`` (double underscore) to avoid interference from a macro with
+the same name. For instance, ``__cxx_rvalue_references__`` can be used instead
+of ``cxx_rvalue_references``.
+
+``__has_attribute``
+-------------------
+
+This function-like macro takes a single identifier argument that is the name of
+an attribute. It evaluates to 1 if the attribute is supported by the current
+compilation target, or 0 if not. It can be used like this:
+
+.. code-block:: c++
+
+ #ifndef __has_attribute // Optional of course.
+ #define __has_attribute(x) 0 // Compatibility with non-clang compilers.
+ #endif
+
+ ...
+ #if __has_attribute(always_inline)
+ #define ALWAYS_INLINE __attribute__((always_inline))
+ #else
+ #define ALWAYS_INLINE
+ #endif
+ ...
+
+The attribute name can also be specified with a preceding and following ``__``
+(double underscore) to avoid interference from a macro with the same name. For
+instance, ``__always_inline__`` can be used instead of ``always_inline``.
+
+``__is_identifier``
+-------------------
+
+This function-like macro takes a single identifier argument that might be either
+a reserved word or a regular identifier. It evaluates to 1 if the argument is just
+a regular identifier and not a reserved word, in the sense that it can then be
+used as the name of a user-defined function or variable. Otherwise it evaluates
+to 0. It can be used like this:
+
+.. code-block:: c++
+
+ ...
+ #ifdef __is_identifier // Compatibility with non-clang compilers.
+ #if __is_identifier(__wchar_t)
+ typedef wchar_t __wchar_t;
+ #endif
+ #endif
+
+ __wchar_t WideCharacter;
+ ...
+
+Include File Checking Macros
+============================
+
+Not all developments systems have the same include files. The
+:ref:`langext-__has_include` and :ref:`langext-__has_include_next` macros allow
+you to check for the existence of an include file before doing a possibly
+failing ``#include`` directive. Include file checking macros must be used
+as expressions in ``#if`` or ``#elif`` preprocessing directives.
+
+.. _langext-__has_include:
+
+``__has_include``
+-----------------
+
+This function-like macro takes a single file name string argument that is the
+name of an include file. It evaluates to 1 if the file can be found using the
+include paths, or 0 otherwise:
+
+.. code-block:: c++
+
+ // Note the two possible file name string formats.
+ #if __has_include("myinclude.h") && __has_include(<stdint.h>)
+ # include "myinclude.h"
+ #endif
+
+To test for this feature, use ``#if defined(__has_include)``:
+
+.. code-block:: c++
+
+ // To avoid problem with non-clang compilers not having this macro.
+ #if defined(__has_include)
+ #if __has_include("myinclude.h")
+ # include "myinclude.h"
+ #endif
+ #endif
+
+.. _langext-__has_include_next:
+
+``__has_include_next``
+----------------------
+
+This function-like macro takes a single file name string argument that is the
+name of an include file. It is like ``__has_include`` except that it looks for
+the second instance of the given file found in the include paths. It evaluates
+to 1 if the second instance of the file can be found using the include paths,
+or 0 otherwise:
+
+.. code-block:: c++
+
+ // Note the two possible file name string formats.
+ #if __has_include_next("myinclude.h") && __has_include_next(<stdint.h>)
+ # include_next "myinclude.h"
+ #endif
+
+ // To avoid problem with non-clang compilers not having this macro.
+ #if defined(__has_include_next)
+ #if __has_include_next("myinclude.h")
+ # include_next "myinclude.h"
+ #endif
+ #endif
+
+Note that ``__has_include_next``, like the GNU extension ``#include_next``
+directive, is intended for use in headers only, and will issue a warning if
+used in the top-level compilation file. A warning will also be issued if an
+absolute path is used in the file argument.
+
+``__has_warning``
+-----------------
+
+This function-like macro takes a string literal that represents a command line
+option for a warning and returns true if that is a valid warning option.
+
+.. code-block:: c++
+
+ #if __has_warning("-Wformat")
+ ...
+ #endif
+
+Builtin Macros
+==============
+
+``__BASE_FILE__``
+ Defined to a string that contains the name of the main input file passed to
+ Clang.
+
+``__COUNTER__``
+ Defined to an integer value that starts at zero and is incremented each time
+ the ``__COUNTER__`` macro is expanded.
+
+``__INCLUDE_LEVEL__``
+ Defined to an integral value that is the include depth of the file currently
+ being translated. For the main file, this value is zero.
+
+``__TIMESTAMP__``
+ Defined to the date and time of the last modification of the current source
+ file.
+
+``__clang__``
+ Defined when compiling with Clang
+
+``__clang_major__``
+ Defined to the major marketing version number of Clang (e.g., the 2 in
+ 2.0.1). Note that marketing version numbers should not be used to check for
+ language features, as different vendors use different numbering schemes.
+ Instead, use the :ref:`langext-feature_check`.
+
+``__clang_minor__``
+ Defined to the minor version number of Clang (e.g., the 0 in 2.0.1). Note
+ that marketing version numbers should not be used to check for language
+ features, as different vendors use different numbering schemes. Instead, use
+ the :ref:`langext-feature_check`.
+
+``__clang_patchlevel__``
+ Defined to the marketing patch level of Clang (e.g., the 1 in 2.0.1).
+
+``__clang_version__``
+ Defined to a string that captures the Clang marketing version, including the
+ Subversion tag or revision number, e.g., "``1.5 (trunk 102332)``".
+
+.. _langext-vectors:
+
+Vectors and Extended Vectors
+============================
+
+Supports the GCC, OpenCL, AltiVec and NEON vector extensions.
+
+OpenCL vector types are created using ``ext_vector_type`` attribute. It
+support for ``V.xyzw`` syntax and other tidbits as seen in OpenCL. An example
+is:
+
+.. code-block:: c++
+
+ typedef float float4 __attribute__((ext_vector_type(4)));
+ typedef float float2 __attribute__((ext_vector_type(2)));
+
+ float4 foo(float2 a, float2 b) {
+ float4 c;
+ c.xz = a;
+ c.yw = b;
+ return c;
+ }
+
+Query for this feature with ``__has_extension(attribute_ext_vector_type)``.
+
+Giving ``-faltivec`` option to clang enables support for AltiVec vector syntax
+and functions. For example:
+
+.. code-block:: c++
+
+ vector float foo(vector int a) {
+ vector int b;
+ b = vec_add(a, a) + a;
+ return (vector float)b;
+ }
+
+NEON vector types are created using ``neon_vector_type`` and
+``neon_polyvector_type`` attributes. For example:
+
+.. code-block:: c++
+
+ typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t;
+ typedef __attribute__((neon_polyvector_type(16))) poly8_t poly8x16_t;
+
+ int8x8_t foo(int8x8_t a) {
+ int8x8_t v;
+ v = a;
+ return v;
+ }
+
+Vector Literals
+---------------
+
+Vector literals can be used to create vectors from a set of scalars, or
+vectors. Either parentheses or braces form can be used. In the parentheses
+form the number of literal values specified must be one, i.e. referring to a
+scalar value, or must match the size of the vector type being created. If a
+single scalar literal value is specified, the scalar literal value will be
+replicated to all the components of the vector type. In the brackets form any
+number of literals can be specified. For example:
+
+.. code-block:: c++
+
+ typedef int v4si __attribute__((__vector_size__(16)));
+ typedef float float4 __attribute__((ext_vector_type(4)));
+ typedef float float2 __attribute__((ext_vector_type(2)));
+
+ v4si vsi = (v4si){1, 2, 3, 4};
+ float4 vf = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
+ vector int vi1 = (vector int)(1); // vi1 will be (1, 1, 1, 1).
+ vector int vi2 = (vector int){1}; // vi2 will be (1, 0, 0, 0).
+ vector int vi3 = (vector int)(1, 2); // error
+ vector int vi4 = (vector int){1, 2}; // vi4 will be (1, 2, 0, 0).
+ vector int vi5 = (vector int)(1, 2, 3, 4);
+ float4 vf = (float4)((float2)(1.0f, 2.0f), (float2)(3.0f, 4.0f));
+
+Vector Operations
+-----------------
+
+The table below shows the support for each operation by vector extension. A
+dash indicates that an operation is not accepted according to a corresponding
+specification.
+
+============================== ====== ======= === ====
+ Opeator OpenCL AltiVec GCC NEON
+============================== ====== ======= === ====
+[] yes yes yes --
+unary operators +, -- yes yes yes --
+++, -- -- yes yes yes --
++,--,*,/,% yes yes yes --
+bitwise operators &,|,^,~ yes yes yes --
+>>,<< yes yes yes --
+!, &&, || no -- -- --
+==, !=, >, <, >=, <= yes yes -- --
+= yes yes yes yes
+:? yes -- -- --
+sizeof yes yes yes yes
+============================== ====== ======= === ====
+
+See also :ref:`langext-__builtin_shufflevector`.
+
+Messages on ``deprecated`` and ``unavailable`` Attributes
+=========================================================
+
+An optional string message can be added to the ``deprecated`` and
+``unavailable`` attributes. For example:
+
+.. code-block:: c++
+
+ void explode(void) __attribute__((deprecated("extremely unsafe, use 'combust' instead!!!")));
+
+If the deprecated or unavailable declaration is used, the message will be
+incorporated into the appropriate diagnostic:
+
+.. code-block:: c++
+
+ harmless.c:4:3: warning: 'explode' is deprecated: extremely unsafe, use 'combust' instead!!!
+ [-Wdeprecated-declarations]
+ explode();
+ ^
+
+Query for this feature with
+``__has_extension(attribute_deprecated_with_message)`` and
+``__has_extension(attribute_unavailable_with_message)``.
+
+Attributes on Enumerators
+=========================
+
+Clang allows attributes to be written on individual enumerators. This allows
+enumerators to be deprecated, made unavailable, etc. The attribute must appear
+after the enumerator name and before any initializer, like so:
+
+.. code-block:: c++
+
+ enum OperationMode {
+ OM_Invalid,
+ OM_Normal,
+ OM_Terrified __attribute__((deprecated)),
+ OM_AbortOnError __attribute__((deprecated)) = 4
+ };
+
+Attributes on the ``enum`` declaration do not apply to individual enumerators.
+
+Query for this feature with ``__has_extension(enumerator_attributes)``.
+
+'User-Specified' System Frameworks
+==================================
+
+Clang provides a mechanism by which frameworks can be built in such a way that
+they will always be treated as being "system frameworks", even if they are not
+present in a system framework directory. This can be useful to system
+framework developers who want to be able to test building other applications
+with development builds of their framework, including the manner in which the
+compiler changes warning behavior for system headers.
+
+Framework developers can opt-in to this mechanism by creating a
+"``.system_framework``" file at the top-level of their framework. That is, the
+framework should have contents like:
+
+.. code-block:: none
+
+ .../TestFramework.framework
+ .../TestFramework.framework/.system_framework
+ .../TestFramework.framework/Headers
+ .../TestFramework.framework/Headers/TestFramework.h
+ ...
+
+Clang will treat the presence of this file as an indicator that the framework
+should be treated as a system framework, regardless of how it was found in the
+framework search path. For consistency, we recommend that such files never be
+included in installed versions of the framework.
+
+Checks for Standard Language Features
+=====================================
+
+The ``__has_feature`` macro can be used to query if certain standard language
+features are enabled. The ``__has_extension`` macro can be used to query if
+language features are available as an extension when compiling for a standard
+which does not provide them. The features which can be tested are listed here.
+
+C++98
+-----
+
+The features listed below are part of the C++98 standard. These features are
+enabled by default when compiling C++ code.
+
+C++ exceptions
+^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_exceptions)`` to determine if C++ exceptions have been
+enabled. For example, compiling code with ``-fno-exceptions`` disables C++
+exceptions.
+
+C++ RTTI
+^^^^^^^^
+
+Use ``__has_feature(cxx_rtti)`` to determine if C++ RTTI has been enabled. For
+example, compiling code with ``-fno-rtti`` disables the use of RTTI.
+
+C++11
+-----
+
+The features listed below are part of the C++11 standard. As a result, all
+these features are enabled with the ``-std=c++11`` or ``-std=gnu++11`` option
+when compiling C++ code.
+
+C++11 SFINAE includes access control
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_access_control_sfinae)`` or
+``__has_extension(cxx_access_control_sfinae)`` to determine whether
+access-control errors (e.g., calling a private constructor) are considered to
+be template argument deduction errors (aka SFINAE errors), per `C++ DR1170
+<http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1170>`_.
+
+C++11 alias templates
+^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_alias_templates)`` or
+``__has_extension(cxx_alias_templates)`` to determine if support for C++11's
+alias declarations and alias templates is enabled.
+
+C++11 alignment specifiers
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_alignas)`` or ``__has_extension(cxx_alignas)`` to
+determine if support for alignment specifiers using ``alignas`` is enabled.
+
+C++11 attributes
+^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_attributes)`` or ``__has_extension(cxx_attributes)`` to
+determine if support for attribute parsing with C++11's square bracket notation
+is enabled.
+
+C++11 generalized constant expressions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_constexpr)`` to determine if support for generalized
+constant expressions (e.g., ``constexpr``) is enabled.
+
+C++11 ``decltype()``
+^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_decltype)`` or ``__has_extension(cxx_decltype)`` to
+determine if support for the ``decltype()`` specifier is enabled. C++11's
+``decltype`` does not require type-completeness of a function call expression.
+Use ``__has_feature(cxx_decltype_incomplete_return_types)`` or
+``__has_extension(cxx_decltype_incomplete_return_types)`` to determine if
+support for this feature is enabled.
+
+C++11 default template arguments in function templates
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_default_function_template_args)`` or
+``__has_extension(cxx_default_function_template_args)`` to determine if support
+for default template arguments in function templates is enabled.
+
+C++11 ``default``\ ed functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_defaulted_functions)`` or
+``__has_extension(cxx_defaulted_functions)`` to determine if support for
+defaulted function definitions (with ``= default``) is enabled.
+
+C++11 delegating constructors
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_delegating_constructors)`` to determine if support for
+delegating constructors is enabled.
+
+C++11 ``deleted`` functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_deleted_functions)`` or
+``__has_extension(cxx_deleted_functions)`` to determine if support for deleted
+function definitions (with ``= delete``) is enabled.
+
+C++11 explicit conversion functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_explicit_conversions)`` to determine if support for
+``explicit`` conversion functions is enabled.
+
+C++11 generalized initializers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_generalized_initializers)`` to determine if support for
+generalized initializers (using braced lists and ``std::initializer_list``) is
+enabled.
+
+C++11 implicit move constructors/assignment operators
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_implicit_moves)`` to determine if Clang will implicitly
+generate move constructors and move assignment operators where needed.
+
+C++11 inheriting constructors
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_inheriting_constructors)`` to determine if support for
+inheriting constructors is enabled.
+
+C++11 inline namespaces
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_inline_namespaces)`` or
+``__has_extension(cxx_inline_namespaces)`` to determine if support for inline
+namespaces is enabled.
+
+C++11 lambdas
+^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_lambdas)`` or ``__has_extension(cxx_lambdas)`` to
+determine if support for lambdas is enabled.
+
+C++11 local and unnamed types as template arguments
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_local_type_template_args)`` or
+``__has_extension(cxx_local_type_template_args)`` to determine if support for
+local and unnamed types as template arguments is enabled.
+
+C++11 noexcept
+^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_noexcept)`` or ``__has_extension(cxx_noexcept)`` to
+determine if support for noexcept exception specifications is enabled.
+
+C++11 in-class non-static data member initialization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_nonstatic_member_init)`` to determine whether in-class
+initialization of non-static data members is enabled.
+
+C++11 ``nullptr``
+^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_nullptr)`` or ``__has_extension(cxx_nullptr)`` to
+determine if support for ``nullptr`` is enabled.
+
+C++11 ``override control``
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_override_control)`` or
+``__has_extension(cxx_override_control)`` to determine if support for the
+override control keywords is enabled.
+
+C++11 reference-qualified functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_reference_qualified_functions)`` or
+``__has_extension(cxx_reference_qualified_functions)`` to determine if support
+for reference-qualified functions (e.g., member functions with ``&`` or ``&&``
+applied to ``*this``) is enabled.
+
+C++11 range-based ``for`` loop
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_range_for)`` or ``__has_extension(cxx_range_for)`` to
+determine if support for the range-based for loop is enabled.
+
+C++11 raw string literals
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_raw_string_literals)`` to determine if support for raw
+string literals (e.g., ``R"x(foo\bar)x"``) is enabled.
+
+C++11 rvalue references
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_rvalue_references)`` or
+``__has_extension(cxx_rvalue_references)`` to determine if support for rvalue
+references is enabled.
+
+C++11 ``static_assert()``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_static_assert)`` or
+``__has_extension(cxx_static_assert)`` to determine if support for compile-time
+assertions using ``static_assert`` is enabled.
+
+C++11 ``thread_local``
+^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_thread_local)`` to determine if support for
+``thread_local`` variables is enabled.
+
+C++11 type inference
+^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_auto_type)`` or ``__has_extension(cxx_auto_type)`` to
+determine C++11 type inference is supported using the ``auto`` specifier. If
+this is disabled, ``auto`` will instead be a storage class specifier, as in C
+or C++98.
+
+C++11 strongly typed enumerations
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_strong_enums)`` or
+``__has_extension(cxx_strong_enums)`` to determine if support for strongly
+typed, scoped enumerations is enabled.
+
+C++11 trailing return type
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_trailing_return)`` or
+``__has_extension(cxx_trailing_return)`` to determine if support for the
+alternate function declaration syntax with trailing return type is enabled.
+
+C++11 Unicode string literals
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_unicode_literals)`` to determine if support for Unicode
+string literals is enabled.
+
+C++11 unrestricted unions
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_unrestricted_unions)`` to determine if support for
+unrestricted unions is enabled.
+
+C++11 user-defined literals
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_user_literals)`` to determine if support for
+user-defined literals is enabled.
+
+C++11 variadic templates
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_variadic_templates)`` or
+``__has_extension(cxx_variadic_templates)`` to determine if support for
+variadic templates is enabled.
+
+C++1y
+-----
+
+The features listed below are part of the committee draft for the C++1y
+standard. As a result, all these features are enabled with the ``-std=c++1y``
+or ``-std=gnu++1y`` option when compiling C++ code.
+
+C++1y binary literals
+^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_binary_literals)`` or
+``__has_extension(cxx_binary_literals)`` to determine whether
+binary literals (for instance, ``0b10010``) are recognized. Clang supports this
+feature as an extension in all language modes.
+
+C++1y contextual conversions
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_contextual_conversions)`` or
+``__has_extension(cxx_contextual_conversions)`` to determine if the C++1y rules
+are used when performing an implicit conversion for an array bound in a
+*new-expression*, the operand of a *delete-expression*, an integral constant
+expression, or a condition in a ``switch`` statement.
+
+C++1y decltype(auto)
+^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_decltype_auto)`` or
+``__has_extension(cxx_decltype_auto)`` to determine if support
+for the ``decltype(auto)`` placeholder type is enabled.
+
+C++1y default initializers for aggregates
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_aggregate_nsdmi)`` or
+``__has_extension(cxx_aggregate_nsdmi)`` to determine if support
+for default initializers in aggregate members is enabled.
+
+C++1y generalized lambda capture
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_init_captures)`` or
+``__has_extension(cxx_init_captures)`` to determine if support for
+lambda captures with explicit initializers is enabled
+(for instance, ``[n(0)] { return ++n; }``).
+
+C++1y generic lambdas
+^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_generic_lambdas)`` or
+``__has_extension(cxx_generic_lambdas)`` to determine if support for generic
+(polymorphic) lambdas is enabled
+(for instance, ``[] (auto x) { return x + 1; }``).
+
+C++1y relaxed constexpr
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_relaxed_constexpr)`` or
+``__has_extension(cxx_relaxed_constexpr)`` to determine if variable
+declarations, local variable modification, and control flow constructs
+are permitted in ``constexpr`` functions.
+
+C++1y return type deduction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_return_type_deduction)`` or
+``__has_extension(cxx_return_type_deduction)`` to determine if support
+for return type deduction for functions (using ``auto`` as a return type)
+is enabled.
+
+C++1y runtime-sized arrays
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_runtime_array)`` or
+``__has_extension(cxx_runtime_array)`` to determine if support
+for arrays of runtime bound (a restricted form of variable-length arrays)
+is enabled.
+Clang's implementation of this feature is incomplete.
+
+C++1y variable templates
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(cxx_variable_templates)`` or
+``__has_extension(cxx_variable_templates)`` to determine if support for
+templated variable declarations is enabled.
+
+C11
+---
+
+The features listed below are part of the C11 standard. As a result, all these
+features are enabled with the ``-std=c11`` or ``-std=gnu11`` option when
+compiling C code. Additionally, because these features are all
+backward-compatible, they are available as extensions in all language modes.
+
+C11 alignment specifiers
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(c_alignas)`` or ``__has_extension(c_alignas)`` to determine
+if support for alignment specifiers using ``_Alignas`` is enabled.
+
+C11 atomic operations
+^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(c_atomic)`` or ``__has_extension(c_atomic)`` to determine
+if support for atomic types using ``_Atomic`` is enabled. Clang also provides
+:ref:`a set of builtins <langext-__c11_atomic>` which can be used to implement
+the ``<stdatomic.h>`` operations on ``_Atomic`` types.
+
+C11 generic selections
+^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(c_generic_selections)`` or
+``__has_extension(c_generic_selections)`` to determine if support for generic
+selections is enabled.
+
+As an extension, the C11 generic selection expression is available in all
+languages supported by Clang. The syntax is the same as that given in the C11
+standard.
+
+In C, type compatibility is decided according to the rules given in the
+appropriate standard, but in C++, which lacks the type compatibility rules used
+in C, types are considered compatible only if they are equivalent.
+
+C11 ``_Static_assert()``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(c_static_assert)`` or ``__has_extension(c_static_assert)``
+to determine if support for compile-time assertions using ``_Static_assert`` is
+enabled.
+
+C11 ``_Thread_local``
+^^^^^^^^^^^^^^^^^^^^^
+
+Use ``__has_feature(c_thread_local)`` or ``__has_extension(c_thread_local)``
+to determine if support for ``_Thread_local`` variables is enabled.
+
+Checks for Type Trait Primitives
+================================
+
+Type trait primitives are special builtin constant expressions that can be used
+by the standard C++ library to facilitate or simplify the implementation of
+user-facing type traits in the <type_traits> header.
+
+They are not intended to be used directly by user code because they are
+implementation-defined and subject to change -- as such they're tied closely to
+the supported set of system headers, currently:
+
+* LLVM's own libc++
+* GNU libstdc++
+* The Microsoft standard C++ library
+
+Clang supports the `GNU C++ type traits
+<http://gcc.gnu.org/onlinedocs/gcc/Type-Traits.html>`_ and a subset of the
+`Microsoft Visual C++ Type traits
+<http://msdn.microsoft.com/en-us/library/ms177194(v=VS.100).aspx>`_.
+
+Feature detection is supported only for some of the primitives at present. User
+code should not use these checks because they bear no direct relation to the
+actual set of type traits supported by the C++ standard library.
+
+For type trait ``__X``, ``__has_extension(X)`` indicates the presence of the
+type trait primitive in the compiler. A simplistic usage example as might be
+seen in standard C++ headers follows:
+
+.. code-block:: c++
+
+ #if __has_extension(is_convertible_to)
+ template<typename From, typename To>
+ struct is_convertible_to {
+ static const bool value = __is_convertible_to(From, To);
+ };
+ #else
+ // Emulate type trait for compatibility with other compilers.
+ #endif
+
+The following type trait primitives are supported by Clang:
+
+* ``__has_nothrow_assign`` (GNU, Microsoft)
+* ``__has_nothrow_copy`` (GNU, Microsoft)
+* ``__has_nothrow_constructor`` (GNU, Microsoft)
+* ``__has_trivial_assign`` (GNU, Microsoft)
+* ``__has_trivial_copy`` (GNU, Microsoft)
+* ``__has_trivial_constructor`` (GNU, Microsoft)
+* ``__has_trivial_destructor`` (GNU, Microsoft)
+* ``__has_virtual_destructor`` (GNU, Microsoft)
+* ``__is_abstract`` (GNU, Microsoft)
+* ``__is_base_of`` (GNU, Microsoft)
+* ``__is_class`` (GNU, Microsoft)
+* ``__is_convertible_to`` (Microsoft)
+* ``__is_empty`` (GNU, Microsoft)
+* ``__is_enum`` (GNU, Microsoft)
+* ``__is_interface_class`` (Microsoft)
+* ``__is_pod`` (GNU, Microsoft)
+* ``__is_polymorphic`` (GNU, Microsoft)
+* ``__is_union`` (GNU, Microsoft)
+* ``__is_literal(type)``: Determines whether the given type is a literal type
+* ``__is_final``: Determines whether the given type is declared with a
+ ``final`` class-virt-specifier.
+* ``__underlying_type(type)``: Retrieves the underlying type for a given
+ ``enum`` type. This trait is required to implement the C++11 standard
+ library.
+* ``__is_trivially_assignable(totype, fromtype)``: Determines whether a value
+ of type ``totype`` can be assigned to from a value of type ``fromtype`` such
+ that no non-trivial functions are called as part of that assignment. This
+ trait is required to implement the C++11 standard library.
+* ``__is_trivially_constructible(type, argtypes...)``: Determines whether a
+ value of type ``type`` can be direct-initialized with arguments of types
+ ``argtypes...`` such that no non-trivial functions are called as part of
+ that initialization. This trait is required to implement the C++11 standard
+ library.
+* ``__is_destructible`` (MSVC 2013): partially implemented
+* ``__is_nothrow_destructible`` (MSVC 2013): partially implemented
+* ``__is_nothrow_assignable`` (MSVC 2013, clang)
+* ``__is_constructible`` (MSVC 2013, clang)
+* ``__is_nothrow_constructible`` (MSVC 2013, clang)
+
+Blocks
+======
+
+The syntax and high level language feature description is in
+:doc:`BlockLanguageSpec<BlockLanguageSpec>`. Implementation and ABI details for
+the clang implementation are in :doc:`Block-ABI-Apple<Block-ABI-Apple>`.
+
+Query for this feature with ``__has_extension(blocks)``.
+
+Objective-C Features
+====================
+
+Related result types
+--------------------
+
+According to Cocoa conventions, Objective-C methods with certain names
+("``init``", "``alloc``", etc.) always return objects that are an instance of
+the receiving class's type. Such methods are said to have a "related result
+type", meaning that a message send to one of these methods will have the same
+static type as an instance of the receiver class. For example, given the
+following classes:
+
+.. code-block:: objc
+
+ @interface NSObject
+ + (id)alloc;
+ - (id)init;
+ @end
+
+ @interface NSArray : NSObject
+ @end
+
+and this common initialization pattern
+
+.. code-block:: objc
+
+ NSArray *array = [[NSArray alloc] init];
+
+the type of the expression ``[NSArray alloc]`` is ``NSArray*`` because
+``alloc`` implicitly has a related result type. Similarly, the type of the
+expression ``[[NSArray alloc] init]`` is ``NSArray*``, since ``init`` has a
+related result type and its receiver is known to have the type ``NSArray *``.
+If neither ``alloc`` nor ``init`` had a related result type, the expressions
+would have had type ``id``, as declared in the method signature.
+
+A method with a related result type can be declared by using the type
+``instancetype`` as its result type. ``instancetype`` is a contextual keyword
+that is only permitted in the result type of an Objective-C method, e.g.
+
+.. code-block:: objc
+
+ @interface A
+ + (instancetype)constructAnA;
+ @end
+
+The related result type can also be inferred for some methods. To determine
+whether a method has an inferred related result type, the first word in the
+camel-case selector (e.g., "``init``" in "``initWithObjects``") is considered,
+and the method will have a related result type if its return type is compatible
+with the type of its class and if:
+
+* the first word is "``alloc``" or "``new``", and the method is a class method,
+ or
+
+* the first word is "``autorelease``", "``init``", "``retain``", or "``self``",
+ and the method is an instance method.
+
+If a method with a related result type is overridden by a subclass method, the
+subclass method must also return a type that is compatible with the subclass
+type. For example:
+
+.. code-block:: objc
+
+ @interface NSString : NSObject
+ - (NSUnrelated *)init; // incorrect usage: NSUnrelated is not NSString or a superclass of NSString
+ @end
+
+Related result types only affect the type of a message send or property access
+via the given method. In all other respects, a method with a related result
+type is treated the same way as method that returns ``id``.
+
+Use ``__has_feature(objc_instancetype)`` to determine whether the
+``instancetype`` contextual keyword is available.
+
+Automatic reference counting
+----------------------------
+
+Clang provides support for :doc:`automated reference counting
+<AutomaticReferenceCounting>` in Objective-C, which eliminates the need
+for manual ``retain``/``release``/``autorelease`` message sends. There are two
+feature macros associated with automatic reference counting:
+``__has_feature(objc_arc)`` indicates the availability of automated reference
+counting in general, while ``__has_feature(objc_arc_weak)`` indicates that
+automated reference counting also includes support for ``__weak`` pointers to
+Objective-C objects.
+
+.. _objc-fixed-enum:
+
+Enumerations with a fixed underlying type
+-----------------------------------------
+
+Clang provides support for C++11 enumerations with a fixed underlying type
+within Objective-C. For example, one can write an enumeration type as:
+
+.. code-block:: c++
+
+ typedef enum : unsigned char { Red, Green, Blue } Color;
+
+This specifies that the underlying type, which is used to store the enumeration
+value, is ``unsigned char``.
+
+Use ``__has_feature(objc_fixed_enum)`` to determine whether support for fixed
+underlying types is available in Objective-C.
+
+Interoperability with C++11 lambdas
+-----------------------------------
+
+Clang provides interoperability between C++11 lambdas and blocks-based APIs, by
+permitting a lambda to be implicitly converted to a block pointer with the
+corresponding signature. For example, consider an API such as ``NSArray``'s
+array-sorting method:
+
+.. code-block:: objc
+
+ - (NSArray *)sortedArrayUsingComparator:(NSComparator)cmptr;
+
+``NSComparator`` is simply a typedef for the block pointer ``NSComparisonResult
+(^)(id, id)``, and parameters of this type are generally provided with block
+literals as arguments. However, one can also use a C++11 lambda so long as it
+provides the same signature (in this case, accepting two parameters of type
+``id`` and returning an ``NSComparisonResult``):
+
+.. code-block:: objc
+
+ NSArray *array = @[@"string 1", @"string 21", @"string 12", @"String 11",
+ @"String 02"];
+ const NSStringCompareOptions comparisonOptions
+ = NSCaseInsensitiveSearch | NSNumericSearch |
+ NSWidthInsensitiveSearch | NSForcedOrderingSearch;
+ NSLocale *currentLocale = [NSLocale currentLocale];
+ NSArray *sorted
+ = [array sortedArrayUsingComparator:[=](id s1, id s2) -> NSComparisonResult {
+ NSRange string1Range = NSMakeRange(0, [s1 length]);
+ return [s1 compare:s2 options:comparisonOptions
+ range:string1Range locale:currentLocale];
+ }];
+ NSLog(@"sorted: %@", sorted);
+
+This code relies on an implicit conversion from the type of the lambda
+expression (an unnamed, local class type called the *closure type*) to the
+corresponding block pointer type. The conversion itself is expressed by a
+conversion operator in that closure type that produces a block pointer with the
+same signature as the lambda itself, e.g.,
+
+.. code-block:: objc
+
+ operator NSComparisonResult (^)(id, id)() const;
+
+This conversion function returns a new block that simply forwards the two
+parameters to the lambda object (which it captures by copy), then returns the
+result. The returned block is first copied (with ``Block_copy``) and then
+autoreleased. As an optimization, if a lambda expression is immediately
+converted to a block pointer (as in the first example, above), then the block
+is not copied and autoreleased: rather, it is given the same lifetime as a
+block literal written at that point in the program, which avoids the overhead
+of copying a block to the heap in the common case.
+
+The conversion from a lambda to a block pointer is only available in
+Objective-C++, and not in C++ with blocks, due to its use of Objective-C memory
+management (autorelease).
+
+Object Literals and Subscripting
+--------------------------------
+
+Clang provides support for :doc:`Object Literals and Subscripting
+<ObjectiveCLiterals>` in Objective-C, which simplifies common Objective-C
+programming patterns, makes programs more concise, and improves the safety of
+container creation. There are several feature macros associated with object
+literals and subscripting: ``__has_feature(objc_array_literals)`` tests the
+availability of array literals; ``__has_feature(objc_dictionary_literals)``
+tests the availability of dictionary literals;
+``__has_feature(objc_subscripting)`` tests the availability of object
+subscripting.
+
+Objective-C Autosynthesis of Properties
+---------------------------------------
+
+Clang provides support for autosynthesis of declared properties. Using this
+feature, clang provides default synthesis of those properties not declared
+ at dynamic and not having user provided backing getter and setter methods.
+``__has_feature(objc_default_synthesize_properties)`` checks for availability
+of this feature in version of clang being used.
+
+.. _langext-objc-retain-release:
+
+Objective-C retaining behavior attributes
+-----------------------------------------
+
+In Objective-C, functions and methods are generally assumed to follow the
+`Cocoa Memory Management
+<http://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmRules.html>`_
+conventions for ownership of object arguments and
+return values. However, there are exceptions, and so Clang provides attributes
+to allow these exceptions to be documented. This are used by ARC and the
+`static analyzer <http://clang-analyzer.llvm.org>`_ Some exceptions may be
+better described using the ``objc_method_family`` attribute instead.
+
+**Usage**: The ``ns_returns_retained``, ``ns_returns_not_retained``,
+``ns_returns_autoreleased``, ``cf_returns_retained``, and
+``cf_returns_not_retained`` attributes can be placed on methods and functions
+that return Objective-C or CoreFoundation objects. They are commonly placed at
+the end of a function prototype or method declaration:
+
+.. code-block:: objc
+
+ id foo() __attribute__((ns_returns_retained));
+
+ - (NSString *)bar:(int)x __attribute__((ns_returns_retained));
+
+The ``*_returns_retained`` attributes specify that the returned object has a +1
+retain count. The ``*_returns_not_retained`` attributes specify that the return
+object has a +0 retain count, even if the normal convention for its selector
+would be +1. ``ns_returns_autoreleased`` specifies that the returned object is
++0, but is guaranteed to live at least as long as the next flush of an
+autorelease pool.
+
+**Usage**: The ``ns_consumed`` and ``cf_consumed`` attributes can be placed on
+an parameter declaration; they specify that the argument is expected to have a
++1 retain count, which will be balanced in some way by the function or method.
+The ``ns_consumes_self`` attribute can only be placed on an Objective-C
+method; it specifies that the method expects its ``self`` parameter to have a
++1 retain count, which it will balance in some way.
+
+.. code-block:: objc
+
+ void foo(__attribute__((ns_consumed)) NSString *string);
+
+ - (void) bar __attribute__((ns_consumes_self));
+ - (void) baz:(id) __attribute__((ns_consumed)) x;
+
+Further examples of these attributes are available in the static analyzer's `list of annotations for analysis
+<http://clang-analyzer.llvm.org/annotations.html#cocoa_mem>`_.
+
+Query for these features with ``__has_attribute(ns_consumed)``,
+``__has_attribute(ns_returns_retained)``, etc.
+
+
+Objective-C++ ABI: protocol-qualifier mangling of parameters
+------------------------------------------------------------
+
+Starting with LLVM 3.4, Clang produces a new mangling for parameters whose
+type is a qualified-``id`` (e.g., ``id<Foo>``). This mangling allows such
+parameters to be differentiated from those with the regular unqualified ``id``
+type.
+
+This was a non-backward compatible mangling change to the ABI. This change
+allows proper overloading, and also prevents mangling conflicts with template
+parameters of protocol-qualified type.
+
+Query the presence of this new mangling with
+``__has_feature(objc_protocol_qualifier_mangling)``.
+
+.. _langext-overloading:
+
+Initializer lists for complex numbers in C
+==========================================
+
+clang supports an extension which allows the following in C:
+
+.. code-block:: c++
+
+ #include <math.h>
+ #include <complex.h>
+ complex float x = { 1.0f, INFINITY }; // Init to (1, Inf)
+
+This construct is useful because there is no way to separately initialize the
+real and imaginary parts of a complex variable in standard C, given that clang
+does not support ``_Imaginary``. (Clang also supports the ``__real__`` and
+``__imag__`` extensions from gcc, which help in some cases, but are not usable
+in static initializers.)
+
+Note that this extension does not allow eliding the braces; the meaning of the
+following two lines is different:
+
+.. code-block:: c++
+
+ complex float x[] = { { 1.0f, 1.0f } }; // [0] = (1, 1)
+ complex float x[] = { 1.0f, 1.0f }; // [0] = (1, 0), [1] = (1, 0)
+
+This extension also works in C++ mode, as far as that goes, but does not apply
+to the C++ ``std::complex``. (In C++11, list initialization allows the same
+syntax to be used with ``std::complex`` with the same meaning.)
+
+Builtin Functions
+=================
+
+Clang supports a number of builtin library functions with the same syntax as
+GCC, including things like ``__builtin_nan``, ``__builtin_constant_p``,
+``__builtin_choose_expr``, ``__builtin_types_compatible_p``,
+``__sync_fetch_and_add``, etc. In addition to the GCC builtins, Clang supports
+a number of builtins that GCC does not, which are listed here.
+
+Please note that Clang does not and will not support all of the GCC builtins
+for vector operations. Instead of using builtins, you should use the functions
+defined in target-specific header files like ``<xmmintrin.h>``, which define
+portable wrappers for these. Many of the Clang versions of these functions are
+implemented directly in terms of :ref:`extended vector support
+<langext-vectors>` instead of builtins, in order to reduce the number of
+builtins that we need to implement.
+
+``__builtin_readcyclecounter``
+------------------------------
+
+``__builtin_readcyclecounter`` is used to access the cycle counter register (or
+a similar low-latency, high-accuracy clock) on those targets that support it.
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_readcyclecounter()
+
+**Example of Use**:
+
+.. code-block:: c++
+
+ unsigned long long t0 = __builtin_readcyclecounter();
+ do_something();
+ unsigned long long t1 = __builtin_readcyclecounter();
+ unsigned long long cycles_to_do_something = t1 - t0; // assuming no overflow
+
+**Description**:
+
+The ``__builtin_readcyclecounter()`` builtin returns the cycle counter value,
+which may be either global or process/thread-specific depending on the target.
+As the backing counters often overflow quickly (on the order of seconds) this
+should only be used for timing small intervals. When not supported by the
+target, the return value is always zero. This builtin takes no arguments and
+produces an unsigned long long result.
+
+Query for this feature with ``__has_builtin(__builtin_readcyclecounter)``. Note
+that even if present, its use may depend on run-time privilege or other OS
+controlled state.
+
+.. _langext-__builtin_shufflevector:
+
+``__builtin_shufflevector``
+---------------------------
+
+``__builtin_shufflevector`` is used to express generic vector
+permutation/shuffle/swizzle operations. This builtin is also very important
+for the implementation of various target-specific header files like
+``<xmmintrin.h>``.
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_shufflevector(vec1, vec2, index1, index2, ...)
+
+**Examples**:
+
+.. code-block:: c++
+
+ // identity operation - return 4-element vector v1.
+ __builtin_shufflevector(v1, v1, 0, 1, 2, 3)
+
+ // "Splat" element 0 of V1 into a 4-element result.
+ __builtin_shufflevector(V1, V1, 0, 0, 0, 0)
+
+ // Reverse 4-element vector V1.
+ __builtin_shufflevector(V1, V1, 3, 2, 1, 0)
+
+ // Concatenate every other element of 4-element vectors V1 and V2.
+ __builtin_shufflevector(V1, V2, 0, 2, 4, 6)
+
+ // Concatenate every other element of 8-element vectors V1 and V2.
+ __builtin_shufflevector(V1, V2, 0, 2, 4, 6, 8, 10, 12, 14)
+
+ // Shuffle v1 with some elements being undefined
+ __builtin_shufflevector(v1, v1, 3, -1, 1, -1)
+
+**Description**:
+
+The first two arguments to ``__builtin_shufflevector`` are vectors that have
+the same element type. The remaining arguments are a list of integers that
+specify the elements indices of the first two vectors that should be extracted
+and returned in a new vector. These element indices are numbered sequentially
+starting with the first vector, continuing into the second vector. Thus, if
+``vec1`` is a 4-element vector, index 5 would refer to the second element of
+``vec2``. An index of -1 can be used to indicate that the corresponding element
+in the returned vector is a don't care and can be optimized by the backend.
+
+The result of ``__builtin_shufflevector`` is a vector with the same element
+type as ``vec1``/``vec2`` but that has an element count equal to the number of
+indices specified.
+
+Query for this feature with ``__has_builtin(__builtin_shufflevector)``.
+
+``__builtin_convertvector``
+---------------------------
+
+``__builtin_convertvector`` is used to express generic vector
+type-conversion operations. The input vector and the output vector
+type must have the same number of elements.
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_convertvector(src_vec, dst_vec_type)
+
+**Examples**:
+
+.. code-block:: c++
+
+ typedef double vector4double __attribute__((__vector_size__(32)));
+ typedef float vector4float __attribute__((__vector_size__(16)));
+ typedef short vector4short __attribute__((__vector_size__(8)));
+ vector4float vf; vector4short vs;
+
+ // convert from a vector of 4 floats to a vector of 4 doubles.
+ __builtin_convertvector(vf, vector4double)
+ // equivalent to:
+ (vector4double) { (double) vf[0], (double) vf[1], (double) vf[2], (double) vf[3] }
+
+ // convert from a vector of 4 shorts to a vector of 4 floats.
+ __builtin_convertvector(vs, vector4float)
+ // equivalent to:
+ (vector4float) { (float) vf[0], (float) vf[1], (float) vf[2], (float) vf[3] }
+
+**Description**:
+
+The first argument to ``__builtin_convertvector`` is a vector, and the second
+argument is a vector type with the same number of elements as the first
+argument.
+
+The result of ``__builtin_convertvector`` is a vector with the same element
+type as the second argument, with a value defined in terms of the action of a
+C-style cast applied to each element of the first argument.
+
+Query for this feature with ``__has_builtin(__builtin_convertvector)``.
+
+``__builtin_unreachable``
+-------------------------
+
+``__builtin_unreachable`` is used to indicate that a specific point in the
+program cannot be reached, even if the compiler might otherwise think it can.
+This is useful to improve optimization and eliminates certain warnings. For
+example, without the ``__builtin_unreachable`` in the example below, the
+compiler assumes that the inline asm can fall through and prints a "function
+declared '``noreturn``' should not return" warning.
+
+**Syntax**:
+
+.. code-block:: c++
+
+ __builtin_unreachable()
+
+**Example of use**:
+
+.. code-block:: c++
+
+ void myabort(void) __attribute__((noreturn));
+ void myabort(void) {
+ asm("int3");
+ __builtin_unreachable();
+ }
+
+**Description**:
+
+The ``__builtin_unreachable()`` builtin has completely undefined behavior.
+Since it has undefined behavior, it is a statement that it is never reached and
+the optimizer can take advantage of this to produce better code. This builtin
+takes no arguments and produces a void result.
+
+Query for this feature with ``__has_builtin(__builtin_unreachable)``.
+
+``__sync_swap``
+---------------
+
+``__sync_swap`` is used to atomically swap integers or pointers in memory.
+
+**Syntax**:
+
+.. code-block:: c++
+
+ type __sync_swap(type *ptr, type value, ...)
+
+**Example of Use**:
+
+.. code-block:: c++
+
+ int old_value = __sync_swap(&value, new_value);
+
+**Description**:
+
+The ``__sync_swap()`` builtin extends the existing ``__sync_*()`` family of
+atomic intrinsics to allow code to atomically swap the current value with the
+new value. More importantly, it helps developers write more efficient and
+correct code by avoiding expensive loops around
+``__sync_bool_compare_and_swap()`` or relying on the platform specific
+implementation details of ``__sync_lock_test_and_set()``. The
+``__sync_swap()`` builtin is a full barrier.
+
+``__builtin_addressof``
+-----------------------
+
+``__builtin_addressof`` performs the functionality of the built-in ``&``
+operator, ignoring any ``operator&`` overload. This is useful in constant
+expressions in C++11, where there is no other way to take the address of an
+object that overloads ``operator&``.
+
+**Example of use**:
+
+.. code-block:: c++
+
+ template<typename T> constexpr T *addressof(T &value) {
+ return __builtin_addressof(value);
+ }
+
+``__builtin_operator_new`` and ``__builtin_operator_delete``
+------------------------------------------------------------
+
+``__builtin_operator_new`` allocates memory just like a non-placement non-class
+*new-expression*. This is exactly like directly calling the normal
+non-placement ``::operator new``, except that it allows certain optimizations
+that the C++ standard does not permit for a direct function call to
+``::operator new`` (in particular, removing ``new`` / ``delete`` pairs and
+merging allocations).
+
+Likewise, ``__builtin_operator_delete`` deallocates memory just like a
+non-class *delete-expression*, and is exactly like directly calling the normal
+``::operator delete``, except that it permits optimizations. Only the unsized
+form of ``__builtin_operator_delete`` is currently available.
+
+These builtins are intended for use in the implementation of ``std::allocator``
+and other similar allocation libraries, and are only available in C++.
+
+Multiprecision Arithmetic Builtins
+----------------------------------
+
+Clang provides a set of builtins which expose multiprecision arithmetic in a
+manner amenable to C. They all have the following form:
+
+.. code-block:: c
+
+ unsigned x = ..., y = ..., carryin = ..., carryout;
+ unsigned sum = __builtin_addc(x, y, carryin, &carryout);
+
+Thus one can form a multiprecision addition chain in the following manner:
+
+.. code-block:: c
+
+ unsigned *x, *y, *z, carryin=0, carryout;
+ z[0] = __builtin_addc(x[0], y[0], carryin, &carryout);
+ carryin = carryout;
+ z[1] = __builtin_addc(x[1], y[1], carryin, &carryout);
+ carryin = carryout;
+ z[2] = __builtin_addc(x[2], y[2], carryin, &carryout);
+ carryin = carryout;
+ z[3] = __builtin_addc(x[3], y[3], carryin, &carryout);
+
+The complete list of builtins are:
+
+.. code-block:: c
+
+ unsigned char __builtin_addcb (unsigned char x, unsigned char y, unsigned char carryin, unsigned char *carryout);
+ unsigned short __builtin_addcs (unsigned short x, unsigned short y, unsigned short carryin, unsigned short *carryout);
+ unsigned __builtin_addc (unsigned x, unsigned y, unsigned carryin, unsigned *carryout);
+ unsigned long __builtin_addcl (unsigned long x, unsigned long y, unsigned long carryin, unsigned long *carryout);
+ unsigned long long __builtin_addcll(unsigned long long x, unsigned long long y, unsigned long long carryin, unsigned long long *carryout);
+ unsigned char __builtin_subcb (unsigned char x, unsigned char y, unsigned char carryin, unsigned char *carryout);
+ unsigned short __builtin_subcs (unsigned short x, unsigned short y, unsigned short carryin, unsigned short *carryout);
+ unsigned __builtin_subc (unsigned x, unsigned y, unsigned carryin, unsigned *carryout);
+ unsigned long __builtin_subcl (unsigned long x, unsigned long y, unsigned long carryin, unsigned long *carryout);
+ unsigned long long __builtin_subcll(unsigned long long x, unsigned long long y, unsigned long long carryin, unsigned long long *carryout);
+
+Checked Arithmetic Builtins
+---------------------------
+
+Clang provides a set of builtins that implement checked arithmetic for security
+critical applications in a manner that is fast and easily expressable in C. As
+an example of their usage:
+
+.. code-block:: c
+
+ errorcode_t security_critical_application(...) {
+ unsigned x, y, result;
+ ...
+ if (__builtin_umul_overflow(x, y, &result))
+ return kErrorCodeHackers;
+ ...
+ use_multiply(result);
+ ...
+ }
+
+A complete enumeration of the builtins are:
+
+.. code-block:: c
+
+ bool __builtin_uadd_overflow (unsigned x, unsigned y, unsigned *sum);
+ bool __builtin_uaddl_overflow (unsigned long x, unsigned long y, unsigned long *sum);
+ bool __builtin_uaddll_overflow(unsigned long long x, unsigned long long y, unsigned long long *sum);
+ bool __builtin_usub_overflow (unsigned x, unsigned y, unsigned *diff);
+ bool __builtin_usubl_overflow (unsigned long x, unsigned long y, unsigned long *diff);
+ bool __builtin_usubll_overflow(unsigned long long x, unsigned long long y, unsigned long long *diff);
+ bool __builtin_umul_overflow (unsigned x, unsigned y, unsigned *prod);
+ bool __builtin_umull_overflow (unsigned long x, unsigned long y, unsigned long *prod);
+ bool __builtin_umulll_overflow(unsigned long long x, unsigned long long y, unsigned long long *prod);
+ bool __builtin_sadd_overflow (int x, int y, int *sum);
+ bool __builtin_saddl_overflow (long x, long y, long *sum);
+ bool __builtin_saddll_overflow(long long x, long long y, long long *sum);
+ bool __builtin_ssub_overflow (int x, int y, int *diff);
+ bool __builtin_ssubl_overflow (long x, long y, long *diff);
+ bool __builtin_ssubll_overflow(long long x, long long y, long long *diff);
+ bool __builtin_smul_overflow (int x, int y, int *prod);
+ bool __builtin_smull_overflow (long x, long y, long *prod);
+ bool __builtin_smulll_overflow(long long x, long long y, long long *prod);
+
+
+.. _langext-__c11_atomic:
+
+__c11_atomic builtins
+---------------------
+
+Clang provides a set of builtins which are intended to be used to implement
+C11's ``<stdatomic.h>`` header. These builtins provide the semantics of the
+``_explicit`` form of the corresponding C11 operation, and are named with a
+``__c11_`` prefix. The supported operations are:
+
+* ``__c11_atomic_init``
+* ``__c11_atomic_thread_fence``
+* ``__c11_atomic_signal_fence``
+* ``__c11_atomic_is_lock_free``
+* ``__c11_atomic_store``
+* ``__c11_atomic_load``
+* ``__c11_atomic_exchange``
+* ``__c11_atomic_compare_exchange_strong``
+* ``__c11_atomic_compare_exchange_weak``
+* ``__c11_atomic_fetch_add``
+* ``__c11_atomic_fetch_sub``
+* ``__c11_atomic_fetch_and``
+* ``__c11_atomic_fetch_or``
+* ``__c11_atomic_fetch_xor``
+
+Low-level ARM exclusive memory builtins
+---------------------------------------
+
+Clang provides overloaded builtins giving direct access to the three key ARM
+instructions for implementing atomic operations.
+
+.. code-block:: c
+
+ T __builtin_arm_ldrex(const volatile T *addr);
+ T __builtin_arm_ldaex(const volatile T *addr);
+ int __builtin_arm_strex(T val, volatile T *addr);
+ int __builtin_arm_stlex(T val, volatile T *addr);
+ void __builtin_arm_clrex(void);
+
+The types ``T`` currently supported are:
+* Integer types with width at most 64 bits (or 128 bits on AArch64).
+* Floating-point types
+* Pointer types.
+
+Note that the compiler does not guarantee it will not insert stores which clear
+the exclusive monitor in between an ``ldrex`` type operation and its paired
+``strex``. In practice this is only usually a risk when the extra store is on
+the same cache line as the variable being modified and Clang will only insert
+stack stores on its own, so it is best not to use these operations on variables
+with automatic storage duration.
+
+Also, loads and stores may be implicit in code written between the ``ldrex`` and
+``strex``. Clang will not necessarily mitigate the effects of these either, so
+care should be exercised.
+
+For these reasons the higher level atomic primitives should be preferred where
+possible.
+
+Non-standard C++11 Attributes
+=============================
+
+Clang's non-standard C++11 attributes live in the ``clang`` attribute
+namespace.
+
+Clang supports GCC's ``gnu`` attribute namespace. All GCC attributes which
+are accepted with the ``__attribute__((foo))`` syntax are also accepted as
+``[[gnu::foo]]``. This only extends to attributes which are specified by GCC
+(see the list of `GCC function attributes
+<http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html>`_, `GCC variable
+attributes <http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html>`_, and
+`GCC type attributes
+<http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html>`_). As with the GCC
+implementation, these attributes must appertain to the *declarator-id* in a
+declaration, which means they must go either at the start of the declaration or
+immediately after the name being declared.
+
+For example, this applies the GNU ``unused`` attribute to ``a`` and ``f``, and
+also applies the GNU ``noreturn`` attribute to ``f``.
+
+.. code-block:: c++
+
+ [[gnu::unused]] int a, f [[gnu::noreturn]] ();
+
+Target-Specific Extensions
+==========================
+
+Clang supports some language features conditionally on some targets.
+
+ARM/AArch64 Language Extensions
+-------------------------------
+
+Memory Barrier Intrinsics
+^^^^^^^^^^^^^^^^^^^^^^^^^
+Clang implements the ``__dmb``, ``__dsb`` and ``__isb`` intrinsics as defined
+in the `ARM C Language Extensions Release 2.0
+<http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf>`_.
+Note that these intrinsics are implemented as motion barriers that block
+reordering of memory accesses and side effect instructions. Other instructions
+like simple arithmatic may be reordered around the intrinsic. If you expect to
+have no reordering at all, use inline assembly instead.
+
+X86/X86-64 Language Extensions
+------------------------------
+
+The X86 backend has these language extensions:
+
+Memory references off the GS segment
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Annotating a pointer with address space #256 causes it to be code generated
+relative to the X86 GS segment register, and address space #257 causes it to be
+relative to the X86 FS segment. Note that this is a very very low-level
+feature that should only be used if you know what you're doing (for example in
+an OS kernel).
+
+Here is an example:
+
+.. code-block:: c++
+
+ #define GS_RELATIVE __attribute__((address_space(256)))
+ int foo(int GS_RELATIVE *P) {
+ return *P;
+ }
+
+Which compiles to (on X86-32):
+
+.. code-block:: gas
+
+ _foo:
+ movl 4(%esp), %eax
+ movl %gs:(%eax), %eax
+ ret
+
+Extensions for Static Analysis
+==============================
+
+Clang supports additional attributes that are useful for documenting program
+invariants and rules for static analysis tools, such as the `Clang Static
+Analyzer <http://clang-analyzer.llvm.org/>`_. These attributes are documented
+in the analyzer's `list of source-level annotations
+<http://clang-analyzer.llvm.org/annotations.html>`_.
+
+
+Extensions for Dynamic Analysis
+===============================
+
+Use ``__has_feature(address_sanitizer)`` to check if the code is being built
+with :doc:`AddressSanitizer`.
+
+Use ``__has_feature(thread_sanitizer)`` to check if the code is being built
+with :doc:`ThreadSanitizer`.
+
+Use ``__has_feature(memory_sanitizer)`` to check if the code is being built
+with :doc:`MemorySanitizer`.
+
+
+Extensions for selectively disabling optimization
+=================================================
+
+Clang provides a mechanism for selectively disabling optimizations in functions
+and methods.
+
+To disable optimizations in a single function definition, the GNU-style or C++11
+non-standard attribute ``optnone`` can be used.
+
+.. code-block:: c++
+
+ // The following functions will not be optimized.
+ // GNU-style attribute
+ __attribute__((optnone)) int foo() {
+ // ... code
+ }
+ // C++11 attribute
+ [[clang::optnone]] int bar() {
+ // ... code
+ }
+
+To facilitate disabling optimization for a range of function definitions, a
+range-based pragma is provided. Its syntax is ``#pragma clang optimize``
+followed by ``off`` or ``on``.
+
+All function definitions in the region between an ``off`` and the following
+``on`` will be decorated with the ``optnone`` attribute unless doing so would
+conflict with explicit attributes already present on the function (e.g. the
+ones that control inlining).
+
+.. code-block:: c++
+
+ #pragma clang optimize off
+ // This function will be decorated with optnone.
+ int foo() {
+ // ... code
+ }
+
+ // optnone conflicts with always_inline, so bar() will not be decorated.
+ __attribute__((always_inline)) int bar() {
+ // ... code
+ }
+ #pragma clang optimize on
+
+If no ``on`` is found to close an ``off`` region, the end of the region is the
+end of the compilation unit.
+
+Note that a stray ``#pragma clang optimize on`` does not selectively enable
+additional optimizations when compiling at low optimization levels. This feature
+can only be used to selectively disable optimizations.
+
+The pragma has an effect on functions only at the point of their definition; for
+function templates, this means that the state of the pragma at the point of an
+instantiation is not necessarily relevant. Consider the following example:
+
+.. code-block:: c++
+
+ template<typename T> T twice(T t) {
+ return 2 * t;
+ }
+
+ #pragma clang optimize off
+ template<typename T> T thrice(T t) {
+ return 3 * t;
+ }
+
+ int container(int a, int b) {
+ return twice(a) + thrice(b);
+ }
+ #pragma clang optimize on
+
+In this example, the definition of the template function ``twice`` is outside
+the pragma region, whereas the definition of ``thrice`` is inside the region.
+The ``container`` function is also in the region and will not be optimized, but
+it causes the instantiation of ``twice`` and ``thrice`` with an ``int`` type; of
+these two instantiations, ``twice`` will be optimized (because its definition
+was outside the region) and ``thrice`` will not be optimized.
+
+.. _langext-pragma-loop:
+
+Extensions for loop hint optimizations
+======================================
+
+The ``#pragma clang loop`` directive is used to specify hints for optimizing the
+subsequent for, while, do-while, or c++11 range-based for loop. The directive
+provides options for vectorization and interleaving. Loop hints can be specified
+before any loop and will be ignored if the optimization is not safe to apply.
+
+A vectorized loop performs multiple iterations of the original loop
+in parallel using vector instructions. The instruction set of the target
+processor determines which vector instructions are available and their vector
+widths. This restricts the types of loops that can be vectorized. The vectorizer
+automatically determines if the loop is safe and profitable to vectorize. A
+vector instruction cost model is used to select the vector width.
+
+Interleaving multiple loop iterations allows modern processors to further
+improve instruction-level parallelism (ILP) using advanced hardware features,
+such as multiple execution units and out-of-order execution. The vectorizer uses
+a cost model that depends on the register pressure and generated code size to
+select the interleaving count.
+
+Vectorization is enabled by ``vectorize(enable)`` and interleaving is enabled
+by ``interleave(enable)``. This is useful when compiling with ``-Os`` to
+manually enable vectorization or interleaving.
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize(enable)
+ #pragma clang loop interleave(enable)
+ for(...) {
+ ...
+ }
+
+The vector width is specified by ``vectorize_width(_value_)`` and the interleave
+count is specified by ``interleave_count(_value_)``, where
+_value_ is a positive integer. This is useful for specifying the optimal
+width/count of the set of target architectures supported by your application.
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize_width(2)
+ #pragma clang loop interleave_count(2)
+ for(...) {
+ ...
+ }
+
+Specifying a width/count of 1 disables the optimization, and is equivalent to
+``vectorize(disable)`` or ``interleave(disable)``.
+
+For convenience multiple loop hints can be specified on a single line.
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize_width(4) interleave_count(8)
+ for(...) {
+ ...
+ }
+
+If an optimization cannot be applied any hints that apply to it will be ignored.
+For example, the hint ``vectorize_width(4)`` is ignored if the loop is not
+proven safe to vectorize. To identify and diagnose optimization issues use
+`-Rpass`, `-Rpass-missed`, and `-Rpass-analysis` command line options. See the
+user guide for details.
Added: www-releases/trunk/3.5.1/tools/clang/docs/LeakSanitizer.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/LeakSanitizer.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/LeakSanitizer.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/LeakSanitizer.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,32 @@
+================
+LeakSanitizer
+================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+LeakSanitizer is a run-time memory leak detector. It can be combined with
+:doc:`AddressSanitizer` to get both memory error and leak detection.
+LeakSanitizer does not introduce any additional slowdown when used in this mode.
+The LeakSanitizer runtime can also be linked in separately to get leak detection
+only, at a minimal performance cost.
+
+Current status
+==============
+
+LeakSanitizer is experimental and supported only on x86\_64 Linux.
+
+The combined mode has been tested on fairly large software projects. The
+stand-alone mode has received much less testing.
+
+There are plans to support LeakSanitizer in :doc:`MemorySanitizer` builds.
+
+More Information
+================
+
+`https://code.google.com/p/address-sanitizer/wiki/LeakSanitizer
+<https://code.google.com/p/address-sanitizer/wiki/LeakSanitizer>`_
+
Added: www-releases/trunk/3.5.1/tools/clang/docs/LibASTMatchers.rst
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/3.5.1/tools/clang/docs/LibASTMatchers.rst?rev=225843&view=auto
==============================================================================
--- www-releases/trunk/3.5.1/tools/clang/docs/LibASTMatchers.rst (added)
+++ www-releases/trunk/3.5.1/tools/clang/docs/LibASTMatchers.rst Tue Jan 13 16:55:20 2015
@@ -0,0 +1,134 @@
+======================
+Matching the Clang AST
+======================
+
+This document explains how to use Clang's LibASTMatchers to match interesting
+nodes of the AST and execute code that uses the matched nodes. Combined with
+:doc:`LibTooling`, LibASTMatchers helps to write code-to-code transformation
+tools or query tools.
+
+We assume basic knowledge about the Clang AST. See the :doc:`Introduction
+to the Clang AST <IntroductionToTheClangAST>` if you want to learn more
+about how the AST is structured.
+
+.. FIXME: create tutorial and link to the tutorial
+
+Introduction
+------------
+
+LibASTMatchers provides a domain specific language to create predicates on
+Clang's AST. This DSL is written in and can be used from C++, allowing users
+to write a single program to both match AST nodes and access the node's C++
+interface to extract attributes, source locations, or any other information
+provided on the AST level.
+
+AST matchers are predicates on nodes in the AST. Matchers are created by
+calling creator functions that allow building up a tree of matchers, where
+inner matchers are used to make the match more specific.
+
+For example, to create a matcher that matches all class or union declarations
+in the AST of a translation unit, you can call `recordDecl()
+<LibASTMatchersReference.html#recordDecl0Anchor>`_. To narrow the match down,
+for example to find all class or union declarations with the name "``Foo``",
+insert a `hasName <LibASTMatchersReference.html#hasName0Anchor>`_ matcher: the
+call ``recordDecl(hasName("Foo"))`` returns a matcher that matches classes or
+unions that are named "``Foo``", in any namespace. By default, matchers that
+accept multiple inner matchers use an implicit `allOf()
+<LibASTMatchersReference.html#allOf0Anchor>`_. This allows further narrowing
+down the match, for example to match all classes that are derived from
+"``Bar``": ``recordDecl(hasName("Foo"), isDerivedFrom("Bar"))``.
+
+How to create a matcher
+-----------------------
+
+With more than a thousand classes in the Clang AST, one can quickly get lost
+when trying to figure out how to create a matcher for a specific pattern. This
+section will teach you how to use a rigorous step-by-step pattern to build the
+matcher you are interested in. Note that there will always be matchers missing
+for some part of the AST. See the section about :ref:`how to write your own
+AST matchers <astmatchers-writing>` later in this document.
+
+.. FIXME: why is it linking back to the same section?!
+
+The precondition to using the matchers is to understand how the AST for what you
+want to match looks like. The
+:doc:`Introduction to the Clang AST <IntroductionToTheClangAST>` teaches you
+how to dump a translation unit's AST into a human readable format.
+
+.. FIXME: Introduce link to ASTMatchersTutorial.html
+.. FIXME: Introduce link to ASTMatchersCookbook.html
+
+In general, the strategy to create the right matchers is:
+
+#. Find the outermost class in Clang's AST you want to match.
+#. Look at the `AST Matcher Reference <LibASTMatchersReference.html>`_ for
+ matchers that either match the node you're interested in or narrow down
+ attributes on the node.
+#. Create your outer match expression. Verify that it works as expected.
+#. Examine the matchers for what the next inner node you want to match is.
+#. Repeat until the matcher is finished.
+
+.. _astmatchers-bind:
+
+Binding nodes in match expressions
+----------------------------------
+
+Matcher expressions allow you to specify which parts of the AST are interesting
+for a certain task. Often you will want to then do something with the nodes
+that were matched, like building source code transformations.
+
+To that end, matchers that match specific AST nodes (so called node matchers)
+are bindable; for example, ``recordDecl(hasName("MyClass")).bind("id")`` will
+bind the matched ``recordDecl`` node to the string "``id``", to be later
+retrieved in the `match callback
+<http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder_1_1MatchCallback.html>`_.
+
+.. FIXME: Introduce link to ASTMatchersTutorial.html
+.. FIXME: Introduce link to ASTMatchersCookbook.html
+
+Writing your own matchers
+-------------------------
+
+There are multiple different ways to define a matcher, depending on its type
+and flexibility.
+
+``VariadicDynCastAllOfMatcher<Base, Derived>``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Those match all nodes of type *Base* if they can be dynamically casted to
+*Derived*. The names of those matchers are nouns, which closely resemble
+*Derived*. ``VariadicDynCastAllOfMatchers`` are the backbone of the matcher
+hierarchy. Most often, your match expression will start with one of them, and
+you can :ref:`bind <astmatchers-bind>` the node they represent to ids for later
+processing.
+
+``VariadicDynCastAllOfMatchers`` are callable classes that model variadic
+template functions in C++03. They take an aribtrary number of
+``Matcher<Derived>`` and return a ``Matcher<Base>``.
+
+``AST_MATCHER_P(Type, Name, ParamType, Param)``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Most matcher definitions use the matcher creation macros. Those define both
+the matcher of type ``Matcher<Type>`` itself, and a matcher-creation function
+named *Name* that takes a parameter of type *ParamType* and returns the
+corresponding matcher.
+
+There are multiple matcher definition macros that deal with polymorphic return
+values and different parameter counts. See `ASTMatchersMacros.h
+<http://clang.llvm.org/doxygen/ASTMatchersMacros_8h.html>`_.
+
+.. _astmatchers-writing:
+
+Matcher creation functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Matchers are generated by nesting calls to matcher creation functions. Most of
+the time those functions are either created by using
+``VariadicDynCastAllOfMatcher`` or the matcher creation macros (see below).
+The free-standing functions are an indication that this matcher is just a
+combination of other matchers, as is for example the case with `callee
+<LibASTMatchersReference.html#callee1Anchor>`_.
+
+.. FIXME: "... macros (see below)" --- there isn't anything below
+
More information about the llvm-commits
mailing list