[cfe-commits] r170812 - in /cfe/trunk/docs: DriverInternals.html DriverInternals.rst index.rst

Sean Silva silvas at purdue.edu
Thu Dec 20 16:16:54 PST 2012


Author: silvas
Date: Thu Dec 20 18:16:53 2012
New Revision: 170812

URL: http://llvm.org/viewvc/llvm-project?rev=170812&view=rev
Log:
docs: Convert DriverInternals to reST.

Added:
    cfe/trunk/docs/DriverInternals.rst
Removed:
    cfe/trunk/docs/DriverInternals.html
Modified:
    cfe/trunk/docs/index.rst

Removed: cfe/trunk/docs/DriverInternals.html
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/DriverInternals.html?rev=170811&view=auto
==============================================================================
--- cfe/trunk/docs/DriverInternals.html (original)
+++ cfe/trunk/docs/DriverInternals.html (removed)
@@ -1,523 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
-          "http://www.w3.org/TR/html4/strict.dtd">
-<html>
-  <head>
-    <title>Clang Driver Manual</title>
-    <link type="text/css" rel="stylesheet" href="../menu.css">
-    <link type="text/css" rel="stylesheet" href="../content.css">
-    <style type="text/css">
-      td {
-      vertical-align: top;
-      }
-    </style>
-  </head>
-  <body>
-
-    <!--#include virtual="../menu.html.incl"-->
-
-    <div id="content">
-
-      <h1>Driver Design & Internals</h1>
-
-      <ul>
-        <li><a href="#intro">Introduction</a></li>
-        <li><a href="#features">Features and Goals</a>
-        <ul>
-          <li><a href="#gcccompat">GCC Compatibility</a></li>
-          <li><a href="#components">Flexible</a></li>
-          <li><a href="#performance">Low Overhead</a></li>
-          <li><a href="#simple">Simple</a></li>
-        </ul>
-        </li>
-        <li><a href="#design">Design</a>
-        <ul>
-          <li><a href="#int_intro">Internals Introduction</a></li>
-          <li><a href="#int_overview">Design Overview</a></li>
-          <li><a href="#int_notes">Additional Notes</a>
-          <ul>
-            <li><a href="#int_compilation">The Compilation Object</a></li>
-            <li><a href="#int_unified_parsing">Unified Parsing & Pipelining</a></li>
-            <li><a href="#int_toolchain_translation">ToolChain Argument Translation</a></li>
-            <li><a href="#int_unused_warnings">Unused Argument Warnings</a></li>
-          </ul>
-          </li>
-          <li><a href="#int_gcc_concepts">Relation to GCC Driver Concepts</a></li>
-        </ul>
-        </li>
-      </ul>
-
-
-      <!-- ======================================================================= -->
-      <h2 id="intro">Introduction</h2>
-      <!-- ======================================================================= -->
-
-      <p>This document describes the Clang driver. The purpose of this
-        document is to describe both the motivation and design goals
-        for the driver, as well as details of the internal
-        implementation.</p>
-
-      <!-- ======================================================================= -->
-      <h2 id="features">Features and Goals</h2>
-      <!-- ======================================================================= -->
-
-      <p>The Clang driver is intended to be a production quality
-        compiler driver providing access to the Clang compiler and
-        tools, with a command line interface which is compatible with
-        the gcc driver.</p>
-
-      <p>Although the driver is part of and driven by the Clang
-        project, it is logically a separate tool which shares many of
-        the same goals as Clang:</p>
-
-      <p><b>Features</b>:</p>
-      <ul>
-        <li><a href="#gcccompat">GCC Compatibility</a></li>
-        <li><a href="#components">Flexible</a></li>
-        <li><a href="#performance">Low Overhead</a></li>
-        <li><a href="#simple">Simple</a></li>
-      </ul>
-
-      <!--=======================================================================-->
-      <h3 id="gcccompat">GCC Compatibility</h3>
-      <!--=======================================================================-->
-
-      <p>The number one goal of the driver is to ease the adoption of
-        Clang by allowing users to drop Clang into a build system
-        which was designed to call GCC. Although this makes the driver
-        much more complicated than might otherwise be necessary, we
-        decided that being very compatible with the gcc command line
-        interface was worth it in order to allow users to quickly test
-        clang on their projects.</p>
-
-      <!--=======================================================================-->
-      <h3 id="components">Flexible</h3>
-      <!--=======================================================================-->
-
-      <p>The driver was designed to be flexible and easily accommodate
-        new uses as we grow the clang and LLVM infrastructure. As one
-        example, the driver can easily support the introduction of
-        tools which have an integrated assembler; something we hope to
-        add to LLVM in the future.</p>
-
-      <p>Similarly, most of the driver functionality is kept in a
-        library which can be used to build other tools which want to
-        implement or accept a gcc like interface. </p>
-
-      <!--=======================================================================-->
-      <h3 id="performance">Low Overhead</h3>
-      <!--=======================================================================-->
-
-      <p>The driver should have as little overhead as possible. In
-        practice, we found that the gcc driver by itself incurred a
-        small but meaningful overhead when compiling many small
-        files. The driver doesn't do much work compared to a
-        compilation, but we have tried to keep it as efficient as
-        possible by following a few simple principles:</p>
-      <ul>
-        <li>Avoid memory allocation and string copying when
-          possible.</li>
-
-        <li>Don't parse arguments more than once.</li>
-
-        <li>Provide a few simple interfaces for efficiently searching
-          arguments.</li>
-      </ul>
-
-      <!--=======================================================================-->
-      <h3 id="simple">Simple</h3>
-      <!--=======================================================================-->
-
-      <p>Finally, the driver was designed to be "as simple as
-        possible", given the other goals. Notably, trying to be
-        completely compatible with the gcc driver adds a significant
-        amount of complexity. However, the design of the driver
-        attempts to mitigate this complexity by dividing the process
-        into a number of independent stages instead of a single
-        monolithic task.</p>
-
-      <!-- ======================================================================= -->
-      <h2 id="design">Internal Design and Implementation</h2>
-      <!-- ======================================================================= -->
-
-      <ul>
-        <li><a href="#int_intro">Internals Introduction</a></li>
-        <li><a href="#int_overview">Design Overview</a></li>
-        <li><a href="#int_notes">Additional Notes</a></li>
-        <li><a href="#int_gcc_concepts">Relation to GCC Driver Concepts</a></li>
-      </ul>
-
-      <!--=======================================================================-->
-      <h3><a name="int_intro">Internals Introduction</a></h3>
-      <!--=======================================================================-->
-
-      <p>In order to satisfy the stated goals, the driver was designed
-        to completely subsume the functionality of the gcc executable;
-        that is, the driver should not need to delegate to gcc to
-        perform subtasks. On Darwin, this implies that the Clang
-        driver also subsumes the gcc driver-driver, which is used to
-        implement support for building universal images (binaries and
-        object files). This also implies that the driver should be
-        able to call the language specific compilers (e.g. cc1)
-        directly, which means that it must have enough information to
-        forward command line arguments to child processes
-        correctly.</p>
-
-      <!--=======================================================================-->
-      <h3><a name="int_overview">Design Overview</a></h3>
-      <!--=======================================================================-->
-
-      <p>The diagram below shows the significant components of the
-        driver architecture and how they relate to one another. The
-        orange components represent concrete data structures built by
-        the driver, the green components indicate conceptually
-        distinct stages which manipulate these data structures, and
-        the blue components are important helper classes. </p>
-
-      <div style="text-align:center">
-        <a href="DriverArchitecture.png">
-          <img width=400 src="DriverArchitecture.png"
-               alt="Driver Architecture Diagram">
-        </a>
-      </div>
-
-      <!--=======================================================================-->
-      <h3><a name="int_stages">Driver Stages</a></h3>
-      <!--=======================================================================-->
-
-      <p>The driver functionality is conceptually divided into five stages:</p>
-
-      <ol>
-        <li>
-          <b>Parse: Option Parsing</b>
-
-          <p>The command line argument strings are decomposed into
-            arguments (<tt>Arg</tt> instances). The driver expects to
-            understand all available options, although there is some
-            facility for just passing certain classes of options
-            through (like <tt>-Wl,</tt>).</p>
-
-          <p>Each argument corresponds to exactly one
-            abstract <tt>Option</tt> definition, which describes how
-            the option is parsed along with some additional
-            metadata. The Arg instances themselves are lightweight and
-            merely contain enough information for clients to determine
-            which option they correspond to and their values (if they
-            have additional parameters).</p>
-
-          <p>For example, a command line like "-Ifoo -I foo" would
-            parse to two Arg instances (a JoinedArg and a SeparateArg
-            instance), but each would refer to the same Option.</p>
-
-          <p>Options are lazily created in order to avoid populating
-            all Option classes when the driver is loaded. Most of the
-            driver code only needs to deal with options by their
-            unique ID (e.g., <tt>options::OPT_I</tt>),</p>
-
-          <p>Arg instances themselves do not generally store the
-            values of parameters. In many cases, this would
-            simply result in creating unnecessary string
-            copies. Instead, Arg instances are always embedded inside
-            an ArgList structure, which contains the original vector
-            of argument strings. Each Arg itself only needs to contain
-            an index into this vector instead of storing its values
-            directly.</p>
-
-          <p>The clang driver can dump the results of this
-            stage using the <tt>-ccc-print-options</tt> flag (which
-            must precede any actual command line arguments). For
-            example:</p>
-          <pre>
-            $ <b>clang -ccc-print-options -Xarch_i386 -fomit-frame-pointer -Wa,-fast -Ifoo -I foo t.c</b>
-            Option 0 - Name: "-Xarch_", Values: {"i386", "-fomit-frame-pointer"}
-            Option 1 - Name: "-Wa,", Values: {"-fast"}
-            Option 2 - Name: "-I", Values: {"foo"}
-            Option 3 - Name: "-I", Values: {"foo"}
-            Option 4 - Name: "<input>", Values: {"t.c"}
-          </pre>
-
-          <p>After this stage is complete the command line should be
-            broken down into well defined option objects with their
-            appropriate parameters.  Subsequent stages should rarely,
-            if ever, need to do any string processing.</p>
-        </li>
-
-        <li>
-          <b>Pipeline: Compilation Job Construction</b>
-
-          <p>Once the arguments are parsed, the tree of subprocess
-            jobs needed for the desired compilation sequence are
-            constructed. This involves determining the input files and
-            their types, what work is to be done on them (preprocess,
-            compile, assemble, link, etc.), and constructing a list of
-            Action instances for each task. The result is a list of
-            one or more top-level actions, each of which generally
-            corresponds to a single output (for example, an object or
-            linked executable).</p>
-
-          <p>The majority of Actions correspond to actual tasks,
-            however there are two special Actions. The first is
-            InputAction, which simply serves to adapt an input
-            argument for use as an input to other Actions. The second
-            is BindArchAction, which conceptually alters the
-            architecture to be used for all of its input Actions.</p>
-
-          <p>The clang driver can dump the results of this
-            stage using the <tt>-ccc-print-phases</tt> flag. For
-            example:</p>
-          <pre>
-            $ <b>clang -ccc-print-phases -x c t.c -x assembler t.s</b>
-            0: input, "t.c", c
-            1: preprocessor, {0}, cpp-output
-            2: compiler, {1}, assembler
-            3: assembler, {2}, object
-            4: input, "t.s", assembler
-            5: assembler, {4}, object
-            6: linker, {3, 5}, image
-          </pre>
-          <p>Here the driver is constructing seven distinct actions,
-            four to compile the "t.c" input into an object file, two to
-            assemble the "t.s" input, and one to link them together.</p>
-
-          <p>A rather different compilation pipeline is shown here; in
-            this example there are two top level actions to compile
-            the input files into two separate object files, where each
-            object file is built using <tt>lipo</tt> to merge results
-            built for two separate architectures.</p>
-          <pre>
-            $ <b>clang -ccc-print-phases -c -arch i386 -arch x86_64 t0.c t1.c</b>
-            0: input, "t0.c", c
-            1: preprocessor, {0}, cpp-output
-            2: compiler, {1}, assembler
-            3: assembler, {2}, object
-            4: bind-arch, "i386", {3}, object
-            5: bind-arch, "x86_64", {3}, object
-            6: lipo, {4, 5}, object
-            7: input, "t1.c", c
-            8: preprocessor, {7}, cpp-output
-            9: compiler, {8}, assembler
-            10: assembler, {9}, object
-            11: bind-arch, "i386", {10}, object
-            12: bind-arch, "x86_64", {10}, object
-            13: lipo, {11, 12}, object
-          </pre>
-
-          <p>After this stage is complete the compilation process is
-            divided into a simple set of actions which need to be
-            performed to produce intermediate or final outputs (in
-            some cases, like <tt>-fsyntax-only</tt>, there is no
-            "real" final output). Phases are well known compilation
-            steps, such as "preprocess", "compile", "assemble",
-            "link", etc.</p>
-        </li>
-
-        <li>
-          <b>Bind: Tool & Filename Selection</b>
-
-          <p>This stage (in conjunction with the Translate stage)
-            turns the tree of Actions into a list of actual subprocess
-            to run. Conceptually, the driver performs a top down
-            matching to assign Action(s) to Tools. The ToolChain is
-            responsible for selecting the tool to perform a particular
-            action; once selected the driver interacts with the tool
-            to see if it can match additional actions (for example, by
-            having an integrated preprocessor).
-
-          <p>Once Tools have been selected for all actions, the driver
-            determines how the tools should be connected (for example,
-            using an inprocess module, pipes, temporary files, or user
-            provided filenames). If an output file is required, the
-            driver also computes the appropriate file name (the suffix
-            and file location depend on the input types and options
-            such as <tt>-save-temps</tt>).
-
-          <p>The driver interacts with a ToolChain to perform the Tool
-            bindings. Each ToolChain contains information about all
-            the tools needed for compilation for a particular
-            architecture, platform, and operating system. A single
-            driver invocation may query multiple ToolChains during one
-            compilation in order to interact with tools for separate
-            architectures.</p>
-
-          <p>The results of this stage are not computed directly, but
-            the driver can print the results via
-            the <tt>-ccc-print-bindings</tt> option. For example:</p>
-          <pre>
-            $ <b>clang -ccc-print-bindings -arch i386 -arch ppc t0.c</b>
-            # "i386-apple-darwin9" - "clang", inputs: ["t0.c"], output: "/tmp/cc-Sn4RKF.s"
-            # "i386-apple-darwin9" - "darwin::Assemble", inputs: ["/tmp/cc-Sn4RKF.s"], output: "/tmp/cc-gvSnbS.o"
-            # "i386-apple-darwin9" - "darwin::Link", inputs: ["/tmp/cc-gvSnbS.o"], output: "/tmp/cc-jgHQxi.out"
-            # "ppc-apple-darwin9" - "gcc::Compile", inputs: ["t0.c"], output: "/tmp/cc-Q0bTox.s"
-            # "ppc-apple-darwin9" - "gcc::Assemble", inputs: ["/tmp/cc-Q0bTox.s"], output: "/tmp/cc-WCdicw.o"
-            # "ppc-apple-darwin9" - "gcc::Link", inputs: ["/tmp/cc-WCdicw.o"], output: "/tmp/cc-HHBEBh.out"
-            # "i386-apple-darwin9" - "darwin::Lipo", inputs: ["/tmp/cc-jgHQxi.out", "/tmp/cc-HHBEBh.out"], output: "a.out"
-          </pre>
-
-          <p>This shows the tool chain, tool, inputs and outputs which
-            have been bound for this compilation sequence. Here clang
-            is being used to compile t0.c on the i386 architecture and
-            darwin specific versions of the tools are being used to
-            assemble and link the result, but generic gcc versions of
-            the tools are being used on PowerPC.</p>
-        </li>
-
-        <li>
-          <b>Translate: Tool Specific Argument Translation</b>
-
-          <p>Once a Tool has been selected to perform a particular
-            Action, the Tool must construct concrete Jobs which will be
-            executed during compilation. The main work is in translating
-            from the gcc style command line options to whatever options
-            the subprocess expects.</p>
-
-          <p>Some tools, such as the assembler, only interact with a
-            handful of arguments and just determine the path of the
-            executable to call and pass on their input and output
-            arguments. Others, like the compiler or the linker, may
-            translate a large number of arguments in addition.</p>
-
-          <p>The ArgList class provides a number of simple helper
-            methods to assist with translating arguments; for example,
-            to pass on only the last of arguments corresponding to some
-            option, or all arguments for an option.</p>
-
-          <p>The result of this stage is a list of Jobs (executable
-            paths and argument strings) to execute.</p>
-        </li>
-
-        <li>
-          <b>Execute</b>
-          <p>Finally, the compilation pipeline is executed. This is
-            mostly straightforward, although there is some interaction
-            with options
-            like <tt>-pipe</tt>, <tt>-pass-exit-codes</tt>
-            and <tt>-time</tt>.</p>
-        </li>
-
-      </ol>
-
-      <!--=======================================================================-->
-      <h3><a name="int_notes">Additional Notes</a></h3>
-      <!--=======================================================================-->
-
-      <h4 id="int_compilation">The Compilation Object</h4>
-
-      <p>The driver constructs a Compilation object for each set of
-        command line arguments. The Driver itself is intended to be
-        invariant during construction of a Compilation; an IDE should be
-        able to construct a single long lived driver instance to use
-        for an entire build, for example.</p>
-
-      <p>The Compilation object holds information that is particular
-        to each compilation sequence. For example, the list of used
-        temporary files (which must be removed once compilation is
-        finished) and result files (which should be removed if
-        compilation fails).</p>
-
-      <h4 id="int_unified_parsing">Unified Parsing & Pipelining</h4>
-
-      <p>Parsing and pipelining both occur without reference to a
-        Compilation instance. This is by design; the driver expects that
-        both of these phases are platform neutral, with a few very well
-        defined exceptions such as whether the platform uses a driver
-        driver.</p>
-
-      <h4 id="int_toolchain_translation">ToolChain Argument Translation</h4>
-
-      <p>In order to match gcc very closely, the clang driver
-        currently allows tool chains to perform their own translation of
-        the argument list (into a new ArgList data structure). Although
-        this allows the clang driver to match gcc easily, it also makes
-        the driver operation much harder to understand (since the Tools
-        stop seeing some arguments the user provided, and see new ones
-        instead).</p>
-
-      <p>For example, on Darwin <tt>-gfull</tt> gets translated into two
-        separate arguments, <tt>-g</tt>
-        and <tt>-fno-eliminate-unused-debug-symbols</tt>. Trying to write Tool
-        logic to do something with <tt>-gfull</tt> will not work, because Tool
-        argument translation is done after the arguments have been
-        translated.</p>
-
-      <p>A long term goal is to remove this tool chain specific
-        translation, and instead force each tool to change its own logic
-        to do the right thing on the untranslated original arguments.</p>
-
-      <h4 id="int_unused_warnings">Unused Argument Warnings</h4>
-      <p>The driver operates by parsing all arguments but giving Tools
-        the opportunity to choose which arguments to pass on. One
-        downside of this infrastructure is that if the user misspells
-        some option, or is confused about which options to use, some
-        command line arguments the user really cared about may go
-        unused. This problem is particularly important when using
-        clang as a compiler, since the clang compiler does not support
-        anywhere near all the options that gcc does, and we want to make
-        sure users know which ones are being used.</p>
-
-      <p>To support this, the driver maintains a bit associated with
-        each argument of whether it has been used (at all) during the
-        compilation. This bit usually doesn't need to be set by hand,
-        as the key ArgList accessors will set it automatically.</p>
-
-      <p>When a compilation is successful (there are no errors), the
-        driver checks the bit and emits an "unused argument" warning for
-        any arguments which were never accessed. This is conservative
-        (the argument may not have been used to do what the user wanted)
-        but still catches the most obvious cases.</p>
-
-      <!--=======================================================================-->
-      <h3><a name="int_gcc_concepts">Relation to GCC Driver Concepts</a></h3>
-      <!--=======================================================================-->
-
-      <p>For those familiar with the gcc driver, this section provides
-        a brief overview of how things from the gcc driver map to the
-        clang driver.</p>
-
-      <ul>
-        <li>
-          <b>Driver Driver</b>
-          <p>The driver driver is fully integrated into the clang
-            driver. The driver simply constructs additional Actions to
-            bind the architecture during the <i>Pipeline</i>
-            phase. The tool chain specific argument translation is
-            responsible for handling <tt>-Xarch_</tt>.</p>
-
-          <p>The one caveat is that this approach
-            requires <tt>-Xarch_</tt> not be used to alter the
-            compilation itself (for example, one cannot
-            provide <tt>-S</tt> as an <tt>-Xarch_</tt> argument). The
-            driver attempts to reject such invocations, and overall
-            there isn't a good reason to abuse <tt>-Xarch_</tt> to
-            that end in practice.</p>
-
-          <p>The upside is that the clang driver is more efficient and
-            does little extra work to support universal builds. It also
-            provides better error reporting and UI consistency.</p>
-        </li>
-
-        <li>
-          <b>Specs</b>
-          <p>The clang driver has no direct correspondent for
-            "specs". The majority of the functionality that is
-            embedded in specs is in the Tool specific argument
-            translation routines. The parts of specs which control the
-            compilation pipeline are generally part of
-            the <i>Pipeline</i> stage.</p>
-        </li>
-
-        <li>
-          <b>Toolchains</b>
-          <p>The gcc driver has no direct understanding of tool
-            chains. Each gcc binary roughly corresponds to the
-            information which is embedded inside a single
-            ToolChain.</p>
-
-          <p>The clang driver is intended to be portable and support
-            complex compilation environments. All platform and tool
-            chain specific code should be protected behind either
-            abstract or well defined interfaces (such as whether the
-            platform supports use as a driver driver).</p>
-        </li>
-      </ul>
-    </div>
-  </body>
-</html>

Added: cfe/trunk/docs/DriverInternals.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/DriverInternals.rst?rev=170812&view=auto
==============================================================================
--- cfe/trunk/docs/DriverInternals.rst (added)
+++ cfe/trunk/docs/DriverInternals.rst Thu Dec 20 18:16:53 2012
@@ -0,0 +1,400 @@
+=========================
+Driver Design & Internals
+=========================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document describes the Clang driver. The purpose of this document
+is to describe both the motivation and design goals for the driver, as
+well as details of the internal implementation.
+
+Features and Goals
+==================
+
+The Clang driver is intended to be a production quality compiler driver
+providing access to the Clang compiler and tools, with a command line
+interface which is compatible with the gcc driver.
+
+Although the driver is part of and driven by the Clang project, it is
+logically a separate tool which shares many of the same goals as Clang:
+
+.. contents:: Features
+   :local:
+
+GCC Compatibility
+-----------------
+
+The number one goal of the driver is to ease the adoption of Clang by
+allowing users to drop Clang into a build system which was designed to
+call GCC. Although this makes the driver much more complicated than
+might otherwise be necessary, we decided that being very compatible with
+the gcc command line interface was worth it in order to allow users to
+quickly test clang on their projects.
+
+Flexible
+--------
+
+The driver was designed to be flexible and easily accommodate new uses
+as we grow the clang and LLVM infrastructure. As one example, the driver
+can easily support the introduction of tools which have an integrated
+assembler; something we hope to add to LLVM in the future.
+
+Similarly, most of the driver functionality is kept in a library which
+can be used to build other tools which want to implement or accept a gcc
+like interface.
+
+Low Overhead
+------------
+
+The driver should have as little overhead as possible. In practice, we
+found that the gcc driver by itself incurred a small but meaningful
+overhead when compiling many small files. The driver doesn't do much
+work compared to a compilation, but we have tried to keep it as
+efficient as possible by following a few simple principles:
+
+-  Avoid memory allocation and string copying when possible.
+-  Don't parse arguments more than once.
+-  Provide a few simple interfaces for efficiently searching arguments.
+
+Simple
+------
+
+Finally, the driver was designed to be "as simple as possible", given
+the other goals. Notably, trying to be completely compatible with the
+gcc driver adds a significant amount of complexity. However, the design
+of the driver attempts to mitigate this complexity by dividing the
+process into a number of independent stages instead of a single
+monolithic task.
+
+Internal Design and Implementation
+==================================
+
+.. contents::
+   :local:
+   :depth: 1
+
+Internals Introduction
+----------------------
+
+In order to satisfy the stated goals, the driver was designed to
+completely subsume the functionality of the gcc executable; that is, the
+driver should not need to delegate to gcc to perform subtasks. On
+Darwin, this implies that the Clang driver also subsumes the gcc
+driver-driver, which is used to implement support for building universal
+images (binaries and object files). This also implies that the driver
+should be able to call the language specific compilers (e.g. cc1)
+directly, which means that it must have enough information to forward
+command line arguments to child processes correctly.
+
+Design Overview
+---------------
+
+The diagram below shows the significant components of the driver
+architecture and how they relate to one another. The orange components
+represent concrete data structures built by the driver, the green
+components indicate conceptually distinct stages which manipulate these
+data structures, and the blue components are important helper classes.
+
+.. image:: DriverArchitecture.png
+   :align: center
+   :alt: Driver Architecture Diagram
+
+Driver Stages
+-------------
+
+The driver functionality is conceptually divided into five stages:
+
+#. **Parse: Option Parsing**
+
+   The command line argument strings are decomposed into arguments
+   (``Arg`` instances). The driver expects to understand all available
+   options, although there is some facility for just passing certain
+   classes of options through (like ``-Wl,``).
+
+   Each argument corresponds to exactly one abstract ``Option``
+   definition, which describes how the option is parsed along with some
+   additional metadata. The Arg instances themselves are lightweight and
+   merely contain enough information for clients to determine which
+   option they correspond to and their values (if they have additional
+   parameters).
+
+   For example, a command line like "-Ifoo -I foo" would parse to two
+   Arg instances (a JoinedArg and a SeparateArg instance), but each
+   would refer to the same Option.
+
+   Options are lazily created in order to avoid populating all Option
+   classes when the driver is loaded. Most of the driver code only needs
+   to deal with options by their unique ID (e.g., ``options::OPT_I``),
+
+   Arg instances themselves do not generally store the values of
+   parameters. In many cases, this would simply result in creating
+   unnecessary string copies. Instead, Arg instances are always embedded
+   inside an ArgList structure, which contains the original vector of
+   argument strings. Each Arg itself only needs to contain an index into
+   this vector instead of storing its values directly.
+
+   The clang driver can dump the results of this stage using the
+   ``-ccc-print-options`` flag (which must precede any actual command
+   line arguments). For example:
+
+   .. code-block:: console
+
+      $ clang -ccc-print-options -Xarch_i386 -fomit-frame-pointer -Wa,-fast -Ifoo -I foo t.c
+      Option 0 - Name: "-Xarch_", Values: {"i386", "-fomit-frame-pointer"}
+      Option 1 - Name: "-Wa,", Values: {"-fast"}
+      Option 2 - Name: "-I", Values: {"foo"}
+      Option 3 - Name: "-I", Values: {"foo"}
+      Option 4 - Name: "<input>", Values: {"t.c"}
+
+   After this stage is complete the command line should be broken down
+   into well defined option objects with their appropriate parameters.
+   Subsequent stages should rarely, if ever, need to do any string
+   processing.
+
+#. **Pipeline: Compilation Job Construction**
+
+   Once the arguments are parsed, the tree of subprocess jobs needed for
+   the desired compilation sequence are constructed. This involves
+   determining the input files and their types, what work is to be done
+   on them (preprocess, compile, assemble, link, etc.), and constructing
+   a list of Action instances for each task. The result is a list of one
+   or more top-level actions, each of which generally corresponds to a
+   single output (for example, an object or linked executable).
+
+   The majority of Actions correspond to actual tasks, however there are
+   two special Actions. The first is InputAction, which simply serves to
+   adapt an input argument for use as an input to other Actions. The
+   second is BindArchAction, which conceptually alters the architecture
+   to be used for all of its input Actions.
+
+   The clang driver can dump the results of this stage using the
+   ``-ccc-print-phases`` flag. For example:
+
+   .. code-block:: console
+
+      $ clang -ccc-print-phases -x c t.c -x assembler t.s
+      0: input, "t.c", c
+      1: preprocessor, {0}, cpp-output
+      2: compiler, {1}, assembler
+      3: assembler, {2}, object
+      4: input, "t.s", assembler
+      5: assembler, {4}, object
+      6: linker, {3, 5}, image
+
+   Here the driver is constructing seven distinct actions, four to
+   compile the "t.c" input into an object file, two to assemble the
+   "t.s" input, and one to link them together.
+
+   A rather different compilation pipeline is shown here; in this
+   example there are two top level actions to compile the input files
+   into two separate object files, where each object file is built using
+   ``lipo`` to merge results built for two separate architectures.
+
+   .. code-block:: console
+
+      $ clang -ccc-print-phases -c -arch i386 -arch x86_64 t0.c t1.c
+      0: input, "t0.c", c
+      1: preprocessor, {0}, cpp-output
+      2: compiler, {1}, assembler
+      3: assembler, {2}, object
+      4: bind-arch, "i386", {3}, object
+      5: bind-arch, "x86_64", {3}, object
+      6: lipo, {4, 5}, object
+      7: input, "t1.c", c
+      8: preprocessor, {7}, cpp-output
+      9: compiler, {8}, assembler
+      10: assembler, {9}, object
+      11: bind-arch, "i386", {10}, object
+      12: bind-arch, "x86_64", {10}, object
+      13: lipo, {11, 12}, object
+
+   After this stage is complete the compilation process is divided into
+   a simple set of actions which need to be performed to produce
+   intermediate or final outputs (in some cases, like ``-fsyntax-only``,
+   there is no "real" final output). Phases are well known compilation
+   steps, such as "preprocess", "compile", "assemble", "link", etc.
+
+#. **Bind: Tool & Filename Selection**
+
+   This stage (in conjunction with the Translate stage) turns the tree
+   of Actions into a list of actual subprocess to run. Conceptually, the
+   driver performs a top down matching to assign Action(s) to Tools. The
+   ToolChain is responsible for selecting the tool to perform a
+   particular action; once selected the driver interacts with the tool
+   to see if it can match additional actions (for example, by having an
+   integrated preprocessor).
+
+   Once Tools have been selected for all actions, the driver determines
+   how the tools should be connected (for example, using an inprocess
+   module, pipes, temporary files, or user provided filenames). If an
+   output file is required, the driver also computes the appropriate
+   file name (the suffix and file location depend on the input types and
+   options such as ``-save-temps``).
+
+   The driver interacts with a ToolChain to perform the Tool bindings.
+   Each ToolChain contains information about all the tools needed for
+   compilation for a particular architecture, platform, and operating
+   system. A single driver invocation may query multiple ToolChains
+   during one compilation in order to interact with tools for separate
+   architectures.
+
+   The results of this stage are not computed directly, but the driver
+   can print the results via the ``-ccc-print-bindings`` option. For
+   example:
+
+   .. code-block:: console
+
+      $ clang -ccc-print-bindings -arch i386 -arch ppc t0.c
+      # "i386-apple-darwin9" - "clang", inputs: ["t0.c"], output: "/tmp/cc-Sn4RKF.s"
+      # "i386-apple-darwin9" - "darwin::Assemble", inputs: ["/tmp/cc-Sn4RKF.s"], output: "/tmp/cc-gvSnbS.o"
+      # "i386-apple-darwin9" - "darwin::Link", inputs: ["/tmp/cc-gvSnbS.o"], output: "/tmp/cc-jgHQxi.out"
+      # "ppc-apple-darwin9" - "gcc::Compile", inputs: ["t0.c"], output: "/tmp/cc-Q0bTox.s"
+      # "ppc-apple-darwin9" - "gcc::Assemble", inputs: ["/tmp/cc-Q0bTox.s"], output: "/tmp/cc-WCdicw.o"
+      # "ppc-apple-darwin9" - "gcc::Link", inputs: ["/tmp/cc-WCdicw.o"], output: "/tmp/cc-HHBEBh.out"
+      # "i386-apple-darwin9" - "darwin::Lipo", inputs: ["/tmp/cc-jgHQxi.out", "/tmp/cc-HHBEBh.out"], output: "a.out"
+
+   This shows the tool chain, tool, inputs and outputs which have been
+   bound for this compilation sequence. Here clang is being used to
+   compile t0.c on the i386 architecture and darwin specific versions of
+   the tools are being used to assemble and link the result, but generic
+   gcc versions of the tools are being used on PowerPC.
+
+#. **Translate: Tool Specific Argument Translation**
+
+   Once a Tool has been selected to perform a particular Action, the
+   Tool must construct concrete Jobs which will be executed during
+   compilation. The main work is in translating from the gcc style
+   command line options to whatever options the subprocess expects.
+
+   Some tools, such as the assembler, only interact with a handful of
+   arguments and just determine the path of the executable to call and
+   pass on their input and output arguments. Others, like the compiler
+   or the linker, may translate a large number of arguments in addition.
+
+   The ArgList class provides a number of simple helper methods to
+   assist with translating arguments; for example, to pass on only the
+   last of arguments corresponding to some option, or all arguments for
+   an option.
+
+   The result of this stage is a list of Jobs (executable paths and
+   argument strings) to execute.
+
+#. **Execute**
+
+   Finally, the compilation pipeline is executed. This is mostly
+   straightforward, although there is some interaction with options like
+   ``-pipe``, ``-pass-exit-codes`` and ``-time``.
+
+Additional Notes
+----------------
+
+The Compilation Object
+^^^^^^^^^^^^^^^^^^^^^^
+
+The driver constructs a Compilation object for each set of command line
+arguments. The Driver itself is intended to be invariant during
+construction of a Compilation; an IDE should be able to construct a
+single long lived driver instance to use for an entire build, for
+example.
+
+The Compilation object holds information that is particular to each
+compilation sequence. For example, the list of used temporary files
+(which must be removed once compilation is finished) and result files
+(which should be removed if compilation fails).
+
+Unified Parsing & Pipelining
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Parsing and pipelining both occur without reference to a Compilation
+instance. This is by design; the driver expects that both of these
+phases are platform neutral, with a few very well defined exceptions
+such as whether the platform uses a driver driver.
+
+ToolChain Argument Translation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In order to match gcc very closely, the clang driver currently allows
+tool chains to perform their own translation of the argument list (into
+a new ArgList data structure). Although this allows the clang driver to
+match gcc easily, it also makes the driver operation much harder to
+understand (since the Tools stop seeing some arguments the user
+provided, and see new ones instead).
+
+For example, on Darwin ``-gfull`` gets translated into two separate
+arguments, ``-g`` and ``-fno-eliminate-unused-debug-symbols``. Trying to
+write Tool logic to do something with ``-gfull`` will not work, because
+Tool argument translation is done after the arguments have been
+translated.
+
+A long term goal is to remove this tool chain specific translation, and
+instead force each tool to change its own logic to do the right thing on
+the untranslated original arguments.
+
+Unused Argument Warnings
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The driver operates by parsing all arguments but giving Tools the
+opportunity to choose which arguments to pass on. One downside of this
+infrastructure is that if the user misspells some option, or is confused
+about which options to use, some command line arguments the user really
+cared about may go unused. This problem is particularly important when
+using clang as a compiler, since the clang compiler does not support
+anywhere near all the options that gcc does, and we want to make sure
+users know which ones are being used.
+
+To support this, the driver maintains a bit associated with each
+argument of whether it has been used (at all) during the compilation.
+This bit usually doesn't need to be set by hand, as the key ArgList
+accessors will set it automatically.
+
+When a compilation is successful (there are no errors), the driver
+checks the bit and emits an "unused argument" warning for any arguments
+which were never accessed. This is conservative (the argument may not
+have been used to do what the user wanted) but still catches the most
+obvious cases.
+
+Relation to GCC Driver Concepts
+-------------------------------
+
+For those familiar with the gcc driver, this section provides a brief
+overview of how things from the gcc driver map to the clang driver.
+
+-  **Driver Driver**
+
+   The driver driver is fully integrated into the clang driver. The
+   driver simply constructs additional Actions to bind the architecture
+   during the *Pipeline* phase. The tool chain specific argument
+   translation is responsible for handling ``-Xarch_``.
+
+   The one caveat is that this approach requires ``-Xarch_`` not be used
+   to alter the compilation itself (for example, one cannot provide
+   ``-S`` as an ``-Xarch_`` argument). The driver attempts to reject
+   such invocations, and overall there isn't a good reason to abuse
+   ``-Xarch_`` to that end in practice.
+
+   The upside is that the clang driver is more efficient and does little
+   extra work to support universal builds. It also provides better error
+   reporting and UI consistency.
+
+-  **Specs**
+
+   The clang driver has no direct correspondent for "specs". The
+   majority of the functionality that is embedded in specs is in the
+   Tool specific argument translation routines. The parts of specs which
+   control the compilation pipeline are generally part of the *Pipeline*
+   stage.
+
+-  **Toolchains**
+
+   The gcc driver has no direct understanding of tool chains. Each gcc
+   binary roughly corresponds to the information which is embedded
+   inside a single ToolChain.
+
+   The clang driver is intended to be portable and support complex
+   compilation environments. All platform and tool chain specific code
+   should be protected behind either abstract or well defined interfaces
+   (such as whether the platform supports use as a driver driver).

Modified: cfe/trunk/docs/index.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/index.rst?rev=170812&r1=170811&r2=170812&view=diff
==============================================================================
--- cfe/trunk/docs/index.rst (original)
+++ cfe/trunk/docs/index.rst Thu Dec 20 18:16:53 2012
@@ -30,6 +30,7 @@
    UsersManual
    AutomaticReferenceCounting
    InternalsManual
+   DriverInternals
 
 Indices and tables
 ==================





More information about the cfe-commits mailing list