[PATCH] Re-factor TableGen docs

Aaron Ballman aaron.ballman at gmail.com
Wed Mar 19 10:10:11 PDT 2014


This is great! A few comments below:

> Index: docs/TableGen/BackEnds.rst
> ===================================================================
> --- /dev/null
> +++ docs/TableGen/BackEnds.rst
> @@ -0,0 +1,281 @@
> +=================
> +TableGen BackEnds
> +=================
> +
> +.. contents::
> +   :local:
> +
> +Introduction
> +============
> +
> +TableGen backends are at the core of TableGen's functionality. The source files
> +provide the semantics to a generated (in memory) structure, but it's up to the
> +backend to print this out in a way that is meaningful to the user (normally a
> +C program including a file or a textual list of warnings, options and error
> +messages).
> +
> +TableGen is used by both LLVM and Clang with very different goals. LLVM uses it
> +as a way to automate the generation of massive amounts of information regarding
> +instructions, schedules, cores and architecture features. Some backends generate
> +output that is consumed by more than one source file, so they need to be created
> +in a way that is easy to use pre-processor tricks. Some backends can also print
> +C code structures, so that they can be directly included as-is.
> +
> +Clang, on the other hand, uses it mainly for diagnostic messages (errors,
> +warnings, tips) and attributes, so more on the textual end of the scale.
> +
> +LLVM BackEnds
> +=============
> +
> +.. warning::
> +   This document is raw. Each section below needs three sub-sections: description
> +   of its purpose with a list of users, output generated from generic input, and
> +   finally why it needed a new backend (in case there's something similar).
> +
> +Emitter
> +-------
> +
> +Generate machine code emitter.
> +
> +RegisterInfo
> +------------
> +
> +Generate registers and register classes info.
> +
> +InstrInfo
> +---------
> +
> +Generate instruction descriptions.
> +
> +AsmWriter
> +---------
> +
> +Generate calling convention descriptions.
> +
> +AsmMatcher
> +----------
> +
> +Generate assembly writer.
> +
> +Disassembler
> +------------
> +
> +Generate disassembler.
> +
> +PseudoLowering
> +--------------
> +
> +Generate pseudo instruction lowering.
> +
> +CallingConv
> +-----------
> +
> +Generate assembly instruction matcher.
> +
> +DAGISel
> +-------
> +
> +Generate a DAG instruction selector.
> +
> +DFAPacketizer
> +-------------
> +
> +Generate DFA Packetizer for VLIW targets.
> +
> +FastISel
> +--------
> +
> +Generate a "fast" instruction selector.
> +
> +Subtarget
> +---------
> +
> +Generate subtarget enumerations.
> +
> +Intrinsic
> +---------
> +
> +Generate intrinsic information.
> +
> +TgtIntrinsic
> +------------
> +
> +Generate target intrinsic information.
> +
> +OptParserDefs
> +-------------
> +
> +Print enum values for a class.
> +
> +CTags
> +-----
> +
> +Generate ctags-compatible index.
> +
> +
> +Clang BackEnds
> +==============
> +
> +ClangAttrClasses
> +----------------
> +
> +Generate clang attribute clases.
> +
> +ClangAttrParserStringSwitches
> +-----------------------------
> +
> +Generate all parser-related attribute string switches.
> +
> +ClangAttrImpl
> +-------------
> +
> +Generate clang attribute implementations.
> +
> +ClangAttrList
> +-------------
> +
> +Generate a clang attribute list.
> +
> +ClangAttrPCHRead
> +----------------
> +
> +Generate clang PCH attribute reader.
> +
> +ClangAttrPCHWrite
> +-----------------
> +
> +Generate clang PCH attribute writer.
> +
> +ClangAttrSpellingList
> +---------------------
> +
> +Generate a clang attribute spelling list.
> +
> +ClangAttrSpellingListIndex
> +--------------------------
> +
> +Generate a clang attribute spelling index.
> +
> +ClangAttrASTVisitor
> +-------------------
> +
> +Generate a recursive AST visitor for clang attribute.
> +
> +ClangAttrTemplateInstantiate
> +----------------------------
> +
> +Generate a clang template instantiate code.
> +
> +ClangAttrParsedAttrList
> +-----------------------
> +
> +Generate a clang parsed attribute list.
> +
> +ClangAttrParsedAttrImpl
> +-----------------------
> +
> +Generate the clang parsed attribute helpers.
> +
> +ClangAttrParsedAttrKinds
> +------------------------
> +
> +Generate a clang parsed attribute kinds.
> +
> +ClangAttrDump
> +-------------
> +
> +Generate clang attribute dumper.
> +
> +ClangDiagsDefs
> +--------------
> +
> +Generate Clang diagnostics definitions.
> +
> +ClangDiagGroups
> +---------------
> +
> +Generate Clang diagnostic groups.
> +
> +ClangDiagsIndexName
> +-------------------
> +
> +Generate Clang diagnostic name index.
> +
> +ClangCommentNodes
> +-----------------
> +
> +Generate Clang AST comment nodes.
> +
> +ClangDeclNodes
> +--------------
> +
> +Generate Clang AST declaration nodes.
> +
> +ClangStmtNodes
> +--------------
> +
> +Generate Clang AST statement nodes.
> +
> +ClangSACheckers
> +---------------
> +
> +Generate Clang Static Analyzer checkers.
> +
> +ClangCommentHTMLTags
> +--------------------
> +
> +Generate efficient matchers for HTML tag names that are used in documentation comments.
> +
> +ClangCommentHTMLTagsProperties
> +------------------------------
> +
> +Generate efficient matchers for HTML tag properties.
> +
> +ClangCommentHTMLNamedCharacterReferences
> +----------------------------------------
> +
> +Generate function to translate named character references to UTF-8 sequences.
> +
> +ClangCommentCommandInfo
> +-----------------------
> +
> +Generate command properties for commands that are used in documentation comments.
> +
> +ClangCommentCommandList
> +-----------------------
> +
> +Generate list of commands that are used in documentation comments.
> +
> +ArmNeon
> +-------
> +
> +Generate arm_neon.h for clang.
> +
> +ArmNeonSema
> +-----------
> +
> +Generate ARM NEON sema support for clang.
> +
> +ArmNeonTest
> +-----------
> +
> +Generate ARM NEON tests for clang.
> +
> +AttrDocs
> +--------
> +
> +Generate attribute documentation.
> +
> +How to write a back-end
> +=======================
> +
> +TODO.
> +
> +Until we get a step-by-step HowTo for writing TableGen backends, you can at
> +least grab the boilerplate (build system, new files, etc.) from Clang's
> +r173931.
> +
> +TODO: How they work, how to write one.  This section should not contain details
> +about any particular backend, except maybe ``-print-enums`` as an example.  This
> +should highlight the APIs in ``TableGen/Record.h``.
> +
> Index: docs/TableGen/Deficiencies.rst
> ===================================================================
> --- /dev/null
> +++ docs/TableGen/Deficiencies.rst
> @@ -0,0 +1,29 @@
> +=====================
> +TableGen Deficiencies
> +=====================
> +
> +.. contents::
> +   :local:
> +
> +Introduction
> +============
> +
> +Despite being very generic, TableGen has some deficiencies that have been
> +pointed out numerous times. The common theme is that, while TableGen allows
> +you to build Domain-Specific-Languages, the final languages that you create
> +lack the power of other DSLs, which in turn increase considerably the size
> +and complexity of TableGen files.
> +
> +At the same time, TableGen allows you to create virtually any meaning of
> +the basic concepts via custom-made back-ends, which can pervert the original
> +design and make it very hard for newcomers to understand it.
> +
> +There are some in favour of extending the semantics even more, but making sure
> +back-ends adhere to strict rules. Others suggesting we should move to less,
> +more powerful DSLs designed with specific purposes, or even re-using existing

less, more powerful DSLs? Probably want to drop the "less"?

> +DSLs.
> +
> +Discussions
> +===========
> +
> +TODO: Add here concerns and proposals.

This seems more like it's meant for the mailing list than as a
user-facing document? Or is there a reason we need a long-lived
document for concerns and proposals?


> Index: docs/TableGen/LangRef.rst
> ===================================================================
> --- docs/TableGen/LangRef.rst
> +++ docs/TableGen/LangRef.rst
> @@ -2,8 +2,6 @@
>  TableGen Language Reference
>  ===========================
>
> -.. sectionauthor:: Sean Silva <silvas at purdue.edu>
> -
>  .. contents::
>     :local:
>
> @@ -18,369 +16,587 @@
>  in and of itself (i.e. how to understand a given construct in terms of how
>  it affects the final set of records represented by the TableGen file). If
>  you are unsure if this document is really what you are looking for, please
> -read :doc:`/TableGenFundamentals` first.
> -
> -Notation
> -========
> -
> -The lexical and syntax notation used here is intended to imitate
> -`Python's`_. In particular, for lexical definitions, the productions
> -operate at the character level and there is no implied whitespace between
> -elements. The syntax definitions operate at the token level, so there is
> -implied whitespace between tokens.
> -
> -.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
> -
> -Lexical Analysis
> -================
> -
> -TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
> -comments.
> -
> -The following is a listing of the basic punctuation tokens::
> -
> -   - + [ ] { } ( ) < > : ; .  = ? #
> -
> -Numeric literals take one of the following forms:
> -
> -.. TableGen actually will lex some pretty strange sequences an interpret
> -   them as numbers. What is shown here is an attempt to approximate what it
> -   "should" accept.
> -
> -.. productionlist::
> -   TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
> -   DecimalInteger: ["+" | "-"] ("0"..."9")+
> -   HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
> -   BinInteger: "0b" ("0" | "1")+
> -
> -One aspect to note is that the :token:`DecimalInteger` token *includes* the
> -``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
> -most languages do.
> -
> -TableGen has identifier-like tokens:
> -
> -.. productionlist::
> -   ualpha: "a"..."z" | "A"..."Z" | "_"
> -   TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
> -   TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
> -
> -Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
> -begin with a number. In case of ambiguity, a token will be interpreted as a
> -numeric literal rather than an identifier.
> -
> -TableGen also has two string-like literals:
> -
> -.. productionlist::
> -   TokString: '"' <non-'"' characters and C-like escapes> '"'
> -   TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
> -
> -:token:`TokCodeFragment` is essentially a multiline string literal
> -delimited by ``[{`` and ``}]``.
> -
> -.. note::
> -   The current implementation accepts the following C-like escapes::
> -
> -      \\ \' \" \t \n
> +read :doc:`the introduction <index>` first.
>
> -TableGen also has the following keywords::
> +TableGen syntax
> +===============
>
> -   bit   bits      class   code         dag
> -   def   foreach   defm    field        in
> -   int   let       list    multiclass   string
> +TableGen doesn't care about the meaning of data (that is up to the backend to
> +define), but it does care about syntax, and it enforces a simple type system.
> +This section describes the syntax and the constructs allowed in a TableGen file.
>
> -TableGen also has "bang operators" which have a
> -wide variety of meanings:
> +TableGen primitives
> +-------------------
>
> -.. productionlist::
> -   BangOperator: one of
> -               :!eq     !if      !head    !tail      !con
> -               :!add    !shl     !sra     !srl
> -               :!cast   !empty   !subst   !foreach   !strconcat
> +TableGen comments
> +^^^^^^^^^^^^^^^^^
>
> -Syntax
> -======
> +TableGen supports BCPL style "``//``" comments, which run to the end of the
> +line, and it also supports **nestable** "``/* */``" comments.

C++-style comments instead of BCPL (not everyone is a programming
language historian, but everyone reading this doc needs to understand
C++)?

>
> -TableGen has an ``include`` mechanism. It does not play a role in the
> -syntax per se, since it is lexically replaced with the contents of the
> -included file.
> +.. _TableGen type:
>
> -.. productionlist::
> -   IncludeDirective: "include" `TokString`
> +The TableGen type system
> +^^^^^^^^^^^^^^^^^^^^^^^^
>
> -TableGen's top-level production consists of "objects".
> +TableGen files are strongly typed, in a simple (but complete) type-system.
> +These types are used to perform automatic conversions, check for errors, and to
> +help interface designers constrain the input that they allow.  Every `value
> +definition`_ is required to have an associated type.
>
> -.. productionlist::
> -   TableGenFile: `Object`*
> -   Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
> +TableGen supports a mixture of very low-level types (such as ``bit``) and very
> +high-level types (such as ``dag``).  This flexibility is what allows it to
> +describe a wide range of information conveniently and compactly.  The TableGen
> +types are:
>
> -``class``\es
> -------------
> +``bit``
> +    A 'bit' is a boolean value that can hold either 0 or 1.
>
> -.. productionlist::
> -   Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
> +``int``
> +    The 'int' type represents a simple 32-bit integer value, such as 5.

Signed integer?

>
> -A ``class`` declaration creates a record which other records can inherit
> -from. A class can be parametrized by a list of "template arguments", whose
> -values can be used in the class body.
> +``string``
> +    The 'string' type represents an ordered sequence of characters of arbitrary
> +    length.

Do we want to mention anything about encodings and null terminators?

>
> -A given class can only be defined once. A ``class`` declaration is
> -considered to define the class if any of the following is true:
> +``bits<n>``
> +    A 'bits' type is an arbitrary, but fixed, size integer that is broken up
> +    into individual bits.  This type is useful because it can handle some bits
> +    being defined while others are undefined.
>
> -.. break ObjectBody into its consituents so that they are present here?
> +``list<ty>``
> +    This type represents a list whose elements are some other type.  The
> +    contained type is arbitrary: it can even be another list type.
>
> -#. The :token:`TemplateArgList` is present.
> -#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
> -#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
> +Class type
> +    Specifying a class name in a type context means that the defined value must
> +    be a subclass of the specified class.  This is useful in conjunction with
> +    the ``list`` type, for example, to constrain the elements of the list to a
> +    common base class (e.g., a ``list<Register>`` can only contain definitions
> +    derived from the "``Register``" class).
>
> -You can declare an empty class by giving and empty :token:`TemplateArgList`
> -and an empty :token:`ObjectBody`. This can serve as a restricted form of
> -forward declaration: note that records deriving from the forward-declared
> -class will inherit no fields from it since the record expansion is done
> -when the record is parsed.
> +``dag``
> +    This type represents a nestable directed graph of elements.
>
> -.. productionlist::
> -   TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
> +To date, these types have been sufficient for describing things that TableGen
> +has been used for, but it is straight-forward to extend this list if needed.

What about the "code" type (which is basically just a fancy way to
specify a multiline string literal)?

>
> -Declarations
> -------------
> +.. _TableGen expressions:
>
> -.. Omitting mention of arcane "field" prefix to discourage its use.
> +TableGen values and expressions
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> -The declaration syntax is pretty much what you would expect as a C++
> -programmer.
> +TableGen allows for a pretty reasonable number of different expression forms
> +when building up values.  These forms allow the TableGen file to be written in a
> +natural syntax and flavor for the application.  The current expression forms
> +supported include:
>
> -.. productionlist::
> -   Declaration: `Type` `TokIdentifier` ["=" `Value`]
> +``?``
> +    uninitialized field
>
> -It assigns the value to the identifer.
> +``0b1001011``
> +    binary integer value
>
> -Types
> ------
> +``07654321``
> +    octal integer value (indicated by a leading 0)
>
> -.. productionlist::
> -   Type: "string" | "code" | "bit" | "int" | "dag"
> -       :| "bits" "<" `TokInteger` ">"
> -       :| "list" "<" `Type` ">"
> -       :| `ClassID`
> -   ClassID: `TokIdentifier`
> +``7``
> +    decimal integer value
>
> -Both ``string`` and ``code`` correspond to the string type; the difference
> -is purely to indicate programmer intention.
> +``0x7F``
> +    hexadecimal integer value
>
> -The :token:`ClassID` must identify a class that has been previously
> -declared or defined.
> +``"foo"``
> +    string value
>
> -Values
> -------
> +``[{ ... }]``
> +    usually called a "code fragment", but is just a multiline string literal
>
> -.. productionlist::
> -   Value: `SimpleValue` `ValueSuffix`*
> -   ValueSuffix: "{" `RangeList` "}"
> -              :| "[" `RangeList` "]"
> -              :| "." `TokIdentifier`
> -   RangeList: `RangePiece` ("," `RangePiece`)*
> -   RangePiece: `TokInteger`
> -             :| `TokInteger` "-" `TokInteger`
> -             :| `TokInteger` `TokInteger`
> +``[ X, Y, Z ]<type>``
> +    list value.  <type> is the type of the list element and is usually optional.
> +    In rare cases, TableGen is unable to deduce the element type in which case
> +    the user must specify it explicitly.
>
> -The peculiar last form of :token:`RangePiece` is due to the fact that the
> -"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
> -two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
> -instead of "1", "-", and "5".
> -The :token:`RangeList` can be thought of as specifying "list slice" in some
> -contexts.
> +``{ a, b, c }``
> +    initializer for a "bits<3>" value
>
> +``value``
> +    value reference
>
> -:token:`SimpleValue` has a number of forms:
> +``value{17}``
> +    access to one bit of a value
>
> +``value{15-17}``
> +    access to multiple bits of a value
>
> -.. productionlist::
> -   SimpleValue: `TokIdentifier`
> +``DEF``
> +    reference to a record definition
>
> -The value will be the variable referenced by the identifier. It can be one
> -of:
> +``CLASS<val list>``
> +    reference to a new anonymous definition of CLASS with the specified template
> +    arguments.
>
> -.. The code for this is exceptionally abstruse. These examples are a
> -   best-effort attempt.
> +``X.Y``
> +    reference to the subfield of a value
>
> -* name of a ``def``, such as the use of ``Bar`` in::
> +``list[4-7,17,2-3]``
> +    A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it.
> +    Elements may be included multiple times.
>
> -     def Bar : SomeClass {
> -       int X = 5;
> -     }
> +``foreach <var> = [ <list> ] in { <body> }``
>
> -     def Foo {
> -       SomeClass Baz = Bar;
> -     }
> +``foreach <var> = [ <list> ] in <def>``
> +    Replicate <body> or <def>, replacing instances of <var> with each value
> +    in <list>.  <var> is scoped at the level of the ``foreach`` loop and must
> +    not conflict with any other object introduced in <body> or <def>.  Currently
> +    only ``def``\s are expanded within <body>.
>
> -* value local to a ``def``, such as the use of ``Bar`` in::
> +``foreach <var> = 0-15 in ...``
>
> -     def Foo {
> -       int Bar = 5;
> -       int Baz = Bar;
> -     }
> +``foreach <var> = {0-15,32-47} in ...``
> +    Loop over ranges of integers. The braces are required for multiple ranges.
>
> -* a template arg of a ``class``, such as the use of ``Bar`` in::
> +``(DEF a, b)``
> +    a dag value.  The first element is required to be a record definition, the
> +    remaining elements in the list may be arbitrary other values, including
> +    nested ```dag``' values.
>
> -     class Foo<int Bar> {
> -       int Baz = Bar;
> -     }
> +``!strconcat(a, b)``
> +    A string value that is the result of concatenating the 'a' and 'b' strings.
>
> -* value local to a ``multiclass``, such as the use of ``Bar`` in::
> +``str1#str2``
> +    "#" (paste) is a shorthand for !strconcat.  It may concatenate things that
> +    are not quoted strings, in which case an implicit !cast<string> is done on
> +    the operand of the paste.
>
> -     multiclass Foo {
> -       int Bar = 5;
> -       int Baz = Bar;
> -     }
> +``!cast<type>(a)``
> +    A symbol of type *type* obtained by looking up the string 'a' in the symbol
> +    table.  If the type of 'a' does not match *type*, TableGen aborts with an
> +    error. !cast<string> is a special case in that the argument must be an
> +    object defined by a 'def' construct.
>
> -* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
> +``!subst(a, b, c)``
> +    If 'a' and 'b' are of string type or are symbol references, substitute 'b'
> +    for 'a' in 'c.'  This operation is analogous to $(subst) in GNU make.
>
> -     multiclass Foo<int Bar> {
> -       int Baz = Bar;
> -     }
> +``!foreach(a, b, c)``
> +    For each member 'b' of dag or list 'a' apply operator 'c.'  'b' is a dummy
> +    variable that should be declared as a member variable of an instantiated
> +    class.  This operation is analogous to $(foreach) in GNU make.
>
> -.. productionlist::
> -   SimpleValue: `TokInteger`
> +``!head(a)``
> +    The first element of list 'a.'
>
> -This represents the numeric value of the integer.
> +``!tail(a)``
> +    The 2nd-N elements of list 'a.'
>
> -.. productionlist::
> -   SimpleValue: `TokString`+
> +``!empty(a)``
> +    An integer {0,1} indicating whether list 'a' is empty.
>
> -Multiple adjacent string literals are concatenated like in C/C++. The value
> -is the concatenation of the strings.
> +``!if(a,b,c)``
> +  'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise.
>
> -.. productionlist::
> -   SimpleValue: `TokCodeFragment`
> +``!eq(a,b)``
> +    'bit 1' if string a is equal to string b, 0 otherwise.  This only operates
> +    on string, int and bit objects.  Use !cast<string> to compare other types of
> +    objects.
>
> -The value is the string value of the code fragment.
> +Note that all of the values have rules specifying how they convert to values
> +for different types.  These rules allow you to assign a value like "``7``"
> +to a "``bits<4>``" value, for example.
>
> -.. productionlist::
> -   SimpleValue: "?"
> +Classes and definitions
> +-----------------------
>
> -``?`` represents an "unset" initializer.
> +As mentioned in the :doc:`introduction <index>`, classes and definitions (collectively known as
> +'records') in TableGen are the main high-level unit of information that TableGen
> +collects.  Records are defined with a ``def`` or ``class`` keyword, the record
> +name, and an optional list of "`template arguments`_".  If the record has
> +superclasses, they are specified as a comma separated list that starts with a
> +colon character ("``:``").  If `value definitions`_ or `let expressions`_ are
> +needed for the class, they are enclosed in curly braces ("``{}``"); otherwise,
> +the record ends with a semicolon.
>
> -.. productionlist::
> -   SimpleValue: "{" `ValueList` "}"
> -   ValueList: [`ValueListNE`]
> -   ValueListNE: `Value` ("," `Value`)*
> +Here is a simple TableGen file:
>
> -This represents a sequence of bits, as would be used to initialize a
> -``bits<n>`` field (where ``n`` is the number of bits).
> +.. code-block:: llvm
>
> -.. productionlist::
> -   SimpleValue: `ClassID` "<" `ValueListNE` ">"
> +  class C { bit V = 1; }
> +  def X : C;
> +  def Y : C {
> +    string Greeting = "hello";
> +  }
> +
> +This example defines two definitions, ``X`` and ``Y``, both of which derive from
> +the ``C`` class.  Because of this, they both get the ``V`` bit value.  The ``Y``
> +definition also gets the Greeting member as well.
> +
> +In general, classes are useful for collecting together the commonality between a
> +group of records and isolating it in a single place.  Also, classes permit the
> +specification of default values for their subclasses, allowing the subclasses to
> +override them as they wish.
>
> -This generates a new anonymous record definition (as would be created by an
> -unnamed ``def`` inheriting from the given class with the given template
> -arguments) and the value is the value of that record definition.
> +.. _value definition:
> +.. _value definitions:
> +
> +Value definitions
> +^^^^^^^^^^^^^^^^^
> +
> +Value definitions define named entries in records.  A value must be defined
> +before it can be referred to as the operand for another value definition or
> +before the value is reset with a `let expression`_.  A value is defined by
> +specifying a `TableGen type`_ and a name.  If an initial value is available, it
> +may be specified after the type with an equal sign.  Value definitions require
> +terminating semicolons.
> +
> +.. _let expression:
> +.. _let expressions:
> +.. _"let" expressions within a record:
> +
> +'let' expressions
> +^^^^^^^^^^^^^^^^^
> +
> +A record-level let expression is used to change the value of a value definition
> +in a record.  This is primarily useful when a superclass defines a value that a
> +derived class or definition wants to override.  Let expressions consist of the
> +'``let``' keyword followed by a value name, an equal sign ("``=``"), and a new
> +value.  For example, a new class could be added to the example above, redefining
> +the ``V`` field for all of its subclasses:
> +
> +.. code-block:: llvm
> +
> +  class D : C { let V = 0; }
> +  def Z : D;
> +
> +In this case, the ``Z`` definition will have a zero value for its ``V`` value,
> +despite the fact that it derives (indirectly) from the ``C`` class, because the
> +``D`` class overrode its value.
> +
> +.. _template arguments:
> +
> +Class template arguments
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +TableGen permits the definition of parameterized classes as well as normal
> +concrete classes.  Parameterized TableGen classes specify a list of variable
> +bindings (which may optionally have defaults) that are bound when used.  Here is
> +a simple example:
> +
> +.. code-block:: llvm
> +
> +  class FPFormat<bits<3> val> {
> +    bits<3> Value = val;
> +  }
> +  def NotFP      : FPFormat<0>;
> +  def ZeroArgFP  : FPFormat<1>;
> +  def OneArgFP   : FPFormat<2>;
> +  def OneArgFPRW : FPFormat<3>;
> +  def TwoArgFP   : FPFormat<4>;
> +  def CompareFP  : FPFormat<5>;
> +  def CondMovFP  : FPFormat<6>;
> +  def SpecialFP  : FPFormat<7>;
> +
> +In this case, template arguments are used as a space efficient way to specify a
> +list of "enumeration values", each with a "``Value``" field set to the specified
> +integer.
> +
> +The more esoteric forms of `TableGen expressions`_ are useful in conjunction
> +with template arguments.  As an example:
> +
> +.. code-block:: llvm
> +
> +  class ModRefVal<bits<2> val> {
> +    bits<2> Value = val;
> +  }
> +
> +  def None   : ModRefVal<0>;
> +  def Mod    : ModRefVal<1>;
> +  def Ref    : ModRefVal<2>;
> +  def ModRef : ModRefVal<3>;
> +
> +  class Value<ModRefVal MR> {
> +    // Decode some information into a more convenient format, while providing
> +    // a nice interface to the user of the "Value" class.
> +    bit isMod = MR.Value{0};
> +    bit isRef = MR.Value{1};
> +
> +    // other stuff...
> +  }
> +
> +  // Example uses
> +  def bork : Value<Mod>;
> +  def zork : Value<Ref>;
> +  def hork : Value<ModRef>;
> +
> +This is obviously a contrived example, but it shows how template arguments can
> +be used to decouple the interface provided to the user of the class from the
> +actual internal data representation expected by the class.  In this case,
> +running ``llvm-tblgen`` on the example prints the following definitions:
> +
> +.. code-block:: llvm
> +
> +  def bork {      // Value
> +    bit isMod = 1;
> +    bit isRef = 0;
> +  }
> +  def hork {      // Value
> +    bit isMod = 1;
> +    bit isRef = 1;
> +  }
> +  def zork {      // Value
> +    bit isMod = 0;
> +    bit isRef = 1;
> +  }
> +
> +This shows that TableGen was able to dig into the argument and extract a piece
> +of information that was requested by the designer of the "Value" class.  For
> +more realistic examples, please see existing users of TableGen, such as the X86
> +backend.
> +
> +Multiclass definitions and instances
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +While classes with template arguments are a good way to factor commonality
> +between two instances of a definition, multiclasses allow a convenient notation
> +for defining multiple definitions at once (instances of implicitly constructed
> +classes).  For example, consider an 3-address instruction set whose instructions
> +come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``"
> +(e.g. SPARC). In this case, you'd like to specify in one place that this
> +commonality exists, then in a separate place indicate what all the ops are.
> +
> +Here is an example TableGen fragment that shows this idea:
> +
> +.. code-block:: llvm
> +
> +  def ops;
> +  def GPR;
> +  def Imm;
> +  class inst<int opc, string asmstr, dag operandlist>;
> +
> +  multiclass ri_inst<int opc, string asmstr> {
> +    def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
> +                   (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
> +    def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
> +                   (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
> +  }
> +
> +  // Instantiations of the ri_inst multiclass.
> +  defm ADD : ri_inst<0b111, "add">;
> +  defm SUB : ri_inst<0b101, "sub">;
> +  defm MUL : ri_inst<0b100, "mul">;
> +  ...
> +
> +The name of the resultant definitions has the multidef fragment names appended
> +to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc.  A defm may
> +inherit from multiple multiclasses, instantiating definitions from each
> +multiclass.  Using a multiclass this way is exactly equivalent to instantiating
> +the classes multiple times yourself, e.g. by writing:
> +
> +.. code-block:: llvm
> +
> +  def ops;
> +  def GPR;
> +  def Imm;
> +  class inst<int opc, string asmstr, dag operandlist>;
> +
> +  class rrinst<int opc, string asmstr>
> +    : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
> +           (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
> +
> +  class riinst<int opc, string asmstr>
> +    : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
> +           (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
> +
> +  // Instantiations of the ri_inst multiclass.
> +  def ADD_rr : rrinst<0b111, "add">;
> +  def ADD_ri : riinst<0b111, "add">;
> +  def SUB_rr : rrinst<0b101, "sub">;
> +  def SUB_ri : riinst<0b101, "sub">;
> +  def MUL_rr : rrinst<0b100, "mul">;
> +  def MUL_ri : riinst<0b100, "mul">;
> +  ...
> +
> +A ``defm`` can also be used inside a multiclass providing several levels of
> +multiclass instantiations.
> +
> +.. code-block:: llvm
> +
> +  class Instruction<bits<4> opc, string Name> {
> +    bits<4> opcode = opc;
> +    string name = Name;
> +  }
> +
> +  multiclass basic_r<bits<4> opc> {
> +    def rr : Instruction<opc, "rr">;
> +    def rm : Instruction<opc, "rm">;
> +  }
> +
> +  multiclass basic_s<bits<4> opc> {
> +    defm SS : basic_r<opc>;
> +    defm SD : basic_r<opc>;
> +    def X : Instruction<opc, "x">;
> +  }
> +
> +  multiclass basic_p<bits<4> opc> {
> +    defm PS : basic_r<opc>;
> +    defm PD : basic_r<opc>;
> +    def Y : Instruction<opc, "y">;
> +  }
> +
> +  defm ADD : basic_s<0xf>, basic_p<0xf>;
> +  ...
> +
> +  // Results
> +  def ADDPDrm { ...
> +  def ADDPDrr { ...
> +  def ADDPSrm { ...
> +  def ADDPSrr { ...
> +  def ADDSDrm { ...
> +  def ADDSDrr { ...
> +  def ADDY { ...
> +  def ADDX { ...
> +
> +``defm`` declarations can inherit from classes too, the rule to follow is that
> +the class list must start after the last multiclass, and there must be at least
> +one multiclass before them.
> +
> +.. code-block:: llvm
> +
> +  class XD { bits<4> Prefix = 11; }
> +  class XS { bits<4> Prefix = 12; }
> +
> +  class I<bits<4> op> {
> +    bits<4> opcode = op;
> +  }
> +
> +  multiclass R {
> +    def rr : I<4>;
> +    def rm : I<2>;
> +  }
> +
> +  multiclass Y {
> +    defm SS : R, XD;
> +    defm SD : R, XS;
> +  }
> +
> +  defm Instr : Y;
> +
> +  // Results
> +  def InstrSDrm {
> +    bits<4> opcode = { 0, 0, 1, 0 };
> +    bits<4> Prefix = { 1, 1, 0, 0 };
> +  }
> +  ...
> +  def InstrSSrr {
> +    bits<4> opcode = { 0, 1, 0, 0 };
> +    bits<4> Prefix = { 1, 0, 1, 1 };
> +  }
> +
> +File scope entities
> +-------------------
> +
> +File inclusion
> +^^^^^^^^^^^^^^
> +
> +TableGen supports the '``include``' token, which textually substitutes the
> +specified file in place of the include directive.  The filename should be
> +specified as a double quoted string immediately after the '``include``' keyword.
> +Example:
> +
> +.. code-block:: llvm
> +
> +  include "foo.td"

Should we mention anything about how includes are searched (what are
the includes relative to)?

> +
> +'let' expressions
> +^^^^^^^^^^^^^^^^^
> +
> +"Let" expressions at file scope are similar to `"let" expressions within a
> +record`_, except they can specify a value binding for multiple records at a
> +time, and may be useful in certain other cases.  File-scope let expressions are
> +really just another way that TableGen allows the end-user to factor out
> +commonality from the records.
> +
> +File-scope "let" expressions take a comma-separated list of bindings to apply,
> +and one or more records to bind the values in.  Here are some examples:
> +
> +.. code-block:: llvm
> +
> +  let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in
> +    def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>;
> +
> +  let isCall = 1 in
> +    // All calls clobber the non-callee saved registers...
> +    let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
> +                MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
> +                XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in {
> +      def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops),
> +                             "call\t${dst:call}", []>;
> +      def CALL32r     : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
> +                          "call\t{*}$dst", [(X86call GR32:$dst)]>;
> +      def CALL32m     : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
> +                          "call\t{*}$dst", []>;
> +    }
> +
> +File-scope "let" expressions are often useful when a couple of definitions need
> +to be added to several records, and the records do not otherwise need to be
> +opened, as in the case with the ``CALL*`` instructions above.
> +
> +It's also possible to use "let" expressions inside multiclasses, providing more
> +ways to factor out commonality from the records, specially if using several
> +levels of multiclass instantiations. This also avoids the need of using "let"
> +expressions within subsequent records inside a multiclass.
> +
> +.. code-block:: llvm
> +
> +  multiclass basic_r<bits<4> opc> {
> +    let Predicates = [HasSSE2] in {
> +      def rr : Instruction<opc, "rr">;
> +      def rm : Instruction<opc, "rm">;
> +    }
> +    let Predicates = [HasSSE3] in
> +      def rx : Instruction<opc, "rx">;
> +  }
>
> -.. productionlist::
> -   SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
> +  multiclass basic_ss<bits<4> opc> {
> +    let IsDouble = 0 in
> +      defm SS : basic_r<opc>;
>
> -A list initializer. The optional :token:`Type` can be used to indicate a
> -specific element type, otherwise the element type will be deduced from the
> -given values.
> +    let IsDouble = 1 in
> +      defm SD : basic_r<opc>;
> +  }
>
> -.. The initial `DagArg` of the dag must start with an identifier or
> -   !cast, but this is more of an implementation detail and so for now just
> -   leave it out.
> +  defm ADD : basic_ss<0xf>;
> +
> +Looping
> +^^^^^^^
> +
> +TableGen supports the '``foreach``' block, which textually replicates the loop
> +body, substituting iterator values for iterator references in the body.
> +Example:
>
> -.. productionlist::
> -   SimpleValue: "(" `DagArg` `DagArgList` ")"
> -   DagArgList: `DagArg` ("," `DagArg`)*
> -   DagArg: `Value` [":" `TokVarName`] | `TokVarName`
> +.. code-block:: llvm
> +
> +  foreach i = [0, 1, 2, 3] in {
> +    def R#i : Register<...>;
> +    def F#i : Register<...>;
> +  }
>
> -The initial :token:`DagArg` is called the "operator" of the dag.
> +This will create objects ``R0``, ``R1``, ``R2`` and ``R3``.  ``foreach`` blocks
> +may be nested. If there is only one item in the body the braces may be
> +elided:
>
> -.. productionlist::
> -   SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
> +.. code-block:: llvm
>
> -Bodies
> -------
> +  foreach i = [0, 1, 2, 3] in
> +    def R#i : Register<...>;
>
> -.. productionlist::
> -   ObjectBody: `BaseClassList` `Body`
> -   BaseClassList: [":" `BaseClassListNE`]
> -   BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
> -   SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
> -   DefmID: `TokIdentifier`
> -
> -The version with the :token:`MultiClassID` is only valid in the
> -:token:`BaseClassList` of a ``defm``.
> -The :token:`MultiClassID` should be the name of a ``multiclass``.
> -
> -.. put this somewhere else
> -
> -It is after parsing the base class list that the "let stack" is applied.
> -
> -.. productionlist::
> -   Body: ";" | "{" BodyList "}"
> -   BodyList: BodyItem*
> -   BodyItem: `Declaration` ";"
> -           :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
> -
> -The ``let`` form allows overriding the value of an inherited field.
> -
> -``def``
> --------
> -
> -.. TODO::
> -   There can be pastes in the names here, like ``#NAME#``. Look into that
> -   and document it (it boils down to ParseIDValue with IDParseMode ==
> -   ParseNameMode). ParseObjectName calls into the general ParseValue, with
> -   the only different from "arbitrary expression parsing" being IDParseMode
> -   == Mode.
> -
> -.. productionlist::
> -   Def: "def" `TokIdentifier` `ObjectBody`
> -
> -Defines a record whose name is given by the :token:`TokIdentifier`. The
> -fields of the record are inherited from the base classes and defined in the
> -body.
> -
> -Special handling occurs if this ``def`` appears inside a ``multiclass`` or
> -a ``foreach``.
> -
> -``defm``
> ---------
> -
> -.. productionlist::
> -   Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
> -
> -Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
> -precede any ``class``'s that appear.
> -
> -``foreach``
> ------------
> -
> -.. productionlist::
> -   Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
> -          :| "foreach" `Declaration` "in" `Object`
> -
> -The value assigned to the variable in the declaration is iterated over and
> -the object or object list is reevaluated with the variable set at each
> -iterated value.
> -
> -Top-Level ``let``
> ------------------
> -
> -.. productionlist::
> -   Let:  "let" `LetList` "in" "{" `Object`* "}"
> -      :| "let" `LetList` "in" `Object`
> -   LetList: `LetItem` ("," `LetItem`)*
> -   LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
> +Code Generator backend info
> +===========================
>
> -This is effectively equivalent to ``let`` inside the body of a record
> -except that it applies to multiple records at a time. The bindings are
> -applied at the end of parsing the base classes of a record.
> +Expressions used by code generator to describe instructions and isel patterns:
>
> -``multiclass``
> ---------------
> +``(implicit a)``
> +    an implicitly defined physical register.  This tells the dag instruction
> +    selection emitter the input pattern's extra definitions matches implicit
> +    physical register definitions.
>
> -.. productionlist::
> -   MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
> -             : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
> -   BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
> -   MultiClassID: `TokIdentifier`
> -   MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
> Index: docs/TableGen/index.rst
> ===================================================================
> --- /dev/null
> +++ docs/TableGen/index.rst
> @@ -0,0 +1,306 @@
> +========
> +TableGen
> +========
> +
> +.. contents::
> +   :local:
> +
> +.. toctree::
> +   :hidden:
> +
> +   BackEnds
> +   LangRef
> +   Deficiencies
> +
> +Introduction
> +============
> +
> +TableGen's purpose is to help a human develop and maintain records of
> +domain-specific information.  Because there may be a large number of these
> +records, it is specifically designed to allow writing flexible descriptions and
> +for common features of these records to be factored out.  This reduces the
> +amount of duplication in the description, reduces the chance of error, and makes
> +it easier to structure domain specific information.
> +
> +The core part of TableGen parses a file, instantiates the declarations, and
> +hands the result off to a domain-specific `backends`_ for processing.
> +
> +The current major users of TableGen are :doc:`../CodeGenerator`
> +and the
> +`Clang diagnostics and attributes <http://clang.llvm.org/docs/UsersManual.html#controlling-errors-and-warnings>`_.
> +
> +Note that if you work on TableGen much, and use emacs or vim, that you can find
> +an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and
> +``llvm/utils/vim`` directories of your LLVM distribution, respectively.
> +
> +.. _intro:
> +
> +
> +The TableGen program
> +====================
> +
> +TableGen files are interpreted by the TableGen program: `llvm-tblgen` available
> +on your build directory under `bin`. It is not installed in the system (or where
> +your sysroot is set to), since it has no use beyond LLVM's build process.
> +
> +Running TableGen
> +----------------
> +
> +TableGen runs just like any other LLVM tool.  The first (optional) argument
> +specifies the file to read.  If a filename is not specified, ``llvm-tblgen``
> +reads from standard input.
> +
> +To be useful, one of the `backends`_ must be used.  These backends are
> +selectable on the command line (type '``llvm-tblgen -help``' for a list).  For
> +example, to get a list of all of the definitions that subclass a particular type
> +(which can be useful for building up an enum list of these records), use the
> +``-print-enums`` option:

Since there's so much commonality here, would it make sense to ditch
foo-tblgen as an identifier, and simply call out that there's
llvm-tblgen and clang-tblgen (perhaps explaining why they're different
and when you might want to add a new tablegen driver)?

> +
> +.. code-block:: bash
> +
> +  $ llvm-tblgen X86.td -print-enums -class=Register
> +  AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX,
> +  ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP,
> +  MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D,
> +  R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15,
> +  R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI,
> +  RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7,
> +  XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5,
> +  XMM6, XMM7, XMM8, XMM9,
> +
> +  $ llvm-tblgen X86.td -print-enums -class=Instruction
> +  ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri,
> +  ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8,
> +  ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm,
> +  ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr,
> +  ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ...
> +
> +The default backend prints out all of the records.
> +
> +If you plan to use TableGen, you will most likely have to write a `backend`_
> +that extracts the information specific to what you need and formats it in the
> +appropriate way.
> +
> +Example
> +-------
> +
> +With no other arguments, `llvm-tblgen` parses the specified file and prints out all
> +of the classes, then all of the definitions.  This is a good way to see what the
> +various definitions expand to fully.  Running this on the ``X86.td`` file prints
> +this (at the time of this writing):
> +
> +.. code-block:: llvm
> +
> +  ...
> +  def ADD32rr {   // Instruction X86Inst I
> +    string Namespace = "X86";
> +    dag OutOperandList = (outs GR32:$dst);
> +    dag InOperandList = (ins GR32:$src1, GR32:$src2);
> +    string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}";
> +    list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))];
> +    list<Register> Uses = [];
> +    list<Register> Defs = [EFLAGS];
> +    list<Predicate> Predicates = [];
> +    int CodeSize = 3;
> +    int AddedComplexity = 0;
> +    bit isReturn = 0;
> +    bit isBranch = 0;
> +    bit isIndirectBranch = 0;
> +    bit isBarrier = 0;
> +    bit isCall = 0;
> +    bit canFoldAsLoad = 0;
> +    bit mayLoad = 0;
> +    bit mayStore = 0;
> +    bit isImplicitDef = 0;
> +    bit isConvertibleToThreeAddress = 1;
> +    bit isCommutable = 1;
> +    bit isTerminator = 0;
> +    bit isReMaterializable = 0;
> +    bit isPredicable = 0;
> +    bit hasDelaySlot = 0;
> +    bit usesCustomInserter = 0;
> +    bit hasCtrlDep = 0;
> +    bit isNotDuplicable = 0;
> +    bit hasSideEffects = 0;
> +    bit neverHasSideEffects = 0;
> +    InstrItinClass Itinerary = NoItinerary;
> +    string Constraints = "";
> +    string DisableEncoding = "";
> +    bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 };
> +    Format Form = MRMDestReg;
> +    bits<6> FormBits = { 0, 0, 0, 0, 1, 1 };
> +    ImmType ImmT = NoImm;
> +    bits<3> ImmTypeBits = { 0, 0, 0 };
> +    bit hasOpSizePrefix = 0;
> +    bit hasAdSizePrefix = 0;
> +    bits<4> Prefix = { 0, 0, 0, 0 };
> +    bit hasREX_WPrefix = 0;
> +    FPFormat FPForm = ?;
> +    bits<3> FPFormBits = { 0, 0, 0 };
> +  }
> +  ...
> +
> +This definition corresponds to the 32-bit register-register ``add`` instruction
> +of the x86 architecture.  ``def ADD32rr`` defines a record named
> +``ADD32rr``, and the comment at the end of the line indicates the superclasses
> +of the definition.  The body of the record contains all of the data that
> +TableGen assembled for the record, indicating that the instruction is part of
> +the "X86" namespace, the pattern indicating how the instruction should be
> +emitted into the assembly file, that it is a two-address instruction, has a
> +particular encoding, etc.  The contents and semantics of the information in the
> +record are specific to the needs of the X86 backend, and are only shown as an
> +example.
> +
> +As you can see, a lot of information is needed for every instruction supported
> +by the code generator, and specifying it all manually would be unmaintainable,
> +prone to bugs, and tiring to do in the first place.  Because we are using
> +TableGen, all of the information was derived from the following definition:
> +
> +.. code-block:: llvm
> +
> +  let Defs = [EFLAGS],
> +      isCommutable = 1,                  // X = ADD Y,Z --> X = ADD Z,Y
> +      isConvertibleToThreeAddress = 1 in // Can transform into LEA.
> +  def ADD32rr  : I<0x01, MRMDestReg, (outs GR32:$dst),
> +                                     (ins GR32:$src1, GR32:$src2),
> +                   "add{l}\t{$src2, $dst|$dst, $src2}",
> +                   [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
> +
> +This definition makes use of the custom class ``I`` (extended from the custom
> +class ``X86Inst``), which is defined in the X86-specific TableGen file, to
> +factor out the common features that instructions of its class share.  A key
> +feature of TableGen is that it allows the end-user to define the abstractions
> +they prefer to use when describing their information.
> +
> +Each ``def`` record has a special entry called "NAME".  This is the name of the
> +record ("``ADD32rr``" above).  In the general case ``def`` names can be formed
> +from various kinds of string processing expressions and ``NAME`` resolves to the
> +final value obtained after resolving all of those expressions.  The user may
> +refer to ``NAME`` anywhere she desires to use the ultimate name of the ``def``.
> +``NAME`` should not be defined anywhere else in user code to avoid conflicts.
> +
> +Syntax
> +======
> +
> +TableGen has a syntax that is losely based on C++ templates, with built-in
> +types and specification. In addition, TableGen's syntax introduces some
> +automation concepts like multiclass, foreach, let, etc.
> +
> +Basic concepts
> +--------------
> +
> +TableGen files consist of two key parts: 'classes' and 'definitions', both of
> +which are considered 'records'.
> +
> +**TableGen records** have a unique name, a list of values, and a list of
> +superclasses.  The list of values is the main data that TableGen builds for each
> +record; it is this that holds the domain specific information for the
> +application.  The interpretation of this data is left to a specific `backends`_,
> +but the structure and format rules are taken care of and are fixed by
> +TableGen.
> +
> +**TableGen definitions** are the concrete form of 'records'.  These generally do
> +not have any undefined values, and are marked with the '``def``' keyword.
> +
> +.. code-block:: llvm
> +
> +  def FeatureFPARMv8 : SubtargetFeature<"fp-armv8", "HasFPARMv8", "true",
> +                                        "Enable ARMv8 FP">;
> +
> +In this example, FeatureFPARMv8 is ``SubtargetFeature`` record initialised
> +with some values. The names of the classes are defined via the
> +keyword `class` either on the same file or some other included. Most target
> +TableGen files include the generic ones in ``include/llvm/Target``.
> +
> +**TableGen classes** are abstract records that are used to build and describe
> +other records.  These classes allow the end-user to build abstractions for
> +either the domain they are targeting (such as "Register", "RegisterClass", and
> +"Instruction" in the LLVM code generator) or for the implementor to help factor
> +out common properties of records (such as "FPInst", which is used to represent
> +floating point instructions in the X86 backend).  TableGen keeps track of all of
> +the classes that are used to build up a definition, so the backend can find all
> +definitions of a particular class, such as "Instruction".
> +
> +.. code-block:: llvm
> +
> + class ProcNoItin<string Name, list<SubtargetFeature> Features>
> +       : Processor<Name, NoItineraries, Features>;
> +
> +Here, the class ProcNoItin, receiving parameters `Name` of type `string` and
> +a list of target features is specializing the class Processor by passing the
> +arguments down as well as hard-coding NoItineraries.
> +
> +**TableGen multiclasses** are groups of abstract records that are instantiated
> +all at once.  Each instantiation can result in multiple TableGen definitions.
> +If a multiclass inherits from another multiclass, the definitions in the
> +sub-multiclass become part of the current multiclass, as if they were declared
> +in the current multiclass.
> +
> +.. code-block:: llvm
> +
> +  multiclass ro_signed_pats<string T, string Rm, dag Base, dag Offset, dag Extend,
> +                          dag address, ValueType sty> {
> +  def : Pat<(i32 (!cast<SDNode>("sextload" # sty) address)),
> +            (!cast<Instruction>("LDRS" # T # "w_" # Rm # "_RegOffset")
> +              Base, Offset, Extend)>;
> +
> +  def : Pat<(i64 (!cast<SDNode>("sextload" # sty) address)),
> +            (!cast<Instruction>("LDRS" # T # "x_" # Rm # "_RegOffset")
> +              Base, Offset, Extend)>;
> +  }
> +
> +  defm : ro_signed_pats<"B", Rm, Base, Offset, Extend,
> +                        !foreach(decls.pattern, address,
> +                                 !subst(SHIFT, imm_eq0, decls.pattern)),
> +                        i8>;
> +
> +
> +
> +See the `TableGen Language Reference <LangRef.html>`_ for more information.
> +
> +.. _backend:
> +.. _backends:
> +
> +TableGen backends
> +=================
> +
> +TableGen files have no real meaning without a back-end. The default operation
> +of running ``llvm-tblgen`` is to print the information in a textual format, but
> +that's only useful for debugging of the TableGen files themselves. The power
> +in TableGen is, however, to interpret the source files into an internal
> +representation that can be generated into anything you want.
> +
> +Current usage of TableGen is to create include huge files with tables that you
> +can either include directly (if the output is in the language you're coding),
> +or be used in pre-processing via macros surrounding the include of the file.
> +
> +Direct output can be used if the back-end already prints a table in C format
> +or if the output is just a list of strings (for error and warning messages).
> +Pre-processed output should be used if the same information needs to be used
> +in different contexts (like Instruction names), so your back-end should print
> +a meta-information list that can be shaped into different compile-time formats.
> +
> +See the `TableGen BackEnds <BackEnds.html>`_ for more information.
> +
> +TableGen Deficiencies
> +=====================
> +
> +Despite being very generic, TableGen has some deficiencies that have been
> +pointed out numerous times. The common theme is that, while TableGen allows
> +you to build Domain-Specific-Languages, the final languages that you create
> +lack the power of other DSLs, which in turn increase considerably the size
> +and complecity of TableGen files.
> +
> +At the same time, TableGen allows you to create virtually any meaning of
> +the basic concepts via custom-made back-ends, which can pervert the original
> +design and make it very hard for newcomers to understand the evil TableGen
> +file.
> +
> +There are some in favour of extending the semantics even more, but makeing sure
> +back-ends adhere to strict rules. Others suggesting we should move to less,
> +more powerful DSLs designed with specific purposes, or even re-using existing
> +DSLs.
> +
> +Either way, this is a discussion that is likely spanning across several years,
> +if not decades. You can read more in the `TableGen Deficiencies <Deficiencies.html>`_
> +document.
> Index: docs/index.rst
> ===================================================================
> --- docs/index.rst
> +++ docs/index.rst
> @@ -222,6 +222,7 @@
>     LinkTimeOptimization
>     SegmentedStacks
>     TableGenFundamentals
> +   TableGen/index
>     DebuggingJITedCode
>     GoldPlugin
>     MarkedUpDisassembly
> @@ -231,7 +232,6 @@
>     WritingAnLLVMBackend
>     GarbageCollection
>     WritingAnLLVMPass
> -   TableGen/LangRef
>     HowToUseAttributes
>     NVPTXUsage
>     StackMaps
>

~Aaron

On Wed, Mar 19, 2014 at 12:23 PM, Renato Golin <renato.golin at linaro.org> wrote:
>   Removing TableGen from root's toctree, since they're in the TableGen/index toctree.
>
> Hi silvas, aaron.ballman,
>
> http://llvm-reviews.chandlerc.com/D3120
>
> CHANGE SINCE LAST DIFF
>   http://llvm-reviews.chandlerc.com/D3120?vs=7952&id=7953#toc
>
> Files:
>   docs/TableGen/BackEnds.rst
>   docs/TableGen/Deficiencies.rst
>   docs/TableGen/LangRef.rst
>   docs/TableGen/index.rst
>   docs/index.rst



More information about the llvm-commits mailing list