[www-releases] r372328 - Check in 9.0.0 source and docs
Hans Wennborg via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 19 07:32:55 PDT 2019
Added: www-releases/trunk/9.0.0/docs/_sources/TransformMetadata.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TransformMetadata.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TransformMetadata.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TransformMetadata.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,441 @@
+.. _transformation-metadata:
+
+============================
+Code Transformation Metadata
+============================
+
+.. contents::
+ :local:
+
+Overview
+========
+
+LLVM transformation passes can be controlled by attaching metadata to
+the code to transform. By default, transformation passes use heuristics
+to determine whether or not to perform transformations, and when doing
+so, other details of how the transformations are applied (e.g., which
+vectorization factor to select).
+Unless the optimizer is otherwise directed, transformations are applied
+conservatively. This conservatism generally allows the optimizer to
+avoid unprofitable transformations, but in practice, this results in the
+optimizer not applying transformations that would be highly profitable.
+
+Frontends can give additional hints to LLVM passes on which
+transformations they should apply. This can be additional knowledge that
+cannot be derived from the emitted IR, or directives passed from the
+user/programmer. OpenMP pragmas are an example of the latter.
+
+If any such metadata is dropped from the program, the code's semantics
+must not change.
+
+Metadata on Loops
+=================
+
+Attributes can be attached to loops as described in :ref:`llvm.loop`.
+Attributes can describe properties of the loop, disable transformations,
+force specific transformations and set transformation options.
+
+Because metadata nodes are immutable (with the exception of
+``MDNode::replaceOperandWith`` which is dangerous to use on uniqued
+metadata), in order to add or remove a loop attributes, a new ``MDNode``
+must be created and assigned as the new ``llvm.loop`` metadata. Any
+connection between the old ``MDNode`` and the loop is lost. The
+``llvm.loop`` node is also used as LoopID (``Loop::getLoopID()``), i.e.
+the loop effectively gets a new identifier. For instance,
+``llvm.mem.parallel_loop_access`` references the LoopID. Therefore, if
+the parallel access property is to be preserved after adding/removing
+loop attributes, any ``llvm.mem.parallel_loop_access`` reference must be
+updated to the new LoopID.
+
+Transformation Metadata Structure
+=================================
+
+Some attributes describe code transformations (unrolling, vectorizing,
+loop distribution, etc.). They can either be a hint to the optimizer
+that a transformation might be beneficial, instruction to use a specific
+option, , or convey a specific request from the user (such as
+``#pragma clang loop`` or ``#pragma omp simd``).
+
+If a transformation is forced but cannot be carried-out for any reason,
+an optimization-missed warning must be emitted. Semantic information
+such as a transformation being safe (e.g.
+``llvm.mem.parallel_loop_access``) can be unused by the optimizer
+without generating a warning.
+
+Unless explicitly disabled, any optimization pass may heuristically
+determine whether a transformation is beneficial and apply it. If
+metadata for another transformation was specified, applying a different
+transformation before it might be inadvertent due to being applied on a
+different loop or the loop not existing anymore. To avoid having to
+explicitly disable an unknown number of passes, the attribute
+``llvm.loop.disable_nonforced`` disables all optional, high-level,
+restructuring transformations.
+
+The following example avoids the loop being altered before being
+vectorized, for instance being unrolled.
+
+.. code-block:: llvm
+
+ br i1 %exitcond, label %for.exit, label %for.header, !llvm.loop !0
+ ...
+ !0 = distinct !{!0, !1, !2}
+ !1 = !{!"llvm.loop.vectorize.enable", i1 true}
+ !2 = !{!"llvm.loop.disable_nonforced"}
+
+After a transformation is applied, follow-up attributes are set on the
+transformed and/or new loop(s). This allows additional attributes
+including followup-transformations to be specified. Specifying multiple
+transformations in the same metadata node is possible for compatibility
+reasons, but their execution order is undefined. For instance, when
+``llvm.loop.vectorize.enable`` and ``llvm.loop.unroll.enable`` are
+specified at the same time, unrolling may occur either before or after
+vectorization.
+
+As an example, the following instructs a loop to be vectorized and only
+then unrolled.
+
+.. code-block:: llvm
+
+ !0 = distinct !{!0, !1, !2, !3}
+ !1 = !{!"llvm.loop.vectorize.enable", i1 true}
+ !2 = !{!"llvm.loop.disable_nonforced"}
+ !3 = !{!"llvm.loop.vectorize.followup_vectorized", !{"llvm.loop.unroll.enable"}}
+
+If, and only if, no followup is specified, the pass may add attributes itself.
+For instance, the vectorizer adds a ``llvm.loop.isvectorized`` attribute and
+all attributes from the original loop excluding its loop vectorizer
+attributes. To avoid this, an empty followup attribute can be used, e.g.
+
+.. code-block:: llvm
+
+ !3 = !{!"llvm.loop.vectorize.followup_vectorized"}
+
+The followup attributes of a transformation that cannot be applied will
+never be added to a loop and are therefore effectively ignored. This means
+that any followup-transformation in such attributes requires that its
+prior transformations are applied before the followup-transformation.
+The user should receive a warning about the first transformation in the
+transformation chain that could not be applied if it a forced
+transformation. All following transformations are skipped.
+
+Pass-Specific Transformation Metadata
+=====================================
+
+Transformation options are specific to each transformation. In the
+following, we present the model for each LLVM loop optimization pass and
+the metadata to influence them.
+
+Loop Vectorization and Interleaving
+-----------------------------------
+
+Loop vectorization and interleaving is interpreted as a single
+transformation. It is interpreted as forced if
+``!{"llvm.loop.vectorize.enable", i1 true}`` is set.
+
+Assuming the pre-vectorization loop is
+
+.. code-block:: c
+
+ for (int i = 0; i < n; i+=1) // original loop
+ Stmt(i);
+
+then the code after vectorization will be approximately (assuming an
+SIMD width of 4):
+
+.. code-block:: c
+
+ int i = 0;
+ if (rtc) {
+ for (; i + 3 < n; i+=4) // vectorized/interleaved loop
+ Stmt(i:i+3);
+ }
+ for (; i < n; i+=1) // epilogue loop
+ Stmt(i);
+
+where ``rtc`` is a generated runtime check.
+
+``llvm.loop.vectorize.followup_vectorized`` will set the attributes for
+the vectorized loop. If not specified, ``llvm.loop.isvectorized`` is
+combined with the original loop's attributes to avoid it being
+vectorized multiple times.
+
+``llvm.loop.vectorize.followup_epilogue`` will set the attributes for
+the remainder loop. If not specified, it will have the original loop's
+attributes combined with ``llvm.loop.isvectorized`` and
+``llvm.loop.unroll.runtime.disable`` (unless the original loop already
+has unroll metadata).
+
+The attributes specified by ``llvm.loop.vectorize.followup_all`` are
+added to both loops.
+
+When using a follow-up attribute, it replaces any automatically deduced
+attributes for the generated loop in question. Therefore it is
+recommended to add ``llvm.loop.isvectorized`` to
+``llvm.loop.vectorize.followup_all`` which avoids that the loop
+vectorizer tries to optimize the loops again.
+
+Loop Unrolling
+--------------
+
+Unrolling is interpreted as forced any ``!{!"llvm.loop.unroll.enable"}``
+metadata or option (``llvm.loop.unroll.count``, ``llvm.loop.unroll.full``)
+is present. Unrolling can be full unrolling, partial unrolling of a loop
+with constant trip count or runtime unrolling of a loop with a trip
+count unknown at compile-time.
+
+If the loop has been unrolled fully, there is no followup-loop. For
+partial/runtime unrolling, the original loop of
+
+.. code-block:: c
+
+ for (int i = 0; i < n; i+=1) // original loop
+ Stmt(i);
+
+is transformed into (using an unroll factor of 4):
+
+.. code-block:: c
+
+ int i = 0;
+ for (; i + 3 < n; i+=4) // unrolled loop
+ Stmt(i);
+ Stmt(i+1);
+ Stmt(i+2);
+ Stmt(i+3);
+ }
+ for (; i < n; i+=1) // remainder loop
+ Stmt(i);
+
+``llvm.loop.unroll.followup_unrolled`` will set the loop attributes of
+the unrolled loop. If not specified, the attributes of the original loop
+without the ``llvm.loop.unroll.*`` attributes are copied and
+``llvm.loop.unroll.disable`` added to it.
+
+``llvm.loop.unroll.followup_remainder`` defines the attributes of the
+remainder loop. If not specified the remainder loop will have no
+attributes. The remainder loop might not be present due to being fully
+unrolled in which case this attribute has no effect.
+
+Attributes defined in ``llvm.loop.unroll.followup_all`` are added to the
+unrolled and remainder loops.
+
+To avoid that the partially unrolled loop is unrolled again, it is
+recommended to add ``llvm.loop.unroll.disable`` to
+``llvm.loop.unroll.followup_all``. If no follow-up attribute specified
+for a generated loop, it is added automatically.
+
+Unroll-And-Jam
+--------------
+
+Unroll-and-jam uses the following transformation model (here with an
+unroll factor if 2). Currently, it does not support a fallback version
+when the transformation is unsafe.
+
+.. code-block:: c
+
+ for (int i = 0; i < n; i+=1) { // original outer loop
+ Fore(i);
+ for (int j = 0; j < m; j+=1) // original inner loop
+ SubLoop(i, j);
+ Aft(i);
+ }
+
+.. code-block:: c
+
+ int i = 0;
+ for (; i + 1 < n; i+=2) { // unrolled outer loop
+ Fore(i);
+ Fore(i+1);
+ for (int j = 0; j < m; j+=1) { // unrolled inner loop
+ SubLoop(i, j);
+ SubLoop(i+1, j);
+ }
+ Aft(i);
+ Aft(i+1);
+ }
+ for (; i < n; i+=1) { // remainder outer loop
+ Fore(i);
+ for (int j = 0; j < m; j+=1) // remainder inner loop
+ SubLoop(i, j);
+ Aft(i);
+ }
+
+``llvm.loop.unroll_and_jam.followup_outer`` will set the loop attributes
+of the unrolled outer loop. If not specified, the attributes of the
+original outer loop without the ``llvm.loop.unroll.*`` attributes are
+copied and ``llvm.loop.unroll.disable`` added to it.
+
+``llvm.loop.unroll_and_jam.followup_inner`` will set the loop attributes
+of the unrolled inner loop. If not specified, the attributes of the
+original inner loop are used unchanged.
+
+``llvm.loop.unroll_and_jam.followup_remainder_outer`` sets the loop
+attributes of the outer remainder loop. If not specified it will not
+have any attributes. The remainder loop might not be present due to
+being fully unrolled.
+
+``llvm.loop.unroll_and_jam.followup_remainder_inner`` sets the loop
+attributes of the inner remainder loop. If not specified it will have
+the attributes of the original inner loop. It the outer remainder loop
+is unrolled, the inner remainder loop might be present multiple times.
+
+Attributes defined in ``llvm.loop.unroll_and_jam.followup_all`` are
+added to all of the aforementioned output loops.
+
+To avoid that the unrolled loop is unrolled again, it is
+recommended to add ``llvm.loop.unroll.disable`` to
+``llvm.loop.unroll_and_jam.followup_all``. It suppresses unroll-and-jam
+as well as an additional inner loop unrolling. If no follow-up
+attribute specified for a generated loop, it is added automatically.
+
+Loop Distribution
+-----------------
+
+The LoopDistribution pass tries to separate vectorizable parts of a loop
+from the non-vectorizable part (which otherwise would make the entire
+loop non-vectorizable). Conceptually, it transforms a loop such as
+
+.. code-block:: c
+
+ for (int i = 1; i < n; i+=1) { // original loop
+ A[i] = i;
+ B[i] = 2 + B[i];
+ C[i] = 3 + C[i - 1];
+ }
+
+into the following code:
+
+.. code-block:: c
+
+ if (rtc) {
+ for (int i = 1; i < n; i+=1) // coincident loop
+ A[i] = i;
+ for (int i = 1; i < n; i+=1) // coincident loop
+ B[i] = 2 + B[i];
+ for (int i = 1; i < n; i+=1) // sequential loop
+ C[i] = 3 + C[i - 1];
+ } else {
+ for (int i = 1; i < n; i+=1) { // fallback loop
+ A[i] = i;
+ B[i] = 2 + B[i];
+ C[i] = 3 + C[i - 1];
+ }
+ }
+
+where ``rtc`` is a generated runtime check.
+
+``llvm.loop.distribute.followup_coincident`` sets the loop attributes of
+all loops without loop-carried dependencies (i.e. vectorizable loops).
+There might be more than one such loops. If not defined, the loops will
+inherit the original loop's attributes.
+
+``llvm.loop.distribute.followup_sequential`` sets the loop attributes of the
+loop with potentially unsafe dependencies. There should be at most one
+such loop. If not defined, the loop will inherit the original loop's
+attributes.
+
+``llvm.loop.distribute.followup_fallback`` defines the loop attributes
+for the fallback loop, which is a copy of the original loop for when
+loop versioning is required. If undefined, the fallback loop inherits
+all attributes from the original loop.
+
+Attributes defined in ``llvm.loop.distribute.followup_all`` are added to
+all of the aforementioned output loops.
+
+It is recommended to add ``llvm.loop.disable_nonforced`` to
+``llvm.loop.distribute.followup_fallback``. This avoids that the
+fallback version (which is likely never executed) is further optimzed
+which would increase the code size.
+
+Versioning LICM
+---------------
+
+The pass hoists code out of loops that are only loop-invariant when
+dynamic conditions apply. For instance, it transforms the loop
+
+.. code-block:: c
+
+ for (int i = 0; i < n; i+=1) // original loop
+ A[i] = B[0];
+
+into:
+
+.. code-block:: c
+
+ if (rtc) {
+ auto b = B[0];
+ for (int i = 0; i < n; i+=1) // versioned loop
+ A[i] = b;
+ } else {
+ for (int i = 0; i < n; i+=1) // unversioned loop
+ A[i] = B[0];
+ }
+
+The runtime condition (``rtc``) checks that the array ``A`` and the
+element `B[0]` do not alias.
+
+Currently, this transformation does not support followup-attributes.
+
+Loop Interchange
+----------------
+
+Currently, the ``LoopInterchange`` pass does not use any metadata.
+
+Ambiguous Transformation Order
+==============================
+
+If there multiple transformations defined, the order in which they are
+executed depends on the order in LLVM's pass pipeline, which is subject
+to change. The default optimization pipeline (anything higher than
+``-O0``) has the following order.
+
+When using the legacy pass manager:
+
+ - LoopInterchange (if enabled)
+ - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
+ - VersioningLICM (if enabled)
+ - LoopDistribute
+ - LoopVectorizer
+ - LoopUnrollAndJam (if enabled)
+ - LoopUnroll (partial and runtime unrolling)
+
+When using the legacy pass manager with LTO:
+
+ - LoopInterchange (if enabled)
+ - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
+ - LoopVectorizer
+ - LoopUnroll (partial and runtime unrolling)
+
+When using the new pass manager:
+
+ - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
+ - LoopDistribute
+ - LoopVectorizer
+ - LoopUnrollAndJam (if enabled)
+ - LoopUnroll (partial and runtime unrolling)
+
+Leftover Transformations
+========================
+
+Forced transformations that have not been applied after the last
+transformation pass should be reported to the user. The transformation
+passes themselves cannot be responsible for this reporting because they
+might not be in the pipeline, there might be multiple passes able to
+apply a transformation (e.g. ``LoopInterchange`` and Polly) or a
+transformation attribute may be 'hidden' inside another passes' followup
+attribute.
+
+The pass ``-transform-warning`` (``WarnMissedTransformationsPass``)
+emits such warnings. It should be placed after the last transformation
+pass.
+
+The current pass pipeline has a fixed order in which transformations
+passes are executed. A transformation can be in the followup of a pass
+that is executed later and thus leftover. For instance, a loop nest
+cannot be distributed and then interchanged with the current pass
+pipeline. The loop distribution will execute, but there is no loop
+interchange pass following such that any loop interchange metadata will
+be ignored. The ``-transform-warning`` should emit a warning in this
+case.
+
+Future versions of LLVM may fix this by executing transformations using
+a dynamic ordering.
Added: www-releases/trunk/9.0.0/docs/_sources/TypeMetadata.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/TypeMetadata.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/TypeMetadata.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/TypeMetadata.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,226 @@
+=============
+Type Metadata
+=============
+
+Type metadata is a mechanism that allows IR modules to co-operatively build
+pointer sets corresponding to addresses within a given set of globals. LLVM's
+`control flow integrity`_ implementation uses this metadata to efficiently
+check (at each call site) that a given address corresponds to either a
+valid vtable or function pointer for a given class or function type, and its
+whole-program devirtualization pass uses the metadata to identify potential
+callees for a given virtual call.
+
+To use the mechanism, a client creates metadata nodes with two elements:
+
+1. a byte offset into the global (generally zero for functions)
+2. a metadata object representing an identifier for the type
+
+These metadata nodes are associated with globals by using global object
+metadata attachments with the ``!type`` metadata kind.
+
+Each type identifier must exclusively identify either global variables
+or functions.
+
+.. admonition:: Limitation
+
+ The current implementation only supports attaching metadata to functions on
+ the x86-32 and x86-64 architectures.
+
+An intrinsic, :ref:`llvm.type.test <type.test>`, is used to test whether a
+given pointer is associated with a type identifier.
+
+.. _control flow integrity: http://clang.llvm.org/docs/ControlFlowIntegrity.html
+
+Representing Type Information using Type Metadata
+=================================================
+
+This section describes how Clang represents C++ type information associated with
+virtual tables using type metadata.
+
+Consider the following inheritance hierarchy:
+
+.. code-block:: c++
+
+ struct A {
+ virtual void f();
+ };
+
+ struct B : A {
+ virtual void f();
+ virtual void g();
+ };
+
+ struct C {
+ virtual void h();
+ };
+
+ struct D : A, C {
+ virtual void f();
+ virtual void h();
+ };
+
+The virtual table objects for A, B, C and D look like this (under the Itanium ABI):
+
+.. csv-table:: Virtual Table Layout for A, B, C, D
+ :header: Class, 0, 1, 2, 3, 4, 5, 6
+
+ A, A::offset-to-top, &A::rtti, &A::f
+ B, B::offset-to-top, &B::rtti, &B::f, &B::g
+ C, C::offset-to-top, &C::rtti, &C::h
+ D, D::offset-to-top, &D::rtti, &D::f, &D::h, D::offset-to-top, &D::rtti, thunk for &D::h
+
+When an object of type A is constructed, the address of ``&A::f`` in A's
+virtual table object is stored in the object's vtable pointer. In ABI parlance
+this address is known as an `address point`_. Similarly, when an object of type
+B is constructed, the address of ``&B::f`` is stored in the vtable pointer. In
+this way, the vtable in B's virtual table object is compatible with A's vtable.
+
+D is a little more complicated, due to the use of multiple inheritance. Its
+virtual table object contains two vtables, one compatible with A's vtable and
+the other compatible with C's vtable. Objects of type D contain two virtual
+pointers, one belonging to the A subobject and containing the address of
+the vtable compatible with A's vtable, and the other belonging to the C
+subobject and containing the address of the vtable compatible with C's vtable.
+
+The full set of compatibility information for the above class hierarchy is
+shown below. The following table shows the name of a class, the offset of an
+address point within that class's vtable and the name of one of the classes
+with which that address point is compatible.
+
+.. csv-table:: Type Offsets for A, B, C, D
+ :header: VTable for, Offset, Compatible Class
+
+ A, 16, A
+ B, 16, A
+ , , B
+ C, 16, C
+ D, 16, A
+ , , D
+ , 48, C
+
+The next step is to encode this compatibility information into the IR. The way
+this is done is to create type metadata named after each of the compatible
+classes, with which we associate each of the compatible address points in
+each vtable. For example, these type metadata entries encode the compatibility
+information for the above hierarchy:
+
+::
+
+ @_ZTV1A = constant [...], !type !0
+ @_ZTV1B = constant [...], !type !0, !type !1
+ @_ZTV1C = constant [...], !type !2
+ @_ZTV1D = constant [...], !type !0, !type !3, !type !4
+
+ !0 = !{i64 16, !"_ZTS1A"}
+ !1 = !{i64 16, !"_ZTS1B"}
+ !2 = !{i64 16, !"_ZTS1C"}
+ !3 = !{i64 16, !"_ZTS1D"}
+ !4 = !{i64 48, !"_ZTS1C"}
+
+With this type metadata, we can now use the ``llvm.type.test`` intrinsic to
+test whether a given pointer is compatible with a type identifier. Working
+backwards, if ``llvm.type.test`` returns true for a particular pointer,
+we can also statically determine the identities of the virtual functions
+that a particular virtual call may call. For example, if a program assumes
+a pointer to be a member of ``!"_ZST1A"``, we know that the address can
+be only be one of ``_ZTV1A+16``, ``_ZTV1B+16`` or ``_ZTV1D+16`` (i.e. the
+address points of the vtables of A, B and D respectively). If we then load
+an address from that pointer, we know that the address can only be one of
+``&A::f``, ``&B::f`` or ``&D::f``.
+
+.. _address point: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vtable-general
+
+Testing Addresses For Type Membership
+=====================================
+
+If a program tests an address using ``llvm.type.test``, this will cause
+a link-time optimization pass, ``LowerTypeTests``, to replace calls to this
+intrinsic with efficient code to perform type member tests. At a high level,
+the pass will lay out referenced globals in a consecutive memory region in
+the object file, construct bit vectors that map onto that memory region,
+and generate code at each of the ``llvm.type.test`` call sites to test
+pointers against those bit vectors. Because of the layout manipulation, the
+globals' definitions must be available at LTO time. For more information,
+see the `control flow integrity design document`_.
+
+A type identifier that identifies functions is transformed into a jump table,
+which is a block of code consisting of one branch instruction for each
+of the functions associated with the type identifier that branches to the
+target function. The pass will redirect any taken function addresses to the
+corresponding jump table entry. In the object file's symbol table, the jump
+table entries take the identities of the original functions, so that addresses
+taken outside the module will pass any verification done inside the module.
+
+Jump tables may call external functions, so their definitions need not
+be available at LTO time. Note that if an externally defined function is
+associated with a type identifier, there is no guarantee that its identity
+within the module will be the same as its identity outside of the module,
+as the former will be the jump table entry if a jump table is necessary.
+
+The `GlobalLayoutBuilder`_ class is responsible for laying out the globals
+efficiently to minimize the sizes of the underlying bitsets.
+
+.. _control flow integrity design document: http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html
+
+:Example:
+
+::
+
+ target datalayout = "e-p:32:32"
+
+ @a = internal global i32 0, !type !0
+ @b = internal global i32 0, !type !0, !type !1
+ @c = internal global i32 0, !type !1
+ @d = internal global [2 x i32] [i32 0, i32 0], !type !2
+
+ define void @e() !type !3 {
+ ret void
+ }
+
+ define void @f() {
+ ret void
+ }
+
+ declare void @g() !type !3
+
+ !0 = !{i32 0, !"typeid1"}
+ !1 = !{i32 0, !"typeid2"}
+ !2 = !{i32 4, !"typeid2"}
+ !3 = !{i32 0, !"typeid3"}
+
+ declare i1 @llvm.type.test(i8* %ptr, metadata %typeid) nounwind readnone
+
+ define i1 @foo(i32* %p) {
+ %pi8 = bitcast i32* %p to i8*
+ %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid1")
+ ret i1 %x
+ }
+
+ define i1 @bar(i32* %p) {
+ %pi8 = bitcast i32* %p to i8*
+ %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid2")
+ ret i1 %x
+ }
+
+ define i1 @baz(void ()* %p) {
+ %pi8 = bitcast void ()* %p to i8*
+ %x = call i1 @llvm.type.test(i8* %pi8, metadata !"typeid3")
+ ret i1 %x
+ }
+
+ define void @main() {
+ %a1 = call i1 @foo(i32* @a) ; returns 1
+ %b1 = call i1 @foo(i32* @b) ; returns 1
+ %c1 = call i1 @foo(i32* @c) ; returns 0
+ %a2 = call i1 @bar(i32* @a) ; returns 0
+ %b2 = call i1 @bar(i32* @b) ; returns 1
+ %c2 = call i1 @bar(i32* @c) ; returns 1
+ %d02 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 0)) ; returns 0
+ %d12 = call i1 @bar(i32* getelementptr ([2 x i32]* @d, i32 0, i32 1)) ; returns 1
+ %e = call i1 @baz(void ()* @e) ; returns 1
+ %f = call i1 @baz(void ()* @f) ; returns 0
+ %g = call i1 @baz(void ()* @g) ; returns 1
+ ret void
+ }
+
+.. _GlobalLayoutBuilder: https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Transforms/IPO/LowerTypeTests.h
Added: www-releases/trunk/9.0.0/docs/_sources/Vectorizers.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/Vectorizers.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/Vectorizers.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/Vectorizers.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,435 @@
+==========================
+Auto-Vectorization in LLVM
+==========================
+
+.. contents::
+ :local:
+
+LLVM has two vectorizers: The :ref:`Loop Vectorizer <loop-vectorizer>`,
+which operates on Loops, and the :ref:`SLP Vectorizer
+<slp-vectorizer>`. These vectorizers
+focus on different optimization opportunities and use different techniques.
+The SLP vectorizer merges multiple scalars that are found in the code into
+vectors while the Loop Vectorizer widens instructions in loops
+to operate on multiple consecutive iterations.
+
+Both the Loop Vectorizer and the SLP Vectorizer are enabled by default.
+
+.. _loop-vectorizer:
+
+The Loop Vectorizer
+===================
+
+Usage
+-----
+
+The Loop Vectorizer is enabled by default, but it can be disabled
+through clang using the command line flag:
+
+.. code-block:: console
+
+ $ clang ... -fno-vectorize file.c
+
+Command line flags
+^^^^^^^^^^^^^^^^^^
+
+The loop vectorizer uses a cost model to decide on the optimal vectorization factor
+and unroll factor. However, users of the vectorizer can force the vectorizer to use
+specific values. Both 'clang' and 'opt' support the flags below.
+
+Users can control the vectorization SIMD width using the command line flag "-force-vector-width".
+
+.. code-block:: console
+
+ $ clang -mllvm -force-vector-width=8 ...
+ $ opt -loop-vectorize -force-vector-width=8 ...
+
+Users can control the unroll factor using the command line flag "-force-vector-interleave"
+
+.. code-block:: console
+
+ $ clang -mllvm -force-vector-interleave=2 ...
+ $ opt -loop-vectorize -force-vector-interleave=2 ...
+
+Pragma loop hint directives
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``#pragma clang loop`` directive allows loop vectorization hints to be
+specified for the subsequent for, while, do-while, or c++11 range-based for
+loop. The directive allows vectorization and interleaving to be enabled or
+disabled. Vector width as well as interleave count can also be manually
+specified. The following example explicitly enables vectorization and
+interleaving:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize(enable) interleave(enable)
+ while(...) {
+ ...
+ }
+
+The following example implicitly enables vectorization and interleaving by
+specifying a vector width and interleaving count:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize_width(2) interleave_count(2)
+ for(...) {
+ ...
+ }
+
+See the Clang
+`language extensions
+<http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations>`_
+for details.
+
+Diagnostics
+-----------
+
+Many loops cannot be vectorized including loops with complicated control flow,
+unvectorizable types, and unvectorizable calls. The loop vectorizer generates
+optimization remarks which can be queried using command line options to identify
+and diagnose loops that are skipped by the loop-vectorizer.
+
+Optimization remarks are enabled using:
+
+``-Rpass=loop-vectorize`` identifies loops that were successfully vectorized.
+
+``-Rpass-missed=loop-vectorize`` identifies loops that failed vectorization and
+indicates if vectorization was specified.
+
+``-Rpass-analysis=loop-vectorize`` identifies the statements that caused
+vectorization to fail. If in addition ``-fsave-optimization-record`` is
+provided, multiple causes of vectorization failure may be listed (this behavior
+might change in the future).
+
+Consider the following loop:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize(enable)
+ for (int i = 0; i < Length; i++) {
+ switch(A[i]) {
+ case 0: A[i] = i*2; break;
+ case 1: A[i] = i; break;
+ default: A[i] = 0;
+ }
+ }
+
+The command line ``-Rpass-missed=loop-vectorized`` prints the remark:
+
+.. code-block:: console
+
+ no_switch.cpp:4:5: remark: loop not vectorized: vectorization is explicitly enabled [-Rpass-missed=loop-vectorize]
+
+And the command line ``-Rpass-analysis=loop-vectorize`` indicates that the
+switch statement cannot be vectorized.
+
+.. code-block:: console
+
+ no_switch.cpp:4:5: remark: loop not vectorized: loop contains a switch statement [-Rpass-analysis=loop-vectorize]
+ switch(A[i]) {
+ ^
+
+To ensure line and column numbers are produced include the command line options
+``-gline-tables-only`` and ``-gcolumn-info``. See the Clang `user manual
+<http://clang.llvm.org/docs/UsersManual.html#options-to-emit-optimization-reports>`_
+for details
+
+Features
+--------
+
+The LLVM Loop Vectorizer has a number of features that allow it to vectorize
+complex loops.
+
+Loops with unknown trip count
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Loop Vectorizer supports loops with an unknown trip count.
+In the loop below, the iteration ``start`` and ``finish`` points are unknown,
+and the Loop Vectorizer has a mechanism to vectorize loops that do not start
+at zero. In this example, 'n' may not be a multiple of the vector width, and
+the vectorizer has to execute the last few iterations as scalar code. Keeping
+a scalar copy of the loop increases the code size.
+
+.. code-block:: c++
+
+ void bar(float *A, float* B, float K, int start, int end) {
+ for (int i = start; i < end; ++i)
+ A[i] *= B[i] + K;
+ }
+
+Runtime Checks of Pointers
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the example below, if the pointers A and B point to consecutive addresses,
+then it is illegal to vectorize the code because some elements of A will be
+written before they are read from array B.
+
+Some programmers use the 'restrict' keyword to notify the compiler that the
+pointers are disjointed, but in our example, the Loop Vectorizer has no way of
+knowing that the pointers A and B are unique. The Loop Vectorizer handles this
+loop by placing code that checks, at runtime, if the arrays A and B point to
+disjointed memory locations. If arrays A and B overlap, then the scalar version
+of the loop is executed.
+
+.. code-block:: c++
+
+ void bar(float *A, float* B, float K, int n) {
+ for (int i = 0; i < n; ++i)
+ A[i] *= B[i] + K;
+ }
+
+
+Reductions
+^^^^^^^^^^
+
+In this example the ``sum`` variable is used by consecutive iterations of
+the loop. Normally, this would prevent vectorization, but the vectorizer can
+detect that 'sum' is a reduction variable. The variable 'sum' becomes a vector
+of integers, and at the end of the loop the elements of the array are added
+together to create the correct result. We support a number of different
+reduction operations, such as addition, multiplication, XOR, AND and OR.
+
+.. code-block:: c++
+
+ int foo(int *A, int *B, int n) {
+ unsigned sum = 0;
+ for (int i = 0; i < n; ++i)
+ sum += A[i] + 5;
+ return sum;
+ }
+
+We support floating point reduction operations when `-ffast-math` is used.
+
+Inductions
+^^^^^^^^^^
+
+In this example the value of the induction variable ``i`` is saved into an
+array. The Loop Vectorizer knows to vectorize induction variables.
+
+.. code-block:: c++
+
+ void bar(float *A, float* B, float K, int n) {
+ for (int i = 0; i < n; ++i)
+ A[i] = i;
+ }
+
+If Conversion
+^^^^^^^^^^^^^
+
+The Loop Vectorizer is able to "flatten" the IF statement in the code and
+generate a single stream of instructions. The Loop Vectorizer supports any
+control flow in the innermost loop. The innermost loop may contain complex
+nesting of IFs, ELSEs and even GOTOs.
+
+.. code-block:: c++
+
+ int foo(int *A, int *B, int n) {
+ unsigned sum = 0;
+ for (int i = 0; i < n; ++i)
+ if (A[i] > B[i])
+ sum += A[i] + 5;
+ return sum;
+ }
+
+Pointer Induction Variables
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This example uses the "accumulate" function of the standard c++ library. This
+loop uses C++ iterators, which are pointers, and not integer indices.
+The Loop Vectorizer detects pointer induction variables and can vectorize
+this loop. This feature is important because many C++ programs use iterators.
+
+.. code-block:: c++
+
+ int baz(int *A, int n) {
+ return std::accumulate(A, A + n, 0);
+ }
+
+Reverse Iterators
+^^^^^^^^^^^^^^^^^
+
+The Loop Vectorizer can vectorize loops that count backwards.
+
+.. code-block:: c++
+
+ int foo(int *A, int *B, int n) {
+ for (int i = n; i > 0; --i)
+ A[i] +=1;
+ }
+
+Scatter / Gather
+^^^^^^^^^^^^^^^^
+
+The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions
+that scatter/gathers memory.
+
+.. code-block:: c++
+
+ int foo(int * A, int * B, int n) {
+ for (intptr_t i = 0; i < n; ++i)
+ A[i] += B[i * 4];
+ }
+
+In many situations the cost model will inform LLVM that this is not beneficial
+and LLVM will only vectorize such code if forced with "-mllvm -force-vector-width=#".
+
+Vectorization of Mixed Types
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Loop Vectorizer can vectorize programs with mixed types. The Vectorizer
+cost model can estimate the cost of the type conversion and decide if
+vectorization is profitable.
+
+.. code-block:: c++
+
+ int foo(int *A, char *B, int n, int k) {
+ for (int i = 0; i < n; ++i)
+ A[i] += 4 * B[i];
+ }
+
+Global Structures Alias Analysis
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Access to global structures can also be vectorized, with alias analysis being
+used to make sure accesses don't alias. Run-time checks can also be added on
+pointer access to structure members.
+
+Many variations are supported, but some that rely on undefined behaviour being
+ignored (as other compilers do) are still being left un-vectorized.
+
+.. code-block:: c++
+
+ struct { int A[100], K, B[100]; } Foo;
+
+ int foo() {
+ for (int i = 0; i < 100; ++i)
+ Foo.A[i] = Foo.B[i] + 100;
+ }
+
+Vectorization of function calls
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Loop Vectorizer can vectorize intrinsic math functions.
+See the table below for a list of these functions.
+
++-----+-----+---------+
+| pow | exp | exp2 |
++-----+-----+---------+
+| sin | cos | sqrt |
++-----+-----+---------+
+| log |log2 | log10 |
++-----+-----+---------+
+|fabs |floor| ceil |
++-----+-----+---------+
+|fma |trunc|nearbyint|
++-----+-----+---------+
+| | | fmuladd |
++-----+-----+---------+
+
+Note that the optimizer may not be able to vectorize math library functions
+that correspond to these intrinsics if the library calls access external state
+such as "errno". To allow better optimization of C/C++ math library functions,
+use "-fno-math-errno".
+
+The loop vectorizer knows about special instructions on the target and will
+vectorize a loop containing a function call that maps to the instructions. For
+example, the loop below will be vectorized on Intel x86 if the SSE4.1 roundps
+instruction is available.
+
+.. code-block:: c++
+
+ void foo(float *f) {
+ for (int i = 0; i != 1024; ++i)
+ f[i] = floorf(f[i]);
+ }
+
+Partial unrolling during vectorization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Modern processors feature multiple execution units, and only programs that contain a
+high degree of parallelism can fully utilize the entire width of the machine.
+The Loop Vectorizer increases the instruction level parallelism (ILP) by
+performing partial-unrolling of loops.
+
+In the example below the entire array is accumulated into the variable 'sum'.
+This is inefficient because only a single execution port can be used by the processor.
+By unrolling the code the Loop Vectorizer allows two or more execution ports
+to be used simultaneously.
+
+.. code-block:: c++
+
+ int foo(int *A, int *B, int n) {
+ unsigned sum = 0;
+ for (int i = 0; i < n; ++i)
+ sum += A[i];
+ return sum;
+ }
+
+The Loop Vectorizer uses a cost model to decide when it is profitable to unroll loops.
+The decision to unroll the loop depends on the register pressure and the generated code size.
+
+Performance
+-----------
+
+This section shows the execution time of Clang on a simple benchmark:
+`gcc-loops <https://github.com/llvm/llvm-test-suite/tree/master/SingleSource/UnitTests/Vectorizer>`_.
+This benchmarks is a collection of loops from the GCC autovectorization
+`page <http://gcc.gnu.org/projects/tree-ssa/vectorization.html>`_ by Dorit Nuzman.
+
+The chart below compares GCC-4.7, ICC-13, and Clang-SVN with and without loop vectorization at -O3, tuned for "corei7-avx", running on a Sandybridge iMac.
+The Y-axis shows the time in msec. Lower is better. The last column shows the geomean of all the kernels.
+
+.. image:: gcc-loops.png
+
+And Linpack-pc with the same configuration. Result is Mflops, higher is better.
+
+.. image:: linpack-pc.png
+
+Ongoing Development Directions
+------------------------------
+
+.. toctree::
+ :hidden:
+
+ Proposals/VectorizationPlan
+
+:doc:`Proposals/VectorizationPlan`
+ Modeling the process and upgrading the infrastructure of LLVM's Loop Vectorizer.
+
+.. _slp-vectorizer:
+
+The SLP Vectorizer
+==================
+
+Details
+-------
+
+The goal of SLP vectorization (a.k.a. superword-level parallelism) is
+to combine similar independent instructions
+into vector instructions. Memory accesses, arithmetic operations, comparison
+operations, PHI-nodes, can all be vectorized using this technique.
+
+For example, the following function performs very similar operations on its
+inputs (a1, b1) and (a2, b2). The basic-block vectorizer may combine these
+into vector operations.
+
+.. code-block:: c++
+
+ void foo(int a1, int a2, int b1, int b2, int *A) {
+ A[0] = a1*(a1 + b1)/b1 + 50*b1/a1;
+ A[1] = a2*(a2 + b2)/b2 + 50*b2/a2;
+ }
+
+The SLP-vectorizer processes the code bottom-up, across basic blocks, in search of scalars to combine.
+
+Usage
+------
+
+The SLP Vectorizer is enabled by default, but it can be disabled
+through clang using the command line flag:
+
+.. code-block:: console
+
+ $ clang -fno-slp-vectorize file.c
Added: www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMBackend.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMBackend.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMBackend.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMBackend.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1979 @@
+=======================
+Writing an LLVM Backend
+=======================
+
+.. toctree::
+ :hidden:
+
+ HowToUseInstrMappings
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+This document describes techniques for writing compiler backends that convert
+the LLVM Intermediate Representation (IR) to code for a specified machine or
+other languages. Code intended for a specific machine can take the form of
+either assembly code or binary code (usable for a JIT compiler).
+
+The backend of LLVM features a target-independent code generator that may
+create output for several types of target CPUs --- including X86, PowerPC,
+ARM, and SPARC. The backend may also be used to generate code targeted at SPUs
+of the Cell processor or GPUs to support the execution of compute kernels.
+
+The document focuses on existing examples found in subdirectories of
+``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document
+focuses on the example of creating a static compiler (one that emits text
+assembly) for a SPARC target, because SPARC has fairly standard
+characteristics, such as a RISC instruction set and straightforward calling
+conventions.
+
+Audience
+--------
+
+The audience for this document is anyone who needs to write an LLVM backend to
+generate code for a specific hardware or software target.
+
+Prerequisite Reading
+--------------------
+
+These essential documents must be read before reading this document:
+
+* `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for
+ the LLVM assembly language.
+
+* :doc:`CodeGenerator` --- a guide to the components (classes and code
+ generation algorithms) for translating the LLVM internal representation into
+ machine code for a specified target. Pay particular attention to the
+ descriptions of code generation stages: Instruction Selection, Scheduling and
+ Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code
+ Insertion, Late Machine Code Optimizations, and Code Emission.
+
+* :doc:`TableGen/index` --- a document that describes the TableGen
+ (``tblgen``) application that manages domain-specific information to support
+ LLVM code generation. TableGen processes input from a target description
+ file (``.td`` suffix) and generates C++ code that can be used for code
+ generation.
+
+* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as
+ are several ``SelectionDAG`` processing steps.
+
+To follow the SPARC examples in this document, have a copy of `The SPARC
+Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for
+reference. For details about the ARM instruction set, refer to the `ARM
+Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about
+the GNU Assembler format (``GAS``), see `Using As
+<http://sourceware.org/binutils/docs/as/index.html>`_, especially for the
+assembly printer. "Using As" contains a list of target machine dependent
+features.
+
+Basic Steps
+-----------
+
+To write a compiler backend for LLVM that converts the LLVM IR to code for a
+specified target (machine or other language), follow these steps:
+
+* Create a subclass of the ``TargetMachine`` class that describes
+ characteristics of your target machine. Copy existing examples of specific
+ ``TargetMachine`` class and header files; for example, start with
+ ``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file
+ names for your target. Similarly, change code that references "``Sparc``" to
+ reference your target.
+
+* Describe the register set of the target. Use TableGen to generate code for
+ register definition, register aliases, and register classes from a
+ target-specific ``RegisterInfo.td`` input file. You should also write
+ additional code for a subclass of the ``TargetRegisterInfo`` class that
+ represents the class register file data used for register allocation and also
+ describes the interactions between registers.
+
+* Describe the instruction set of the target. Use TableGen to generate code
+ for target-specific instructions from target-specific versions of
+ ``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write
+ additional code for a subclass of the ``TargetInstrInfo`` class to represent
+ machine instructions supported by the target machine.
+
+* Describe the selection and conversion of the LLVM IR from a Directed Acyclic
+ Graph (DAG) representation of instructions to native target-specific
+ instructions. Use TableGen to generate code that matches patterns and
+ selects instructions based on additional information in a target-specific
+ version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``,
+ where ``XXX`` identifies the specific target, to perform pattern matching and
+ DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp``
+ to replace or remove operations and data types that are not supported
+ natively in a SelectionDAG.
+
+* Write code for an assembly printer that converts LLVM IR to a GAS format for
+ your target machine. You should add assembly strings to the instructions
+ defined in your target-specific version of ``TargetInstrInfo.td``. You
+ should also write code for a subclass of ``AsmPrinter`` that performs the
+ LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``.
+
+* Optionally, add support for subtargets (i.e., variants with different
+ capabilities). You should also write code for a subclass of the
+ ``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and
+ ``-mattr=`` command-line options.
+
+* Optionally, add JIT support and create a machine code emitter (subclass of
+ ``TargetJITInfo``) that is used to emit binary code directly into memory.
+
+In the ``.cpp`` and ``.h``. files, initially stub up these methods and then
+implement them later. Initially, you may not know which private members that
+the class will need and which components will need to be subclassed.
+
+Preliminaries
+-------------
+
+To actually create your compiler backend, you need to create and modify a few
+files. The absolute minimum is discussed here. But to actually use the LLVM
+target-independent code generator, you must perform the steps described in the
+:doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document.
+
+First, you should create a subdirectory under ``lib/Target`` to hold all the
+files related to your target. If your target is called "Dummy", create the
+directory ``lib/Target/Dummy``.
+
+In this new directory, create a ``CMakeLists.txt``. It is easiest to copy a
+``CMakeLists.txt`` of another target and modify it. It should at least contain
+the ``LLVM_TARGET_DEFINITIONS`` variable. The library can be named ``LLVMDummy``
+(for example, see the MIPS target). Alternatively, you can split the library
+into ``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which
+should be implemented in a subdirectory below ``lib/Target/Dummy`` (for example,
+see the PowerPC target).
+
+Note that these two naming schemes are hardcoded into ``llvm-config``. Using
+any other naming scheme will confuse ``llvm-config`` and produce a lot of
+(seemingly unrelated) linker errors when linking ``llc``.
+
+To make your target actually do something, you need to implement a subclass of
+``TargetMachine``. This implementation should typically be in the file
+``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target``
+directory will be built and should work. To use LLVM's target independent code
+generator, you should do what all current machine backends do: create a
+subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a
+subclass of ``TargetMachine``.)
+
+To get LLVM to actually build and link your target, you need to run ``cmake``
+with ``-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=Dummy``. This will build your
+target without needing to add it to the list of all the targets.
+
+Once your target is stable, you can add it to the ``LLVM_ALL_TARGETS`` variable
+located in the main ``CMakeLists.txt``.
+
+Target Machine
+==============
+
+``LLVMTargetMachine`` is designed as a base class for targets implemented with
+the LLVM target-independent code generator. The ``LLVMTargetMachine`` class
+should be specialized by a concrete target class that implements the various
+virtual methods. ``LLVMTargetMachine`` is defined as a subclass of
+``TargetMachine`` in ``include/llvm/Target/TargetMachine.h``. The
+``TargetMachine`` class implementation (``TargetMachine.cpp``) also processes
+numerous command-line options.
+
+To create a concrete target-specific subclass of ``LLVMTargetMachine``, start
+by copying an existing ``TargetMachine`` class and header. You should name the
+files that you create to reflect your specific target. For instance, for the
+SPARC target, name the files ``SparcTargetMachine.h`` and
+``SparcTargetMachine.cpp``.
+
+For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must
+have access methods to obtain objects that represent target components. These
+methods are named ``get*Info``, and are intended to obtain the instruction set
+(``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout
+(``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also
+implement the ``getDataLayout`` method to access an object with target-specific
+data characteristics, such as data type size and alignment requirements.
+
+For instance, for the SPARC target, the header file ``SparcTargetMachine.h``
+declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that
+simply return a class member.
+
+.. code-block:: c++
+
+ namespace llvm {
+
+ class Module;
+
+ class SparcTargetMachine : public LLVMTargetMachine {
+ const DataLayout DataLayout; // Calculates type size & alignment
+ SparcSubtarget Subtarget;
+ SparcInstrInfo InstrInfo;
+ TargetFrameInfo FrameInfo;
+
+ protected:
+ virtual const TargetAsmInfo *createTargetAsmInfo() const;
+
+ public:
+ SparcTargetMachine(const Module &M, const std::string &FS);
+
+ virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
+ virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
+ virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
+ virtual const TargetRegisterInfo *getRegisterInfo() const {
+ return &InstrInfo.getRegisterInfo();
+ }
+ virtual const DataLayout *getDataLayout() const { return &DataLayout; }
+ static unsigned getModuleMatchQuality(const Module &M);
+
+ // Pass Pipeline Configuration
+ virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
+ virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
+ };
+
+ } // end namespace llvm
+
+* ``getInstrInfo()``
+* ``getRegisterInfo()``
+* ``getFrameInfo()``
+* ``getDataLayout()``
+* ``getSubtargetImpl()``
+
+For some targets, you also need to support the following methods:
+
+* ``getTargetLowering()``
+* ``getJITInfo()``
+
+Some architectures, such as GPUs, do not support jumping to an arbitrary
+program location and implement branching using masked execution and loop using
+special instructions around the loop body. In order to avoid CFG modifications
+that introduce irreducible control flow not handled by such hardware, a target
+must call `setRequiresStructuredCFG(true)` when being initialized.
+
+In addition, the ``XXXTargetMachine`` constructor should specify a
+``TargetDescription`` string that determines the data layout for the target
+machine, including characteristics such as pointer size, alignment, and
+endianness. For example, the constructor for ``SparcTargetMachine`` contains
+the following:
+
+.. code-block:: c++
+
+ SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
+ : DataLayout("E-p:32:32-f128:128:128"),
+ Subtarget(M, FS), InstrInfo(Subtarget),
+ FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
+ }
+
+Hyphens separate portions of the ``TargetDescription`` string.
+
+* An upper-case "``E``" in the string indicates a big-endian target data model.
+ A lower-case "``e``" indicates little-endian.
+
+* "``p:``" is followed by pointer information: size, ABI alignment, and
+ preferred alignment. If only two figures follow "``p:``", then the first
+ value is pointer size, and the second value is both ABI and preferred
+ alignment.
+
+* Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or
+ "``a``" (corresponding to integer, floating point, vector, or aggregate).
+ "``i``", "``v``", or "``a``" are followed by ABI alignment and preferred
+ alignment. "``f``" is followed by three values: the first indicates the size
+ of a long double, then ABI alignment, and then ABI preferred alignment.
+
+Target Registration
+===================
+
+You must also register your target with the ``TargetRegistry``, which is what
+other LLVM tools use to be able to lookup and use your target at runtime. The
+``TargetRegistry`` can be used directly, but for most targets there are helper
+templates which should take care of the work for you.
+
+All targets should declare a global ``Target`` object which is used to
+represent the target during registration. Then, in the target's ``TargetInfo``
+library, the target should define that object and use the ``RegisterTarget``
+template to register the target. For example, the Sparc registration code
+looks like this:
+
+.. code-block:: c++
+
+ Target llvm::getTheSparcTarget();
+
+ extern "C" void LLVMInitializeSparcTargetInfo() {
+ RegisterTarget<Triple::sparc, /*HasJIT=*/false>
+ X(getTheSparcTarget(), "sparc", "Sparc");
+ }
+
+This allows the ``TargetRegistry`` to look up the target by name or by target
+triple. In addition, most targets will also register additional features which
+are available in separate libraries. These registration steps are separate,
+because some clients may wish to only link in some parts of the target --- the
+JIT code generator does not require the use of the assembler printer, for
+example. Here is an example of registering the Sparc assembly printer:
+
+.. code-block:: c++
+
+ extern "C" void LLVMInitializeSparcAsmPrinter() {
+ RegisterAsmPrinter<SparcAsmPrinter> X(getTheSparcTarget());
+ }
+
+For more information, see "`llvm/Target/TargetRegistry.h
+</doxygen/TargetRegistry_8h-source.html>`_".
+
+Register Set and Register Classes
+=================================
+
+You should describe a concrete target-specific class that represents the
+register file of a target machine. This class is called ``XXXRegisterInfo``
+(where ``XXX`` identifies the target) and represents the class register file
+data that is used for register allocation. It also describes the interactions
+between registers.
+
+You also need to define register classes to categorize related registers. A
+register class should be added for groups of registers that are all treated the
+same way for some instruction. Typical examples are register classes for
+integer, floating-point, or vector registers. A register allocator allows an
+instruction to use any register in a specified register class to perform the
+instruction in a similar manner. Register classes allocate virtual registers
+to instructions from these sets, and register classes let the
+target-independent register allocator automatically choose the actual
+registers.
+
+Much of the code for registers, including register definition, register
+aliases, and register classes, is generated by TableGen from
+``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc``
+and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the
+implementation of ``XXXRegisterInfo`` requires hand-coding.
+
+Defining a Register
+-------------------
+
+The ``XXXRegisterInfo.td`` file typically starts with register definitions for
+a target machine. The ``Register`` class (specified in ``Target.td``) is used
+to define an object for each register. The specified string ``n`` becomes the
+``Name`` of the register. The basic ``Register`` object does not have any
+subregisters and does not specify any aliases.
+
+.. code-block:: text
+
+ class Register<string n> {
+ string Namespace = "";
+ string AsmName = n;
+ string Name = n;
+ int SpillSize = 0;
+ int SpillAlignment = 0;
+ list<Register> Aliases = [];
+ list<Register> SubRegs = [];
+ list<int> DwarfNumbers = [];
+ }
+
+For example, in the ``X86RegisterInfo.td`` file, there are register definitions
+that utilize the ``Register`` class, such as:
+
+.. code-block:: text
+
+ def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>;
+
+This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``)
+that are used by ``gcc``, ``gdb``, or a debug information writer to identify a
+register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values
+representing 3 different modes: the first element is for X86-64, the second for
+exception handling (EH) on X86-32, and the third is generic. -1 is a special
+Dwarf number that indicates the gcc number is undefined, and -2 indicates the
+register number is invalid for this mode.
+
+From the previously described line in the ``X86RegisterInfo.td`` file, TableGen
+generates this code in the ``X86GenRegisterInfo.inc`` file:
+
+.. code-block:: c++
+
+ static const unsigned GR8[] = { X86::AL, ... };
+
+ const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
+
+ const TargetRegisterDesc RegisterDescriptors[] = {
+ ...
+ { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
+
+From the register info file, TableGen generates a ``TargetRegisterDesc`` object
+for each register. ``TargetRegisterDesc`` is defined in
+``include/llvm/Target/TargetRegisterInfo.h`` with the following fields:
+
+.. code-block:: c++
+
+ struct TargetRegisterDesc {
+ const char *AsmName; // Assembly language name for the register
+ const char *Name; // Printable name for the reg (for debugging)
+ const unsigned *AliasSet; // Register Alias Set
+ const unsigned *SubRegs; // Sub-register set
+ const unsigned *ImmSubRegs; // Immediate sub-register set
+ const unsigned *SuperRegs; // Super-register set
+ };
+
+TableGen uses the entire target description file (``.td``) to determine text
+names for the register (in the ``AsmName`` and ``Name`` fields of
+``TargetRegisterDesc``) and the relationships of other registers to the defined
+register (in the other ``TargetRegisterDesc`` fields). In this example, other
+definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as
+aliases for one another, so TableGen generates a null-terminated array
+(``AL_AliasSet``) for this register alias set.
+
+The ``Register`` class is commonly used as a base class for more complex
+classes. In ``Target.td``, the ``Register`` class is the base for the
+``RegisterWithSubRegs`` class that is used to define registers that need to
+specify subregisters in the ``SubRegs`` list, as shown here:
+
+.. code-block:: text
+
+ class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> {
+ let SubRegs = subregs;
+ }
+
+In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC:
+a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``,
+and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a
+feature common to these subclasses. Note the use of "``let``" expressions to
+override values that are initially defined in a superclass (such as ``SubRegs``
+field in the ``Rd`` class).
+
+.. code-block:: text
+
+ class SparcReg<string n> : Register<n> {
+ field bits<5> Num;
+ let Namespace = "SP";
+ }
+ // Ri - 32-bit integer registers
+ class Ri<bits<5> num, string n> :
+ SparcReg<n> {
+ let Num = num;
+ }
+ // Rf - 32-bit floating-point registers
+ class Rf<bits<5> num, string n> :
+ SparcReg<n> {
+ let Num = num;
+ }
+ // Rd - Slots in the FP register file for 64-bit floating-point values.
+ class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> {
+ let Num = num;
+ let SubRegs = subregs;
+ }
+
+In the ``SparcRegisterInfo.td`` file, there are register definitions that
+utilize these subclasses of ``Register``, such as:
+
+.. code-block:: text
+
+ def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>;
+ def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
+ ...
+ def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>;
+ def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>;
+ ...
+ def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>;
+ def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>;
+
+The last two registers shown above (``D0`` and ``D1``) are double-precision
+floating-point registers that are aliases for pairs of single-precision
+floating-point sub-registers. In addition to aliases, the sub-register and
+super-register relationships of the defined register are in fields of a
+register's ``TargetRegisterDesc``.
+
+Defining a Register Class
+-------------------------
+
+The ``RegisterClass`` class (specified in ``Target.td``) is used to define an
+object that represents a group of related registers and also defines the
+default allocation order of the registers. A target description file
+``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes
+using the following class:
+
+.. code-block:: text
+
+ class RegisterClass<string namespace,
+ list<ValueType> regTypes, int alignment, dag regList> {
+ string Namespace = namespace;
+ list<ValueType> RegTypes = regTypes;
+ int Size = 0; // spill size, in bits; zero lets tblgen pick the size
+ int Alignment = alignment;
+
+ // CopyCost is the cost of copying a value between two registers
+ // default value 1 means a single instruction
+ // A negative value means copying is extremely expensive or impossible
+ int CopyCost = 1;
+ dag MemberList = regList;
+
+ // for register classes that are subregisters of this class
+ list<RegisterClass> SubRegClassList = [];
+
+ code MethodProtos = [{}]; // to insert arbitrary code
+ code MethodBodies = [{}];
+ }
+
+To define a ``RegisterClass``, use the following 4 arguments:
+
+* The first argument of the definition is the name of the namespace.
+
+* The second argument is a list of ``ValueType`` register type values that are
+ defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include
+ integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean),
+ floating-point types (``f32``, ``f64``), and vector types (for example,
+ ``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass``
+ must have the same ``ValueType``, but some registers may store vector data in
+ different configurations. For example a register that can process a 128-bit
+ vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4
+ 32-bit integers, and so on.
+
+* The third argument of the ``RegisterClass`` definition specifies the
+ alignment required of the registers when they are stored or loaded to
+ memory.
+
+* The final argument, ``regList``, specifies which registers are in this class.
+ If an alternative allocation order method is not specified, then ``regList``
+ also defines the order of allocation used by the register allocator. Besides
+ simply listing registers with ``(add R0, R1, ...)``, more advanced set
+ operators are available. See ``include/llvm/Target/Target.td`` for more
+ information.
+
+In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined:
+``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the
+first argument defines the namespace with the string "``SP``". ``FPRegs``
+defines a group of 32 single-precision floating-point registers (``F0`` to
+``F31``); ``DFPRegs`` defines a group of 16 double-precision registers
+(``D0-D15``).
+
+.. code-block:: text
+
+ // F0, F1, F2, ..., F31
+ def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>;
+
+ def DFPRegs : RegisterClass<"SP", [f64], 64,
+ (add D0, D1, D2, D3, D4, D5, D6, D7, D8,
+ D9, D10, D11, D12, D13, D14, D15)>;
+
+ def IntRegs : RegisterClass<"SP", [i32], 32,
+ (add L0, L1, L2, L3, L4, L5, L6, L7,
+ I0, I1, I2, I3, I4, I5,
+ O0, O1, O2, O3, O4, O5, O7,
+ G1,
+ // Non-allocatable regs:
+ G2, G3, G4,
+ O6, // stack ptr
+ I6, // frame ptr
+ I7, // return address
+ G0, // constant zero
+ G5, G6, G7 // reserved for kernel
+ )>;
+
+Using ``SparcRegisterInfo.td`` with TableGen generates several output files
+that are intended for inclusion in other source code that you write.
+``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should
+be included in the header file for the implementation of the SPARC register
+implementation that you write (``SparcRegisterInfo.h``). In
+``SparcGenRegisterInfo.h.inc`` a new structure is defined called
+``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also
+specifies types, based upon the defined register classes: ``DFPRegsClass``,
+``FPRegsClass``, and ``IntRegsClass``.
+
+``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is
+included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register
+implementation. The code below shows only the generated integer registers and
+associated register classes. The order of registers in ``IntRegs`` reflects
+the order in the definition of ``IntRegs`` in the target description file.
+
+.. code-block:: c++
+
+ // IntRegs Register Class...
+ static const unsigned IntRegs[] = {
+ SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
+ SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
+ SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
+ SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
+ SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
+ SP::G6, SP::G7,
+ };
+
+ // IntRegsVTs Register Class Value Types...
+ static const MVT::ValueType IntRegsVTs[] = {
+ MVT::i32, MVT::Other
+ };
+
+ namespace SP { // Register class instances
+ DFPRegsClass DFPRegsRegClass;
+ FPRegsClass FPRegsRegClass;
+ IntRegsClass IntRegsRegClass;
+ ...
+ // IntRegs Sub-register Classes...
+ static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
+ NULL
+ };
+ ...
+ // IntRegs Super-register Classes..
+ static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
+ NULL
+ };
+ ...
+ // IntRegs Register Class sub-classes...
+ static const TargetRegisterClass* const IntRegsSubclasses [] = {
+ NULL
+ };
+ ...
+ // IntRegs Register Class super-classes...
+ static const TargetRegisterClass* const IntRegsSuperclasses [] = {
+ NULL
+ };
+
+ IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
+ IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
+ IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
+ }
+
+The register allocators will avoid using reserved registers, and callee saved
+registers are not used until all the volatile registers have been used. That
+is usually good enough, but in some cases it may be necessary to provide custom
+allocation orders.
+
+Implement a subclass of ``TargetRegisterInfo``
+----------------------------------------------
+
+The final step is to hand code portions of ``XXXRegisterInfo``, which
+implements the interface described in ``TargetRegisterInfo.h`` (see
+:ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or
+``false``, unless overridden. Here is a list of functions that are overridden
+for the SPARC implementation in ``SparcRegisterInfo.cpp``:
+
+* ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the
+ order of the desired callee-save stack frame offset.
+
+* ``getReservedRegs`` --- Returns a bitset indexed by physical register
+ numbers, indicating if a particular register is unavailable.
+
+* ``hasFP`` --- Return a Boolean indicating if a function should have a
+ dedicated frame pointer register.
+
+* ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo
+ instructions are used, this can be called to eliminate them.
+
+* ``eliminateFrameIndex`` --- Eliminate abstract frame indices from
+ instructions that may use them.
+
+* ``emitPrologue`` --- Insert prologue code into the function.
+
+* ``emitEpilogue`` --- Insert epilogue code into the function.
+
+.. _instruction-set:
+
+Instruction Set
+===============
+
+During the early stages of code generation, the LLVM IR code is converted to a
+``SelectionDAG`` with nodes that are instances of the ``SDNode`` class
+containing target instructions. An ``SDNode`` has an opcode, operands, type
+requirements, and operation properties. For example, is an operation
+commutative, does an operation load from memory. The various operation node
+types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file
+(values of the ``NodeType`` enum in the ``ISD`` namespace).
+
+TableGen uses the following target description (``.td``) input files to
+generate much of the code for instruction definition:
+
+* ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and
+ other fundamental classes are defined.
+
+* ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection
+ generators, contains ``SDTC*`` classes (selection DAG type constraint),
+ definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``,
+ ``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``,
+ ``PatFrag``, ``PatLeaf``, ``ComplexPattern``.
+
+* ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific
+ instructions.
+
+* ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates,
+ condition codes, and instructions of an instruction set. For architecture
+ modifications, a different file name may be used. For example, for Pentium
+ with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with
+ MMX, this file is ``X86InstrMMX.td``.
+
+There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of
+the target. The ``XXX.td`` file includes the other ``.td`` input files, but
+its contents are only directly important for subtargets.
+
+You should describe a concrete target-specific class ``XXXInstrInfo`` that
+represents machine instructions supported by a target machine.
+``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of
+which describes one instruction. An instruction descriptor defines:
+
+* Opcode mnemonic
+* Number of operands
+* List of implicit register definitions and uses
+* Target-independent properties (such as memory access, is commutable)
+* Target-specific flags
+
+The Instruction class (defined in ``Target.td``) is mostly used as a base for
+more complex instruction classes.
+
+.. code-block:: text
+
+ class Instruction {
+ string Namespace = "";
+ dag OutOperandList; // A dag containing the MI def operand list.
+ dag InOperandList; // A dag containing the MI use operand list.
+ string AsmString = ""; // The .s format to print the instruction with.
+ list<dag> Pattern; // Set to the DAG pattern for this instruction.
+ list<Register> Uses = [];
+ list<Register> Defs = [];
+ list<Predicate> Predicates = []; // predicates turned into isel match code
+ ... remainder not shown for space ...
+ }
+
+A ``SelectionDAG`` node (``SDNode``) should contain an object representing a
+target-specific instruction that is defined in ``XXXInstrInfo.td``. The
+instruction objects should represent instructions from the architecture manual
+of the target machine (such as the SPARC Architecture Manual for the SPARC
+target).
+
+A single instruction from the architecture manual is often modeled as multiple
+target instructions, depending upon its operands. For example, a manual might
+describe an add instruction that takes a register or an immediate operand. An
+LLVM target could model this with two instructions named ``ADDri`` and
+``ADDrr``.
+
+You should define a class for each instruction category and define each opcode
+as a subclass of the category with appropriate parameters such as the fixed
+binary encoding of opcodes and extended opcodes. You should map the register
+bits to the bits of the instruction in which they are encoded (for the JIT).
+Also you should specify how the instruction should be printed when the
+automatic assembly printer is used.
+
+As is described in the SPARC Architecture Manual, Version 8, there are three
+major 32-bit formats for instructions. Format 1 is only for the ``CALL``
+instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high
+bits of a register) instructions. Format 3 is for other instructions.
+
+Each of these formats has corresponding classes in ``SparcInstrFormat.td``.
+``InstSP`` is a base class for other instruction classes. Additional base
+classes are specified for more precise formats: for example in
+``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for
+branches. There are three other base classes: ``F3_1`` for register/register
+operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for
+floating-point operations. ``SparcInstrInfo.td`` also adds the base class
+``Pseudo`` for synthetic SPARC instructions.
+
+``SparcInstrInfo.td`` largely consists of operand and instruction definitions
+for the SPARC target. In ``SparcInstrInfo.td``, the following target
+description file entry, ``LDrr``, defines the Load Integer instruction for a
+Word (the ``LD`` SPARC opcode) from a memory address to a register. The first
+parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this
+category of operation. The second parameter (``000000``\ :sub:`2`) is the
+specific operation value for ``LD``/Load Word. The third parameter is the
+output destination, which is a register operand and defined in the ``Register``
+target description file (``IntRegs``).
+
+.. code-block:: text
+
+ def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
+ "ld [$addr], $dst",
+ [(set i32:$dst, (load ADDRrr:$addr))]>;
+
+The fourth parameter is the input source, which uses the address operand
+``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``:
+
+.. code-block:: text
+
+ def MEMrr : Operand<i32> {
+ let PrintMethod = "printMemOperand";
+ let MIOperandInfo = (ops IntRegs, IntRegs);
+ }
+
+The fifth parameter is a string that is used by the assembly printer and can be
+left as an empty string until the assembly printer interface is implemented.
+The sixth and final parameter is the pattern used to match the instruction
+during the SelectionDAG Select Phase described in :doc:`CodeGenerator`.
+This parameter is detailed in the next section, :ref:`instruction-selector`.
+
+Instruction class definitions are not overloaded for different operand types,
+so separate versions of instructions are needed for register, memory, or
+immediate value operands. For example, to perform a Load Integer instruction
+for a Word from an immediate operand to a register, the following instruction
+class is defined:
+
+.. code-block:: text
+
+ def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
+ "ld [$addr], $dst",
+ [(set i32:$dst, (load ADDRri:$addr))]>;
+
+Writing these definitions for so many similar instructions can involve a lot of
+cut and paste. In ``.td`` files, the ``multiclass`` directive enables the
+creation of templates to define several instruction classes at once (using the
+``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass``
+pattern ``F3_12`` is defined to create 2 instruction classes each time
+``F3_12`` is invoked:
+
+.. code-block:: text
+
+ multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
+ def rr : F3_1 <2, Op3Val,
+ (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
+ !strconcat(OpcStr, " $b, $c, $dst"),
+ [(set i32:$dst, (OpNode i32:$b, i32:$c))]>;
+ def ri : F3_2 <2, Op3Val,
+ (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
+ !strconcat(OpcStr, " $b, $c, $dst"),
+ [(set i32:$dst, (OpNode i32:$b, simm13:$c))]>;
+ }
+
+So when the ``defm`` directive is used for the ``XOR`` and ``ADD``
+instructions, as seen below, it creates four instruction objects: ``XORrr``,
+``XORri``, ``ADDrr``, and ``ADDri``.
+
+.. code-block:: text
+
+ defm XOR : F3_12<"xor", 0b000011, xor>;
+ defm ADD : F3_12<"add", 0b000000, add>;
+
+``SparcInstrInfo.td`` also includes definitions for condition codes that are
+referenced by branch instructions. The following definitions in
+``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code.
+For example, the 10\ :sup:`th` bit represents the "greater than" condition for
+integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for
+floats.
+
+.. code-block:: text
+
+ def ICC_NE : ICC_VAL< 9>; // Not Equal
+ def ICC_E : ICC_VAL< 1>; // Equal
+ def ICC_G : ICC_VAL<10>; // Greater
+ ...
+ def FCC_U : FCC_VAL<23>; // Unordered
+ def FCC_G : FCC_VAL<22>; // Greater
+ def FCC_UG : FCC_VAL<21>; // Unordered or Greater
+ ...
+
+(Note that ``Sparc.h`` also defines enums that correspond to the same SPARC
+condition codes. Care must be taken to ensure the values in ``Sparc.h``
+correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``,
+``SPCC::FCC_U = 23`` and so on.)
+
+Instruction Operand Mapping
+---------------------------
+
+The code generator backend maps instruction operands to fields in the
+instruction. Operands are assigned to unbound fields in the instruction in the
+order they are defined. Fields are bound when they are assigned a value. For
+example, the Sparc target defines the ``XNORrr`` instruction as a ``F3_1``
+format instruction having three operands.
+
+.. code-block:: text
+
+ def XNORrr : F3_1<2, 0b000111,
+ (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
+ "xnor $b, $c, $dst",
+ [(set i32:$dst, (not (xor i32:$b, i32:$c)))]>;
+
+The instruction templates in ``SparcInstrFormats.td`` show the base class for
+``F3_1`` is ``InstSP``.
+
+.. code-block:: text
+
+ class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction {
+ field bits<32> Inst;
+ let Namespace = "SP";
+ bits<2> op;
+ let Inst{31-30} = op;
+ dag OutOperandList = outs;
+ dag InOperandList = ins;
+ let AsmString = asmstr;
+ let Pattern = pattern;
+ }
+
+``InstSP`` leaves the ``op`` field unbound.
+
+.. code-block:: text
+
+ class F3<dag outs, dag ins, string asmstr, list<dag> pattern>
+ : InstSP<outs, ins, asmstr, pattern> {
+ bits<5> rd;
+ bits<6> op3;
+ bits<5> rs1;
+ let op{1} = 1; // Op = 2 or 3
+ let Inst{29-25} = rd;
+ let Inst{24-19} = op3;
+ let Inst{18-14} = rs1;
+ }
+
+``F3`` binds the ``op`` field and defines the ``rd``, ``op3``, and ``rs1``
+fields. ``F3`` format instructions will bind the operands ``rd``, ``op3``, and
+``rs1`` fields.
+
+.. code-block:: text
+
+ class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins,
+ string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> {
+ bits<8> asi = 0; // asi not currently used
+ bits<5> rs2;
+ let op = opVal;
+ let op3 = op3val;
+ let Inst{13} = 0; // i field = 0
+ let Inst{12-5} = asi; // address space identifier
+ let Inst{4-0} = rs2;
+ }
+
+``F3_1`` binds the ``op3`` field and defines the ``rs2`` fields. ``F3_1``
+format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2``
+fields. This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``,
+and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively.
+
+Instruction Operand Name Mapping
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+TableGen will also generate a function called getNamedOperandIdx() which
+can be used to look up an operand's index in a MachineInstr based on its
+TableGen name. Setting the UseNamedOperandTable bit in an instruction's
+TableGen definition will add all of its operands to an enumeration in the
+llvm::XXX:OpName namespace and also add an entry for it into the OperandMap
+table, which can be queried using getNamedOperandIdx()
+
+.. code-block:: text
+
+ int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0
+ int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b); // => 1
+ int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c); // => 2
+ int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d); // => -1
+
+ ...
+
+The entries in the OpName enum are taken verbatim from the TableGen definitions,
+so operands with lowercase names will have lower case entries in the enum.
+
+To include the getNamedOperandIdx() function in your backend, you will need
+to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h.
+For example:
+
+XXXInstrInfo.cpp:
+
+.. code-block:: c++
+
+ #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function
+ #include "XXXGenInstrInfo.inc"
+
+XXXInstrInfo.h:
+
+.. code-block:: c++
+
+ #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum
+ #include "XXXGenInstrInfo.inc"
+
+ namespace XXX {
+ int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex);
+ } // End namespace XXX
+
+Instruction Operand Types
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+TableGen will also generate an enumeration consisting of all named Operand
+types defined in the backend, in the llvm::XXX::OpTypes namespace.
+Some common immediate Operand types (for instance i8, i32, i64, f32, f64)
+are defined for all targets in ``include/llvm/Target/Target.td``, and are
+available in each Target's OpTypes enum. Also, only named Operand types appear
+in the enumeration: anonymous types are ignored.
+For example, the X86 backend defines ``brtarget`` and ``brtarget8``, both
+instances of the TableGen ``Operand`` class, which represent branch target
+operands:
+
+.. code-block:: text
+
+ def brtarget : Operand<OtherVT>;
+ def brtarget8 : Operand<OtherVT>;
+
+This results in:
+
+.. code-block:: c++
+
+ namespace X86 {
+ namespace OpTypes {
+ enum OperandType {
+ ...
+ brtarget,
+ brtarget8,
+ ...
+ i32imm,
+ i64imm,
+ ...
+ OPERAND_TYPE_LIST_END
+ } // End namespace OpTypes
+ } // End namespace X86
+
+In typical TableGen fashion, to use the enum, you will need to define a
+preprocessor macro:
+
+.. code-block:: c++
+
+ #define GET_INSTRINFO_OPERAND_TYPES_ENUM // For OpTypes enum
+ #include "XXXGenInstrInfo.inc"
+
+
+Instruction Scheduling
+----------------------
+
+Instruction itineraries can be queried using MCDesc::getSchedClass(). The
+value can be named by an enumeration in llvm::XXX::Sched namespace generated
+by TableGen in XXXGenInstrInfo.inc. The name of the schedule classes are
+the same as provided in XXXSchedule.td plus a default NoItinerary class.
+
+The schedule models are generated by TableGen by the SubtargetEmitter,
+using the ``CodeGenSchedModels`` class. This is distinct from the itinerary
+method of specifying machine resource use. The tool ``utils/schedcover.py``
+can be used to determine which instructions have been covered by the
+schedule model description and which haven't. The first step is to use the
+instructions below to create an output file. Then run ``schedcover.py`` on the
+output file:
+
+.. code-block:: shell
+
+ $ <src>/utils/schedcover.py <build>/lib/Target/AArch64/tblGenSubtarget.with
+ instruction, default, CortexA53Model, CortexA57Model, CycloneModel, ExynosM1Model, FalkorModel, KryoModel, ThunderX2T99Model, ThunderXT8XModel
+ ABSv16i8, WriteV, , , CyWriteV3, M1WriteNMISC1, FalkorWr_2VXVY_2cyc, KryoWrite_2cyc_XY_XY_150ln, ,
+ ABSv1i64, WriteV, , , CyWriteV3, M1WriteNMISC1, FalkorWr_1VXVY_2cyc, KryoWrite_2cyc_XY_noRSV_67ln, ,
+ ...
+
+To capture the debug output from generating a schedule model, change to the
+appropriate target directory and use the following command:
+command with the ``subtarget-emitter`` debug option:
+
+.. code-block:: shell
+
+ $ <build>/bin/llvm-tblgen -debug-only=subtarget-emitter -gen-subtarget \
+ -I <src>/lib/Target/<target> -I <src>/include \
+ -I <src>/lib/Target <src>/lib/Target/<target>/<target>.td \
+ -o <build>/lib/Target/<target>/<target>GenSubtargetInfo.inc.tmp \
+ > tblGenSubtarget.dbg 2>&1
+
+Where ``<build>`` is the build directory, ``src`` is the source directory,
+and ``<target>`` is the name of the target.
+To double check that the above command is what is needed, one can capture the
+exact TableGen command from a build by using:
+
+.. code-block:: shell
+
+ $ VERBOSE=1 make ...
+
+and search for ``llvm-tblgen`` commands in the output.
+
+
+Instruction Relation Mapping
+----------------------------
+
+This TableGen feature is used to relate instructions with each other. It is
+particularly useful when you have multiple instruction formats and need to
+switch between them after instruction selection. This entire feature is driven
+by relation models which can be defined in ``XXXInstrInfo.td`` files
+according to the target-specific instruction set. Relation models are defined
+using ``InstrMapping`` class as a base. TableGen parses all the models
+and generates instruction relation maps using the specified information.
+Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file
+along with the functions to query them. For the detailed information on how to
+use this feature, please refer to :doc:`HowToUseInstrMappings`.
+
+Implement a subclass of ``TargetInstrInfo``
+-------------------------------------------
+
+The final step is to hand code portions of ``XXXInstrInfo``, which implements
+the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`).
+These functions return ``0`` or a Boolean or they assert, unless overridden.
+Here's a list of functions that are overridden for the SPARC implementation in
+``SparcInstrInfo.cpp``:
+
+* ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct
+ load from a stack slot, return the register number of the destination and the
+ ``FrameIndex`` of the stack slot.
+
+* ``isStoreToStackSlot`` --- If the specified machine instruction is a direct
+ store to a stack slot, return the register number of the destination and the
+ ``FrameIndex`` of the stack slot.
+
+* ``copyPhysReg`` --- Copy values between a pair of physical registers.
+
+* ``storeRegToStackSlot`` --- Store a register value to a stack slot.
+
+* ``loadRegFromStackSlot`` --- Load a register value from a stack slot.
+
+* ``storeRegToAddr`` --- Store a register value to memory.
+
+* ``loadRegFromAddr`` --- Load a register value from memory.
+
+* ``foldMemoryOperand`` --- Attempt to combine instructions of any load or
+ store instruction for the specified operand(s).
+
+Branch Folding and If Conversion
+--------------------------------
+
+Performance can be improved by combining instructions or by eliminating
+instructions that are never reached. The ``AnalyzeBranch`` method in
+``XXXInstrInfo`` may be implemented to examine conditional instructions and
+remove unnecessary instructions. ``AnalyzeBranch`` looks at the end of a
+machine basic block (MBB) for opportunities for improvement, such as branch
+folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine
+function passes (see the source files ``BranchFolding.cpp`` and
+``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``AnalyzeBranch``
+to improve the control flow graph that represents the instructions.
+
+Several implementations of ``AnalyzeBranch`` (for ARM, Alpha, and X86) can be
+examined as models for your own ``AnalyzeBranch`` implementation. Since SPARC
+does not implement a useful ``AnalyzeBranch``, the ARM target implementation is
+shown below.
+
+``AnalyzeBranch`` returns a Boolean value and takes four parameters:
+
+* ``MachineBasicBlock &MBB`` --- The incoming block to be examined.
+
+* ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a
+ conditional branch that evaluates to true, ``TBB`` is the destination.
+
+* ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to
+ false, ``FBB`` is returned as the destination.
+
+* ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a
+ condition for a conditional branch.
+
+In the simplest case, if a block ends without a branch, then it falls through
+to the successor block. No destination blocks are specified for either ``TBB``
+or ``FBB``, so both parameters return ``NULL``. The start of the
+``AnalyzeBranch`` (see code below for the ARM target) shows the function
+parameters and the code for the simplest case.
+
+.. code-block:: c++
+
+ bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
+ MachineBasicBlock *&TBB,
+ MachineBasicBlock *&FBB,
+ std::vector<MachineOperand> &Cond) const
+ {
+ MachineBasicBlock::iterator I = MBB.end();
+ if (I == MBB.begin() || !isUnpredicatedTerminator(--I))
+ return false;
+
+If a block ends with a single unconditional branch instruction, then
+``AnalyzeBranch`` (shown below) should return the destination of that branch in
+the ``TBB`` parameter.
+
+.. code-block:: c++
+
+ if (LastOpc == ARM::B || LastOpc == ARM::tB) {
+ TBB = LastInst->getOperand(0).getMBB();
+ return false;
+ }
+
+If a block ends with two unconditional branches, then the second branch is
+never reached. In that situation, as shown below, remove the last branch
+instruction and return the penultimate branch in the ``TBB`` parameter.
+
+.. code-block:: c++
+
+ if ((SecondLastOpc == ARM::B || SecondLastOpc == ARM::tB) &&
+ (LastOpc == ARM::B || LastOpc == ARM::tB)) {
+ TBB = SecondLastInst->getOperand(0).getMBB();
+ I = LastInst;
+ I->eraseFromParent();
+ return false;
+ }
+
+A block may end with a single conditional branch instruction that falls through
+to successor block if the condition evaluates to false. In that case,
+``AnalyzeBranch`` (shown below) should return the destination of that
+conditional branch in the ``TBB`` parameter and a list of operands in the
+``Cond`` parameter to evaluate the condition.
+
+.. code-block:: c++
+
+ if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) {
+ // Block ends with fall-through condbranch.
+ TBB = LastInst->getOperand(0).getMBB();
+ Cond.push_back(LastInst->getOperand(1));
+ Cond.push_back(LastInst->getOperand(2));
+ return false;
+ }
+
+If a block ends with both a conditional branch and an ensuing unconditional
+branch, then ``AnalyzeBranch`` (shown below) should return the conditional
+branch destination (assuming it corresponds to a conditional evaluation of
+"``true``") in the ``TBB`` parameter and the unconditional branch destination
+in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A
+list of operands to evaluate the condition should be returned in the ``Cond``
+parameter.
+
+.. code-block:: c++
+
+ unsigned SecondLastOpc = SecondLastInst->getOpcode();
+
+ if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) ||
+ (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
+ TBB = SecondLastInst->getOperand(0).getMBB();
+ Cond.push_back(SecondLastInst->getOperand(1));
+ Cond.push_back(SecondLastInst->getOperand(2));
+ FBB = LastInst->getOperand(0).getMBB();
+ return false;
+ }
+
+For the last two cases (ending with a single conditional branch or ending with
+one conditional and one unconditional branch), the operands returned in the
+``Cond`` parameter can be passed to methods of other instructions to create new
+branches or perform other operations. An implementation of ``AnalyzeBranch``
+requires the helper methods ``RemoveBranch`` and ``InsertBranch`` to manage
+subsequent operations.
+
+``AnalyzeBranch`` should return false indicating success in most circumstances.
+``AnalyzeBranch`` should only return true when the method is stumped about what
+to do, for example, if a block has three terminating branches.
+``AnalyzeBranch`` may return true if it encounters a terminator it cannot
+handle, such as an indirect branch.
+
+.. _instruction-selector:
+
+Instruction Selector
+====================
+
+LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of
+the ``SelectionDAG`` ideally represent native target instructions. During code
+generation, instruction selection passes are performed to convert non-native
+DAG instructions into native target-specific instructions. The pass described
+in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG
+instruction selection. Optionally, a pass may be defined (in
+``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch
+instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes
+operations and data types not supported natively (legalizes) in a
+``SelectionDAG``.
+
+TableGen generates code for instruction selection using the following target
+description input files:
+
+* ``XXXInstrInfo.td`` --- Contains definitions of instructions in a
+ target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is
+ included in ``XXXISelDAGToDAG.cpp``.
+
+* ``XXXCallingConv.td`` --- Contains the calling and return value conventions
+ for the target architecture, and it generates ``XXXGenCallingConv.inc``,
+ which is included in ``XXXISelLowering.cpp``.
+
+The implementation of an instruction selection pass must include a header that
+declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In
+``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction
+selection pass into the queue of passes to run.
+
+The LLVM static compiler (``llc``) is an excellent tool for visualizing the
+contents of DAGs. To display the ``SelectionDAG`` before or after specific
+processing phases, use the command line options for ``llc``, described at
+:ref:`SelectionDAG-Process`.
+
+To describe instruction selector behavior, you should add patterns for lowering
+LLVM code into a ``SelectionDAG`` as the last parameter of the instruction
+definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``,
+this entry defines a register store operation, and the last parameter describes
+a pattern with the store DAG operator.
+
+.. code-block:: text
+
+ def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
+ "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>;
+
+``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``:
+
+.. code-block:: text
+
+ def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
+
+The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function
+defined in an implementation of the Instructor Selector (such as
+``SparcISelDAGToDAG.cpp``).
+
+In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined
+below:
+
+.. code-block:: text
+
+ def store : PatFrag<(ops node:$val, node:$ptr),
+ (st node:$val, node:$ptr), [{
+ if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
+ return !ST->isTruncatingStore() &&
+ ST->getAddressingMode() == ISD::UNINDEXED;
+ return false;
+ }]>;
+
+``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the
+``SelectCode`` method that is used to call the appropriate processing method
+for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE``
+for the ``ISD::STORE`` opcode.
+
+.. code-block:: c++
+
+ SDNode *SelectCode(SDValue N) {
+ ...
+ MVT::ValueType NVT = N.getNode()->getValueType(0);
+ switch (N.getOpcode()) {
+ case ISD::STORE: {
+ switch (NVT) {
+ default:
+ return Select_ISD_STORE(N);
+ break;
+ }
+ break;
+ }
+ ...
+
+The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``,
+code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method
+is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this
+instruction.
+
+.. code-block:: c++
+
+ SDNode *Select_ISD_STORE(const SDValue &N) {
+ SDValue Chain = N.getOperand(0);
+ if (Predicate_store(N.getNode())) {
+ SDValue N1 = N.getOperand(1);
+ SDValue N2 = N.getOperand(2);
+ SDValue CPTmp0;
+ SDValue CPTmp1;
+
+ // Pattern: (st:void i32:i32:$src,
+ // ADDRrr:i32:$addr)<<P:Predicate_store>>
+ // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
+ // Pattern complexity = 13 cost = 1 size = 0
+ if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
+ N1.getNode()->getValueType(0) == MVT::i32 &&
+ N2.getNode()->getValueType(0) == MVT::i32) {
+ return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
+ }
+ ...
+
+The SelectionDAG Legalize Phase
+-------------------------------
+
+The Legalize phase converts a DAG to use types and operations that are natively
+supported by the target. For natively unsupported types and operations, you
+need to add code to the target-specific ``XXXTargetLowering`` implementation to
+convert unsupported types and operations to supported ones.
+
+In the constructor for the ``XXXTargetLowering`` class, first use the
+``addRegisterClass`` method to specify which types are supported and which
+register classes are associated with them. The code for the register classes
+are generated by TableGen from ``XXXRegisterInfo.td`` and placed in
+``XXXGenRegisterInfo.h.inc``. For example, the implementation of the
+constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``)
+starts with the following code:
+
+.. code-block:: c++
+
+ addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
+ addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
+ addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
+
+You should examine the node types in the ``ISD`` namespace
+(``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations
+the target natively supports. For operations that do **not** have native
+support, add a callback to the constructor for the ``XXXTargetLowering`` class,
+so the instruction selection process knows what to do. The ``TargetLowering``
+class callback methods (declared in ``llvm/Target/TargetLowering.h``) are:
+
+* ``setOperationAction`` --- General operation.
+* ``setLoadExtAction`` --- Load with extension.
+* ``setTruncStoreAction`` --- Truncating store.
+* ``setIndexedLoadAction`` --- Indexed load.
+* ``setIndexedStoreAction`` --- Indexed store.
+* ``setConvertAction`` --- Type conversion.
+* ``setCondCodeAction`` --- Support for a given condition code.
+
+Note: on older releases, ``setLoadXAction`` is used instead of
+``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not
+be supported. Examine your release to see what methods are specifically
+supported.
+
+These callbacks are used to determine that an operation does or does not work
+with a specified type (or types). And in all cases, the third parameter is a
+``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or
+``Legal``. ``SparcISelLowering.cpp`` contains examples of all four
+``LegalAction`` values.
+
+Promote
+^^^^^^^
+
+For an operation without native support for a given type, the specified type
+may be promoted to a larger type that is supported. For example, SPARC does
+not support a sign-extending load for Boolean values (``i1`` type), so in
+``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes
+``i1`` type values to a large type before loading.
+
+.. code-block:: c++
+
+ setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
+
+Expand
+^^^^^^
+
+For a type without native support, a value may need to be broken down further,
+rather than promoted. For an operation without native support, a combination
+of other operations may be used to similar effect. In SPARC, the
+floating-point sine and cosine trig operations are supported by expansion to
+other operations, as indicated by the third parameter, ``Expand``, to
+``setOperationAction``:
+
+.. code-block:: c++
+
+ setOperationAction(ISD::FSIN, MVT::f32, Expand);
+ setOperationAction(ISD::FCOS, MVT::f32, Expand);
+
+Custom
+^^^^^^
+
+For some operations, simple type promotion or operation expansion may be
+insufficient. In some cases, a special intrinsic function must be implemented.
+
+For example, a constant value may require special treatment, or an operation
+may require spilling and restoring registers in the stack and working with
+register allocators.
+
+As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion
+from a floating point value to a signed integer, first the
+``setOperationAction`` should be called with ``Custom`` as the third parameter:
+
+.. code-block:: c++
+
+ setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
+
+In the ``LowerOperation`` method, for each ``Custom`` operation, a case
+statement should be added to indicate what function to call. In the following
+code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method:
+
+.. code-block:: c++
+
+ SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
+ switch (Op.getOpcode()) {
+ case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
+ ...
+ }
+ }
+
+Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to
+convert the floating-point value to an integer.
+
+.. code-block:: c++
+
+ static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) {
+ assert(Op.getValueType() == MVT::i32);
+ Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
+ return DAG.getNode(ISD::BITCAST, MVT::i32, Op);
+ }
+
+Legal
+^^^^^
+
+The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation
+**is** natively supported. ``Legal`` represents the default condition, so it
+is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an
+operation to count the bits set in an integer) is natively supported only for
+SPARC v9. The following code enables the ``Expand`` conversion technique for
+non-v9 SPARC implementations.
+
+.. code-block:: c++
+
+ setOperationAction(ISD::CTPOP, MVT::i32, Expand);
+ ...
+ if (TM.getSubtarget<SparcSubtarget>().isV9())
+ setOperationAction(ISD::CTPOP, MVT::i32, Legal);
+
+Calling Conventions
+-------------------
+
+To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses
+interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in
+``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor
+file ``XXXGenCallingConv.td`` and generate the header file
+``XXXGenCallingConv.inc``, which is typically included in
+``XXXISelLowering.cpp``. You can use the interfaces in
+``TargetCallingConv.td`` to specify:
+
+* The order of parameter allocation.
+
+* Where parameters and return values are placed (that is, on the stack or in
+ registers).
+
+* Which registers may be used.
+
+* Whether the caller or callee unwinds the stack.
+
+The following example demonstrates the use of the ``CCIfType`` and
+``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is,
+if the current argument is of type ``f32`` or ``f64``), then the action is
+performed. In this case, the ``CCAssignToReg`` action assigns the argument
+value to the first available register: either ``R0`` or ``R1``.
+
+.. code-block:: text
+
+ CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
+
+``SparcCallingConv.td`` contains definitions for a target-specific return-value
+calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention
+(``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates
+which registers are used for specified scalar return types. A single-precision
+float is returned to register ``F0``, and a double-precision float goes to
+register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``.
+
+.. code-block:: text
+
+ def RetCC_Sparc32 : CallingConv<[
+ CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
+ CCIfType<[f32], CCAssignToReg<[F0]>>,
+ CCIfType<[f64], CCAssignToReg<[D0]>>
+ ]>;
+
+The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces
+``CCAssignToStack``, which assigns the value to a stack slot with the specified
+size and alignment. In the example below, the first parameter, 4, indicates
+the size of the slot, and the second parameter, also 4, indicates the stack
+alignment along 4-byte units. (Special cases: if size is zero, then the ABI
+size is used; if alignment is zero, then the ABI alignment is used.)
+
+.. code-block:: text
+
+ def CC_Sparc32 : CallingConv<[
+ // All arguments get passed in integer registers if there is space.
+ CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
+ CCAssignToStack<4, 4>
+ ]>;
+
+``CCDelegateTo`` is another commonly used interface, which tries to find a
+specified sub-calling convention, and, if a match is found, it is invoked. In
+the following example (in ``X86CallingConv.td``), the definition of
+``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is
+assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is
+invoked.
+
+.. code-block:: text
+
+ def RetCC_X86_32_C : CallingConv<[
+ CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
+ CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
+ CCDelegateTo<RetCC_X86Common>
+ ]>;
+
+``CCIfCC`` is an interface that attempts to match the given name to the current
+calling convention. If the name identifies the current calling convention,
+then a specified action is invoked. In the following example (in
+``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then
+``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in
+use, then ``RetCC_X86_32_SSE`` is invoked.
+
+.. code-block:: text
+
+ def RetCC_X86_32 : CallingConv<[
+ CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
+ CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
+ CCDelegateTo<RetCC_X86_32_C>
+ ]>;
+
+Other calling convention interfaces include:
+
+* ``CCIf <predicate, action>`` --- If the predicate matches, apply the action.
+
+* ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``"
+ attribute, then apply the action.
+
+* ``CCIfNest <action>`` --- If the argument is marked with the "``nest``"
+ attribute, then apply the action.
+
+* ``CCIfNotVarArg <action>`` --- If the current function does not take a
+ variable number of arguments, apply the action.
+
+* ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to
+ ``CCAssignToReg``, but with a shadow list of registers.
+
+* ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the
+ minimum specified size and alignment.
+
+* ``CCPromoteToType <type>`` --- Promote the current value to the specified
+ type.
+
+* ``CallingConv <[actions]>`` --- Define each calling convention that is
+ supported.
+
+Assembly Printer
+================
+
+During the code emission stage, the code generator may utilize an LLVM pass to
+produce assembly output. To do this, you want to implement the code for a
+printer that converts LLVM IR to a GAS-format assembly language for your target
+machine, using the following steps:
+
+* Define all the assembly strings for your target, adding them to the
+ instructions defined in the ``XXXInstrInfo.td`` file. (See
+ :ref:`instruction-set`.) TableGen will produce an output file
+ (``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction``
+ method for the ``XXXAsmPrinter`` class.
+
+* Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of
+ the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``).
+
+* Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for
+ ``TargetAsmInfo`` properties and sometimes new implementations for methods.
+
+* Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that
+ performs the LLVM-to-assembly conversion.
+
+The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the
+``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly,
+``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo``
+replacement values that override the default values in ``TargetAsmInfo.cpp``.
+For example in ``SparcTargetAsmInfo.cpp``:
+
+.. code-block:: c++
+
+ SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
+ Data16bitsDirective = "\t.half\t";
+ Data32bitsDirective = "\t.word\t";
+ Data64bitsDirective = 0; // .xword is only supported by V9.
+ ZeroDirective = "\t.skip\t";
+ CommentString = "!";
+ ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
+ }
+
+The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example
+where the target specific ``TargetAsmInfo`` class uses an overridden methods:
+``ExpandInlineAsm``.
+
+A target-specific implementation of ``AsmPrinter`` is written in
+``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts
+the LLVM to printable assembly. The implementation must include the following
+headers that have declarations for the ``AsmPrinter`` and
+``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of
+``FunctionPass``.
+
+.. code-block:: c++
+
+ #include "llvm/CodeGen/AsmPrinter.h"
+ #include "llvm/CodeGen/MachineFunctionPass.h"
+
+As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set
+up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is
+instantiated to process variable names.
+
+In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in
+``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In
+``MachineFunctionPass``, the ``runOnFunction`` method invokes
+``runOnMachineFunction``. Target-specific implementations of
+``runOnMachineFunction`` differ, but generally do the following to process each
+machine function:
+
+* Call ``SetupMachineFunction`` to perform initialization.
+
+* Call ``EmitConstantPool`` to print out (to the output stream) constants which
+ have been spilled to memory.
+
+* Call ``EmitJumpTableInfo`` to print out jump tables used by the current
+ function.
+
+* Print out the label for the current function.
+
+* Print out the code for the function, including basic block labels and the
+ assembly for the instruction (using ``printInstruction``)
+
+The ``XXXAsmPrinter`` implementation must also include the code generated by
+TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in
+``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction``
+method that may call these methods:
+
+* ``printOperand``
+* ``printMemOperand``
+* ``printCCOperand`` (for conditional statements)
+* ``printDataDirective``
+* ``printDeclare``
+* ``printImplicitDef``
+* ``printInlineAsm``
+
+The implementations of ``printDeclare``, ``printImplicitDef``,
+``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally
+adequate for printing assembly and do not need to be overridden.
+
+The ``printOperand`` method is implemented with a long ``switch``/``case``
+statement for the type of operand: register, immediate, basic block, external
+symbol, global address, constant pool index, or jump table index. For an
+instruction with a memory address operand, the ``printMemOperand`` method
+should be implemented to generate the proper output. Similarly,
+``printCCOperand`` should be used to print a conditional operand.
+
+``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be
+called to shut down the assembly printer. During ``doFinalization``, global
+variables and constants are printed to output.
+
+Subtarget Support
+=================
+
+Subtarget support is used to inform the code generation process of instruction
+set variations for a given chip set. For example, the LLVM SPARC
+implementation provided covers three major versions of the SPARC microprocessor
+architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a
+64-bit architecture), and the UltraSPARC architecture. V8 has 16
+double-precision floating-point registers that are also usable as either 32
+single-precision or 8 quad-precision registers. V8 is also purely big-endian.
+V9 has 32 double-precision floating-point registers that are also usable as 16
+quad-precision registers, but cannot be used as single-precision registers.
+The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
+extensions.
+
+If subtarget support is needed, you should implement a target-specific
+``XXXSubtarget`` class for your architecture. This class should process the
+command-line options ``-mcpu=`` and ``-mattr=``.
+
+TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to
+generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the
+``SubtargetFeature`` interface is defined. The first 4 string parameters of
+the ``SubtargetFeature`` interface are a feature name, an attribute set by the
+feature, the value of the attribute, and a description of the feature. (The
+fifth parameter is a list of features whose presence is implied, and its
+default value is an empty array.)
+
+.. code-block:: text
+
+ class SubtargetFeature<string n, string a, string v, string d,
+ list<SubtargetFeature> i = []> {
+ string Name = n;
+ string Attribute = a;
+ string Value = v;
+ string Desc = d;
+ list<SubtargetFeature> Implies = i;
+ }
+
+In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the
+following features.
+
+.. code-block:: text
+
+ def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
+ "Enable SPARC-V9 instructions">;
+ def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
+ "V8DeprecatedInsts", "true",
+ "Enable deprecated V8 instructions in V9 mode">;
+ def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
+ "Enable UltraSPARC Visual Instruction Set extensions">;
+
+Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to
+define particular SPARC processor subtypes that may have the previously
+described features.
+
+.. code-block:: text
+
+ class Proc<string Name, list<SubtargetFeature> Features>
+ : Processor<Name, NoItineraries, Features>;
+
+ def : Proc<"generic", []>;
+ def : Proc<"v8", []>;
+ def : Proc<"supersparc", []>;
+ def : Proc<"sparclite", []>;
+ def : Proc<"f934", []>;
+ def : Proc<"hypersparc", []>;
+ def : Proc<"sparclite86x", []>;
+ def : Proc<"sparclet", []>;
+ def : Proc<"tsc701", []>;
+ def : Proc<"v9", [FeatureV9]>;
+ def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
+ def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
+ def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
+
+From ``Target.td`` and ``Sparc.td`` files, the resulting
+``SparcGenSubtarget.inc`` specifies enum values to identify the features,
+arrays of constants to represent the CPU features and CPU subtypes, and the
+``ParseSubtargetFeatures`` method that parses the features string that sets
+specified subtarget options. The generated ``SparcGenSubtarget.inc`` file
+should be included in the ``SparcSubtarget.cpp``. The target-specific
+implementation of the ``XXXSubtarget`` method should follow this pseudocode:
+
+.. code-block:: c++
+
+ XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
+ // Set the default features
+ // Determine default and user specified characteristics of the CPU
+ // Call ParseSubtargetFeatures(FS, CPU) to parse the features string
+ // Perform any additional operations
+ }
+
+JIT Support
+===========
+
+The implementation of a target machine optionally includes a Just-In-Time (JIT)
+code generator that emits machine code and auxiliary structures as binary
+output that can be written directly to memory. To do this, implement JIT code
+generation by performing the following steps:
+
+* Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass
+ that transforms target-machine instructions into relocatable machine
+ code.
+
+* Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for
+ target-specific code-generation activities, such as emitting machine code and
+ stubs.
+
+* Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object
+ through its ``getJITInfo`` method.
+
+There are several different approaches to writing the JIT support code. For
+instance, TableGen and target descriptor files may be used for creating a JIT
+code generator, but are not mandatory. For the Alpha and PowerPC target
+machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which
+contains the binary coding of machine instructions and the
+``getBinaryCodeForInstr`` method to access those codes. Other JIT
+implementations do not.
+
+Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the
+``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the
+``MachineCodeEmitter`` class containing code for several callback functions
+that write data (in bytes, words, strings, etc.) to the output stream.
+
+Machine Code Emitter
+--------------------
+
+In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is
+implemented as a function pass (subclass of ``MachineFunctionPass``). The
+target-specific implementation of ``runOnMachineFunction`` (invoked by
+``runOnFunction`` in ``MachineFunctionPass``) iterates through the
+``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and
+emit binary code. ``emitInstruction`` is largely implemented with case
+statements on the instruction types defined in ``XXXInstrInfo.h``. For
+example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built
+around the following ``switch``/``case`` statements:
+
+.. code-block:: c++
+
+ switch (Desc->TSFlags & X86::FormMask) {
+ case X86II::Pseudo: // for not yet implemented instructions
+ ... // or pseudo-instructions
+ break;
+ case X86II::RawFrm: // for instructions with a fixed opcode value
+ ...
+ break;
+ case X86II::AddRegFrm: // for instructions that have one register operand
+ ... // added to their opcode
+ break;
+ case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
+ ... // to specify a destination (register)
+ break;
+ case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
+ ... // to specify a destination (memory)
+ break;
+ case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
+ ... // to specify a source (register)
+ break;
+ case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
+ ... // to specify a source (memory)
+ break;
+ case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
+ case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
+ case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
+ case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
+ ...
+ break;
+ case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
+ case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
+ case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
+ case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
+ ...
+ break;
+ case X86II::MRMInitReg: // for instructions whose source and
+ ... // destination are the same register
+ break;
+ }
+
+The implementations of these case statements often first emit the opcode and
+then get the operand(s). Then depending upon the operand, helper methods may
+be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``,
+for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is
+the opcode added to the register operand. Then an object representing the
+machine operand, ``MO1``, is extracted. The helper methods such as
+``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``,
+``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type.
+(``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``,
+``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``,
+and ``emitJumpTableAddress`` that emit the data into the output stream.)
+
+.. code-block:: c++
+
+ case X86II::AddRegFrm:
+ MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
+
+ if (CurOp != NumOps) {
+ const MachineOperand &MO1 = MI.getOperand(CurOp++);
+ unsigned Size = X86InstrInfo::sizeOfImm(Desc);
+ if (MO1.isImmediate())
+ emitConstant(MO1.getImm(), Size);
+ else {
+ unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
+ : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
+ if (Opcode == X86::MOV64ri)
+ rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
+ if (MO1.isGlobalAddress()) {
+ bool NeedStub = isa<Function>(MO1.getGlobal());
+ bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
+ emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
+ NeedStub, isLazy);
+ } else if (MO1.isExternalSymbol())
+ emitExternalSymbolAddress(MO1.getSymbolName(), rt);
+ else if (MO1.isConstantPoolIndex())
+ emitConstPoolAddress(MO1.getIndex(), rt);
+ else if (MO1.isJumpTableIndex())
+ emitJumpTableAddress(MO1.getIndex(), rt);
+ }
+ }
+ break;
+
+In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which
+is a ``RelocationType`` enum that may be used to relocate addresses (for
+example, a global address with a PIC base offset). The ``RelocationType`` enum
+for that target is defined in the short target-specific ``XXXRelocations.h``
+file. The ``RelocationType`` is used by the ``relocate`` method defined in
+``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols.
+
+For example, ``X86Relocations.h`` specifies the following relocation types for
+the X86 addresses. In all four cases, the relocated value is added to the
+value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``,
+there is an additional initial adjustment.
+
+.. code-block:: c++
+
+ enum RelocationType {
+ reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
+ reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
+ reloc_absolute_word = 2, // absolute relocation; no additional adjustment
+ reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
+ };
+
+Target JIT Info
+---------------
+
+``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific
+code-generation activities, such as emitting machine code and stubs. At
+minimum, a target-specific version of ``XXXJITInfo`` implements the following:
+
+* ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a
+ function that is used for compilation.
+
+* ``emitFunctionStub`` --- Returns a native function with a specified address
+ for a callback function.
+
+* ``relocate`` --- Changes the addresses of referenced globals, based on
+ relocation types.
+
+* Callback function that are wrappers to a function stub that is used when the
+ real target is not initially known.
+
+``getLazyResolverFunction`` is generally trivial to implement. It makes the
+incoming parameter as the global ``JITCompilerFunction`` and returns the
+callback function that will be used a function wrapper. For the Alpha target
+(in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is
+simply:
+
+.. code-block:: c++
+
+ TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
+ JITCompilerFn F) {
+ JITCompilerFunction = F;
+ return AlphaCompilationCallback;
+ }
+
+For the X86 target, the ``getLazyResolverFunction`` implementation is a little
+more complicated, because it returns a different callback function for
+processors with SSE instructions and XMM registers.
+
+The callback function initially saves and later restores the callee register
+values, incoming arguments, and frame and return address. The callback
+function needs low-level access to the registers or stack, so it is typically
+implemented with assembler.
+
Added: www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMPass.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMPass.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMPass.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/WritingAnLLVMPass.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1456 @@
+====================
+Writing an LLVM Pass
+====================
+
+.. program:: opt
+
+.. contents::
+ :local:
+
+Introduction --- What is a pass?
+================================
+
+The LLVM Pass Framework is an important part of the LLVM system, because LLVM
+passes are where most of the interesting parts of the compiler exist. Passes
+perform the transformations and optimizations that make up the compiler, they
+build the analysis results that are used by these transformations, and they
+are, above all, a structuring technique for compiler code.
+
+All LLVM passes are subclasses of the `Pass
+<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_ class, which implement
+functionality by overriding virtual methods inherited from ``Pass``. Depending
+on how your pass works, you should inherit from the :ref:`ModulePass
+<writing-an-llvm-pass-ModulePass>` , :ref:`CallGraphSCCPass
+<writing-an-llvm-pass-CallGraphSCCPass>`, :ref:`FunctionPass
+<writing-an-llvm-pass-FunctionPass>` , or :ref:`LoopPass
+<writing-an-llvm-pass-LoopPass>`, or :ref:`RegionPass
+<writing-an-llvm-pass-RegionPass>`, or :ref:`BasicBlockPass
+<writing-an-llvm-pass-BasicBlockPass>` classes, which gives the system more
+information about what your pass does, and how it can be combined with other
+passes. One of the main features of the LLVM Pass Framework is that it
+schedules passes to run in an efficient way based on the constraints that your
+pass meets (which are indicated by which class they derive from).
+
+We start by showing you how to construct a pass, everything from setting up the
+code, to compiling, loading, and executing it. After the basics are down, more
+advanced features are discussed.
+
+Quick Start --- Writing hello world
+===================================
+
+Here we describe how to write the "hello world" of passes. The "Hello" pass is
+designed to simply print out the name of non-external functions that exist in
+the program being compiled. It does not modify the program at all, it just
+inspects it. The source code and files for this pass are available in the LLVM
+source tree in the ``lib/Transforms/Hello`` directory.
+
+.. _writing-an-llvm-pass-makefile:
+
+Setting up the build environment
+--------------------------------
+
+First, configure and build LLVM. Next, you need to create a new directory
+somewhere in the LLVM source base. For this example, we'll assume that you
+made ``lib/Transforms/Hello``. Finally, you must set up a build script
+that will compile the source code for the new pass. To do this,
+copy the following into ``CMakeLists.txt``:
+
+.. code-block:: cmake
+
+ add_llvm_library( LLVMHello MODULE
+ Hello.cpp
+
+ PLUGIN_TOOL
+ opt
+ )
+
+and the following line into ``lib/Transforms/CMakeLists.txt``:
+
+.. code-block:: cmake
+
+ add_subdirectory(Hello)
+
+(Note that there is already a directory named ``Hello`` with a sample "Hello"
+pass; you may play with it -- in which case you don't need to modify any
+``CMakeLists.txt`` files -- or, if you want to create everything from scratch,
+use another name.)
+
+This build script specifies that ``Hello.cpp`` file in the current directory
+is to be compiled and linked into a shared object ``$(LEVEL)/lib/LLVMHello.so`` that
+can be dynamically loaded by the :program:`opt` tool via its :option:`-load`
+option. If your operating system uses a suffix other than ``.so`` (such as
+Windows or macOS), the appropriate extension will be used.
+
+Now that we have the build scripts set up, we just need to write the code for
+the pass itself.
+
+.. _writing-an-llvm-pass-basiccode:
+
+Basic code required
+-------------------
+
+Now that we have a way to compile our new pass, we just have to write it.
+Start out with:
+
+.. code-block:: c++
+
+ #include "llvm/Pass.h"
+ #include "llvm/IR/Function.h"
+ #include "llvm/Support/raw_ostream.h"
+
+Which are needed because we are writing a `Pass
+<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_, we are operating on
+`Function <http://llvm.org/doxygen/classllvm_1_1Function.html>`_\ s, and we will
+be doing some printing.
+
+Next we have:
+
+.. code-block:: c++
+
+ using namespace llvm;
+
+... which is required because the functions from the include files live in the
+llvm namespace.
+
+Next we have:
+
+.. code-block:: c++
+
+ namespace {
+
+... which starts out an anonymous namespace. Anonymous namespaces are to C++
+what the "``static``" keyword is to C (at global scope). It makes the things
+declared inside of the anonymous namespace visible only to the current file.
+If you're not familiar with them, consult a decent C++ book for more
+information.
+
+Next, we declare our pass itself:
+
+.. code-block:: c++
+
+ struct Hello : public FunctionPass {
+
+This declares a "``Hello``" class that is a subclass of :ref:`FunctionPass
+<writing-an-llvm-pass-FunctionPass>`. The different builtin pass subclasses
+are described in detail :ref:`later <writing-an-llvm-pass-pass-classes>`, but
+for now, know that ``FunctionPass`` operates on a function at a time.
+
+.. code-block:: c++
+
+ static char ID;
+ Hello() : FunctionPass(ID) {}
+
+This declares pass identifier used by LLVM to identify pass. This allows LLVM
+to avoid using expensive C++ runtime information.
+
+.. code-block:: c++
+
+ bool runOnFunction(Function &F) override {
+ errs() << "Hello: ";
+ errs().write_escaped(F.getName()) << '\n';
+ return false;
+ }
+ }; // end of struct Hello
+ } // end of anonymous namespace
+
+We declare a :ref:`runOnFunction <writing-an-llvm-pass-runOnFunction>` method,
+which overrides an abstract virtual method inherited from :ref:`FunctionPass
+<writing-an-llvm-pass-FunctionPass>`. This is where we are supposed to do our
+thing, so we just print out our message with the name of each function.
+
+.. code-block:: c++
+
+ char Hello::ID = 0;
+
+We initialize pass ID here. LLVM uses ID's address to identify a pass, so
+initialization value is not important.
+
+.. code-block:: c++
+
+ static RegisterPass<Hello> X("hello", "Hello World Pass",
+ false /* Only looks at CFG */,
+ false /* Analysis Pass */);
+
+Lastly, we :ref:`register our class <writing-an-llvm-pass-registration>`
+``Hello``, giving it a command line argument "``hello``", and a name "Hello
+World Pass". The last two arguments describe its behavior: if a pass walks CFG
+without modifying it then the third argument is set to ``true``; if a pass is
+an analysis pass, for example dominator tree pass, then ``true`` is supplied as
+the fourth argument.
+
+If we want to register the pass as a step of an existing pipeline, some extension
+points are provided, e.g. ``PassManagerBuilder::EP_EarlyAsPossible`` to apply our
+pass before any optimization, or ``PassManagerBuilder::EP_FullLinkTimeOptimizationLast``
+to apply it after Link Time Optimizations.
+
+.. code-block:: c++
+
+ static llvm::RegisterStandardPasses Y(
+ llvm::PassManagerBuilder::EP_EarlyAsPossible,
+ [](const llvm::PassManagerBuilder &Builder,
+ llvm::legacy::PassManagerBase &PM) { PM.add(new Hello()); });
+
+As a whole, the ``.cpp`` file looks like:
+
+.. code-block:: c++
+
+ #include "llvm/Pass.h"
+ #include "llvm/IR/Function.h"
+ #include "llvm/Support/raw_ostream.h"
+
+ #include "llvm/IR/LegacyPassManager.h"
+ #include "llvm/Transforms/IPO/PassManagerBuilder.h"
+
+ using namespace llvm;
+
+ namespace {
+ struct Hello : public FunctionPass {
+ static char ID;
+ Hello() : FunctionPass(ID) {}
+
+ bool runOnFunction(Function &F) override {
+ errs() << "Hello: ";
+ errs().write_escaped(F.getName()) << '\n';
+ return false;
+ }
+ }; // end of struct Hello
+ } // end of anonymous namespace
+
+ char Hello::ID = 0;
+ static RegisterPass<Hello> X("hello", "Hello World Pass",
+ false /* Only looks at CFG */,
+ false /* Analysis Pass */);
+
+ static RegisterStandardPasses Y(
+ PassManagerBuilder::EP_EarlyAsPossible,
+ [](const PassManagerBuilder &Builder,
+ legacy::PassManagerBase &PM) { PM.add(new Hello()); });
+
+Now that it's all together, compile the file with a simple "``gmake``" command
+from the top level of your build directory and you should get a new file
+"``lib/LLVMHello.so``". Note that everything in this file is
+contained in an anonymous namespace --- this reflects the fact that passes
+are self contained units that do not need external interfaces (although they
+can have them) to be useful.
+
+Running a pass with ``opt``
+---------------------------
+
+Now that you have a brand new shiny shared object file, we can use the
+:program:`opt` command to run an LLVM program through your pass. Because you
+registered your pass with ``RegisterPass``, you will be able to use the
+:program:`opt` tool to access it, once loaded.
+
+To test it, follow the example at the end of the :doc:`GettingStarted` to
+compile "Hello World" to LLVM. We can now run the bitcode file (hello.bc) for
+the program through our transformation like this (or course, any bitcode file
+will work):
+
+.. code-block:: console
+
+ $ opt -load lib/LLVMHello.so -hello < hello.bc > /dev/null
+ Hello: __main
+ Hello: puts
+ Hello: main
+
+The :option:`-load` option specifies that :program:`opt` should load your pass
+as a shared object, which makes "``-hello``" a valid command line argument
+(which is one reason you need to :ref:`register your pass
+<writing-an-llvm-pass-registration>`). Because the Hello pass does not modify
+the program in any interesting way, we just throw away the result of
+:program:`opt` (sending it to ``/dev/null``).
+
+To see what happened to the other string you registered, try running
+:program:`opt` with the :option:`-help` option:
+
+.. code-block:: console
+
+ $ opt -load lib/LLVMHello.so -help
+ OVERVIEW: llvm .bc -> .bc modular optimizer and analysis printer
+
+ USAGE: opt [subcommand] [options] <input bitcode file>
+
+ OPTIONS:
+ Optimizations available:
+ ...
+ -guard-widening - Widen guards
+ -gvn - Global Value Numbering
+ -gvn-hoist - Early GVN Hoisting of Expressions
+ -hello - Hello World Pass
+ -indvars - Induction Variable Simplification
+ -inferattrs - Infer set function attributes
+ ...
+
+The pass name gets added as the information string for your pass, giving some
+documentation to users of :program:`opt`. Now that you have a working pass,
+you would go ahead and make it do the cool transformations you want. Once you
+get it all working and tested, it may become useful to find out how fast your
+pass is. The :ref:`PassManager <writing-an-llvm-pass-passmanager>` provides a
+nice command line option (:option:`-time-passes`) that allows you to get
+information about the execution time of your pass along with the other passes
+you queue up. For example:
+
+.. code-block:: console
+
+ $ opt -load lib/LLVMHello.so -hello -time-passes < hello.bc > /dev/null
+ Hello: __main
+ Hello: puts
+ Hello: main
+ ===-------------------------------------------------------------------------===
+ ... Pass execution timing report ...
+ ===-------------------------------------------------------------------------===
+ Total Execution Time: 0.0007 seconds (0.0005 wall clock)
+
+ ---User Time--- --User+System-- ---Wall Time--- --- Name ---
+ 0.0004 ( 55.3%) 0.0004 ( 55.3%) 0.0004 ( 75.7%) Bitcode Writer
+ 0.0003 ( 44.7%) 0.0003 ( 44.7%) 0.0001 ( 13.6%) Hello World Pass
+ 0.0000 ( 0.0%) 0.0000 ( 0.0%) 0.0001 ( 10.7%) Module Verifier
+ 0.0007 (100.0%) 0.0007 (100.0%) 0.0005 (100.0%) Total
+
+As you can see, our implementation above is pretty fast. The additional
+passes listed are automatically inserted by the :program:`opt` tool to verify
+that the LLVM emitted by your pass is still valid and well formed LLVM, which
+hasn't been broken somehow.
+
+Now that you have seen the basics of the mechanics behind passes, we can talk
+about some more details of how they work and how to use them.
+
+.. _writing-an-llvm-pass-pass-classes:
+
+Pass classes and requirements
+=============================
+
+One of the first things that you should do when designing a new pass is to
+decide what class you should subclass for your pass. The :ref:`Hello World
+<writing-an-llvm-pass-basiccode>` example uses the :ref:`FunctionPass
+<writing-an-llvm-pass-FunctionPass>` class for its implementation, but we did
+not discuss why or when this should occur. Here we talk about the classes
+available, from the most general to the most specific.
+
+When choosing a superclass for your ``Pass``, you should choose the **most
+specific** class possible, while still being able to meet the requirements
+listed. This gives the LLVM Pass Infrastructure information necessary to
+optimize how passes are run, so that the resultant compiler isn't unnecessarily
+slow.
+
+The ``ImmutablePass`` class
+---------------------------
+
+The most plain and boring type of pass is the "`ImmutablePass
+<http://llvm.org/doxygen/classllvm_1_1ImmutablePass.html>`_" class. This pass
+type is used for passes that do not have to be run, do not change state, and
+never need to be updated. This is not a normal type of transformation or
+analysis, but can provide information about the current compiler configuration.
+
+Although this pass class is very infrequently used, it is important for
+providing information about the current target machine being compiled for, and
+other static information that can affect the various transformations.
+
+``ImmutablePass``\ es never invalidate other transformations, are never
+invalidated, and are never "run".
+
+.. _writing-an-llvm-pass-ModulePass:
+
+The ``ModulePass`` class
+------------------------
+
+The `ModulePass <http://llvm.org/doxygen/classllvm_1_1ModulePass.html>`_ class
+is the most general of all superclasses that you can use. Deriving from
+``ModulePass`` indicates that your pass uses the entire program as a unit,
+referring to function bodies in no predictable order, or adding and removing
+functions. Because nothing is known about the behavior of ``ModulePass``
+subclasses, no optimization can be done for their execution.
+
+A module pass can use function level passes (e.g. dominators) using the
+``getAnalysis`` interface ``getAnalysis<DominatorTree>(llvm::Function *)`` to
+provide the function to retrieve analysis result for, if the function pass does
+not require any module or immutable passes. Note that this can only be done
+for functions for which the analysis ran, e.g. in the case of dominators you
+should only ask for the ``DominatorTree`` for function definitions, not
+declarations.
+
+To write a correct ``ModulePass`` subclass, derive from ``ModulePass`` and
+overload the ``runOnModule`` method with the following signature:
+
+The ``runOnModule`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnModule(Module &M) = 0;
+
+The ``runOnModule`` method performs the interesting work of the pass. It
+should return ``true`` if the module was modified by the transformation and
+``false`` otherwise.
+
+.. _writing-an-llvm-pass-CallGraphSCCPass:
+
+The ``CallGraphSCCPass`` class
+------------------------------
+
+The `CallGraphSCCPass
+<http://llvm.org/doxygen/classllvm_1_1CallGraphSCCPass.html>`_ is used by
+passes that need to traverse the program bottom-up on the call graph (callees
+before callers). Deriving from ``CallGraphSCCPass`` provides some mechanics
+for building and traversing the ``CallGraph``, but also allows the system to
+optimize execution of ``CallGraphSCCPass``\ es. If your pass meets the
+requirements outlined below, and doesn't meet the requirements of a
+:ref:`FunctionPass <writing-an-llvm-pass-FunctionPass>` or :ref:`BasicBlockPass
+<writing-an-llvm-pass-BasicBlockPass>`, you should derive from
+``CallGraphSCCPass``.
+
+``TODO``: explain briefly what SCC, Tarjan's algo, and B-U mean.
+
+To be explicit, CallGraphSCCPass subclasses are:
+
+#. ... *not allowed* to inspect or modify any ``Function``\ s other than those
+ in the current SCC and the direct callers and direct callees of the SCC.
+#. ... *required* to preserve the current ``CallGraph`` object, updating it to
+ reflect any changes made to the program.
+#. ... *not allowed* to add or remove SCC's from the current Module, though
+ they may change the contents of an SCC.
+#. ... *allowed* to add or remove global variables from the current Module.
+#. ... *allowed* to maintain state across invocations of :ref:`runOnSCC
+ <writing-an-llvm-pass-runOnSCC>` (including global data).
+
+Implementing a ``CallGraphSCCPass`` is slightly tricky in some cases because it
+has to handle SCCs with more than one node in it. All of the virtual methods
+described below should return ``true`` if they modified the program, or
+``false`` if they didn't.
+
+The ``doInitialization(CallGraph &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doInitialization(CallGraph &CG);
+
+The ``doInitialization`` method is allowed to do most of the things that
+``CallGraphSCCPass``\ es are not allowed to do. They can add and remove
+functions, get pointers to functions, etc. The ``doInitialization`` method is
+designed to do simple initialization type of stuff that does not depend on the
+SCCs being processed. The ``doInitialization`` method call is not scheduled to
+overlap with any other pass executions (thus it should be very fast).
+
+.. _writing-an-llvm-pass-runOnSCC:
+
+The ``runOnSCC`` method
+^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnSCC(CallGraphSCC &SCC) = 0;
+
+The ``runOnSCC`` method performs the interesting work of the pass, and should
+return ``true`` if the module was modified by the transformation, ``false``
+otherwise.
+
+The ``doFinalization(CallGraph &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doFinalization(CallGraph &CG);
+
+The ``doFinalization`` method is an infrequently used method that is called
+when the pass framework has finished calling :ref:`runOnSCC
+<writing-an-llvm-pass-runOnSCC>` for every SCC in the program being compiled.
+
+.. _writing-an-llvm-pass-FunctionPass:
+
+The ``FunctionPass`` class
+--------------------------
+
+In contrast to ``ModulePass`` subclasses, `FunctionPass
+<http://llvm.org/doxygen/classllvm_1_1Pass.html>`_ subclasses do have a
+predictable, local behavior that can be expected by the system. All
+``FunctionPass`` execute on each function in the program independent of all of
+the other functions in the program. ``FunctionPass``\ es do not require that
+they are executed in a particular order, and ``FunctionPass``\ es do not modify
+external functions.
+
+To be explicit, ``FunctionPass`` subclasses are not allowed to:
+
+#. Inspect or modify a ``Function`` other than the one currently being processed.
+#. Add or remove ``Function``\ s from the current ``Module``.
+#. Add or remove global variables from the current ``Module``.
+#. Maintain state across invocations of :ref:`runOnFunction
+ <writing-an-llvm-pass-runOnFunction>` (including global data).
+
+Implementing a ``FunctionPass`` is usually straightforward (See the :ref:`Hello
+World <writing-an-llvm-pass-basiccode>` pass for example).
+``FunctionPass``\ es may overload three virtual methods to do their work. All
+of these methods should return ``true`` if they modified the program, or
+``false`` if they didn't.
+
+.. _writing-an-llvm-pass-doInitialization-mod:
+
+The ``doInitialization(Module &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doInitialization(Module &M);
+
+The ``doInitialization`` method is allowed to do most of the things that
+``FunctionPass``\ es are not allowed to do. They can add and remove functions,
+get pointers to functions, etc. The ``doInitialization`` method is designed to
+do simple initialization type of stuff that does not depend on the functions
+being processed. The ``doInitialization`` method call is not scheduled to
+overlap with any other pass executions (thus it should be very fast).
+
+A good example of how this method should be used is the `LowerAllocations
+<http://llvm.org/doxygen/LowerAllocations_8cpp-source.html>`_ pass. This pass
+converts ``malloc`` and ``free`` instructions into platform dependent
+``malloc()`` and ``free()`` function calls. It uses the ``doInitialization``
+method to get a reference to the ``malloc`` and ``free`` functions that it
+needs, adding prototypes to the module if necessary.
+
+.. _writing-an-llvm-pass-runOnFunction:
+
+The ``runOnFunction`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnFunction(Function &F) = 0;
+
+The ``runOnFunction`` method must be implemented by your subclass to do the
+transformation or analysis work of your pass. As usual, a ``true`` value
+should be returned if the function is modified.
+
+.. _writing-an-llvm-pass-doFinalization-mod:
+
+The ``doFinalization(Module &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doFinalization(Module &M);
+
+The ``doFinalization`` method is an infrequently used method that is called
+when the pass framework has finished calling :ref:`runOnFunction
+<writing-an-llvm-pass-runOnFunction>` for every function in the program being
+compiled.
+
+.. _writing-an-llvm-pass-LoopPass:
+
+The ``LoopPass`` class
+----------------------
+
+All ``LoopPass`` execute on each loop in the function independent of all of the
+other loops in the function. ``LoopPass`` processes loops in loop nest order
+such that outer most loop is processed last.
+
+``LoopPass`` subclasses are allowed to update loop nest using ``LPPassManager``
+interface. Implementing a loop pass is usually straightforward.
+``LoopPass``\ es may overload three virtual methods to do their work. All
+these methods should return ``true`` if they modified the program, or ``false``
+if they didn't.
+
+A ``LoopPass`` subclass which is intended to run as part of the main loop pass
+pipeline needs to preserve all of the same *function* analyses that the other
+loop passes in its pipeline require. To make that easier,
+a ``getLoopAnalysisUsage`` function is provided by ``LoopUtils.h``. It can be
+called within the subclass's ``getAnalysisUsage`` override to get consistent
+and correct behavior. Analogously, ``INITIALIZE_PASS_DEPENDENCY(LoopPass)``
+will initialize this set of function analyses.
+
+The ``doInitialization(Loop *, LPPassManager &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doInitialization(Loop *, LPPassManager &LPM);
+
+The ``doInitialization`` method is designed to do simple initialization type of
+stuff that does not depend on the functions being processed. The
+``doInitialization`` method call is not scheduled to overlap with any other
+pass executions (thus it should be very fast). ``LPPassManager`` interface
+should be used to access ``Function`` or ``Module`` level analysis information.
+
+.. _writing-an-llvm-pass-runOnLoop:
+
+The ``runOnLoop`` method
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnLoop(Loop *, LPPassManager &LPM) = 0;
+
+The ``runOnLoop`` method must be implemented by your subclass to do the
+transformation or analysis work of your pass. As usual, a ``true`` value
+should be returned if the function is modified. ``LPPassManager`` interface
+should be used to update loop nest.
+
+The ``doFinalization()`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doFinalization();
+
+The ``doFinalization`` method is an infrequently used method that is called
+when the pass framework has finished calling :ref:`runOnLoop
+<writing-an-llvm-pass-runOnLoop>` for every loop in the program being compiled.
+
+.. _writing-an-llvm-pass-RegionPass:
+
+The ``RegionPass`` class
+------------------------
+
+``RegionPass`` is similar to :ref:`LoopPass <writing-an-llvm-pass-LoopPass>`,
+but executes on each single entry single exit region in the function.
+``RegionPass`` processes regions in nested order such that the outer most
+region is processed last.
+
+``RegionPass`` subclasses are allowed to update the region tree by using the
+``RGPassManager`` interface. You may overload three virtual methods of
+``RegionPass`` to implement your own region pass. All these methods should
+return ``true`` if they modified the program, or ``false`` if they did not.
+
+The ``doInitialization(Region *, RGPassManager &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doInitialization(Region *, RGPassManager &RGM);
+
+The ``doInitialization`` method is designed to do simple initialization type of
+stuff that does not depend on the functions being processed. The
+``doInitialization`` method call is not scheduled to overlap with any other
+pass executions (thus it should be very fast). ``RPPassManager`` interface
+should be used to access ``Function`` or ``Module`` level analysis information.
+
+.. _writing-an-llvm-pass-runOnRegion:
+
+The ``runOnRegion`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnRegion(Region *, RGPassManager &RGM) = 0;
+
+The ``runOnRegion`` method must be implemented by your subclass to do the
+transformation or analysis work of your pass. As usual, a true value should be
+returned if the region is modified. ``RGPassManager`` interface should be used to
+update region tree.
+
+The ``doFinalization()`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doFinalization();
+
+The ``doFinalization`` method is an infrequently used method that is called
+when the pass framework has finished calling :ref:`runOnRegion
+<writing-an-llvm-pass-runOnRegion>` for every region in the program being
+compiled.
+
+.. _writing-an-llvm-pass-BasicBlockPass:
+
+The ``BasicBlockPass`` class
+----------------------------
+
+``BasicBlockPass``\ es are just like :ref:`FunctionPass's
+<writing-an-llvm-pass-FunctionPass>` , except that they must limit their scope
+of inspection and modification to a single basic block at a time. As such,
+they are **not** allowed to do any of the following:
+
+#. Modify or inspect any basic blocks outside of the current one.
+#. Maintain state across invocations of :ref:`runOnBasicBlock
+ <writing-an-llvm-pass-runOnBasicBlock>`.
+#. Modify the control flow graph (by altering terminator instructions)
+#. Any of the things forbidden for :ref:`FunctionPasses
+ <writing-an-llvm-pass-FunctionPass>`.
+
+``BasicBlockPass``\ es are useful for traditional local and "peephole"
+optimizations. They may override the same :ref:`doInitialization(Module &)
+<writing-an-llvm-pass-doInitialization-mod>` and :ref:`doFinalization(Module &)
+<writing-an-llvm-pass-doFinalization-mod>` methods that :ref:`FunctionPass's
+<writing-an-llvm-pass-FunctionPass>` have, but also have the following virtual
+methods that may also be implemented:
+
+The ``doInitialization(Function &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doInitialization(Function &F);
+
+The ``doInitialization`` method is allowed to do most of the things that
+``BasicBlockPass``\ es are not allowed to do, but that ``FunctionPass``\ es
+can. The ``doInitialization`` method is designed to do simple initialization
+that does not depend on the ``BasicBlock``\ s being processed. The
+``doInitialization`` method call is not scheduled to overlap with any other
+pass executions (thus it should be very fast).
+
+.. _writing-an-llvm-pass-runOnBasicBlock:
+
+The ``runOnBasicBlock`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnBasicBlock(BasicBlock &BB) = 0;
+
+Override this function to do the work of the ``BasicBlockPass``. This function
+is not allowed to inspect or modify basic blocks other than the parameter, and
+are not allowed to modify the CFG. A ``true`` value must be returned if the
+basic block is modified.
+
+The ``doFinalization(Function &)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool doFinalization(Function &F);
+
+The ``doFinalization`` method is an infrequently used method that is called
+when the pass framework has finished calling :ref:`runOnBasicBlock
+<writing-an-llvm-pass-runOnBasicBlock>` for every ``BasicBlock`` in the program
+being compiled. This can be used to perform per-function finalization.
+
+The ``MachineFunctionPass`` class
+---------------------------------
+
+A ``MachineFunctionPass`` is a part of the LLVM code generator that executes on
+the machine-dependent representation of each LLVM function in the program.
+
+Code generator passes are registered and initialized specially by
+``TargetMachine::addPassesToEmitFile`` and similar routines, so they cannot
+generally be run from the :program:`opt` or :program:`bugpoint` commands.
+
+A ``MachineFunctionPass`` is also a ``FunctionPass``, so all the restrictions
+that apply to a ``FunctionPass`` also apply to it. ``MachineFunctionPass``\ es
+also have additional restrictions. In particular, ``MachineFunctionPass``\ es
+are not allowed to do any of the following:
+
+#. Modify or create any LLVM IR ``Instruction``\ s, ``BasicBlock``\ s,
+ ``Argument``\ s, ``Function``\ s, ``GlobalVariable``\ s,
+ ``GlobalAlias``\ es, or ``Module``\ s.
+#. Modify a ``MachineFunction`` other than the one currently being processed.
+#. Maintain state across invocations of :ref:`runOnMachineFunction
+ <writing-an-llvm-pass-runOnMachineFunction>` (including global data).
+
+.. _writing-an-llvm-pass-runOnMachineFunction:
+
+The ``runOnMachineFunction(MachineFunction &MF)`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual bool runOnMachineFunction(MachineFunction &MF) = 0;
+
+``runOnMachineFunction`` can be considered the main entry point of a
+``MachineFunctionPass``; that is, you should override this method to do the
+work of your ``MachineFunctionPass``.
+
+The ``runOnMachineFunction`` method is called on every ``MachineFunction`` in a
+``Module``, so that the ``MachineFunctionPass`` may perform optimizations on
+the machine-dependent representation of the function. If you want to get at
+the LLVM ``Function`` for the ``MachineFunction`` you're working on, use
+``MachineFunction``'s ``getFunction()`` accessor method --- but remember, you
+may not modify the LLVM ``Function`` or its contents from a
+``MachineFunctionPass``.
+
+.. _writing-an-llvm-pass-registration:
+
+Pass registration
+-----------------
+
+In the :ref:`Hello World <writing-an-llvm-pass-basiccode>` example pass we
+illustrated how pass registration works, and discussed some of the reasons that
+it is used and what it does. Here we discuss how and why passes are
+registered.
+
+As we saw above, passes are registered with the ``RegisterPass`` template. The
+template parameter is the name of the pass that is to be used on the command
+line to specify that the pass should be added to a program (for example, with
+:program:`opt` or :program:`bugpoint`). The first argument is the name of the
+pass, which is to be used for the :option:`-help` output of programs, as well
+as for debug output generated by the `--debug-pass` option.
+
+If you want your pass to be easily dumpable, you should implement the virtual
+print method:
+
+The ``print`` method
+^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual void print(llvm::raw_ostream &O, const Module *M) const;
+
+The ``print`` method must be implemented by "analyses" in order to print a
+human readable version of the analysis results. This is useful for debugging
+an analysis itself, as well as for other people to figure out how an analysis
+works. Use the opt ``-analyze`` argument to invoke this method.
+
+The ``llvm::raw_ostream`` parameter specifies the stream to write the results
+on, and the ``Module`` parameter gives a pointer to the top level module of the
+program that has been analyzed. Note however that this pointer may be ``NULL``
+in certain circumstances (such as calling the ``Pass::dump()`` from a
+debugger), so it should only be used to enhance debug output, it should not be
+depended on.
+
+.. _writing-an-llvm-pass-interaction:
+
+Specifying interactions between passes
+--------------------------------------
+
+One of the main responsibilities of the ``PassManager`` is to make sure that
+passes interact with each other correctly. Because ``PassManager`` tries to
+:ref:`optimize the execution of passes <writing-an-llvm-pass-passmanager>` it
+must know how the passes interact with each other and what dependencies exist
+between the various passes. To track this, each pass can declare the set of
+passes that are required to be executed before the current pass, and the passes
+which are invalidated by the current pass.
+
+Typically this functionality is used to require that analysis results are
+computed before your pass is run. Running arbitrary transformation passes can
+invalidate the computed analysis results, which is what the invalidation set
+specifies. If a pass does not implement the :ref:`getAnalysisUsage
+<writing-an-llvm-pass-getAnalysisUsage>` method, it defaults to not having any
+prerequisite passes, and invalidating **all** other passes.
+
+.. _writing-an-llvm-pass-getAnalysisUsage:
+
+The ``getAnalysisUsage`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual void getAnalysisUsage(AnalysisUsage &Info) const;
+
+By implementing the ``getAnalysisUsage`` method, the required and invalidated
+sets may be specified for your transformation. The implementation should fill
+in the `AnalysisUsage
+<http://llvm.org/doxygen/classllvm_1_1AnalysisUsage.html>`_ object with
+information about which passes are required and not invalidated. To do this, a
+pass may call any of the following methods on the ``AnalysisUsage`` object:
+
+The ``AnalysisUsage::addRequired<>`` and ``AnalysisUsage::addRequiredTransitive<>`` methods
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If your pass requires a previous pass to be executed (an analysis for example),
+it can use one of these methods to arrange for it to be run before your pass.
+LLVM has many different types of analyses and passes that can be required,
+spanning the range from ``DominatorSet`` to ``BreakCriticalEdges``. Requiring
+``BreakCriticalEdges``, for example, guarantees that there will be no critical
+edges in the CFG when your pass has been run.
+
+Some analyses chain to other analyses to do their job. For example, an
+`AliasAnalysis <AliasAnalysis>` implementation is required to :ref:`chain
+<aliasanalysis-chaining>` to other alias analysis passes. In cases where
+analyses chain, the ``addRequiredTransitive`` method should be used instead of
+the ``addRequired`` method. This informs the ``PassManager`` that the
+transitively required pass should be alive as long as the requiring pass is.
+
+The ``AnalysisUsage::addPreserved<>`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+One of the jobs of the ``PassManager`` is to optimize how and when analyses are
+run. In particular, it attempts to avoid recomputing data unless it needs to.
+For this reason, passes are allowed to declare that they preserve (i.e., they
+don't invalidate) an existing analysis if it's available. For example, a
+simple constant folding pass would not modify the CFG, so it can't possibly
+affect the results of dominator analysis. By default, all passes are assumed
+to invalidate all others.
+
+The ``AnalysisUsage`` class provides several methods which are useful in
+certain circumstances that are related to ``addPreserved``. In particular, the
+``setPreservesAll`` method can be called to indicate that the pass does not
+modify the LLVM program at all (which is true for analyses), and the
+``setPreservesCFG`` method can be used by transformations that change
+instructions in the program but do not modify the CFG or terminator
+instructions (note that this property is implicitly set for
+:ref:`BasicBlockPass <writing-an-llvm-pass-BasicBlockPass>`\ es).
+
+``addPreserved`` is particularly useful for transformations like
+``BreakCriticalEdges``. This pass knows how to update a small set of loop and
+dominator related analyses if they exist, so it can preserve them, despite the
+fact that it hacks on the CFG.
+
+Example implementations of ``getAnalysisUsage``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ // This example modifies the program, but does not modify the CFG
+ void LICM::getAnalysisUsage(AnalysisUsage &AU) const {
+ AU.setPreservesCFG();
+ AU.addRequired<LoopInfoWrapperPass>();
+ }
+
+.. _writing-an-llvm-pass-getAnalysis:
+
+The ``getAnalysis<>`` and ``getAnalysisIfAvailable<>`` methods
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``Pass::getAnalysis<>`` method is automatically inherited by your class,
+providing you with access to the passes that you declared that you required
+with the :ref:`getAnalysisUsage <writing-an-llvm-pass-getAnalysisUsage>`
+method. It takes a single template argument that specifies which pass class
+you want, and returns a reference to that pass. For example:
+
+.. code-block:: c++
+
+ bool LICM::runOnFunction(Function &F) {
+ LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
+ //...
+ }
+
+This method call returns a reference to the pass desired. You may get a
+runtime assertion failure if you attempt to get an analysis that you did not
+declare as required in your :ref:`getAnalysisUsage
+<writing-an-llvm-pass-getAnalysisUsage>` implementation. This method can be
+called by your ``run*`` method implementation, or by any other local method
+invoked by your ``run*`` method.
+
+A module level pass can use function level analysis info using this interface.
+For example:
+
+.. code-block:: c++
+
+ bool ModuleLevelPass::runOnModule(Module &M) {
+ //...
+ DominatorTree &DT = getAnalysis<DominatorTree>(Func);
+ //...
+ }
+
+In above example, ``runOnFunction`` for ``DominatorTree`` is called by pass
+manager before returning a reference to the desired pass.
+
+If your pass is capable of updating analyses if they exist (e.g.,
+``BreakCriticalEdges``, as described above), you can use the
+``getAnalysisIfAvailable`` method, which returns a pointer to the analysis if
+it is active. For example:
+
+.. code-block:: c++
+
+ if (DominatorSet *DS = getAnalysisIfAvailable<DominatorSet>()) {
+ // A DominatorSet is active. This code will update it.
+ }
+
+Implementing Analysis Groups
+----------------------------
+
+Now that we understand the basics of how passes are defined, how they are used,
+and how they are required from other passes, it's time to get a little bit
+fancier. All of the pass relationships that we have seen so far are very
+simple: one pass depends on one other specific pass to be run before it can
+run. For many applications, this is great, for others, more flexibility is
+required.
+
+In particular, some analyses are defined such that there is a single simple
+interface to the analysis results, but multiple ways of calculating them.
+Consider alias analysis for example. The most trivial alias analysis returns
+"may alias" for any alias query. The most sophisticated analysis a
+flow-sensitive, context-sensitive interprocedural analysis that can take a
+significant amount of time to execute (and obviously, there is a lot of room
+between these two extremes for other implementations). To cleanly support
+situations like this, the LLVM Pass Infrastructure supports the notion of
+Analysis Groups.
+
+Analysis Group Concepts
+^^^^^^^^^^^^^^^^^^^^^^^
+
+An Analysis Group is a single simple interface that may be implemented by
+multiple different passes. Analysis Groups can be given human readable names
+just like passes, but unlike passes, they need not derive from the ``Pass``
+class. An analysis group may have one or more implementations, one of which is
+the "default" implementation.
+
+Analysis groups are used by client passes just like other passes are: the
+``AnalysisUsage::addRequired()`` and ``Pass::getAnalysis()`` methods. In order
+to resolve this requirement, the :ref:`PassManager
+<writing-an-llvm-pass-passmanager>` scans the available passes to see if any
+implementations of the analysis group are available. If none is available, the
+default implementation is created for the pass to use. All standard rules for
+:ref:`interaction between passes <writing-an-llvm-pass-interaction>` still
+apply.
+
+Although :ref:`Pass Registration <writing-an-llvm-pass-registration>` is
+optional for normal passes, all analysis group implementations must be
+registered, and must use the :ref:`INITIALIZE_AG_PASS
+<writing-an-llvm-pass-RegisterAnalysisGroup>` template to join the
+implementation pool. Also, a default implementation of the interface **must**
+be registered with :ref:`RegisterAnalysisGroup
+<writing-an-llvm-pass-RegisterAnalysisGroup>`.
+
+As a concrete example of an Analysis Group in action, consider the
+`AliasAnalysis <http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`_
+analysis group. The default implementation of the alias analysis interface
+(the `basicaa <http://llvm.org/doxygen/structBasicAliasAnalysis.html>`_ pass)
+just does a few simple checks that don't require significant analysis to
+compute (such as: two different globals can never alias each other, etc).
+Passes that use the `AliasAnalysis
+<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`_ interface (for
+example the `gvn <http://llvm.org/doxygen/classllvm_1_1GVN.html>`_ pass), do not
+care which implementation of alias analysis is actually provided, they just use
+the designated interface.
+
+From the user's perspective, commands work just like normal. Issuing the
+command ``opt -gvn ...`` will cause the ``basicaa`` class to be instantiated
+and added to the pass sequence. Issuing the command ``opt -somefancyaa -gvn
+...`` will cause the ``gvn`` pass to use the ``somefancyaa`` alias analysis
+(which doesn't actually exist, it's just a hypothetical example) instead.
+
+.. _writing-an-llvm-pass-RegisterAnalysisGroup:
+
+Using ``RegisterAnalysisGroup``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``RegisterAnalysisGroup`` template is used to register the analysis group
+itself, while the ``INITIALIZE_AG_PASS`` is used to add pass implementations to
+the analysis group. First, an analysis group should be registered, with a
+human readable name provided for it. Unlike registration of passes, there is
+no command line argument to be specified for the Analysis Group Interface
+itself, because it is "abstract":
+
+.. code-block:: c++
+
+ static RegisterAnalysisGroup<AliasAnalysis> A("Alias Analysis");
+
+Once the analysis is registered, passes can declare that they are valid
+implementations of the interface by using the following code:
+
+.. code-block:: c++
+
+ namespace {
+ // Declare that we implement the AliasAnalysis interface
+ INITIALIZE_AG_PASS(FancyAA, AliasAnalysis , "somefancyaa",
+ "A more complex alias analysis implementation",
+ false, // Is CFG Only?
+ true, // Is Analysis?
+ false); // Is default Analysis Group implementation?
+ }
+
+This just shows a class ``FancyAA`` that uses the ``INITIALIZE_AG_PASS`` macro
+both to register and to "join" the `AliasAnalysis
+<http://llvm.org/doxygen/classllvm_1_1AliasAnalysis.html>`_ analysis group.
+Every implementation of an analysis group should join using this macro.
+
+.. code-block:: c++
+
+ namespace {
+ // Declare that we implement the AliasAnalysis interface
+ INITIALIZE_AG_PASS(BasicAA, AliasAnalysis, "basicaa",
+ "Basic Alias Analysis (default AA impl)",
+ false, // Is CFG Only?
+ true, // Is Analysis?
+ true); // Is default Analysis Group implementation?
+ }
+
+Here we show how the default implementation is specified (using the final
+argument to the ``INITIALIZE_AG_PASS`` template). There must be exactly one
+default implementation available at all times for an Analysis Group to be used.
+Only default implementation can derive from ``ImmutablePass``. Here we declare
+that the `BasicAliasAnalysis
+<http://llvm.org/doxygen/structBasicAliasAnalysis.html>`_ pass is the default
+implementation for the interface.
+
+Pass Statistics
+===============
+
+The `Statistic <http://llvm.org/doxygen/Statistic_8h_source.html>`_ class is
+designed to be an easy way to expose various success metrics from passes.
+These statistics are printed at the end of a run, when the :option:`-stats`
+command line option is enabled on the command line. See the :ref:`Statistics
+section <Statistic>` in the Programmer's Manual for details.
+
+.. _writing-an-llvm-pass-passmanager:
+
+What PassManager does
+---------------------
+
+The `PassManager <http://llvm.org/doxygen/PassManager_8h_source.html>`_ `class
+<http://llvm.org/doxygen/classllvm_1_1PassManager.html>`_ takes a list of
+passes, ensures their :ref:`prerequisites <writing-an-llvm-pass-interaction>`
+are set up correctly, and then schedules passes to run efficiently. All of the
+LLVM tools that run passes use the PassManager for execution of these passes.
+
+The PassManager does two main things to try to reduce the execution time of a
+series of passes:
+
+#. **Share analysis results.** The ``PassManager`` attempts to avoid
+ recomputing analysis results as much as possible. This means keeping track
+ of which analyses are available already, which analyses get invalidated, and
+ which analyses are needed to be run for a pass. An important part of work
+ is that the ``PassManager`` tracks the exact lifetime of all analysis
+ results, allowing it to :ref:`free memory
+ <writing-an-llvm-pass-releaseMemory>` allocated to holding analysis results
+ as soon as they are no longer needed.
+
+#. **Pipeline the execution of passes on the program.** The ``PassManager``
+ attempts to get better cache and memory usage behavior out of a series of
+ passes by pipelining the passes together. This means that, given a series
+ of consecutive :ref:`FunctionPass <writing-an-llvm-pass-FunctionPass>`, it
+ will execute all of the :ref:`FunctionPass
+ <writing-an-llvm-pass-FunctionPass>` on the first function, then all of the
+ :ref:`FunctionPasses <writing-an-llvm-pass-FunctionPass>` on the second
+ function, etc... until the entire program has been run through the passes.
+
+ This improves the cache behavior of the compiler, because it is only
+ touching the LLVM program representation for a single function at a time,
+ instead of traversing the entire program. It reduces the memory consumption
+ of compiler, because, for example, only one `DominatorSet
+ <http://llvm.org/doxygen/classllvm_1_1DominatorSet.html>`_ needs to be
+ calculated at a time. This also makes it possible to implement some
+ :ref:`interesting enhancements <writing-an-llvm-pass-SMP>` in the future.
+
+The effectiveness of the ``PassManager`` is influenced directly by how much
+information it has about the behaviors of the passes it is scheduling. For
+example, the "preserved" set is intentionally conservative in the face of an
+unimplemented :ref:`getAnalysisUsage <writing-an-llvm-pass-getAnalysisUsage>`
+method. Not implementing when it should be implemented will have the effect of
+not allowing any analysis results to live across the execution of your pass.
+
+The ``PassManager`` class exposes a ``--debug-pass`` command line options that
+is useful for debugging pass execution, seeing how things work, and diagnosing
+when you should be preserving more analyses than you currently are. (To get
+information about all of the variants of the ``--debug-pass`` option, just type
+"``opt -help-hidden``").
+
+By using the --debug-pass=Structure option, for example, we can see how our
+:ref:`Hello World <writing-an-llvm-pass-basiccode>` pass interacts with other
+passes. Lets try it out with the gvn and licm passes:
+
+.. code-block:: console
+
+ $ opt -load lib/LLVMHello.so -gvn -licm --debug-pass=Structure < hello.bc > /dev/null
+ ModulePass Manager
+ FunctionPass Manager
+ Dominator Tree Construction
+ Basic Alias Analysis (stateless AA impl)
+ Function Alias Analysis Results
+ Memory Dependence Analysis
+ Global Value Numbering
+ Natural Loop Information
+ Canonicalize natural loops
+ Loop-Closed SSA Form Pass
+ Basic Alias Analysis (stateless AA impl)
+ Function Alias Analysis Results
+ Scalar Evolution Analysis
+ Loop Pass Manager
+ Loop Invariant Code Motion
+ Module Verifier
+ Bitcode Writer
+
+This output shows us when passes are constructed.
+Here we see that GVN uses dominator tree information to do its job. The LICM pass
+uses natural loop information, which uses dominator tree as well.
+
+After the LICM pass, the module verifier runs (which is automatically added by
+the :program:`opt` tool), which uses the dominator tree to check that the
+resultant LLVM code is well formed. Note that the dominator tree is computed
+once, and shared by three passes.
+
+Lets see how this changes when we run the :ref:`Hello World
+<writing-an-llvm-pass-basiccode>` pass in between the two passes:
+
+.. code-block:: console
+
+ $ opt -load lib/LLVMHello.so -gvn -hello -licm --debug-pass=Structure < hello.bc > /dev/null
+ ModulePass Manager
+ FunctionPass Manager
+ Dominator Tree Construction
+ Basic Alias Analysis (stateless AA impl)
+ Function Alias Analysis Results
+ Memory Dependence Analysis
+ Global Value Numbering
+ Hello World Pass
+ Dominator Tree Construction
+ Natural Loop Information
+ Canonicalize natural loops
+ Loop-Closed SSA Form Pass
+ Basic Alias Analysis (stateless AA impl)
+ Function Alias Analysis Results
+ Scalar Evolution Analysis
+ Loop Pass Manager
+ Loop Invariant Code Motion
+ Module Verifier
+ Bitcode Writer
+ Hello: __main
+ Hello: puts
+ Hello: main
+
+Here we see that the :ref:`Hello World <writing-an-llvm-pass-basiccode>` pass
+has killed the Dominator Tree pass, even though it doesn't modify the code at
+all! To fix this, we need to add the following :ref:`getAnalysisUsage
+<writing-an-llvm-pass-getAnalysisUsage>` method to our pass:
+
+.. code-block:: c++
+
+ // We don't modify the program, so we preserve all analyses
+ void getAnalysisUsage(AnalysisUsage &AU) const override {
+ AU.setPreservesAll();
+ }
+
+Now when we run our pass, we get this output:
+
+.. code-block:: console
+
+ $ opt -load lib/LLVMHello.so -gvn -hello -licm --debug-pass=Structure < hello.bc > /dev/null
+ Pass Arguments: -gvn -hello -licm
+ ModulePass Manager
+ FunctionPass Manager
+ Dominator Tree Construction
+ Basic Alias Analysis (stateless AA impl)
+ Function Alias Analysis Results
+ Memory Dependence Analysis
+ Global Value Numbering
+ Hello World Pass
+ Natural Loop Information
+ Canonicalize natural loops
+ Loop-Closed SSA Form Pass
+ Basic Alias Analysis (stateless AA impl)
+ Function Alias Analysis Results
+ Scalar Evolution Analysis
+ Loop Pass Manager
+ Loop Invariant Code Motion
+ Module Verifier
+ Bitcode Writer
+ Hello: __main
+ Hello: puts
+ Hello: main
+
+Which shows that we don't accidentally invalidate dominator information
+anymore, and therefore do not have to compute it twice.
+
+.. _writing-an-llvm-pass-releaseMemory:
+
+The ``releaseMemory`` method
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c++
+
+ virtual void releaseMemory();
+
+The ``PassManager`` automatically determines when to compute analysis results,
+and how long to keep them around for. Because the lifetime of the pass object
+itself is effectively the entire duration of the compilation process, we need
+some way to free analysis results when they are no longer useful. The
+``releaseMemory`` virtual method is the way to do this.
+
+If you are writing an analysis or any other pass that retains a significant
+amount of state (for use by another pass which "requires" your pass and uses
+the :ref:`getAnalysis <writing-an-llvm-pass-getAnalysis>` method) you should
+implement ``releaseMemory`` to, well, release the memory allocated to maintain
+this internal state. This method is called after the ``run*`` method for the
+class, before the next call of ``run*`` in your pass.
+
+Registering dynamically loaded passes
+=====================================
+
+*Size matters* when constructing production quality tools using LLVM, both for
+the purposes of distribution, and for regulating the resident code size when
+running on the target system. Therefore, it becomes desirable to selectively
+use some passes, while omitting others and maintain the flexibility to change
+configurations later on. You want to be able to do all this, and, provide
+feedback to the user. This is where pass registration comes into play.
+
+The fundamental mechanisms for pass registration are the
+``MachinePassRegistry`` class and subclasses of ``MachinePassRegistryNode``.
+
+An instance of ``MachinePassRegistry`` is used to maintain a list of
+``MachinePassRegistryNode`` objects. This instance maintains the list and
+communicates additions and deletions to the command line interface.
+
+An instance of ``MachinePassRegistryNode`` subclass is used to maintain
+information provided about a particular pass. This information includes the
+command line name, the command help string and the address of the function used
+to create an instance of the pass. A global static constructor of one of these
+instances *registers* with a corresponding ``MachinePassRegistry``, the static
+destructor *unregisters*. Thus a pass that is statically linked in the tool
+will be registered at start up. A dynamically loaded pass will register on
+load and unregister at unload.
+
+Using existing registries
+-------------------------
+
+There are predefined registries to track instruction scheduling
+(``RegisterScheduler``) and register allocation (``RegisterRegAlloc``) machine
+passes. Here we will describe how to *register* a register allocator machine
+pass.
+
+Implement your register allocator machine pass. In your register allocator
+``.cpp`` file add the following include:
+
+.. code-block:: c++
+
+ #include "llvm/CodeGen/RegAllocRegistry.h"
+
+Also in your register allocator ``.cpp`` file, define a creator function in the
+form:
+
+.. code-block:: c++
+
+ FunctionPass *createMyRegisterAllocator() {
+ return new MyRegisterAllocator();
+ }
+
+Note that the signature of this function should match the type of
+``RegisterRegAlloc::FunctionPassCtor``. In the same file add the "installing"
+declaration, in the form:
+
+.. code-block:: c++
+
+ static RegisterRegAlloc myRegAlloc("myregalloc",
+ "my register allocator help string",
+ createMyRegisterAllocator);
+
+Note the two spaces prior to the help string produces a tidy result on the
+:option:`-help` query.
+
+.. code-block:: console
+
+ $ llc -help
+ ...
+ -regalloc - Register allocator to use (default=linearscan)
+ =linearscan - linear scan register allocator
+ =local - local register allocator
+ =simple - simple register allocator
+ =myregalloc - my register allocator help string
+ ...
+
+And that's it. The user is now free to use ``-regalloc=myregalloc`` as an
+option. Registering instruction schedulers is similar except use the
+``RegisterScheduler`` class. Note that the
+``RegisterScheduler::FunctionPassCtor`` is significantly different from
+``RegisterRegAlloc::FunctionPassCtor``.
+
+To force the load/linking of your register allocator into the
+:program:`llc`/:program:`lli` tools, add your creator function's global
+declaration to ``Passes.h`` and add a "pseudo" call line to
+``llvm/Codegen/LinkAllCodegenComponents.h``.
+
+Creating new registries
+-----------------------
+
+The easiest way to get started is to clone one of the existing registries; we
+recommend ``llvm/CodeGen/RegAllocRegistry.h``. The key things to modify are
+the class name and the ``FunctionPassCtor`` type.
+
+Then you need to declare the registry. Example: if your pass registry is
+``RegisterMyPasses`` then define:
+
+.. code-block:: c++
+
+ MachinePassRegistry RegisterMyPasses::Registry;
+
+And finally, declare the command line option for your passes. Example:
+
+.. code-block:: c++
+
+ cl::opt<RegisterMyPasses::FunctionPassCtor, false,
+ RegisterPassParser<RegisterMyPasses> >
+ MyPassOpt("mypass",
+ cl::init(&createDefaultMyPass),
+ cl::desc("my pass option help"));
+
+Here the command option is "``mypass``", with ``createDefaultMyPass`` as the
+default creator.
+
+Using GDB with dynamically loaded passes
+----------------------------------------
+
+Unfortunately, using GDB with dynamically loaded passes is not as easy as it
+should be. First of all, you can't set a breakpoint in a shared object that
+has not been loaded yet, and second of all there are problems with inlined
+functions in shared objects. Here are some suggestions to debugging your pass
+with GDB.
+
+For sake of discussion, I'm going to assume that you are debugging a
+transformation invoked by :program:`opt`, although nothing described here
+depends on that.
+
+Setting a breakpoint in your pass
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+First thing you do is start gdb on the opt process:
+
+.. code-block:: console
+
+ $ gdb opt
+ GNU gdb 5.0
+ Copyright 2000 Free Software Foundation, Inc.
+ GDB is free software, covered by the GNU General Public License, and you are
+ welcome to change it and/or distribute copies of it under certain conditions.
+ Type "show copying" to see the conditions.
+ There is absolutely no warranty for GDB. Type "show warranty" for details.
+ This GDB was configured as "sparc-sun-solaris2.6"...
+ (gdb)
+
+Note that :program:`opt` has a lot of debugging information in it, so it takes
+time to load. Be patient. Since we cannot set a breakpoint in our pass yet
+(the shared object isn't loaded until runtime), we must execute the process,
+and have it stop before it invokes our pass, but after it has loaded the shared
+object. The most foolproof way of doing this is to set a breakpoint in
+``PassManager::run`` and then run the process with the arguments you want:
+
+.. code-block:: console
+
+ $ (gdb) break llvm::PassManager::run
+ Breakpoint 1 at 0x2413bc: file Pass.cpp, line 70.
+ (gdb) run test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption]
+ Starting program: opt test.bc -load $(LLVMTOP)/llvm/Debug+Asserts/lib/[libname].so -[passoption]
+ Breakpoint 1, PassManager::run (this=0xffbef174, M=@0x70b298) at Pass.cpp:70
+ 70 bool PassManager::run(Module &M) { return PM->run(M); }
+ (gdb)
+
+Once the :program:`opt` stops in the ``PassManager::run`` method you are now
+free to set breakpoints in your pass so that you can trace through execution or
+do other standard debugging stuff.
+
+Miscellaneous Problems
+^^^^^^^^^^^^^^^^^^^^^^
+
+Once you have the basics down, there are a couple of problems that GDB has,
+some with solutions, some without.
+
+* Inline functions have bogus stack information. In general, GDB does a pretty
+ good job getting stack traces and stepping through inline functions. When a
+ pass is dynamically loaded however, it somehow completely loses this
+ capability. The only solution I know of is to de-inline a function (move it
+ from the body of a class to a ``.cpp`` file).
+
+* Restarting the program breaks breakpoints. After following the information
+ above, you have succeeded in getting some breakpoints planted in your pass.
+ Next thing you know, you restart the program (i.e., you type "``run``" again),
+ and you start getting errors about breakpoints being unsettable. The only
+ way I have found to "fix" this problem is to delete the breakpoints that are
+ already set in your pass, run the program, and re-set the breakpoints once
+ execution stops in ``PassManager::run``.
+
+Hopefully these tips will help with common case debugging situations. If you'd
+like to contribute some tips of your own, just contact `Chris
+<mailto:sabre at nondot.org>`_.
+
+Future extensions planned
+-------------------------
+
+Although the LLVM Pass Infrastructure is very capable as it stands, and does
+some nifty stuff, there are things we'd like to add in the future. Here is
+where we are going:
+
+.. _writing-an-llvm-pass-SMP:
+
+Multithreaded LLVM
+^^^^^^^^^^^^^^^^^^
+
+Multiple CPU machines are becoming more common and compilation can never be
+fast enough: obviously we should allow for a multithreaded compiler. Because
+of the semantics defined for passes above (specifically they cannot maintain
+state across invocations of their ``run*`` methods), a nice clean way to
+implement a multithreaded compiler would be for the ``PassManager`` class to
+create multiple instances of each pass object, and allow the separate instances
+to be hacking on different parts of the program at the same time.
+
+This implementation would prevent each of the passes from having to implement
+multithreaded constructs, requiring only the LLVM core to have locking in a few
+places (for global resources). Although this is a simple extension, we simply
+haven't had time (or multiprocessor machines, thus a reason) to implement this.
+Despite that, we have kept the LLVM passes SMP ready, and you should too.
+
Added: www-releases/trunk/9.0.0/docs/_sources/XRay.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/XRay.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/XRay.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/XRay.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,339 @@
+====================
+XRay Instrumentation
+====================
+
+:Version: 1 as of 2016-11-08
+
+.. contents::
+ :local:
+
+
+Introduction
+============
+
+XRay is a function call tracing system which combines compiler-inserted
+instrumentation points and a runtime library that can dynamically enable and
+disable the instrumentation.
+
+More high level information about XRay can be found in the `XRay whitepaper`_.
+
+This document describes how to use XRay as implemented in LLVM.
+
+XRay in LLVM
+============
+
+XRay consists of three main parts:
+
+- Compiler-inserted instrumentation points.
+- A runtime library for enabling/disabling tracing at runtime.
+- A suite of tools for analysing the traces.
+
+ **NOTE:** As of July 25, 2018 , XRay is only available for the following
+ architectures running Linux: x86_64, arm7 (no thumb), aarch64, powerpc64le,
+ mips, mipsel, mips64, mips64el, NetBSD: x86_64, FreeBSD: x86_64 and
+ OpenBSD: x86_64.
+
+The compiler-inserted instrumentation points come in the form of nop-sleds in
+the final generated binary, and an ELF section named ``xray_instr_map`` which
+contains entries pointing to these instrumentation points. The runtime library
+relies on being able to access the entries of the ``xray_instr_map``, and
+overwrite the instrumentation points at runtime.
+
+Using XRay
+==========
+
+You can use XRay in a couple of ways:
+
+- Instrumenting your C/C++/Objective-C/Objective-C++ application.
+- Generating LLVM IR with the correct function attributes.
+
+The rest of this section covers these main ways and later on how to customise
+what XRay does in an XRay-instrumented binary.
+
+Instrumenting your C/C++/Objective-C Application
+------------------------------------------------
+
+The easiest way of getting XRay instrumentation for your application is by
+enabling the ``-fxray-instrument`` flag in your clang invocation.
+
+For example:
+
+::
+
+ clang -fxray-instrument ...
+
+By default, functions that have at least 200 instructions will get XRay
+instrumentation points. You can tweak that number through the
+``-fxray-instruction-threshold=`` flag:
+
+::
+
+ clang -fxray-instrument -fxray-instruction-threshold=1 ...
+
+You can also specifically instrument functions in your binary to either always
+or never be instrumented using source-level attributes. You can do it using the
+GCC-style attributes or C++11-style attributes.
+
+.. code-block:: c++
+
+ [[clang::xray_always_instrument]] void always_instrumented();
+
+ [[clang::xray_never_instrument]] void never_instrumented();
+
+ void alt_always_instrumented() __attribute__((xray_always_instrument));
+
+ void alt_never_instrumented() __attribute__((xray_never_instrument));
+
+When linking a binary, you can either manually link in the `XRay Runtime
+Library`_ or use ``clang`` to link it in automatically with the
+``-fxray-instrument`` flag. Alternatively, you can statically link-in the XRay
+runtime library from compiler-rt -- those archive files will take the name of
+`libclang_rt.xray-{arch}` where `{arch}` is the mnemonic supported by clang
+(x86_64, arm7, etc.).
+
+LLVM Function Attribute
+-----------------------
+
+If you're using LLVM IR directly, you can add the ``function-instrument``
+string attribute to your functions, to get the similar effect that the
+C/C++/Objective-C source-level attributes would get:
+
+.. code-block:: llvm
+
+ define i32 @always_instrument() uwtable "function-instrument"="xray-always" {
+ ; ...
+ }
+
+ define i32 @never_instrument() uwtable "function-instrument"="xray-never" {
+ ; ...
+ }
+
+You can also set the ``xray-instruction-threshold`` attribute and provide a
+numeric string value for how many instructions should be in the function before
+it gets instrumented.
+
+.. code-block:: llvm
+
+ define i32 @maybe_instrument() uwtable "xray-instruction-threshold"="2" {
+ ; ...
+ }
+
+Special Case File
+-----------------
+
+Attributes can be imbued through the use of special case files instead of
+adding them to the original source files. You can use this to mark certain
+functions and classes to be never, always, or instrumented with first-argument
+logging from a file. The file's format is described below:
+
+.. code-block:: bash
+
+ # Comments are supported
+ [always]
+ fun:always_instrument
+ fun:log_arg1=arg1 # Log the first argument for the function
+
+ [never]
+ fun:never_instrument
+
+These files can be provided through the ``-fxray-attr-list=`` flag to clang.
+You may have multiple files loaded through multiple instances of the flag.
+
+XRay Runtime Library
+--------------------
+
+The XRay Runtime Library is part of the compiler-rt project, which implements
+the runtime components that perform the patching and unpatching of inserted
+instrumentation points. When you use ``clang`` to link your binaries and the
+``-fxray-instrument`` flag, it will automatically link in the XRay runtime.
+
+The default implementation of the XRay runtime will enable XRay instrumentation
+before ``main`` starts, which works for applications that have a short
+lifetime. This implementation also records all function entry and exit events
+which may result in a lot of records in the resulting trace.
+
+Also by default the filename of the XRay trace is ``xray-log.XXXXXX`` where the
+``XXXXXX`` part is randomly generated.
+
+These options can be controlled through the ``XRAY_OPTIONS`` environment
+variable, where we list down the options and their defaults below.
+
++-------------------+-----------------+---------------+------------------------+
+| Option | Type | Default | Description |
++===================+=================+===============+========================+
+| patch_premain | ``bool`` | ``false`` | Whether to patch |
+| | | | instrumentation points |
+| | | | before main. |
++-------------------+-----------------+---------------+------------------------+
+| xray_mode | ``const char*`` | ``""`` | Default mode to |
+| | | | install and initialize |
+| | | | before ``main``. |
++-------------------+-----------------+---------------+------------------------+
+| xray_logfile_base | ``const char*`` | ``xray-log.`` | Filename base for the |
+| | | | XRay logfile. |
++-------------------+-----------------+---------------+------------------------+
+| verbosity | ``int`` | ``0`` | Runtime verbosity |
+| | | | level. |
++-------------------+-----------------+---------------+------------------------+
+
+
+If you choose to not use the default logging implementation that comes with the
+XRay runtime and/or control when/how the XRay instrumentation runs, you may use
+the XRay APIs directly for doing so. To do this, you'll need to include the
+``xray_log_interface.h`` from the compiler-rt ``xray`` directory. The important API
+functions we list below:
+
+- ``__xray_log_register_mode(...)``: Register a logging implementation against
+ a string Mode identifier. The implementation is an instance of
+ ``XRayLogImpl`` defined in ``xray/xray_log_interface.h``.
+- ``__xray_log_select_mode(...)``: Select the mode to install, associated with
+ a string Mode identifier. Only implementations registered with
+ ``__xray_log_register_mode(...)`` can be chosen with this function.
+- ``__xray_log_init_mode(...)``: This function allows for initializing and
+ re-initializing an installed logging implementation. See
+ ``xray/xray_log_interface.h`` for details, part of the XRay compiler-rt
+ installation.
+
+Once a logging implementation has been initialized, it can be "stopped" by
+finalizing the implementation through the ``__xray_log_finalize()`` function.
+The finalization routine is the opposite of the initialization. When finalized,
+an implementation's data can be cleared out through the
+``__xray_log_flushLog()`` function. For implementations that support in-memory
+processing, these should register an iterator function to provide access to the
+data via the ``__xray_log_set_buffer_iterator(...)`` which allows code calling
+the ``__xray_log_process_buffers(...)`` function to deal with the data in
+memory.
+
+All of this is better explained in the ``xray/xray_log_interface.h`` header.
+
+Basic Mode
+----------
+
+XRay supports a basic logging mode which will trace the application's
+execution, and periodically append to a single log. This mode can be
+installed/enabled by setting ``xray_mode=xray-basic`` in the ``XRAY_OPTIONS``
+environment variable. Combined with ``patch_premain=true`` this can allow for
+tracing applications from start to end.
+
+Like all the other modes installed through ``__xray_log_select_mode(...)``, the
+implementation can be configured through the ``__xray_log_init_mode(...)``
+function, providing the mode string and the flag options. Basic-mode specific
+defaults can be provided in the ``XRAY_BASIC_OPTIONS`` environment variable.
+
+Flight Data Recorder Mode
+-------------------------
+
+XRay supports a logging mode which allows the application to only capture a
+fixed amount of memory's worth of events. Flight Data Recorder (FDR) mode works
+very much like a plane's "black box" which keeps recording data to memory in a
+fixed-size circular queue of buffers, and have the data available
+programmatically until the buffers are finalized and flushed. To use FDR mode
+on your application, you may set the ``xray_mode`` variable to ``xray-fdr`` in
+the ``XRAY_OPTIONS`` environment variable. Additional options to the FDR mode
+implementation can be provided in the ``XRAY_FDR_OPTIONS`` environment
+variable. Programmatic configuration can be done by calling
+``__xray_log_init_mode("xray-fdr", <configuration string>)`` once it has been
+selected/installed.
+
+When the buffers are flushed to disk, the result is a binary trace format
+described by `XRay FDR format <XRayFDRFormat.html>`_
+
+When FDR mode is on, it will keep writing and recycling memory buffers until
+the logging implementation is finalized -- at which point it can be flushed and
+re-initialised later. To do this programmatically, we follow the workflow
+provided below:
+
+.. code-block:: c++
+
+ // Patch the sleds, if we haven't yet.
+ auto patch_status = __xray_patch();
+
+ // Maybe handle the patch_status errors.
+
+ // When we want to flush the log, we need to finalize it first, to give
+ // threads a chance to return buffers to the queue.
+ auto finalize_status = __xray_log_finalize();
+ if (finalize_status != XRAY_LOG_FINALIZED) {
+ // maybe retry, or bail out.
+ }
+
+ // At this point, we are sure that the log is finalized, so we may try
+ // flushing the log.
+ auto flush_status = __xray_log_flushLog();
+ if (flush_status != XRAY_LOG_FLUSHED) {
+ // maybe retry, or bail out.
+ }
+
+The default settings for the FDR mode implementation will create logs named
+similarly to the basic log implementation, but will have a different log
+format. All the trace analysis tools (and the trace reading library) will
+support all versions of the FDR mode format as we add more functionality and
+record types in the future.
+
+ **NOTE:** We do not promise perpetual support for when we update the log
+ versions we support going forward. Deprecation of the formats will be
+ announced and discussed on the developers mailing list.
+
+Trace Analysis Tools
+--------------------
+
+We currently have the beginnings of a trace analysis tool in LLVM, which can be
+found in the ``tools/llvm-xray`` directory. The ``llvm-xray`` tool currently
+supports the following subcommands:
+
+- ``extract``: Extract the instrumentation map from a binary, and return it as
+ YAML.
+- ``account``: Performs basic function call accounting statistics with various
+ options for sorting, and output formats (supports CSV, YAML, and
+ console-friendly TEXT).
+- ``convert``: Converts an XRay log file from one format to another. We can
+ convert from binary XRay traces (both basic and FDR mode) to YAML,
+ `flame-graph <https://github.com/brendangregg/FlameGraph>`_ friendly text
+ formats, as well as `Chrome Trace Viewer (catapult)
+ <https://github.com/catapult-project/catapult>` formats.
+- ``graph``: Generates a DOT graph of the function call relationships between
+ functions found in an XRay trace.
+- ``stack``: Reconstructs function call stacks from a timeline of function
+ calls in an XRay trace.
+
+These subcommands use various library components found as part of the XRay
+libraries, distributed with the LLVM distribution. These are:
+
+- ``llvm/XRay/Trace.h`` : A trace reading library for conveniently loading
+ an XRay trace of supported forms, into a convenient in-memory representation.
+ All the analysis tools that deal with traces use this implementation.
+- ``llvm/XRay/Graph.h`` : A semi-generic graph type used by the graph
+ subcommand to conveniently represent a function call graph with statistics
+ associated with edges and vertices.
+- ``llvm/XRay/InstrumentationMap.h``: A convenient tool for analyzing the
+ instrumentation map in XRay-instrumented object files and binaries. The
+ ``extract`` and ``stack`` subcommands uses this particular library.
+
+Future Work
+===========
+
+There are a number of ongoing efforts for expanding the toolset building around
+the XRay instrumentation system.
+
+Trace Analysis Tools
+--------------------
+
+- Work is in progress to integrate with or develop tools to visualize findings
+ from an XRay trace. Particularly, the ``stack`` tool is being expanded to
+ output formats that allow graphing and exploring the duration of time in each
+ call stack.
+- With a large instrumented binary, the size of generated XRay traces can
+ quickly become unwieldy. We are working on integrating pruning techniques and
+ heuristics for the analysis tools to sift through the traces and surface only
+ relevant information.
+
+More Platforms
+--------------
+
+We're looking forward to contributions to port XRay to more architectures and
+operating systems.
+
+.. References...
+
+.. _`XRay whitepaper`: http://research.google.com/pubs/pub45287.html
+
Added: www-releases/trunk/9.0.0/docs/_sources/XRayExample.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/XRayExample.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/XRayExample.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/XRayExample.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,349 @@
+===================
+Debugging with XRay
+===================
+
+This document shows an example of how you would go about analyzing applications
+built with XRay instrumentation. Here we will attempt to debug ``llc``
+compiling some sample LLVM IR generated by Clang.
+
+.. contents::
+ :local:
+
+Building with XRay
+------------------
+
+To debug an application with XRay instrumentation, we need to build it with a
+Clang that supports the ``-fxray-instrument`` option. See `XRay <XRay.html>`_
+for more technical details of how XRay works for background information.
+
+In our example, we need to add ``-fxray-instrument`` to the list of flags
+passed to Clang when building a binary. Note that we need to link with Clang as
+well to get the XRay runtime linked in appropriately. For building ``llc`` with
+XRay, we do something similar below for our LLVM build:
+
+::
+
+ $ mkdir -p llvm-build && cd llvm-build
+ # Assume that the LLVM sources are at ../llvm
+ $ cmake -GNinja ../llvm -DCMAKE_BUILD_TYPE=Release \
+ -DCMAKE_C_FLAGS_RELEASE="-fxray-instrument" -DCMAKE_CXX_FLAGS="-fxray-instrument" \
+ # Once this finishes, we should build llc
+ $ ninja llc
+
+
+To verify that we have an XRay instrumented binary, we can use ``objdump`` to
+look for the ``xray_instr_map`` section.
+
+::
+
+ $ objdump -h -j xray_instr_map ./bin/llc
+ ./bin/llc: file format elf64-x86-64
+
+ Sections:
+ Idx Name Size VMA LMA File off Algn
+ 14 xray_instr_map 00002fc0 00000000041516c6 00000000041516c6 03d516c6 2**0
+ CONTENTS, ALLOC, LOAD, READONLY, DATA
+
+Getting Traces
+--------------
+
+By default, XRay does not write out the trace files or patch the application
+before main starts. If we run ``llc`` it should work like a normally built
+binary. If we want to get a full trace of the application's operations (of the
+functions we do end up instrumenting with XRay) then we need to enable XRay
+at application start. To do this, XRay checks the ``XRAY_OPTIONS`` environment
+variable.
+
+::
+
+ # The following doesn't create an XRay trace by default.
+ $ ./bin/llc input.ll
+
+ # We need to set the XRAY_OPTIONS to enable some features.
+ $ XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1" ./bin/llc input.ll
+ ==69819==XRay: Log file in 'xray-log.llc.m35qPB'
+
+At this point we now have an XRay trace we can start analysing.
+
+The ``llvm-xray`` Tool
+----------------------
+
+Having a trace then allows us to do basic accounting of the functions that were
+instrumented, and how much time we're spending in parts of the code. To make
+sense of this data, we use the ``llvm-xray`` tool which has a few subcommands
+to help us understand our trace.
+
+One of the things we can do is to get an accounting of the functions that have
+been instrumented. We can see an example accounting with ``llvm-xray account``:
+
+::
+
+ $ llvm-xray account xray-log.llc.m35qPB -top=10 -sort=sum -sortorder=dsc -instr_map ./bin/llc
+ Functions with latencies: 29
+ funcid count [ min, med, 90p, 99p, max] sum function
+ 187 360 [ 0.000000, 0.000001, 0.000014, 0.000032, 0.000075] 0.001596 LLLexer.cpp:446:0: llvm::LLLexer::LexIdentifier()
+ 85 130 [ 0.000000, 0.000000, 0.000018, 0.000023, 0.000156] 0.000799 X86ISelDAGToDAG.cpp:1984:0: (anonymous namespace)::X86DAGToDAGISel::Select(llvm::SDNode*)
+ 138 130 [ 0.000000, 0.000000, 0.000017, 0.000155, 0.000155] 0.000774 SelectionDAGISel.cpp:2963:0: llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int)
+ 188 103 [ 0.000000, 0.000000, 0.000003, 0.000123, 0.000214] 0.000737 LLParser.cpp:2692:0: llvm::LLParser::ParseValID(llvm::ValID&, llvm::LLParser::PerFunctionState*)
+ 88 1 [ 0.000562, 0.000562, 0.000562, 0.000562, 0.000562] 0.000562 X86ISelLowering.cpp:83:0: llvm::X86TargetLowering::X86TargetLowering(llvm::X86TargetMachine const&, llvm::X86Subtarget const&)
+ 125 102 [ 0.000001, 0.000003, 0.000010, 0.000017, 0.000049] 0.000471 Verifier.cpp:3714:0: (anonymous namespace)::Verifier::visitInstruction(llvm::Instruction&)
+ 90 8 [ 0.000023, 0.000035, 0.000106, 0.000106, 0.000106] 0.000342 X86ISelLowering.cpp:3363:0: llvm::X86TargetLowering::LowerCall(llvm::TargetLowering::CallLoweringInfo&, llvm::SmallVectorImpl<llvm::SDValue>&) const
+ 124 32 [ 0.000003, 0.000007, 0.000016, 0.000041, 0.000041] 0.000310 Verifier.cpp:1967:0: (anonymous namespace)::Verifier::visitFunction(llvm::Function const&)
+ 123 1 [ 0.000302, 0.000302, 0.000302, 0.000302, 0.000302] 0.000302 LLVMContextImpl.cpp:54:0: llvm::LLVMContextImpl::~LLVMContextImpl()
+ 139 46 [ 0.000000, 0.000002, 0.000006, 0.000008, 0.000019] 0.000138 TargetLowering.cpp:506:0: llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, llvm::APInt const&, llvm::APInt&, llvm::APInt&, llvm::TargetLowering::TargetLoweringOpt&, unsigned int, bool) const
+
+This shows us that for our input file, ``llc`` spent the most cumulative time
+in the lexer (a total of 1 millisecond). If we wanted for example to work with
+this data in a spreadsheet, we can output the results as CSV using the
+``-format=csv`` option to the command for further analysis.
+
+If we want to get a textual representation of the raw trace we can use the
+``llvm-xray convert`` tool to get YAML output. The first few lines of that
+output for an example trace would look like the following:
+
+::
+
+ $ llvm-xray convert -f yaml -symbolize -instr_map=./bin/llc xray-log.llc.m35qPB
+ ---
+ header:
+ version: 1
+ type: 0
+ constant-tsc: true
+ nonstop-tsc: true
+ cycle-frequency: 2601000000
+ records:
+ - { type: 0, func-id: 110, function: __cxx_global_var_init.8, cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426023268520 }
+ - { type: 0, func-id: 110, function: __cxx_global_var_init.8, cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426023523052 }
+ - { type: 0, func-id: 164, function: __cxx_global_var_init, cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426029925386 }
+ - { type: 0, func-id: 164, function: __cxx_global_var_init, cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426030031128 }
+ - { type: 0, func-id: 142, function: '(anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*)', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426046951388 }
+ - { type: 0, func-id: 142, function: '(anonymous namespace)::CommandLineParser::ParseCommandLineOptions(int, char const* const*, llvm::StringRef, llvm::raw_ostream*)', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426047282020 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426047857332 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426047984152 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426048036584 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426048042292 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-enter, tsc: 5434426048055056 }
+ - { type: 0, func-id: 187, function: 'llvm::LLLexer::LexIdentifier()', cpu: 37, thread: 69819, kind: function-exit, tsc: 5434426048067316 }
+
+Controlling Fidelity
+--------------------
+
+So far in our examples, we haven't been getting full coverage of the functions
+we have in the binary. To get that, we need to modify the compiler flags so
+that we can instrument more (if not all) the functions we have in the binary.
+We have two options for doing that, and we explore both of these below.
+
+Instruction Threshold
+`````````````````````
+
+The first "blunt" way of doing this is by setting the minimum threshold for
+function bodies to 1. We can do that with the
+``-fxray-instruction-threshold=N`` flag when building our binary. We rebuild
+``llc`` with this option and observe the results:
+
+::
+
+ $ rm CMakeCache.txt
+ $ cmake -GNinja ../llvm -DCMAKE_BUILD_TYPE=Release \
+ -DCMAKE_C_FLAGS_RELEASE="-fxray-instrument -fxray-instruction-threshold=1" \
+ -DCMAKE_CXX_FLAGS="-fxray-instrument -fxray-instruction-threshold=1"
+ $ ninja llc
+ $ XRAY_OPTIONS="patch_premain=true" ./bin/llc input.ll
+ ==69819==XRay: Log file in 'xray-log.llc.5rqxkU'
+
+ $ llvm-xray account xray-log.llc.5rqxkU -top=10 -sort=sum -sortorder=dsc -instr_map ./bin/llc
+ Functions with latencies: 36652
+ funcid count [ min, med, 90p, 99p, max] sum function
+ 75 1 [ 0.672368, 0.672368, 0.672368, 0.672368, 0.672368] 0.672368 llc.cpp:271:0: main
+ 78 1 [ 0.626455, 0.626455, 0.626455, 0.626455, 0.626455] 0.626455 llc.cpp:381:0: compileModule(char**, llvm::LLVMContext&)
+ 139617 1 [ 0.472618, 0.472618, 0.472618, 0.472618, 0.472618] 0.472618 LegacyPassManager.cpp:1723:0: llvm::legacy::PassManager::run(llvm::Module&)
+ 139610 1 [ 0.472618, 0.472618, 0.472618, 0.472618, 0.472618] 0.472618 LegacyPassManager.cpp:1681:0: llvm::legacy::PassManagerImpl::run(llvm::Module&)
+ 139612 1 [ 0.470948, 0.470948, 0.470948, 0.470948, 0.470948] 0.470948 LegacyPassManager.cpp:1564:0: (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&)
+ 139607 2 [ 0.147345, 0.315994, 0.315994, 0.315994, 0.315994] 0.463340 LegacyPassManager.cpp:1530:0: llvm::FPPassManager::runOnModule(llvm::Module&)
+ 139605 21 [ 0.000002, 0.000002, 0.102593, 0.213336, 0.213336] 0.463331 LegacyPassManager.cpp:1491:0: llvm::FPPassManager::runOnFunction(llvm::Function&)
+ 139563 26096 [ 0.000002, 0.000002, 0.000037, 0.000063, 0.000215] 0.225708 LegacyPassManager.cpp:1083:0: llvm::PMDataManager::findAnalysisPass(void const*, bool)
+ 108055 188 [ 0.000002, 0.000120, 0.001375, 0.004523, 0.062624] 0.159279 MachineFunctionPass.cpp:38:0: llvm::MachineFunctionPass::runOnFunction(llvm::Function&)
+ 62635 22 [ 0.000041, 0.000046, 0.000050, 0.126744, 0.126744] 0.127715 X86TargetMachine.cpp:242:0: llvm::X86TargetMachine::getSubtargetImpl(llvm::Function const&) const
+
+
+Instrumentation Attributes
+``````````````````````````
+
+The other way is to use configuration files for selecting which functions
+should always be instrumented by the compiler. This gives us a way of ensuring
+that certain functions are either always or never instrumented by not having to
+add the attribute to the source.
+
+To use this feature, you can define one file for the functions to always
+instrument, and another for functions to never instrument. The format of these
+files are exactly the same as the SanitizerLists files that control similar
+things for the sanitizer implementations. For example:
+
+::
+
+ # xray-attr-list.txt
+ # always instrument functions that match the following filters:
+ [always]
+ fun:main
+
+ # never instrument functions that match the following filters:
+ [never]
+ fun:__cxx_*
+
+Given the file above we can re-build by providing it to the
+``-fxray-attr-list=`` flag to clang. You can have multiple files, each defining
+different sets of attribute sets, to be combined into a single list by clang.
+
+The XRay stack tool
+-------------------
+
+Given a trace, and optionally an instrumentation map, the ``llvm-xray stack``
+command can be used to analyze a call stack graph constructed from the function
+call timeline.
+
+The way to use the command is to output the top stacks by call count and time spent.
+
+::
+
+ $ llvm-xray stack xray-log.llc.5rqxkU -instr_map ./bin/llc
+
+ Unique Stacks: 3069
+ Top 10 Stacks by leaf sum:
+
+ Sum: 9633790
+ lvl function count sum
+ #0 main 1 58421550
+ #1 compileModule(char**, llvm::LLVMContext&) 1 51440360
+ #2 llvm::legacy::PassManagerImpl::run(llvm::Module&) 1 40535375
+ #3 llvm::FPPassManager::runOnModule(llvm::Module&) 2 39337525
+ #4 llvm::FPPassManager::runOnFunction(llvm::Function&) 6 39331465
+ #5 llvm::PMDataManager::verifyPreservedAnalysis(llvm::Pass*) 399 16628590
+ #6 llvm::PMTopLevelManager::findAnalysisPass(void const*) 4584 15155600
+ #7 llvm::PMDataManager::findAnalysisPass(void const*, bool) 32088 9633790
+
+ ..etc..
+
+In the default mode, identical stacks on different threads are independently
+aggregated. In a multithreaded program, you may end up having identical call
+stacks fill your list of top calls.
+
+To address this, you may specify the ``-aggregate-threads`` or
+``-per-thread-stacks`` flags. ``-per-thread-stacks`` treats the thread id as an
+implicit root in each call stack tree, while ``-aggregate-threads`` combines
+identical stacks from all threads.
+
+Flame Graph Generation
+----------------------
+
+The ``llvm-xray stack`` tool may also be used to generate flamegraphs for
+visualizing your instrumented invocations. The tool does not generate the graphs
+themselves, but instead generates a format that can be used with Brendan Gregg's
+FlameGraph tool, currently available on `github
+<https://github.com/brendangregg/FlameGraph>`_.
+
+To generate output for a flamegraph, a few more options are necessary.
+
+- ``-all-stacks`` - Emits all of the stacks.
+- ``-stack-format`` - Choose the flamegraph output format 'flame'.
+- ``-aggregation-type`` - Choose the metric to graph.
+
+You may pipe the command output directly to the flamegraph tool to obtain an
+svg file.
+
+::
+
+ $llvm-xray stack xray-log.llc.5rqxkU -instr_map ./bin/llc -stack-format=flame -aggregation-type=time -all-stacks | \
+ /path/to/FlameGraph/flamegraph.pl > flamegraph.svg
+
+If you open the svg in a browser, mouse events allow exploring the call stacks.
+
+Chrome Trace Viewer Visualization
+---------------------------------
+
+We can also generate a trace which can be loaded by the Chrome Trace Viewer
+from the same generated trace:
+
+::
+
+ $ llvm-xray convert -symbolize -instr_map=./bin/llc \
+ -output-format=trace_event xray-log.llc.5rqxkU \
+ | gzip > llc-trace.txt.gz
+
+From a Chrome browser, navigating to ``chrome:///tracing`` allows us to load
+the ``sample-trace.txt.gz`` file to visualize the execution trace.
+
+Further Exploration
+-------------------
+
+The ``llvm-xray`` tool has a few other subcommands that are in various stages
+of being developed. One interesting subcommand that can highlight a few
+interesting things is the ``graph`` subcommand. Given for example the following
+toy program that we build with XRay instrumentation, we can see how the
+generated graph may be a helpful indicator of where time is being spent for the
+application.
+
+.. code-block:: c++
+
+ // sample.cc
+ #include <iostream>
+ #include <thread>
+
+ [[clang::xray_always_instrument]] void f() {
+ std::cerr << '.';
+ }
+
+ [[clang::xray_always_instrument]] void g() {
+ for (int i = 0; i < 1 << 10; ++i) {
+ std::cerr << '-';
+ }
+ }
+
+ int main(int argc, char* argv[]) {
+ std::thread t1([] {
+ for (int i = 0; i < 1 << 10; ++i)
+ f();
+ });
+ std::thread t2([] {
+ g();
+ });
+ t1.join();
+ t2.join();
+ std::cerr << '\n';
+ }
+
+We then build the above with XRay instrumentation:
+
+::
+
+ $ clang++ -o sample -O3 sample.cc -std=c++11 -fxray-instrument -fxray-instruction-threshold=1
+ $ XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic" ./sample
+
+We can then explore the graph rendering of the trace generated by this sample
+application. We assume you have the graphviz toosl available in your system,
+including both ``unflatten`` and ``dot``. If you prefer rendering or exploring
+the graph using another tool, then that should be feasible as well. ``llvm-xray
+graph`` will create DOT format graphs which should be usable in most graph
+rendering applications. One example invocation of the ``llvm-xray graph``
+command should yield some interesting insights to the workings of C++
+applications:
+
+::
+
+ $ llvm-xray graph xray-log.sample.* -m sample -color-edges=sum -edge-label=sum \
+ | unflatten -f -l10 | dot -Tsvg -o sample.svg
+
+
+Next Steps
+----------
+
+If you have some interesting analyses you'd like to implement as part of the
+llvm-xray tool, please feel free to propose them on the llvm-dev@ mailing list.
+The following are some ideas to inspire you in getting involved and potentially
+making things better.
+
+ - Implement a query/filtering library that allows for finding patterns in the
+ XRay traces.
+ - Collecting function call stacks and how often they're encountered in the
+ XRay trace.
+
+
Added: www-releases/trunk/9.0.0/docs/_sources/XRayFDRFormat.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/XRayFDRFormat.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/XRayFDRFormat.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/XRayFDRFormat.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,401 @@
+======================================
+XRay Flight Data Recorder Trace Format
+======================================
+
+:Version: 1 as of 2017-07-20
+
+.. contents::
+ :local:
+
+
+Introduction
+============
+
+When gathering XRay traces in Flight Data Recorder mode, each thread of an
+application will claim buffers to fill with trace data, which at some point
+is finalized and flushed.
+
+A goal of the profiler is to minimize overhead, the flushed data directly
+corresponds to the buffer.
+
+This document describes the format of a trace file.
+
+
+General
+=======
+
+Each trace file corresponds to a sequence of events in a particular thread.
+
+The file has a header followed by a sequence of discriminated record types.
+
+The endianness of byte fields matches the endianess of the platform which
+produced the trace file.
+
+
+Header Section
+==============
+
+A trace file begins with a 32 byte header.
+
++-------------------+-----------------+----------------------------------------+
+| Field | Size (bytes) | Description |
++===================+=================+========================================+
+| version | ``2`` | Anticipates versioned readers. This |
+| | | document describes the format when |
+| | | version == 1 |
++-------------------+-----------------+----------------------------------------+
+| type | ``2`` | An enumeration encoding the type of |
+| | | trace. Flight Data Recorder mode |
+| | | traces have type == 1 |
++-------------------+-----------------+----------------------------------------+
+| bitfield | ``4`` | Holds parameters that are not aligned |
+| | | to bytes. Further described below. |
++-------------------+-----------------+----------------------------------------+
+| cycle_frequency | ``8`` | The frequency in hertz of the CPU |
+| | | oscillator used to measure duration of |
+| | | events in ticks. |
++-------------------+-----------------+----------------------------------------+
+| buffer_size | ``8`` | The size in bytes of the data portion |
+| | | of the trace following the header. |
++-------------------+-----------------+----------------------------------------+
+| reserved | ``8`` | Reserved for future use. |
++-------------------+-----------------+----------------------------------------+
+
+The bitfield parameter of the file header is composed of the following fields.
+
++-------------------+----------------+-----------------------------------------+
+| Field | Size (bits) | Description |
++===================+================+=========================================+
+| constant_tsc | ``1`` | Whether the platform's timestamp |
+| | | counter used to record ticks between |
+| | | events ticks at a constant frequency |
+| | | despite CPU frequency changes. |
+| | | 0 == non-constant. 1 == constant. |
++-------------------+----------------+-----------------------------------------+
+| nonstop_tsc | ``1`` | Whether the tsc continues to count |
+| | | despite whether the CPU is in a low |
+| | | power state. 0 == stop. 1 == non-stop. |
++-------------------+----------------+-----------------------------------------+
+| reserved | ``30`` | Not meaningful. |
++-------------------+----------------+-----------------------------------------+
+
+
+Data Section
+============
+
+Following the header in a trace is a data section with size matching the
+buffer_size field in the header.
+
+The data section is a stream of elements of different types.
+
+There are a few categories of data in the sequence.
+
+- ``Function Records``: Function Records contain the timing of entry into and
+ exit from function execution. Function Records have 8 bytes each.
+
+- ``Metadata Records``: Metadata records serve many purposes. Mostly, they
+ capture information that may be too costly to record for each function, but
+ that is required to contextualize the fine-grained timings. They also are used
+ as markers for user-defined Event Data payloads. Metadata records have 16
+ bytes each.
+
+- ``Event Data``: Free form data may be associated with events that are traced
+ by the binary and encode data defined by a handler function. Event data is
+ always preceded with a marker record which indicates how large it is.
+
+- ``Function Arguments``: The arguments to some functions are included in the
+ trace. These are either pointer addresses or primitives that are read and
+ logged independently of their types in a high level language. To the tracer,
+ they are all numbers. Function Records that have attached arguments will
+ indicate their presence on the function entry record. We only support logging
+ contiguous function argument sequences starting with argument zero, which will
+ be the "this" pointer for member function invocations. For example, we don't
+ support logging the first and third argument.
+
+A reader of the memory format must maintain a state machine. The format makes no
+attempt to pad for alignment, and it is not seekable.
+
+
+Function Records
+----------------
+
+Function Records have an 8 byte layout. This layout encodes information to
+reconstruct a call stack of instrumented function and their durations.
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bits) | Description |
++===============+==============+===============================================+
+| discriminant | ``1`` | Indicates whether a reader should read a |
+| | | Function or Metadata record. Set to ``0`` for |
+| | | Function records. |
++---------------+--------------+-----------------------------------------------+
+| action | ``3`` | Specifies whether the function is being |
+| | | entered, exited, or is a non-standard entry |
+| | | or exit produced by optimizations. |
++---------------+--------------+-----------------------------------------------+
+| function_id | ``28`` | A numeric ID for the function. Resolved to a |
+| | | name via the xray instrumentation map. The |
+| | | instrumentation map is built by xray at |
+| | | compile time into an object file and pairs |
+| | | the function ids to addresses. It is used for |
+| | | patching and as a lookup into the binary's |
+| | | symbols to obtain names. |
++---------------+--------------+-----------------------------------------------+
+| tsc_delta | ``32`` | The number of ticks of the timestamp counter |
+| | | since a previous record recorded a delta or |
+| | | other TSC resetting event. |
++---------------+--------------+-----------------------------------------------+
+
+On little-endian machines, the bitfields are ordered from least significant bit
+bit to most significant bit. A reader can read an 8 bit value and apply the mask
+``0x01`` for the discriminant. Similarly, they can read 32 bits and unsigned
+shift right by ``0x04`` to obtain the function_id field.
+
+On big-endian machine, the bitfields are written in order from most significant
+bit to least significant bit. A reader would read an 8 bit value and unsigned
+shift right by 7 bits for the discriminant. The function_id field could be
+obtained by reading a 32 bit value and applying the mask ``0x0FFFFFFF``.
+
+Function action types are as follows.
+
++---------------+--------------+-----------------------------------------------+
+| Type | Number | Description |
++===============+==============+===============================================+
+| Entry | ``0`` | Typical function entry. |
++---------------+--------------+-----------------------------------------------+
+| Exit | ``1`` | Typical function exit. |
++---------------+--------------+-----------------------------------------------+
+| Tail_Exit | ``2`` | An exit from a function due to tail call |
+| | | optimization. |
++---------------+--------------+-----------------------------------------------+
+| Entry_Args | ``3`` | A function entry that records arguments. |
++---------------+--------------+-----------------------------------------------+
+
+Entry_Args records do not contain the arguments themselves. Instead, metadata
+records for each of the logged args follow the function record in the stream.
+
+
+Metadata Records
+----------------
+
+Interspersed throughout the buffer are 16 byte Metadata records. For typically
+instrumented binaries, they will be sparser than Function records, and they
+provide a fuller picture of the binary execution state.
+
+Metadata record layout is partially record dependent, but they share a common
+structure.
+
+The same bit field rules described for function records apply to the first byte
+of MetadataRecords. Within this byte, little endian machines use lsb to msb
+ordering and big endian machines use msb to lsb ordering.
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size | Description |
++===============+==============+===============================================+
+| discriminant | ``1 bit`` | Indicates whether a reader should read a |
+| | | Function or Metadata record. Set to ``1`` for |
+| | | Metadata records. |
++---------------+--------------+-----------------------------------------------+
+| record_kind | ``7 bits`` | The type of Metadata record. |
++---------------+--------------+-----------------------------------------------+
+| data | ``15 bytes`` | A data field used differently for each record |
+| | | type. |
++---------------+--------------+-----------------------------------------------+
+
+Here is a table of the enumerated record kinds.
+
++--------+---------------------------+
+| Number | Type |
++========+===========================+
+| 0 | NewBuffer |
++--------+---------------------------+
+| 1 | EndOfBuffer |
++--------+---------------------------+
+| 2 | NewCPUId |
++--------+---------------------------+
+| 3 | TSCWrap |
++--------+---------------------------+
+| 4 | WallTimeMarker |
++--------+---------------------------+
+| 5 | CustomEventMarker |
++--------+---------------------------+
+| 6 | CallArgument |
++--------+---------------------------+
+
+
+NewBuffer Records
+-----------------
+
+Each buffer begins with a NewBuffer record immediately after the header.
+It records the thread ID of the thread that the trace belongs to.
+
+Its data segment is as follows.
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bytes) | Description |
++===============+==============+===============================================+
+| thread_Id | ``2`` | Thread ID for buffer. |
++---------------+--------------+-----------------------------------------------+
+| reserved | ``13`` | Unused. |
++---------------+--------------+-----------------------------------------------+
+
+
+WallClockTime Records
+---------------------
+
+Following the NewBuffer record, each buffer records an absolute time as a frame
+of reference for the durations recorded by timestamp counter deltas.
+
+Its data segment is as follows.
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bytes) | Description |
++===============+==============+===============================================+
+| seconds | ``8`` | Seconds on absolute timescale. The starting |
+| | | point is unspecified and depends on the |
+| | | implementation and platform configured by the |
+| | | tracer. |
++---------------+--------------+-----------------------------------------------+
+| microseconds | ``4`` | The microsecond component of the time. |
++---------------+--------------+-----------------------------------------------+
+| reserved | ``3`` | Unused. |
++---------------+--------------+-----------------------------------------------+
+
+
+NewCpuId Records
+----------------
+
+Each function entry invokes a routine to determine what CPU is executing.
+Typically, this is done with readtscp, which reads the timestamp counter at the
+same time.
+
+If the tracing detects that the execution has switched CPUs or if this is the
+first instrumented entry point, the tracer will output a NewCpuId record.
+
+Its data segment is as follows.
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bytes) | Description |
++===============+==============+===============================================+
+| cpu_id | ``2`` | CPU Id. |
++---------------+--------------+-----------------------------------------------+
+| absolute_tsc | ``8`` | The absolute value of the timestamp counter. |
++---------------+--------------+-----------------------------------------------+
+| reserved | ``5`` | Unused. |
++---------------+--------------+-----------------------------------------------+
+
+
+TSCWrap Records
+---------------
+
+Since each function record uses a 32 bit value to represent the number of ticks
+of the timestamp counter since the last reference, it is possible for this value
+to overflow, particularly for sparsely instrumented binaries.
+
+When this delta would not fit into a 32 bit representation, a reference absolute
+timestamp counter record is written in the form of a TSCWrap record.
+
+Its data segment is as follows.
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bytes) | Description |
++===============+==============+===============================================+
+| absolute_tsc | ``8`` | Timestamp counter value. |
++---------------+--------------+-----------------------------------------------+
+| reserved | ``7`` | Unused. |
++---------------+--------------+-----------------------------------------------+
+
+
+CallArgument Records
+--------------------
+
+Immediately following an Entry_Args type function record, there may be one or
+more CallArgument records that contain the traced function's parameter values.
+
+The order of the CallArgument Record sequency corresponds one to one with the
+order of the function parameters.
+
+CallArgument data segment:
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bytes) | Description |
++===============+==============+===============================================+
+| argument | ``8`` | Numeric argument (may be pointer address). |
++---------------+--------------+-----------------------------------------------+
+| reserved | ``7`` | Unused. |
++---------------+--------------+-----------------------------------------------+
+
+
+CustomEventMarker Records
+-------------------------
+
+XRay provides the feature of logging custom events. This may be leveraged to
+record tracing info for RPCs or similarly trace data that is application
+specific.
+
+Custom Events themselves are an unstructured (application defined) segment of
+memory with arbitrary size within the buffer. They are preceded by
+CustomEventMarkers to indicate their presence and size.
+
+CustomEventMarker data segment:
+
++---------------+--------------+-----------------------------------------------+
+| Field | Size (bytes) | Description |
++===============+==============+===============================================+
+| event_size | ``4`` | Size of preceded event. |
++---------------+--------------+-----------------------------------------------+
+| absolute_tsc | ``8`` | A timestamp counter of the event. |
++---------------+--------------+-----------------------------------------------+
+| reserved | ``3`` | Unused. |
++---------------+--------------+-----------------------------------------------+
+
+
+EndOfBuffer Records
+-------------------
+
+An EndOfBuffer record type indicates that there is no more trace data in this
+buffer. The reader is expected to seek past the remaining buffer_size expressed
+before the start of buffer and look for either another header or EOF.
+
+
+Format Grammar and Invariants
+=============================
+
+Not all sequences of Metadata records and Function records are valid data. A
+sequence should be parsed as a state machine. The expectations for a valid
+format can be expressed as a context free grammar.
+
+This is an attempt to explain the format with statements in EBNF format.
+
+- Format := Header ThreadBuffer* EOF
+
+- ThreadBuffer := NewBuffer WallClockTime NewCPUId BodySequence* End
+
+- BodySequence := NewCPUId | TSCWrap | Function | CustomEvent
+
+- Function := (Function_Entry_Args CallArgument*) | Function_Other_Type
+
+- CustomEvent := CustomEventMarker CustomEventUnstructuredMemory
+
+- End := EndOfBuffer RemainingBufferSizeToSkip
+
+
+Function Record Order
+---------------------
+
+There are a few clarifications that may help understand what is expected of
+Function records.
+
+- Functions with an Exit are expected to have a corresponding Entry or
+ Entry_Args function record precede them in the trace.
+
+- Tail_Exit Function records record the Function ID of the function whose return
+ address the program counter will take. In other words, the final function that
+ would be popped off of the call stack if tail call optimization was not used.
+
+- Not all functions marked for instrumentation are necessarily in the trace. The
+ tracer uses heuristics to preserve the trace for non-trivial functions.
+
+- Not every entry must have a traced Exit or Tail Exit. The buffer may run out
+ of space or the program may request for the tracer to finalize toreturn the
+ buffer before an instrumented function exits.
Added: www-releases/trunk/9.0.0/docs/_sources/YamlIO.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/YamlIO.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/YamlIO.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/YamlIO.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1035 @@
+=====================
+YAML I/O
+=====================
+
+.. contents::
+ :local:
+
+Introduction to YAML
+====================
+
+YAML is a human readable data serialization language. The full YAML language
+spec can be read at `yaml.org
+<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_. The simplest form of
+yaml is just "scalars", "mappings", and "sequences". A scalar is any number
+or string. The pound/hash symbol (#) begins a comment line. A mapping is
+a set of key-value pairs where the key ends with a colon. For example:
+
+.. code-block:: yaml
+
+ # a mapping
+ name: Tom
+ hat-size: 7
+
+A sequence is a list of items where each item starts with a leading dash ('-').
+For example:
+
+.. code-block:: yaml
+
+ # a sequence
+ - x86
+ - x86_64
+ - PowerPC
+
+You can combine mappings and sequences by indenting. For example a sequence
+of mappings in which one of the mapping values is itself a sequence:
+
+.. code-block:: yaml
+
+ # a sequence of mappings with one key's value being a sequence
+ - name: Tom
+ cpus:
+ - x86
+ - x86_64
+ - name: Bob
+ cpus:
+ - x86
+ - name: Dan
+ cpus:
+ - PowerPC
+ - x86
+
+Sometime sequences are known to be short and the one entry per line is too
+verbose, so YAML offers an alternate syntax for sequences called a "Flow
+Sequence" in which you put comma separated sequence elements into square
+brackets. The above example could then be simplified to :
+
+
+.. code-block:: yaml
+
+ # a sequence of mappings with one key's value being a flow sequence
+ - name: Tom
+ cpus: [ x86, x86_64 ]
+ - name: Bob
+ cpus: [ x86 ]
+ - name: Dan
+ cpus: [ PowerPC, x86 ]
+
+
+Introduction to YAML I/O
+========================
+
+The use of indenting makes the YAML easy for a human to read and understand,
+but having a program read and write YAML involves a lot of tedious details.
+The YAML I/O library structures and simplifies reading and writing YAML
+documents.
+
+YAML I/O assumes you have some "native" data structures which you want to be
+able to dump as YAML and recreate from YAML. The first step is to try
+writing example YAML for your data structures. You may find after looking at
+possible YAML representations that a direct mapping of your data structures
+to YAML is not very readable. Often the fields are not in the order that
+a human would find readable. Or the same information is replicated in multiple
+locations, making it hard for a human to write such YAML correctly.
+
+In relational database theory there is a design step called normalization in
+which you reorganize fields and tables. The same considerations need to
+go into the design of your YAML encoding. But, you may not want to change
+your existing native data structures. Therefore, when writing out YAML
+there may be a normalization step, and when reading YAML there would be a
+corresponding denormalization step.
+
+YAML I/O uses a non-invasive, traits based design. YAML I/O defines some
+abstract base templates. You specialize those templates on your data types.
+For instance, if you have an enumerated type FooBar you could specialize
+ScalarEnumerationTraits on that type and define the enumeration() method:
+
+.. code-block:: c++
+
+ using llvm::yaml::ScalarEnumerationTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct ScalarEnumerationTraits<FooBar> {
+ static void enumeration(IO &io, FooBar &value) {
+ ...
+ }
+ };
+
+
+As with all YAML I/O template specializations, the ScalarEnumerationTraits is used for
+both reading and writing YAML. That is, the mapping between in-memory enum
+values and the YAML string representation is only in one place.
+This assures that the code for writing and parsing of YAML stays in sync.
+
+To specify a YAML mappings, you define a specialization on
+llvm::yaml::MappingTraits.
+If your native data structure happens to be a struct that is already normalized,
+then the specialization is simple. For example:
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct MappingTraits<Person> {
+ static void mapping(IO &io, Person &info) {
+ io.mapRequired("name", info.name);
+ io.mapOptional("hat-size", info.hatSize);
+ }
+ };
+
+
+A YAML sequence is automatically inferred if you data type has begin()/end()
+iterators and a push_back() method. Therefore any of the STL containers
+(such as std::vector<>) will automatically translate to YAML sequences.
+
+Once you have defined specializations for your data types, you can
+programmatically use YAML I/O to write a YAML document:
+
+.. code-block:: c++
+
+ using llvm::yaml::Output;
+
+ Person tom;
+ tom.name = "Tom";
+ tom.hatSize = 8;
+ Person dan;
+ dan.name = "Dan";
+ dan.hatSize = 7;
+ std::vector<Person> persons;
+ persons.push_back(tom);
+ persons.push_back(dan);
+
+ Output yout(llvm::outs());
+ yout << persons;
+
+This would write the following:
+
+.. code-block:: yaml
+
+ - name: Tom
+ hat-size: 8
+ - name: Dan
+ hat-size: 7
+
+And you can also read such YAML documents with the following code:
+
+.. code-block:: c++
+
+ using llvm::yaml::Input;
+
+ typedef std::vector<Person> PersonList;
+ std::vector<PersonList> docs;
+
+ Input yin(document.getBuffer());
+ yin >> docs;
+
+ if ( yin.error() )
+ return;
+
+ // Process read document
+ for ( PersonList &pl : docs ) {
+ for ( Person &person : pl ) {
+ cout << "name=" << person.name;
+ }
+ }
+
+One other feature of YAML is the ability to define multiple documents in a
+single file. That is why reading YAML produces a vector of your document type.
+
+
+
+Error Handling
+==============
+
+When parsing a YAML document, if the input does not match your schema (as
+expressed in your XxxTraits<> specializations). YAML I/O
+will print out an error message and your Input object's error() method will
+return true. For instance the following document:
+
+.. code-block:: yaml
+
+ - name: Tom
+ shoe-size: 12
+ - name: Dan
+ hat-size: 7
+
+Has a key (shoe-size) that is not defined in the schema. YAML I/O will
+automatically generate this error:
+
+.. code-block:: yaml
+
+ YAML:2:2: error: unknown key 'shoe-size'
+ shoe-size: 12
+ ^~~~~~~~~
+
+Similar errors are produced for other input not conforming to the schema.
+
+
+Scalars
+=======
+
+YAML scalars are just strings (i.e. not a sequence or mapping). The YAML I/O
+library provides support for translating between YAML scalars and specific
+C++ types.
+
+
+Built-in types
+--------------
+The following types have built-in support in YAML I/O:
+
+* bool
+* float
+* double
+* StringRef
+* std::string
+* int64_t
+* int32_t
+* int16_t
+* int8_t
+* uint64_t
+* uint32_t
+* uint16_t
+* uint8_t
+
+That is, you can use those types in fields of MappingTraits or as element type
+in sequence. When reading, YAML I/O will validate that the string found
+is convertible to that type and error out if not.
+
+
+Unique types
+------------
+Given that YAML I/O is trait based, the selection of how to convert your data
+to YAML is based on the type of your data. But in C++ type matching, typedefs
+do not generate unique type names. That means if you have two typedefs of
+unsigned int, to YAML I/O both types look exactly like unsigned int. To
+facilitate make unique type names, YAML I/O provides a macro which is used
+like a typedef on built-in types, but expands to create a class with conversion
+operators to and from the base type. For example:
+
+.. code-block:: c++
+
+ LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFooFlags)
+ LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyBarFlags)
+
+This generates two classes MyFooFlags and MyBarFlags which you can use in your
+native data structures instead of uint32_t. They are implicitly
+converted to and from uint32_t. The point of creating these unique types
+is that you can now specify traits on them to get different YAML conversions.
+
+Hex types
+---------
+An example use of a unique type is that YAML I/O provides fixed sized unsigned
+integers that are written with YAML I/O as hexadecimal instead of the decimal
+format used by the built-in integer types:
+
+* Hex64
+* Hex32
+* Hex16
+* Hex8
+
+You can use llvm::yaml::Hex32 instead of uint32_t and the only different will
+be that when YAML I/O writes out that type it will be formatted in hexadecimal.
+
+
+ScalarEnumerationTraits
+-----------------------
+YAML I/O supports translating between in-memory enumerations and a set of string
+values in YAML documents. This is done by specializing ScalarEnumerationTraits<>
+on your enumeration type and define a enumeration() method.
+For instance, suppose you had an enumeration of CPUs and a struct with it as
+a field:
+
+.. code-block:: c++
+
+ enum CPUs {
+ cpu_x86_64 = 5,
+ cpu_x86 = 7,
+ cpu_PowerPC = 8
+ };
+
+ struct Info {
+ CPUs cpu;
+ uint32_t flags;
+ };
+
+To support reading and writing of this enumeration, you can define a
+ScalarEnumerationTraits specialization on CPUs, which can then be used
+as a field type:
+
+.. code-block:: c++
+
+ using llvm::yaml::ScalarEnumerationTraits;
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct ScalarEnumerationTraits<CPUs> {
+ static void enumeration(IO &io, CPUs &value) {
+ io.enumCase(value, "x86_64", cpu_x86_64);
+ io.enumCase(value, "x86", cpu_x86);
+ io.enumCase(value, "PowerPC", cpu_PowerPC);
+ }
+ };
+
+ template <>
+ struct MappingTraits<Info> {
+ static void mapping(IO &io, Info &info) {
+ io.mapRequired("cpu", info.cpu);
+ io.mapOptional("flags", info.flags, 0);
+ }
+ };
+
+When reading YAML, if the string found does not match any of the strings
+specified by enumCase() methods, an error is automatically generated.
+When writing YAML, if the value being written does not match any of the values
+specified by the enumCase() methods, a runtime assertion is triggered.
+
+
+BitValue
+--------
+Another common data structure in C++ is a field where each bit has a unique
+meaning. This is often used in a "flags" field. YAML I/O has support for
+converting such fields to a flow sequence. For instance suppose you
+had the following bit flags defined:
+
+.. code-block:: c++
+
+ enum {
+ flagsPointy = 1
+ flagsHollow = 2
+ flagsFlat = 4
+ flagsRound = 8
+ };
+
+ LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFlags)
+
+To support reading and writing of MyFlags, you specialize ScalarBitSetTraits<>
+on MyFlags and provide the bit values and their names.
+
+.. code-block:: c++
+
+ using llvm::yaml::ScalarBitSetTraits;
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct ScalarBitSetTraits<MyFlags> {
+ static void bitset(IO &io, MyFlags &value) {
+ io.bitSetCase(value, "hollow", flagHollow);
+ io.bitSetCase(value, "flat", flagFlat);
+ io.bitSetCase(value, "round", flagRound);
+ io.bitSetCase(value, "pointy", flagPointy);
+ }
+ };
+
+ struct Info {
+ StringRef name;
+ MyFlags flags;
+ };
+
+ template <>
+ struct MappingTraits<Info> {
+ static void mapping(IO &io, Info& info) {
+ io.mapRequired("name", info.name);
+ io.mapRequired("flags", info.flags);
+ }
+ };
+
+With the above, YAML I/O (when writing) will test mask each value in the
+bitset trait against the flags field, and each that matches will
+cause the corresponding string to be added to the flow sequence. The opposite
+is done when reading and any unknown string values will result in a error. With
+the above schema, a same valid YAML document is:
+
+.. code-block:: yaml
+
+ name: Tom
+ flags: [ pointy, flat ]
+
+Sometimes a "flags" field might contains an enumeration part
+defined by a bit-mask.
+
+.. code-block:: c++
+
+ enum {
+ flagsFeatureA = 1,
+ flagsFeatureB = 2,
+ flagsFeatureC = 4,
+
+ flagsCPUMask = 24,
+
+ flagsCPU1 = 8,
+ flagsCPU2 = 16
+ };
+
+To support reading and writing such fields, you need to use the maskedBitSet()
+method and provide the bit values, their names and the enumeration mask.
+
+.. code-block:: c++
+
+ template <>
+ struct ScalarBitSetTraits<MyFlags> {
+ static void bitset(IO &io, MyFlags &value) {
+ io.bitSetCase(value, "featureA", flagsFeatureA);
+ io.bitSetCase(value, "featureB", flagsFeatureB);
+ io.bitSetCase(value, "featureC", flagsFeatureC);
+ io.maskedBitSetCase(value, "CPU1", flagsCPU1, flagsCPUMask);
+ io.maskedBitSetCase(value, "CPU2", flagsCPU2, flagsCPUMask);
+ }
+ };
+
+YAML I/O (when writing) will apply the enumeration mask to the flags field,
+and compare the result and values from the bitset. As in case of a regular
+bitset, each that matches will cause the corresponding string to be added
+to the flow sequence.
+
+Custom Scalar
+-------------
+Sometimes for readability a scalar needs to be formatted in a custom way. For
+instance your internal data structure may use a integer for time (seconds since
+some epoch), but in YAML it would be much nicer to express that integer in
+some time format (e.g. 4-May-2012 10:30pm). YAML I/O has a way to support
+custom formatting and parsing of scalar types by specializing ScalarTraits<> on
+your data type. When writing, YAML I/O will provide the native type and
+your specialization must create a temporary llvm::StringRef. When reading,
+YAML I/O will provide an llvm::StringRef of scalar and your specialization
+must convert that to your native data type. An outline of a custom scalar type
+looks like:
+
+.. code-block:: c++
+
+ using llvm::yaml::ScalarTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct ScalarTraits<MyCustomType> {
+ static void output(const MyCustomType &value, void*,
+ llvm::raw_ostream &out) {
+ out << value; // do custom formatting here
+ }
+ static StringRef input(StringRef scalar, void*, MyCustomType &value) {
+ // do custom parsing here. Return the empty string on success,
+ // or an error message on failure.
+ return StringRef();
+ }
+ // Determine if this scalar needs quotes.
+ static QuotingType mustQuote(StringRef) { return QuotingType::Single; }
+ };
+
+Block Scalars
+-------------
+
+YAML block scalars are string literals that are represented in YAML using the
+literal block notation, just like the example shown below:
+
+.. code-block:: yaml
+
+ text: |
+ First line
+ Second line
+
+The YAML I/O library provides support for translating between YAML block scalars
+and specific C++ types by allowing you to specialize BlockScalarTraits<> on
+your data type. The library doesn't provide any built-in support for block
+scalar I/O for types like std::string and llvm::StringRef as they are already
+supported by YAML I/O and use the ordinary scalar notation by default.
+
+BlockScalarTraits specializations are very similar to the
+ScalarTraits specialization - YAML I/O will provide the native type and your
+specialization must create a temporary llvm::StringRef when writing, and
+it will also provide an llvm::StringRef that has the value of that block scalar
+and your specialization must convert that to your native data type when reading.
+An example of a custom type with an appropriate specialization of
+BlockScalarTraits is shown below:
+
+.. code-block:: c++
+
+ using llvm::yaml::BlockScalarTraits;
+ using llvm::yaml::IO;
+
+ struct MyStringType {
+ std::string Str;
+ };
+
+ template <>
+ struct BlockScalarTraits<MyStringType> {
+ static void output(const MyStringType &Value, void *Ctxt,
+ llvm::raw_ostream &OS) {
+ OS << Value.Str;
+ }
+
+ static StringRef input(StringRef Scalar, void *Ctxt,
+ MyStringType &Value) {
+ Value.Str = Scalar.str();
+ return StringRef();
+ }
+ };
+
+
+
+Mappings
+========
+
+To be translated to or from a YAML mapping for your type T you must specialize
+llvm::yaml::MappingTraits on T and implement the "void mapping(IO &io, T&)"
+method. If your native data structures use pointers to a class everywhere,
+you can specialize on the class pointer. Examples:
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ // Example of struct Foo which is used by value
+ template <>
+ struct MappingTraits<Foo> {
+ static void mapping(IO &io, Foo &foo) {
+ io.mapOptional("size", foo.size);
+ ...
+ }
+ };
+
+ // Example of struct Bar which is natively always a pointer
+ template <>
+ struct MappingTraits<Bar*> {
+ static void mapping(IO &io, Bar *&bar) {
+ io.mapOptional("size", bar->size);
+ ...
+ }
+ };
+
+
+No Normalization
+----------------
+
+The mapping() method is responsible, if needed, for normalizing and
+denormalizing. In a simple case where the native data structure requires no
+normalization, the mapping method just uses mapOptional() or mapRequired() to
+bind the struct's fields to YAML key names. For example:
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct MappingTraits<Person> {
+ static void mapping(IO &io, Person &info) {
+ io.mapRequired("name", info.name);
+ io.mapOptional("hat-size", info.hatSize);
+ }
+ };
+
+
+Normalization
+----------------
+
+When [de]normalization is required, the mapping() method needs a way to access
+normalized values as fields. To help with this, there is
+a template MappingNormalization<> which you can then use to automatically
+do the normalization and denormalization. The template is used to create
+a local variable in your mapping() method which contains the normalized keys.
+
+Suppose you have native data type
+Polar which specifies a position in polar coordinates (distance, angle):
+
+.. code-block:: c++
+
+ struct Polar {
+ float distance;
+ float angle;
+ };
+
+but you've decided the normalized YAML for should be in x,y coordinates. That
+is, you want the yaml to look like:
+
+.. code-block:: yaml
+
+ x: 10.3
+ y: -4.7
+
+You can support this by defining a MappingTraits that normalizes the polar
+coordinates to x,y coordinates when writing YAML and denormalizes x,y
+coordinates into polar when reading YAML.
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ template <>
+ struct MappingTraits<Polar> {
+
+ class NormalizedPolar {
+ public:
+ NormalizedPolar(IO &io)
+ : x(0.0), y(0.0) {
+ }
+ NormalizedPolar(IO &, Polar &polar)
+ : x(polar.distance * cos(polar.angle)),
+ y(polar.distance * sin(polar.angle)) {
+ }
+ Polar denormalize(IO &) {
+ return Polar(sqrt(x*x+y*y), arctan(x,y));
+ }
+
+ float x;
+ float y;
+ };
+
+ static void mapping(IO &io, Polar &polar) {
+ MappingNormalization<NormalizedPolar, Polar> keys(io, polar);
+
+ io.mapRequired("x", keys->x);
+ io.mapRequired("y", keys->y);
+ }
+ };
+
+When writing YAML, the local variable "keys" will be a stack allocated
+instance of NormalizedPolar, constructed from the supplied polar object which
+initializes it x and y fields. The mapRequired() methods then write out the x
+and y values as key/value pairs.
+
+When reading YAML, the local variable "keys" will be a stack allocated instance
+of NormalizedPolar, constructed by the empty constructor. The mapRequired
+methods will find the matching key in the YAML document and fill in the x and y
+fields of the NormalizedPolar object keys. At the end of the mapping() method
+when the local keys variable goes out of scope, the denormalize() method will
+automatically be called to convert the read values back to polar coordinates,
+and then assigned back to the second parameter to mapping().
+
+In some cases, the normalized class may be a subclass of the native type and
+could be returned by the denormalize() method, except that the temporary
+normalized instance is stack allocated. In these cases, the utility template
+MappingNormalizationHeap<> can be used instead. It just like
+MappingNormalization<> except that it heap allocates the normalized object
+when reading YAML. It never destroys the normalized object. The denormalize()
+method can this return "this".
+
+
+Default values
+--------------
+Within a mapping() method, calls to io.mapRequired() mean that that key is
+required to exist when parsing YAML documents, otherwise YAML I/O will issue an
+error.
+
+On the other hand, keys registered with io.mapOptional() are allowed to not
+exist in the YAML document being read. So what value is put in the field
+for those optional keys?
+There are two steps to how those optional fields are filled in. First, the
+second parameter to the mapping() method is a reference to a native class. That
+native class must have a default constructor. Whatever value the default
+constructor initially sets for an optional field will be that field's value.
+Second, the mapOptional() method has an optional third parameter. If provided
+it is the value that mapOptional() should set that field to if the YAML document
+does not have that key.
+
+There is one important difference between those two ways (default constructor
+and third parameter to mapOptional). When YAML I/O generates a YAML document,
+if the mapOptional() third parameter is used, if the actual value being written
+is the same as (using ==) the default value, then that key/value is not written.
+
+
+Order of Keys
+--------------
+
+When writing out a YAML document, the keys are written in the order that the
+calls to mapRequired()/mapOptional() are made in the mapping() method. This
+gives you a chance to write the fields in an order that a human reader of
+the YAML document would find natural. This may be different that the order
+of the fields in the native class.
+
+When reading in a YAML document, the keys in the document can be in any order,
+but they are processed in the order that the calls to mapRequired()/mapOptional()
+are made in the mapping() method. That enables some interesting
+functionality. For instance, if the first field bound is the cpu and the second
+field bound is flags, and the flags are cpu specific, you can programmatically
+switch how the flags are converted to and from YAML based on the cpu.
+This works for both reading and writing. For example:
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ struct Info {
+ CPUs cpu;
+ uint32_t flags;
+ };
+
+ template <>
+ struct MappingTraits<Info> {
+ static void mapping(IO &io, Info &info) {
+ io.mapRequired("cpu", info.cpu);
+ // flags must come after cpu for this to work when reading yaml
+ if ( info.cpu == cpu_x86_64 )
+ io.mapRequired("flags", *(My86_64Flags*)info.flags);
+ else
+ io.mapRequired("flags", *(My86Flags*)info.flags);
+ }
+ };
+
+
+Tags
+----
+
+The YAML syntax supports tags as a way to specify the type of a node before
+it is parsed. This allows dynamic types of nodes. But the YAML I/O model uses
+static typing, so there are limits to how you can use tags with the YAML I/O
+model. Recently, we added support to YAML I/O for checking/setting the optional
+tag on a map. Using this functionality it is even possbile to support different
+mappings, as long as they are convertible.
+
+To check a tag, inside your mapping() method you can use io.mapTag() to specify
+what the tag should be. This will also add that tag when writing yaml.
+
+Validation
+----------
+
+Sometimes in a yaml map, each key/value pair is valid, but the combination is
+not. This is similar to something having no syntax errors, but still having
+semantic errors. To support semantic level checking, YAML I/O allows
+an optional ``validate()`` method in a MappingTraits template specialization.
+
+When parsing yaml, the ``validate()`` method is call *after* all key/values in
+the map have been processed. Any error message returned by the ``validate()``
+method during input will be printed just a like a syntax error would be printed.
+When writing yaml, the ``validate()`` method is called *before* the yaml
+key/values are written. Any error during output will trigger an ``assert()``
+because it is a programming error to have invalid struct values.
+
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ struct Stuff {
+ ...
+ };
+
+ template <>
+ struct MappingTraits<Stuff> {
+ static void mapping(IO &io, Stuff &stuff) {
+ ...
+ }
+ static StringRef validate(IO &io, Stuff &stuff) {
+ // Look at all fields in 'stuff' and if there
+ // are any bad values return a string describing
+ // the error. Otherwise return an empty string.
+ return StringRef();
+ }
+ };
+
+Flow Mapping
+------------
+A YAML "flow mapping" is a mapping that uses the inline notation
+(e.g { x: 1, y: 0 } ) when written to YAML. To specify that a type should be
+written in YAML using flow mapping, your MappingTraits specialization should
+add "static const bool flow = true;". For instance:
+
+.. code-block:: c++
+
+ using llvm::yaml::MappingTraits;
+ using llvm::yaml::IO;
+
+ struct Stuff {
+ ...
+ };
+
+ template <>
+ struct MappingTraits<Stuff> {
+ static void mapping(IO &io, Stuff &stuff) {
+ ...
+ }
+
+ static const bool flow = true;
+ }
+
+Flow mappings are subject to line wrapping according to the Output object
+configuration.
+
+Sequence
+========
+
+To be translated to or from a YAML sequence for your type T you must specialize
+llvm::yaml::SequenceTraits on T and implement two methods:
+``size_t size(IO &io, T&)`` and
+``T::value_type& element(IO &io, T&, size_t indx)``. For example:
+
+.. code-block:: c++
+
+ template <>
+ struct SequenceTraits<MySeq> {
+ static size_t size(IO &io, MySeq &list) { ... }
+ static MySeqEl &element(IO &io, MySeq &list, size_t index) { ... }
+ };
+
+The size() method returns how many elements are currently in your sequence.
+The element() method returns a reference to the i'th element in the sequence.
+When parsing YAML, the element() method may be called with an index one bigger
+than the current size. Your element() method should allocate space for one
+more element (using default constructor if element is a C++ object) and returns
+a reference to that new allocated space.
+
+
+Flow Sequence
+-------------
+A YAML "flow sequence" is a sequence that when written to YAML it uses the
+inline notation (e.g [ foo, bar ] ). To specify that a sequence type should
+be written in YAML as a flow sequence, your SequenceTraits specialization should
+add "static const bool flow = true;". For instance:
+
+.. code-block:: c++
+
+ template <>
+ struct SequenceTraits<MyList> {
+ static size_t size(IO &io, MyList &list) { ... }
+ static MyListEl &element(IO &io, MyList &list, size_t index) { ... }
+
+ // The existence of this member causes YAML I/O to use a flow sequence
+ static const bool flow = true;
+ };
+
+With the above, if you used MyList as the data type in your native data
+structures, then when converted to YAML, a flow sequence of integers
+will be used (e.g. [ 10, -3, 4 ]).
+
+Flow sequences are subject to line wrapping according to the Output object
+configuration.
+
+Utility Macros
+--------------
+Since a common source of sequences is std::vector<>, YAML I/O provides macros:
+LLVM_YAML_IS_SEQUENCE_VECTOR() and LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR() which
+can be used to easily specify SequenceTraits<> on a std::vector type. YAML
+I/O does not partial specialize SequenceTraits on std::vector<> because that
+would force all vectors to be sequences. An example use of the macros:
+
+.. code-block:: c++
+
+ std::vector<MyType1>;
+ std::vector<MyType2>;
+ LLVM_YAML_IS_SEQUENCE_VECTOR(MyType1)
+ LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(MyType2)
+
+
+
+Document List
+=============
+
+YAML allows you to define multiple "documents" in a single YAML file. Each
+new document starts with a left aligned "---" token. The end of all documents
+is denoted with a left aligned "..." token. Many users of YAML will never
+have need for multiple documents. The top level node in their YAML schema
+will be a mapping or sequence. For those cases, the following is not needed.
+But for cases where you do want multiple documents, you can specify a
+trait for you document list type. The trait has the same methods as
+SequenceTraits but is named DocumentListTraits. For example:
+
+.. code-block:: c++
+
+ template <>
+ struct DocumentListTraits<MyDocList> {
+ static size_t size(IO &io, MyDocList &list) { ... }
+ static MyDocType element(IO &io, MyDocList &list, size_t index) { ... }
+ };
+
+
+User Context Data
+=================
+When an llvm::yaml::Input or llvm::yaml::Output object is created their
+constructors take an optional "context" parameter. This is a pointer to
+whatever state information you might need.
+
+For instance, in a previous example we showed how the conversion type for a
+flags field could be determined at runtime based on the value of another field
+in the mapping. But what if an inner mapping needs to know some field value
+of an outer mapping? That is where the "context" parameter comes in. You
+can set values in the context in the outer map's mapping() method and
+retrieve those values in the inner map's mapping() method.
+
+The context value is just a void*. All your traits which use the context
+and operate on your native data types, need to agree what the context value
+actually is. It could be a pointer to an object or struct which your various
+traits use to shared context sensitive information.
+
+
+Output
+======
+
+The llvm::yaml::Output class is used to generate a YAML document from your
+in-memory data structures, using traits defined on your data types.
+To instantiate an Output object you need an llvm::raw_ostream, an optional
+context pointer and an optional wrapping column:
+
+.. code-block:: c++
+
+ class Output : public IO {
+ public:
+ Output(llvm::raw_ostream &, void *context = NULL, int WrapColumn = 70);
+
+Once you have an Output object, you can use the C++ stream operator on it
+to write your native data as YAML. One thing to recall is that a YAML file
+can contain multiple "documents". If the top level data structure you are
+streaming as YAML is a mapping, scalar, or sequence, then Output assumes you
+are generating one document and wraps the mapping output
+with "``---``" and trailing "``...``".
+
+The WrapColumn parameter will cause the flow mappings and sequences to
+line-wrap when they go over the supplied column. Pass 0 to completely
+suppress the wrapping.
+
+.. code-block:: c++
+
+ using llvm::yaml::Output;
+
+ void dumpMyMapDoc(const MyMapType &info) {
+ Output yout(llvm::outs());
+ yout << info;
+ }
+
+The above could produce output like:
+
+.. code-block:: yaml
+
+ ---
+ name: Tom
+ hat-size: 7
+ ...
+
+On the other hand, if the top level data structure you are streaming as YAML
+has a DocumentListTraits specialization, then Output walks through each element
+of your DocumentList and generates a "---" before the start of each element
+and ends with a "...".
+
+.. code-block:: c++
+
+ using llvm::yaml::Output;
+
+ void dumpMyMapDoc(const MyDocListType &docList) {
+ Output yout(llvm::outs());
+ yout << docList;
+ }
+
+The above could produce output like:
+
+.. code-block:: yaml
+
+ ---
+ name: Tom
+ hat-size: 7
+ ---
+ name: Tom
+ shoe-size: 11
+ ...
+
+Input
+=====
+
+The llvm::yaml::Input class is used to parse YAML document(s) into your native
+data structures. To instantiate an Input
+object you need a StringRef to the entire YAML file, and optionally a context
+pointer:
+
+.. code-block:: c++
+
+ class Input : public IO {
+ public:
+ Input(StringRef inputContent, void *context=NULL);
+
+Once you have an Input object, you can use the C++ stream operator to read
+the document(s). If you expect there might be multiple YAML documents in
+one file, you'll need to specialize DocumentListTraits on a list of your
+document type and stream in that document list type. Otherwise you can
+just stream in the document type. Also, you can check if there was
+any syntax errors in the YAML be calling the error() method on the Input
+object. For example:
+
+.. code-block:: c++
+
+ // Reading a single document
+ using llvm::yaml::Input;
+
+ Input yin(mb.getBuffer());
+
+ // Parse the YAML file
+ MyDocType theDoc;
+ yin >> theDoc;
+
+ // Check for error
+ if ( yin.error() )
+ return;
+
+
+.. code-block:: c++
+
+ // Reading multiple documents in one file
+ using llvm::yaml::Input;
+
+ LLVM_YAML_IS_DOCUMENT_LIST_VECTOR(MyDocType)
+
+ Input yin(mb.getBuffer());
+
+ // Parse the YAML file
+ std::vector<MyDocType> theDocList;
+ yin >> theDocList;
+
+ // Check for error
+ if ( yin.error() )
+ return;
+
+
Added: www-releases/trunk/9.0.0/docs/_sources/index.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/index.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/index.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/index.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,626 @@
+Overview
+========
+
+The LLVM compiler infrastructure supports a wide range of projects, from
+industrial strength compilers to specialized JIT applications to small
+research projects.
+
+Similarly, documentation is broken down into several high-level groupings
+targeted at different audiences:
+
+LLVM Design & Overview
+======================
+
+Several introductory papers and presentations.
+
+.. toctree::
+ :hidden:
+
+ LangRef
+
+:doc:`LangRef`
+ Defines the LLVM intermediate representation.
+
+`Introduction to the LLVM Compiler`__
+ Presentation providing a users introduction to LLVM.
+
+ .. __: http://llvm.org/pubs/2008-10-04-ACAT-LLVM-Intro.html
+
+`Intro to LLVM`__
+ Book chapter providing a compiler hacker's introduction to LLVM.
+
+ .. __: http://www.aosabook.org/en/llvm.html
+
+
+`LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation`__
+ Design overview.
+
+ .. __: http://llvm.org/pubs/2004-01-30-CGO-LLVM.html
+
+`LLVM: An Infrastructure for Multi-Stage Optimization`__
+ More details (quite old now).
+
+ .. __: http://llvm.org/pubs/2002-12-LattnerMSThesis.html
+
+`Publications mentioning LLVM <http://llvm.org/pubs>`_
+ ..
+
+User Guides
+===========
+
+For those new to the LLVM system.
+
+NOTE: If you are a user who is only interested in using an LLVM-based compiler,
+you should look into `Clang <http://clang.llvm.org>`_ instead. The
+documentation here is intended for users who have a need to work with the
+intermediate LLVM representation.
+
+.. toctree::
+ :hidden:
+
+ CMake
+ CMakePrimer
+ AdvancedBuilds
+ HowToBuildOnARM
+ HowToBuildWithPGO
+ HowToCrossCompileBuiltinsOnArm
+ HowToCrossCompileLLVM
+ CommandGuide/index
+ GettingStarted
+ GettingStartedVS
+ FAQ
+ Lexicon
+ HowToAddABuilder
+ yaml2obj
+ HowToSubmitABug
+ SphinxQuickstartTemplate
+ MarkdownQuickstartTemplate
+ Phabricator
+ TestingGuide
+ tutorial/index
+ ReleaseNotes
+ Passes
+ YamlIO
+ GetElementPtr
+ Frontend/PerformanceTips
+ MCJITDesignAndImplementation
+ ORCv2
+ CodeOfConduct
+ CompileCudaWithLLVM
+ ReportingGuide
+ Benchmarking
+ Docker
+ BuildingADistribution
+ Remarks
+
+:doc:`GettingStarted`
+ Discusses how to get up and running quickly with the LLVM infrastructure.
+ Everything from unpacking and compilation of the distribution to execution
+ of some tools.
+
+:doc:`CMake`
+ An addendum to the main Getting Started guide for those using the `CMake
+ build system <http://www.cmake.org>`_.
+
+:doc:`HowToBuildOnARM`
+ Notes on building and testing LLVM/Clang on ARM.
+
+:doc:`HowToBuildWithPGO`
+ Notes on building LLVM/Clang with PGO.
+
+:doc:`HowToCrossCompileBuiltinsOnArm`
+ Notes on cross-building and testing the compiler-rt builtins for Arm.
+
+:doc:`HowToCrossCompileLLVM`
+ Notes on cross-building and testing LLVM/Clang.
+
+:doc:`GettingStartedVS`
+ An addendum to the main Getting Started guide for those using Visual Studio
+ on Windows.
+
+:doc:`tutorial/index`
+ Tutorials about using LLVM. Includes a tutorial about making a custom
+ language with LLVM.
+
+:doc:`LLVM Command Guide <CommandGuide/index>`
+ A reference manual for the LLVM command line utilities ("man" pages for LLVM
+ tools).
+
+:doc:`Passes`
+ A list of optimizations and analyses implemented in LLVM.
+
+:doc:`FAQ`
+ A list of common questions and problems and their solutions.
+
+:doc:`Release notes for the current release <ReleaseNotes>`
+ This describes new features, known bugs, and other limitations.
+
+:doc:`HowToSubmitABug`
+ Instructions for properly submitting information about any bugs you run into
+ in the LLVM system.
+
+:doc:`SphinxQuickstartTemplate`
+ A template + tutorial for writing new Sphinx documentation. It is meant
+ to be read in source form.
+
+:doc:`LLVM Testing Infrastructure Guide <TestingGuide>`
+ A reference manual for using the LLVM testing infrastructure.
+
+:doc:`TestSuiteGuide`
+ Describes how to compile and run the test-suite benchmarks.
+
+`How to build the C, C++, ObjC, and ObjC++ front end`__
+ Instructions for building the clang front-end from source.
+
+ .. __: http://clang.llvm.org/get_started.html
+
+:doc:`Lexicon`
+ Definition of acronyms, terms and concepts used in LLVM.
+
+:doc:`HowToAddABuilder`
+ Instructions for adding new builder to LLVM buildbot master.
+
+:doc:`YamlIO`
+ A reference guide for using LLVM's YAML I/O library.
+
+:doc:`GetElementPtr`
+ Answers to some very frequent questions about LLVM's most frequently
+ misunderstood instruction.
+
+:doc:`Frontend/PerformanceTips`
+ A collection of tips for frontend authors on how to generate IR
+ which LLVM is able to effectively optimize.
+
+:doc:`Docker`
+ A reference for using Dockerfiles provided with LLVM.
+
+:doc:`BuildingADistribution`
+ A best-practices guide for using LLVM's CMake build system to package and
+ distribute LLVM-based tools.
+
+:doc:`Remarks`
+ A reference on the implementation of remarks in LLVM.
+
+Programming Documentation
+=========================
+
+For developers of applications which use LLVM as a library.
+
+.. toctree::
+ :hidden:
+
+ Atomics
+ CodingStandards
+ CommandLine
+ CompilerWriterInfo
+ ExtendingLLVM
+ HowToSetUpLLVMStyleRTTI
+ ProgrammersManual
+ Extensions
+ LibFuzzer
+ FuzzingLLVM
+ ScudoHardenedAllocator
+ OptBisect
+
+:doc:`LLVM Language Reference Manual <LangRef>`
+ Defines the LLVM intermediate representation and the assembly form of the
+ different nodes.
+
+:doc:`Atomics`
+ Information about LLVM's concurrency model.
+
+:doc:`ProgrammersManual`
+ Introduction to the general layout of the LLVM sourcebase, important classes
+ and APIs, and some tips & tricks.
+
+:doc:`Extensions`
+ LLVM-specific extensions to tools and formats LLVM seeks compatibility with.
+
+:doc:`CommandLine`
+ Provides information on using the command line parsing library.
+
+:doc:`CodingStandards`
+ Details the LLVM coding standards and provides useful information on writing
+ efficient C++ code.
+
+:doc:`HowToSetUpLLVMStyleRTTI`
+ How to make ``isa<>``, ``dyn_cast<>``, etc. available for clients of your
+ class hierarchy.
+
+:doc:`ExtendingLLVM`
+ Look here to see how to add instructions and intrinsics to LLVM.
+
+`Doxygen generated documentation <http://llvm.org/doxygen/>`_
+ (`classes <http://llvm.org/doxygen/inherits.html>`_)
+
+`Documentation for Go bindings <http://godoc.org/llvm.org/llvm/bindings/go/llvm>`_
+
+`Github Source Repository Browser <http://github.com/llvm/llvm-project//>`_
+ ..
+
+:doc:`CompilerWriterInfo`
+ A list of helpful links for compiler writers.
+
+:doc:`LibFuzzer`
+ A library for writing in-process guided fuzzers.
+
+:doc:`FuzzingLLVM`
+ Information on writing and using Fuzzers to find bugs in LLVM.
+
+:doc:`ScudoHardenedAllocator`
+ A library that implements a security-hardened `malloc()`.
+
+:doc:`OptBisect`
+ A command line option for debugging optimization-induced failures.
+
+.. _index-subsystem-docs:
+
+Subsystem Documentation
+=======================
+
+For API clients and LLVM developers.
+
+.. toctree::
+ :hidden:
+
+ AliasAnalysis
+ MemorySSA
+ BitCodeFormat
+ BlockFrequencyTerminology
+ BranchWeightMetadata
+ Bugpoint
+ CodeGenerator
+ ExceptionHandling
+ AddingConstrainedIntrinsics
+ LinkTimeOptimization
+ SegmentedStacks
+ TableGenFundamentals
+ TableGen/index
+ DebuggingJITedCode
+ GoldPlugin
+ MarkedUpDisassembly
+ SystemLibrary
+ SupportLibrary
+ SourceLevelDebugging
+ Vectorizers
+ WritingAnLLVMBackend
+ GarbageCollection
+ WritingAnLLVMPass
+ HowToUseAttributes
+ NVPTXUsage
+ AMDGPUUsage
+ StackMaps
+ InAlloca
+ BigEndianNEON
+ CoverageMappingFormat
+ Statepoints
+ MergeFunctions
+ TypeMetadata
+ TransformMetadata
+ FaultMaps
+ MIRLangRef
+ Coroutines
+ GlobalISel
+ XRay
+ XRayExample
+ XRayFDRFormat
+ PDB/index
+ CFIVerify
+ SpeculativeLoadHardening
+ StackSafetyAnalysis
+
+:doc:`WritingAnLLVMPass`
+ Information on how to write LLVM transformations and analyses.
+
+:doc:`WritingAnLLVMBackend`
+ Information on how to write LLVM backends for machine targets.
+
+:doc:`CodeGenerator`
+ The design and implementation of the LLVM code generator. Useful if you are
+ working on retargetting LLVM to a new architecture, designing a new codegen
+ pass, or enhancing existing components.
+
+:doc:`Machine IR (MIR) Format Reference Manual <MIRLangRef>`
+ A reference manual for the MIR serialization format, which is used to test
+ LLVM's code generation passes.
+
+:doc:`TableGen <TableGen/index>`
+ Describes the TableGen tool, which is used heavily by the LLVM code
+ generator.
+
+:doc:`AliasAnalysis`
+ Information on how to write a new alias analysis implementation or how to
+ use existing analyses.
+
+:doc:`MemorySSA`
+ Information about the MemorySSA utility in LLVM, as well as how to use it.
+
+:doc:`GarbageCollection`
+ The interfaces source-language compilers should use for compiling GC'd
+ programs.
+
+:doc:`Source Level Debugging with LLVM <SourceLevelDebugging>`
+ This document describes the design and philosophy behind the LLVM
+ source-level debugger.
+
+:doc:`Vectorizers`
+ This document describes the current status of vectorization in LLVM.
+
+:doc:`ExceptionHandling`
+ This document describes the design and implementation of exception handling
+ in LLVM.
+
+:doc:`AddingConstrainedIntrinsics`
+ Gives the steps necessary when adding a new constrained math intrinsic
+ to LLVM.
+
+:doc:`Bugpoint`
+ Automatic bug finder and test-case reducer description and usage
+ information.
+
+:doc:`BitCodeFormat`
+ This describes the file format and encoding used for LLVM "bc" files.
+
+:doc:`Support Library <SupportLibrary>`
+ This document describes the LLVM Support Library (``lib/Support``) and
+ how to keep LLVM source code portable
+
+:doc:`LinkTimeOptimization`
+ This document describes the interface between LLVM intermodular optimizer
+ and the linker and its design
+
+:doc:`GoldPlugin`
+ How to build your programs with link-time optimization on Linux.
+
+:doc:`DebuggingJITedCode`
+ How to debug JITed code with GDB.
+
+:doc:`MCJITDesignAndImplementation`
+ Describes the inner workings of MCJIT execution engine.
+
+:doc:`ORCv2`
+ Describes the design and implementation of the ORC APIs, including some
+ usage examples, and a guide for users transitioning from ORCv1 to ORCv2.
+
+:doc:`BranchWeightMetadata`
+ Provides information about Branch Prediction Information.
+
+:doc:`BlockFrequencyTerminology`
+ Provides information about terminology used in the ``BlockFrequencyInfo``
+ analysis pass.
+
+:doc:`SegmentedStacks`
+ This document describes segmented stacks and how they are used in LLVM.
+
+:doc:`MarkedUpDisassembly`
+ This document describes the optional rich disassembly output syntax.
+
+:doc:`HowToUseAttributes`
+ Answers some questions about the new Attributes infrastructure.
+
+:doc:`NVPTXUsage`
+ This document describes using the NVPTX backend to compile GPU kernels.
+
+:doc:`AMDGPUUsage`
+ This document describes using the AMDGPU backend to compile GPU kernels.
+
+:doc:`StackMaps`
+ LLVM support for mapping instruction addresses to the location of
+ values and allowing code to be patched.
+
+:doc:`BigEndianNEON`
+ LLVM's support for generating NEON instructions on big endian ARM targets is
+ somewhat nonintuitive. This document explains the implementation and rationale.
+
+:doc:`CoverageMappingFormat`
+ This describes the format and encoding used for LLVM’s code coverage mapping.
+
+:doc:`Statepoints`
+ This describes a set of experimental extensions for garbage
+ collection support.
+
+:doc:`MergeFunctions`
+ Describes functions merging optimization.
+
+:doc:`InAlloca`
+ Description of the ``inalloca`` argument attribute.
+
+:doc:`FaultMaps`
+ LLVM support for folding control flow into faulting machine instructions.
+
+:doc:`CompileCudaWithLLVM`
+ LLVM support for CUDA.
+
+:doc:`Coroutines`
+ LLVM support for coroutines.
+
+:doc:`GlobalISel`
+ This describes the prototype instruction selection replacement, GlobalISel.
+
+:doc:`XRay`
+ High-level documentation of how to use XRay in LLVM.
+
+:doc:`XRayExample`
+ An example of how to debug an application with XRay.
+
+:doc:`The Microsoft PDB File Format <PDB/index>`
+ A detailed description of the Microsoft PDB (Program Database) file format.
+
+:doc:`CFIVerify`
+ A description of the verification tool for Control Flow Integrity.
+
+:doc:`SpeculativeLoadHardening`
+ A description of the Speculative Load Hardening mitigation for Spectre v1.
+
+:doc:`StackSafetyAnalysis`
+ This document describes the design of the stack safety analysis of local
+ variables.
+
+Development Process Documentation
+=================================
+
+Information about LLVM's development process.
+
+.. toctree::
+ :hidden:
+
+ Contributing
+ DeveloperPolicy
+ Projects
+ LLVMBuild
+ HowToReleaseLLVM
+ Packaging
+ ReleaseProcess
+ Phabricator
+ BugLifeCycle
+
+:doc:`Contributing`
+ An overview on how to contribute to LLVM.
+
+:doc:`DeveloperPolicy`
+ The LLVM project's policy towards developers and their contributions.
+
+:doc:`Projects`
+ How-to guide and templates for new projects that *use* the LLVM
+ infrastructure. The templates (directory organization, Makefiles, and test
+ tree) allow the project code to be located outside (or inside) the ``llvm/``
+ tree, while using LLVM header files and libraries.
+
+:doc:`LLVMBuild`
+ Describes the LLVMBuild organization and files used by LLVM to specify
+ component descriptions.
+
+:doc:`HowToReleaseLLVM`
+ This is a guide to preparing LLVM releases. Most developers can ignore it.
+
+:doc:`ReleaseProcess`
+ This is a guide to validate a new release, during the release process. Most developers can ignore it.
+
+:doc:`Packaging`
+ Advice on packaging LLVM into a distribution.
+
+:doc:`Phabricator`
+ Describes how to use the Phabricator code review tool hosted on
+ http://reviews.llvm.org/ and its command line interface, Arcanist.
+
+:doc:`BugLifeCycle`
+ Describes how bugs are reported, triaged and closed.
+
+Community
+=========
+
+LLVM has a thriving community of friendly and helpful developers.
+The two primary communication mechanisms in the LLVM community are mailing
+lists and IRC.
+
+Mailing Lists
+-------------
+
+If you can't find what you need in these docs, try consulting the mailing
+lists.
+
+`Developer's List (llvm-dev)`__
+ This list is for people who want to be included in technical discussions of
+ LLVM. People post to this list when they have questions about writing code
+ for or using the LLVM tools. It is relatively low volume.
+
+ .. __: http://lists.llvm.org/mailman/listinfo/llvm-dev
+
+`Commits Archive (llvm-commits)`__
+ This list contains all commit messages that are made when LLVM developers
+ commit code changes to the repository. It also serves as a forum for
+ patch review (i.e. send patches here). It is useful for those who want to
+ stay on the bleeding edge of LLVM development. This list is very high
+ volume.
+
+ .. __: http://lists.llvm.org/pipermail/llvm-commits/
+
+`Bugs & Patches Archive (llvm-bugs)`__
+ This list gets emailed every time a bug is opened and closed. It is
+ higher volume than the LLVM-dev list.
+
+ .. __: http://lists.llvm.org/pipermail/llvm-bugs/
+
+`Test Results Archive (llvm-testresults)`__
+ A message is automatically sent to this list by every active nightly tester
+ when it completes. As such, this list gets email several times each day,
+ making it a high volume list.
+
+ .. __: http://lists.llvm.org/pipermail/llvm-testresults/
+
+`LLVM Announcements List (llvm-announce)`__
+ This is a low volume list that provides important announcements regarding
+ LLVM. It gets email about once a month.
+
+ .. __: http://lists.llvm.org/mailman/listinfo/llvm-announce
+
+IRC
+---
+
+Users and developers of the LLVM project (including subprojects such as Clang)
+can be found in #llvm on `irc.oftc.net <irc://irc.oftc.net/llvm>`_.
+
+This channel has several bots.
+
+* Buildbot reporters
+
+ * llvmbb - Bot for the main LLVM buildbot master.
+ http://lab.llvm.org:8011/console
+ * smooshlab - Apple's internal buildbot master.
+
+* robot - Bugzilla linker. %bug <number>
+
+* clang-bot - A `geordi <http://www.eelis.net/geordi/>`_ instance running
+ near-trunk clang instead of gcc.
+
+Meetups and social events
+-------------------------
+
+.. toctree::
+ :hidden:
+
+ MeetupGuidelines
+
+Besides developer `meetings and conferences <https://llvm.org/devmtg/>`_,
+there are several user groups called
+`LLVM Socials <https://www.meetup.com/pro/llvm/>`_. We greatly encourage you to
+join one in your city. Or start a new one if there is none:
+
+:doc:`MeetupGuidelines`
+
+Community wide proposals
+------------------------
+
+Proposals for massive changes in how the community behaves and how the work flow
+can be better.
+
+.. toctree::
+ :hidden:
+
+ CodeOfConduct
+ Proposals/GitHubMove
+ Proposals/TestSuite
+ Proposals/VariableNames
+ Proposals/VectorizationPlan
+
+:doc:`CodeOfConduct`
+ Proposal to adopt a code of conduct on the LLVM social spaces (lists, events,
+ IRC, etc).
+
+:doc:`Proposals/GitHubMove`
+ Proposal to move from SVN/Git to GitHub.
+
+:doc:`Proposals/TestSuite`
+ Proposals for additional benchmarks/programs for llvm's test-suite.
+
+:doc:`Proposals/VariableNames`
+ Proposal to change the variable names coding standard.
+
+:doc:`Proposals/VectorizationPlan`
+ Proposal to model the process and upgrade the infrastructure of LLVM's Loop Vectorizer.
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`search`
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,323 @@
+=======================================================
+Building a JIT: Starting out with KaleidoscopeJIT
+=======================================================
+
+.. contents::
+ :local:
+
+Chapter 1 Introduction
+======================
+
+**Warning: This tutorial is currently being updated to account for ORC API
+changes. Only Chapters 1 and 2 are up-to-date.**
+
+**Example code from Chapters 3 to 5 will compile and run, but has not been
+updated**
+
+Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
+tutorial runs through the implementation of a JIT compiler using LLVM's
+On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
+KaleidoscopeJIT class used in the
+`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
+introduces new features like concurrent compilation, optimization, lazy
+compilation and remote execution.
+
+The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how
+these APIs interact with other parts of LLVM, and to teach you how to recombine
+them to build a custom JIT that is suited to your use-case.
+
+The structure of the tutorial is:
+
+- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will
+ introduce some of the basic concepts of the ORC JIT APIs, including the
+ idea of an ORC *Layer*.
+
+- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding
+ a new layer that will optimize IR and generated code.
+
+- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a
+ Compile-On-Demand layer to lazily compile IR.
+
+- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by
+ replacing the Compile-On-Demand layer with a custom layer that uses the ORC
+ Compile Callbacks API directly to defer IR-generation until functions are
+ called.
+
+- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into
+ a remote process with reduced privileges using the JIT Remote APIs.
+
+To provide input for our JIT we will use a lightly modified version of the
+Kaleidoscope REPL from `Chapter 7 <LangImpl07.html>`_ of the "Implementing a
+language in LLVM tutorial".
+
+Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API.
+It was preceded by MCJIT, and before that by the (now deleted) legacy JIT.
+These tutorials don't assume any experience with these earlier APIs, but
+readers acquainted with them will see many familiar elements. Where appropriate
+we will make this connection with the earlier APIs explicit to help people who
+are transitioning from them to ORC.
+
+JIT API Basics
+==============
+
+The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed,
+rather than compiling whole programs to disk ahead of time as a traditional
+compiler does. To support that aim our initial, bare-bones JIT API will have
+just two functions:
+
+1. ``Error addModule(std::unique_ptr<Module> M)``: Make the given IR module
+ available for execution.
+2. ``Expected<JITEvaluatedSymbol> lookup()``: Search for pointers to
+ symbols (functions or variables) that have been added to the JIT.
+
+A basic use-case for this API, executing the 'main' function from a module,
+will look like:
+
+.. code-block:: c++
+
+ JIT J;
+ J.addModule(buildModule());
+ auto *Main = (int(*)(int, char*[]))J.lookup("main").getAddress();
+ int Result = Main();
+
+The APIs that we build in these tutorials will all be variations on this simple
+theme. Behind this API we will refine the implementation of the JIT to add
+support for concurrent compilation, optimization and lazy compilation.
+Eventually we will extend the API itself to allow higher-level program
+representations (e.g. ASTs) to be added to the JIT.
+
+KaleidoscopeJIT
+===============
+
+In the previous section we described our API, now we examine a simple
+implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
+`Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use
+the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
+input for our JIT: Each time the user enters an expression the REPL will add a
+new IR module containing the code for that expression to the JIT. If the
+expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
+use the lookup method of our JIT class find and execute the code for the
+expression. In later chapters of this tutorial we will modify the REPL to enable
+new interactions with our JIT class, but for now we will take this setup for
+granted and focus our attention on the implementation of our JIT itself.
+
+Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the
+usual include guards and #includes [2]_, we get to the definition of our class:
+
+.. code-block:: c++
+
+ #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
+ #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
+
+ #include "llvm/ADT/StringRef.h"
+ #include "llvm/ExecutionEngine/JITSymbol.h"
+ #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
+ #include "llvm/ExecutionEngine/Orc/Core.h"
+ #include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
+ #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
+ #include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"
+ #include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
+ #include "llvm/ExecutionEngine/SectionMemoryManager.h"
+ #include "llvm/IR/DataLayout.h"
+ #include "llvm/IR/LLVMContext.h"
+ #include <memory>
+
+ namespace llvm {
+ namespace orc {
+
+ class KaleidoscopeJIT {
+ private:
+ ExecutionSession ES;
+ RTDyldObjectLinkingLayer ObjectLayer;
+ IRCompileLayer CompileLayer;
+
+ DataLayout DL;
+ MangleAndInterner Mangle;
+ ThreadSafeContext Ctx;
+
+ public:
+ KaleidoscopeJIT(JITTargetMachineBuilder JTMB, DataLayout DL)
+ : ObjectLayer(ES,
+ []() { return llvm::make_unique<SectionMemoryManager>(); }),
+ CompileLayer(ES, ObjectLayer, ConcurrentIRCompiler(std::move(JTMB))),
+ DL(std::move(DL)), Mangle(ES, this->DL),
+ Ctx(llvm::make_unique<LLVMContext>()) {
+ ES.getMainJITDylib().setGenerator(
+ cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
+ }
+
+Our class begins with six member variables: An ExecutionSession member, ``ES``,
+which provides context for our running JIT'd code (including the string pool,
+global mutex, and error reporting facilities); An RTDyldObjectLinkingLayer,
+``ObjectLayer``, that can be used to add object files to our JIT (though we will
+not use it directly); An IRCompileLayer, ``CompileLayer``, that can be used to
+add LLVM Modules to our JIT (and which builds on the ObjectLayer), A DataLayout
+and MangleAndInterner, ``DL`` and ``Mangle``, that will be used for symbol mangling
+(more on that later); and finally an LLVMContext that clients will use when
+building IR files for the JIT.
+
+Next up we have our class constructor, which takes a `JITTargetMachineBuilder``
+that will be used by our IRCompiler, and a ``DataLayout`` that we will use to
+initialize our DL member. The constructor begins by initializing our
+ObjectLayer. The ObjectLayer requires a reference to the ExecutionSession, and
+a function object that will build a JIT memory manager for each module that is
+added (a JIT memory manager manages memory allocations, memory permissions, and
+registration of exception handlers for JIT'd code). For this we use a lambda
+that returns a SectionMemoryManager, an off-the-shelf utility that provides all
+the basic memory management functionality required for this chapter. Next we
+initialize our CompileLayer. The CompileLayer needs three things: (1) A
+reference to the ExecutionSession, (2) A reference to our object layer, and (3)
+a compiler instance to use to perform the actual compilation from IR to object
+files. We use the off-the-shelf ConcurrentIRCompiler utility as our compiler,
+which we construct using this constructor's JITTargetMachineBuilder argument.
+The ConcurrentIRCompiler utility will use the JITTargetMachineBuilder to build
+llvm TargetMachines (which are not thread safe) as needed for compiles. After
+this, we initialize our supporting members: ``DL``, ``Mangler`` and ``Ctx`` with
+the input DataLayout, the ExecutionSession and DL member, and a new default
+constucted LLVMContext respectively. Now that our members have been initialized,
+so the one thing that remains to do is to tweak the configuration of the
+*JITDylib* that we will store our code in. We want to modify this dylib to
+contain not only the symbols that we add to it, but also the symbols from our
+REPL process as well. We do this by attaching a
+``DynamicLibrarySearchGenerator`` instance using the
+``DynamicLibrarySearchGenerator::GetForCurrentProcess`` method.
+
+
+.. code-block:: c++
+
+ static Expected<std::unique_ptr<KaleidoscopeJIT>> Create() {
+ auto JTMB = JITTargetMachineBuilder::detectHost();
+
+ if (!JTMB)
+ return JTMB.takeError();
+
+ auto DL = JTMB->getDefaultDataLayoutForTarget();
+ if (!DL)
+ return DL.takeError();
+
+ return llvm::make_unique<KaleidoscopeJIT>(std::move(*JTMB), std::move(*DL));
+ }
+
+ const DataLayout &getDataLayout() const { return DL; }
+
+ LLVMContext &getContext() { return *Ctx.getContext(); }
+
+Next we have a named constructor, ``Create``, which will build a KaleidoscopeJIT
+instance that is configured to generate code for our host process. It does this
+by first generating a JITTargetMachineBuilder instance using that clases's
+detectHost method and then using that instance to generate a datalayout for
+the target process. Each of these operations can fail, so each returns its
+result wrapped in an Expected value [3]_ that we must check for error before
+continuing. If both operations succeed we can unwrap their results (using the
+dereference operator) and pass them into KaleidoscopeJIT's constructor on the
+last line of the function.
+
+Following the named constructor we have the ``getDataLayout()`` and
+``getContext()`` methods. These are used to make data structures created and
+managed by the JIT (especially the LLVMContext) available to the REPL code that
+will build our IR modules.
+
+.. code-block:: c++
+
+ void addModule(std::unique_ptr<Module> M) {
+ cantFail(CompileLayer.add(ES.getMainJITDylib(),
+ ThreadSafeModule(std::move(M), Ctx)));
+ }
+
+ Expected<JITEvaluatedSymbol> lookup(StringRef Name) {
+ return ES.lookup({&ES.getMainJITDylib()}, Mangle(Name.str()));
+ }
+
+Now we come to the first of our JIT API methods: addModule. This method is
+responsible for adding IR to the JIT and making it available for execution. In
+this initial implementation of our JIT we will make our modules "available for
+execution" by adding them to the CompileLayer, which will it turn store the
+Module in the main JITDylib. This process will create new symbol table entries
+in the JITDylib for each definition in the module, and will defer compilation of
+the module until any of its definitions is looked up. Note that this is not lazy
+compilation: just referencing a definition, even if it is never used, will be
+enough to trigger compilation. In later chapters we will teach our JIT to defer
+compilation of functions until they're actually called. To add our Module we
+must first wrap it in a ThreadSafeModule instance, which manages the lifetime of
+the Module's LLVMContext (our Ctx member) in a thread-friendly way. In our
+example, all modules will share the Ctx member, which will exist for the
+duration of the JIT. Once we switch to concurrent compilation in later chapters
+we will use a new context per module.
+
+Our last method is ``lookup``, which allows us to look up addresses for
+function and variable definitions added to the JIT based on their symbol names.
+As noted above, lookup will implicitly trigger compilation for any symbol
+that has not already been compiled. Our lookup method calls through to
+`ExecutionSession::lookup`, passing in a list of dylibs to search (in our case
+just the main dylib), and the symbol name to search for, with a twist: We have
+to *mangle* the name of the symbol we're searching for first. The ORC JIT
+components use mangled symbols internally the same way a static compiler and
+linker would, rather than using plain IR symbol names. This allows JIT'd code
+to interoperate easily with precompiled code in the application or shared
+libraries. The kind of mangling will depend on the DataLayout, which in turn
+depends on the target platform. To allow us to remain portable and search based
+on the un-mangled name, we just re-produce this mangling ourselves using our
+``Mangle`` member function object.
+
+This brings us to the end of Chapter 1 of Building a JIT. You now have a basic
+but fully functioning JIT stack that you can use to take LLVM IR and make it
+executable within the context of your JIT process. In the next chapter we'll
+look at how to extend this JIT to produce better quality code, and in the
+process take a deeper look at the ORC layer concept.
+
+`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example. To build this
+example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
+ :language: c++
+
+.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a
+ simplifying assumption: symbols cannot be re-defined. This will make it
+ impossible to re-define symbols in the REPL, but will make our symbol
+ lookup logic simpler. Re-introducing support for symbol redefinition is
+ left as an exercise for the reader. (The KaleidoscopeJIT.h used in the
+ original tutorials will be a helpful reference).
+
+.. [2] +-----------------------------+-----------------------------------------------+
+ | File | Reason for inclusion |
+ +=============================+===============================================+
+ | JITSymbol.h | Defines the lookup result type |
+ | | JITEvaluatedSymbol |
+ +-----------------------------+-----------------------------------------------+
+ | CompileUtils.h | Provides the SimpleCompiler class. |
+ +-----------------------------+-----------------------------------------------+
+ | Core.h | Core utilities such as ExecutionSession and |
+ | | JITDylib. |
+ +-----------------------------+-----------------------------------------------+
+ | ExecutionUtils.h | Provides the DynamicLibrarySearchGenerator |
+ | | class. |
+ +-----------------------------+-----------------------------------------------+
+ | IRCompileLayer.h | Provides the IRCompileLayer class. |
+ +-----------------------------+-----------------------------------------------+
+ | JITTargetMachineBuilder.h | Provides the JITTargetMachineBuilder class. |
+ +-----------------------------+-----------------------------------------------+
+ | RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. |
+ +-----------------------------+-----------------------------------------------+
+ | SectionMemoryManager.h | Provides the SectionMemoryManager class. |
+ +-----------------------------+-----------------------------------------------+
+ | DataLayout.h | Provides the DataLayout class. |
+ +-----------------------------+-----------------------------------------------+
+ | LLVMContext.h | Provides the LLVMContext class. |
+ +-----------------------------+-----------------------------------------------+
+
+.. [3] See the ErrorHandling section in the LLVM Programmer's Manual
+ (http://llvm.org/docs/ProgrammersManual.html#error-handling)
\ No newline at end of file
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,277 @@
+=====================================================================
+Building a JIT: Adding Optimizations -- An introduction to ORC Layers
+=====================================================================
+
+.. contents::
+ :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 2 Introduction
+======================
+
+**Warning: This tutorial is currently being updated to account for ORC API
+changes. Only Chapters 1 and 2 are up-to-date.**
+
+**Example code from Chapters 3 to 5 will compile and run, but has not been
+updated**
+
+Welcome to Chapter 2 of the "Building an ORC-based JIT in LLVM" tutorial. In
+`Chapter 1 <BuildingAJIT1.html>`_ of this series we examined a basic JIT
+class, KaleidoscopeJIT, that could take LLVM IR modules as input and produce
+executable code in memory. KaleidoscopeJIT was able to do this with relatively
+little code by composing two off-the-shelf *ORC layers*: IRCompileLayer and
+ObjectLinkingLayer, to do much of the heavy lifting.
+
+In this layer we'll learn more about the ORC layer concept by using a new layer,
+IRTransformLayer, to add IR optimization support to KaleidoscopeJIT.
+
+Optimizing Modules using the IRTransformLayer
+=============================================
+
+In `Chapter 4 <LangImpl04.html>`_ of the "Implementing a language with LLVM"
+tutorial series the llvm *FunctionPassManager* is introduced as a means for
+optimizing LLVM IR. Interested readers may read that chapter for details, but
+in short: to optimize a Module we create an llvm::FunctionPassManager
+instance, configure it with a set of optimizations, then run the PassManager on
+a Module to mutate it into a (hopefully) more optimized but semantically
+equivalent form. In the original tutorial series the FunctionPassManager was
+created outside the KaleidoscopeJIT and modules were optimized before being
+added to it. In this Chapter we will make optimization a phase of our JIT
+instead. For now this will provide us a motivation to learn more about ORC
+layers, but in the long term making optimization part of our JIT will yield an
+important benefit: When we begin lazily compiling code (i.e. deferring
+compilation of each function until the first time it's run) having
+optimization managed by our JIT will allow us to optimize lazily too, rather
+than having to do all our optimization up-front.
+
+To add optimization support to our JIT we will take the KaleidoscopeJIT from
+Chapter 1 and compose an ORC *IRTransformLayer* on top. We will look at how the
+IRTransformLayer works in more detail below, but the interface is simple: the
+constructor for this layer takes a reference to the execution session and the
+layer below (as all layers do) plus an *IR optimization function* that it will
+apply to each Module that is added via addModule:
+
+.. code-block:: c++
+
+ class KaleidoscopeJIT {
+ private:
+ ExecutionSession ES;
+ RTDyldObjectLinkingLayer ObjectLayer;
+ IRCompileLayer CompileLayer;
+ IRTransformLayer TransformLayer;
+
+ DataLayout DL;
+ MangleAndInterner Mangle;
+ ThreadSafeContext Ctx;
+
+ public:
+
+ KaleidoscopeJIT(JITTargetMachineBuilder JTMB, DataLayout DL)
+ : ObjectLayer(ES,
+ []() { return llvm::make_unique<SectionMemoryManager>(); }),
+ CompileLayer(ES, ObjectLayer, ConcurrentIRCompiler(std::move(JTMB))),
+ TransformLayer(ES, CompileLayer, optimizeModule),
+ DL(std::move(DL)), Mangle(ES, this->DL),
+ Ctx(llvm::make_unique<LLVMContext>()) {
+ ES.getMainJITDylib().setGenerator(
+ cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
+ }
+
+Our extended KaleidoscopeJIT class starts out the same as it did in Chapter 1,
+but after the CompileLayer we introduce a new member, TransformLayer, which sits
+on top of our CompileLayer. We initialize our OptimizeLayer with a reference to
+the ExecutionSession and output layer (standard practice for layers), along with
+a *transform function*. For our transform function we supply our classes
+optimizeModule static method.
+
+.. code-block:: c++
+
+ // ...
+ return cantFail(OptimizeLayer.addModule(std::move(M),
+ std::move(Resolver)));
+ // ...
+
+Next we need to update our addModule method to replace the call to
+``CompileLayer::add`` with a call to ``OptimizeLayer::add`` instead.
+
+.. code-block:: c++
+
+ static Expected<ThreadSafeModule>
+ optimizeModule(ThreadSafeModule M, const MaterializationResponsibility &R) {
+ // Create a function pass manager.
+ auto FPM = llvm::make_unique<legacy::FunctionPassManager>(M.get());
+
+ // Add some optimizations.
+ FPM->add(createInstructionCombiningPass());
+ FPM->add(createReassociatePass());
+ FPM->add(createGVNPass());
+ FPM->add(createCFGSimplificationPass());
+ FPM->doInitialization();
+
+ // Run the optimizations over all functions in the module being added to
+ // the JIT.
+ for (auto &F : *M)
+ FPM->run(F);
+
+ return M;
+ }
+
+At the bottom of our JIT we add a private method to do the actual optimization:
+*optimizeModule*. This function takes the module to be transformed as input (as
+a ThreadSafeModule) along with a reference to a reference to a new class:
+``MaterializationResponsibility``. The MaterializationResponsibility argument
+can be used to query JIT state for the module being transformed, such as the set
+of definitions in the module that JIT'd code is actively trying to call/access.
+For now we will ignore this argument and use a standard optimization
+pipeline. To do this we set up a FunctionPassManager, add some passes to it, run
+it over every function in the module, and then return the mutated module. The
+specific optimizations are the same ones used in `Chapter 4 <LangImpl04.html>`_
+of the "Implementing a language with LLVM" tutorial series. Readers may visit
+that chapter for a more in-depth discussion of these, and of IR optimization in
+general.
+
+And that's it in terms of changes to KaleidoscopeJIT: When a module is added via
+addModule the OptimizeLayer will call our optimizeModule function before passing
+the transformed module on to the CompileLayer below. Of course, we could have
+called optimizeModule directly in our addModule function and not gone to the
+bother of using the IRTransformLayer, but doing so gives us another opportunity
+to see how layers compose. It also provides a neat entry point to the *layer*
+concept itself, because IRTransformLayer is one of the simplest layers that
+can be implemented.
+
+.. code-block:: c++
+
+ // From IRTransformLayer.h:
+ class IRTransformLayer : public IRLayer {
+ public:
+ using TransformFunction = std::function<Expected<ThreadSafeModule>(
+ ThreadSafeModule, const MaterializationResponsibility &R)>;
+
+ IRTransformLayer(ExecutionSession &ES, IRLayer &BaseLayer,
+ TransformFunction Transform = identityTransform);
+
+ void setTransform(TransformFunction Transform) {
+ this->Transform = std::move(Transform);
+ }
+
+ static ThreadSafeModule
+ identityTransform(ThreadSafeModule TSM,
+ const MaterializationResponsibility &R) {
+ return TSM;
+ }
+
+ void emit(MaterializationResponsibility R, ThreadSafeModule TSM) override;
+
+ private:
+ IRLayer &BaseLayer;
+ TransformFunction Transform;
+ };
+
+ // From IRTransfomrLayer.cpp:
+
+ IRTransformLayer::IRTransformLayer(ExecutionSession &ES,
+ IRLayer &BaseLayer,
+ TransformFunction Transform)
+ : IRLayer(ES), BaseLayer(BaseLayer), Transform(std::move(Transform)) {}
+
+ void IRTransformLayer::emit(MaterializationResponsibility R,
+ ThreadSafeModule TSM) {
+ assert(TSM.getModule() && "Module must not be null");
+
+ if (auto TransformedTSM = Transform(std::move(TSM), R))
+ BaseLayer.emit(std::move(R), std::move(*TransformedTSM));
+ else {
+ R.failMaterialization();
+ getExecutionSession().reportError(TransformedTSM.takeError());
+ }
+ }
+
+This is the whole definition of IRTransformLayer, from
+``llvm/include/llvm/ExecutionEngine/Orc/IRTransformLayer.h`` and
+``llvm/lib/ExecutionEngine/Orc/IRTransformLayer.cpp``. This class is concerned
+with two very simple jobs: (1) Running every IR Module that is emitted via this
+layer through the transform function object, and (2) implementing the ORC
+``IRLayer`` interface (which itself conforms to the general ORC Layer concept,
+more on that below). Most of the class is straightforward: a typedef for the
+transform function, a constructor to initialize the members, a setter for the
+transform function value, and a default no-op transform. The most important
+method is ``emit`` as this is half of our IRLayer interface. The emit method
+applies our transform to each module that it is called on and, if the transform
+succeeds, passes the transformed module to the base layer. If the transform
+fails, our emit function calls
+``MaterializationResponsibility::failMaterialization`` (this JIT clients who
+may be waiting on other threads know that the code they were waiting for has
+failed to compile) and logs the error with the execution session before bailing
+out.
+
+The other half of the IRLayer interface we inherit unmodified from the IRLayer
+class:
+
+.. code-block:: c++
+
+ Error IRLayer::add(JITDylib &JD, ThreadSafeModule TSM, VModuleKey K) {
+ return JD.define(llvm::make_unique<BasicIRLayerMaterializationUnit>(
+ *this, std::move(K), std::move(TSM)));
+ }
+
+This code, from ``llvm/lib/ExecutionEngine/Orc/Layer.cpp``, adds a
+ThreadSafeModule to a given JITDylib by wrapping it up in a
+``MaterializationUnit`` (in this case a ``BasicIRLayerMaterializationUnit``).
+Most layers that derived from IRLayer can rely on this default implementation
+of the ``add`` method.
+
+These two operations, ``add`` and ``emit``, together constitute the layer
+concept: A layer is a way to wrap a portion of a compiler pipeline (in this case
+the "opt" phase of an LLVM compiler) whose API is is opaque to ORC in an
+interface that allows ORC to invoke it when needed. The add method takes an
+module in some input program representation (in this case an LLVM IR module) and
+stores it in the target JITDylib, arranging for it to be passed back to the
+Layer's emit method when any symbol defined by that module is requested. Layers
+can compose neatly by calling the 'emit' method of a base layer to complete
+their work. For example, in this tutorial our IRTransformLayer calls through to
+our IRCompileLayer to compile the transformed IR, and our IRCompileLayer in turn
+calls our ObjectLayer to link the object file produced by our compiler.
+
+
+So far we have learned how to optimize and compile our LLVM IR, but we have not
+focused on when compilation happens. Our current REPL is eager: Each function
+definition is optimized and compiled as soon as it is referenced by any other
+code, regardless of whether it is ever called at runtime. In the next chapter we
+will introduce fully lazy compilation, in which functions are not compiled until
+they are first called at run-time. At this point the trade-offs get much more
+interesting: the lazier we are, the quicker we can start executing the first
+function, but the more often we will have to pause to compile newly encountered
+functions. If we only code-gen lazily, but optimize eagerly, we will have a
+longer startup time (as everything is optimized) but relatively short pauses as
+each function just passes through code-gen. If we both optimize and code-gen
+lazily we can start executing the first function more quickly, but we will have
+longer pauses as each function has to be both optimized and code-gen'd when it
+is first executed. Things become even more interesting if we consider
+interproceedural optimizations like inlining, which must be performed eagerly.
+These are complex trade-offs, and there is no one-size-fits all solution to
+them, but by providing composable layers we leave the decisions to the person
+implementing the JIT, and make it easy for them to experiment with different
+configurations.
+
+`Next: Adding Per-function Lazy Compilation <BuildingAJIT3.html>`_
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example with an
+IRTransformLayer added to enable optimization. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter2/KaleidoscopeJIT.h
+ :language: c++
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,191 @@
+=============================================
+Building a JIT: Per-function Lazy Compilation
+=============================================
+
+.. contents::
+ :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 3 Introduction
+======================
+
+**Warning: This text is currently out of date due to ORC API updates.**
+
+**The example code has been updated and can be used. The text will be updated
+once the API churn dies down.**
+
+Welcome to Chapter 3 of the "Building an ORC-based JIT in LLVM" tutorial. This
+chapter discusses lazy JITing and shows you how to enable it by adding an ORC
+CompileOnDemand layer the JIT from `Chapter 2 <BuildingAJIT2.html>`_.
+
+Lazy Compilation
+================
+
+When we add a module to the KaleidoscopeJIT class from Chapter 2 it is
+immediately optimized, compiled and linked for us by the IRTransformLayer,
+IRCompileLayer and RTDyldObjectLinkingLayer respectively. This scheme, where all the
+work to make a Module executable is done up front, is simple to understand and
+its performance characteristics are easy to reason about. However, it will lead
+to very high startup times if the amount of code to be compiled is large, and
+may also do a lot of unnecessary compilation if only a few compiled functions
+are ever called at runtime. A truly "just-in-time" compiler should allow us to
+defer the compilation of any given function until the moment that function is
+first called, improving launch times and eliminating redundant work. In fact,
+the ORC APIs provide us with a layer to lazily compile LLVM IR:
+*CompileOnDemandLayer*.
+
+The CompileOnDemandLayer class conforms to the layer interface described in
+Chapter 2, but its addModule method behaves quite differently from the layers
+we have seen so far: rather than doing any work up front, it just scans the
+Modules being added and arranges for each function in them to be compiled the
+first time it is called. To do this, the CompileOnDemandLayer creates two small
+utilities for each function that it scans: a *stub* and a *compile
+callback*. The stub is a pair of a function pointer (which will be pointed at
+the function's implementation once the function has been compiled) and an
+indirect jump through the pointer. By fixing the address of the indirect jump
+for the lifetime of the program we can give the function a permanent "effective
+address", one that can be safely used for indirection and function pointer
+comparison even if the function's implementation is never compiled, or if it is
+compiled more than once (due to, for example, recompiling the function at a
+higher optimization level) and changes address. The second utility, the compile
+callback, represents a re-entry point from the program into the compiler that
+will trigger compilation and then execution of a function. By initializing the
+function's stub to point at the function's compile callback, we enable lazy
+compilation: The first attempted call to the function will follow the function
+pointer and trigger the compile callback instead. The compile callback will
+compile the function, update the function pointer for the stub, then execute
+the function. On all subsequent calls to the function, the function pointer
+will point at the already-compiled function, so there is no further overhead
+from the compiler. We will look at this process in more detail in the next
+chapter of this tutorial, but for now we'll trust the CompileOnDemandLayer to
+set all the stubs and callbacks up for us. All we need to do is to add the
+CompileOnDemandLayer to the top of our stack and we'll get the benefits of
+lazy compilation. We just need a few changes to the source:
+
+.. code-block:: c++
+
+ ...
+ #include "llvm/ExecutionEngine/SectionMemoryManager.h"
+ #include "llvm/ExecutionEngine/Orc/CompileOnDemandLayer.h"
+ #include "llvm/ExecutionEngine/Orc/CompileUtils.h"
+ ...
+
+ ...
+ class KaleidoscopeJIT {
+ private:
+ std::unique_ptr<TargetMachine> TM;
+ const DataLayout DL;
+ RTDyldObjectLinkingLayer ObjectLayer;
+ IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer;
+
+ using OptimizeFunction =
+ std::function<std::shared_ptr<Module>(std::shared_ptr<Module>)>;
+
+ IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer;
+
+ std::unique_ptr<JITCompileCallbackManager> CompileCallbackManager;
+ CompileOnDemandLayer<decltype(OptimizeLayer)> CODLayer;
+
+ public:
+ using ModuleHandle = decltype(CODLayer)::ModuleHandleT;
+
+First we need to include the CompileOnDemandLayer.h header, then add two new
+members: a std::unique_ptr<JITCompileCallbackManager> and a CompileOnDemandLayer,
+to our class. The CompileCallbackManager member is used by the CompileOnDemandLayer
+to create the compile callback needed for each function.
+
+.. code-block:: c++
+
+ KaleidoscopeJIT()
+ : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
+ ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }),
+ CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
+ OptimizeLayer(CompileLayer,
+ [this](std::shared_ptr<Module> M) {
+ return optimizeModule(std::move(M));
+ }),
+ CompileCallbackManager(
+ orc::createLocalCompileCallbackManager(TM->getTargetTriple(), 0)),
+ CODLayer(OptimizeLayer,
+ [this](Function &F) { return std::set<Function*>({&F}); },
+ *CompileCallbackManager,
+ orc::createLocalIndirectStubsManagerBuilder(
+ TM->getTargetTriple())) {
+ llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
+ }
+
+Next we have to update our constructor to initialize the new members. To create
+an appropriate compile callback manager we use the
+createLocalCompileCallbackManager function, which takes a TargetMachine and a
+JITTargetAddress to call if it receives a request to compile an unknown
+function. In our simple JIT this situation is unlikely to come up, so we'll
+cheat and just pass '0' here. In a production quality JIT you could give the
+address of a function that throws an exception in order to unwind the JIT'd
+code's stack.
+
+Now we can construct our CompileOnDemandLayer. Following the pattern from
+previous layers we start by passing a reference to the next layer down in our
+stack -- the OptimizeLayer. Next we need to supply a 'partitioning function':
+when a not-yet-compiled function is called, the CompileOnDemandLayer will call
+this function to ask us what we would like to compile. At a minimum we need to
+compile the function being called (given by the argument to the partitioning
+function), but we could also request that the CompileOnDemandLayer compile other
+functions that are unconditionally called (or highly likely to be called) from
+the function being called. For KaleidoscopeJIT we'll keep it simple and just
+request compilation of the function that was called. Next we pass a reference to
+our CompileCallbackManager. Finally, we need to supply an "indirect stubs
+manager builder": a utility function that constructs IndirectStubManagers, which
+are in turn used to build the stubs for the functions in each module. The
+CompileOnDemandLayer will call the indirect stub manager builder once for each
+call to addModule, and use the resulting indirect stubs manager to create
+stubs for all functions in all modules in the set. If/when the module set is
+removed from the JIT the indirect stubs manager will be deleted, freeing any
+memory allocated to the stubs. We supply this function by using the
+createLocalIndirectStubsManagerBuilder utility.
+
+.. code-block:: c++
+
+ // ...
+ if (auto Sym = CODLayer.findSymbol(Name, false))
+ // ...
+ return cantFail(CODLayer.addModule(std::move(Ms),
+ std::move(Resolver)));
+ // ...
+
+ // ...
+ return CODLayer.findSymbol(MangledNameStream.str(), true);
+ // ...
+
+ // ...
+ CODLayer.removeModule(H);
+ // ...
+
+Finally, we need to replace the references to OptimizeLayer in our addModule,
+findSymbol, and removeModule methods. With that, we're up and running.
+
+**To be done:**
+
+** Chapter conclusion.**
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example with a CompileOnDemand
+layer added to enable lazy function-at-a-time compilation. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter3/KaleidoscopeJIT.h
+ :language: c++
+
+`Next: Extreme Laziness -- Using Compile Callbacks to JIT directly from ASTs <BuildingAJIT4.html>`_
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,48 @@
+===========================================================================
+Building a JIT: Extreme Laziness - Using Compile Callbacks to JIT from ASTs
+===========================================================================
+
+.. contents::
+ :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 4 Introduction
+======================
+
+Welcome to Chapter 4 of the "Building an ORC-based JIT in LLVM" tutorial. This
+chapter introduces the Compile Callbacks and Indirect Stubs APIs and shows how
+they can be used to replace the CompileOnDemand layer from
+`Chapter 3 <BuildingAJIT3.html>`_ with a custom lazy-JITing scheme that JITs
+directly from Kaleidoscope ASTs.
+
+**To be done:**
+
+**(1) Describe the drawbacks of JITing from IR (have to compile to IR first,
+which reduces the benefits of laziness).**
+
+**(2) Describe CompileCallbackManagers and IndirectStubManagers in detail.**
+
+**(3) Run through the implementation of addFunctionAST.**
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example that JITs lazily from
+Kaleidoscope ASTS. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter4/KaleidoscopeJIT.h
+ :language: c++
+
+`Next: Remote-JITing -- Process-isolation and laziness-at-a-distance <BuildingAJIT5.html>`_
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT5.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT5.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT5.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/BuildingAJIT5.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,57 @@
+=============================================================================
+Building a JIT: Remote-JITing -- Process Isolation and Laziness at a Distance
+=============================================================================
+
+.. contents::
+ :local:
+
+**This tutorial is under active development. It is incomplete and details may
+change frequently.** Nonetheless we invite you to try it out as it stands, and
+we welcome any feedback.
+
+Chapter 5 Introduction
+======================
+
+Welcome to Chapter 5 of the "Building an ORC-based JIT in LLVM" tutorial. This
+chapter introduces the ORC RemoteJIT Client/Server APIs and shows how to use
+them to build a JIT stack that will execute its code via a communications
+channel with a different process. This can be a separate process on the same
+machine, a process on a different machine, or even a process on a different
+platform/architecture. The code builds on top of the lazy-AST-compiling JIT
+stack from `Chapter 4 <BuildingAJIT3.html>`_.
+
+**To be done -- this is going to be a long one:**
+
+**(1) Introduce channels, RPC, RemoteJIT Client and Server APIs**
+
+**(2) Describe the client code in greater detail. Discuss modifications of the
+KaleidoscopeJIT class, and the REPL itself.**
+
+**(3) Describe the server code.**
+
+**(4) Describe how to run the demo.**
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example that JITs lazily from
+Kaleidoscope ASTS. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
+ clang++ -g Server/server.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy-server
+ # Run
+ ./toy-server &
+ ./toy
+
+Here is the code for the modified KaleidoscopeJIT:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter5/KaleidoscopeJIT.h
+ :language: c++
+
+And the code for the JIT server:
+
+.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter5/Server/server.cpp
+ :language: c++
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl01.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl01.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl01.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl01.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,293 @@
+=================================================
+Kaleidoscope: Tutorial Introduction and the Lexer
+=================================================
+
+.. contents::
+ :local:
+
+Tutorial Introduction
+=====================
+
+Welcome to the "Implementing a language with LLVM" tutorial. This
+tutorial runs through the implementation of a simple language, showing
+how fun and easy it can be. This tutorial will get you up and started as
+well as help to build a framework you can extend to other languages. The
+code in this tutorial can also be used as a playground to hack on other
+LLVM specific things.
+
+The goal of this tutorial is to progressively unveil our language,
+describing how it is built up over time. This will let us cover a fairly
+broad range of language design and LLVM-specific usage issues, showing
+and explaining the code for it all along the way, without overwhelming
+you with tons of details up front.
+
+It is useful to point out ahead of time that this tutorial is really
+about teaching compiler techniques and LLVM specifically, *not* about
+teaching modern and sane software engineering principles. In practice,
+this means that we'll take a number of shortcuts to simplify the
+exposition. For example, the code uses global variables
+all over the place, doesn't use nice design patterns like
+`visitors <http://en.wikipedia.org/wiki/Visitor_pattern>`_, etc... but
+it is very simple. If you dig in and use the code as a basis for future
+projects, fixing these deficiencies shouldn't be hard.
+
+I've tried to put this tutorial together in a way that makes chapters
+easy to skip over if you are already familiar with or are uninterested
+in the various pieces. The structure of the tutorial is:
+
+- `Chapter #1 <#language>`_: Introduction to the Kaleidoscope
+ language, and the definition of its Lexer - This shows where we are
+ going and the basic functionality that we want it to do. In order to
+ make this tutorial maximally understandable and hackable, we choose
+ to implement everything in C++ instead of using lexer and parser
+ generators. LLVM obviously works just fine with such tools, feel free
+ to use one if you prefer.
+- `Chapter #2 <LangImpl02.html>`_: Implementing a Parser and AST -
+ With the lexer in place, we can talk about parsing techniques and
+ basic AST construction. This tutorial describes recursive descent
+ parsing and operator precedence parsing. Nothing in Chapters 1 or 2
+ is LLVM-specific, the code doesn't even link in LLVM at this point.
+ :)
+- `Chapter #3 <LangImpl03.html>`_: Code generation to LLVM IR - With
+ the AST ready, we can show off how easy generation of LLVM IR really
+ is.
+- `Chapter #4 <LangImpl04.html>`_: Adding JIT and Optimizer Support
+ - Because a lot of people are interested in using LLVM as a JIT,
+ we'll dive right into it and show you the 3 lines it takes to add JIT
+ support. LLVM is also useful in many other ways, but this is one
+ simple and "sexy" way to show off its power. :)
+- `Chapter #5 <LangImpl05.html>`_: Extending the Language: Control
+ Flow - With the language up and running, we show how to extend it
+ with control flow operations (if/then/else and a 'for' loop). This
+ gives us a chance to talk about simple SSA construction and control
+ flow.
+- `Chapter #6 <LangImpl06.html>`_: Extending the Language:
+ User-defined Operators - This is a silly but fun chapter that talks
+ about extending the language to let the user program define their own
+ arbitrary unary and binary operators (with assignable precedence!).
+ This lets us build a significant piece of the "language" as library
+ routines.
+- `Chapter #7 <LangImpl07.html>`_: Extending the Language: Mutable
+ Variables - This chapter talks about adding user-defined local
+ variables along with an assignment operator. The interesting part
+ about this is how easy and trivial it is to construct SSA form in
+ LLVM: no, LLVM does *not* require your front-end to construct SSA
+ form!
+- `Chapter #8 <LangImpl08.html>`_: Compiling to Object Files - This
+ chapter explains how to take LLVM IR and compile it down to object
+ files.
+- `Chapter #9 <LangImpl09.html>`_: Extending the Language: Debug
+ Information - Having built a decent little programming language with
+ control flow, functions and mutable variables, we consider what it
+ takes to add debug information to standalone executables. This debug
+ information will allow you to set breakpoints in Kaleidoscope
+ functions, print out argument variables, and call functions - all
+ from within the debugger!
+- `Chapter #10 <LangImpl10.html>`_: Conclusion and other useful LLVM
+ tidbits - This chapter wraps up the series by talking about
+ potential ways to extend the language, but also includes a bunch of
+ pointers to info about "special topics" like adding garbage
+ collection support, exceptions, debugging, support for "spaghetti
+ stacks", and a bunch of other tips and tricks.
+
+By the end of the tutorial, we'll have written a bit less than 1000 lines
+of non-comment, non-blank, lines of code. With this small amount of
+code, we'll have built up a very reasonable compiler for a non-trivial
+language including a hand-written lexer, parser, AST, as well as code
+generation support with a JIT compiler. While other systems may have
+interesting "hello world" tutorials, I think the breadth of this
+tutorial is a great testament to the strengths of LLVM and why you
+should consider it if you're interested in language or compiler design.
+
+A note about this tutorial: we expect you to extend the language and
+play with it on your own. Take the code and go crazy hacking away at it,
+compilers don't need to be scary creatures - it can be a lot of fun to
+play with languages!
+
+The Basic Language
+==================
+
+This tutorial will be illustrated with a toy language that we'll call
+"`Kaleidoscope <http://en.wikipedia.org/wiki/Kaleidoscope>`_" (derived
+from "meaning beautiful, form, and view"). Kaleidoscope is a procedural
+language that allows you to define functions, use conditionals, math,
+etc. Over the course of the tutorial, we'll extend Kaleidoscope to
+support the if/then/else construct, a for loop, user defined operators,
+JIT compilation with a simple command line interface, etc.
+
+Because we want to keep things simple, the only datatype in Kaleidoscope
+is a 64-bit floating point type (aka 'double' in C parlance). As such,
+all values are implicitly double precision and the language doesn't
+require type declarations. This gives the language a very nice and
+simple syntax. For example, the following simple example computes
+`Fibonacci numbers: <http://en.wikipedia.org/wiki/Fibonacci_number>`_
+
+::
+
+ # Compute the x'th fibonacci number.
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2)
+
+ # This expression will compute the 40th number.
+ fib(40)
+
+We also allow Kaleidoscope to call into standard library functions (the
+LLVM JIT makes this completely trivial). This means that you can use the
+'extern' keyword to define a function before you use it (this is also
+useful for mutually recursive functions). For example:
+
+::
+
+ extern sin(arg);
+ extern cos(arg);
+ extern atan2(arg1 arg2);
+
+ atan2(sin(.4), cos(42))
+
+A more interesting example is included in Chapter 6 where we write a
+little Kaleidoscope application that `displays a Mandelbrot
+Set <LangImpl06.html#kicking-the-tires>`_ at various levels of magnification.
+
+Lets dive into the implementation of this language!
+
+The Lexer
+=========
+
+When it comes to implementing a language, the first thing needed is the
+ability to process a text file and recognize what it says. The
+traditional way to do this is to use a
+"`lexer <http://en.wikipedia.org/wiki/Lexical_analysis>`_" (aka
+'scanner') to break the input up into "tokens". Each token returned by
+the lexer includes a token code and potentially some metadata (e.g. the
+numeric value of a number). First, we define the possibilities:
+
+.. code-block:: c++
+
+ // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
+ // of these for known things.
+ enum Token {
+ tok_eof = -1,
+
+ // commands
+ tok_def = -2,
+ tok_extern = -3,
+
+ // primary
+ tok_identifier = -4,
+ tok_number = -5,
+ };
+
+ static std::string IdentifierStr; // Filled in if tok_identifier
+ static double NumVal; // Filled in if tok_number
+
+Each token returned by our lexer will either be one of the Token enum
+values or it will be an 'unknown' character like '+', which is returned
+as its ASCII value. If the current token is an identifier, the
+``IdentifierStr`` global variable holds the name of the identifier. If
+the current token is a numeric literal (like 1.0), ``NumVal`` holds its
+value. Note that we use global variables for simplicity, this is not the
+best choice for a real language implementation :).
+
+The actual implementation of the lexer is a single function named
+``gettok``. The ``gettok`` function is called to return the next token
+from standard input. Its definition starts as:
+
+.. code-block:: c++
+
+ /// gettok - Return the next token from standard input.
+ static int gettok() {
+ static int LastChar = ' ';
+
+ // Skip any whitespace.
+ while (isspace(LastChar))
+ LastChar = getchar();
+
+``gettok`` works by calling the C ``getchar()`` function to read
+characters one at a time from standard input. It eats them as it
+recognizes them and stores the last character read, but not processed,
+in LastChar. The first thing that it has to do is ignore whitespace
+between tokens. This is accomplished with the loop above.
+
+The next thing ``gettok`` needs to do is recognize identifiers and
+specific keywords like "def". Kaleidoscope does this with this simple
+loop:
+
+.. code-block:: c++
+
+ if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
+ IdentifierStr = LastChar;
+ while (isalnum((LastChar = getchar())))
+ IdentifierStr += LastChar;
+
+ if (IdentifierStr == "def")
+ return tok_def;
+ if (IdentifierStr == "extern")
+ return tok_extern;
+ return tok_identifier;
+ }
+
+Note that this code sets the '``IdentifierStr``' global whenever it
+lexes an identifier. Also, since language keywords are matched by the
+same loop, we handle them here inline. Numeric values are similar:
+
+.. code-block:: c++
+
+ if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
+ std::string NumStr;
+ do {
+ NumStr += LastChar;
+ LastChar = getchar();
+ } while (isdigit(LastChar) || LastChar == '.');
+
+ NumVal = strtod(NumStr.c_str(), 0);
+ return tok_number;
+ }
+
+This is all pretty straight-forward code for processing input. When
+reading a numeric value from input, we use the C ``strtod`` function to
+convert it to a numeric value that we store in ``NumVal``. Note that
+this isn't doing sufficient error checking: it will incorrectly read
+"1.23.45.67" and handle it as if you typed in "1.23". Feel free to
+extend it :). Next we handle comments:
+
+.. code-block:: c++
+
+ if (LastChar == '#') {
+ // Comment until end of line.
+ do
+ LastChar = getchar();
+ while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
+
+ if (LastChar != EOF)
+ return gettok();
+ }
+
+We handle comments by skipping to the end of the line and then return
+the next token. Finally, if the input doesn't match one of the above
+cases, it is either an operator character like '+' or the end of the
+file. These are handled with this code:
+
+.. code-block:: c++
+
+ // Check for end of file. Don't eat the EOF.
+ if (LastChar == EOF)
+ return tok_eof;
+
+ // Otherwise, just return the character as its ascii value.
+ int ThisChar = LastChar;
+ LastChar = getchar();
+ return ThisChar;
+ }
+
+With this, we have the complete lexer for the basic Kaleidoscope
+language (the `full code listing <LangImpl02.html#full-code-listing>`_ for the Lexer
+is available in the `next chapter <LangImpl02.html>`_ of the tutorial).
+Next we'll `build a simple parser that uses this to build an Abstract
+Syntax Tree <LangImpl02.html>`_. When we have that, we'll include a
+driver so that you can use the lexer and parser together.
+
+`Next: Implementing a Parser and AST <LangImpl02.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl02.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl02.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl02.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl02.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,737 @@
+===========================================
+Kaleidoscope: Implementing a Parser and AST
+===========================================
+
+.. contents::
+ :local:
+
+Chapter 2 Introduction
+======================
+
+Welcome to Chapter 2 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. This chapter shows you how to use the
+lexer, built in `Chapter 1 <LangImpl01.html>`_, to build a full
+`parser <http://en.wikipedia.org/wiki/Parsing>`_ for our Kaleidoscope
+language. Once we have a parser, we'll define and build an `Abstract
+Syntax Tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST).
+
+The parser we will build uses a combination of `Recursive Descent
+Parsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and
+`Operator-Precedence
+Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to
+parse the Kaleidoscope language (the latter for binary expressions and
+the former for everything else). Before we get to parsing though, let's
+talk about the output of the parser: the Abstract Syntax Tree.
+
+The Abstract Syntax Tree (AST)
+==============================
+
+The AST for a program captures its behavior in such a way that it is
+easy for later stages of the compiler (e.g. code generation) to
+interpret. We basically want one object for each construct in the
+language, and the AST should closely model the language. In
+Kaleidoscope, we have expressions, a prototype, and a function object.
+We'll start with expressions first:
+
+.. code-block:: c++
+
+ /// ExprAST - Base class for all expression nodes.
+ class ExprAST {
+ public:
+ virtual ~ExprAST() {}
+ };
+
+ /// NumberExprAST - Expression class for numeric literals like "1.0".
+ class NumberExprAST : public ExprAST {
+ double Val;
+
+ public:
+ NumberExprAST(double Val) : Val(Val) {}
+ };
+
+The code above shows the definition of the base ExprAST class and one
+subclass which we use for numeric literals. The important thing to note
+about this code is that the NumberExprAST class captures the numeric
+value of the literal as an instance variable. This allows later phases
+of the compiler to know what the stored numeric value is.
+
+Right now we only create the AST, so there are no useful accessor
+methods on them. It would be very easy to add a virtual method to pretty
+print the code, for example. Here are the other expression AST node
+definitions that we'll use in the basic form of the Kaleidoscope
+language:
+
+.. code-block:: c++
+
+ /// VariableExprAST - Expression class for referencing a variable, like "a".
+ class VariableExprAST : public ExprAST {
+ std::string Name;
+
+ public:
+ VariableExprAST(const std::string &Name) : Name(Name) {}
+ };
+
+ /// BinaryExprAST - Expression class for a binary operator.
+ class BinaryExprAST : public ExprAST {
+ char Op;
+ std::unique_ptr<ExprAST> LHS, RHS;
+
+ public:
+ BinaryExprAST(char op, std::unique_ptr<ExprAST> LHS,
+ std::unique_ptr<ExprAST> RHS)
+ : Op(op), LHS(std::move(LHS)), RHS(std::move(RHS)) {}
+ };
+
+ /// CallExprAST - Expression class for function calls.
+ class CallExprAST : public ExprAST {
+ std::string Callee;
+ std::vector<std::unique_ptr<ExprAST>> Args;
+
+ public:
+ CallExprAST(const std::string &Callee,
+ std::vector<std::unique_ptr<ExprAST>> Args)
+ : Callee(Callee), Args(std::move(Args)) {}
+ };
+
+This is all (intentionally) rather straight-forward: variables capture
+the variable name, binary operators capture their opcode (e.g. '+'), and
+calls capture a function name as well as a list of any argument
+expressions. One thing that is nice about our AST is that it captures
+the language features without talking about the syntax of the language.
+Note that there is no discussion about precedence of binary operators,
+lexical structure, etc.
+
+For our basic language, these are all of the expression nodes we'll
+define. Because it doesn't have conditional control flow, it isn't
+Turing-complete; we'll fix that in a later installment. The two things
+we need next are a way to talk about the interface to a function, and a
+way to talk about functions themselves:
+
+.. code-block:: c++
+
+ /// PrototypeAST - This class represents the "prototype" for a function,
+ /// which captures its name, and its argument names (thus implicitly the number
+ /// of arguments the function takes).
+ class PrototypeAST {
+ std::string Name;
+ std::vector<std::string> Args;
+
+ public:
+ PrototypeAST(const std::string &name, std::vector<std::string> Args)
+ : Name(name), Args(std::move(Args)) {}
+
+ const std::string &getName() const { return Name; }
+ };
+
+ /// FunctionAST - This class represents a function definition itself.
+ class FunctionAST {
+ std::unique_ptr<PrototypeAST> Proto;
+ std::unique_ptr<ExprAST> Body;
+
+ public:
+ FunctionAST(std::unique_ptr<PrototypeAST> Proto,
+ std::unique_ptr<ExprAST> Body)
+ : Proto(std::move(Proto)), Body(std::move(Body)) {}
+ };
+
+In Kaleidoscope, functions are typed with just a count of their
+arguments. Since all values are double precision floating point, the
+type of each argument doesn't need to be stored anywhere. In a more
+aggressive and realistic language, the "ExprAST" class would probably
+have a type field.
+
+With this scaffolding, we can now talk about parsing expressions and
+function bodies in Kaleidoscope.
+
+Parser Basics
+=============
+
+Now that we have an AST to build, we need to define the parser code to
+build it. The idea here is that we want to parse something like "x+y"
+(which is returned as three tokens by the lexer) into an AST that could
+be generated with calls like this:
+
+.. code-block:: c++
+
+ auto LHS = llvm::make_unique<VariableExprAST>("x");
+ auto RHS = llvm::make_unique<VariableExprAST>("y");
+ auto Result = std::make_unique<BinaryExprAST>('+', std::move(LHS),
+ std::move(RHS));
+
+In order to do this, we'll start by defining some basic helper routines:
+
+.. code-block:: c++
+
+ /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
+ /// token the parser is looking at. getNextToken reads another token from the
+ /// lexer and updates CurTok with its results.
+ static int CurTok;
+ static int getNextToken() {
+ return CurTok = gettok();
+ }
+
+This implements a simple token buffer around the lexer. This allows us
+to look one token ahead at what the lexer is returning. Every function
+in our parser will assume that CurTok is the current token that needs to
+be parsed.
+
+.. code-block:: c++
+
+
+ /// LogError* - These are little helper functions for error handling.
+ std::unique_ptr<ExprAST> LogError(const char *Str) {
+ fprintf(stderr, "LogError: %s\n", Str);
+ return nullptr;
+ }
+ std::unique_ptr<PrototypeAST> LogErrorP(const char *Str) {
+ LogError(Str);
+ return nullptr;
+ }
+
+The ``LogError`` routines are simple helper routines that our parser will
+use to handle errors. The error recovery in our parser will not be the
+best and is not particular user-friendly, but it will be enough for our
+tutorial. These routines make it easier to handle errors in routines
+that have various return types: they always return null.
+
+With these basic helper functions, we can implement the first piece of
+our grammar: numeric literals.
+
+Basic Expression Parsing
+========================
+
+We start with numeric literals, because they are the simplest to
+process. For each production in our grammar, we'll define a function
+which parses that production. For numeric literals, we have:
+
+.. code-block:: c++
+
+ /// numberexpr ::= number
+ static std::unique_ptr<ExprAST> ParseNumberExpr() {
+ auto Result = llvm::make_unique<NumberExprAST>(NumVal);
+ getNextToken(); // consume the number
+ return std::move(Result);
+ }
+
+This routine is very simple: it expects to be called when the current
+token is a ``tok_number`` token. It takes the current number value,
+creates a ``NumberExprAST`` node, advances the lexer to the next token,
+and finally returns.
+
+There are some interesting aspects to this. The most important one is
+that this routine eats all of the tokens that correspond to the
+production and returns the lexer buffer with the next token (which is
+not part of the grammar production) ready to go. This is a fairly
+standard way to go for recursive descent parsers. For a better example,
+the parenthesis operator is defined like this:
+
+.. code-block:: c++
+
+ /// parenexpr ::= '(' expression ')'
+ static std::unique_ptr<ExprAST> ParseParenExpr() {
+ getNextToken(); // eat (.
+ auto V = ParseExpression();
+ if (!V)
+ return nullptr;
+
+ if (CurTok != ')')
+ return LogError("expected ')'");
+ getNextToken(); // eat ).
+ return V;
+ }
+
+This function illustrates a number of interesting things about the
+parser:
+
+1) It shows how we use the LogError routines. When called, this function
+expects that the current token is a '(' token, but after parsing the
+subexpression, it is possible that there is no ')' waiting. For example,
+if the user types in "(4 x" instead of "(4)", the parser should emit an
+error. Because errors can occur, the parser needs a way to indicate that
+they happened: in our parser, we return null on an error.
+
+2) Another interesting aspect of this function is that it uses recursion
+by calling ``ParseExpression`` (we will soon see that
+``ParseExpression`` can call ``ParseParenExpr``). This is powerful
+because it allows us to handle recursive grammars, and keeps each
+production very simple. Note that parentheses do not cause construction
+of AST nodes themselves. While we could do it this way, the most
+important role of parentheses are to guide the parser and provide
+grouping. Once the parser constructs the AST, parentheses are not
+needed.
+
+The next simple production is for handling variable references and
+function calls:
+
+.. code-block:: c++
+
+ /// identifierexpr
+ /// ::= identifier
+ /// ::= identifier '(' expression* ')'
+ static std::unique_ptr<ExprAST> ParseIdentifierExpr() {
+ std::string IdName = IdentifierStr;
+
+ getNextToken(); // eat identifier.
+
+ if (CurTok != '(') // Simple variable ref.
+ return llvm::make_unique<VariableExprAST>(IdName);
+
+ // Call.
+ getNextToken(); // eat (
+ std::vector<std::unique_ptr<ExprAST>> Args;
+ if (CurTok != ')') {
+ while (1) {
+ if (auto Arg = ParseExpression())
+ Args.push_back(std::move(Arg));
+ else
+ return nullptr;
+
+ if (CurTok == ')')
+ break;
+
+ if (CurTok != ',')
+ return LogError("Expected ')' or ',' in argument list");
+ getNextToken();
+ }
+ }
+
+ // Eat the ')'.
+ getNextToken();
+
+ return llvm::make_unique<CallExprAST>(IdName, std::move(Args));
+ }
+
+This routine follows the same style as the other routines. (It expects
+to be called if the current token is a ``tok_identifier`` token). It
+also has recursion and error handling. One interesting aspect of this is
+that it uses *look-ahead* to determine if the current identifier is a
+stand alone variable reference or if it is a function call expression.
+It handles this by checking to see if the token after the identifier is
+a '(' token, constructing either a ``VariableExprAST`` or
+``CallExprAST`` node as appropriate.
+
+Now that we have all of our simple expression-parsing logic in place, we
+can define a helper function to wrap it together into one entry point.
+We call this class of expressions "primary" expressions, for reasons
+that will become more clear `later in the
+tutorial <LangImpl6.html#user-defined-unary-operators>`_. In order to parse an arbitrary
+primary expression, we need to determine what sort of expression it is:
+
+.. code-block:: c++
+
+ /// primary
+ /// ::= identifierexpr
+ /// ::= numberexpr
+ /// ::= parenexpr
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ }
+ }
+
+Now that you see the definition of this function, it is more obvious why
+we can assume the state of CurTok in the various functions. This uses
+look-ahead to determine which sort of expression is being inspected, and
+then parses it with a function call.
+
+Now that basic expressions are handled, we need to handle binary
+expressions. They are a bit more complex.
+
+Binary Expression Parsing
+=========================
+
+Binary expressions are significantly harder to parse because they are
+often ambiguous. For example, when given the string "x+y\*z", the parser
+can choose to parse it as either "(x+y)\*z" or "x+(y\*z)". With common
+definitions from mathematics, we expect the later parse, because "\*"
+(multiplication) has higher *precedence* than "+" (addition).
+
+There are many ways to handle this, but an elegant and efficient way is
+to use `Operator-Precedence
+Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_.
+This parsing technique uses the precedence of binary operators to guide
+recursion. To start with, we need a table of precedences:
+
+.. code-block:: c++
+
+ /// BinopPrecedence - This holds the precedence for each binary operator that is
+ /// defined.
+ static std::map<char, int> BinopPrecedence;
+
+ /// GetTokPrecedence - Get the precedence of the pending binary operator token.
+ static int GetTokPrecedence() {
+ if (!isascii(CurTok))
+ return -1;
+
+ // Make sure it's a declared binop.
+ int TokPrec = BinopPrecedence[CurTok];
+ if (TokPrec <= 0) return -1;
+ return TokPrec;
+ }
+
+ int main() {
+ // Install standard binary operators.
+ // 1 is lowest precedence.
+ BinopPrecedence['<'] = 10;
+ BinopPrecedence['+'] = 20;
+ BinopPrecedence['-'] = 20;
+ BinopPrecedence['*'] = 40; // highest.
+ ...
+ }
+
+For the basic form of Kaleidoscope, we will only support 4 binary
+operators (this can obviously be extended by you, our brave and intrepid
+reader). The ``GetTokPrecedence`` function returns the precedence for
+the current token, or -1 if the token is not a binary operator. Having a
+map makes it easy to add new operators and makes it clear that the
+algorithm doesn't depend on the specific operators involved, but it
+would be easy enough to eliminate the map and do the comparisons in the
+``GetTokPrecedence`` function. (Or just use a fixed-size array).
+
+With the helper above defined, we can now start parsing binary
+expressions. The basic idea of operator precedence parsing is to break
+down an expression with potentially ambiguous binary operators into
+pieces. Consider, for example, the expression "a+b+(c+d)\*e\*f+g".
+Operator precedence parsing considers this as a stream of primary
+expressions separated by binary operators. As such, it will first parse
+the leading primary expression "a", then it will see the pairs [+, b]
+[+, (c+d)] [\*, e] [\*, f] and [+, g]. Note that because parentheses are
+primary expressions, the binary expression parser doesn't need to worry
+about nested subexpressions like (c+d) at all.
+
+To start, an expression is a primary expression potentially followed by
+a sequence of [binop,primaryexpr] pairs:
+
+.. code-block:: c++
+
+ /// expression
+ /// ::= primary binoprhs
+ ///
+ static std::unique_ptr<ExprAST> ParseExpression() {
+ auto LHS = ParsePrimary();
+ if (!LHS)
+ return nullptr;
+
+ return ParseBinOpRHS(0, std::move(LHS));
+ }
+
+``ParseBinOpRHS`` is the function that parses the sequence of pairs for
+us. It takes a precedence and a pointer to an expression for the part
+that has been parsed so far. Note that "x" is a perfectly valid
+expression: As such, "binoprhs" is allowed to be empty, in which case it
+returns the expression that is passed into it. In our example above, the
+code passes the expression for "a" into ``ParseBinOpRHS`` and the
+current token is "+".
+
+The precedence value passed into ``ParseBinOpRHS`` indicates the
+*minimal operator precedence* that the function is allowed to eat. For
+example, if the current pair stream is [+, x] and ``ParseBinOpRHS`` is
+passed in a precedence of 40, it will not consume any tokens (because
+the precedence of '+' is only 20). With this in mind, ``ParseBinOpRHS``
+starts with:
+
+.. code-block:: c++
+
+ /// binoprhs
+ /// ::= ('+' primary)*
+ static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec,
+ std::unique_ptr<ExprAST> LHS) {
+ // If this is a binop, find its precedence.
+ while (1) {
+ int TokPrec = GetTokPrecedence();
+
+ // If this is a binop that binds at least as tightly as the current binop,
+ // consume it, otherwise we are done.
+ if (TokPrec < ExprPrec)
+ return LHS;
+
+This code gets the precedence of the current token and checks to see if
+if is too low. Because we defined invalid tokens to have a precedence of
+-1, this check implicitly knows that the pair-stream ends when the token
+stream runs out of binary operators. If this check succeeds, we know
+that the token is a binary operator and that it will be included in this
+expression:
+
+.. code-block:: c++
+
+ // Okay, we know this is a binop.
+ int BinOp = CurTok;
+ getNextToken(); // eat binop
+
+ // Parse the primary expression after the binary operator.
+ auto RHS = ParsePrimary();
+ if (!RHS)
+ return nullptr;
+
+As such, this code eats (and remembers) the binary operator and then
+parses the primary expression that follows. This builds up the whole
+pair, the first of which is [+, b] for the running example.
+
+Now that we parsed the left-hand side of an expression and one pair of
+the RHS sequence, we have to decide which way the expression associates.
+In particular, we could have "(a+b) binop unparsed" or "a + (b binop
+unparsed)". To determine this, we look ahead at "binop" to determine its
+precedence and compare it to BinOp's precedence (which is '+' in this
+case):
+
+.. code-block:: c++
+
+ // If BinOp binds less tightly with RHS than the operator after RHS, let
+ // the pending operator take RHS as its LHS.
+ int NextPrec = GetTokPrecedence();
+ if (TokPrec < NextPrec) {
+
+If the precedence of the binop to the right of "RHS" is lower or equal
+to the precedence of our current operator, then we know that the
+parentheses associate as "(a+b) binop ...". In our example, the current
+operator is "+" and the next operator is "+", we know that they have the
+same precedence. In this case we'll create the AST node for "a+b", and
+then continue parsing:
+
+.. code-block:: c++
+
+ ... if body omitted ...
+ }
+
+ // Merge LHS/RHS.
+ LHS = llvm::make_unique<BinaryExprAST>(BinOp, std::move(LHS),
+ std::move(RHS));
+ } // loop around to the top of the while loop.
+ }
+
+In our example above, this will turn "a+b+" into "(a+b)" and execute the
+next iteration of the loop, with "+" as the current token. The code
+above will eat, remember, and parse "(c+d)" as the primary expression,
+which makes the current pair equal to [+, (c+d)]. It will then evaluate
+the 'if' conditional above with "\*" as the binop to the right of the
+primary. In this case, the precedence of "\*" is higher than the
+precedence of "+" so the if condition will be entered.
+
+The critical question left here is "how can the if condition parse the
+right hand side in full"? In particular, to build the AST correctly for
+our example, it needs to get all of "(c+d)\*e\*f" as the RHS expression
+variable. The code to do this is surprisingly simple (code from the
+above two blocks duplicated for context):
+
+.. code-block:: c++
+
+ // If BinOp binds less tightly with RHS than the operator after RHS, let
+ // the pending operator take RHS as its LHS.
+ int NextPrec = GetTokPrecedence();
+ if (TokPrec < NextPrec) {
+ RHS = ParseBinOpRHS(TokPrec+1, std::move(RHS));
+ if (!RHS)
+ return nullptr;
+ }
+ // Merge LHS/RHS.
+ LHS = llvm::make_unique<BinaryExprAST>(BinOp, std::move(LHS),
+ std::move(RHS));
+ } // loop around to the top of the while loop.
+ }
+
+At this point, we know that the binary operator to the RHS of our
+primary has higher precedence than the binop we are currently parsing.
+As such, we know that any sequence of pairs whose operators are all
+higher precedence than "+" should be parsed together and returned as
+"RHS". To do this, we recursively invoke the ``ParseBinOpRHS`` function
+specifying "TokPrec+1" as the minimum precedence required for it to
+continue. In our example above, this will cause it to return the AST
+node for "(c+d)\*e\*f" as RHS, which is then set as the RHS of the '+'
+expression.
+
+Finally, on the next iteration of the while loop, the "+g" piece is
+parsed and added to the AST. With this little bit of code (14
+non-trivial lines), we correctly handle fully general binary expression
+parsing in a very elegant way. This was a whirlwind tour of this code,
+and it is somewhat subtle. I recommend running through it with a few
+tough examples to see how it works.
+
+This wraps up handling of expressions. At this point, we can point the
+parser at an arbitrary token stream and build an expression from it,
+stopping at the first token that is not part of the expression. Next up
+we need to handle function definitions, etc.
+
+Parsing the Rest
+================
+
+The next thing missing is handling of function prototypes. In
+Kaleidoscope, these are used both for 'extern' function declarations as
+well as function body definitions. The code to do this is
+straight-forward and not very interesting (once you've survived
+expressions):
+
+.. code-block:: c++
+
+ /// prototype
+ /// ::= id '(' id* ')'
+ static std::unique_ptr<PrototypeAST> ParsePrototype() {
+ if (CurTok != tok_identifier)
+ return LogErrorP("Expected function name in prototype");
+
+ std::string FnName = IdentifierStr;
+ getNextToken();
+
+ if (CurTok != '(')
+ return LogErrorP("Expected '(' in prototype");
+
+ // Read the list of argument names.
+ std::vector<std::string> ArgNames;
+ while (getNextToken() == tok_identifier)
+ ArgNames.push_back(IdentifierStr);
+ if (CurTok != ')')
+ return LogErrorP("Expected ')' in prototype");
+
+ // success.
+ getNextToken(); // eat ')'.
+
+ return llvm::make_unique<PrototypeAST>(FnName, std::move(ArgNames));
+ }
+
+Given this, a function definition is very simple, just a prototype plus
+an expression to implement the body:
+
+.. code-block:: c++
+
+ /// definition ::= 'def' prototype expression
+ static std::unique_ptr<FunctionAST> ParseDefinition() {
+ getNextToken(); // eat def.
+ auto Proto = ParsePrototype();
+ if (!Proto) return nullptr;
+
+ if (auto E = ParseExpression())
+ return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E));
+ return nullptr;
+ }
+
+In addition, we support 'extern' to declare functions like 'sin' and
+'cos' as well as to support forward declaration of user functions. These
+'extern's are just prototypes with no body:
+
+.. code-block:: c++
+
+ /// external ::= 'extern' prototype
+ static std::unique_ptr<PrototypeAST> ParseExtern() {
+ getNextToken(); // eat extern.
+ return ParsePrototype();
+ }
+
+Finally, we'll also let the user type in arbitrary top-level expressions
+and evaluate them on the fly. We will handle this by defining anonymous
+nullary (zero argument) functions for them:
+
+.. code-block:: c++
+
+ /// toplevelexpr ::= expression
+ static std::unique_ptr<FunctionAST> ParseTopLevelExpr() {
+ if (auto E = ParseExpression()) {
+ // Make an anonymous proto.
+ auto Proto = llvm::make_unique<PrototypeAST>("", std::vector<std::string>());
+ return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E));
+ }
+ return nullptr;
+ }
+
+Now that we have all the pieces, let's build a little driver that will
+let us actually *execute* this code we've built!
+
+The Driver
+==========
+
+The driver for this simply invokes all of the parsing pieces with a
+top-level dispatch loop. There isn't much interesting here, so I'll just
+include the top-level loop. See `below <#full-code-listing>`_ for full code in the
+"Top-Level Parsing" section.
+
+.. code-block:: c++
+
+ /// top ::= definition | external | expression | ';'
+ static void MainLoop() {
+ while (1) {
+ fprintf(stderr, "ready> ");
+ switch (CurTok) {
+ case tok_eof:
+ return;
+ case ';': // ignore top-level semicolons.
+ getNextToken();
+ break;
+ case tok_def:
+ HandleDefinition();
+ break;
+ case tok_extern:
+ HandleExtern();
+ break;
+ default:
+ HandleTopLevelExpression();
+ break;
+ }
+ }
+ }
+
+The most interesting part of this is that we ignore top-level
+semicolons. Why is this, you ask? The basic reason is that if you type
+"4 + 5" at the command line, the parser doesn't know whether that is the
+end of what you will type or not. For example, on the next line you
+could type "def foo..." in which case 4+5 is the end of a top-level
+expression. Alternatively you could type "\* 6", which would continue
+the expression. Having top-level semicolons allows you to type "4+5;",
+and the parser will know you are done.
+
+Conclusions
+===========
+
+With just under 400 lines of commented code (240 lines of non-comment,
+non-blank code), we fully defined our minimal language, including a
+lexer, parser, and AST builder. With this done, the executable will
+validate Kaleidoscope code and tell us if it is grammatically invalid.
+For example, here is a sample interaction:
+
+.. code-block:: bash
+
+ $ ./a.out
+ ready> def foo(x y) x+foo(y, 4.0);
+ Parsed a function definition.
+ ready> def foo(x y) x+y y;
+ Parsed a function definition.
+ Parsed a top-level expr
+ ready> def foo(x y) x+y );
+ Parsed a function definition.
+ Error: unknown token when expecting an expression
+ ready> extern sin(a);
+ ready> Parsed an extern
+ ready> ^D
+ $
+
+There is a lot of room for extension here. You can define new AST nodes,
+extend the language in many ways, etc. In the `next
+installment <LangImpl03.html>`_, we will describe how to generate LLVM
+Intermediate Representation (IR) from the AST.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example. Because this
+uses the LLVM libraries, we need to link them in. To do this, we use the
+`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform
+our makefile/command line about which options to use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g -O3 toy.cpp `llvm-config --cxxflags`
+ # Run
+ ./a.out
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter2/toy.cpp
+ :language: c++
+
+`Next: Implementing Code Generation to LLVM IR <LangImpl03.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl03.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl03.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl03.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl03.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,568 @@
+========================================
+Kaleidoscope: Code generation to LLVM IR
+========================================
+
+.. contents::
+ :local:
+
+Chapter 3 Introduction
+======================
+
+Welcome to Chapter 3 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. This chapter shows you how to transform
+the `Abstract Syntax Tree <LangImpl02.html>`_, built in Chapter 2, into
+LLVM IR. This will teach you a little bit about how LLVM does things, as
+well as demonstrate how easy it is to use. It's much more work to build
+a lexer and parser than it is to generate LLVM IR code. :)
+
+**Please note**: the code in this chapter and later require LLVM 3.7 or
+later. LLVM 3.6 and before will not work with it. Also note that you
+need to use a version of this tutorial that matches your LLVM release:
+If you are using an official LLVM release, use the version of the
+documentation included with your release or on the `llvm.org releases
+page <http://llvm.org/releases/>`_.
+
+Code Generation Setup
+=====================
+
+In order to generate LLVM IR, we want some simple setup to get started.
+First we define virtual code generation (codegen) methods in each AST
+class:
+
+.. code-block:: c++
+
+ /// ExprAST - Base class for all expression nodes.
+ class ExprAST {
+ public:
+ virtual ~ExprAST() {}
+ virtual Value *codegen() = 0;
+ };
+
+ /// NumberExprAST - Expression class for numeric literals like "1.0".
+ class NumberExprAST : public ExprAST {
+ double Val;
+
+ public:
+ NumberExprAST(double Val) : Val(Val) {}
+ virtual Value *codegen();
+ };
+ ...
+
+The codegen() method says to emit IR for that AST node along with all
+the things it depends on, and they all return an LLVM Value object.
+"Value" is the class used to represent a "`Static Single Assignment
+(SSA) <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+register" or "SSA value" in LLVM. The most distinct aspect of SSA values
+is that their value is computed as the related instruction executes, and
+it does not get a new value until (and if) the instruction re-executes.
+In other words, there is no way to "change" an SSA value. For more
+information, please read up on `Static Single
+Assignment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+- the concepts are really quite natural once you grok them.
+
+Note that instead of adding virtual methods to the ExprAST class
+hierarchy, it could also make sense to use a `visitor
+pattern <http://en.wikipedia.org/wiki/Visitor_pattern>`_ or some other
+way to model this. Again, this tutorial won't dwell on good software
+engineering practices: for our purposes, adding a virtual method is
+simplest.
+
+The second thing we want is an "LogError" method like we used for the
+parser, which will be used to report errors found during code generation
+(for example, use of an undeclared parameter):
+
+.. code-block:: c++
+
+ static LLVMContext TheContext;
+ static IRBuilder<> Builder(TheContext);
+ static std::unique_ptr<Module> TheModule;
+ static std::map<std::string, Value *> NamedValues;
+
+ Value *LogErrorV(const char *Str) {
+ LogError(Str);
+ return nullptr;
+ }
+
+The static variables will be used during code generation. ``TheContext``
+is an opaque object that owns a lot of core LLVM data structures, such as
+the type and constant value tables. We don't need to understand it in
+detail, we just need a single instance to pass into APIs that require it.
+
+The ``Builder`` object is a helper object that makes it easy to generate
+LLVM instructions. Instances of the
+`IRBuilder <http://llvm.org/doxygen/IRBuilder_8h-source.html>`_
+class template keep track of the current place to insert instructions
+and has methods to create new instructions.
+
+``TheModule`` is an LLVM construct that contains functions and global
+variables. In many ways, it is the top-level structure that the LLVM IR
+uses to contain code. It will own the memory for all of the IR that we
+generate, which is why the codegen() method returns a raw Value\*,
+rather than a unique_ptr<Value>.
+
+The ``NamedValues`` map keeps track of which values are defined in the
+current scope and what their LLVM representation is. (In other words, it
+is a symbol table for the code). In this form of Kaleidoscope, the only
+things that can be referenced are function parameters. As such, function
+parameters will be in this map when generating code for their function
+body.
+
+With these basics in place, we can start talking about how to generate
+code for each expression. Note that this assumes that the ``Builder``
+has been set up to generate code *into* something. For now, we'll assume
+that this has already been done, and we'll just use it to emit code.
+
+Expression Code Generation
+==========================
+
+Generating LLVM code for expression nodes is very straightforward: less
+than 45 lines of commented code for all four of our expression nodes.
+First we'll do numeric literals:
+
+.. code-block:: c++
+
+ Value *NumberExprAST::codegen() {
+ return ConstantFP::get(TheContext, APFloat(Val));
+ }
+
+In the LLVM IR, numeric constants are represented with the
+``ConstantFP`` class, which holds the numeric value in an ``APFloat``
+internally (``APFloat`` has the capability of holding floating point
+constants of Arbitrary Precision). This code basically just creates
+and returns a ``ConstantFP``. Note that in the LLVM IR that constants
+are all uniqued together and shared. For this reason, the API uses the
+"foo::get(...)" idiom instead of "new foo(..)" or "foo::Create(..)".
+
+.. code-block:: c++
+
+ Value *VariableExprAST::codegen() {
+ // Look this variable up in the function.
+ Value *V = NamedValues[Name];
+ if (!V)
+ LogErrorV("Unknown variable name");
+ return V;
+ }
+
+References to variables are also quite simple using LLVM. In the simple
+version of Kaleidoscope, we assume that the variable has already been
+emitted somewhere and its value is available. In practice, the only
+values that can be in the ``NamedValues`` map are function arguments.
+This code simply checks to see that the specified name is in the map (if
+not, an unknown variable is being referenced) and returns the value for
+it. In future chapters, we'll add support for `loop induction
+variables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for `local
+variables <LangImpl7.html#user-defined-local-variables>`_.
+
+.. code-block:: c++
+
+ Value *BinaryExprAST::codegen() {
+ Value *L = LHS->codegen();
+ Value *R = RHS->codegen();
+ if (!L || !R)
+ return nullptr;
+
+ switch (Op) {
+ case '+':
+ return Builder.CreateFAdd(L, R, "addtmp");
+ case '-':
+ return Builder.CreateFSub(L, R, "subtmp");
+ case '*':
+ return Builder.CreateFMul(L, R, "multmp");
+ case '<':
+ L = Builder.CreateFCmpULT(L, R, "cmptmp");
+ // Convert bool 0/1 to double 0.0 or 1.0
+ return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext),
+ "booltmp");
+ default:
+ return LogErrorV("invalid binary operator");
+ }
+ }
+
+Binary operators start to get more interesting. The basic idea here is
+that we recursively emit code for the left-hand side of the expression,
+then the right-hand side, then we compute the result of the binary
+expression. In this code, we do a simple switch on the opcode to create
+the right LLVM instruction.
+
+In the example above, the LLVM builder class is starting to show its
+value. IRBuilder knows where to insert the newly created instruction,
+all you have to do is specify what instruction to create (e.g. with
+``CreateFAdd``), which operands to use (``L`` and ``R`` here) and
+optionally provide a name for the generated instruction.
+
+One nice thing about LLVM is that the name is just a hint. For instance,
+if the code above emits multiple "addtmp" variables, LLVM will
+automatically provide each one with an increasing, unique numeric
+suffix. Local value names for instructions are purely optional, but it
+makes it much easier to read the IR dumps.
+
+`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict
+rules: for example, the Left and Right operators of an `add
+instruction <../LangRef.html#add-instruction>`_ must have the same type, and the
+result type of the add must match the operand types. Because all values
+in Kaleidoscope are doubles, this makes for very simple code for add,
+sub and mul.
+
+On the other hand, LLVM specifies that the `fcmp
+instruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a
+one bit integer). The problem with this is that Kaleidoscope wants the
+value to be a 0.0 or 1.0 value. In order to get these semantics, we
+combine the fcmp instruction with a `uitofp
+instruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its
+input integer into a floating point value by treating the input as an
+unsigned value. In contrast, if we used the `sitofp
+instruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator
+would return 0.0 and -1.0, depending on the input value.
+
+.. code-block:: c++
+
+ Value *CallExprAST::codegen() {
+ // Look up the name in the global module table.
+ Function *CalleeF = TheModule->getFunction(Callee);
+ if (!CalleeF)
+ return LogErrorV("Unknown function referenced");
+
+ // If argument mismatch error.
+ if (CalleeF->arg_size() != Args.size())
+ return LogErrorV("Incorrect # arguments passed");
+
+ std::vector<Value *> ArgsV;
+ for (unsigned i = 0, e = Args.size(); i != e; ++i) {
+ ArgsV.push_back(Args[i]->codegen());
+ if (!ArgsV.back())
+ return nullptr;
+ }
+
+ return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
+ }
+
+Code generation for function calls is quite straightforward with LLVM. The code
+above initially does a function name lookup in the LLVM Module's symbol table.
+Recall that the LLVM Module is the container that holds the functions we are
+JIT'ing. By giving each function the same name as what the user specifies, we
+can use the LLVM symbol table to resolve function names for us.
+
+Once we have the function to call, we recursively codegen each argument
+that is to be passed in, and create an LLVM `call
+instruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C
+calling conventions by default, allowing these calls to also call into
+standard library functions like "sin" and "cos", with no additional
+effort.
+
+This wraps up our handling of the four basic expressions that we have so
+far in Kaleidoscope. Feel free to go in and add some more. For example,
+by browsing the `LLVM language reference <../LangRef.html>`_ you'll find
+several other interesting instructions that are really easy to plug into
+our basic framework.
+
+Function Code Generation
+========================
+
+Code generation for prototypes and functions must handle a number of
+details, which make their code less beautiful than expression code
+generation, but allows us to illustrate some important points. First,
+let's talk about code generation for prototypes: they are used both for
+function bodies and external function declarations. The code starts
+with:
+
+.. code-block:: c++
+
+ Function *PrototypeAST::codegen() {
+ // Make the function type: double(double,double) etc.
+ std::vector<Type*> Doubles(Args.size(),
+ Type::getDoubleTy(TheContext));
+ FunctionType *FT =
+ FunctionType::get(Type::getDoubleTy(TheContext), Doubles, false);
+
+ Function *F =
+ Function::Create(FT, Function::ExternalLinkage, Name, TheModule.get());
+
+This code packs a lot of power into a few lines. Note first that this
+function returns a "Function\*" instead of a "Value\*". Because a
+"prototype" really talks about the external interface for a function
+(not the value computed by an expression), it makes sense for it to
+return the LLVM Function it corresponds to when codegen'd.
+
+The call to ``FunctionType::get`` creates the ``FunctionType`` that
+should be used for a given Prototype. Since all function arguments in
+Kaleidoscope are of type double, the first line creates a vector of "N"
+LLVM double types. It then uses the ``Functiontype::get`` method to
+create a function type that takes "N" doubles as arguments, returns one
+double as a result, and that is not vararg (the false parameter
+indicates this). Note that Types in LLVM are uniqued just like Constants
+are, so you don't "new" a type, you "get" it.
+
+The final line above actually creates the IR Function corresponding to
+the Prototype. This indicates the type, linkage and name to use, as
+well as which module to insert into. "`external
+linkage <../LangRef.html#linkage>`_" means that the function may be
+defined outside the current module and/or that it is callable by
+functions outside the module. The Name passed in is the name the user
+specified: since "``TheModule``" is specified, this name is registered
+in "``TheModule``"s symbol table.
+
+.. code-block:: c++
+
+ // Set names for all arguments.
+ unsigned Idx = 0;
+ for (auto &Arg : F->args())
+ Arg.setName(Args[Idx++]);
+
+ return F;
+
+Finally, we set the name of each of the function's arguments according to the
+names given in the Prototype. This step isn't strictly necessary, but keeping
+the names consistent makes the IR more readable, and allows subsequent code to
+refer directly to the arguments for their names, rather than having to look up
+them up in the Prototype AST.
+
+At this point we have a function prototype with no body. This is how LLVM IR
+represents function declarations. For extern statements in Kaleidoscope, this
+is as far as we need to go. For function definitions however, we need to
+codegen and attach a function body.
+
+.. code-block:: c++
+
+ Function *FunctionAST::codegen() {
+ // First, check for an existing function from a previous 'extern' declaration.
+ Function *TheFunction = TheModule->getFunction(Proto->getName());
+
+ if (!TheFunction)
+ TheFunction = Proto->codegen();
+
+ if (!TheFunction)
+ return nullptr;
+
+ if (!TheFunction->empty())
+ return (Function*)LogErrorV("Function cannot be redefined.");
+
+
+For function definitions, we start by searching TheModule's symbol table for an
+existing version of this function, in case one has already been created using an
+'extern' statement. If Module::getFunction returns null then no previous version
+exists, so we'll codegen one from the Prototype. In either case, we want to
+assert that the function is empty (i.e. has no body yet) before we start.
+
+.. code-block:: c++
+
+ // Create a new basic block to start insertion into.
+ BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
+ Builder.SetInsertPoint(BB);
+
+ // Record the function arguments in the NamedValues map.
+ NamedValues.clear();
+ for (auto &Arg : TheFunction->args())
+ NamedValues[Arg.getName()] = &Arg;
+
+Now we get to the point where the ``Builder`` is set up. The first line
+creates a new `basic block <http://en.wikipedia.org/wiki/Basic_block>`_
+(named "entry"), which is inserted into ``TheFunction``. The second line
+then tells the builder that new instructions should be inserted into the
+end of the new basic block. Basic blocks in LLVM are an important part
+of functions that define the `Control Flow
+Graph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we
+don't have any control flow, our functions will only contain one block
+at this point. We'll fix this in `Chapter 5 <LangImpl05.html>`_ :).
+
+Next we add the function arguments to the NamedValues map (after first clearing
+it out) so that they're accessible to ``VariableExprAST`` nodes.
+
+.. code-block:: c++
+
+ if (Value *RetVal = Body->codegen()) {
+ // Finish off the function.
+ Builder.CreateRet(RetVal);
+
+ // Validate the generated code, checking for consistency.
+ verifyFunction(*TheFunction);
+
+ return TheFunction;
+ }
+
+Once the insertion point has been set up and the NamedValues map populated,
+we call the ``codegen()`` method for the root expression of the function. If no
+error happens, this emits code to compute the expression into the entry block
+and returns the value that was computed. Assuming no error, we then create an
+LLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the function.
+Once the function is built, we call ``verifyFunction``, which is
+provided by LLVM. This function does a variety of consistency checks on
+the generated code, to determine if our compiler is doing everything
+right. Using this is important: it can catch a lot of bugs. Once the
+function is finished and validated, we return it.
+
+.. code-block:: c++
+
+ // Error reading body, remove function.
+ TheFunction->eraseFromParent();
+ return nullptr;
+ }
+
+The only piece left here is handling of the error case. For simplicity,
+we handle this by merely deleting the function we produced with the
+``eraseFromParent`` method. This allows the user to redefine a function
+that they incorrectly typed in before: if we didn't delete it, it would
+live in the symbol table, with a body, preventing future redefinition.
+
+This code does have a bug, though: If the ``FunctionAST::codegen()`` method
+finds an existing IR Function, it does not validate its signature against the
+definition's own prototype. This means that an earlier 'extern' declaration will
+take precedence over the function definition's signature, which can cause
+codegen to fail, for instance if the function arguments are named differently.
+There are a number of ways to fix this bug, see what you can come up with! Here
+is a testcase:
+
+::
+
+ extern foo(a); # ok, defines foo.
+ def foo(b) b; # Error: Unknown variable name. (decl using 'a' takes precedence).
+
+Driver Changes and Closing Thoughts
+===================================
+
+For now, code generation to LLVM doesn't really get us much, except that
+we can look at the pretty IR calls. The sample code inserts calls to
+codegen into the "``HandleDefinition``", "``HandleExtern``" etc
+functions, and then dumps out the LLVM IR. This gives a nice way to look
+at the LLVM IR for simple functions. For example:
+
+::
+
+ ready> 4+5;
+ Read top-level expression:
+ define double @0() {
+ entry:
+ ret double 9.000000e+00
+ }
+
+Note how the parser turns the top-level expression into anonymous
+functions for us. This will be handy when we add `JIT
+support <LangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that the
+code is very literally transcribed, no optimizations are being performed
+except simple constant folding done by IRBuilder. We will `add
+optimizations <LangImpl4.html#trivial-constant-folding>`_ explicitly in the next
+chapter.
+
+::
+
+ ready> def foo(a b) a*a + 2*a*b + b*b;
+ Read function definition:
+ define double @foo(double %a, double %b) {
+ entry:
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
+ }
+
+This shows some simple arithmetic. Notice the striking similarity to the
+LLVM builder calls that we use to create the instructions.
+
+::
+
+ ready> def bar(a) foo(a, 4.0) + bar(31337);
+ Read function definition:
+ define double @bar(double %a) {
+ entry:
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
+ }
+
+This shows some function calls. Note that this function will take a long
+time to execute if you call it. In the future we'll add conditional
+control flow to actually make recursion useful :).
+
+::
+
+ ready> extern cos(x);
+ Read extern:
+ declare double @cos(double)
+
+ ready> cos(1.234);
+ Read top-level expression:
+ define double @1() {
+ entry:
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
+ }
+
+This shows an extern for the libm "cos" function, and a call to it.
+
+.. TODO:: Abandon Pygments' horrible `llvm` lexer. It just totally gives up
+ on highlighting this due to the first line.
+
+::
+
+ ready> ^D
+ ; ModuleID = 'my cool jit'
+
+ define double @0() {
+ entry:
+ %addtmp = fadd double 4.000000e+00, 5.000000e+00
+ ret double %addtmp
+ }
+
+ define double @foo(double %a, double %b) {
+ entry:
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
+ }
+
+ define double @bar(double %a) {
+ entry:
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
+ }
+
+ declare double @cos(double)
+
+ define double @1() {
+ entry:
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
+ }
+
+When you quit the current demo (by sending an EOF via CTRL+D on Linux
+or CTRL+Z and ENTER on Windows), it dumps out the IR for the entire
+module generated. Here you can see the big picture with all the
+functions referencing each other.
+
+This wraps up the third chapter of the Kaleidoscope tutorial. Up next,
+we'll describe how to `add JIT codegen and optimizer
+support <LangImpl04.html>`_ to this so we can actually start running
+code!
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the LLVM code generator. Because this uses the LLVM libraries, we need
+to link them in. To do this, we use the
+`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform
+our makefile/command line about which options to use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core` -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter3/toy.cpp
+ :language: c++
+
+`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl04.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl04.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl04.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl04.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,659 @@
+==============================================
+Kaleidoscope: Adding JIT and Optimizer Support
+==============================================
+
+.. contents::
+ :local:
+
+Chapter 4 Introduction
+======================
+
+Welcome to Chapter 4 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. Chapters 1-3 described the implementation
+of a simple language and added support for generating LLVM IR. This
+chapter describes two new techniques: adding optimizer support to your
+language, and adding JIT compiler support. These additions will
+demonstrate how to get nice, efficient code for the Kaleidoscope
+language.
+
+Trivial Constant Folding
+========================
+
+Our demonstration for Chapter 3 is elegant and easy to extend.
+Unfortunately, it does not produce wonderful code. The IRBuilder,
+however, does give us obvious optimizations when compiling simple code:
+
+::
+
+ ready> def test(x) 1+2+x;
+ Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 3.000000e+00, %x
+ ret double %addtmp
+ }
+
+This code is not a literal transcription of the AST built by parsing the
+input. That would be:
+
+::
+
+ ready> def test(x) 1+2+x;
+ Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 2.000000e+00, 1.000000e+00
+ %addtmp1 = fadd double %addtmp, %x
+ ret double %addtmp1
+ }
+
+Constant folding, as seen above, in particular, is a very common and
+very important optimization: so much so that many language implementors
+implement constant folding support in their AST representation.
+
+With LLVM, you don't need this support in the AST. Since all calls to
+build LLVM IR go through the LLVM IR builder, the builder itself checked
+to see if there was a constant folding opportunity when you call it. If
+so, it just does the constant fold and return the constant instead of
+creating an instruction.
+
+Well, that was easy :). In practice, we recommend always using
+``IRBuilder`` when generating code like this. It has no "syntactic
+overhead" for its use (you don't have to uglify your compiler with
+constant checks everywhere) and it can dramatically reduce the amount of
+LLVM IR that is generated in some cases (particular for languages with a
+macro preprocessor or that use a lot of constants).
+
+On the other hand, the ``IRBuilder`` is limited by the fact that it does
+all of its analysis inline with the code as it is built. If you take a
+slightly more complex example:
+
+::
+
+ ready> def test(x) (1+2+x)*(x+(1+2));
+ ready> Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 3.000000e+00, %x
+ %addtmp1 = fadd double %x, 3.000000e+00
+ %multmp = fmul double %addtmp, %addtmp1
+ ret double %multmp
+ }
+
+In this case, the LHS and RHS of the multiplication are the same value.
+We'd really like to see this generate "``tmp = x+3; result = tmp*tmp;``"
+instead of computing "``x+3``" twice.
+
+Unfortunately, no amount of local analysis will be able to detect and
+correct this. This requires two transformations: reassociation of
+expressions (to make the add's lexically identical) and Common
+Subexpression Elimination (CSE) to delete the redundant add instruction.
+Fortunately, LLVM provides a broad range of optimizations that you can
+use, in the form of "passes".
+
+LLVM Optimization Passes
+========================
+
+.. warning::
+
+ Due to the transition to the new PassManager infrastructure this tutorial
+ is based on ``llvm::legacy::FunctionPassManager`` which can be found in
+ `LegacyPassManager.h <http://llvm.org/doxygen/classllvm_1_1legacy_1_1FunctionPassManager.html>`_.
+ For the purpose of the this tutorial the above should be used until
+ the pass manager transition is complete.
+
+LLVM provides many optimization passes, which do many different sorts of
+things and have different tradeoffs. Unlike other systems, LLVM doesn't
+hold to the mistaken notion that one set of optimizations is right for
+all languages and for all situations. LLVM allows a compiler implementor
+to make complete decisions about what optimizations to use, in which
+order, and in what situation.
+
+As a concrete example, LLVM supports both "whole module" passes, which
+look across as large of body of code as they can (often a whole file,
+but if run at link time, this can be a substantial portion of the whole
+program). It also supports and includes "per-function" passes which just
+operate on a single function at a time, without looking at other
+functions. For more information on passes and how they are run, see the
+`How to Write a Pass <../WritingAnLLVMPass.html>`_ document and the
+`List of LLVM Passes <../Passes.html>`_.
+
+For Kaleidoscope, we are currently generating functions on the fly, one
+at a time, as the user types them in. We aren't shooting for the
+ultimate optimization experience in this setting, but we also want to
+catch the easy and quick stuff where possible. As such, we will choose
+to run a few per-function optimizations as the user types the function
+in. If we wanted to make a "static Kaleidoscope compiler", we would use
+exactly the code we have now, except that we would defer running the
+optimizer until the entire file has been parsed.
+
+In order to get per-function optimizations going, we need to set up a
+`FunctionPassManager <../WritingAnLLVMPass.html#what-passmanager-doesr>`_ to hold
+and organize the LLVM optimizations that we want to run. Once we have
+that, we can add a set of optimizations to run. We'll need a new
+FunctionPassManager for each module that we want to optimize, so we'll
+write a function to create and initialize both the module and pass manager
+for us:
+
+.. code-block:: c++
+
+ void InitializeModuleAndPassManager(void) {
+ // Open a new module.
+ TheModule = llvm::make_unique<Module>("my cool jit", TheContext);
+
+ // Create a new pass manager attached to it.
+ TheFPM = llvm::make_unique<FunctionPassManager>(TheModule.get());
+
+ // Do simple "peephole" optimizations and bit-twiddling optzns.
+ TheFPM->add(createInstructionCombiningPass());
+ // Reassociate expressions.
+ TheFPM->add(createReassociatePass());
+ // Eliminate Common SubExpressions.
+ TheFPM->add(createGVNPass());
+ // Simplify the control flow graph (deleting unreachable blocks, etc).
+ TheFPM->add(createCFGSimplificationPass());
+
+ TheFPM->doInitialization();
+ }
+
+This code initializes the global module ``TheModule``, and the function pass
+manager ``TheFPM``, which is attached to ``TheModule``. Once the pass manager is
+set up, we use a series of "add" calls to add a bunch of LLVM passes.
+
+In this case, we choose to add four optimization passes.
+The passes we choose here are a pretty standard set
+of "cleanup" optimizations that are useful for a wide variety of code. I won't
+delve into what they do but, believe me, they are a good starting place :).
+
+Once the PassManager is set up, we need to make use of it. We do this by
+running it after our newly created function is constructed (in
+``FunctionAST::codegen()``), but before it is returned to the client:
+
+.. code-block:: c++
+
+ if (Value *RetVal = Body->codegen()) {
+ // Finish off the function.
+ Builder.CreateRet(RetVal);
+
+ // Validate the generated code, checking for consistency.
+ verifyFunction(*TheFunction);
+
+ // Optimize the function.
+ TheFPM->run(*TheFunction);
+
+ return TheFunction;
+ }
+
+As you can see, this is pretty straightforward. The
+``FunctionPassManager`` optimizes and updates the LLVM Function\* in
+place, improving (hopefully) its body. With this in place, we can try
+our test above again:
+
+::
+
+ ready> def test(x) (1+2+x)*(x+(1+2));
+ ready> Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double %x, 3.000000e+00
+ %multmp = fmul double %addtmp, %addtmp
+ ret double %multmp
+ }
+
+As expected, we now get our nicely optimized code, saving a floating
+point add instruction from every execution of this function.
+
+LLVM provides a wide variety of optimizations that can be used in
+certain circumstances. Some `documentation about the various
+passes <../Passes.html>`_ is available, but it isn't very complete.
+Another good source of ideas can come from looking at the passes that
+``Clang`` runs to get started. The "``opt``" tool allows you to
+experiment with passes from the command line, so you can see if they do
+anything.
+
+Now that we have reasonable code coming out of our front-end, let's talk
+about executing it!
+
+Adding a JIT Compiler
+=====================
+
+Code that is available in LLVM IR can have a wide variety of tools
+applied to it. For example, you can run optimizations on it (as we did
+above), you can dump it out in textual or binary forms, you can compile
+the code to an assembly file (.s) for some target, or you can JIT
+compile it. The nice thing about the LLVM IR representation is that it
+is the "common currency" between many different parts of the compiler.
+
+In this section, we'll add JIT compiler support to our interpreter. The
+basic idea that we want for Kaleidoscope is to have the user enter
+function bodies as they do now, but immediately evaluate the top-level
+expressions they type in. For example, if they type in "1 + 2;", we
+should evaluate and print out 3. If they define a function, they should
+be able to call it from the command line.
+
+In order to do this, we first prepare the environment to create code for
+the current native target and declare and initialize the JIT. This is
+done by calling some ``InitializeNativeTarget\*`` functions and
+adding a global variable ``TheJIT``, and initializing it in
+``main``:
+
+.. code-block:: c++
+
+ static std::unique_ptr<KaleidoscopeJIT> TheJIT;
+ ...
+ int main() {
+ InitializeNativeTarget();
+ InitializeNativeTargetAsmPrinter();
+ InitializeNativeTargetAsmParser();
+
+ // Install standard binary operators.
+ // 1 is lowest precedence.
+ BinopPrecedence['<'] = 10;
+ BinopPrecedence['+'] = 20;
+ BinopPrecedence['-'] = 20;
+ BinopPrecedence['*'] = 40; // highest.
+
+ // Prime the first token.
+ fprintf(stderr, "ready> ");
+ getNextToken();
+
+ TheJIT = llvm::make_unique<KaleidoscopeJIT>();
+
+ // Run the main "interpreter loop" now.
+ MainLoop();
+
+ return 0;
+ }
+
+We also need to setup the data layout for the JIT:
+
+.. code-block:: c++
+
+ void InitializeModuleAndPassManager(void) {
+ // Open a new module.
+ TheModule = llvm::make_unique<Module>("my cool jit", TheContext);
+ TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
+
+ // Create a new pass manager attached to it.
+ TheFPM = llvm::make_unique<FunctionPassManager>(TheModule.get());
+ ...
+
+The KaleidoscopeJIT class is a simple JIT built specifically for these
+tutorials, available inside the LLVM source code
+at llvm-src/examples/Kaleidoscope/include/KaleidoscopeJIT.h.
+In later chapters we will look at how it works and extend it with
+new features, but for now we will take it as given. Its API is very simple:
+``addModule`` adds an LLVM IR module to the JIT, making its functions
+available for execution; ``removeModule`` removes a module, freeing any
+memory associated with the code in that module; and ``findSymbol`` allows us
+to look up pointers to the compiled code.
+
+We can take this simple API and change our code that parses top-level expressions to
+look like this:
+
+.. code-block:: c++
+
+ static void HandleTopLevelExpression() {
+ // Evaluate a top-level expression into an anonymous function.
+ if (auto FnAST = ParseTopLevelExpr()) {
+ if (FnAST->codegen()) {
+
+ // JIT the module containing the anonymous expression, keeping a handle so
+ // we can free it later.
+ auto H = TheJIT->addModule(std::move(TheModule));
+ InitializeModuleAndPassManager();
+
+ // Search the JIT for the __anon_expr symbol.
+ auto ExprSymbol = TheJIT->findSymbol("__anon_expr");
+ assert(ExprSymbol && "Function not found");
+
+ // Get the symbol's address and cast it to the right type (takes no
+ // arguments, returns a double) so we can call it as a native function.
+ double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress();
+ fprintf(stderr, "Evaluated to %f\n", FP());
+
+ // Delete the anonymous expression module from the JIT.
+ TheJIT->removeModule(H);
+ }
+
+If parsing and codegen succeeed, the next step is to add the module containing
+the top-level expression to the JIT. We do this by calling addModule, which
+triggers code generation for all the functions in the module, and returns a
+handle that can be used to remove the module from the JIT later. Once the module
+has been added to the JIT it can no longer be modified, so we also open a new
+module to hold subsequent code by calling ``InitializeModuleAndPassManager()``.
+
+Once we've added the module to the JIT we need to get a pointer to the final
+generated code. We do this by calling the JIT's findSymbol method, and passing
+the name of the top-level expression function: ``__anon_expr``. Since we just
+added this function, we assert that findSymbol returned a result.
+
+Next, we get the in-memory address of the ``__anon_expr`` function by calling
+``getAddress()`` on the symbol. Recall that we compile top-level expressions
+into a self-contained LLVM function that takes no arguments and returns the
+computed double. Because the LLVM JIT compiler matches the native platform ABI,
+this means that you can just cast the result pointer to a function pointer of
+that type and call it directly. This means, there is no difference between JIT
+compiled code and native machine code that is statically linked into your
+application.
+
+Finally, since we don't support re-evaluation of top-level expressions, we
+remove the module from the JIT when we're done to free the associated memory.
+Recall, however, that the module we created a few lines earlier (via
+``InitializeModuleAndPassManager``) is still open and waiting for new code to be
+added.
+
+With just these two changes, let's see how Kaleidoscope works now!
+
+::
+
+ ready> 4+5;
+ Read top-level expression:
+ define double @0() {
+ entry:
+ ret double 9.000000e+00
+ }
+
+ Evaluated to 9.000000
+
+Well this looks like it is basically working. The dump of the function
+shows the "no argument function that always returns double" that we
+synthesize for each top-level expression that is typed in. This
+demonstrates very basic functionality, but can we do more?
+
+::
+
+ ready> def testfunc(x y) x + y*2;
+ Read function definition:
+ define double @testfunc(double %x, double %y) {
+ entry:
+ %multmp = fmul double %y, 2.000000e+00
+ %addtmp = fadd double %multmp, %x
+ ret double %addtmp
+ }
+
+ ready> testfunc(4, 10);
+ Read top-level expression:
+ define double @1() {
+ entry:
+ %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
+ ret double %calltmp
+ }
+
+ Evaluated to 24.000000
+
+ ready> testfunc(5, 10);
+ ready> LLVM ERROR: Program used external function 'testfunc' which could not be resolved!
+
+
+Function definitions and calls also work, but something went very wrong on that
+last line. The call looks valid, so what happened? As you may have guessed from
+the API a Module is a unit of allocation for the JIT, and testfunc was part
+of the same module that contained anonymous expression. When we removed that
+module from the JIT to free the memory for the anonymous expression, we deleted
+the definition of ``testfunc`` along with it. Then, when we tried to call
+testfunc a second time, the JIT could no longer find it.
+
+The easiest way to fix this is to put the anonymous expression in a separate
+module from the rest of the function definitions. The JIT will happily resolve
+function calls across module boundaries, as long as each of the functions called
+has a prototype, and is added to the JIT before it is called. By putting the
+anonymous expression in a different module we can delete it without affecting
+the rest of the functions.
+
+In fact, we're going to go a step further and put every function in its own
+module. Doing so allows us to exploit a useful property of the KaleidoscopeJIT
+that will make our environment more REPL-like: Functions can be added to the
+JIT more than once (unlike a module where every function must have a unique
+definition). When you look up a symbol in KaleidoscopeJIT it will always return
+the most recent definition:
+
+::
+
+ ready> def foo(x) x + 1;
+ Read function definition:
+ define double @foo(double %x) {
+ entry:
+ %addtmp = fadd double %x, 1.000000e+00
+ ret double %addtmp
+ }
+
+ ready> foo(2);
+ Evaluated to 3.000000
+
+ ready> def foo(x) x + 2;
+ define double @foo(double %x) {
+ entry:
+ %addtmp = fadd double %x, 2.000000e+00
+ ret double %addtmp
+ }
+
+ ready> foo(2);
+ Evaluated to 4.000000
+
+
+To allow each function to live in its own module we'll need a way to
+re-generate previous function declarations into each new module we open:
+
+.. code-block:: c++
+
+ static std::unique_ptr<KaleidoscopeJIT> TheJIT;
+
+ ...
+
+ Function *getFunction(std::string Name) {
+ // First, see if the function has already been added to the current module.
+ if (auto *F = TheModule->getFunction(Name))
+ return F;
+
+ // If not, check whether we can codegen the declaration from some existing
+ // prototype.
+ auto FI = FunctionProtos.find(Name);
+ if (FI != FunctionProtos.end())
+ return FI->second->codegen();
+
+ // If no existing prototype exists, return null.
+ return nullptr;
+ }
+
+ ...
+
+ Value *CallExprAST::codegen() {
+ // Look up the name in the global module table.
+ Function *CalleeF = getFunction(Callee);
+
+ ...
+
+ Function *FunctionAST::codegen() {
+ // Transfer ownership of the prototype to the FunctionProtos map, but keep a
+ // reference to it for use below.
+ auto &P = *Proto;
+ FunctionProtos[Proto->getName()] = std::move(Proto);
+ Function *TheFunction = getFunction(P.getName());
+ if (!TheFunction)
+ return nullptr;
+
+
+To enable this, we'll start by adding a new global, ``FunctionProtos``, that
+holds the most recent prototype for each function. We'll also add a convenience
+method, ``getFunction()``, to replace calls to ``TheModule->getFunction()``.
+Our convenience method searches ``TheModule`` for an existing function
+declaration, falling back to generating a new declaration from FunctionProtos if
+it doesn't find one. In ``CallExprAST::codegen()`` we just need to replace the
+call to ``TheModule->getFunction()``. In ``FunctionAST::codegen()`` we need to
+update the FunctionProtos map first, then call ``getFunction()``. With this
+done, we can always obtain a function declaration in the current module for any
+previously declared function.
+
+We also need to update HandleDefinition and HandleExtern:
+
+.. code-block:: c++
+
+ static void HandleDefinition() {
+ if (auto FnAST = ParseDefinition()) {
+ if (auto *FnIR = FnAST->codegen()) {
+ fprintf(stderr, "Read function definition:");
+ FnIR->print(errs());
+ fprintf(stderr, "\n");
+ TheJIT->addModule(std::move(TheModule));
+ InitializeModuleAndPassManager();
+ }
+ } else {
+ // Skip token for error recovery.
+ getNextToken();
+ }
+ }
+
+ static void HandleExtern() {
+ if (auto ProtoAST = ParseExtern()) {
+ if (auto *FnIR = ProtoAST->codegen()) {
+ fprintf(stderr, "Read extern: ");
+ FnIR->print(errs());
+ fprintf(stderr, "\n");
+ FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST);
+ }
+ } else {
+ // Skip token for error recovery.
+ getNextToken();
+ }
+ }
+
+In HandleDefinition, we add two lines to transfer the newly defined function to
+the JIT and open a new module. In HandleExtern, we just need to add one line to
+add the prototype to FunctionProtos.
+
+With these changes made, let's try our REPL again (I removed the dump of the
+anonymous functions this time, you should get the idea by now :) :
+
+::
+
+ ready> def foo(x) x + 1;
+ ready> foo(2);
+ Evaluated to 3.000000
+
+ ready> def foo(x) x + 2;
+ ready> foo(2);
+ Evaluated to 4.000000
+
+It works!
+
+Even with this simple code, we get some surprisingly powerful capabilities -
+check this out:
+
+::
+
+ ready> extern sin(x);
+ Read extern:
+ declare double @sin(double)
+
+ ready> extern cos(x);
+ Read extern:
+ declare double @cos(double)
+
+ ready> sin(1.0);
+ Read top-level expression:
+ define double @2() {
+ entry:
+ ret double 0x3FEAED548F090CEE
+ }
+
+ Evaluated to 0.841471
+
+ ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
+ Read function definition:
+ define double @foo(double %x) {
+ entry:
+ %calltmp = call double @sin(double %x)
+ %multmp = fmul double %calltmp, %calltmp
+ %calltmp2 = call double @cos(double %x)
+ %multmp4 = fmul double %calltmp2, %calltmp2
+ %addtmp = fadd double %multmp, %multmp4
+ ret double %addtmp
+ }
+
+ ready> foo(4.0);
+ Read top-level expression:
+ define double @3() {
+ entry:
+ %calltmp = call double @foo(double 4.000000e+00)
+ ret double %calltmp
+ }
+
+ Evaluated to 1.000000
+
+Whoa, how does the JIT know about sin and cos? The answer is surprisingly
+simple: The KaleidoscopeJIT has a straightforward symbol resolution rule that
+it uses to find symbols that aren't available in any given module: First
+it searches all the modules that have already been added to the JIT, from the
+most recent to the oldest, to find the newest definition. If no definition is
+found inside the JIT, it falls back to calling "``dlsym("sin")``" on the
+Kaleidoscope process itself. Since "``sin``" is defined within the JIT's
+address space, it simply patches up calls in the module to call the libm
+version of ``sin`` directly. But in some cases this even goes further:
+as sin and cos are names of standard math functions, the constant folder
+will directly evaluate the function calls to the correct result when called
+with constants like in the "``sin(1.0)``" above.
+
+In the future we'll see how tweaking this symbol resolution rule can be used to
+enable all sorts of useful features, from security (restricting the set of
+symbols available to JIT'd code), to dynamic code generation based on symbol
+names, and even lazy compilation.
+
+One immediate benefit of the symbol resolution rule is that we can now extend
+the language by writing arbitrary C++ code to implement operations. For example,
+if we add:
+
+.. code-block:: c++
+
+ #ifdef _WIN32
+ #define DLLEXPORT __declspec(dllexport)
+ #else
+ #define DLLEXPORT
+ #endif
+
+ /// putchard - putchar that takes a double and returns 0.
+ extern "C" DLLEXPORT double putchard(double X) {
+ fputc((char)X, stderr);
+ return 0;
+ }
+
+Note, that for Windows we need to actually export the functions because
+the dynamic symbol loader will use GetProcAddress to find the symbols.
+
+Now we can produce simple output to the console by using things like:
+"``extern putchard(x); putchard(120);``", which prints a lowercase 'x'
+on the console (120 is the ASCII code for 'x'). Similar code could be
+used to implement file I/O, console input, and many other capabilities
+in Kaleidoscope.
+
+This completes the JIT and optimizer chapter of the Kaleidoscope
+tutorial. At this point, we can compile a non-Turing-complete
+programming language, optimize and JIT compile it in a user-driven way.
+Next up we'll look into `extending the language with control flow
+constructs <LangImpl05.html>`_, tackling some interesting LLVM IR issues
+along the way.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the LLVM JIT and optimizer. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+If you are compiling this on Linux, make sure to add the "-rdynamic"
+option as well. This makes sure that the external functions are resolved
+properly at runtime.
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter4/toy.cpp
+ :language: c++
+
+`Next: Extending the language: control flow <LangImpl05.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl05.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl05.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl05.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl05.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,814 @@
+==================================================
+Kaleidoscope: Extending the Language: Control Flow
+==================================================
+
+.. contents::
+ :local:
+
+Chapter 5 Introduction
+======================
+
+Welcome to Chapter 5 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. Parts 1-4 described the implementation of
+the simple Kaleidoscope language and included support for generating
+LLVM IR, followed by optimizations and a JIT compiler. Unfortunately, as
+presented, Kaleidoscope is mostly useless: it has no control flow other
+than call and return. This means that you can't have conditional
+branches in the code, significantly limiting its power. In this episode
+of "build that compiler", we'll extend Kaleidoscope to have an
+if/then/else expression plus a simple 'for' loop.
+
+If/Then/Else
+============
+
+Extending Kaleidoscope to support if/then/else is quite straightforward.
+It basically requires adding support for this "new" concept to the
+lexer, parser, AST, and LLVM code emitter. This example is nice, because
+it shows how easy it is to "grow" a language over time, incrementally
+extending it as new ideas are discovered.
+
+Before we get going on "how" we add this extension, let's talk about
+"what" we want. The basic idea is that we want to be able to write this
+sort of thing:
+
+::
+
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+In Kaleidoscope, every construct is an expression: there are no
+statements. As such, the if/then/else expression needs to return a value
+like any other. Since we're using a mostly functional form, we'll have
+it evaluate its conditional, then return the 'then' or 'else' value
+based on how the condition was resolved. This is very similar to the C
+"?:" expression.
+
+The semantics of the if/then/else expression is that it evaluates the
+condition to a boolean equality value: 0.0 is considered to be false and
+everything else is considered to be true. If the condition is true, the
+first subexpression is evaluated and returned, if the condition is
+false, the second subexpression is evaluated and returned. Since
+Kaleidoscope allows side-effects, this behavior is important to nail
+down.
+
+Now that we know what we "want", let's break this down into its
+constituent pieces.
+
+Lexer Extensions for If/Then/Else
+---------------------------------
+
+The lexer extensions are straightforward. First we add new enum values
+for the relevant tokens:
+
+.. code-block:: c++
+
+ // control
+ tok_if = -6,
+ tok_then = -7,
+ tok_else = -8,
+
+Once we have that, we recognize the new keywords in the lexer. This is
+pretty simple stuff:
+
+.. code-block:: c++
+
+ ...
+ if (IdentifierStr == "def")
+ return tok_def;
+ if (IdentifierStr == "extern")
+ return tok_extern;
+ if (IdentifierStr == "if")
+ return tok_if;
+ if (IdentifierStr == "then")
+ return tok_then;
+ if (IdentifierStr == "else")
+ return tok_else;
+ return tok_identifier;
+
+AST Extensions for If/Then/Else
+-------------------------------
+
+To represent the new expression we add a new AST node for it:
+
+.. code-block:: c++
+
+ /// IfExprAST - Expression class for if/then/else.
+ class IfExprAST : public ExprAST {
+ std::unique_ptr<ExprAST> Cond, Then, Else;
+
+ public:
+ IfExprAST(std::unique_ptr<ExprAST> Cond, std::unique_ptr<ExprAST> Then,
+ std::unique_ptr<ExprAST> Else)
+ : Cond(std::move(Cond)), Then(std::move(Then)), Else(std::move(Else)) {}
+
+ Value *codegen() override;
+ };
+
+The AST node just has pointers to the various subexpressions.
+
+Parser Extensions for If/Then/Else
+----------------------------------
+
+Now that we have the relevant tokens coming from the lexer and we have
+the AST node to build, our parsing logic is relatively straightforward.
+First we define a new parsing function:
+
+.. code-block:: c++
+
+ /// ifexpr ::= 'if' expression 'then' expression 'else' expression
+ static std::unique_ptr<ExprAST> ParseIfExpr() {
+ getNextToken(); // eat the if.
+
+ // condition.
+ auto Cond = ParseExpression();
+ if (!Cond)
+ return nullptr;
+
+ if (CurTok != tok_then)
+ return LogError("expected then");
+ getNextToken(); // eat the then
+
+ auto Then = ParseExpression();
+ if (!Then)
+ return nullptr;
+
+ if (CurTok != tok_else)
+ return LogError("expected else");
+
+ getNextToken();
+
+ auto Else = ParseExpression();
+ if (!Else)
+ return nullptr;
+
+ return llvm::make_unique<IfExprAST>(std::move(Cond), std::move(Then),
+ std::move(Else));
+ }
+
+Next we hook it up as a primary expression:
+
+.. code-block:: c++
+
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ case tok_if:
+ return ParseIfExpr();
+ }
+ }
+
+LLVM IR for If/Then/Else
+------------------------
+
+Now that we have it parsing and building the AST, the final piece is
+adding LLVM code generation support. This is the most interesting part
+of the if/then/else example, because this is where it starts to
+introduce new concepts. All of the code above has been thoroughly
+described in previous chapters.
+
+To motivate the code we want to produce, let's take a look at a simple
+example. Consider:
+
+::
+
+ extern foo();
+ extern bar();
+ def baz(x) if x then foo() else bar();
+
+If you disable optimizations, the code you'll (soon) get from
+Kaleidoscope looks like this:
+
+.. code-block:: llvm
+
+ declare double @foo()
+
+ declare double @bar()
+
+ define double @baz(double %x) {
+ entry:
+ %ifcond = fcmp one double %x, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then: ; preds = %entry
+ %calltmp = call double @foo()
+ br label %ifcont
+
+ else: ; preds = %entry
+ %calltmp1 = call double @bar()
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
+ ret double %iftmp
+ }
+
+To visualize the control flow graph, you can use a nifty feature of the
+LLVM '`opt <http://llvm.org/cmds/opt.html>`_' tool. If you put this LLVM
+IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
+window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
+see this graph:
+
+.. figure:: LangImpl05-cfg.png
+ :align: center
+ :alt: Example CFG
+
+ Example CFG
+
+Another way to get this is to call "``F->viewCFG()``" or
+"``F->viewCFGOnly()``" (where F is a "``Function*``") either by
+inserting actual calls into the code and recompiling or by calling these
+in the debugger. LLVM has many nice features for visualizing various
+graphs.
+
+Getting back to the generated code, it is fairly simple: the entry block
+evaluates the conditional expression ("x" in our case here) and compares
+the result to 0.0 with the "``fcmp one``" instruction ('one' is "Ordered
+and Not Equal"). Based on the result of this expression, the code jumps
+to either the "then" or "else" blocks, which contain the expressions for
+the true/false cases.
+
+Once the then/else blocks are finished executing, they both branch back
+to the 'ifcont' block to execute the code that happens after the
+if/then/else. In this case the only thing left to do is to return to the
+caller of the function. The question then becomes: how does the code
+know which expression to return?
+
+The answer to this question involves an important SSA operation: the
+`Phi
+operation <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+If you're not familiar with SSA, `the wikipedia
+article <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+is a good introduction and there are various other introductions to it
+available on your favorite search engine. The short version is that
+"execution" of the Phi operation requires "remembering" which block
+control came from. The Phi operation takes on the value corresponding to
+the input control block. In this case, if control comes in from the
+"then" block, it gets the value of "calltmp". If control comes from the
+"else" block, it gets the value of "calltmp1".
+
+At this point, you are probably starting to think "Oh no! This means my
+simple and elegant front-end will have to start generating SSA form in
+order to use LLVM!". Fortunately, this is not the case, and we strongly
+advise *not* implementing an SSA construction algorithm in your
+front-end unless there is an amazingly good reason to do so. In
+practice, there are two sorts of values that float around in code
+written for your average imperative programming language that might need
+Phi nodes:
+
+#. Code that involves user variables: ``x = 1; x = x + 1;``
+#. Values that are implicit in the structure of your AST, such as the
+ Phi node in this case.
+
+In `Chapter 7 <LangImpl07.html>`_ of this tutorial ("mutable variables"),
+we'll talk about #1 in depth. For now, just believe me that you don't
+need SSA construction to handle this case. For #2, you have the choice
+of using the techniques that we will describe for #1, or you can insert
+Phi nodes directly, if convenient. In this case, it is really
+easy to generate the Phi node, so we choose to do it directly.
+
+Okay, enough of the motivation and overview, let's generate code!
+
+Code Generation for If/Then/Else
+--------------------------------
+
+In order to generate code for this, we implement the ``codegen`` method
+for ``IfExprAST``:
+
+.. code-block:: c++
+
+ Value *IfExprAST::codegen() {
+ Value *CondV = Cond->codegen();
+ if (!CondV)
+ return nullptr;
+
+ // Convert condition to a bool by comparing non-equal to 0.0.
+ CondV = Builder.CreateFCmpONE(
+ CondV, ConstantFP::get(TheContext, APFloat(0.0)), "ifcond");
+
+This code is straightforward and similar to what we saw before. We emit
+the expression for the condition, then compare that value to zero to get
+a truth value as a 1-bit (bool) value.
+
+.. code-block:: c++
+
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+
+ // Create blocks for the then and else cases. Insert the 'then' block at the
+ // end of the function.
+ BasicBlock *ThenBB =
+ BasicBlock::Create(TheContext, "then", TheFunction);
+ BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");
+ BasicBlock *MergeBB = BasicBlock::Create(TheContext, "ifcont");
+
+ Builder.CreateCondBr(CondV, ThenBB, ElseBB);
+
+This code creates the basic blocks that are related to the if/then/else
+statement, and correspond directly to the blocks in the example above.
+The first line gets the current Function object that is being built. It
+gets this by asking the builder for the current BasicBlock, and asking
+that block for its "parent" (the function it is currently embedded
+into).
+
+Once it has that, it creates three blocks. Note that it passes
+"TheFunction" into the constructor for the "then" block. This causes the
+constructor to automatically insert the new block into the end of the
+specified function. The other two blocks are created, but aren't yet
+inserted into the function.
+
+Once the blocks are created, we can emit the conditional branch that
+chooses between them. Note that creating new blocks does not implicitly
+affect the IRBuilder, so it is still inserting into the block that the
+condition went into. Also note that it is creating a branch to the
+"then" block and the "else" block, even though the "else" block isn't
+inserted into the function yet. This is all ok: it is the standard way
+that LLVM supports forward references.
+
+.. code-block:: c++
+
+ // Emit then value.
+ Builder.SetInsertPoint(ThenBB);
+
+ Value *ThenV = Then->codegen();
+ if (!ThenV)
+ return nullptr;
+
+ Builder.CreateBr(MergeBB);
+ // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
+ ThenBB = Builder.GetInsertBlock();
+
+After the conditional branch is inserted, we move the builder to start
+inserting into the "then" block. Strictly speaking, this call moves the
+insertion point to be at the end of the specified block. However, since
+the "then" block is empty, it also starts out by inserting at the
+beginning of the block. :)
+
+Once the insertion point is set, we recursively codegen the "then"
+expression from the AST. To finish off the "then" block, we create an
+unconditional branch to the merge block. One interesting (and very
+important) aspect of the LLVM IR is that it `requires all basic blocks
+to be "terminated" <../LangRef.html#functionstructure>`_ with a `control
+flow instruction <../LangRef.html#terminators>`_ such as return or
+branch. This means that all control flow, *including fall throughs* must
+be made explicit in the LLVM IR. If you violate this rule, the verifier
+will emit an error.
+
+The final line here is quite subtle, but is very important. The basic
+issue is that when we create the Phi node in the merge block, we need to
+set up the block/value pairs that indicate how the Phi will work.
+Importantly, the Phi node expects to have an entry for each predecessor
+of the block in the CFG. Why then, are we getting the current block when
+we just set it to ThenBB 5 lines above? The problem is that the "Then"
+expression may actually itself change the block that the Builder is
+emitting into if, for example, it contains a nested "if/then/else"
+expression. Because calling ``codegen()`` recursively could arbitrarily change
+the notion of the current block, we are required to get an up-to-date
+value for code that will set up the Phi node.
+
+.. code-block:: c++
+
+ // Emit else block.
+ TheFunction->getBasicBlockList().push_back(ElseBB);
+ Builder.SetInsertPoint(ElseBB);
+
+ Value *ElseV = Else->codegen();
+ if (!ElseV)
+ return nullptr;
+
+ Builder.CreateBr(MergeBB);
+ // codegen of 'Else' can change the current block, update ElseBB for the PHI.
+ ElseBB = Builder.GetInsertBlock();
+
+Code generation for the 'else' block is basically identical to codegen
+for the 'then' block. The only significant difference is the first line,
+which adds the 'else' block to the function. Recall previously that the
+'else' block was created, but not added to the function. Now that the
+'then' and 'else' blocks are emitted, we can finish up with the merge
+code:
+
+.. code-block:: c++
+
+ // Emit merge block.
+ TheFunction->getBasicBlockList().push_back(MergeBB);
+ Builder.SetInsertPoint(MergeBB);
+ PHINode *PN =
+ Builder.CreatePHI(Type::getDoubleTy(TheContext), 2, "iftmp");
+
+ PN->addIncoming(ThenV, ThenBB);
+ PN->addIncoming(ElseV, ElseBB);
+ return PN;
+ }
+
+The first two lines here are now familiar: the first adds the "merge"
+block to the Function object (it was previously floating, like the else
+block above). The second changes the insertion point so that newly
+created code will go into the "merge" block. Once that is done, we need
+to create the PHI node and set up the block/value pairs for the PHI.
+
+Finally, the CodeGen function returns the phi node as the value computed
+by the if/then/else expression. In our example above, this returned
+value will feed into the code for the top-level function, which will
+create the return instruction.
+
+Overall, we now have the ability to execute conditional code in
+Kaleidoscope. With this extension, Kaleidoscope is a fairly complete
+language that can calculate a wide variety of numeric functions. Next up
+we'll add another useful expression that is familiar from non-functional
+languages...
+
+'for' Loop Expression
+=====================
+
+Now that we know how to add basic control flow constructs to the
+language, we have the tools to add more powerful things. Let's add
+something more aggressive, a 'for' expression:
+
+::
+
+ extern putchard(char);
+ def printstar(n)
+ for i = 1, i < n, 1.0 in
+ putchard(42); # ascii 42 = '*'
+
+ # print 100 '*' characters
+ printstar(100);
+
+This expression defines a new variable ("i" in this case) which iterates
+from a starting value, while the condition ("i < n" in this case) is
+true, incrementing by an optional step value ("1.0" in this case). If
+the step value is omitted, it defaults to 1.0. While the loop is true,
+it executes its body expression. Because we don't have anything better
+to return, we'll just define the loop as always returning 0.0. In the
+future when we have mutable variables, it will get more useful.
+
+As before, let's talk about the changes that we need to Kaleidoscope to
+support this.
+
+Lexer Extensions for the 'for' Loop
+-----------------------------------
+
+The lexer extensions are the same sort of thing as for if/then/else:
+
+.. code-block:: c++
+
+ ... in enum Token ...
+ // control
+ tok_if = -6, tok_then = -7, tok_else = -8,
+ tok_for = -9, tok_in = -10
+
+ ... in gettok ...
+ if (IdentifierStr == "def")
+ return tok_def;
+ if (IdentifierStr == "extern")
+ return tok_extern;
+ if (IdentifierStr == "if")
+ return tok_if;
+ if (IdentifierStr == "then")
+ return tok_then;
+ if (IdentifierStr == "else")
+ return tok_else;
+ if (IdentifierStr == "for")
+ return tok_for;
+ if (IdentifierStr == "in")
+ return tok_in;
+ return tok_identifier;
+
+AST Extensions for the 'for' Loop
+---------------------------------
+
+The AST node is just as simple. It basically boils down to capturing the
+variable name and the constituent expressions in the node.
+
+.. code-block:: c++
+
+ /// ForExprAST - Expression class for for/in.
+ class ForExprAST : public ExprAST {
+ std::string VarName;
+ std::unique_ptr<ExprAST> Start, End, Step, Body;
+
+ public:
+ ForExprAST(const std::string &VarName, std::unique_ptr<ExprAST> Start,
+ std::unique_ptr<ExprAST> End, std::unique_ptr<ExprAST> Step,
+ std::unique_ptr<ExprAST> Body)
+ : VarName(VarName), Start(std::move(Start)), End(std::move(End)),
+ Step(std::move(Step)), Body(std::move(Body)) {}
+
+ Value *codegen() override;
+ };
+
+Parser Extensions for the 'for' Loop
+------------------------------------
+
+The parser code is also fairly standard. The only interesting thing here
+is handling of the optional step value. The parser code handles it by
+checking to see if the second comma is present. If not, it sets the step
+value to null in the AST node:
+
+.. code-block:: c++
+
+ /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
+ static std::unique_ptr<ExprAST> ParseForExpr() {
+ getNextToken(); // eat the for.
+
+ if (CurTok != tok_identifier)
+ return LogError("expected identifier after for");
+
+ std::string IdName = IdentifierStr;
+ getNextToken(); // eat identifier.
+
+ if (CurTok != '=')
+ return LogError("expected '=' after for");
+ getNextToken(); // eat '='.
+
+
+ auto Start = ParseExpression();
+ if (!Start)
+ return nullptr;
+ if (CurTok != ',')
+ return LogError("expected ',' after for start value");
+ getNextToken();
+
+ auto End = ParseExpression();
+ if (!End)
+ return nullptr;
+
+ // The step value is optional.
+ std::unique_ptr<ExprAST> Step;
+ if (CurTok == ',') {
+ getNextToken();
+ Step = ParseExpression();
+ if (!Step)
+ return nullptr;
+ }
+
+ if (CurTok != tok_in)
+ return LogError("expected 'in' after for");
+ getNextToken(); // eat 'in'.
+
+ auto Body = ParseExpression();
+ if (!Body)
+ return nullptr;
+
+ return llvm::make_unique<ForExprAST>(IdName, std::move(Start),
+ std::move(End), std::move(Step),
+ std::move(Body));
+ }
+
+And again we hook it up as a primary expression:
+
+.. code-block:: c++
+
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ case tok_if:
+ return ParseIfExpr();
+ case tok_for:
+ return ParseForExpr();
+ }
+ }
+
+LLVM IR for the 'for' Loop
+--------------------------
+
+Now we get to the good part: the LLVM IR we want to generate for this
+thing. With the simple example above, we get this LLVM IR (note that
+this dump is generated with optimizations disabled for clarity):
+
+.. code-block:: llvm
+
+ declare double @putchard(double)
+
+ define double @printstar(double %n) {
+ entry:
+ ; initial value = 1.0 (inlined into phi)
+ br label %loop
+
+ loop: ; preds = %loop, %entry
+ %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
+ ; body
+ %calltmp = call double @putchard(double 4.200000e+01)
+ ; increment
+ %nextvar = fadd double %i, 1.000000e+00
+
+ ; termination test
+ %cmptmp = fcmp ult double %i, %n
+ %booltmp = uitofp i1 %cmptmp to double
+ %loopcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %loopcond, label %loop, label %afterloop
+
+ afterloop: ; preds = %loop
+ ; loop always returns 0.0
+ ret double 0.000000e+00
+ }
+
+This loop contains all the same constructs we saw before: a phi node,
+several expressions, and some basic blocks. Let's see how this fits
+together.
+
+Code Generation for the 'for' Loop
+----------------------------------
+
+The first part of codegen is very simple: we just output the start
+expression for the loop value:
+
+.. code-block:: c++
+
+ Value *ForExprAST::codegen() {
+ // Emit the start code first, without 'variable' in scope.
+ Value *StartVal = Start->codegen();
+ if (!StartVal)
+ return nullptr;
+
+With this out of the way, the next step is to set up the LLVM basic
+block for the start of the loop body. In the case above, the whole loop
+body is one block, but remember that the body code itself could consist
+of multiple blocks (e.g. if it contains an if/then/else or a for/in
+expression).
+
+.. code-block:: c++
+
+ // Make the new basic block for the loop header, inserting after current
+ // block.
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+ BasicBlock *PreheaderBB = Builder.GetInsertBlock();
+ BasicBlock *LoopBB =
+ BasicBlock::Create(TheContext, "loop", TheFunction);
+
+ // Insert an explicit fall through from the current block to the LoopBB.
+ Builder.CreateBr(LoopBB);
+
+This code is similar to what we saw for if/then/else. Because we will
+need it to create the Phi node, we remember the block that falls through
+into the loop. Once we have that, we create the actual block that starts
+the loop and create an unconditional branch for the fall-through between
+the two blocks.
+
+.. code-block:: c++
+
+ // Start insertion in LoopBB.
+ Builder.SetInsertPoint(LoopBB);
+
+ // Start the PHI node with an entry for Start.
+ PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(TheContext),
+ 2, VarName.c_str());
+ Variable->addIncoming(StartVal, PreheaderBB);
+
+Now that the "preheader" for the loop is set up, we switch to emitting
+code for the loop body. To begin with, we move the insertion point and
+create the PHI node for the loop induction variable. Since we already
+know the incoming value for the starting value, we add it to the Phi
+node. Note that the Phi will eventually get a second value for the
+backedge, but we can't set it up yet (because it doesn't exist!).
+
+.. code-block:: c++
+
+ // Within the loop, the variable is defined equal to the PHI node. If it
+ // shadows an existing variable, we have to restore it, so save it now.
+ Value *OldVal = NamedValues[VarName];
+ NamedValues[VarName] = Variable;
+
+ // Emit the body of the loop. This, like any other expr, can change the
+ // current BB. Note that we ignore the value computed by the body, but don't
+ // allow an error.
+ if (!Body->codegen())
+ return nullptr;
+
+Now the code starts to get more interesting. Our 'for' loop introduces a
+new variable to the symbol table. This means that our symbol table can
+now contain either function arguments or loop variables. To handle this,
+before we codegen the body of the loop, we add the loop variable as the
+current value for its name. Note that it is possible that there is a
+variable of the same name in the outer scope. It would be easy to make
+this an error (emit an error and return null if there is already an
+entry for VarName) but we choose to allow shadowing of variables. In
+order to handle this correctly, we remember the Value that we are
+potentially shadowing in ``OldVal`` (which will be null if there is no
+shadowed variable).
+
+Once the loop variable is set into the symbol table, the code
+recursively codegen's the body. This allows the body to use the loop
+variable: any references to it will naturally find it in the symbol
+table.
+
+.. code-block:: c++
+
+ // Emit the step value.
+ Value *StepVal = nullptr;
+ if (Step) {
+ StepVal = Step->codegen();
+ if (!StepVal)
+ return nullptr;
+ } else {
+ // If not specified, use 1.0.
+ StepVal = ConstantFP::get(TheContext, APFloat(1.0));
+ }
+
+ Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
+
+Now that the body is emitted, we compute the next value of the iteration
+variable by adding the step value, or 1.0 if it isn't present.
+'``NextVar``' will be the value of the loop variable on the next
+iteration of the loop.
+
+.. code-block:: c++
+
+ // Compute the end condition.
+ Value *EndCond = End->codegen();
+ if (!EndCond)
+ return nullptr;
+
+ // Convert condition to a bool by comparing non-equal to 0.0.
+ EndCond = Builder.CreateFCmpONE(
+ EndCond, ConstantFP::get(TheContext, APFloat(0.0)), "loopcond");
+
+Finally, we evaluate the exit value of the loop, to determine whether
+the loop should exit. This mirrors the condition evaluation for the
+if/then/else statement.
+
+.. code-block:: c++
+
+ // Create the "after loop" block and insert it.
+ BasicBlock *LoopEndBB = Builder.GetInsertBlock();
+ BasicBlock *AfterBB =
+ BasicBlock::Create(TheContext, "afterloop", TheFunction);
+
+ // Insert the conditional branch into the end of LoopEndBB.
+ Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
+
+ // Any new code will be inserted in AfterBB.
+ Builder.SetInsertPoint(AfterBB);
+
+With the code for the body of the loop complete, we just need to finish
+up the control flow for it. This code remembers the end block (for the
+phi node), then creates the block for the loop exit ("afterloop"). Based
+on the value of the exit condition, it creates a conditional branch that
+chooses between executing the loop again and exiting the loop. Any
+future code is emitted in the "afterloop" block, so it sets the
+insertion position to it.
+
+.. code-block:: c++
+
+ // Add a new entry to the PHI node for the backedge.
+ Variable->addIncoming(NextVar, LoopEndBB);
+
+ // Restore the unshadowed variable.
+ if (OldVal)
+ NamedValues[VarName] = OldVal;
+ else
+ NamedValues.erase(VarName);
+
+ // for expr always returns 0.0.
+ return Constant::getNullValue(Type::getDoubleTy(TheContext));
+ }
+
+The final code handles various cleanups: now that we have the "NextVar"
+value, we can add the incoming value to the loop PHI node. After that,
+we remove the loop variable from the symbol table, so that it isn't in
+scope after the for loop. Finally, code generation of the for loop
+always returns 0.0, so that is what we return from
+``ForExprAST::codegen()``.
+
+With this, we conclude the "adding control flow to Kaleidoscope" chapter
+of the tutorial. In this chapter we added two control flow constructs,
+and used them to motivate a couple of aspects of the LLVM IR that are
+important for front-end implementors to know. In the next chapter of our
+saga, we will get a bit crazier and add `user-defined
+operators <LangImpl06.html>`_ to our poor innocent language.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the if/then/else and for expressions. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter5/toy.cpp
+ :language: c++
+
+`Next: Extending the language: user-defined operators <LangImpl06.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl06.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl06.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl06.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl06.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,768 @@
+============================================================
+Kaleidoscope: Extending the Language: User-defined Operators
+============================================================
+
+.. contents::
+ :local:
+
+Chapter 6 Introduction
+======================
+
+Welcome to Chapter 6 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. At this point in our tutorial, we now
+have a fully functional language that is fairly minimal, but also
+useful. There is still one big problem with it, however. Our language
+doesn't have many useful operators (like division, logical negation, or
+even any comparisons besides less-than).
+
+This chapter of the tutorial takes a wild digression into adding
+user-defined operators to the simple and beautiful Kaleidoscope
+language. This digression now gives us a simple and ugly language in
+some ways, but also a powerful one at the same time. One of the great
+things about creating your own language is that you get to decide what
+is good or bad. In this tutorial we'll assume that it is okay to use
+this as a way to show some interesting parsing techniques.
+
+At the end of this tutorial, we'll run through an example Kaleidoscope
+application that `renders the Mandelbrot set <#kicking-the-tires>`_. This gives an
+example of what you can build with Kaleidoscope and its feature set.
+
+User-defined Operators: the Idea
+================================
+
+The "operator overloading" that we will add to Kaleidoscope is more
+general than in languages like C++. In C++, you are only allowed to
+redefine existing operators: you can't programmatically change the
+grammar, introduce new operators, change precedence levels, etc. In this
+chapter, we will add this capability to Kaleidoscope, which will let the
+user round out the set of operators that are supported.
+
+The point of going into user-defined operators in a tutorial like this
+is to show the power and flexibility of using a hand-written parser.
+Thus far, the parser we have been implementing uses recursive descent
+for most parts of the grammar and operator precedence parsing for the
+expressions. See `Chapter 2 <LangImpl02.html>`_ for details. By
+using operator precedence parsing, it is very easy to allow
+the programmer to introduce new operators into the grammar: the grammar
+is dynamically extensible as the JIT runs.
+
+The two specific features we'll add are programmable unary operators
+(right now, Kaleidoscope has no unary operators at all) as well as
+binary operators. An example of this is:
+
+::
+
+ # Logical unary not.
+ def unary!(v)
+ if v then
+ 0
+ else
+ 1;
+
+ # Define > with the same precedence as <.
+ def binary> 10 (LHS RHS)
+ RHS < LHS;
+
+ # Binary "logical or", (note that it does not "short circuit")
+ def binary| 5 (LHS RHS)
+ if LHS then
+ 1
+ else if RHS then
+ 1
+ else
+ 0;
+
+ # Define = with slightly lower precedence than relationals.
+ def binary= 9 (LHS RHS)
+ !(LHS < RHS | LHS > RHS);
+
+Many languages aspire to being able to implement their standard runtime
+library in the language itself. In Kaleidoscope, we can implement
+significant parts of the language in the library!
+
+We will break down implementation of these features into two parts:
+implementing support for user-defined binary operators and adding unary
+operators.
+
+User-defined Binary Operators
+=============================
+
+Adding support for user-defined binary operators is pretty simple with
+our current framework. We'll first add support for the unary/binary
+keywords:
+
+.. code-block:: c++
+
+ enum Token {
+ ...
+ // operators
+ tok_binary = -11,
+ tok_unary = -12
+ };
+ ...
+ static int gettok() {
+ ...
+ if (IdentifierStr == "for")
+ return tok_for;
+ if (IdentifierStr == "in")
+ return tok_in;
+ if (IdentifierStr == "binary")
+ return tok_binary;
+ if (IdentifierStr == "unary")
+ return tok_unary;
+ return tok_identifier;
+
+This just adds lexer support for the unary and binary keywords, like we
+did in `previous chapters <LangImpl5.html#lexer-extensions-for-if-then-else>`_. One nice thing
+about our current AST, is that we represent binary operators with full
+generalisation by using their ASCII code as the opcode. For our extended
+operators, we'll use this same representation, so we don't need any new
+AST or parser support.
+
+On the other hand, we have to be able to represent the definitions of
+these new operators, in the "def binary\| 5" part of the function
+definition. In our grammar so far, the "name" for the function
+definition is parsed as the "prototype" production and into the
+``PrototypeAST`` AST node. To represent our new user-defined operators
+as prototypes, we have to extend the ``PrototypeAST`` AST node like
+this:
+
+.. code-block:: c++
+
+ /// PrototypeAST - This class represents the "prototype" for a function,
+ /// which captures its argument names as well as if it is an operator.
+ class PrototypeAST {
+ std::string Name;
+ std::vector<std::string> Args;
+ bool IsOperator;
+ unsigned Precedence; // Precedence if a binary op.
+
+ public:
+ PrototypeAST(const std::string &name, std::vector<std::string> Args,
+ bool IsOperator = false, unsigned Prec = 0)
+ : Name(name), Args(std::move(Args)), IsOperator(IsOperator),
+ Precedence(Prec) {}
+
+ Function *codegen();
+ const std::string &getName() const { return Name; }
+
+ bool isUnaryOp() const { return IsOperator && Args.size() == 1; }
+ bool isBinaryOp() const { return IsOperator && Args.size() == 2; }
+
+ char getOperatorName() const {
+ assert(isUnaryOp() || isBinaryOp());
+ return Name[Name.size() - 1];
+ }
+
+ unsigned getBinaryPrecedence() const { return Precedence; }
+ };
+
+Basically, in addition to knowing a name for the prototype, we now keep
+track of whether it was an operator, and if it was, what precedence
+level the operator is at. The precedence is only used for binary
+operators (as you'll see below, it just doesn't apply for unary
+operators). Now that we have a way to represent the prototype for a
+user-defined operator, we need to parse it:
+
+.. code-block:: c++
+
+ /// prototype
+ /// ::= id '(' id* ')'
+ /// ::= binary LETTER number? (id, id)
+ static std::unique_ptr<PrototypeAST> ParsePrototype() {
+ std::string FnName;
+
+ unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
+ unsigned BinaryPrecedence = 30;
+
+ switch (CurTok) {
+ default:
+ return LogErrorP("Expected function name in prototype");
+ case tok_identifier:
+ FnName = IdentifierStr;
+ Kind = 0;
+ getNextToken();
+ break;
+ case tok_binary:
+ getNextToken();
+ if (!isascii(CurTok))
+ return LogErrorP("Expected binary operator");
+ FnName = "binary";
+ FnName += (char)CurTok;
+ Kind = 2;
+ getNextToken();
+
+ // Read the precedence if present.
+ if (CurTok == tok_number) {
+ if (NumVal < 1 || NumVal > 100)
+ return LogErrorP("Invalid precedence: must be 1..100");
+ BinaryPrecedence = (unsigned)NumVal;
+ getNextToken();
+ }
+ break;
+ }
+
+ if (CurTok != '(')
+ return LogErrorP("Expected '(' in prototype");
+
+ std::vector<std::string> ArgNames;
+ while (getNextToken() == tok_identifier)
+ ArgNames.push_back(IdentifierStr);
+ if (CurTok != ')')
+ return LogErrorP("Expected ')' in prototype");
+
+ // success.
+ getNextToken(); // eat ')'.
+
+ // Verify right number of names for operator.
+ if (Kind && ArgNames.size() != Kind)
+ return LogErrorP("Invalid number of operands for operator");
+
+ return llvm::make_unique<PrototypeAST>(FnName, std::move(ArgNames), Kind != 0,
+ BinaryPrecedence);
+ }
+
+This is all fairly straightforward parsing code, and we have already
+seen a lot of similar code in the past. One interesting part about the
+code above is the couple lines that set up ``FnName`` for binary
+operators. This builds names like "binary@" for a newly defined "@"
+operator. It then takes advantage of the fact that symbol names in the
+LLVM symbol table are allowed to have any character in them, including
+embedded nul characters.
+
+The next interesting thing to add, is codegen support for these binary
+operators. Given our current structure, this is a simple addition of a
+default case for our existing binary operator node:
+
+.. code-block:: c++
+
+ Value *BinaryExprAST::codegen() {
+ Value *L = LHS->codegen();
+ Value *R = RHS->codegen();
+ if (!L || !R)
+ return nullptr;
+
+ switch (Op) {
+ case '+':
+ return Builder.CreateFAdd(L, R, "addtmp");
+ case '-':
+ return Builder.CreateFSub(L, R, "subtmp");
+ case '*':
+ return Builder.CreateFMul(L, R, "multmp");
+ case '<':
+ L = Builder.CreateFCmpULT(L, R, "cmptmp");
+ // Convert bool 0/1 to double 0.0 or 1.0
+ return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext),
+ "booltmp");
+ default:
+ break;
+ }
+
+ // If it wasn't a builtin binary operator, it must be a user defined one. Emit
+ // a call to it.
+ Function *F = getFunction(std::string("binary") + Op);
+ assert(F && "binary operator not found!");
+
+ Value *Ops[2] = { L, R };
+ return Builder.CreateCall(F, Ops, "binop");
+ }
+
+As you can see above, the new code is actually really simple. It just
+does a lookup for the appropriate operator in the symbol table and
+generates a function call to it. Since user-defined operators are just
+built as normal functions (because the "prototype" boils down to a
+function with the right name) everything falls into place.
+
+The final piece of code we are missing, is a bit of top-level magic:
+
+.. code-block:: c++
+
+ Function *FunctionAST::codegen() {
+ // Transfer ownership of the prototype to the FunctionProtos map, but keep a
+ // reference to it for use below.
+ auto &P = *Proto;
+ FunctionProtos[Proto->getName()] = std::move(Proto);
+ Function *TheFunction = getFunction(P.getName());
+ if (!TheFunction)
+ return nullptr;
+
+ // If this is an operator, install it.
+ if (P.isBinaryOp())
+ BinopPrecedence[P.getOperatorName()] = P.getBinaryPrecedence();
+
+ // Create a new basic block to start insertion into.
+ BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
+ ...
+
+Basically, before codegening a function, if it is a user-defined
+operator, we register it in the precedence table. This allows the binary
+operator parsing logic we already have in place to handle it. Since we
+are working on a fully-general operator precedence parser, this is all
+we need to do to "extend the grammar".
+
+Now we have useful user-defined binary operators. This builds a lot on
+the previous framework we built for other operators. Adding unary
+operators is a bit more challenging, because we don't have any framework
+for it yet - let's see what it takes.
+
+User-defined Unary Operators
+============================
+
+Since we don't currently support unary operators in the Kaleidoscope
+language, we'll need to add everything to support them. Above, we added
+simple support for the 'unary' keyword to the lexer. In addition to
+that, we need an AST node:
+
+.. code-block:: c++
+
+ /// UnaryExprAST - Expression class for a unary operator.
+ class UnaryExprAST : public ExprAST {
+ char Opcode;
+ std::unique_ptr<ExprAST> Operand;
+
+ public:
+ UnaryExprAST(char Opcode, std::unique_ptr<ExprAST> Operand)
+ : Opcode(Opcode), Operand(std::move(Operand)) {}
+
+ Value *codegen() override;
+ };
+
+This AST node is very simple and obvious by now. It directly mirrors the
+binary operator AST node, except that it only has one child. With this,
+we need to add the parsing logic. Parsing a unary operator is pretty
+simple: we'll add a new function to do it:
+
+.. code-block:: c++
+
+ /// unary
+ /// ::= primary
+ /// ::= '!' unary
+ static std::unique_ptr<ExprAST> ParseUnary() {
+ // If the current token is not an operator, it must be a primary expr.
+ if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
+ return ParsePrimary();
+
+ // If this is a unary operator, read it.
+ int Opc = CurTok;
+ getNextToken();
+ if (auto Operand = ParseUnary())
+ return llvm::make_unique<UnaryExprAST>(Opc, std::move(Operand));
+ return nullptr;
+ }
+
+The grammar we add is pretty straightforward here. If we see a unary
+operator when parsing a primary operator, we eat the operator as a
+prefix and parse the remaining piece as another unary operator. This
+allows us to handle multiple unary operators (e.g. "!!x"). Note that
+unary operators can't have ambiguous parses like binary operators can,
+so there is no need for precedence information.
+
+The problem with this function, is that we need to call ParseUnary from
+somewhere. To do this, we change previous callers of ParsePrimary to
+call ParseUnary instead:
+
+.. code-block:: c++
+
+ /// binoprhs
+ /// ::= ('+' unary)*
+ static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec,
+ std::unique_ptr<ExprAST> LHS) {
+ ...
+ // Parse the unary expression after the binary operator.
+ auto RHS = ParseUnary();
+ if (!RHS)
+ return nullptr;
+ ...
+ }
+ /// expression
+ /// ::= unary binoprhs
+ ///
+ static std::unique_ptr<ExprAST> ParseExpression() {
+ auto LHS = ParseUnary();
+ if (!LHS)
+ return nullptr;
+
+ return ParseBinOpRHS(0, std::move(LHS));
+ }
+
+With these two simple changes, we are now able to parse unary operators
+and build the AST for them. Next up, we need to add parser support for
+prototypes, to parse the unary operator prototype. We extend the binary
+operator code above with:
+
+.. code-block:: c++
+
+ /// prototype
+ /// ::= id '(' id* ')'
+ /// ::= binary LETTER number? (id, id)
+ /// ::= unary LETTER (id)
+ static std::unique_ptr<PrototypeAST> ParsePrototype() {
+ std::string FnName;
+
+ unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
+ unsigned BinaryPrecedence = 30;
+
+ switch (CurTok) {
+ default:
+ return LogErrorP("Expected function name in prototype");
+ case tok_identifier:
+ FnName = IdentifierStr;
+ Kind = 0;
+ getNextToken();
+ break;
+ case tok_unary:
+ getNextToken();
+ if (!isascii(CurTok))
+ return LogErrorP("Expected unary operator");
+ FnName = "unary";
+ FnName += (char)CurTok;
+ Kind = 1;
+ getNextToken();
+ break;
+ case tok_binary:
+ ...
+
+As with binary operators, we name unary operators with a name that
+includes the operator character. This assists us at code generation
+time. Speaking of, the final piece we need to add is codegen support for
+unary operators. It looks like this:
+
+.. code-block:: c++
+
+ Value *UnaryExprAST::codegen() {
+ Value *OperandV = Operand->codegen();
+ if (!OperandV)
+ return nullptr;
+
+ Function *F = getFunction(std::string("unary") + Opcode);
+ if (!F)
+ return LogErrorV("Unknown unary operator");
+
+ return Builder.CreateCall(F, OperandV, "unop");
+ }
+
+This code is similar to, but simpler than, the code for binary
+operators. It is simpler primarily because it doesn't need to handle any
+predefined operators.
+
+Kicking the Tires
+=================
+
+It is somewhat hard to believe, but with a few simple extensions we've
+covered in the last chapters, we have grown a real-ish language. With
+this, we can do a lot of interesting things, including I/O, math, and a
+bunch of other things. For example, we can now add a nice sequencing
+operator (printd is defined to print out the specified value and a
+newline):
+
+::
+
+ ready> extern printd(x);
+ Read extern:
+ declare double @printd(double)
+
+ ready> def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.
+ ...
+ ready> printd(123) : printd(456) : printd(789);
+ 123.000000
+ 456.000000
+ 789.000000
+ Evaluated to 0.000000
+
+We can also define a bunch of other "primitive" operations, such as:
+
+::
+
+ # Logical unary not.
+ def unary!(v)
+ if v then
+ 0
+ else
+ 1;
+
+ # Unary negate.
+ def unary-(v)
+ 0-v;
+
+ # Define > with the same precedence as <.
+ def binary> 10 (LHS RHS)
+ RHS < LHS;
+
+ # Binary logical or, which does not short circuit.
+ def binary| 5 (LHS RHS)
+ if LHS then
+ 1
+ else if RHS then
+ 1
+ else
+ 0;
+
+ # Binary logical and, which does not short circuit.
+ def binary& 6 (LHS RHS)
+ if !LHS then
+ 0
+ else
+ !!RHS;
+
+ # Define = with slightly lower precedence than relationals.
+ def binary = 9 (LHS RHS)
+ !(LHS < RHS | LHS > RHS);
+
+ # Define ':' for sequencing: as a low-precedence operator that ignores operands
+ # and just returns the RHS.
+ def binary : 1 (x y) y;
+
+Given the previous if/then/else support, we can also define interesting
+functions for I/O. For example, the following prints out a character
+whose "density" reflects the value passed in: the lower the value, the
+denser the character:
+
+::
+
+ ready> extern putchard(char);
+ ...
+ ready> def printdensity(d)
+ if d > 8 then
+ putchard(32) # ' '
+ else if d > 4 then
+ putchard(46) # '.'
+ else if d > 2 then
+ putchard(43) # '+'
+ else
+ putchard(42); # '*'
+ ...
+ ready> printdensity(1): printdensity(2): printdensity(3):
+ printdensity(4): printdensity(5): printdensity(9):
+ putchard(10);
+ **++.
+ Evaluated to 0.000000
+
+Based on these simple primitive operations, we can start to define more
+interesting things. For example, here's a little function that determines
+the number of iterations it takes for a certain function in the complex
+plane to diverge:
+
+::
+
+ # Determine whether the specific location diverges.
+ # Solve for z = z^2 + c in the complex plane.
+ def mandelconverger(real imag iters creal cimag)
+ if iters > 255 | (real*real + imag*imag > 4) then
+ iters
+ else
+ mandelconverger(real*real - imag*imag + creal,
+ 2*real*imag + cimag,
+ iters+1, creal, cimag);
+
+ # Return the number of iterations required for the iteration to escape
+ def mandelconverge(real imag)
+ mandelconverger(real, imag, 0, real, imag);
+
+This "``z = z2 + c``" function is a beautiful little creature that is
+the basis for computation of the `Mandelbrot
+Set <http://en.wikipedia.org/wiki/Mandelbrot_set>`_. Our
+``mandelconverge`` function returns the number of iterations that it
+takes for a complex orbit to escape, saturating to 255. This is not a
+very useful function by itself, but if you plot its value over a
+two-dimensional plane, you can see the Mandelbrot set. Given that we are
+limited to using putchard here, our amazing graphical output is limited,
+but we can whip together something using the density plotter above:
+
+::
+
+ # Compute and plot the mandelbrot set with the specified 2 dimensional range
+ # info.
+ def mandelhelp(xmin xmax xstep ymin ymax ystep)
+ for y = ymin, y < ymax, ystep in (
+ (for x = xmin, x < xmax, xstep in
+ printdensity(mandelconverge(x,y)))
+ : putchard(10)
+ )
+
+ # mandel - This is a convenient helper function for plotting the mandelbrot set
+ # from the specified position with the specified Magnification.
+ def mandel(realstart imagstart realmag imagmag)
+ mandelhelp(realstart, realstart+realmag*78, realmag,
+ imagstart, imagstart+imagmag*40, imagmag);
+
+Given this, we can try plotting out the mandelbrot set! Lets try it out:
+
+::
+
+ ready> mandel(-2.3, -1.3, 0.05, 0.07);
+ *******************************+++++++++++*************************************
+ *************************+++++++++++++++++++++++*******************************
+ **********************+++++++++++++++++++++++++++++****************************
+ *******************+++++++++++++++++++++.. ...++++++++*************************
+ *****************++++++++++++++++++++++.... ...+++++++++***********************
+ ***************+++++++++++++++++++++++..... ...+++++++++*********************
+ **************+++++++++++++++++++++++.... ....+++++++++********************
+ *************++++++++++++++++++++++...... .....++++++++*******************
+ ************+++++++++++++++++++++....... .......+++++++******************
+ ***********+++++++++++++++++++.... ... .+++++++*****************
+ **********+++++++++++++++++....... .+++++++****************
+ *********++++++++++++++........... ...+++++++***************
+ ********++++++++++++............ ...++++++++**************
+ ********++++++++++... .......... .++++++++**************
+ *******+++++++++..... .+++++++++*************
+ *******++++++++...... ..+++++++++*************
+ *******++++++....... ..+++++++++*************
+ *******+++++...... ..+++++++++*************
+ *******.... .... ...+++++++++*************
+ *******.... . ...+++++++++*************
+ *******+++++...... ...+++++++++*************
+ *******++++++....... ..+++++++++*************
+ *******++++++++...... .+++++++++*************
+ *******+++++++++..... ..+++++++++*************
+ ********++++++++++... .......... .++++++++**************
+ ********++++++++++++............ ...++++++++**************
+ *********++++++++++++++.......... ...+++++++***************
+ **********++++++++++++++++........ .+++++++****************
+ **********++++++++++++++++++++.... ... ..+++++++****************
+ ***********++++++++++++++++++++++....... .......++++++++*****************
+ ************+++++++++++++++++++++++...... ......++++++++******************
+ **************+++++++++++++++++++++++.... ....++++++++********************
+ ***************+++++++++++++++++++++++..... ...+++++++++*********************
+ *****************++++++++++++++++++++++.... ...++++++++***********************
+ *******************+++++++++++++++++++++......++++++++*************************
+ *********************++++++++++++++++++++++.++++++++***************************
+ *************************+++++++++++++++++++++++*******************************
+ ******************************+++++++++++++************************************
+ *******************************************************************************
+ *******************************************************************************
+ *******************************************************************************
+ Evaluated to 0.000000
+ ready> mandel(-2, -1, 0.02, 0.04);
+ **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
+ ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
+ *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
+ *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
+ ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
+ **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
+ ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
+ ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
+ **********++++++++++++++++++++++++++++++++++++++++++++++.............
+ ********+++++++++++++++++++++++++++++++++++++++++++..................
+ *******+++++++++++++++++++++++++++++++++++++++.......................
+ ******+++++++++++++++++++++++++++++++++++...........................
+ *****++++++++++++++++++++++++++++++++............................
+ *****++++++++++++++++++++++++++++...............................
+ ****++++++++++++++++++++++++++...... .........................
+ ***++++++++++++++++++++++++......... ...... ...........
+ ***++++++++++++++++++++++............
+ **+++++++++++++++++++++..............
+ **+++++++++++++++++++................
+ *++++++++++++++++++.................
+ *++++++++++++++++............ ...
+ *++++++++++++++..............
+ *+++....++++................
+ *.......... ...........
+ *
+ *.......... ...........
+ *+++....++++................
+ *++++++++++++++..............
+ *++++++++++++++++............ ...
+ *++++++++++++++++++.................
+ **+++++++++++++++++++................
+ **+++++++++++++++++++++..............
+ ***++++++++++++++++++++++............
+ ***++++++++++++++++++++++++......... ...... ...........
+ ****++++++++++++++++++++++++++...... .........................
+ *****++++++++++++++++++++++++++++...............................
+ *****++++++++++++++++++++++++++++++++............................
+ ******+++++++++++++++++++++++++++++++++++...........................
+ *******+++++++++++++++++++++++++++++++++++++++.......................
+ ********+++++++++++++++++++++++++++++++++++++++++++..................
+ Evaluated to 0.000000
+ ready> mandel(-0.9, -1.4, 0.02, 0.03);
+ *******************************************************************************
+ *******************************************************************************
+ *******************************************************************************
+ **********+++++++++++++++++++++************************************************
+ *+++++++++++++++++++++++++++++++++++++++***************************************
+ +++++++++++++++++++++++++++++++++++++++++++++**********************************
+ ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
+ ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
+ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
+ +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
+ +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
+ +++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
+ ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
+ +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
+ ++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
+ ++++++++++++++++++++++++............. .......++++++++++++++++++++++******
+ +++++++++++++++++++++++............. ........+++++++++++++++++++++++****
+ ++++++++++++++++++++++........... ..........++++++++++++++++++++++***
+ ++++++++++++++++++++........... .........++++++++++++++++++++++*
+ ++++++++++++++++++............ ...........++++++++++++++++++++
+ ++++++++++++++++............... .............++++++++++++++++++
+ ++++++++++++++................. ...............++++++++++++++++
+ ++++++++++++.................. .................++++++++++++++
+ +++++++++.................. .................+++++++++++++
+ ++++++........ . ......... ..++++++++++++
+ ++............ ...... ....++++++++++
+ .............. ...++++++++++
+ .............. ....+++++++++
+ .............. .....++++++++
+ ............. ......++++++++
+ ........... .......++++++++
+ ......... ........+++++++
+ ......... ........+++++++
+ ......... ....+++++++
+ ........ ...+++++++
+ ....... ...+++++++
+ ....+++++++
+ .....+++++++
+ ....+++++++
+ ....+++++++
+ ....+++++++
+ Evaluated to 0.000000
+ ready> ^D
+
+At this point, you may be starting to realize that Kaleidoscope is a
+real and powerful language. It may not be self-similar :), but it can be
+used to plot things that are!
+
+With this, we conclude the "adding user-defined operators" chapter of
+the tutorial. We have successfully augmented our language, adding the
+ability to extend the language in the library, and we have shown how
+this can be used to build a simple but interesting end-user application
+in Kaleidoscope. At this point, Kaleidoscope can build a variety of
+applications that are functional and can call functions with
+side-effects, but it can't actually define and mutate a variable itself.
+
+Strikingly, variable mutation is an important feature of some languages,
+and it is not at all obvious how to `add support for mutable
+variables <LangImpl07.html>`_ without having to add an "SSA construction"
+phase to your front-end. In the next chapter, we will describe how you
+can add variable mutation without building SSA in your front-end.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the support for user-defined operators. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+On some platforms, you will need to specify -rdynamic or
+-Wl,--export-dynamic when linking. This ensures that symbols defined in
+the main executable are exported to the dynamic linker and so are
+available for symbol resolution at run time. This is not needed if you
+compile your support code into a shared library, although doing that
+will cause problems on Windows.
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter6/toy.cpp
+ :language: c++
+
+`Next: Extending the language: mutable variables / SSA
+construction <LangImpl07.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl07.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl07.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl07.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl07.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,883 @@
+=======================================================
+Kaleidoscope: Extending the Language: Mutable Variables
+=======================================================
+
+.. contents::
+ :local:
+
+Chapter 7 Introduction
+======================
+
+Welcome to Chapter 7 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
+very respectable, albeit simple, `functional programming
+language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
+journey, we learned some parsing techniques, how to build and represent
+an AST, how to build LLVM IR, and how to optimize the resultant code as
+well as JIT compile it.
+
+While Kaleidoscope is interesting as a functional language, the fact
+that it is functional makes it "too easy" to generate LLVM IR for it. In
+particular, a functional language makes it very easy to build LLVM IR
+directly in `SSA
+form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+Since LLVM requires that the input code be in SSA form, this is a very
+nice property and it is often unclear to newcomers how to generate code
+for an imperative language with mutable variables.
+
+The short (and happy) summary of this chapter is that there is no need
+for your front-end to build SSA form: LLVM provides highly tuned and
+well tested support for this, though the way it works is a bit
+unexpected for some.
+
+Why is this a hard problem?
+===========================
+
+To understand why mutable variables cause complexities in SSA
+construction, consider this extremely simple C example:
+
+.. code-block:: c
+
+ int G, H;
+ int test(_Bool Condition) {
+ int X;
+ if (Condition)
+ X = G;
+ else
+ X = H;
+ return X;
+ }
+
+In this case, we have the variable "X", whose value depends on the path
+executed in the program. Because there are two different possible values
+for X before the return instruction, a PHI node is inserted to merge the
+two values. The LLVM IR that we want for this example looks like this:
+
+.. code-block:: llvm
+
+ @G = weak global i32 0 ; type of @G is i32*
+ @H = weak global i32 0 ; type of @H is i32*
+
+ define i32 @test(i1 %Condition) {
+ entry:
+ br i1 %Condition, label %cond_true, label %cond_false
+
+ cond_true:
+ %X.0 = load i32* @G
+ br label %cond_next
+
+ cond_false:
+ %X.1 = load i32* @H
+ br label %cond_next
+
+ cond_next:
+ %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
+ ret i32 %X.2
+ }
+
+In this example, the loads from the G and H global variables are
+explicit in the LLVM IR, and they live in the then/else branches of the
+if statement (cond\_true/cond\_false). In order to merge the incoming
+values, the X.2 phi node in the cond\_next block selects the right value
+to use based on where control flow is coming from: if control flow comes
+from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
+control flow comes from cond\_true, it gets the value of X.0. The intent
+of this chapter is not to explain the details of SSA form. For more
+information, see one of the many `online
+references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+
+The question for this article is "who places the phi nodes when lowering
+assignments to mutable variables?". The issue here is that LLVM
+*requires* that its IR be in SSA form: there is no "non-ssa" mode for
+it. However, SSA construction requires non-trivial algorithms and data
+structures, so it is inconvenient and wasteful for every front-end to
+have to reproduce this logic.
+
+Memory in LLVM
+==============
+
+The 'trick' here is that while LLVM does require all register values to
+be in SSA form, it does not require (or permit) memory objects to be in
+SSA form. In the example above, note that the loads from G and H are
+direct accesses to G and H: they are not renamed or versioned. This
+differs from some other compiler systems, which do try to version memory
+objects. In LLVM, instead of encoding dataflow analysis of memory into
+the LLVM IR, it is handled with `Analysis
+Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
+
+With this in mind, the high-level idea is that we want to make a stack
+variable (which lives in memory, because it is on the stack) for each
+mutable object in a function. To take advantage of this trick, we need
+to talk about how LLVM represents stack variables.
+
+In LLVM, all memory accesses are explicit with load/store instructions,
+and it is carefully designed not to have (or need) an "address-of"
+operator. Notice how the type of the @G/@H global variables is actually
+"i32\*" even though the variable is defined as "i32". What this means is
+that @G defines *space* for an i32 in the global data area, but its
+*name* actually refers to the address for that space. Stack variables
+work the same way, except that instead of being declared with global
+variable definitions, they are declared with the `LLVM alloca
+instruction <../LangRef.html#alloca-instruction>`_:
+
+.. code-block:: llvm
+
+ define i32 @example() {
+ entry:
+ %X = alloca i32 ; type of %X is i32*.
+ ...
+ %tmp = load i32* %X ; load the stack value %X from the stack.
+ %tmp2 = add i32 %tmp, 1 ; increment it
+ store i32 %tmp2, i32* %X ; store it back
+ ...
+
+This code shows an example of how you can declare and manipulate a stack
+variable in the LLVM IR. Stack memory allocated with the alloca
+instruction is fully general: you can pass the address of the stack slot
+to functions, you can store it in other variables, etc. In our example
+above, we could rewrite the example to use the alloca technique to avoid
+using a PHI node:
+
+.. code-block:: llvm
+
+ @G = weak global i32 0 ; type of @G is i32*
+ @H = weak global i32 0 ; type of @H is i32*
+
+ define i32 @test(i1 %Condition) {
+ entry:
+ %X = alloca i32 ; type of %X is i32*.
+ br i1 %Condition, label %cond_true, label %cond_false
+
+ cond_true:
+ %X.0 = load i32* @G
+ store i32 %X.0, i32* %X ; Update X
+ br label %cond_next
+
+ cond_false:
+ %X.1 = load i32* @H
+ store i32 %X.1, i32* %X ; Update X
+ br label %cond_next
+
+ cond_next:
+ %X.2 = load i32* %X ; Read X
+ ret i32 %X.2
+ }
+
+With this, we have discovered a way to handle arbitrary mutable
+variables without the need to create Phi nodes at all:
+
+#. Each mutable variable becomes a stack allocation.
+#. Each read of the variable becomes a load from the stack.
+#. Each update of the variable becomes a store to the stack.
+#. Taking the address of a variable just uses the stack address
+ directly.
+
+While this solution has solved our immediate problem, it introduced
+another one: we have now apparently introduced a lot of stack traffic
+for very simple and common operations, a major performance problem.
+Fortunately for us, the LLVM optimizer has a highly-tuned optimization
+pass named "mem2reg" that handles this case, promoting allocas like this
+into SSA registers, inserting Phi nodes as appropriate. If you run this
+example through the pass, for example, you'll get:
+
+.. code-block:: bash
+
+ $ llvm-as < example.ll | opt -mem2reg | llvm-dis
+ @G = weak global i32 0
+ @H = weak global i32 0
+
+ define i32 @test(i1 %Condition) {
+ entry:
+ br i1 %Condition, label %cond_true, label %cond_false
+
+ cond_true:
+ %X.0 = load i32* @G
+ br label %cond_next
+
+ cond_false:
+ %X.1 = load i32* @H
+ br label %cond_next
+
+ cond_next:
+ %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
+ ret i32 %X.01
+ }
+
+The mem2reg pass implements the standard "iterated dominance frontier"
+algorithm for constructing SSA form and has a number of optimizations
+that speed up (very common) degenerate cases. The mem2reg optimization
+pass is the answer to dealing with mutable variables, and we highly
+recommend that you depend on it. Note that mem2reg only works on
+variables in certain circumstances:
+
+#. mem2reg is alloca-driven: it looks for allocas and if it can handle
+ them, it promotes them. It does not apply to global variables or heap
+ allocations.
+#. mem2reg only looks for alloca instructions in the entry block of the
+ function. Being in the entry block guarantees that the alloca is only
+ executed once, which makes analysis simpler.
+#. mem2reg only promotes allocas whose uses are direct loads and stores.
+ If the address of the stack object is passed to a function, or if any
+ funny pointer arithmetic is involved, the alloca will not be
+ promoted.
+#. mem2reg only works on allocas of `first
+ class <../LangRef.html#first-class-types>`_ values (such as pointers,
+ scalars and vectors), and only if the array size of the allocation is
+ 1 (or missing in the .ll file). mem2reg is not capable of promoting
+ structs or arrays to registers. Note that the "sroa" pass is
+ more powerful and can promote structs, "unions", and arrays in many
+ cases.
+
+All of these properties are easy to satisfy for most imperative
+languages, and we'll illustrate it below with Kaleidoscope. The final
+question you may be asking is: should I bother with this nonsense for my
+front-end? Wouldn't it be better if I just did SSA construction
+directly, avoiding use of the mem2reg optimization pass? In short, we
+strongly recommend that you use this technique for building SSA form,
+unless there is an extremely good reason not to. Using this technique
+is:
+
+- Proven and well tested: clang uses this technique
+ for local mutable variables. As such, the most common clients of LLVM
+ are using this to handle a bulk of their variables. You can be sure
+ that bugs are found fast and fixed early.
+- Extremely Fast: mem2reg has a number of special cases that make it
+ fast in common cases as well as fully general. For example, it has
+ fast-paths for variables that are only used in a single block,
+ variables that only have one assignment point, good heuristics to
+ avoid insertion of unneeded phi nodes, etc.
+- Needed for debug info generation: `Debug information in
+ LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
+ the variable exposed so that debug info can be attached to it. This
+ technique dovetails very naturally with this style of debug info.
+
+If nothing else, this makes it much easier to get your front-end up and
+running, and is very simple to implement. Let's extend Kaleidoscope with
+mutable variables now!
+
+Mutable Variables in Kaleidoscope
+=================================
+
+Now that we know the sort of problem we want to tackle, let's see what
+this looks like in the context of our little Kaleidoscope language.
+We're going to add two features:
+
+#. The ability to mutate variables with the '=' operator.
+#. The ability to define new variables.
+
+While the first item is really what this is about, we only have
+variables for incoming arguments as well as for induction variables, and
+redefining those only goes so far :). Also, the ability to define new
+variables is a useful thing regardless of whether you will be mutating
+them. Here's a motivating example that shows how we could use these:
+
+::
+
+ # Define ':' for sequencing: as a low-precedence operator that ignores operands
+ # and just returns the RHS.
+ def binary : 1 (x y) y;
+
+ # Recursive fib, we could do this before.
+ def fib(x)
+ if (x < 3) then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+ # Iterative fib.
+ def fibi(x)
+ var a = 1, b = 1, c in
+ (for i = 3, i < x in
+ c = a + b :
+ a = b :
+ b = c) :
+ b;
+
+ # Call it.
+ fibi(10);
+
+In order to mutate variables, we have to change our existing variables
+to use the "alloca trick". Once we have that, we'll add our new
+operator, then extend Kaleidoscope to support new variable definitions.
+
+Adjusting Existing Variables for Mutation
+=========================================
+
+The symbol table in Kaleidoscope is managed at code generation time by
+the '``NamedValues``' map. This map currently keeps track of the LLVM
+"Value\*" that holds the double value for the named variable. In order
+to support mutation, we need to change this slightly, so that
+``NamedValues`` holds the *memory location* of the variable in question.
+Note that this change is a refactoring: it changes the structure of the
+code, but does not (by itself) change the behavior of the compiler. All
+of these changes are isolated in the Kaleidoscope code generator.
+
+At this point in Kaleidoscope's development, it only supports variables
+for two things: incoming arguments to functions and the induction
+variable of 'for' loops. For consistency, we'll allow mutation of these
+variables in addition to other user-defined variables. This means that
+these will both need memory locations.
+
+To start our transformation of Kaleidoscope, we'll change the
+NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
+we do this, the C++ compiler will tell us what parts of the code we need
+to update:
+
+.. code-block:: c++
+
+ static std::map<std::string, AllocaInst*> NamedValues;
+
+Also, since we will need to create these allocas, we'll use a helper
+function that ensures that the allocas are created in the entry block of
+the function:
+
+.. code-block:: c++
+
+ /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
+ /// the function. This is used for mutable variables etc.
+ static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
+ const std::string &VarName) {
+ IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
+ TheFunction->getEntryBlock().begin());
+ return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), 0,
+ VarName.c_str());
+ }
+
+This funny looking code creates an IRBuilder object that is pointing at
+the first instruction (.begin()) of the entry block. It then creates an
+alloca with the expected name and returns it. Because all values in
+Kaleidoscope are doubles, there is no need to pass in a type to use.
+
+With this in place, the first functionality change we want to make belongs to
+variable references. In our new scheme, variables live on the stack, so
+code generating a reference to them actually needs to produce a load
+from the stack slot:
+
+.. code-block:: c++
+
+ Value *VariableExprAST::codegen() {
+ // Look this variable up in the function.
+ Value *V = NamedValues[Name];
+ if (!V)
+ return LogErrorV("Unknown variable name");
+
+ // Load the value.
+ return Builder.CreateLoad(V, Name.c_str());
+ }
+
+As you can see, this is pretty straightforward. Now we need to update
+the things that define the variables to set up the alloca. We'll start
+with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
+the unabridged code):
+
+.. code-block:: c++
+
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+
+ // Create an alloca for the variable in the entry block.
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
+
+ // Emit the start code first, without 'variable' in scope.
+ Value *StartVal = Start->codegen();
+ if (!StartVal)
+ return nullptr;
+
+ // Store the value into the alloca.
+ Builder.CreateStore(StartVal, Alloca);
+ ...
+
+ // Compute the end condition.
+ Value *EndCond = End->codegen();
+ if (!EndCond)
+ return nullptr;
+
+ // Reload, increment, and restore the alloca. This handles the case where
+ // the body of the loop mutates the variable.
+ Value *CurVar = Builder.CreateLoad(Alloca);
+ Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
+ Builder.CreateStore(NextVar, Alloca);
+ ...
+
+This code is virtually identical to the code `before we allowed mutable
+variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
+no longer have to construct a PHI node, and we use load/store to access
+the variable as needed.
+
+To support mutable argument variables, we need to also make allocas for
+them. The code for this is also pretty simple:
+
+.. code-block:: c++
+
+ Function *FunctionAST::codegen() {
+ ...
+ Builder.SetInsertPoint(BB);
+
+ // Record the function arguments in the NamedValues map.
+ NamedValues.clear();
+ for (auto &Arg : TheFunction->args()) {
+ // Create an alloca for this variable.
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
+
+ // Store the initial value into the alloca.
+ Builder.CreateStore(&Arg, Alloca);
+
+ // Add arguments to variable symbol table.
+ NamedValues[Arg.getName()] = Alloca;
+ }
+
+ if (Value *RetVal = Body->codegen()) {
+ ...
+
+For each argument, we make an alloca, store the input value to the
+function into the alloca, and register the alloca as the memory location
+for the argument. This method gets invoked by ``FunctionAST::codegen()``
+right after it sets up the entry block for the function.
+
+The final missing piece is adding the mem2reg pass, which allows us to
+get good codegen once again:
+
+.. code-block:: c++
+
+ // Promote allocas to registers.
+ TheFPM->add(createPromoteMemoryToRegisterPass());
+ // Do simple "peephole" optimizations and bit-twiddling optzns.
+ TheFPM->add(createInstructionCombiningPass());
+ // Reassociate expressions.
+ TheFPM->add(createReassociatePass());
+ ...
+
+It is interesting to see what the code looks like before and after the
+mem2reg optimization runs. For example, this is the before/after code
+for our recursive fib function. Before the optimization:
+
+.. code-block:: llvm
+
+ define double @fib(double %x) {
+ entry:
+ %x1 = alloca double
+ store double %x, double* %x1
+ %x2 = load double, double* %x1
+ %cmptmp = fcmp ult double %x2, 3.000000e+00
+ %booltmp = uitofp i1 %cmptmp to double
+ %ifcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then: ; preds = %entry
+ br label %ifcont
+
+ else: ; preds = %entry
+ %x3 = load double, double* %x1
+ %subtmp = fsub double %x3, 1.000000e+00
+ %calltmp = call double @fib(double %subtmp)
+ %x4 = load double, double* %x1
+ %subtmp5 = fsub double %x4, 2.000000e+00
+ %calltmp6 = call double @fib(double %subtmp5)
+ %addtmp = fadd double %calltmp, %calltmp6
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
+ ret double %iftmp
+ }
+
+Here there is only one variable (x, the input argument) but you can
+still see the extremely simple-minded code generation strategy we are
+using. In the entry block, an alloca is created, and the initial input
+value is stored into it. Each reference to the variable does a reload
+from the stack. Also, note that we didn't modify the if/then/else
+expression, so it still inserts a PHI node. While we could make an
+alloca for it, it is actually easier to create a PHI node for it, so we
+still just make the PHI.
+
+Here is the code after the mem2reg pass runs:
+
+.. code-block:: llvm
+
+ define double @fib(double %x) {
+ entry:
+ %cmptmp = fcmp ult double %x, 3.000000e+00
+ %booltmp = uitofp i1 %cmptmp to double
+ %ifcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then:
+ br label %ifcont
+
+ else:
+ %subtmp = fsub double %x, 1.000000e+00
+ %calltmp = call double @fib(double %subtmp)
+ %subtmp5 = fsub double %x, 2.000000e+00
+ %calltmp6 = call double @fib(double %subtmp5)
+ %addtmp = fadd double %calltmp, %calltmp6
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
+ ret double %iftmp
+ }
+
+This is a trivial case for mem2reg, since there are no redefinitions of
+the variable. The point of showing this is to calm your tension about
+inserting such blatent inefficiencies :).
+
+After the rest of the optimizers run, we get:
+
+.. code-block:: llvm
+
+ define double @fib(double %x) {
+ entry:
+ %cmptmp = fcmp ult double %x, 3.000000e+00
+ %booltmp = uitofp i1 %cmptmp to double
+ %ifcond = fcmp ueq double %booltmp, 0.000000e+00
+ br i1 %ifcond, label %else, label %ifcont
+
+ else:
+ %subtmp = fsub double %x, 1.000000e+00
+ %calltmp = call double @fib(double %subtmp)
+ %subtmp5 = fsub double %x, 2.000000e+00
+ %calltmp6 = call double @fib(double %subtmp5)
+ %addtmp = fadd double %calltmp, %calltmp6
+ ret double %addtmp
+
+ ifcont:
+ ret double 1.000000e+00
+ }
+
+Here we see that the simplifycfg pass decided to clone the return
+instruction into the end of the 'else' block. This allowed it to
+eliminate some branches and the PHI node.
+
+Now that all symbol table references are updated to use stack variables,
+we'll add the assignment operator.
+
+New Assignment Operator
+=======================
+
+With our current framework, adding a new assignment operator is really
+simple. We will parse it just like any other binary operator, but handle
+it internally (instead of allowing the user to define it). The first
+step is to set a precedence:
+
+.. code-block:: c++
+
+ int main() {
+ // Install standard binary operators.
+ // 1 is lowest precedence.
+ BinopPrecedence['='] = 2;
+ BinopPrecedence['<'] = 10;
+ BinopPrecedence['+'] = 20;
+ BinopPrecedence['-'] = 20;
+
+Now that the parser knows the precedence of the binary operator, it
+takes care of all the parsing and AST generation. We just need to
+implement codegen for the assignment operator. This looks like:
+
+.. code-block:: c++
+
+ Value *BinaryExprAST::codegen() {
+ // Special case '=' because we don't want to emit the LHS as an expression.
+ if (Op == '=') {
+ // Assignment requires the LHS to be an identifier.
+ VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get());
+ if (!LHSE)
+ return LogErrorV("destination of '=' must be a variable");
+
+Unlike the rest of the binary operators, our assignment operator doesn't
+follow the "emit LHS, emit RHS, do computation" model. As such, it is
+handled as a special case before the other binary operators are handled.
+The other strange thing is that it requires the LHS to be a variable. It
+is invalid to have "(x+1) = expr" - only things like "x = expr" are
+allowed.
+
+.. code-block:: c++
+
+ // Codegen the RHS.
+ Value *Val = RHS->codegen();
+ if (!Val)
+ return nullptr;
+
+ // Look up the name.
+ Value *Variable = NamedValues[LHSE->getName()];
+ if (!Variable)
+ return LogErrorV("Unknown variable name");
+
+ Builder.CreateStore(Val, Variable);
+ return Val;
+ }
+ ...
+
+Once we have the variable, codegen'ing the assignment is
+straightforward: we emit the RHS of the assignment, create a store, and
+return the computed value. Returning a value allows for chained
+assignments like "X = (Y = Z)".
+
+Now that we have an assignment operator, we can mutate loop variables
+and arguments. For example, we can now run code like this:
+
+::
+
+ # Function to print a double.
+ extern printd(x);
+
+ # Define ':' for sequencing: as a low-precedence operator that ignores operands
+ # and just returns the RHS.
+ def binary : 1 (x y) y;
+
+ def test(x)
+ printd(x) :
+ x = 4 :
+ printd(x);
+
+ test(123);
+
+When run, this example prints "123" and then "4", showing that we did
+actually mutate the value! Okay, we have now officially implemented our
+goal: getting this to work requires SSA construction in the general
+case. However, to be really useful, we want the ability to define our
+own local variables, let's add this next!
+
+User-defined Local Variables
+============================
+
+Adding var/in is just like any other extension we made to
+Kaleidoscope: we extend the lexer, the parser, the AST and the code
+generator. The first step for adding our new 'var/in' construct is to
+extend the lexer. As before, this is pretty trivial, the code looks like
+this:
+
+.. code-block:: c++
+
+ enum Token {
+ ...
+ // var definition
+ tok_var = -13
+ ...
+ }
+ ...
+ static int gettok() {
+ ...
+ if (IdentifierStr == "in")
+ return tok_in;
+ if (IdentifierStr == "binary")
+ return tok_binary;
+ if (IdentifierStr == "unary")
+ return tok_unary;
+ if (IdentifierStr == "var")
+ return tok_var;
+ return tok_identifier;
+ ...
+
+The next step is to define the AST node that we will construct. For
+var/in, it looks like this:
+
+.. code-block:: c++
+
+ /// VarExprAST - Expression class for var/in
+ class VarExprAST : public ExprAST {
+ std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
+ std::unique_ptr<ExprAST> Body;
+
+ public:
+ VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
+ std::unique_ptr<ExprAST> Body)
+ : VarNames(std::move(VarNames)), Body(std::move(Body)) {}
+
+ Value *codegen() override;
+ };
+
+var/in allows a list of names to be defined all at once, and each name
+can optionally have an initializer value. As such, we capture this
+information in the VarNames vector. Also, var/in has a body, this body
+is allowed to access the variables defined by the var/in.
+
+With this in place, we can define the parser pieces. The first thing we
+do is add it as a primary expression:
+
+.. code-block:: c++
+
+ /// primary
+ /// ::= identifierexpr
+ /// ::= numberexpr
+ /// ::= parenexpr
+ /// ::= ifexpr
+ /// ::= forexpr
+ /// ::= varexpr
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ case tok_if:
+ return ParseIfExpr();
+ case tok_for:
+ return ParseForExpr();
+ case tok_var:
+ return ParseVarExpr();
+ }
+ }
+
+Next we define ParseVarExpr:
+
+.. code-block:: c++
+
+ /// varexpr ::= 'var' identifier ('=' expression)?
+ // (',' identifier ('=' expression)?)* 'in' expression
+ static std::unique_ptr<ExprAST> ParseVarExpr() {
+ getNextToken(); // eat the var.
+
+ std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
+
+ // At least one variable name is required.
+ if (CurTok != tok_identifier)
+ return LogError("expected identifier after var");
+
+The first part of this code parses the list of identifier/expr pairs
+into the local ``VarNames`` vector.
+
+.. code-block:: c++
+
+ while (1) {
+ std::string Name = IdentifierStr;
+ getNextToken(); // eat identifier.
+
+ // Read the optional initializer.
+ std::unique_ptr<ExprAST> Init;
+ if (CurTok == '=') {
+ getNextToken(); // eat the '='.
+
+ Init = ParseExpression();
+ if (!Init) return nullptr;
+ }
+
+ VarNames.push_back(std::make_pair(Name, std::move(Init)));
+
+ // End of var list, exit loop.
+ if (CurTok != ',') break;
+ getNextToken(); // eat the ','.
+
+ if (CurTok != tok_identifier)
+ return LogError("expected identifier list after var");
+ }
+
+Once all the variables are parsed, we then parse the body and create the
+AST node:
+
+.. code-block:: c++
+
+ // At this point, we have to have 'in'.
+ if (CurTok != tok_in)
+ return LogError("expected 'in' keyword after 'var'");
+ getNextToken(); // eat 'in'.
+
+ auto Body = ParseExpression();
+ if (!Body)
+ return nullptr;
+
+ return llvm::make_unique<VarExprAST>(std::move(VarNames),
+ std::move(Body));
+ }
+
+Now that we can parse and represent the code, we need to support
+emission of LLVM IR for it. This code starts out with:
+
+.. code-block:: c++
+
+ Value *VarExprAST::codegen() {
+ std::vector<AllocaInst *> OldBindings;
+
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+
+ // Register all variables and emit their initializer.
+ for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
+ const std::string &VarName = VarNames[i].first;
+ ExprAST *Init = VarNames[i].second.get();
+
+Basically it loops over all the variables, installing them one at a
+time. For each variable we put into the symbol table, we remember the
+previous value that we replace in OldBindings.
+
+.. code-block:: c++
+
+ // Emit the initializer before adding the variable to scope, this prevents
+ // the initializer from referencing the variable itself, and permits stuff
+ // like this:
+ // var a = 1 in
+ // var a = a in ... # refers to outer 'a'.
+ Value *InitVal;
+ if (Init) {
+ InitVal = Init->codegen();
+ if (!InitVal)
+ return nullptr;
+ } else { // If not specified, use 0.0.
+ InitVal = ConstantFP::get(TheContext, APFloat(0.0));
+ }
+
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
+ Builder.CreateStore(InitVal, Alloca);
+
+ // Remember the old variable binding so that we can restore the binding when
+ // we unrecurse.
+ OldBindings.push_back(NamedValues[VarName]);
+
+ // Remember this binding.
+ NamedValues[VarName] = Alloca;
+ }
+
+There are more comments here than code. The basic idea is that we emit
+the initializer, create the alloca, then update the symbol table to
+point to it. Once all the variables are installed in the symbol table,
+we evaluate the body of the var/in expression:
+
+.. code-block:: c++
+
+ // Codegen the body, now that all vars are in scope.
+ Value *BodyVal = Body->codegen();
+ if (!BodyVal)
+ return nullptr;
+
+Finally, before returning, we restore the previous variable bindings:
+
+.. code-block:: c++
+
+ // Pop all our variables from scope.
+ for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
+ NamedValues[VarNames[i].first] = OldBindings[i];
+
+ // Return the body computation.
+ return BodyVal;
+ }
+
+The end result of all of this is that we get properly scoped variable
+definitions, and we even (trivially) allow mutation of them :).
+
+With this, we completed what we set out to do. Our nice iterative fib
+example from the intro compiles and runs just fine. The mem2reg pass
+optimizes all of our stack variables into SSA registers, inserting PHI
+nodes where needed, and our front-end remains simple: no "iterated
+dominance frontier" computation anywhere in sight.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+mutable variables and var/in support. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp
+ :language: c++
+
+`Next: Compiling to Object Code <LangImpl08.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl08.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl08.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl08.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl08.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,218 @@
+========================================
+ Kaleidoscope: Compiling to Object Code
+========================================
+
+.. contents::
+ :local:
+
+Chapter 8 Introduction
+======================
+
+Welcome to Chapter 8 of the "`Implementing a language with LLVM
+<index.html>`_" tutorial. This chapter describes how to compile our
+language down to object files.
+
+Choosing a target
+=================
+
+LLVM has native support for cross-compilation. You can compile to the
+architecture of your current machine, or just as easily compile for
+other architectures. In this tutorial, we'll target the current
+machine.
+
+To specify the architecture that you want to target, we use a string
+called a "target triple". This takes the form
+``<arch><sub>-<vendor>-<sys>-<abi>`` (see the `cross compilation docs
+<http://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_).
+
+As an example, we can see what clang thinks is our current target
+triple:
+
+::
+
+ $ clang --version | grep Target
+ Target: x86_64-unknown-linux-gnu
+
+Running this command may show something different on your machine as
+you might be using a different architecture or operating system to me.
+
+Fortunately, we don't need to hard-code a target triple to target the
+current machine. LLVM provides ``sys::getDefaultTargetTriple``, which
+returns the target triple of the current machine.
+
+.. code-block:: c++
+
+ auto TargetTriple = sys::getDefaultTargetTriple();
+
+LLVM doesn't require us to link in all the target
+functionality. For example, if we're just using the JIT, we don't need
+the assembly printers. Similarly, if we're only targeting certain
+architectures, we can only link in the functionality for those
+architectures.
+
+For this example, we'll initialize all the targets for emitting object
+code.
+
+.. code-block:: c++
+
+ InitializeAllTargetInfos();
+ InitializeAllTargets();
+ InitializeAllTargetMCs();
+ InitializeAllAsmParsers();
+ InitializeAllAsmPrinters();
+
+We can now use our target triple to get a ``Target``:
+
+.. code-block:: c++
+
+ std::string Error;
+ auto Target = TargetRegistry::lookupTarget(TargetTriple, Error);
+
+ // Print an error and exit if we couldn't find the requested target.
+ // This generally occurs if we've forgotten to initialise the
+ // TargetRegistry or we have a bogus target triple.
+ if (!Target) {
+ errs() << Error;
+ return 1;
+ }
+
+Target Machine
+==============
+
+We will also need a ``TargetMachine``. This class provides a complete
+machine description of the machine we're targeting. If we want to
+target a specific feature (such as SSE) or a specific CPU (such as
+Intel's Sandylake), we do so now.
+
+To see which features and CPUs that LLVM knows about, we can use
+``llc``. For example, let's look at x86:
+
+::
+
+ $ llvm-as < /dev/null | llc -march=x86 -mattr=help
+ Available CPUs for this target:
+
+ amdfam10 - Select the amdfam10 processor.
+ athlon - Select the athlon processor.
+ athlon-4 - Select the athlon-4 processor.
+ ...
+
+ Available features for this target:
+
+ 16bit-mode - 16-bit mode (i8086).
+ 32bit-mode - 32-bit mode (80386).
+ 3dnow - Enable 3DNow! instructions.
+ 3dnowa - Enable 3DNow! Athlon instructions.
+ ...
+
+For our example, we'll use the generic CPU without any additional
+features, options or relocation model.
+
+.. code-block:: c++
+
+ auto CPU = "generic";
+ auto Features = "";
+
+ TargetOptions opt;
+ auto RM = Optional<Reloc::Model>();
+ auto TargetMachine = Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM);
+
+
+Configuring the Module
+======================
+
+We're now ready to configure our module, to specify the target and
+data layout. This isn't strictly necessary, but the `frontend
+performance guide <../Frontend/PerformanceTips.html>`_ recommends
+this. Optimizations benefit from knowing about the target and data
+layout.
+
+.. code-block:: c++
+
+ TheModule->setDataLayout(TargetMachine->createDataLayout());
+ TheModule->setTargetTriple(TargetTriple);
+
+Emit Object Code
+================
+
+We're ready to emit object code! Let's define where we want to write
+our file to:
+
+.. code-block:: c++
+
+ auto Filename = "output.o";
+ std::error_code EC;
+ raw_fd_ostream dest(Filename, EC, sys::fs::F_None);
+
+ if (EC) {
+ errs() << "Could not open file: " << EC.message();
+ return 1;
+ }
+
+Finally, we define a pass that emits object code, then we run that
+pass:
+
+.. code-block:: c++
+
+ legacy::PassManager pass;
+ auto FileType = TargetMachine::CGFT_ObjectFile;
+
+ if (TargetMachine->addPassesToEmitFile(pass, dest, nullptr, FileType)) {
+ errs() << "TargetMachine can't emit a file of this type";
+ return 1;
+ }
+
+ pass.run(*TheModule);
+ dest.flush();
+
+Putting It All Together
+=======================
+
+Does it work? Let's give it a try. We need to compile our code, but
+note that the arguments to ``llvm-config`` are different to the previous chapters.
+
+::
+
+ $ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs all` -o toy
+
+Let's run it, and define a simple ``average`` function. Press Ctrl-D
+when you're done.
+
+::
+
+ $ ./toy
+ ready> def average(x y) (x + y) * 0.5;
+ ^D
+ Wrote output.o
+
+We have an object file! To test it, let's write a simple program and
+link it with our output. Here's the source code:
+
+.. code-block:: c++
+
+ #include <iostream>
+
+ extern "C" {
+ double average(double, double);
+ }
+
+ int main() {
+ std::cout << "average of 3.0 and 4.0: " << average(3.0, 4.0) << std::endl;
+ }
+
+We link our program to output.o and check the result is what we
+expected:
+
+::
+
+ $ clang++ main.cpp output.o -o main
+ $ ./main
+ average of 3.0 and 4.0: 3.5
+
+Full Code Listing
+=================
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp
+ :language: c++
+
+`Next: Adding Debug Information <LangImpl09.html>`_
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl09.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl09.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl09.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl09.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,465 @@
+======================================
+Kaleidoscope: Adding Debug Information
+======================================
+
+.. contents::
+ :local:
+
+Chapter 9 Introduction
+======================
+
+Welcome to Chapter 9 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a
+decent little programming language with functions and variables.
+What happens if something goes wrong though, how do you debug your
+program?
+
+Source level debugging uses formatted data that helps a debugger
+translate from binary and the state of the machine back to the
+source that the programmer wrote. In LLVM we generally use a format
+called `DWARF <http://dwarfstd.org>`_. DWARF is a compact encoding
+that represents types, source locations, and variable locations.
+
+The short summary of this chapter is that we'll go through the
+various things you have to add to a programming language to
+support debug info, and how you translate that into DWARF.
+
+Caveat: For now we can't debug via the JIT, so we'll need to compile
+our program down to something small and standalone. As part of this
+we'll make a few modifications to the running of the language and
+how programs are compiled. This means that we'll have a source file
+with a simple program written in Kaleidoscope rather than the
+interactive JIT. It does involve a limitation that we can only
+have one "top level" command at a time to reduce the number of
+changes necessary.
+
+Here's the sample program we'll be compiling:
+
+.. code-block:: python
+
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+ fib(10)
+
+
+Why is this a hard problem?
+===========================
+
+Debug information is a hard problem for a few different reasons - mostly
+centered around optimized code. First, optimization makes keeping source
+locations more difficult. In LLVM IR we keep the original source location
+for each IR level instruction on the instruction. Optimization passes
+should keep the source locations for newly created instructions, but merged
+instructions only get to keep a single location - this can cause jumping
+around when stepping through optimized programs. Secondly, optimization
+can move variables in ways that are either optimized out, shared in memory
+with other variables, or difficult to track. For the purposes of this
+tutorial we're going to avoid optimization (as you'll see with one of the
+next sets of patches).
+
+Ahead-of-Time Compilation Mode
+==============================
+
+To highlight only the aspects of adding debug information to a source
+language without needing to worry about the complexities of JIT debugging
+we're going to make a few changes to Kaleidoscope to support compiling
+the IR emitted by the front end into a simple standalone program that
+you can execute, debug, and see results.
+
+First we make our anonymous function that contains our top level
+statement be our "main":
+
+.. code-block:: udiff
+
+ - auto Proto = llvm::make_unique<PrototypeAST>("", std::vector<std::string>());
+ + auto Proto = llvm::make_unique<PrototypeAST>("main", std::vector<std::string>());
+
+just with the simple change of giving it a name.
+
+Then we're going to remove the command line code wherever it exists:
+
+.. code-block:: udiff
+
+ @@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() {
+ /// top ::= definition | external | expression | ';'
+ static void MainLoop() {
+ while (1) {
+ - fprintf(stderr, "ready> ");
+ switch (CurTok) {
+ case tok_eof:
+ return;
+ @@ -1184,7 +1183,6 @@ int main() {
+ BinopPrecedence['*'] = 40; // highest.
+
+ // Prime the first token.
+ - fprintf(stderr, "ready> ");
+ getNextToken();
+
+Lastly we're going to disable all of the optimization passes and the JIT so
+that the only thing that happens after we're done parsing and generating
+code is that the LLVM IR goes to standard error:
+
+.. code-block:: udiff
+
+ @@ -1108,17 +1108,8 @@ static void HandleExtern() {
+ static void HandleTopLevelExpression() {
+ // Evaluate a top-level expression into an anonymous function.
+ if (auto FnAST = ParseTopLevelExpr()) {
+ - if (auto *FnIR = FnAST->codegen()) {
+ - // We're just doing this to make sure it executes.
+ - TheExecutionEngine->finalizeObject();
+ - // JIT the function, returning a function pointer.
+ - void *FPtr = TheExecutionEngine->getPointerToFunction(FnIR);
+ -
+ - // Cast it to the right type (takes no arguments, returns a double) so we
+ - // can call it as a native function.
+ - double (*FP)() = (double (*)())(intptr_t)FPtr;
+ - // Ignore the return value for this.
+ - (void)FP;
+ + if (!F->codegen()) {
+ + fprintf(stderr, "Error generating code for top level expr");
+ }
+ } else {
+ // Skip token for error recovery.
+ @@ -1439,11 +1459,11 @@ int main() {
+ // target lays out data structures.
+ TheModule->setDataLayout(TheExecutionEngine->getDataLayout());
+ OurFPM.add(new DataLayoutPass());
+ +#if 0
+ OurFPM.add(createBasicAliasAnalysisPass());
+ // Promote allocas to registers.
+ OurFPM.add(createPromoteMemoryToRegisterPass());
+ @@ -1218,7 +1210,7 @@ int main() {
+ OurFPM.add(createGVNPass());
+ // Simplify the control flow graph (deleting unreachable blocks, etc).
+ OurFPM.add(createCFGSimplificationPass());
+ -
+ + #endif
+ OurFPM.doInitialization();
+
+ // Set the global so the code gen can use this.
+
+This relatively small set of changes get us to the point that we can compile
+our piece of Kaleidoscope language down to an executable program via this
+command line:
+
+.. code-block:: bash
+
+ Kaleidoscope-Ch9 < fib.ks | & clang -x ir -
+
+which gives an a.out/a.exe in the current working directory.
+
+Compile Unit
+============
+
+The top level container for a section of code in DWARF is a compile unit.
+This contains the type and function data for an individual translation unit
+(read: one file of source code). So the first thing we need to do is
+construct one for our fib.ks file.
+
+DWARF Emission Setup
+====================
+
+Similar to the ``IRBuilder`` class we have a
+`DIBuilder <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
+that helps in constructing debug metadata for an LLVM IR file. It
+corresponds 1:1 similarly to ``IRBuilder`` and LLVM IR, but with nicer names.
+Using it does require that you be more familiar with DWARF terminology than
+you needed to be with ``IRBuilder`` and ``Instruction`` names, but if you
+read through the general documentation on the
+`Metadata Format <http://llvm.org/docs/SourceLevelDebugging.html>`_ it
+should be a little more clear. We'll be using this class to construct all
+of our IR level descriptions. Construction for it takes a module so we
+need to construct it shortly after we construct our module. We've left it
+as a global static variable to make it a bit easier to use.
+
+Next we're going to create a small container to cache some of our frequent
+data. The first will be our compile unit, but we'll also write a bit of
+code for our one type since we won't have to worry about multiple typed
+expressions:
+
+.. code-block:: c++
+
+ static DIBuilder *DBuilder;
+
+ struct DebugInfo {
+ DICompileUnit *TheCU;
+ DIType *DblTy;
+
+ DIType *getDoubleTy();
+ } KSDbgInfo;
+
+ DIType *DebugInfo::getDoubleTy() {
+ if (DblTy)
+ return DblTy;
+
+ DblTy = DBuilder->createBasicType("double", 64, dwarf::DW_ATE_float);
+ return DblTy;
+ }
+
+And then later on in ``main`` when we're constructing our module:
+
+.. code-block:: c++
+
+ DBuilder = new DIBuilder(*TheModule);
+
+ KSDbgInfo.TheCU = DBuilder->createCompileUnit(
+ dwarf::DW_LANG_C, DBuilder->createFile("fib.ks", "."),
+ "Kaleidoscope Compiler", 0, "", 0);
+
+There are a couple of things to note here. First, while we're producing a
+compile unit for a language called Kaleidoscope we used the language
+constant for C. This is because a debugger wouldn't necessarily understand
+the calling conventions or default ABI for a language it doesn't recognize
+and we follow the C ABI in our LLVM code generation so it's the closest
+thing to accurate. This ensures we can actually call functions from the
+debugger and have them execute. Secondly, you'll see the "fib.ks" in the
+call to ``createCompileUnit``. This is a default hard coded value since
+we're using shell redirection to put our source into the Kaleidoscope
+compiler. In a usual front end you'd have an input file name and it would
+go there.
+
+One last thing as part of emitting debug information via DIBuilder is that
+we need to "finalize" the debug information. The reasons are part of the
+underlying API for DIBuilder, but make sure you do this near the end of
+main:
+
+.. code-block:: c++
+
+ DBuilder->finalize();
+
+before you dump out the module.
+
+Functions
+=========
+
+Now that we have our ``Compile Unit`` and our source locations, we can add
+function definitions to the debug info. So in ``PrototypeAST::codegen()`` we
+add a few lines of code to describe a context for our subprogram, in this
+case the "File", and the actual definition of the function itself.
+
+So the context:
+
+.. code-block:: c++
+
+ DIFile *Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(),
+ KSDbgInfo.TheCU.getDirectory());
+
+giving us an DIFile and asking the ``Compile Unit`` we created above for the
+directory and filename where we are currently. Then, for now, we use some
+source locations of 0 (since our AST doesn't currently have source location
+information) and construct our function definition:
+
+.. code-block:: c++
+
+ DIScope *FContext = Unit;
+ unsigned LineNo = 0;
+ unsigned ScopeLine = 0;
+ DISubprogram *SP = DBuilder->createFunction(
+ FContext, P.getName(), StringRef(), Unit, LineNo,
+ CreateFunctionType(TheFunction->arg_size(), Unit),
+ false /* internal linkage */, true /* definition */, ScopeLine,
+ DINode::FlagPrototyped, false);
+ TheFunction->setSubprogram(SP);
+
+and we now have an DISubprogram that contains a reference to all of our
+metadata for the function.
+
+Source Locations
+================
+
+The most important thing for debug information is accurate source location -
+this makes it possible to map your source code back. We have a problem though,
+Kaleidoscope really doesn't have any source location information in the lexer
+or parser so we'll need to add it.
+
+.. code-block:: c++
+
+ struct SourceLocation {
+ int Line;
+ int Col;
+ };
+ static SourceLocation CurLoc;
+ static SourceLocation LexLoc = {1, 0};
+
+ static int advance() {
+ int LastChar = getchar();
+
+ if (LastChar == '\n' || LastChar == '\r') {
+ LexLoc.Line++;
+ LexLoc.Col = 0;
+ } else
+ LexLoc.Col++;
+ return LastChar;
+ }
+
+In this set of code we've added some functionality on how to keep track of the
+line and column of the "source file". As we lex every token we set our current
+current "lexical location" to the assorted line and column for the beginning
+of the token. We do this by overriding all of the previous calls to
+``getchar()`` with our new ``advance()`` that keeps track of the information
+and then we have added to all of our AST classes a source location:
+
+.. code-block:: c++
+
+ class ExprAST {
+ SourceLocation Loc;
+
+ public:
+ ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {}
+ virtual ~ExprAST() {}
+ virtual Value* codegen() = 0;
+ int getLine() const { return Loc.Line; }
+ int getCol() const { return Loc.Col; }
+ virtual raw_ostream &dump(raw_ostream &out, int ind) {
+ return out << ':' << getLine() << ':' << getCol() << '\n';
+ }
+
+that we pass down through when we create a new expression:
+
+.. code-block:: c++
+
+ LHS = llvm::make_unique<BinaryExprAST>(BinLoc, BinOp, std::move(LHS),
+ std::move(RHS));
+
+giving us locations for each of our expressions and variables.
+
+To make sure that every instruction gets proper source location information,
+we have to tell ``Builder`` whenever we're at a new source location.
+We use a small helper function for this:
+
+.. code-block:: c++
+
+ void DebugInfo::emitLocation(ExprAST *AST) {
+ DIScope *Scope;
+ if (LexicalBlocks.empty())
+ Scope = TheCU;
+ else
+ Scope = LexicalBlocks.back();
+ Builder.SetCurrentDebugLocation(
+ DebugLoc::get(AST->getLine(), AST->getCol(), Scope));
+ }
+
+This both tells the main ``IRBuilder`` where we are, but also what scope
+we're in. The scope can either be on compile-unit level or be the nearest
+enclosing lexical block like the current function.
+To represent this we create a stack of scopes:
+
+.. code-block:: c++
+
+ std::vector<DIScope *> LexicalBlocks;
+
+and push the scope (function) to the top of the stack when we start
+generating the code for each function:
+
+.. code-block:: c++
+
+ KSDbgInfo.LexicalBlocks.push_back(SP);
+
+Also, we may not forget to pop the scope back off of the scope stack at the
+end of the code generation for the function:
+
+.. code-block:: c++
+
+ // Pop off the lexical block for the function since we added it
+ // unconditionally.
+ KSDbgInfo.LexicalBlocks.pop_back();
+
+Then we make sure to emit the location every time we start to generate code
+for a new AST object:
+
+.. code-block:: c++
+
+ KSDbgInfo.emitLocation(this);
+
+Variables
+=========
+
+Now that we have functions, we need to be able to print out the variables
+we have in scope. Let's get our function arguments set up so we can get
+decent backtraces and see how our functions are being called. It isn't
+a lot of code, and we generally handle it when we're creating the
+argument allocas in ``FunctionAST::codegen``.
+
+.. code-block:: c++
+
+ // Record the function arguments in the NamedValues map.
+ NamedValues.clear();
+ unsigned ArgIdx = 0;
+ for (auto &Arg : TheFunction->args()) {
+ // Create an alloca for this variable.
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
+
+ // Create a debug descriptor for the variable.
+ DILocalVariable *D = DBuilder->createParameterVariable(
+ SP, Arg.getName(), ++ArgIdx, Unit, LineNo, KSDbgInfo.getDoubleTy(),
+ true);
+
+ DBuilder->insertDeclare(Alloca, D, DBuilder->createExpression(),
+ DebugLoc::get(LineNo, 0, SP),
+ Builder.GetInsertBlock());
+
+ // Store the initial value into the alloca.
+ Builder.CreateStore(&Arg, Alloca);
+
+ // Add arguments to variable symbol table.
+ NamedValues[Arg.getName()] = Alloca;
+ }
+
+
+Here we're first creating the variable, giving it the scope (``SP``),
+the name, source location, type, and since it's an argument, the argument
+index. Next, we create an ``lvm.dbg.declare`` call to indicate at the IR
+level that we've got a variable in an alloca (and it gives a starting
+location for the variable), and setting a source location for the
+beginning of the scope on the declare.
+
+One interesting thing to note at this point is that various debuggers have
+assumptions based on how code and debug information was generated for them
+in the past. In this case we need to do a little bit of a hack to avoid
+generating line information for the function prologue so that the debugger
+knows to skip over those instructions when setting a breakpoint. So in
+``FunctionAST::CodeGen`` we add some more lines:
+
+.. code-block:: c++
+
+ // Unset the location for the prologue emission (leading instructions with no
+ // location in a function are considered part of the prologue and the debugger
+ // will run past them when breaking on a function)
+ KSDbgInfo.emitLocation(nullptr);
+
+and then emit a new location when we actually start generating code for the
+body of the function:
+
+.. code-block:: c++
+
+ KSDbgInfo.emitLocation(Body.get());
+
+With this we have enough debug information to set breakpoints in functions,
+print out argument variables, and call functions. Not too bad for just a
+few simple lines of code!
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+debug information. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../examples/Kaleidoscope/Chapter9/toy.cpp
+ :language: c++
+
+`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl10.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl10.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl10.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/LangImpl10.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,254 @@
+======================================================
+Kaleidoscope: Conclusion and other useful LLVM tidbits
+======================================================
+
+.. contents::
+ :local:
+
+Tutorial Conclusion
+===================
+
+Welcome to the final chapter of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In the course of this tutorial, we have
+grown our little Kaleidoscope language from being a useless toy, to
+being a semi-interesting (but probably still useless) toy. :)
+
+It is interesting to see how far we've come, and how little code it has
+taken. We built the entire lexer, parser, AST, code generator, an
+interactive run-loop (with a JIT!), and emitted debug information in
+standalone executables - all in under 1000 lines of (non-comment/non-blank)
+code.
+
+Our little language supports a couple of interesting features: it
+supports user defined binary and unary operators, it uses JIT
+compilation for immediate evaluation, and it supports a few control flow
+constructs with SSA construction.
+
+Part of the idea of this tutorial was to show you how easy and fun it
+can be to define, build, and play with languages. Building a compiler
+need not be a scary or mystical process! Now that you've seen some of
+the basics, I strongly encourage you to take the code and hack on it.
+For example, try adding:
+
+- **global variables** - While global variables have questional value
+ in modern software engineering, they are often useful when putting
+ together quick little hacks like the Kaleidoscope compiler itself.
+ Fortunately, our current setup makes it very easy to add global
+ variables: just have value lookup check to see if an unresolved
+ variable is in the global variable symbol table before rejecting it.
+ To create a new global variable, make an instance of the LLVM
+ ``GlobalVariable`` class.
+- **typed variables** - Kaleidoscope currently only supports variables
+ of type double. This gives the language a very nice elegance, because
+ only supporting one type means that you never have to specify types.
+ Different languages have different ways of handling this. The easiest
+ way is to require the user to specify types for every variable
+ definition, and record the type of the variable in the symbol table
+ along with its Value\*.
+- **arrays, structs, vectors, etc** - Once you add types, you can start
+ extending the type system in all sorts of interesting ways. Simple
+ arrays are very easy and are quite useful for many different
+ applications. Adding them is mostly an exercise in learning how the
+ LLVM `getelementptr <../LangRef.html#getelementptr-instruction>`_ instruction
+ works: it is so nifty/unconventional, it `has its own
+ FAQ <../GetElementPtr.html>`_!
+- **standard runtime** - Our current language allows the user to access
+ arbitrary external functions, and we use it for things like "printd"
+ and "putchard". As you extend the language to add higher-level
+ constructs, often these constructs make the most sense if they are
+ lowered to calls into a language-supplied runtime. For example, if
+ you add hash tables to the language, it would probably make sense to
+ add the routines to a runtime, instead of inlining them all the way.
+- **memory management** - Currently we can only access the stack in
+ Kaleidoscope. It would also be useful to be able to allocate heap
+ memory, either with calls to the standard libc malloc/free interface
+ or with a garbage collector. If you would like to use garbage
+ collection, note that LLVM fully supports `Accurate Garbage
+ Collection <../GarbageCollection.html>`_ including algorithms that
+ move objects and need to scan/update the stack.
+- **exception handling support** - LLVM supports generation of `zero
+ cost exceptions <../ExceptionHandling.html>`_ which interoperate with
+ code compiled in other languages. You could also generate code by
+ implicitly making every function return an error value and checking
+ it. You could also make explicit use of setjmp/longjmp. There are
+ many different ways to go here.
+- **object orientation, generics, database access, complex numbers,
+ geometric programming, ...** - Really, there is no end of crazy
+ features that you can add to the language.
+- **unusual domains** - We've been talking about applying LLVM to a
+ domain that many people are interested in: building a compiler for a
+ specific language. However, there are many other domains that can use
+ compiler technology that are not typically considered. For example,
+ LLVM has been used to implement OpenGL graphics acceleration,
+ translate C++ code to ActionScript, and many other cute and clever
+ things. Maybe you will be the first to JIT compile a regular
+ expression interpreter into native code with LLVM?
+
+Have fun - try doing something crazy and unusual. Building a language
+like everyone else always has, is much less fun than trying something a
+little crazy or off the wall and seeing how it turns out. If you get
+stuck or want to talk about it, feel free to email the `llvm-dev mailing
+list <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_: it has lots
+of people who are interested in languages and are often willing to help
+out.
+
+Before we end this tutorial, I want to talk about some "tips and tricks"
+for generating LLVM IR. These are some of the more subtle things that
+may not be obvious, but are very useful if you want to take advantage of
+LLVM's capabilities.
+
+Properties of the LLVM IR
+=========================
+
+We have a couple of common questions about code in the LLVM IR form -
+let's just get these out of the way right now, shall we?
+
+Target Independence
+-------------------
+
+Kaleidoscope is an example of a "portable language": any program written
+in Kaleidoscope will work the same way on any target that it runs on.
+Many other languages have this property, e.g. lisp, java, haskell,
+javascript, python, etc (note that while these languages are portable,
+not all their libraries are).
+
+One nice aspect of LLVM is that it is often capable of preserving target
+independence in the IR: you can take the LLVM IR for a
+Kaleidoscope-compiled program and run it on any target that LLVM
+supports, even emitting C code and compiling that on targets that LLVM
+doesn't support natively. You can trivially tell that the Kaleidoscope
+compiler generates target-independent code because it never queries for
+any target-specific information when generating code.
+
+The fact that LLVM provides a compact, target-independent,
+representation for code gets a lot of people excited. Unfortunately,
+these people are usually thinking about C or a language from the C
+family when they are asking questions about language portability. I say
+"unfortunately", because there is really no way to make (fully general)
+C code portable, other than shipping the source code around (and of
+course, C source code is not actually portable in general either - ever
+port a really old application from 32- to 64-bits?).
+
+The problem with C (again, in its full generality) is that it is heavily
+laden with target specific assumptions. As one simple example, the
+preprocessor often destructively removes target-independence from the
+code when it processes the input text:
+
+.. code-block:: c
+
+ #ifdef __i386__
+ int X = 1;
+ #else
+ int X = 42;
+ #endif
+
+While it is possible to engineer more and more complex solutions to
+problems like this, it cannot be solved in full generality in a way that
+is better than shipping the actual source code.
+
+That said, there are interesting subsets of C that can be made portable.
+If you are willing to fix primitive types to a fixed size (say int =
+32-bits, and long = 64-bits), don't care about ABI compatibility with
+existing binaries, and are willing to give up some other minor features,
+you can have portable code. This can make sense for specialized domains
+such as an in-kernel language.
+
+Safety Guarantees
+-----------------
+
+Many of the languages above are also "safe" languages: it is impossible
+for a program written in Java to corrupt its address space and crash the
+process (assuming the JVM has no bugs). Safety is an interesting
+property that requires a combination of language design, runtime
+support, and often operating system support.
+
+It is certainly possible to implement a safe language in LLVM, but LLVM
+IR does not itself guarantee safety. The LLVM IR allows unsafe pointer
+casts, use after free bugs, buffer over-runs, and a variety of other
+problems. Safety needs to be implemented as a layer on top of LLVM and,
+conveniently, several groups have investigated this. Ask on the `llvm-dev
+mailing list <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ if
+you are interested in more details.
+
+Language-Specific Optimizations
+-------------------------------
+
+One thing about LLVM that turns off many people is that it does not
+solve all the world's problems in one system. One specific
+complaint is that people perceive LLVM as being incapable of performing
+high-level language-specific optimization: LLVM "loses too much
+information". Here are a few observations about this:
+
+First, you're right that LLVM does lose information. For example, as of
+this writing, there is no way to distinguish in the LLVM IR whether an
+SSA-value came from a C "int" or a C "long" on an ILP32 machine (other
+than debug info). Both get compiled down to an 'i32' value and the
+information about what it came from is lost. The more general issue
+here, is that the LLVM type system uses "structural equivalence" instead
+of "name equivalence". Another place this surprises people is if you
+have two types in a high-level language that have the same structure
+(e.g. two different structs that have a single int field): these types
+will compile down into a single LLVM type and it will be impossible to
+tell what it came from.
+
+Second, while LLVM does lose information, LLVM is not a fixed target: we
+continue to enhance and improve it in many different ways. In addition
+to adding new features (LLVM did not always support exceptions or debug
+info), we also extend the IR to capture important information for
+optimization (e.g. whether an argument is sign or zero extended,
+information about pointers aliasing, etc). Many of the enhancements are
+user-driven: people want LLVM to include some specific feature, so they
+go ahead and extend it.
+
+Third, it is *possible and easy* to add language-specific optimizations,
+and you have a number of choices in how to do it. As one trivial
+example, it is easy to add language-specific optimization passes that
+"know" things about code compiled for a language. In the case of the C
+family, there is an optimization pass that "knows" about the standard C
+library functions. If you call "exit(0)" in main(), it knows that it is
+safe to optimize that into "return 0;" because C specifies what the
+'exit' function does.
+
+In addition to simple library knowledge, it is possible to embed a
+variety of other language-specific information into the LLVM IR. If you
+have a specific need and run into a wall, please bring the topic up on
+the llvm-dev list. At the very worst, you can always treat LLVM as if it
+were a "dumb code generator" and implement the high-level optimizations
+you desire in your front-end, on the language-specific AST.
+
+Tips and Tricks
+===============
+
+There is a variety of useful tips and tricks that you come to know after
+working on/with LLVM that aren't obvious at first glance. Instead of
+letting everyone rediscover them, this section talks about some of these
+issues.
+
+Implementing portable offsetof/sizeof
+-------------------------------------
+
+One interesting thing that comes up, if you are trying to keep the code
+generated by your compiler "target independent", is that you often need
+to know the size of some LLVM type or the offset of some field in an
+llvm structure. For example, you might need to pass the size of a type
+into a function that allocates memory.
+
+Unfortunately, this can vary widely across targets: for example the
+width of a pointer is trivially target-specific. However, there is a
+`clever way to use the getelementptr
+instruction <http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt>`_
+that allows you to compute this in a portable way.
+
+Garbage Collected Stack Frames
+------------------------------
+
+Some languages want to explicitly manage their stack frames, often so
+that they are garbage collected or to allow easy implementation of
+closures. There are often better ways to implement these features than
+explicit stack frames, but `LLVM does support
+them, <http://nondot.org/sabre/LLVMNotes/ExplicitlyManagedStackFrames.txt>`_
+if you want. It requires your front-end to convert the code into
+`Continuation Passing
+Style <http://en.wikipedia.org/wiki/Continuation-passing_style>`_ and
+the use of tail calls (which LLVM also supports).
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl01.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl01.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl01.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl01.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,196 @@
+:orphan:
+
+=====================================================
+Kaleidoscope: Kaleidoscope Introduction and the Lexer
+=====================================================
+
+.. contents::
+ :local:
+
+The Kaleidoscope Language
+=========================
+
+This tutorial is illustrated with a toy language called
+"`Kaleidoscope <http://en.wikipedia.org/wiki/Kaleidoscope>`_" (derived
+from "meaning beautiful, form, and view"). Kaleidoscope is a procedural
+language that allows you to define functions, use conditionals, math,
+etc. Over the course of the tutorial, we'll extend Kaleidoscope to
+support the if/then/else construct, a for loop, user defined operators,
+JIT compilation with a simple command line interface, debug info, etc.
+
+We want to keep things simple, so the only datatype in Kaleidoscope
+is a 64-bit floating point type (aka 'double' in C parlance). As such,
+all values are implicitly double precision and the language doesn't
+require type declarations. This gives the language a very nice and
+simple syntax. For example, the following simple example computes
+`Fibonacci numbers: <http://en.wikipedia.org/wiki/Fibonacci_number>`_
+
+::
+
+ # Compute the x'th fibonacci number.
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2)
+
+ # This expression will compute the 40th number.
+ fib(40)
+
+We also allow Kaleidoscope to call into standard library functions - the
+LLVM JIT makes this really easy. This means that you can use the
+'extern' keyword to define a function before you use it (this is also
+useful for mutually recursive functions). For example:
+
+::
+
+ extern sin(arg);
+ extern cos(arg);
+ extern atan2(arg1 arg2);
+
+ atan2(sin(.4), cos(42))
+
+A more interesting example is included in Chapter 6 where we write a
+little Kaleidoscope application that `displays a Mandelbrot
+Set <LangImpl06.html#kicking-the-tires>`_ at various levels of magnification.
+
+Let's dive into the implementation of this language!
+
+The Lexer
+=========
+
+When it comes to implementing a language, the first thing needed is the
+ability to process a text file and recognize what it says. The
+traditional way to do this is to use a
+"`lexer <http://en.wikipedia.org/wiki/Lexical_analysis>`_" (aka
+'scanner') to break the input up into "tokens". Each token returned by
+the lexer includes a token code and potentially some metadata (e.g. the
+numeric value of a number). First, we define the possibilities:
+
+.. code-block:: c++
+
+ // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
+ // of these for known things.
+ enum Token {
+ tok_eof = -1,
+
+ // commands
+ tok_def = -2,
+ tok_extern = -3,
+
+ // primary
+ tok_identifier = -4,
+ tok_number = -5,
+ };
+
+ static std::string IdentifierStr; // Filled in if tok_identifier
+ static double NumVal; // Filled in if tok_number
+
+Each token returned by our lexer will either be one of the Token enum
+values or it will be an 'unknown' character like '+', which is returned
+as its ASCII value. If the current token is an identifier, the
+``IdentifierStr`` global variable holds the name of the identifier. If
+the current token is a numeric literal (like 1.0), ``NumVal`` holds its
+value. We use global variables for simplicity, but this is not the
+best choice for a real language implementation :).
+
+The actual implementation of the lexer is a single function named
+``gettok``. The ``gettok`` function is called to return the next token
+from standard input. Its definition starts as:
+
+.. code-block:: c++
+
+ /// gettok - Return the next token from standard input.
+ static int gettok() {
+ static int LastChar = ' ';
+
+ // Skip any whitespace.
+ while (isspace(LastChar))
+ LastChar = getchar();
+
+``gettok`` works by calling the C ``getchar()`` function to read
+characters one at a time from standard input. It eats them as it
+recognizes them and stores the last character read, but not processed,
+in LastChar. The first thing that it has to do is ignore whitespace
+between tokens. This is accomplished with the loop above.
+
+The next thing ``gettok`` needs to do is recognize identifiers and
+specific keywords like "def". Kaleidoscope does this with this simple
+loop:
+
+.. code-block:: c++
+
+ if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
+ IdentifierStr = LastChar;
+ while (isalnum((LastChar = getchar())))
+ IdentifierStr += LastChar;
+
+ if (IdentifierStr == "def")
+ return tok_def;
+ if (IdentifierStr == "extern")
+ return tok_extern;
+ return tok_identifier;
+ }
+
+Note that this code sets the '``IdentifierStr``' global whenever it
+lexes an identifier. Also, since language keywords are matched by the
+same loop, we handle them here inline. Numeric values are similar:
+
+.. code-block:: c++
+
+ if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
+ std::string NumStr;
+ do {
+ NumStr += LastChar;
+ LastChar = getchar();
+ } while (isdigit(LastChar) || LastChar == '.');
+
+ NumVal = strtod(NumStr.c_str(), 0);
+ return tok_number;
+ }
+
+This is all pretty straightforward code for processing input. When
+reading a numeric value from input, we use the C ``strtod`` function to
+convert it to a numeric value that we store in ``NumVal``. Note that
+this isn't doing sufficient error checking: it will incorrectly read
+"1.23.45.67" and handle it as if you typed in "1.23". Feel free to
+extend it! Next we handle comments:
+
+.. code-block:: c++
+
+ if (LastChar == '#') {
+ // Comment until end of line.
+ do
+ LastChar = getchar();
+ while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
+
+ if (LastChar != EOF)
+ return gettok();
+ }
+
+We handle comments by skipping to the end of the line and then return
+the next token. Finally, if the input doesn't match one of the above
+cases, it is either an operator character like '+' or the end of the
+file. These are handled with this code:
+
+.. code-block:: c++
+
+ // Check for end of file. Don't eat the EOF.
+ if (LastChar == EOF)
+ return tok_eof;
+
+ // Otherwise, just return the character as its ascii value.
+ int ThisChar = LastChar;
+ LastChar = getchar();
+ return ThisChar;
+ }
+
+With this, we have the complete lexer for the basic Kaleidoscope
+language (the `full code listing <LangImpl02.html#full-code-listing>`_ for the Lexer
+is available in the `next chapter <LangImpl02.html>`_ of the tutorial).
+Next we'll `build a simple parser that uses this to build an Abstract
+Syntax Tree <LangImpl02.html>`_. When we have that, we'll include a
+driver so that you can use the lexer and parser together.
+
+`Next: Implementing a Parser and AST <LangImpl02.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl02.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl02.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl02.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl02.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,739 @@
+:orphan:
+
+===========================================
+Kaleidoscope: Implementing a Parser and AST
+===========================================
+
+.. contents::
+ :local:
+
+Chapter 2 Introduction
+======================
+
+Welcome to Chapter 2 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. This chapter shows you how to use the
+lexer, built in `Chapter 1 <LangImpl01.html>`_, to build a full
+`parser <http://en.wikipedia.org/wiki/Parsing>`_ for our Kaleidoscope
+language. Once we have a parser, we'll define and build an `Abstract
+Syntax Tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST).
+
+The parser we will build uses a combination of `Recursive Descent
+Parsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and
+`Operator-Precedence
+Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to
+parse the Kaleidoscope language (the latter for binary expressions and
+the former for everything else). Before we get to parsing though, let's
+talk about the output of the parser: the Abstract Syntax Tree.
+
+The Abstract Syntax Tree (AST)
+==============================
+
+The AST for a program captures its behavior in such a way that it is
+easy for later stages of the compiler (e.g. code generation) to
+interpret. We basically want one object for each construct in the
+language, and the AST should closely model the language. In
+Kaleidoscope, we have expressions, a prototype, and a function object.
+We'll start with expressions first:
+
+.. code-block:: c++
+
+ /// ExprAST - Base class for all expression nodes.
+ class ExprAST {
+ public:
+ virtual ~ExprAST() {}
+ };
+
+ /// NumberExprAST - Expression class for numeric literals like "1.0".
+ class NumberExprAST : public ExprAST {
+ double Val;
+
+ public:
+ NumberExprAST(double Val) : Val(Val) {}
+ };
+
+The code above shows the definition of the base ExprAST class and one
+subclass which we use for numeric literals. The important thing to note
+about this code is that the NumberExprAST class captures the numeric
+value of the literal as an instance variable. This allows later phases
+of the compiler to know what the stored numeric value is.
+
+Right now we only create the AST, so there are no useful accessor
+methods on them. It would be very easy to add a virtual method to pretty
+print the code, for example. Here are the other expression AST node
+definitions that we'll use in the basic form of the Kaleidoscope
+language:
+
+.. code-block:: c++
+
+ /// VariableExprAST - Expression class for referencing a variable, like "a".
+ class VariableExprAST : public ExprAST {
+ std::string Name;
+
+ public:
+ VariableExprAST(const std::string &Name) : Name(Name) {}
+ };
+
+ /// BinaryExprAST - Expression class for a binary operator.
+ class BinaryExprAST : public ExprAST {
+ char Op;
+ std::unique_ptr<ExprAST> LHS, RHS;
+
+ public:
+ BinaryExprAST(char op, std::unique_ptr<ExprAST> LHS,
+ std::unique_ptr<ExprAST> RHS)
+ : Op(op), LHS(std::move(LHS)), RHS(std::move(RHS)) {}
+ };
+
+ /// CallExprAST - Expression class for function calls.
+ class CallExprAST : public ExprAST {
+ std::string Callee;
+ std::vector<std::unique_ptr<ExprAST>> Args;
+
+ public:
+ CallExprAST(const std::string &Callee,
+ std::vector<std::unique_ptr<ExprAST>> Args)
+ : Callee(Callee), Args(std::move(Args)) {}
+ };
+
+This is all (intentionally) rather straight-forward: variables capture
+the variable name, binary operators capture their opcode (e.g. '+'), and
+calls capture a function name as well as a list of any argument
+expressions. One thing that is nice about our AST is that it captures
+the language features without talking about the syntax of the language.
+Note that there is no discussion about precedence of binary operators,
+lexical structure, etc.
+
+For our basic language, these are all of the expression nodes we'll
+define. Because it doesn't have conditional control flow, it isn't
+Turing-complete; we'll fix that in a later installment. The two things
+we need next are a way to talk about the interface to a function, and a
+way to talk about functions themselves:
+
+.. code-block:: c++
+
+ /// PrototypeAST - This class represents the "prototype" for a function,
+ /// which captures its name, and its argument names (thus implicitly the number
+ /// of arguments the function takes).
+ class PrototypeAST {
+ std::string Name;
+ std::vector<std::string> Args;
+
+ public:
+ PrototypeAST(const std::string &name, std::vector<std::string> Args)
+ : Name(name), Args(std::move(Args)) {}
+
+ const std::string &getName() const { return Name; }
+ };
+
+ /// FunctionAST - This class represents a function definition itself.
+ class FunctionAST {
+ std::unique_ptr<PrototypeAST> Proto;
+ std::unique_ptr<ExprAST> Body;
+
+ public:
+ FunctionAST(std::unique_ptr<PrototypeAST> Proto,
+ std::unique_ptr<ExprAST> Body)
+ : Proto(std::move(Proto)), Body(std::move(Body)) {}
+ };
+
+In Kaleidoscope, functions are typed with just a count of their
+arguments. Since all values are double precision floating point, the
+type of each argument doesn't need to be stored anywhere. In a more
+aggressive and realistic language, the "ExprAST" class would probably
+have a type field.
+
+With this scaffolding, we can now talk about parsing expressions and
+function bodies in Kaleidoscope.
+
+Parser Basics
+=============
+
+Now that we have an AST to build, we need to define the parser code to
+build it. The idea here is that we want to parse something like "x+y"
+(which is returned as three tokens by the lexer) into an AST that could
+be generated with calls like this:
+
+.. code-block:: c++
+
+ auto LHS = llvm::make_unique<VariableExprAST>("x");
+ auto RHS = llvm::make_unique<VariableExprAST>("y");
+ auto Result = std::make_unique<BinaryExprAST>('+', std::move(LHS),
+ std::move(RHS));
+
+In order to do this, we'll start by defining some basic helper routines:
+
+.. code-block:: c++
+
+ /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
+ /// token the parser is looking at. getNextToken reads another token from the
+ /// lexer and updates CurTok with its results.
+ static int CurTok;
+ static int getNextToken() {
+ return CurTok = gettok();
+ }
+
+This implements a simple token buffer around the lexer. This allows us
+to look one token ahead at what the lexer is returning. Every function
+in our parser will assume that CurTok is the current token that needs to
+be parsed.
+
+.. code-block:: c++
+
+
+ /// LogError* - These are little helper functions for error handling.
+ std::unique_ptr<ExprAST> LogError(const char *Str) {
+ fprintf(stderr, "LogError: %s\n", Str);
+ return nullptr;
+ }
+ std::unique_ptr<PrototypeAST> LogErrorP(const char *Str) {
+ LogError(Str);
+ return nullptr;
+ }
+
+The ``LogError`` routines are simple helper routines that our parser will
+use to handle errors. The error recovery in our parser will not be the
+best and is not particular user-friendly, but it will be enough for our
+tutorial. These routines make it easier to handle errors in routines
+that have various return types: they always return null.
+
+With these basic helper functions, we can implement the first piece of
+our grammar: numeric literals.
+
+Basic Expression Parsing
+========================
+
+We start with numeric literals, because they are the simplest to
+process. For each production in our grammar, we'll define a function
+which parses that production. For numeric literals, we have:
+
+.. code-block:: c++
+
+ /// numberexpr ::= number
+ static std::unique_ptr<ExprAST> ParseNumberExpr() {
+ auto Result = llvm::make_unique<NumberExprAST>(NumVal);
+ getNextToken(); // consume the number
+ return std::move(Result);
+ }
+
+This routine is very simple: it expects to be called when the current
+token is a ``tok_number`` token. It takes the current number value,
+creates a ``NumberExprAST`` node, advances the lexer to the next token,
+and finally returns.
+
+There are some interesting aspects to this. The most important one is
+that this routine eats all of the tokens that correspond to the
+production and returns the lexer buffer with the next token (which is
+not part of the grammar production) ready to go. This is a fairly
+standard way to go for recursive descent parsers. For a better example,
+the parenthesis operator is defined like this:
+
+.. code-block:: c++
+
+ /// parenexpr ::= '(' expression ')'
+ static std::unique_ptr<ExprAST> ParseParenExpr() {
+ getNextToken(); // eat (.
+ auto V = ParseExpression();
+ if (!V)
+ return nullptr;
+
+ if (CurTok != ')')
+ return LogError("expected ')'");
+ getNextToken(); // eat ).
+ return V;
+ }
+
+This function illustrates a number of interesting things about the
+parser:
+
+1) It shows how we use the LogError routines. When called, this function
+expects that the current token is a '(' token, but after parsing the
+subexpression, it is possible that there is no ')' waiting. For example,
+if the user types in "(4 x" instead of "(4)", the parser should emit an
+error. Because errors can occur, the parser needs a way to indicate that
+they happened: in our parser, we return null on an error.
+
+2) Another interesting aspect of this function is that it uses recursion
+by calling ``ParseExpression`` (we will soon see that
+``ParseExpression`` can call ``ParseParenExpr``). This is powerful
+because it allows us to handle recursive grammars, and keeps each
+production very simple. Note that parentheses do not cause construction
+of AST nodes themselves. While we could do it this way, the most
+important role of parentheses are to guide the parser and provide
+grouping. Once the parser constructs the AST, parentheses are not
+needed.
+
+The next simple production is for handling variable references and
+function calls:
+
+.. code-block:: c++
+
+ /// identifierexpr
+ /// ::= identifier
+ /// ::= identifier '(' expression* ')'
+ static std::unique_ptr<ExprAST> ParseIdentifierExpr() {
+ std::string IdName = IdentifierStr;
+
+ getNextToken(); // eat identifier.
+
+ if (CurTok != '(') // Simple variable ref.
+ return llvm::make_unique<VariableExprAST>(IdName);
+
+ // Call.
+ getNextToken(); // eat (
+ std::vector<std::unique_ptr<ExprAST>> Args;
+ if (CurTok != ')') {
+ while (1) {
+ if (auto Arg = ParseExpression())
+ Args.push_back(std::move(Arg));
+ else
+ return nullptr;
+
+ if (CurTok == ')')
+ break;
+
+ if (CurTok != ',')
+ return LogError("Expected ')' or ',' in argument list");
+ getNextToken();
+ }
+ }
+
+ // Eat the ')'.
+ getNextToken();
+
+ return llvm::make_unique<CallExprAST>(IdName, std::move(Args));
+ }
+
+This routine follows the same style as the other routines. (It expects
+to be called if the current token is a ``tok_identifier`` token). It
+also has recursion and error handling. One interesting aspect of this is
+that it uses *look-ahead* to determine if the current identifier is a
+stand alone variable reference or if it is a function call expression.
+It handles this by checking to see if the token after the identifier is
+a '(' token, constructing either a ``VariableExprAST`` or
+``CallExprAST`` node as appropriate.
+
+Now that we have all of our simple expression-parsing logic in place, we
+can define a helper function to wrap it together into one entry point.
+We call this class of expressions "primary" expressions, for reasons
+that will become more clear `later in the
+tutorial <LangImpl6.html#user-defined-unary-operators>`_. In order to parse an arbitrary
+primary expression, we need to determine what sort of expression it is:
+
+.. code-block:: c++
+
+ /// primary
+ /// ::= identifierexpr
+ /// ::= numberexpr
+ /// ::= parenexpr
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ }
+ }
+
+Now that you see the definition of this function, it is more obvious why
+we can assume the state of CurTok in the various functions. This uses
+look-ahead to determine which sort of expression is being inspected, and
+then parses it with a function call.
+
+Now that basic expressions are handled, we need to handle binary
+expressions. They are a bit more complex.
+
+Binary Expression Parsing
+=========================
+
+Binary expressions are significantly harder to parse because they are
+often ambiguous. For example, when given the string "x+y\*z", the parser
+can choose to parse it as either "(x+y)\*z" or "x+(y\*z)". With common
+definitions from mathematics, we expect the later parse, because "\*"
+(multiplication) has higher *precedence* than "+" (addition).
+
+There are many ways to handle this, but an elegant and efficient way is
+to use `Operator-Precedence
+Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_.
+This parsing technique uses the precedence of binary operators to guide
+recursion. To start with, we need a table of precedences:
+
+.. code-block:: c++
+
+ /// BinopPrecedence - This holds the precedence for each binary operator that is
+ /// defined.
+ static std::map<char, int> BinopPrecedence;
+
+ /// GetTokPrecedence - Get the precedence of the pending binary operator token.
+ static int GetTokPrecedence() {
+ if (!isascii(CurTok))
+ return -1;
+
+ // Make sure it's a declared binop.
+ int TokPrec = BinopPrecedence[CurTok];
+ if (TokPrec <= 0) return -1;
+ return TokPrec;
+ }
+
+ int main() {
+ // Install standard binary operators.
+ // 1 is lowest precedence.
+ BinopPrecedence['<'] = 10;
+ BinopPrecedence['+'] = 20;
+ BinopPrecedence['-'] = 20;
+ BinopPrecedence['*'] = 40; // highest.
+ ...
+ }
+
+For the basic form of Kaleidoscope, we will only support 4 binary
+operators (this can obviously be extended by you, our brave and intrepid
+reader). The ``GetTokPrecedence`` function returns the precedence for
+the current token, or -1 if the token is not a binary operator. Having a
+map makes it easy to add new operators and makes it clear that the
+algorithm doesn't depend on the specific operators involved, but it
+would be easy enough to eliminate the map and do the comparisons in the
+``GetTokPrecedence`` function. (Or just use a fixed-size array).
+
+With the helper above defined, we can now start parsing binary
+expressions. The basic idea of operator precedence parsing is to break
+down an expression with potentially ambiguous binary operators into
+pieces. Consider, for example, the expression "a+b+(c+d)\*e\*f+g".
+Operator precedence parsing considers this as a stream of primary
+expressions separated by binary operators. As such, it will first parse
+the leading primary expression "a", then it will see the pairs [+, b]
+[+, (c+d)] [\*, e] [\*, f] and [+, g]. Note that because parentheses are
+primary expressions, the binary expression parser doesn't need to worry
+about nested subexpressions like (c+d) at all.
+
+To start, an expression is a primary expression potentially followed by
+a sequence of [binop,primaryexpr] pairs:
+
+.. code-block:: c++
+
+ /// expression
+ /// ::= primary binoprhs
+ ///
+ static std::unique_ptr<ExprAST> ParseExpression() {
+ auto LHS = ParsePrimary();
+ if (!LHS)
+ return nullptr;
+
+ return ParseBinOpRHS(0, std::move(LHS));
+ }
+
+``ParseBinOpRHS`` is the function that parses the sequence of pairs for
+us. It takes a precedence and a pointer to an expression for the part
+that has been parsed so far. Note that "x" is a perfectly valid
+expression: As such, "binoprhs" is allowed to be empty, in which case it
+returns the expression that is passed into it. In our example above, the
+code passes the expression for "a" into ``ParseBinOpRHS`` and the
+current token is "+".
+
+The precedence value passed into ``ParseBinOpRHS`` indicates the
+*minimal operator precedence* that the function is allowed to eat. For
+example, if the current pair stream is [+, x] and ``ParseBinOpRHS`` is
+passed in a precedence of 40, it will not consume any tokens (because
+the precedence of '+' is only 20). With this in mind, ``ParseBinOpRHS``
+starts with:
+
+.. code-block:: c++
+
+ /// binoprhs
+ /// ::= ('+' primary)*
+ static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec,
+ std::unique_ptr<ExprAST> LHS) {
+ // If this is a binop, find its precedence.
+ while (1) {
+ int TokPrec = GetTokPrecedence();
+
+ // If this is a binop that binds at least as tightly as the current binop,
+ // consume it, otherwise we are done.
+ if (TokPrec < ExprPrec)
+ return LHS;
+
+This code gets the precedence of the current token and checks to see if
+if is too low. Because we defined invalid tokens to have a precedence of
+-1, this check implicitly knows that the pair-stream ends when the token
+stream runs out of binary operators. If this check succeeds, we know
+that the token is a binary operator and that it will be included in this
+expression:
+
+.. code-block:: c++
+
+ // Okay, we know this is a binop.
+ int BinOp = CurTok;
+ getNextToken(); // eat binop
+
+ // Parse the primary expression after the binary operator.
+ auto RHS = ParsePrimary();
+ if (!RHS)
+ return nullptr;
+
+As such, this code eats (and remembers) the binary operator and then
+parses the primary expression that follows. This builds up the whole
+pair, the first of which is [+, b] for the running example.
+
+Now that we parsed the left-hand side of an expression and one pair of
+the RHS sequence, we have to decide which way the expression associates.
+In particular, we could have "(a+b) binop unparsed" or "a + (b binop
+unparsed)". To determine this, we look ahead at "binop" to determine its
+precedence and compare it to BinOp's precedence (which is '+' in this
+case):
+
+.. code-block:: c++
+
+ // If BinOp binds less tightly with RHS than the operator after RHS, let
+ // the pending operator take RHS as its LHS.
+ int NextPrec = GetTokPrecedence();
+ if (TokPrec < NextPrec) {
+
+If the precedence of the binop to the right of "RHS" is lower or equal
+to the precedence of our current operator, then we know that the
+parentheses associate as "(a+b) binop ...". In our example, the current
+operator is "+" and the next operator is "+", we know that they have the
+same precedence. In this case we'll create the AST node for "a+b", and
+then continue parsing:
+
+.. code-block:: c++
+
+ ... if body omitted ...
+ }
+
+ // Merge LHS/RHS.
+ LHS = llvm::make_unique<BinaryExprAST>(BinOp, std::move(LHS),
+ std::move(RHS));
+ } // loop around to the top of the while loop.
+ }
+
+In our example above, this will turn "a+b+" into "(a+b)" and execute the
+next iteration of the loop, with "+" as the current token. The code
+above will eat, remember, and parse "(c+d)" as the primary expression,
+which makes the current pair equal to [+, (c+d)]. It will then evaluate
+the 'if' conditional above with "\*" as the binop to the right of the
+primary. In this case, the precedence of "\*" is higher than the
+precedence of "+" so the if condition will be entered.
+
+The critical question left here is "how can the if condition parse the
+right hand side in full"? In particular, to build the AST correctly for
+our example, it needs to get all of "(c+d)\*e\*f" as the RHS expression
+variable. The code to do this is surprisingly simple (code from the
+above two blocks duplicated for context):
+
+.. code-block:: c++
+
+ // If BinOp binds less tightly with RHS than the operator after RHS, let
+ // the pending operator take RHS as its LHS.
+ int NextPrec = GetTokPrecedence();
+ if (TokPrec < NextPrec) {
+ RHS = ParseBinOpRHS(TokPrec+1, std::move(RHS));
+ if (!RHS)
+ return nullptr;
+ }
+ // Merge LHS/RHS.
+ LHS = llvm::make_unique<BinaryExprAST>(BinOp, std::move(LHS),
+ std::move(RHS));
+ } // loop around to the top of the while loop.
+ }
+
+At this point, we know that the binary operator to the RHS of our
+primary has higher precedence than the binop we are currently parsing.
+As such, we know that any sequence of pairs whose operators are all
+higher precedence than "+" should be parsed together and returned as
+"RHS". To do this, we recursively invoke the ``ParseBinOpRHS`` function
+specifying "TokPrec+1" as the minimum precedence required for it to
+continue. In our example above, this will cause it to return the AST
+node for "(c+d)\*e\*f" as RHS, which is then set as the RHS of the '+'
+expression.
+
+Finally, on the next iteration of the while loop, the "+g" piece is
+parsed and added to the AST. With this little bit of code (14
+non-trivial lines), we correctly handle fully general binary expression
+parsing in a very elegant way. This was a whirlwind tour of this code,
+and it is somewhat subtle. I recommend running through it with a few
+tough examples to see how it works.
+
+This wraps up handling of expressions. At this point, we can point the
+parser at an arbitrary token stream and build an expression from it,
+stopping at the first token that is not part of the expression. Next up
+we need to handle function definitions, etc.
+
+Parsing the Rest
+================
+
+The next thing missing is handling of function prototypes. In
+Kaleidoscope, these are used both for 'extern' function declarations as
+well as function body definitions. The code to do this is
+straight-forward and not very interesting (once you've survived
+expressions):
+
+.. code-block:: c++
+
+ /// prototype
+ /// ::= id '(' id* ')'
+ static std::unique_ptr<PrototypeAST> ParsePrototype() {
+ if (CurTok != tok_identifier)
+ return LogErrorP("Expected function name in prototype");
+
+ std::string FnName = IdentifierStr;
+ getNextToken();
+
+ if (CurTok != '(')
+ return LogErrorP("Expected '(' in prototype");
+
+ // Read the list of argument names.
+ std::vector<std::string> ArgNames;
+ while (getNextToken() == tok_identifier)
+ ArgNames.push_back(IdentifierStr);
+ if (CurTok != ')')
+ return LogErrorP("Expected ')' in prototype");
+
+ // success.
+ getNextToken(); // eat ')'.
+
+ return llvm::make_unique<PrototypeAST>(FnName, std::move(ArgNames));
+ }
+
+Given this, a function definition is very simple, just a prototype plus
+an expression to implement the body:
+
+.. code-block:: c++
+
+ /// definition ::= 'def' prototype expression
+ static std::unique_ptr<FunctionAST> ParseDefinition() {
+ getNextToken(); // eat def.
+ auto Proto = ParsePrototype();
+ if (!Proto) return nullptr;
+
+ if (auto E = ParseExpression())
+ return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E));
+ return nullptr;
+ }
+
+In addition, we support 'extern' to declare functions like 'sin' and
+'cos' as well as to support forward declaration of user functions. These
+'extern's are just prototypes with no body:
+
+.. code-block:: c++
+
+ /// external ::= 'extern' prototype
+ static std::unique_ptr<PrototypeAST> ParseExtern() {
+ getNextToken(); // eat extern.
+ return ParsePrototype();
+ }
+
+Finally, we'll also let the user type in arbitrary top-level expressions
+and evaluate them on the fly. We will handle this by defining anonymous
+nullary (zero argument) functions for them:
+
+.. code-block:: c++
+
+ /// toplevelexpr ::= expression
+ static std::unique_ptr<FunctionAST> ParseTopLevelExpr() {
+ if (auto E = ParseExpression()) {
+ // Make an anonymous proto.
+ auto Proto = llvm::make_unique<PrototypeAST>("", std::vector<std::string>());
+ return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E));
+ }
+ return nullptr;
+ }
+
+Now that we have all the pieces, let's build a little driver that will
+let us actually *execute* this code we've built!
+
+The Driver
+==========
+
+The driver for this simply invokes all of the parsing pieces with a
+top-level dispatch loop. There isn't much interesting here, so I'll just
+include the top-level loop. See `below <#full-code-listing>`_ for full code in the
+"Top-Level Parsing" section.
+
+.. code-block:: c++
+
+ /// top ::= definition | external | expression | ';'
+ static void MainLoop() {
+ while (1) {
+ fprintf(stderr, "ready> ");
+ switch (CurTok) {
+ case tok_eof:
+ return;
+ case ';': // ignore top-level semicolons.
+ getNextToken();
+ break;
+ case tok_def:
+ HandleDefinition();
+ break;
+ case tok_extern:
+ HandleExtern();
+ break;
+ default:
+ HandleTopLevelExpression();
+ break;
+ }
+ }
+ }
+
+The most interesting part of this is that we ignore top-level
+semicolons. Why is this, you ask? The basic reason is that if you type
+"4 + 5" at the command line, the parser doesn't know whether that is the
+end of what you will type or not. For example, on the next line you
+could type "def foo..." in which case 4+5 is the end of a top-level
+expression. Alternatively you could type "\* 6", which would continue
+the expression. Having top-level semicolons allows you to type "4+5;",
+and the parser will know you are done.
+
+Conclusions
+===========
+
+With just under 400 lines of commented code (240 lines of non-comment,
+non-blank code), we fully defined our minimal language, including a
+lexer, parser, and AST builder. With this done, the executable will
+validate Kaleidoscope code and tell us if it is grammatically invalid.
+For example, here is a sample interaction:
+
+.. code-block:: bash
+
+ $ ./a.out
+ ready> def foo(x y) x+foo(y, 4.0);
+ Parsed a function definition.
+ ready> def foo(x y) x+y y;
+ Parsed a function definition.
+ Parsed a top-level expr
+ ready> def foo(x y) x+y );
+ Parsed a function definition.
+ Error: unknown token when expecting an expression
+ ready> extern sin(a);
+ ready> Parsed an extern
+ ready> ^D
+ $
+
+There is a lot of room for extension here. You can define new AST nodes,
+extend the language in many ways, etc. In the `next
+installment <LangImpl03.html>`_, we will describe how to generate LLVM
+Intermediate Representation (IR) from the AST.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example. Because this
+uses the LLVM libraries, we need to link them in. To do this, we use the
+`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform
+our makefile/command line about which options to use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g -O3 toy.cpp `llvm-config --cxxflags`
+ # Run
+ ./a.out
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter2/toy.cpp
+ :language: c++
+
+`Next: Implementing Code Generation to LLVM IR <LangImpl03.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl03.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl03.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl03.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl03.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,570 @@
+:orphan:
+
+========================================
+Kaleidoscope: Code generation to LLVM IR
+========================================
+
+.. contents::
+ :local:
+
+Chapter 3 Introduction
+======================
+
+Welcome to Chapter 3 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. This chapter shows you how to transform
+the `Abstract Syntax Tree <LangImpl02.html>`_, built in Chapter 2, into
+LLVM IR. This will teach you a little bit about how LLVM does things, as
+well as demonstrate how easy it is to use. It's much more work to build
+a lexer and parser than it is to generate LLVM IR code. :)
+
+**Please note**: the code in this chapter and later require LLVM 3.7 or
+later. LLVM 3.6 and before will not work with it. Also note that you
+need to use a version of this tutorial that matches your LLVM release:
+If you are using an official LLVM release, use the version of the
+documentation included with your release or on the `llvm.org releases
+page <http://llvm.org/releases/>`_.
+
+Code Generation Setup
+=====================
+
+In order to generate LLVM IR, we want some simple setup to get started.
+First we define virtual code generation (codegen) methods in each AST
+class:
+
+.. code-block:: c++
+
+ /// ExprAST - Base class for all expression nodes.
+ class ExprAST {
+ public:
+ virtual ~ExprAST() {}
+ virtual Value *codegen() = 0;
+ };
+
+ /// NumberExprAST - Expression class for numeric literals like "1.0".
+ class NumberExprAST : public ExprAST {
+ double Val;
+
+ public:
+ NumberExprAST(double Val) : Val(Val) {}
+ virtual Value *codegen();
+ };
+ ...
+
+The codegen() method says to emit IR for that AST node along with all
+the things it depends on, and they all return an LLVM Value object.
+"Value" is the class used to represent a "`Static Single Assignment
+(SSA) <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+register" or "SSA value" in LLVM. The most distinct aspect of SSA values
+is that their value is computed as the related instruction executes, and
+it does not get a new value until (and if) the instruction re-executes.
+In other words, there is no way to "change" an SSA value. For more
+information, please read up on `Static Single
+Assignment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+- the concepts are really quite natural once you grok them.
+
+Note that instead of adding virtual methods to the ExprAST class
+hierarchy, it could also make sense to use a `visitor
+pattern <http://en.wikipedia.org/wiki/Visitor_pattern>`_ or some other
+way to model this. Again, this tutorial won't dwell on good software
+engineering practices: for our purposes, adding a virtual method is
+simplest.
+
+The second thing we want is an "LogError" method like we used for the
+parser, which will be used to report errors found during code generation
+(for example, use of an undeclared parameter):
+
+.. code-block:: c++
+
+ static LLVMContext TheContext;
+ static IRBuilder<> Builder(TheContext);
+ static std::unique_ptr<Module> TheModule;
+ static std::map<std::string, Value *> NamedValues;
+
+ Value *LogErrorV(const char *Str) {
+ LogError(Str);
+ return nullptr;
+ }
+
+The static variables will be used during code generation. ``TheContext``
+is an opaque object that owns a lot of core LLVM data structures, such as
+the type and constant value tables. We don't need to understand it in
+detail, we just need a single instance to pass into APIs that require it.
+
+The ``Builder`` object is a helper object that makes it easy to generate
+LLVM instructions. Instances of the
+`IRBuilder <http://llvm.org/doxygen/IRBuilder_8h-source.html>`_
+class template keep track of the current place to insert instructions
+and has methods to create new instructions.
+
+``TheModule`` is an LLVM construct that contains functions and global
+variables. In many ways, it is the top-level structure that the LLVM IR
+uses to contain code. It will own the memory for all of the IR that we
+generate, which is why the codegen() method returns a raw Value\*,
+rather than a unique_ptr<Value>.
+
+The ``NamedValues`` map keeps track of which values are defined in the
+current scope and what their LLVM representation is. (In other words, it
+is a symbol table for the code). In this form of Kaleidoscope, the only
+things that can be referenced are function parameters. As such, function
+parameters will be in this map when generating code for their function
+body.
+
+With these basics in place, we can start talking about how to generate
+code for each expression. Note that this assumes that the ``Builder``
+has been set up to generate code *into* something. For now, we'll assume
+that this has already been done, and we'll just use it to emit code.
+
+Expression Code Generation
+==========================
+
+Generating LLVM code for expression nodes is very straightforward: less
+than 45 lines of commented code for all four of our expression nodes.
+First we'll do numeric literals:
+
+.. code-block:: c++
+
+ Value *NumberExprAST::codegen() {
+ return ConstantFP::get(TheContext, APFloat(Val));
+ }
+
+In the LLVM IR, numeric constants are represented with the
+``ConstantFP`` class, which holds the numeric value in an ``APFloat``
+internally (``APFloat`` has the capability of holding floating point
+constants of Arbitrary Precision). This code basically just creates
+and returns a ``ConstantFP``. Note that in the LLVM IR that constants
+are all uniqued together and shared. For this reason, the API uses the
+"foo::get(...)" idiom instead of "new foo(..)" or "foo::Create(..)".
+
+.. code-block:: c++
+
+ Value *VariableExprAST::codegen() {
+ // Look this variable up in the function.
+ Value *V = NamedValues[Name];
+ if (!V)
+ LogErrorV("Unknown variable name");
+ return V;
+ }
+
+References to variables are also quite simple using LLVM. In the simple
+version of Kaleidoscope, we assume that the variable has already been
+emitted somewhere and its value is available. In practice, the only
+values that can be in the ``NamedValues`` map are function arguments.
+This code simply checks to see that the specified name is in the map (if
+not, an unknown variable is being referenced) and returns the value for
+it. In future chapters, we'll add support for `loop induction
+variables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for `local
+variables <LangImpl7.html#user-defined-local-variables>`_.
+
+.. code-block:: c++
+
+ Value *BinaryExprAST::codegen() {
+ Value *L = LHS->codegen();
+ Value *R = RHS->codegen();
+ if (!L || !R)
+ return nullptr;
+
+ switch (Op) {
+ case '+':
+ return Builder.CreateFAdd(L, R, "addtmp");
+ case '-':
+ return Builder.CreateFSub(L, R, "subtmp");
+ case '*':
+ return Builder.CreateFMul(L, R, "multmp");
+ case '<':
+ L = Builder.CreateFCmpULT(L, R, "cmptmp");
+ // Convert bool 0/1 to double 0.0 or 1.0
+ return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext),
+ "booltmp");
+ default:
+ return LogErrorV("invalid binary operator");
+ }
+ }
+
+Binary operators start to get more interesting. The basic idea here is
+that we recursively emit code for the left-hand side of the expression,
+then the right-hand side, then we compute the result of the binary
+expression. In this code, we do a simple switch on the opcode to create
+the right LLVM instruction.
+
+In the example above, the LLVM builder class is starting to show its
+value. IRBuilder knows where to insert the newly created instruction,
+all you have to do is specify what instruction to create (e.g. with
+``CreateFAdd``), which operands to use (``L`` and ``R`` here) and
+optionally provide a name for the generated instruction.
+
+One nice thing about LLVM is that the name is just a hint. For instance,
+if the code above emits multiple "addtmp" variables, LLVM will
+automatically provide each one with an increasing, unique numeric
+suffix. Local value names for instructions are purely optional, but it
+makes it much easier to read the IR dumps.
+
+`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict
+rules: for example, the Left and Right operators of an `add
+instruction <../LangRef.html#add-instruction>`_ must have the same type, and the
+result type of the add must match the operand types. Because all values
+in Kaleidoscope are doubles, this makes for very simple code for add,
+sub and mul.
+
+On the other hand, LLVM specifies that the `fcmp
+instruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a
+one bit integer). The problem with this is that Kaleidoscope wants the
+value to be a 0.0 or 1.0 value. In order to get these semantics, we
+combine the fcmp instruction with a `uitofp
+instruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its
+input integer into a floating point value by treating the input as an
+unsigned value. In contrast, if we used the `sitofp
+instruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator
+would return 0.0 and -1.0, depending on the input value.
+
+.. code-block:: c++
+
+ Value *CallExprAST::codegen() {
+ // Look up the name in the global module table.
+ Function *CalleeF = TheModule->getFunction(Callee);
+ if (!CalleeF)
+ return LogErrorV("Unknown function referenced");
+
+ // If argument mismatch error.
+ if (CalleeF->arg_size() != Args.size())
+ return LogErrorV("Incorrect # arguments passed");
+
+ std::vector<Value *> ArgsV;
+ for (unsigned i = 0, e = Args.size(); i != e; ++i) {
+ ArgsV.push_back(Args[i]->codegen());
+ if (!ArgsV.back())
+ return nullptr;
+ }
+
+ return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
+ }
+
+Code generation for function calls is quite straightforward with LLVM. The code
+above initially does a function name lookup in the LLVM Module's symbol table.
+Recall that the LLVM Module is the container that holds the functions we are
+JIT'ing. By giving each function the same name as what the user specifies, we
+can use the LLVM symbol table to resolve function names for us.
+
+Once we have the function to call, we recursively codegen each argument
+that is to be passed in, and create an LLVM `call
+instruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C
+calling conventions by default, allowing these calls to also call into
+standard library functions like "sin" and "cos", with no additional
+effort.
+
+This wraps up our handling of the four basic expressions that we have so
+far in Kaleidoscope. Feel free to go in and add some more. For example,
+by browsing the `LLVM language reference <../LangRef.html>`_ you'll find
+several other interesting instructions that are really easy to plug into
+our basic framework.
+
+Function Code Generation
+========================
+
+Code generation for prototypes and functions must handle a number of
+details, which make their code less beautiful than expression code
+generation, but allows us to illustrate some important points. First,
+let's talk about code generation for prototypes: they are used both for
+function bodies and external function declarations. The code starts
+with:
+
+.. code-block:: c++
+
+ Function *PrototypeAST::codegen() {
+ // Make the function type: double(double,double) etc.
+ std::vector<Type*> Doubles(Args.size(),
+ Type::getDoubleTy(TheContext));
+ FunctionType *FT =
+ FunctionType::get(Type::getDoubleTy(TheContext), Doubles, false);
+
+ Function *F =
+ Function::Create(FT, Function::ExternalLinkage, Name, TheModule.get());
+
+This code packs a lot of power into a few lines. Note first that this
+function returns a "Function\*" instead of a "Value\*". Because a
+"prototype" really talks about the external interface for a function
+(not the value computed by an expression), it makes sense for it to
+return the LLVM Function it corresponds to when codegen'd.
+
+The call to ``FunctionType::get`` creates the ``FunctionType`` that
+should be used for a given Prototype. Since all function arguments in
+Kaleidoscope are of type double, the first line creates a vector of "N"
+LLVM double types. It then uses the ``Functiontype::get`` method to
+create a function type that takes "N" doubles as arguments, returns one
+double as a result, and that is not vararg (the false parameter
+indicates this). Note that Types in LLVM are uniqued just like Constants
+are, so you don't "new" a type, you "get" it.
+
+The final line above actually creates the IR Function corresponding to
+the Prototype. This indicates the type, linkage and name to use, as
+well as which module to insert into. "`external
+linkage <../LangRef.html#linkage>`_" means that the function may be
+defined outside the current module and/or that it is callable by
+functions outside the module. The Name passed in is the name the user
+specified: since "``TheModule``" is specified, this name is registered
+in "``TheModule``"s symbol table.
+
+.. code-block:: c++
+
+ // Set names for all arguments.
+ unsigned Idx = 0;
+ for (auto &Arg : F->args())
+ Arg.setName(Args[Idx++]);
+
+ return F;
+
+Finally, we set the name of each of the function's arguments according to the
+names given in the Prototype. This step isn't strictly necessary, but keeping
+the names consistent makes the IR more readable, and allows subsequent code to
+refer directly to the arguments for their names, rather than having to look up
+them up in the Prototype AST.
+
+At this point we have a function prototype with no body. This is how LLVM IR
+represents function declarations. For extern statements in Kaleidoscope, this
+is as far as we need to go. For function definitions however, we need to
+codegen and attach a function body.
+
+.. code-block:: c++
+
+ Function *FunctionAST::codegen() {
+ // First, check for an existing function from a previous 'extern' declaration.
+ Function *TheFunction = TheModule->getFunction(Proto->getName());
+
+ if (!TheFunction)
+ TheFunction = Proto->codegen();
+
+ if (!TheFunction)
+ return nullptr;
+
+ if (!TheFunction->empty())
+ return (Function*)LogErrorV("Function cannot be redefined.");
+
+
+For function definitions, we start by searching TheModule's symbol table for an
+existing version of this function, in case one has already been created using an
+'extern' statement. If Module::getFunction returns null then no previous version
+exists, so we'll codegen one from the Prototype. In either case, we want to
+assert that the function is empty (i.e. has no body yet) before we start.
+
+.. code-block:: c++
+
+ // Create a new basic block to start insertion into.
+ BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
+ Builder.SetInsertPoint(BB);
+
+ // Record the function arguments in the NamedValues map.
+ NamedValues.clear();
+ for (auto &Arg : TheFunction->args())
+ NamedValues[Arg.getName()] = &Arg;
+
+Now we get to the point where the ``Builder`` is set up. The first line
+creates a new `basic block <http://en.wikipedia.org/wiki/Basic_block>`_
+(named "entry"), which is inserted into ``TheFunction``. The second line
+then tells the builder that new instructions should be inserted into the
+end of the new basic block. Basic blocks in LLVM are an important part
+of functions that define the `Control Flow
+Graph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we
+don't have any control flow, our functions will only contain one block
+at this point. We'll fix this in `Chapter 5 <LangImpl05.html>`_ :).
+
+Next we add the function arguments to the NamedValues map (after first clearing
+it out) so that they're accessible to ``VariableExprAST`` nodes.
+
+.. code-block:: c++
+
+ if (Value *RetVal = Body->codegen()) {
+ // Finish off the function.
+ Builder.CreateRet(RetVal);
+
+ // Validate the generated code, checking for consistency.
+ verifyFunction(*TheFunction);
+
+ return TheFunction;
+ }
+
+Once the insertion point has been set up and the NamedValues map populated,
+we call the ``codegen()`` method for the root expression of the function. If no
+error happens, this emits code to compute the expression into the entry block
+and returns the value that was computed. Assuming no error, we then create an
+LLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the function.
+Once the function is built, we call ``verifyFunction``, which is
+provided by LLVM. This function does a variety of consistency checks on
+the generated code, to determine if our compiler is doing everything
+right. Using this is important: it can catch a lot of bugs. Once the
+function is finished and validated, we return it.
+
+.. code-block:: c++
+
+ // Error reading body, remove function.
+ TheFunction->eraseFromParent();
+ return nullptr;
+ }
+
+The only piece left here is handling of the error case. For simplicity,
+we handle this by merely deleting the function we produced with the
+``eraseFromParent`` method. This allows the user to redefine a function
+that they incorrectly typed in before: if we didn't delete it, it would
+live in the symbol table, with a body, preventing future redefinition.
+
+This code does have a bug, though: If the ``FunctionAST::codegen()`` method
+finds an existing IR Function, it does not validate its signature against the
+definition's own prototype. This means that an earlier 'extern' declaration will
+take precedence over the function definition's signature, which can cause
+codegen to fail, for instance if the function arguments are named differently.
+There are a number of ways to fix this bug, see what you can come up with! Here
+is a testcase:
+
+::
+
+ extern foo(a); # ok, defines foo.
+ def foo(b) b; # Error: Unknown variable name. (decl using 'a' takes precedence).
+
+Driver Changes and Closing Thoughts
+===================================
+
+For now, code generation to LLVM doesn't really get us much, except that
+we can look at the pretty IR calls. The sample code inserts calls to
+codegen into the "``HandleDefinition``", "``HandleExtern``" etc
+functions, and then dumps out the LLVM IR. This gives a nice way to look
+at the LLVM IR for simple functions. For example:
+
+::
+
+ ready> 4+5;
+ Read top-level expression:
+ define double @0() {
+ entry:
+ ret double 9.000000e+00
+ }
+
+Note how the parser turns the top-level expression into anonymous
+functions for us. This will be handy when we add `JIT
+support <LangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that the
+code is very literally transcribed, no optimizations are being performed
+except simple constant folding done by IRBuilder. We will `add
+optimizations <LangImpl4.html#trivial-constant-folding>`_ explicitly in the next
+chapter.
+
+::
+
+ ready> def foo(a b) a*a + 2*a*b + b*b;
+ Read function definition:
+ define double @foo(double %a, double %b) {
+ entry:
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
+ }
+
+This shows some simple arithmetic. Notice the striking similarity to the
+LLVM builder calls that we use to create the instructions.
+
+::
+
+ ready> def bar(a) foo(a, 4.0) + bar(31337);
+ Read function definition:
+ define double @bar(double %a) {
+ entry:
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
+ }
+
+This shows some function calls. Note that this function will take a long
+time to execute if you call it. In the future we'll add conditional
+control flow to actually make recursion useful :).
+
+::
+
+ ready> extern cos(x);
+ Read extern:
+ declare double @cos(double)
+
+ ready> cos(1.234);
+ Read top-level expression:
+ define double @1() {
+ entry:
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
+ }
+
+This shows an extern for the libm "cos" function, and a call to it.
+
+.. TODO:: Abandon Pygments' horrible `llvm` lexer. It just totally gives up
+ on highlighting this due to the first line.
+
+::
+
+ ready> ^D
+ ; ModuleID = 'my cool jit'
+
+ define double @0() {
+ entry:
+ %addtmp = fadd double 4.000000e+00, 5.000000e+00
+ ret double %addtmp
+ }
+
+ define double @foo(double %a, double %b) {
+ entry:
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
+ }
+
+ define double @bar(double %a) {
+ entry:
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
+ }
+
+ declare double @cos(double)
+
+ define double @1() {
+ entry:
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
+ }
+
+When you quit the current demo (by sending an EOF via CTRL+D on Linux
+or CTRL+Z and ENTER on Windows), it dumps out the IR for the entire
+module generated. Here you can see the big picture with all the
+functions referencing each other.
+
+This wraps up the third chapter of the Kaleidoscope tutorial. Up next,
+we'll describe how to `add JIT codegen and optimizer
+support <LangImpl04.html>`_ to this so we can actually start running
+code!
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the LLVM code generator. Because this uses the LLVM libraries, we need
+to link them in. To do this, we use the
+`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform
+our makefile/command line about which options to use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core` -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter3/toy.cpp
+ :language: c++
+
+`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl04.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl04.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl04.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl04.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,661 @@
+:orphan:
+
+==============================================
+Kaleidoscope: Adding JIT and Optimizer Support
+==============================================
+
+.. contents::
+ :local:
+
+Chapter 4 Introduction
+======================
+
+Welcome to Chapter 4 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. Chapters 1-3 described the implementation
+of a simple language and added support for generating LLVM IR. This
+chapter describes two new techniques: adding optimizer support to your
+language, and adding JIT compiler support. These additions will
+demonstrate how to get nice, efficient code for the Kaleidoscope
+language.
+
+Trivial Constant Folding
+========================
+
+Our demonstration for Chapter 3 is elegant and easy to extend.
+Unfortunately, it does not produce wonderful code. The IRBuilder,
+however, does give us obvious optimizations when compiling simple code:
+
+::
+
+ ready> def test(x) 1+2+x;
+ Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 3.000000e+00, %x
+ ret double %addtmp
+ }
+
+This code is not a literal transcription of the AST built by parsing the
+input. That would be:
+
+::
+
+ ready> def test(x) 1+2+x;
+ Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 2.000000e+00, 1.000000e+00
+ %addtmp1 = fadd double %addtmp, %x
+ ret double %addtmp1
+ }
+
+Constant folding, as seen above, in particular, is a very common and
+very important optimization: so much so that many language implementors
+implement constant folding support in their AST representation.
+
+With LLVM, you don't need this support in the AST. Since all calls to
+build LLVM IR go through the LLVM IR builder, the builder itself checked
+to see if there was a constant folding opportunity when you call it. If
+so, it just does the constant fold and return the constant instead of
+creating an instruction.
+
+Well, that was easy :). In practice, we recommend always using
+``IRBuilder`` when generating code like this. It has no "syntactic
+overhead" for its use (you don't have to uglify your compiler with
+constant checks everywhere) and it can dramatically reduce the amount of
+LLVM IR that is generated in some cases (particular for languages with a
+macro preprocessor or that use a lot of constants).
+
+On the other hand, the ``IRBuilder`` is limited by the fact that it does
+all of its analysis inline with the code as it is built. If you take a
+slightly more complex example:
+
+::
+
+ ready> def test(x) (1+2+x)*(x+(1+2));
+ ready> Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 3.000000e+00, %x
+ %addtmp1 = fadd double %x, 3.000000e+00
+ %multmp = fmul double %addtmp, %addtmp1
+ ret double %multmp
+ }
+
+In this case, the LHS and RHS of the multiplication are the same value.
+We'd really like to see this generate "``tmp = x+3; result = tmp*tmp;``"
+instead of computing "``x+3``" twice.
+
+Unfortunately, no amount of local analysis will be able to detect and
+correct this. This requires two transformations: reassociation of
+expressions (to make the add's lexically identical) and Common
+Subexpression Elimination (CSE) to delete the redundant add instruction.
+Fortunately, LLVM provides a broad range of optimizations that you can
+use, in the form of "passes".
+
+LLVM Optimization Passes
+========================
+
+.. warning::
+
+ Due to the transition to the new PassManager infrastructure this tutorial
+ is based on ``llvm::legacy::FunctionPassManager`` which can be found in
+ `LegacyPassManager.h <http://llvm.org/doxygen/classllvm_1_1legacy_1_1FunctionPassManager.html>`_.
+ For the purpose of the this tutorial the above should be used until
+ the pass manager transition is complete.
+
+LLVM provides many optimization passes, which do many different sorts of
+things and have different tradeoffs. Unlike other systems, LLVM doesn't
+hold to the mistaken notion that one set of optimizations is right for
+all languages and for all situations. LLVM allows a compiler implementor
+to make complete decisions about what optimizations to use, in which
+order, and in what situation.
+
+As a concrete example, LLVM supports both "whole module" passes, which
+look across as large of body of code as they can (often a whole file,
+but if run at link time, this can be a substantial portion of the whole
+program). It also supports and includes "per-function" passes which just
+operate on a single function at a time, without looking at other
+functions. For more information on passes and how they are run, see the
+`How to Write a Pass <../WritingAnLLVMPass.html>`_ document and the
+`List of LLVM Passes <../Passes.html>`_.
+
+For Kaleidoscope, we are currently generating functions on the fly, one
+at a time, as the user types them in. We aren't shooting for the
+ultimate optimization experience in this setting, but we also want to
+catch the easy and quick stuff where possible. As such, we will choose
+to run a few per-function optimizations as the user types the function
+in. If we wanted to make a "static Kaleidoscope compiler", we would use
+exactly the code we have now, except that we would defer running the
+optimizer until the entire file has been parsed.
+
+In order to get per-function optimizations going, we need to set up a
+`FunctionPassManager <../WritingAnLLVMPass.html#what-passmanager-doesr>`_ to hold
+and organize the LLVM optimizations that we want to run. Once we have
+that, we can add a set of optimizations to run. We'll need a new
+FunctionPassManager for each module that we want to optimize, so we'll
+write a function to create and initialize both the module and pass manager
+for us:
+
+.. code-block:: c++
+
+ void InitializeModuleAndPassManager(void) {
+ // Open a new module.
+ TheModule = llvm::make_unique<Module>("my cool jit", TheContext);
+
+ // Create a new pass manager attached to it.
+ TheFPM = llvm::make_unique<FunctionPassManager>(TheModule.get());
+
+ // Do simple "peephole" optimizations and bit-twiddling optzns.
+ TheFPM->add(createInstructionCombiningPass());
+ // Reassociate expressions.
+ TheFPM->add(createReassociatePass());
+ // Eliminate Common SubExpressions.
+ TheFPM->add(createGVNPass());
+ // Simplify the control flow graph (deleting unreachable blocks, etc).
+ TheFPM->add(createCFGSimplificationPass());
+
+ TheFPM->doInitialization();
+ }
+
+This code initializes the global module ``TheModule``, and the function pass
+manager ``TheFPM``, which is attached to ``TheModule``. Once the pass manager is
+set up, we use a series of "add" calls to add a bunch of LLVM passes.
+
+In this case, we choose to add four optimization passes.
+The passes we choose here are a pretty standard set
+of "cleanup" optimizations that are useful for a wide variety of code. I won't
+delve into what they do but, believe me, they are a good starting place :).
+
+Once the PassManager is set up, we need to make use of it. We do this by
+running it after our newly created function is constructed (in
+``FunctionAST::codegen()``), but before it is returned to the client:
+
+.. code-block:: c++
+
+ if (Value *RetVal = Body->codegen()) {
+ // Finish off the function.
+ Builder.CreateRet(RetVal);
+
+ // Validate the generated code, checking for consistency.
+ verifyFunction(*TheFunction);
+
+ // Optimize the function.
+ TheFPM->run(*TheFunction);
+
+ return TheFunction;
+ }
+
+As you can see, this is pretty straightforward. The
+``FunctionPassManager`` optimizes and updates the LLVM Function\* in
+place, improving (hopefully) its body. With this in place, we can try
+our test above again:
+
+::
+
+ ready> def test(x) (1+2+x)*(x+(1+2));
+ ready> Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double %x, 3.000000e+00
+ %multmp = fmul double %addtmp, %addtmp
+ ret double %multmp
+ }
+
+As expected, we now get our nicely optimized code, saving a floating
+point add instruction from every execution of this function.
+
+LLVM provides a wide variety of optimizations that can be used in
+certain circumstances. Some `documentation about the various
+passes <../Passes.html>`_ is available, but it isn't very complete.
+Another good source of ideas can come from looking at the passes that
+``Clang`` runs to get started. The "``opt``" tool allows you to
+experiment with passes from the command line, so you can see if they do
+anything.
+
+Now that we have reasonable code coming out of our front-end, let's talk
+about executing it!
+
+Adding a JIT Compiler
+=====================
+
+Code that is available in LLVM IR can have a wide variety of tools
+applied to it. For example, you can run optimizations on it (as we did
+above), you can dump it out in textual or binary forms, you can compile
+the code to an assembly file (.s) for some target, or you can JIT
+compile it. The nice thing about the LLVM IR representation is that it
+is the "common currency" between many different parts of the compiler.
+
+In this section, we'll add JIT compiler support to our interpreter. The
+basic idea that we want for Kaleidoscope is to have the user enter
+function bodies as they do now, but immediately evaluate the top-level
+expressions they type in. For example, if they type in "1 + 2;", we
+should evaluate and print out 3. If they define a function, they should
+be able to call it from the command line.
+
+In order to do this, we first prepare the environment to create code for
+the current native target and declare and initialize the JIT. This is
+done by calling some ``InitializeNativeTarget\*`` functions and
+adding a global variable ``TheJIT``, and initializing it in
+``main``:
+
+.. code-block:: c++
+
+ static std::unique_ptr<KaleidoscopeJIT> TheJIT;
+ ...
+ int main() {
+ InitializeNativeTarget();
+ InitializeNativeTargetAsmPrinter();
+ InitializeNativeTargetAsmParser();
+
+ // Install standard binary operators.
+ // 1 is lowest precedence.
+ BinopPrecedence['<'] = 10;
+ BinopPrecedence['+'] = 20;
+ BinopPrecedence['-'] = 20;
+ BinopPrecedence['*'] = 40; // highest.
+
+ // Prime the first token.
+ fprintf(stderr, "ready> ");
+ getNextToken();
+
+ TheJIT = llvm::make_unique<KaleidoscopeJIT>();
+
+ // Run the main "interpreter loop" now.
+ MainLoop();
+
+ return 0;
+ }
+
+We also need to setup the data layout for the JIT:
+
+.. code-block:: c++
+
+ void InitializeModuleAndPassManager(void) {
+ // Open a new module.
+ TheModule = llvm::make_unique<Module>("my cool jit", TheContext);
+ TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
+
+ // Create a new pass manager attached to it.
+ TheFPM = llvm::make_unique<FunctionPassManager>(TheModule.get());
+ ...
+
+The KaleidoscopeJIT class is a simple JIT built specifically for these
+tutorials, available inside the LLVM source code
+at llvm-src/examples/Kaleidoscope/include/KaleidoscopeJIT.h.
+In later chapters we will look at how it works and extend it with
+new features, but for now we will take it as given. Its API is very simple:
+``addModule`` adds an LLVM IR module to the JIT, making its functions
+available for execution; ``removeModule`` removes a module, freeing any
+memory associated with the code in that module; and ``findSymbol`` allows us
+to look up pointers to the compiled code.
+
+We can take this simple API and change our code that parses top-level expressions to
+look like this:
+
+.. code-block:: c++
+
+ static void HandleTopLevelExpression() {
+ // Evaluate a top-level expression into an anonymous function.
+ if (auto FnAST = ParseTopLevelExpr()) {
+ if (FnAST->codegen()) {
+
+ // JIT the module containing the anonymous expression, keeping a handle so
+ // we can free it later.
+ auto H = TheJIT->addModule(std::move(TheModule));
+ InitializeModuleAndPassManager();
+
+ // Search the JIT for the __anon_expr symbol.
+ auto ExprSymbol = TheJIT->findSymbol("__anon_expr");
+ assert(ExprSymbol && "Function not found");
+
+ // Get the symbol's address and cast it to the right type (takes no
+ // arguments, returns a double) so we can call it as a native function.
+ double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress();
+ fprintf(stderr, "Evaluated to %f\n", FP());
+
+ // Delete the anonymous expression module from the JIT.
+ TheJIT->removeModule(H);
+ }
+
+If parsing and codegen succeeed, the next step is to add the module containing
+the top-level expression to the JIT. We do this by calling addModule, which
+triggers code generation for all the functions in the module, and returns a
+handle that can be used to remove the module from the JIT later. Once the module
+has been added to the JIT it can no longer be modified, so we also open a new
+module to hold subsequent code by calling ``InitializeModuleAndPassManager()``.
+
+Once we've added the module to the JIT we need to get a pointer to the final
+generated code. We do this by calling the JIT's findSymbol method, and passing
+the name of the top-level expression function: ``__anon_expr``. Since we just
+added this function, we assert that findSymbol returned a result.
+
+Next, we get the in-memory address of the ``__anon_expr`` function by calling
+``getAddress()`` on the symbol. Recall that we compile top-level expressions
+into a self-contained LLVM function that takes no arguments and returns the
+computed double. Because the LLVM JIT compiler matches the native platform ABI,
+this means that you can just cast the result pointer to a function pointer of
+that type and call it directly. This means, there is no difference between JIT
+compiled code and native machine code that is statically linked into your
+application.
+
+Finally, since we don't support re-evaluation of top-level expressions, we
+remove the module from the JIT when we're done to free the associated memory.
+Recall, however, that the module we created a few lines earlier (via
+``InitializeModuleAndPassManager``) is still open and waiting for new code to be
+added.
+
+With just these two changes, let's see how Kaleidoscope works now!
+
+::
+
+ ready> 4+5;
+ Read top-level expression:
+ define double @0() {
+ entry:
+ ret double 9.000000e+00
+ }
+
+ Evaluated to 9.000000
+
+Well this looks like it is basically working. The dump of the function
+shows the "no argument function that always returns double" that we
+synthesize for each top-level expression that is typed in. This
+demonstrates very basic functionality, but can we do more?
+
+::
+
+ ready> def testfunc(x y) x + y*2;
+ Read function definition:
+ define double @testfunc(double %x, double %y) {
+ entry:
+ %multmp = fmul double %y, 2.000000e+00
+ %addtmp = fadd double %multmp, %x
+ ret double %addtmp
+ }
+
+ ready> testfunc(4, 10);
+ Read top-level expression:
+ define double @1() {
+ entry:
+ %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
+ ret double %calltmp
+ }
+
+ Evaluated to 24.000000
+
+ ready> testfunc(5, 10);
+ ready> LLVM ERROR: Program used external function 'testfunc' which could not be resolved!
+
+
+Function definitions and calls also work, but something went very wrong on that
+last line. The call looks valid, so what happened? As you may have guessed from
+the API a Module is a unit of allocation for the JIT, and testfunc was part
+of the same module that contained anonymous expression. When we removed that
+module from the JIT to free the memory for the anonymous expression, we deleted
+the definition of ``testfunc`` along with it. Then, when we tried to call
+testfunc a second time, the JIT could no longer find it.
+
+The easiest way to fix this is to put the anonymous expression in a separate
+module from the rest of the function definitions. The JIT will happily resolve
+function calls across module boundaries, as long as each of the functions called
+has a prototype, and is added to the JIT before it is called. By putting the
+anonymous expression in a different module we can delete it without affecting
+the rest of the functions.
+
+In fact, we're going to go a step further and put every function in its own
+module. Doing so allows us to exploit a useful property of the KaleidoscopeJIT
+that will make our environment more REPL-like: Functions can be added to the
+JIT more than once (unlike a module where every function must have a unique
+definition). When you look up a symbol in KaleidoscopeJIT it will always return
+the most recent definition:
+
+::
+
+ ready> def foo(x) x + 1;
+ Read function definition:
+ define double @foo(double %x) {
+ entry:
+ %addtmp = fadd double %x, 1.000000e+00
+ ret double %addtmp
+ }
+
+ ready> foo(2);
+ Evaluated to 3.000000
+
+ ready> def foo(x) x + 2;
+ define double @foo(double %x) {
+ entry:
+ %addtmp = fadd double %x, 2.000000e+00
+ ret double %addtmp
+ }
+
+ ready> foo(2);
+ Evaluated to 4.000000
+
+
+To allow each function to live in its own module we'll need a way to
+re-generate previous function declarations into each new module we open:
+
+.. code-block:: c++
+
+ static std::unique_ptr<KaleidoscopeJIT> TheJIT;
+
+ ...
+
+ Function *getFunction(std::string Name) {
+ // First, see if the function has already been added to the current module.
+ if (auto *F = TheModule->getFunction(Name))
+ return F;
+
+ // If not, check whether we can codegen the declaration from some existing
+ // prototype.
+ auto FI = FunctionProtos.find(Name);
+ if (FI != FunctionProtos.end())
+ return FI->second->codegen();
+
+ // If no existing prototype exists, return null.
+ return nullptr;
+ }
+
+ ...
+
+ Value *CallExprAST::codegen() {
+ // Look up the name in the global module table.
+ Function *CalleeF = getFunction(Callee);
+
+ ...
+
+ Function *FunctionAST::codegen() {
+ // Transfer ownership of the prototype to the FunctionProtos map, but keep a
+ // reference to it for use below.
+ auto &P = *Proto;
+ FunctionProtos[Proto->getName()] = std::move(Proto);
+ Function *TheFunction = getFunction(P.getName());
+ if (!TheFunction)
+ return nullptr;
+
+
+To enable this, we'll start by adding a new global, ``FunctionProtos``, that
+holds the most recent prototype for each function. We'll also add a convenience
+method, ``getFunction()``, to replace calls to ``TheModule->getFunction()``.
+Our convenience method searches ``TheModule`` for an existing function
+declaration, falling back to generating a new declaration from FunctionProtos if
+it doesn't find one. In ``CallExprAST::codegen()`` we just need to replace the
+call to ``TheModule->getFunction()``. In ``FunctionAST::codegen()`` we need to
+update the FunctionProtos map first, then call ``getFunction()``. With this
+done, we can always obtain a function declaration in the current module for any
+previously declared function.
+
+We also need to update HandleDefinition and HandleExtern:
+
+.. code-block:: c++
+
+ static void HandleDefinition() {
+ if (auto FnAST = ParseDefinition()) {
+ if (auto *FnIR = FnAST->codegen()) {
+ fprintf(stderr, "Read function definition:");
+ FnIR->print(errs());
+ fprintf(stderr, "\n");
+ TheJIT->addModule(std::move(TheModule));
+ InitializeModuleAndPassManager();
+ }
+ } else {
+ // Skip token for error recovery.
+ getNextToken();
+ }
+ }
+
+ static void HandleExtern() {
+ if (auto ProtoAST = ParseExtern()) {
+ if (auto *FnIR = ProtoAST->codegen()) {
+ fprintf(stderr, "Read extern: ");
+ FnIR->print(errs());
+ fprintf(stderr, "\n");
+ FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST);
+ }
+ } else {
+ // Skip token for error recovery.
+ getNextToken();
+ }
+ }
+
+In HandleDefinition, we add two lines to transfer the newly defined function to
+the JIT and open a new module. In HandleExtern, we just need to add one line to
+add the prototype to FunctionProtos.
+
+With these changes made, let's try our REPL again (I removed the dump of the
+anonymous functions this time, you should get the idea by now :) :
+
+::
+
+ ready> def foo(x) x + 1;
+ ready> foo(2);
+ Evaluated to 3.000000
+
+ ready> def foo(x) x + 2;
+ ready> foo(2);
+ Evaluated to 4.000000
+
+It works!
+
+Even with this simple code, we get some surprisingly powerful capabilities -
+check this out:
+
+::
+
+ ready> extern sin(x);
+ Read extern:
+ declare double @sin(double)
+
+ ready> extern cos(x);
+ Read extern:
+ declare double @cos(double)
+
+ ready> sin(1.0);
+ Read top-level expression:
+ define double @2() {
+ entry:
+ ret double 0x3FEAED548F090CEE
+ }
+
+ Evaluated to 0.841471
+
+ ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
+ Read function definition:
+ define double @foo(double %x) {
+ entry:
+ %calltmp = call double @sin(double %x)
+ %multmp = fmul double %calltmp, %calltmp
+ %calltmp2 = call double @cos(double %x)
+ %multmp4 = fmul double %calltmp2, %calltmp2
+ %addtmp = fadd double %multmp, %multmp4
+ ret double %addtmp
+ }
+
+ ready> foo(4.0);
+ Read top-level expression:
+ define double @3() {
+ entry:
+ %calltmp = call double @foo(double 4.000000e+00)
+ ret double %calltmp
+ }
+
+ Evaluated to 1.000000
+
+Whoa, how does the JIT know about sin and cos? The answer is surprisingly
+simple: The KaleidoscopeJIT has a straightforward symbol resolution rule that
+it uses to find symbols that aren't available in any given module: First
+it searches all the modules that have already been added to the JIT, from the
+most recent to the oldest, to find the newest definition. If no definition is
+found inside the JIT, it falls back to calling "``dlsym("sin")``" on the
+Kaleidoscope process itself. Since "``sin``" is defined within the JIT's
+address space, it simply patches up calls in the module to call the libm
+version of ``sin`` directly. But in some cases this even goes further:
+as sin and cos are names of standard math functions, the constant folder
+will directly evaluate the function calls to the correct result when called
+with constants like in the "``sin(1.0)``" above.
+
+In the future we'll see how tweaking this symbol resolution rule can be used to
+enable all sorts of useful features, from security (restricting the set of
+symbols available to JIT'd code), to dynamic code generation based on symbol
+names, and even lazy compilation.
+
+One immediate benefit of the symbol resolution rule is that we can now extend
+the language by writing arbitrary C++ code to implement operations. For example,
+if we add:
+
+.. code-block:: c++
+
+ #ifdef _WIN32
+ #define DLLEXPORT __declspec(dllexport)
+ #else
+ #define DLLEXPORT
+ #endif
+
+ /// putchard - putchar that takes a double and returns 0.
+ extern "C" DLLEXPORT double putchard(double X) {
+ fputc((char)X, stderr);
+ return 0;
+ }
+
+Note, that for Windows we need to actually export the functions because
+the dynamic symbol loader will use GetProcAddress to find the symbols.
+
+Now we can produce simple output to the console by using things like:
+"``extern putchard(x); putchard(120);``", which prints a lowercase 'x'
+on the console (120 is the ASCII code for 'x'). Similar code could be
+used to implement file I/O, console input, and many other capabilities
+in Kaleidoscope.
+
+This completes the JIT and optimizer chapter of the Kaleidoscope
+tutorial. At this point, we can compile a non-Turing-complete
+programming language, optimize and JIT compile it in a user-driven way.
+Next up we'll look into `extending the language with control flow
+constructs <LangImpl05.html>`_, tackling some interesting LLVM IR issues
+along the way.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the LLVM JIT and optimizer. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+If you are compiling this on Linux, make sure to add the "-rdynamic"
+option as well. This makes sure that the external functions are resolved
+properly at runtime.
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter4/toy.cpp
+ :language: c++
+
+`Next: Extending the language: control flow <LangImpl05.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl05.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl05.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl05.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl05.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,816 @@
+:orphan:
+
+==================================================
+Kaleidoscope: Extending the Language: Control Flow
+==================================================
+
+.. contents::
+ :local:
+
+Chapter 5 Introduction
+======================
+
+Welcome to Chapter 5 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. Parts 1-4 described the implementation of
+the simple Kaleidoscope language and included support for generating
+LLVM IR, followed by optimizations and a JIT compiler. Unfortunately, as
+presented, Kaleidoscope is mostly useless: it has no control flow other
+than call and return. This means that you can't have conditional
+branches in the code, significantly limiting its power. In this episode
+of "build that compiler", we'll extend Kaleidoscope to have an
+if/then/else expression plus a simple 'for' loop.
+
+If/Then/Else
+============
+
+Extending Kaleidoscope to support if/then/else is quite straightforward.
+It basically requires adding support for this "new" concept to the
+lexer, parser, AST, and LLVM code emitter. This example is nice, because
+it shows how easy it is to "grow" a language over time, incrementally
+extending it as new ideas are discovered.
+
+Before we get going on "how" we add this extension, let's talk about
+"what" we want. The basic idea is that we want to be able to write this
+sort of thing:
+
+::
+
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+In Kaleidoscope, every construct is an expression: there are no
+statements. As such, the if/then/else expression needs to return a value
+like any other. Since we're using a mostly functional form, we'll have
+it evaluate its conditional, then return the 'then' or 'else' value
+based on how the condition was resolved. This is very similar to the C
+"?:" expression.
+
+The semantics of the if/then/else expression is that it evaluates the
+condition to a boolean equality value: 0.0 is considered to be false and
+everything else is considered to be true. If the condition is true, the
+first subexpression is evaluated and returned, if the condition is
+false, the second subexpression is evaluated and returned. Since
+Kaleidoscope allows side-effects, this behavior is important to nail
+down.
+
+Now that we know what we "want", let's break this down into its
+constituent pieces.
+
+Lexer Extensions for If/Then/Else
+---------------------------------
+
+The lexer extensions are straightforward. First we add new enum values
+for the relevant tokens:
+
+.. code-block:: c++
+
+ // control
+ tok_if = -6,
+ tok_then = -7,
+ tok_else = -8,
+
+Once we have that, we recognize the new keywords in the lexer. This is
+pretty simple stuff:
+
+.. code-block:: c++
+
+ ...
+ if (IdentifierStr == "def")
+ return tok_def;
+ if (IdentifierStr == "extern")
+ return tok_extern;
+ if (IdentifierStr == "if")
+ return tok_if;
+ if (IdentifierStr == "then")
+ return tok_then;
+ if (IdentifierStr == "else")
+ return tok_else;
+ return tok_identifier;
+
+AST Extensions for If/Then/Else
+-------------------------------
+
+To represent the new expression we add a new AST node for it:
+
+.. code-block:: c++
+
+ /// IfExprAST - Expression class for if/then/else.
+ class IfExprAST : public ExprAST {
+ std::unique_ptr<ExprAST> Cond, Then, Else;
+
+ public:
+ IfExprAST(std::unique_ptr<ExprAST> Cond, std::unique_ptr<ExprAST> Then,
+ std::unique_ptr<ExprAST> Else)
+ : Cond(std::move(Cond)), Then(std::move(Then)), Else(std::move(Else)) {}
+
+ Value *codegen() override;
+ };
+
+The AST node just has pointers to the various subexpressions.
+
+Parser Extensions for If/Then/Else
+----------------------------------
+
+Now that we have the relevant tokens coming from the lexer and we have
+the AST node to build, our parsing logic is relatively straightforward.
+First we define a new parsing function:
+
+.. code-block:: c++
+
+ /// ifexpr ::= 'if' expression 'then' expression 'else' expression
+ static std::unique_ptr<ExprAST> ParseIfExpr() {
+ getNextToken(); // eat the if.
+
+ // condition.
+ auto Cond = ParseExpression();
+ if (!Cond)
+ return nullptr;
+
+ if (CurTok != tok_then)
+ return LogError("expected then");
+ getNextToken(); // eat the then
+
+ auto Then = ParseExpression();
+ if (!Then)
+ return nullptr;
+
+ if (CurTok != tok_else)
+ return LogError("expected else");
+
+ getNextToken();
+
+ auto Else = ParseExpression();
+ if (!Else)
+ return nullptr;
+
+ return llvm::make_unique<IfExprAST>(std::move(Cond), std::move(Then),
+ std::move(Else));
+ }
+
+Next we hook it up as a primary expression:
+
+.. code-block:: c++
+
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ case tok_if:
+ return ParseIfExpr();
+ }
+ }
+
+LLVM IR for If/Then/Else
+------------------------
+
+Now that we have it parsing and building the AST, the final piece is
+adding LLVM code generation support. This is the most interesting part
+of the if/then/else example, because this is where it starts to
+introduce new concepts. All of the code above has been thoroughly
+described in previous chapters.
+
+To motivate the code we want to produce, let's take a look at a simple
+example. Consider:
+
+::
+
+ extern foo();
+ extern bar();
+ def baz(x) if x then foo() else bar();
+
+If you disable optimizations, the code you'll (soon) get from
+Kaleidoscope looks like this:
+
+.. code-block:: llvm
+
+ declare double @foo()
+
+ declare double @bar()
+
+ define double @baz(double %x) {
+ entry:
+ %ifcond = fcmp one double %x, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then: ; preds = %entry
+ %calltmp = call double @foo()
+ br label %ifcont
+
+ else: ; preds = %entry
+ %calltmp1 = call double @bar()
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
+ ret double %iftmp
+ }
+
+To visualize the control flow graph, you can use a nifty feature of the
+LLVM '`opt <http://llvm.org/cmds/opt.html>`_' tool. If you put this LLVM
+IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
+window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
+see this graph:
+
+.. figure:: LangImpl05-cfg.png
+ :align: center
+ :alt: Example CFG
+
+ Example CFG
+
+Another way to get this is to call "``F->viewCFG()``" or
+"``F->viewCFGOnly()``" (where F is a "``Function*``") either by
+inserting actual calls into the code and recompiling or by calling these
+in the debugger. LLVM has many nice features for visualizing various
+graphs.
+
+Getting back to the generated code, it is fairly simple: the entry block
+evaluates the conditional expression ("x" in our case here) and compares
+the result to 0.0 with the "``fcmp one``" instruction ('one' is "Ordered
+and Not Equal"). Based on the result of this expression, the code jumps
+to either the "then" or "else" blocks, which contain the expressions for
+the true/false cases.
+
+Once the then/else blocks are finished executing, they both branch back
+to the 'ifcont' block to execute the code that happens after the
+if/then/else. In this case the only thing left to do is to return to the
+caller of the function. The question then becomes: how does the code
+know which expression to return?
+
+The answer to this question involves an important SSA operation: the
+`Phi
+operation <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+If you're not familiar with SSA, `the wikipedia
+article <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+is a good introduction and there are various other introductions to it
+available on your favorite search engine. The short version is that
+"execution" of the Phi operation requires "remembering" which block
+control came from. The Phi operation takes on the value corresponding to
+the input control block. In this case, if control comes in from the
+"then" block, it gets the value of "calltmp". If control comes from the
+"else" block, it gets the value of "calltmp1".
+
+At this point, you are probably starting to think "Oh no! This means my
+simple and elegant front-end will have to start generating SSA form in
+order to use LLVM!". Fortunately, this is not the case, and we strongly
+advise *not* implementing an SSA construction algorithm in your
+front-end unless there is an amazingly good reason to do so. In
+practice, there are two sorts of values that float around in code
+written for your average imperative programming language that might need
+Phi nodes:
+
+#. Code that involves user variables: ``x = 1; x = x + 1;``
+#. Values that are implicit in the structure of your AST, such as the
+ Phi node in this case.
+
+In `Chapter 7 <LangImpl07.html>`_ of this tutorial ("mutable variables"),
+we'll talk about #1 in depth. For now, just believe me that you don't
+need SSA construction to handle this case. For #2, you have the choice
+of using the techniques that we will describe for #1, or you can insert
+Phi nodes directly, if convenient. In this case, it is really
+easy to generate the Phi node, so we choose to do it directly.
+
+Okay, enough of the motivation and overview, let's generate code!
+
+Code Generation for If/Then/Else
+--------------------------------
+
+In order to generate code for this, we implement the ``codegen`` method
+for ``IfExprAST``:
+
+.. code-block:: c++
+
+ Value *IfExprAST::codegen() {
+ Value *CondV = Cond->codegen();
+ if (!CondV)
+ return nullptr;
+
+ // Convert condition to a bool by comparing non-equal to 0.0.
+ CondV = Builder.CreateFCmpONE(
+ CondV, ConstantFP::get(TheContext, APFloat(0.0)), "ifcond");
+
+This code is straightforward and similar to what we saw before. We emit
+the expression for the condition, then compare that value to zero to get
+a truth value as a 1-bit (bool) value.
+
+.. code-block:: c++
+
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+
+ // Create blocks for the then and else cases. Insert the 'then' block at the
+ // end of the function.
+ BasicBlock *ThenBB =
+ BasicBlock::Create(TheContext, "then", TheFunction);
+ BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");
+ BasicBlock *MergeBB = BasicBlock::Create(TheContext, "ifcont");
+
+ Builder.CreateCondBr(CondV, ThenBB, ElseBB);
+
+This code creates the basic blocks that are related to the if/then/else
+statement, and correspond directly to the blocks in the example above.
+The first line gets the current Function object that is being built. It
+gets this by asking the builder for the current BasicBlock, and asking
+that block for its "parent" (the function it is currently embedded
+into).
+
+Once it has that, it creates three blocks. Note that it passes
+"TheFunction" into the constructor for the "then" block. This causes the
+constructor to automatically insert the new block into the end of the
+specified function. The other two blocks are created, but aren't yet
+inserted into the function.
+
+Once the blocks are created, we can emit the conditional branch that
+chooses between them. Note that creating new blocks does not implicitly
+affect the IRBuilder, so it is still inserting into the block that the
+condition went into. Also note that it is creating a branch to the
+"then" block and the "else" block, even though the "else" block isn't
+inserted into the function yet. This is all ok: it is the standard way
+that LLVM supports forward references.
+
+.. code-block:: c++
+
+ // Emit then value.
+ Builder.SetInsertPoint(ThenBB);
+
+ Value *ThenV = Then->codegen();
+ if (!ThenV)
+ return nullptr;
+
+ Builder.CreateBr(MergeBB);
+ // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
+ ThenBB = Builder.GetInsertBlock();
+
+After the conditional branch is inserted, we move the builder to start
+inserting into the "then" block. Strictly speaking, this call moves the
+insertion point to be at the end of the specified block. However, since
+the "then" block is empty, it also starts out by inserting at the
+beginning of the block. :)
+
+Once the insertion point is set, we recursively codegen the "then"
+expression from the AST. To finish off the "then" block, we create an
+unconditional branch to the merge block. One interesting (and very
+important) aspect of the LLVM IR is that it `requires all basic blocks
+to be "terminated" <../LangRef.html#functionstructure>`_ with a `control
+flow instruction <../LangRef.html#terminators>`_ such as return or
+branch. This means that all control flow, *including fall throughs* must
+be made explicit in the LLVM IR. If you violate this rule, the verifier
+will emit an error.
+
+The final line here is quite subtle, but is very important. The basic
+issue is that when we create the Phi node in the merge block, we need to
+set up the block/value pairs that indicate how the Phi will work.
+Importantly, the Phi node expects to have an entry for each predecessor
+of the block in the CFG. Why then, are we getting the current block when
+we just set it to ThenBB 5 lines above? The problem is that the "Then"
+expression may actually itself change the block that the Builder is
+emitting into if, for example, it contains a nested "if/then/else"
+expression. Because calling ``codegen()`` recursively could arbitrarily change
+the notion of the current block, we are required to get an up-to-date
+value for code that will set up the Phi node.
+
+.. code-block:: c++
+
+ // Emit else block.
+ TheFunction->getBasicBlockList().push_back(ElseBB);
+ Builder.SetInsertPoint(ElseBB);
+
+ Value *ElseV = Else->codegen();
+ if (!ElseV)
+ return nullptr;
+
+ Builder.CreateBr(MergeBB);
+ // codegen of 'Else' can change the current block, update ElseBB for the PHI.
+ ElseBB = Builder.GetInsertBlock();
+
+Code generation for the 'else' block is basically identical to codegen
+for the 'then' block. The only significant difference is the first line,
+which adds the 'else' block to the function. Recall previously that the
+'else' block was created, but not added to the function. Now that the
+'then' and 'else' blocks are emitted, we can finish up with the merge
+code:
+
+.. code-block:: c++
+
+ // Emit merge block.
+ TheFunction->getBasicBlockList().push_back(MergeBB);
+ Builder.SetInsertPoint(MergeBB);
+ PHINode *PN =
+ Builder.CreatePHI(Type::getDoubleTy(TheContext), 2, "iftmp");
+
+ PN->addIncoming(ThenV, ThenBB);
+ PN->addIncoming(ElseV, ElseBB);
+ return PN;
+ }
+
+The first two lines here are now familiar: the first adds the "merge"
+block to the Function object (it was previously floating, like the else
+block above). The second changes the insertion point so that newly
+created code will go into the "merge" block. Once that is done, we need
+to create the PHI node and set up the block/value pairs for the PHI.
+
+Finally, the CodeGen function returns the phi node as the value computed
+by the if/then/else expression. In our example above, this returned
+value will feed into the code for the top-level function, which will
+create the return instruction.
+
+Overall, we now have the ability to execute conditional code in
+Kaleidoscope. With this extension, Kaleidoscope is a fairly complete
+language that can calculate a wide variety of numeric functions. Next up
+we'll add another useful expression that is familiar from non-functional
+languages...
+
+'for' Loop Expression
+=====================
+
+Now that we know how to add basic control flow constructs to the
+language, we have the tools to add more powerful things. Let's add
+something more aggressive, a 'for' expression:
+
+::
+
+ extern putchard(char);
+ def printstar(n)
+ for i = 1, i < n, 1.0 in
+ putchard(42); # ascii 42 = '*'
+
+ # print 100 '*' characters
+ printstar(100);
+
+This expression defines a new variable ("i" in this case) which iterates
+from a starting value, while the condition ("i < n" in this case) is
+true, incrementing by an optional step value ("1.0" in this case). If
+the step value is omitted, it defaults to 1.0. While the loop is true,
+it executes its body expression. Because we don't have anything better
+to return, we'll just define the loop as always returning 0.0. In the
+future when we have mutable variables, it will get more useful.
+
+As before, let's talk about the changes that we need to Kaleidoscope to
+support this.
+
+Lexer Extensions for the 'for' Loop
+-----------------------------------
+
+The lexer extensions are the same sort of thing as for if/then/else:
+
+.. code-block:: c++
+
+ ... in enum Token ...
+ // control
+ tok_if = -6, tok_then = -7, tok_else = -8,
+ tok_for = -9, tok_in = -10
+
+ ... in gettok ...
+ if (IdentifierStr == "def")
+ return tok_def;
+ if (IdentifierStr == "extern")
+ return tok_extern;
+ if (IdentifierStr == "if")
+ return tok_if;
+ if (IdentifierStr == "then")
+ return tok_then;
+ if (IdentifierStr == "else")
+ return tok_else;
+ if (IdentifierStr == "for")
+ return tok_for;
+ if (IdentifierStr == "in")
+ return tok_in;
+ return tok_identifier;
+
+AST Extensions for the 'for' Loop
+---------------------------------
+
+The AST node is just as simple. It basically boils down to capturing the
+variable name and the constituent expressions in the node.
+
+.. code-block:: c++
+
+ /// ForExprAST - Expression class for for/in.
+ class ForExprAST : public ExprAST {
+ std::string VarName;
+ std::unique_ptr<ExprAST> Start, End, Step, Body;
+
+ public:
+ ForExprAST(const std::string &VarName, std::unique_ptr<ExprAST> Start,
+ std::unique_ptr<ExprAST> End, std::unique_ptr<ExprAST> Step,
+ std::unique_ptr<ExprAST> Body)
+ : VarName(VarName), Start(std::move(Start)), End(std::move(End)),
+ Step(std::move(Step)), Body(std::move(Body)) {}
+
+ Value *codegen() override;
+ };
+
+Parser Extensions for the 'for' Loop
+------------------------------------
+
+The parser code is also fairly standard. The only interesting thing here
+is handling of the optional step value. The parser code handles it by
+checking to see if the second comma is present. If not, it sets the step
+value to null in the AST node:
+
+.. code-block:: c++
+
+ /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
+ static std::unique_ptr<ExprAST> ParseForExpr() {
+ getNextToken(); // eat the for.
+
+ if (CurTok != tok_identifier)
+ return LogError("expected identifier after for");
+
+ std::string IdName = IdentifierStr;
+ getNextToken(); // eat identifier.
+
+ if (CurTok != '=')
+ return LogError("expected '=' after for");
+ getNextToken(); // eat '='.
+
+
+ auto Start = ParseExpression();
+ if (!Start)
+ return nullptr;
+ if (CurTok != ',')
+ return LogError("expected ',' after for start value");
+ getNextToken();
+
+ auto End = ParseExpression();
+ if (!End)
+ return nullptr;
+
+ // The step value is optional.
+ std::unique_ptr<ExprAST> Step;
+ if (CurTok == ',') {
+ getNextToken();
+ Step = ParseExpression();
+ if (!Step)
+ return nullptr;
+ }
+
+ if (CurTok != tok_in)
+ return LogError("expected 'in' after for");
+ getNextToken(); // eat 'in'.
+
+ auto Body = ParseExpression();
+ if (!Body)
+ return nullptr;
+
+ return llvm::make_unique<ForExprAST>(IdName, std::move(Start),
+ std::move(End), std::move(Step),
+ std::move(Body));
+ }
+
+And again we hook it up as a primary expression:
+
+.. code-block:: c++
+
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ case tok_if:
+ return ParseIfExpr();
+ case tok_for:
+ return ParseForExpr();
+ }
+ }
+
+LLVM IR for the 'for' Loop
+--------------------------
+
+Now we get to the good part: the LLVM IR we want to generate for this
+thing. With the simple example above, we get this LLVM IR (note that
+this dump is generated with optimizations disabled for clarity):
+
+.. code-block:: llvm
+
+ declare double @putchard(double)
+
+ define double @printstar(double %n) {
+ entry:
+ ; initial value = 1.0 (inlined into phi)
+ br label %loop
+
+ loop: ; preds = %loop, %entry
+ %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
+ ; body
+ %calltmp = call double @putchard(double 4.200000e+01)
+ ; increment
+ %nextvar = fadd double %i, 1.000000e+00
+
+ ; termination test
+ %cmptmp = fcmp ult double %i, %n
+ %booltmp = uitofp i1 %cmptmp to double
+ %loopcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %loopcond, label %loop, label %afterloop
+
+ afterloop: ; preds = %loop
+ ; loop always returns 0.0
+ ret double 0.000000e+00
+ }
+
+This loop contains all the same constructs we saw before: a phi node,
+several expressions, and some basic blocks. Let's see how this fits
+together.
+
+Code Generation for the 'for' Loop
+----------------------------------
+
+The first part of codegen is very simple: we just output the start
+expression for the loop value:
+
+.. code-block:: c++
+
+ Value *ForExprAST::codegen() {
+ // Emit the start code first, without 'variable' in scope.
+ Value *StartVal = Start->codegen();
+ if (!StartVal)
+ return nullptr;
+
+With this out of the way, the next step is to set up the LLVM basic
+block for the start of the loop body. In the case above, the whole loop
+body is one block, but remember that the body code itself could consist
+of multiple blocks (e.g. if it contains an if/then/else or a for/in
+expression).
+
+.. code-block:: c++
+
+ // Make the new basic block for the loop header, inserting after current
+ // block.
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+ BasicBlock *PreheaderBB = Builder.GetInsertBlock();
+ BasicBlock *LoopBB =
+ BasicBlock::Create(TheContext, "loop", TheFunction);
+
+ // Insert an explicit fall through from the current block to the LoopBB.
+ Builder.CreateBr(LoopBB);
+
+This code is similar to what we saw for if/then/else. Because we will
+need it to create the Phi node, we remember the block that falls through
+into the loop. Once we have that, we create the actual block that starts
+the loop and create an unconditional branch for the fall-through between
+the two blocks.
+
+.. code-block:: c++
+
+ // Start insertion in LoopBB.
+ Builder.SetInsertPoint(LoopBB);
+
+ // Start the PHI node with an entry for Start.
+ PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(TheContext),
+ 2, VarName.c_str());
+ Variable->addIncoming(StartVal, PreheaderBB);
+
+Now that the "preheader" for the loop is set up, we switch to emitting
+code for the loop body. To begin with, we move the insertion point and
+create the PHI node for the loop induction variable. Since we already
+know the incoming value for the starting value, we add it to the Phi
+node. Note that the Phi will eventually get a second value for the
+backedge, but we can't set it up yet (because it doesn't exist!).
+
+.. code-block:: c++
+
+ // Within the loop, the variable is defined equal to the PHI node. If it
+ // shadows an existing variable, we have to restore it, so save it now.
+ Value *OldVal = NamedValues[VarName];
+ NamedValues[VarName] = Variable;
+
+ // Emit the body of the loop. This, like any other expr, can change the
+ // current BB. Note that we ignore the value computed by the body, but don't
+ // allow an error.
+ if (!Body->codegen())
+ return nullptr;
+
+Now the code starts to get more interesting. Our 'for' loop introduces a
+new variable to the symbol table. This means that our symbol table can
+now contain either function arguments or loop variables. To handle this,
+before we codegen the body of the loop, we add the loop variable as the
+current value for its name. Note that it is possible that there is a
+variable of the same name in the outer scope. It would be easy to make
+this an error (emit an error and return null if there is already an
+entry for VarName) but we choose to allow shadowing of variables. In
+order to handle this correctly, we remember the Value that we are
+potentially shadowing in ``OldVal`` (which will be null if there is no
+shadowed variable).
+
+Once the loop variable is set into the symbol table, the code
+recursively codegen's the body. This allows the body to use the loop
+variable: any references to it will naturally find it in the symbol
+table.
+
+.. code-block:: c++
+
+ // Emit the step value.
+ Value *StepVal = nullptr;
+ if (Step) {
+ StepVal = Step->codegen();
+ if (!StepVal)
+ return nullptr;
+ } else {
+ // If not specified, use 1.0.
+ StepVal = ConstantFP::get(TheContext, APFloat(1.0));
+ }
+
+ Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
+
+Now that the body is emitted, we compute the next value of the iteration
+variable by adding the step value, or 1.0 if it isn't present.
+'``NextVar``' will be the value of the loop variable on the next
+iteration of the loop.
+
+.. code-block:: c++
+
+ // Compute the end condition.
+ Value *EndCond = End->codegen();
+ if (!EndCond)
+ return nullptr;
+
+ // Convert condition to a bool by comparing non-equal to 0.0.
+ EndCond = Builder.CreateFCmpONE(
+ EndCond, ConstantFP::get(TheContext, APFloat(0.0)), "loopcond");
+
+Finally, we evaluate the exit value of the loop, to determine whether
+the loop should exit. This mirrors the condition evaluation for the
+if/then/else statement.
+
+.. code-block:: c++
+
+ // Create the "after loop" block and insert it.
+ BasicBlock *LoopEndBB = Builder.GetInsertBlock();
+ BasicBlock *AfterBB =
+ BasicBlock::Create(TheContext, "afterloop", TheFunction);
+
+ // Insert the conditional branch into the end of LoopEndBB.
+ Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
+
+ // Any new code will be inserted in AfterBB.
+ Builder.SetInsertPoint(AfterBB);
+
+With the code for the body of the loop complete, we just need to finish
+up the control flow for it. This code remembers the end block (for the
+phi node), then creates the block for the loop exit ("afterloop"). Based
+on the value of the exit condition, it creates a conditional branch that
+chooses between executing the loop again and exiting the loop. Any
+future code is emitted in the "afterloop" block, so it sets the
+insertion position to it.
+
+.. code-block:: c++
+
+ // Add a new entry to the PHI node for the backedge.
+ Variable->addIncoming(NextVar, LoopEndBB);
+
+ // Restore the unshadowed variable.
+ if (OldVal)
+ NamedValues[VarName] = OldVal;
+ else
+ NamedValues.erase(VarName);
+
+ // for expr always returns 0.0.
+ return Constant::getNullValue(Type::getDoubleTy(TheContext));
+ }
+
+The final code handles various cleanups: now that we have the "NextVar"
+value, we can add the incoming value to the loop PHI node. After that,
+we remove the loop variable from the symbol table, so that it isn't in
+scope after the for loop. Finally, code generation of the for loop
+always returns 0.0, so that is what we return from
+``ForExprAST::codegen()``.
+
+With this, we conclude the "adding control flow to Kaleidoscope" chapter
+of the tutorial. In this chapter we added two control flow constructs,
+and used them to motivate a couple of aspects of the LLVM IR that are
+important for front-end implementors to know. In the next chapter of our
+saga, we will get a bit crazier and add `user-defined
+operators <LangImpl06.html>`_ to our poor innocent language.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the if/then/else and for expressions. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter5/toy.cpp
+ :language: c++
+
+`Next: Extending the language: user-defined operators <LangImpl06.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl06.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl06.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl06.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl06.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,770 @@
+:orphan:
+
+============================================================
+Kaleidoscope: Extending the Language: User-defined Operators
+============================================================
+
+.. contents::
+ :local:
+
+Chapter 6 Introduction
+======================
+
+Welcome to Chapter 6 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. At this point in our tutorial, we now
+have a fully functional language that is fairly minimal, but also
+useful. There is still one big problem with it, however. Our language
+doesn't have many useful operators (like division, logical negation, or
+even any comparisons besides less-than).
+
+This chapter of the tutorial takes a wild digression into adding
+user-defined operators to the simple and beautiful Kaleidoscope
+language. This digression now gives us a simple and ugly language in
+some ways, but also a powerful one at the same time. One of the great
+things about creating your own language is that you get to decide what
+is good or bad. In this tutorial we'll assume that it is okay to use
+this as a way to show some interesting parsing techniques.
+
+At the end of this tutorial, we'll run through an example Kaleidoscope
+application that `renders the Mandelbrot set <#kicking-the-tires>`_. This gives an
+example of what you can build with Kaleidoscope and its feature set.
+
+User-defined Operators: the Idea
+================================
+
+The "operator overloading" that we will add to Kaleidoscope is more
+general than in languages like C++. In C++, you are only allowed to
+redefine existing operators: you can't programmatically change the
+grammar, introduce new operators, change precedence levels, etc. In this
+chapter, we will add this capability to Kaleidoscope, which will let the
+user round out the set of operators that are supported.
+
+The point of going into user-defined operators in a tutorial like this
+is to show the power and flexibility of using a hand-written parser.
+Thus far, the parser we have been implementing uses recursive descent
+for most parts of the grammar and operator precedence parsing for the
+expressions. See `Chapter 2 <LangImpl02.html>`_ for details. By
+using operator precedence parsing, it is very easy to allow
+the programmer to introduce new operators into the grammar: the grammar
+is dynamically extensible as the JIT runs.
+
+The two specific features we'll add are programmable unary operators
+(right now, Kaleidoscope has no unary operators at all) as well as
+binary operators. An example of this is:
+
+::
+
+ # Logical unary not.
+ def unary!(v)
+ if v then
+ 0
+ else
+ 1;
+
+ # Define > with the same precedence as <.
+ def binary> 10 (LHS RHS)
+ RHS < LHS;
+
+ # Binary "logical or", (note that it does not "short circuit")
+ def binary| 5 (LHS RHS)
+ if LHS then
+ 1
+ else if RHS then
+ 1
+ else
+ 0;
+
+ # Define = with slightly lower precedence than relationals.
+ def binary= 9 (LHS RHS)
+ !(LHS < RHS | LHS > RHS);
+
+Many languages aspire to being able to implement their standard runtime
+library in the language itself. In Kaleidoscope, we can implement
+significant parts of the language in the library!
+
+We will break down implementation of these features into two parts:
+implementing support for user-defined binary operators and adding unary
+operators.
+
+User-defined Binary Operators
+=============================
+
+Adding support for user-defined binary operators is pretty simple with
+our current framework. We'll first add support for the unary/binary
+keywords:
+
+.. code-block:: c++
+
+ enum Token {
+ ...
+ // operators
+ tok_binary = -11,
+ tok_unary = -12
+ };
+ ...
+ static int gettok() {
+ ...
+ if (IdentifierStr == "for")
+ return tok_for;
+ if (IdentifierStr == "in")
+ return tok_in;
+ if (IdentifierStr == "binary")
+ return tok_binary;
+ if (IdentifierStr == "unary")
+ return tok_unary;
+ return tok_identifier;
+
+This just adds lexer support for the unary and binary keywords, like we
+did in `previous chapters <LangImpl5.html#lexer-extensions-for-if-then-else>`_. One nice thing
+about our current AST, is that we represent binary operators with full
+generalisation by using their ASCII code as the opcode. For our extended
+operators, we'll use this same representation, so we don't need any new
+AST or parser support.
+
+On the other hand, we have to be able to represent the definitions of
+these new operators, in the "def binary\| 5" part of the function
+definition. In our grammar so far, the "name" for the function
+definition is parsed as the "prototype" production and into the
+``PrototypeAST`` AST node. To represent our new user-defined operators
+as prototypes, we have to extend the ``PrototypeAST`` AST node like
+this:
+
+.. code-block:: c++
+
+ /// PrototypeAST - This class represents the "prototype" for a function,
+ /// which captures its argument names as well as if it is an operator.
+ class PrototypeAST {
+ std::string Name;
+ std::vector<std::string> Args;
+ bool IsOperator;
+ unsigned Precedence; // Precedence if a binary op.
+
+ public:
+ PrototypeAST(const std::string &name, std::vector<std::string> Args,
+ bool IsOperator = false, unsigned Prec = 0)
+ : Name(name), Args(std::move(Args)), IsOperator(IsOperator),
+ Precedence(Prec) {}
+
+ Function *codegen();
+ const std::string &getName() const { return Name; }
+
+ bool isUnaryOp() const { return IsOperator && Args.size() == 1; }
+ bool isBinaryOp() const { return IsOperator && Args.size() == 2; }
+
+ char getOperatorName() const {
+ assert(isUnaryOp() || isBinaryOp());
+ return Name[Name.size() - 1];
+ }
+
+ unsigned getBinaryPrecedence() const { return Precedence; }
+ };
+
+Basically, in addition to knowing a name for the prototype, we now keep
+track of whether it was an operator, and if it was, what precedence
+level the operator is at. The precedence is only used for binary
+operators (as you'll see below, it just doesn't apply for unary
+operators). Now that we have a way to represent the prototype for a
+user-defined operator, we need to parse it:
+
+.. code-block:: c++
+
+ /// prototype
+ /// ::= id '(' id* ')'
+ /// ::= binary LETTER number? (id, id)
+ static std::unique_ptr<PrototypeAST> ParsePrototype() {
+ std::string FnName;
+
+ unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
+ unsigned BinaryPrecedence = 30;
+
+ switch (CurTok) {
+ default:
+ return LogErrorP("Expected function name in prototype");
+ case tok_identifier:
+ FnName = IdentifierStr;
+ Kind = 0;
+ getNextToken();
+ break;
+ case tok_binary:
+ getNextToken();
+ if (!isascii(CurTok))
+ return LogErrorP("Expected binary operator");
+ FnName = "binary";
+ FnName += (char)CurTok;
+ Kind = 2;
+ getNextToken();
+
+ // Read the precedence if present.
+ if (CurTok == tok_number) {
+ if (NumVal < 1 || NumVal > 100)
+ return LogErrorP("Invalid precedence: must be 1..100");
+ BinaryPrecedence = (unsigned)NumVal;
+ getNextToken();
+ }
+ break;
+ }
+
+ if (CurTok != '(')
+ return LogErrorP("Expected '(' in prototype");
+
+ std::vector<std::string> ArgNames;
+ while (getNextToken() == tok_identifier)
+ ArgNames.push_back(IdentifierStr);
+ if (CurTok != ')')
+ return LogErrorP("Expected ')' in prototype");
+
+ // success.
+ getNextToken(); // eat ')'.
+
+ // Verify right number of names for operator.
+ if (Kind && ArgNames.size() != Kind)
+ return LogErrorP("Invalid number of operands for operator");
+
+ return llvm::make_unique<PrototypeAST>(FnName, std::move(ArgNames), Kind != 0,
+ BinaryPrecedence);
+ }
+
+This is all fairly straightforward parsing code, and we have already
+seen a lot of similar code in the past. One interesting part about the
+code above is the couple lines that set up ``FnName`` for binary
+operators. This builds names like "binary@" for a newly defined "@"
+operator. It then takes advantage of the fact that symbol names in the
+LLVM symbol table are allowed to have any character in them, including
+embedded nul characters.
+
+The next interesting thing to add, is codegen support for these binary
+operators. Given our current structure, this is a simple addition of a
+default case for our existing binary operator node:
+
+.. code-block:: c++
+
+ Value *BinaryExprAST::codegen() {
+ Value *L = LHS->codegen();
+ Value *R = RHS->codegen();
+ if (!L || !R)
+ return nullptr;
+
+ switch (Op) {
+ case '+':
+ return Builder.CreateFAdd(L, R, "addtmp");
+ case '-':
+ return Builder.CreateFSub(L, R, "subtmp");
+ case '*':
+ return Builder.CreateFMul(L, R, "multmp");
+ case '<':
+ L = Builder.CreateFCmpULT(L, R, "cmptmp");
+ // Convert bool 0/1 to double 0.0 or 1.0
+ return Builder.CreateUIToFP(L, Type::getDoubleTy(TheContext),
+ "booltmp");
+ default:
+ break;
+ }
+
+ // If it wasn't a builtin binary operator, it must be a user defined one. Emit
+ // a call to it.
+ Function *F = getFunction(std::string("binary") + Op);
+ assert(F && "binary operator not found!");
+
+ Value *Ops[2] = { L, R };
+ return Builder.CreateCall(F, Ops, "binop");
+ }
+
+As you can see above, the new code is actually really simple. It just
+does a lookup for the appropriate operator in the symbol table and
+generates a function call to it. Since user-defined operators are just
+built as normal functions (because the "prototype" boils down to a
+function with the right name) everything falls into place.
+
+The final piece of code we are missing, is a bit of top-level magic:
+
+.. code-block:: c++
+
+ Function *FunctionAST::codegen() {
+ // Transfer ownership of the prototype to the FunctionProtos map, but keep a
+ // reference to it for use below.
+ auto &P = *Proto;
+ FunctionProtos[Proto->getName()] = std::move(Proto);
+ Function *TheFunction = getFunction(P.getName());
+ if (!TheFunction)
+ return nullptr;
+
+ // If this is an operator, install it.
+ if (P.isBinaryOp())
+ BinopPrecedence[P.getOperatorName()] = P.getBinaryPrecedence();
+
+ // Create a new basic block to start insertion into.
+ BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
+ ...
+
+Basically, before codegening a function, if it is a user-defined
+operator, we register it in the precedence table. This allows the binary
+operator parsing logic we already have in place to handle it. Since we
+are working on a fully-general operator precedence parser, this is all
+we need to do to "extend the grammar".
+
+Now we have useful user-defined binary operators. This builds a lot on
+the previous framework we built for other operators. Adding unary
+operators is a bit more challenging, because we don't have any framework
+for it yet - let's see what it takes.
+
+User-defined Unary Operators
+============================
+
+Since we don't currently support unary operators in the Kaleidoscope
+language, we'll need to add everything to support them. Above, we added
+simple support for the 'unary' keyword to the lexer. In addition to
+that, we need an AST node:
+
+.. code-block:: c++
+
+ /// UnaryExprAST - Expression class for a unary operator.
+ class UnaryExprAST : public ExprAST {
+ char Opcode;
+ std::unique_ptr<ExprAST> Operand;
+
+ public:
+ UnaryExprAST(char Opcode, std::unique_ptr<ExprAST> Operand)
+ : Opcode(Opcode), Operand(std::move(Operand)) {}
+
+ Value *codegen() override;
+ };
+
+This AST node is very simple and obvious by now. It directly mirrors the
+binary operator AST node, except that it only has one child. With this,
+we need to add the parsing logic. Parsing a unary operator is pretty
+simple: we'll add a new function to do it:
+
+.. code-block:: c++
+
+ /// unary
+ /// ::= primary
+ /// ::= '!' unary
+ static std::unique_ptr<ExprAST> ParseUnary() {
+ // If the current token is not an operator, it must be a primary expr.
+ if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
+ return ParsePrimary();
+
+ // If this is a unary operator, read it.
+ int Opc = CurTok;
+ getNextToken();
+ if (auto Operand = ParseUnary())
+ return llvm::make_unique<UnaryExprAST>(Opc, std::move(Operand));
+ return nullptr;
+ }
+
+The grammar we add is pretty straightforward here. If we see a unary
+operator when parsing a primary operator, we eat the operator as a
+prefix and parse the remaining piece as another unary operator. This
+allows us to handle multiple unary operators (e.g. "!!x"). Note that
+unary operators can't have ambiguous parses like binary operators can,
+so there is no need for precedence information.
+
+The problem with this function, is that we need to call ParseUnary from
+somewhere. To do this, we change previous callers of ParsePrimary to
+call ParseUnary instead:
+
+.. code-block:: c++
+
+ /// binoprhs
+ /// ::= ('+' unary)*
+ static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec,
+ std::unique_ptr<ExprAST> LHS) {
+ ...
+ // Parse the unary expression after the binary operator.
+ auto RHS = ParseUnary();
+ if (!RHS)
+ return nullptr;
+ ...
+ }
+ /// expression
+ /// ::= unary binoprhs
+ ///
+ static std::unique_ptr<ExprAST> ParseExpression() {
+ auto LHS = ParseUnary();
+ if (!LHS)
+ return nullptr;
+
+ return ParseBinOpRHS(0, std::move(LHS));
+ }
+
+With these two simple changes, we are now able to parse unary operators
+and build the AST for them. Next up, we need to add parser support for
+prototypes, to parse the unary operator prototype. We extend the binary
+operator code above with:
+
+.. code-block:: c++
+
+ /// prototype
+ /// ::= id '(' id* ')'
+ /// ::= binary LETTER number? (id, id)
+ /// ::= unary LETTER (id)
+ static std::unique_ptr<PrototypeAST> ParsePrototype() {
+ std::string FnName;
+
+ unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
+ unsigned BinaryPrecedence = 30;
+
+ switch (CurTok) {
+ default:
+ return LogErrorP("Expected function name in prototype");
+ case tok_identifier:
+ FnName = IdentifierStr;
+ Kind = 0;
+ getNextToken();
+ break;
+ case tok_unary:
+ getNextToken();
+ if (!isascii(CurTok))
+ return LogErrorP("Expected unary operator");
+ FnName = "unary";
+ FnName += (char)CurTok;
+ Kind = 1;
+ getNextToken();
+ break;
+ case tok_binary:
+ ...
+
+As with binary operators, we name unary operators with a name that
+includes the operator character. This assists us at code generation
+time. Speaking of, the final piece we need to add is codegen support for
+unary operators. It looks like this:
+
+.. code-block:: c++
+
+ Value *UnaryExprAST::codegen() {
+ Value *OperandV = Operand->codegen();
+ if (!OperandV)
+ return nullptr;
+
+ Function *F = getFunction(std::string("unary") + Opcode);
+ if (!F)
+ return LogErrorV("Unknown unary operator");
+
+ return Builder.CreateCall(F, OperandV, "unop");
+ }
+
+This code is similar to, but simpler than, the code for binary
+operators. It is simpler primarily because it doesn't need to handle any
+predefined operators.
+
+Kicking the Tires
+=================
+
+It is somewhat hard to believe, but with a few simple extensions we've
+covered in the last chapters, we have grown a real-ish language. With
+this, we can do a lot of interesting things, including I/O, math, and a
+bunch of other things. For example, we can now add a nice sequencing
+operator (printd is defined to print out the specified value and a
+newline):
+
+::
+
+ ready> extern printd(x);
+ Read extern:
+ declare double @printd(double)
+
+ ready> def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.
+ ...
+ ready> printd(123) : printd(456) : printd(789);
+ 123.000000
+ 456.000000
+ 789.000000
+ Evaluated to 0.000000
+
+We can also define a bunch of other "primitive" operations, such as:
+
+::
+
+ # Logical unary not.
+ def unary!(v)
+ if v then
+ 0
+ else
+ 1;
+
+ # Unary negate.
+ def unary-(v)
+ 0-v;
+
+ # Define > with the same precedence as <.
+ def binary> 10 (LHS RHS)
+ RHS < LHS;
+
+ # Binary logical or, which does not short circuit.
+ def binary| 5 (LHS RHS)
+ if LHS then
+ 1
+ else if RHS then
+ 1
+ else
+ 0;
+
+ # Binary logical and, which does not short circuit.
+ def binary& 6 (LHS RHS)
+ if !LHS then
+ 0
+ else
+ !!RHS;
+
+ # Define = with slightly lower precedence than relationals.
+ def binary = 9 (LHS RHS)
+ !(LHS < RHS | LHS > RHS);
+
+ # Define ':' for sequencing: as a low-precedence operator that ignores operands
+ # and just returns the RHS.
+ def binary : 1 (x y) y;
+
+Given the previous if/then/else support, we can also define interesting
+functions for I/O. For example, the following prints out a character
+whose "density" reflects the value passed in: the lower the value, the
+denser the character:
+
+::
+
+ ready> extern putchard(char);
+ ...
+ ready> def printdensity(d)
+ if d > 8 then
+ putchard(32) # ' '
+ else if d > 4 then
+ putchard(46) # '.'
+ else if d > 2 then
+ putchard(43) # '+'
+ else
+ putchard(42); # '*'
+ ...
+ ready> printdensity(1): printdensity(2): printdensity(3):
+ printdensity(4): printdensity(5): printdensity(9):
+ putchard(10);
+ **++.
+ Evaluated to 0.000000
+
+Based on these simple primitive operations, we can start to define more
+interesting things. For example, here's a little function that determines
+the number of iterations it takes for a certain function in the complex
+plane to diverge:
+
+::
+
+ # Determine whether the specific location diverges.
+ # Solve for z = z^2 + c in the complex plane.
+ def mandelconverger(real imag iters creal cimag)
+ if iters > 255 | (real*real + imag*imag > 4) then
+ iters
+ else
+ mandelconverger(real*real - imag*imag + creal,
+ 2*real*imag + cimag,
+ iters+1, creal, cimag);
+
+ # Return the number of iterations required for the iteration to escape
+ def mandelconverge(real imag)
+ mandelconverger(real, imag, 0, real, imag);
+
+This "``z = z2 + c``" function is a beautiful little creature that is
+the basis for computation of the `Mandelbrot
+Set <http://en.wikipedia.org/wiki/Mandelbrot_set>`_. Our
+``mandelconverge`` function returns the number of iterations that it
+takes for a complex orbit to escape, saturating to 255. This is not a
+very useful function by itself, but if you plot its value over a
+two-dimensional plane, you can see the Mandelbrot set. Given that we are
+limited to using putchard here, our amazing graphical output is limited,
+but we can whip together something using the density plotter above:
+
+::
+
+ # Compute and plot the mandelbrot set with the specified 2 dimensional range
+ # info.
+ def mandelhelp(xmin xmax xstep ymin ymax ystep)
+ for y = ymin, y < ymax, ystep in (
+ (for x = xmin, x < xmax, xstep in
+ printdensity(mandelconverge(x,y)))
+ : putchard(10)
+ )
+
+ # mandel - This is a convenient helper function for plotting the mandelbrot set
+ # from the specified position with the specified Magnification.
+ def mandel(realstart imagstart realmag imagmag)
+ mandelhelp(realstart, realstart+realmag*78, realmag,
+ imagstart, imagstart+imagmag*40, imagmag);
+
+Given this, we can try plotting out the mandelbrot set! Lets try it out:
+
+::
+
+ ready> mandel(-2.3, -1.3, 0.05, 0.07);
+ *******************************+++++++++++*************************************
+ *************************+++++++++++++++++++++++*******************************
+ **********************+++++++++++++++++++++++++++++****************************
+ *******************+++++++++++++++++++++.. ...++++++++*************************
+ *****************++++++++++++++++++++++.... ...+++++++++***********************
+ ***************+++++++++++++++++++++++..... ...+++++++++*********************
+ **************+++++++++++++++++++++++.... ....+++++++++********************
+ *************++++++++++++++++++++++...... .....++++++++*******************
+ ************+++++++++++++++++++++....... .......+++++++******************
+ ***********+++++++++++++++++++.... ... .+++++++*****************
+ **********+++++++++++++++++....... .+++++++****************
+ *********++++++++++++++........... ...+++++++***************
+ ********++++++++++++............ ...++++++++**************
+ ********++++++++++... .......... .++++++++**************
+ *******+++++++++..... .+++++++++*************
+ *******++++++++...... ..+++++++++*************
+ *******++++++....... ..+++++++++*************
+ *******+++++...... ..+++++++++*************
+ *******.... .... ...+++++++++*************
+ *******.... . ...+++++++++*************
+ *******+++++...... ...+++++++++*************
+ *******++++++....... ..+++++++++*************
+ *******++++++++...... .+++++++++*************
+ *******+++++++++..... ..+++++++++*************
+ ********++++++++++... .......... .++++++++**************
+ ********++++++++++++............ ...++++++++**************
+ *********++++++++++++++.......... ...+++++++***************
+ **********++++++++++++++++........ .+++++++****************
+ **********++++++++++++++++++++.... ... ..+++++++****************
+ ***********++++++++++++++++++++++....... .......++++++++*****************
+ ************+++++++++++++++++++++++...... ......++++++++******************
+ **************+++++++++++++++++++++++.... ....++++++++********************
+ ***************+++++++++++++++++++++++..... ...+++++++++*********************
+ *****************++++++++++++++++++++++.... ...++++++++***********************
+ *******************+++++++++++++++++++++......++++++++*************************
+ *********************++++++++++++++++++++++.++++++++***************************
+ *************************+++++++++++++++++++++++*******************************
+ ******************************+++++++++++++************************************
+ *******************************************************************************
+ *******************************************************************************
+ *******************************************************************************
+ Evaluated to 0.000000
+ ready> mandel(-2, -1, 0.02, 0.04);
+ **************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
+ ***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ *********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
+ *******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
+ *****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
+ ***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
+ **************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
+ ************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
+ ***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
+ **********++++++++++++++++++++++++++++++++++++++++++++++.............
+ ********+++++++++++++++++++++++++++++++++++++++++++..................
+ *******+++++++++++++++++++++++++++++++++++++++.......................
+ ******+++++++++++++++++++++++++++++++++++...........................
+ *****++++++++++++++++++++++++++++++++............................
+ *****++++++++++++++++++++++++++++...............................
+ ****++++++++++++++++++++++++++...... .........................
+ ***++++++++++++++++++++++++......... ...... ...........
+ ***++++++++++++++++++++++............
+ **+++++++++++++++++++++..............
+ **+++++++++++++++++++................
+ *++++++++++++++++++.................
+ *++++++++++++++++............ ...
+ *++++++++++++++..............
+ *+++....++++................
+ *.......... ...........
+ *
+ *.......... ...........
+ *+++....++++................
+ *++++++++++++++..............
+ *++++++++++++++++............ ...
+ *++++++++++++++++++.................
+ **+++++++++++++++++++................
+ **+++++++++++++++++++++..............
+ ***++++++++++++++++++++++............
+ ***++++++++++++++++++++++++......... ...... ...........
+ ****++++++++++++++++++++++++++...... .........................
+ *****++++++++++++++++++++++++++++...............................
+ *****++++++++++++++++++++++++++++++++............................
+ ******+++++++++++++++++++++++++++++++++++...........................
+ *******+++++++++++++++++++++++++++++++++++++++.......................
+ ********+++++++++++++++++++++++++++++++++++++++++++..................
+ Evaluated to 0.000000
+ ready> mandel(-0.9, -1.4, 0.02, 0.03);
+ *******************************************************************************
+ *******************************************************************************
+ *******************************************************************************
+ **********+++++++++++++++++++++************************************************
+ *+++++++++++++++++++++++++++++++++++++++***************************************
+ +++++++++++++++++++++++++++++++++++++++++++++**********************************
+ ++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
+ ++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
+ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
+ +++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
+ +++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
+ +++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
+ ++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
+ +++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
+ ++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
+ ++++++++++++++++++++++++............. .......++++++++++++++++++++++******
+ +++++++++++++++++++++++............. ........+++++++++++++++++++++++****
+ ++++++++++++++++++++++........... ..........++++++++++++++++++++++***
+ ++++++++++++++++++++........... .........++++++++++++++++++++++*
+ ++++++++++++++++++............ ...........++++++++++++++++++++
+ ++++++++++++++++............... .............++++++++++++++++++
+ ++++++++++++++................. ...............++++++++++++++++
+ ++++++++++++.................. .................++++++++++++++
+ +++++++++.................. .................+++++++++++++
+ ++++++........ . ......... ..++++++++++++
+ ++............ ...... ....++++++++++
+ .............. ...++++++++++
+ .............. ....+++++++++
+ .............. .....++++++++
+ ............. ......++++++++
+ ........... .......++++++++
+ ......... ........+++++++
+ ......... ........+++++++
+ ......... ....+++++++
+ ........ ...+++++++
+ ....... ...+++++++
+ ....+++++++
+ .....+++++++
+ ....+++++++
+ ....+++++++
+ ....+++++++
+ Evaluated to 0.000000
+ ready> ^D
+
+At this point, you may be starting to realize that Kaleidoscope is a
+real and powerful language. It may not be self-similar :), but it can be
+used to plot things that are!
+
+With this, we conclude the "adding user-defined operators" chapter of
+the tutorial. We have successfully augmented our language, adding the
+ability to extend the language in the library, and we have shown how
+this can be used to build a simple but interesting end-user application
+in Kaleidoscope. At this point, Kaleidoscope can build a variety of
+applications that are functional and can call functions with
+side-effects, but it can't actually define and mutate a variable itself.
+
+Strikingly, variable mutation is an important feature of some languages,
+and it is not at all obvious how to `add support for mutable
+variables <LangImpl07.html>`_ without having to add an "SSA construction"
+phase to your front-end. In the next chapter, we will describe how you
+can add variable mutation without building SSA in your front-end.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the support for user-defined operators. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+On some platforms, you will need to specify -rdynamic or
+-Wl,--export-dynamic when linking. This ensures that symbols defined in
+the main executable are exported to the dynamic linker and so are
+available for symbol resolution at run time. This is not needed if you
+compile your support code into a shared library, although doing that
+will cause problems on Windows.
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter6/toy.cpp
+ :language: c++
+
+`Next: Extending the language: mutable variables / SSA
+construction <LangImpl07.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl07.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl07.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl07.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl07.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,885 @@
+:orphan:
+
+=======================================================
+Kaleidoscope: Extending the Language: Mutable Variables
+=======================================================
+
+.. contents::
+ :local:
+
+Chapter 7 Introduction
+======================
+
+Welcome to Chapter 7 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
+very respectable, albeit simple, `functional programming
+language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
+journey, we learned some parsing techniques, how to build and represent
+an AST, how to build LLVM IR, and how to optimize the resultant code as
+well as JIT compile it.
+
+While Kaleidoscope is interesting as a functional language, the fact
+that it is functional makes it "too easy" to generate LLVM IR for it. In
+particular, a functional language makes it very easy to build LLVM IR
+directly in `SSA
+form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+Since LLVM requires that the input code be in SSA form, this is a very
+nice property and it is often unclear to newcomers how to generate code
+for an imperative language with mutable variables.
+
+The short (and happy) summary of this chapter is that there is no need
+for your front-end to build SSA form: LLVM provides highly tuned and
+well tested support for this, though the way it works is a bit
+unexpected for some.
+
+Why is this a hard problem?
+===========================
+
+To understand why mutable variables cause complexities in SSA
+construction, consider this extremely simple C example:
+
+.. code-block:: c
+
+ int G, H;
+ int test(_Bool Condition) {
+ int X;
+ if (Condition)
+ X = G;
+ else
+ X = H;
+ return X;
+ }
+
+In this case, we have the variable "X", whose value depends on the path
+executed in the program. Because there are two different possible values
+for X before the return instruction, a PHI node is inserted to merge the
+two values. The LLVM IR that we want for this example looks like this:
+
+.. code-block:: llvm
+
+ @G = weak global i32 0 ; type of @G is i32*
+ @H = weak global i32 0 ; type of @H is i32*
+
+ define i32 @test(i1 %Condition) {
+ entry:
+ br i1 %Condition, label %cond_true, label %cond_false
+
+ cond_true:
+ %X.0 = load i32* @G
+ br label %cond_next
+
+ cond_false:
+ %X.1 = load i32* @H
+ br label %cond_next
+
+ cond_next:
+ %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
+ ret i32 %X.2
+ }
+
+In this example, the loads from the G and H global variables are
+explicit in the LLVM IR, and they live in the then/else branches of the
+if statement (cond\_true/cond\_false). In order to merge the incoming
+values, the X.2 phi node in the cond\_next block selects the right value
+to use based on where control flow is coming from: if control flow comes
+from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
+control flow comes from cond\_true, it gets the value of X.0. The intent
+of this chapter is not to explain the details of SSA form. For more
+information, see one of the many `online
+references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+
+The question for this article is "who places the phi nodes when lowering
+assignments to mutable variables?". The issue here is that LLVM
+*requires* that its IR be in SSA form: there is no "non-ssa" mode for
+it. However, SSA construction requires non-trivial algorithms and data
+structures, so it is inconvenient and wasteful for every front-end to
+have to reproduce this logic.
+
+Memory in LLVM
+==============
+
+The 'trick' here is that while LLVM does require all register values to
+be in SSA form, it does not require (or permit) memory objects to be in
+SSA form. In the example above, note that the loads from G and H are
+direct accesses to G and H: they are not renamed or versioned. This
+differs from some other compiler systems, which do try to version memory
+objects. In LLVM, instead of encoding dataflow analysis of memory into
+the LLVM IR, it is handled with `Analysis
+Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
+
+With this in mind, the high-level idea is that we want to make a stack
+variable (which lives in memory, because it is on the stack) for each
+mutable object in a function. To take advantage of this trick, we need
+to talk about how LLVM represents stack variables.
+
+In LLVM, all memory accesses are explicit with load/store instructions,
+and it is carefully designed not to have (or need) an "address-of"
+operator. Notice how the type of the @G/@H global variables is actually
+"i32\*" even though the variable is defined as "i32". What this means is
+that @G defines *space* for an i32 in the global data area, but its
+*name* actually refers to the address for that space. Stack variables
+work the same way, except that instead of being declared with global
+variable definitions, they are declared with the `LLVM alloca
+instruction <../LangRef.html#alloca-instruction>`_:
+
+.. code-block:: llvm
+
+ define i32 @example() {
+ entry:
+ %X = alloca i32 ; type of %X is i32*.
+ ...
+ %tmp = load i32* %X ; load the stack value %X from the stack.
+ %tmp2 = add i32 %tmp, 1 ; increment it
+ store i32 %tmp2, i32* %X ; store it back
+ ...
+
+This code shows an example of how you can declare and manipulate a stack
+variable in the LLVM IR. Stack memory allocated with the alloca
+instruction is fully general: you can pass the address of the stack slot
+to functions, you can store it in other variables, etc. In our example
+above, we could rewrite the example to use the alloca technique to avoid
+using a PHI node:
+
+.. code-block:: llvm
+
+ @G = weak global i32 0 ; type of @G is i32*
+ @H = weak global i32 0 ; type of @H is i32*
+
+ define i32 @test(i1 %Condition) {
+ entry:
+ %X = alloca i32 ; type of %X is i32*.
+ br i1 %Condition, label %cond_true, label %cond_false
+
+ cond_true:
+ %X.0 = load i32* @G
+ store i32 %X.0, i32* %X ; Update X
+ br label %cond_next
+
+ cond_false:
+ %X.1 = load i32* @H
+ store i32 %X.1, i32* %X ; Update X
+ br label %cond_next
+
+ cond_next:
+ %X.2 = load i32* %X ; Read X
+ ret i32 %X.2
+ }
+
+With this, we have discovered a way to handle arbitrary mutable
+variables without the need to create Phi nodes at all:
+
+#. Each mutable variable becomes a stack allocation.
+#. Each read of the variable becomes a load from the stack.
+#. Each update of the variable becomes a store to the stack.
+#. Taking the address of a variable just uses the stack address
+ directly.
+
+While this solution has solved our immediate problem, it introduced
+another one: we have now apparently introduced a lot of stack traffic
+for very simple and common operations, a major performance problem.
+Fortunately for us, the LLVM optimizer has a highly-tuned optimization
+pass named "mem2reg" that handles this case, promoting allocas like this
+into SSA registers, inserting Phi nodes as appropriate. If you run this
+example through the pass, for example, you'll get:
+
+.. code-block:: bash
+
+ $ llvm-as < example.ll | opt -mem2reg | llvm-dis
+ @G = weak global i32 0
+ @H = weak global i32 0
+
+ define i32 @test(i1 %Condition) {
+ entry:
+ br i1 %Condition, label %cond_true, label %cond_false
+
+ cond_true:
+ %X.0 = load i32* @G
+ br label %cond_next
+
+ cond_false:
+ %X.1 = load i32* @H
+ br label %cond_next
+
+ cond_next:
+ %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
+ ret i32 %X.01
+ }
+
+The mem2reg pass implements the standard "iterated dominance frontier"
+algorithm for constructing SSA form and has a number of optimizations
+that speed up (very common) degenerate cases. The mem2reg optimization
+pass is the answer to dealing with mutable variables, and we highly
+recommend that you depend on it. Note that mem2reg only works on
+variables in certain circumstances:
+
+#. mem2reg is alloca-driven: it looks for allocas and if it can handle
+ them, it promotes them. It does not apply to global variables or heap
+ allocations.
+#. mem2reg only looks for alloca instructions in the entry block of the
+ function. Being in the entry block guarantees that the alloca is only
+ executed once, which makes analysis simpler.
+#. mem2reg only promotes allocas whose uses are direct loads and stores.
+ If the address of the stack object is passed to a function, or if any
+ funny pointer arithmetic is involved, the alloca will not be
+ promoted.
+#. mem2reg only works on allocas of `first
+ class <../LangRef.html#first-class-types>`_ values (such as pointers,
+ scalars and vectors), and only if the array size of the allocation is
+ 1 (or missing in the .ll file). mem2reg is not capable of promoting
+ structs or arrays to registers. Note that the "sroa" pass is
+ more powerful and can promote structs, "unions", and arrays in many
+ cases.
+
+All of these properties are easy to satisfy for most imperative
+languages, and we'll illustrate it below with Kaleidoscope. The final
+question you may be asking is: should I bother with this nonsense for my
+front-end? Wouldn't it be better if I just did SSA construction
+directly, avoiding use of the mem2reg optimization pass? In short, we
+strongly recommend that you use this technique for building SSA form,
+unless there is an extremely good reason not to. Using this technique
+is:
+
+- Proven and well tested: clang uses this technique
+ for local mutable variables. As such, the most common clients of LLVM
+ are using this to handle a bulk of their variables. You can be sure
+ that bugs are found fast and fixed early.
+- Extremely Fast: mem2reg has a number of special cases that make it
+ fast in common cases as well as fully general. For example, it has
+ fast-paths for variables that are only used in a single block,
+ variables that only have one assignment point, good heuristics to
+ avoid insertion of unneeded phi nodes, etc.
+- Needed for debug info generation: `Debug information in
+ LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
+ the variable exposed so that debug info can be attached to it. This
+ technique dovetails very naturally with this style of debug info.
+
+If nothing else, this makes it much easier to get your front-end up and
+running, and is very simple to implement. Let's extend Kaleidoscope with
+mutable variables now!
+
+Mutable Variables in Kaleidoscope
+=================================
+
+Now that we know the sort of problem we want to tackle, let's see what
+this looks like in the context of our little Kaleidoscope language.
+We're going to add two features:
+
+#. The ability to mutate variables with the '=' operator.
+#. The ability to define new variables.
+
+While the first item is really what this is about, we only have
+variables for incoming arguments as well as for induction variables, and
+redefining those only goes so far :). Also, the ability to define new
+variables is a useful thing regardless of whether you will be mutating
+them. Here's a motivating example that shows how we could use these:
+
+::
+
+ # Define ':' for sequencing: as a low-precedence operator that ignores operands
+ # and just returns the RHS.
+ def binary : 1 (x y) y;
+
+ # Recursive fib, we could do this before.
+ def fib(x)
+ if (x < 3) then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+ # Iterative fib.
+ def fibi(x)
+ var a = 1, b = 1, c in
+ (for i = 3, i < x in
+ c = a + b :
+ a = b :
+ b = c) :
+ b;
+
+ # Call it.
+ fibi(10);
+
+In order to mutate variables, we have to change our existing variables
+to use the "alloca trick". Once we have that, we'll add our new
+operator, then extend Kaleidoscope to support new variable definitions.
+
+Adjusting Existing Variables for Mutation
+=========================================
+
+The symbol table in Kaleidoscope is managed at code generation time by
+the '``NamedValues``' map. This map currently keeps track of the LLVM
+"Value\*" that holds the double value for the named variable. In order
+to support mutation, we need to change this slightly, so that
+``NamedValues`` holds the *memory location* of the variable in question.
+Note that this change is a refactoring: it changes the structure of the
+code, but does not (by itself) change the behavior of the compiler. All
+of these changes are isolated in the Kaleidoscope code generator.
+
+At this point in Kaleidoscope's development, it only supports variables
+for two things: incoming arguments to functions and the induction
+variable of 'for' loops. For consistency, we'll allow mutation of these
+variables in addition to other user-defined variables. This means that
+these will both need memory locations.
+
+To start our transformation of Kaleidoscope, we'll change the
+NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
+we do this, the C++ compiler will tell us what parts of the code we need
+to update:
+
+.. code-block:: c++
+
+ static std::map<std::string, AllocaInst*> NamedValues;
+
+Also, since we will need to create these allocas, we'll use a helper
+function that ensures that the allocas are created in the entry block of
+the function:
+
+.. code-block:: c++
+
+ /// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
+ /// the function. This is used for mutable variables etc.
+ static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
+ const std::string &VarName) {
+ IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
+ TheFunction->getEntryBlock().begin());
+ return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), 0,
+ VarName.c_str());
+ }
+
+This funny looking code creates an IRBuilder object that is pointing at
+the first instruction (.begin()) of the entry block. It then creates an
+alloca with the expected name and returns it. Because all values in
+Kaleidoscope are doubles, there is no need to pass in a type to use.
+
+With this in place, the first functionality change we want to make belongs to
+variable references. In our new scheme, variables live on the stack, so
+code generating a reference to them actually needs to produce a load
+from the stack slot:
+
+.. code-block:: c++
+
+ Value *VariableExprAST::codegen() {
+ // Look this variable up in the function.
+ Value *V = NamedValues[Name];
+ if (!V)
+ return LogErrorV("Unknown variable name");
+
+ // Load the value.
+ return Builder.CreateLoad(V, Name.c_str());
+ }
+
+As you can see, this is pretty straightforward. Now we need to update
+the things that define the variables to set up the alloca. We'll start
+with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
+the unabridged code):
+
+.. code-block:: c++
+
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+
+ // Create an alloca for the variable in the entry block.
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
+
+ // Emit the start code first, without 'variable' in scope.
+ Value *StartVal = Start->codegen();
+ if (!StartVal)
+ return nullptr;
+
+ // Store the value into the alloca.
+ Builder.CreateStore(StartVal, Alloca);
+ ...
+
+ // Compute the end condition.
+ Value *EndCond = End->codegen();
+ if (!EndCond)
+ return nullptr;
+
+ // Reload, increment, and restore the alloca. This handles the case where
+ // the body of the loop mutates the variable.
+ Value *CurVar = Builder.CreateLoad(Alloca);
+ Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
+ Builder.CreateStore(NextVar, Alloca);
+ ...
+
+This code is virtually identical to the code `before we allowed mutable
+variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
+no longer have to construct a PHI node, and we use load/store to access
+the variable as needed.
+
+To support mutable argument variables, we need to also make allocas for
+them. The code for this is also pretty simple:
+
+.. code-block:: c++
+
+ Function *FunctionAST::codegen() {
+ ...
+ Builder.SetInsertPoint(BB);
+
+ // Record the function arguments in the NamedValues map.
+ NamedValues.clear();
+ for (auto &Arg : TheFunction->args()) {
+ // Create an alloca for this variable.
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
+
+ // Store the initial value into the alloca.
+ Builder.CreateStore(&Arg, Alloca);
+
+ // Add arguments to variable symbol table.
+ NamedValues[Arg.getName()] = Alloca;
+ }
+
+ if (Value *RetVal = Body->codegen()) {
+ ...
+
+For each argument, we make an alloca, store the input value to the
+function into the alloca, and register the alloca as the memory location
+for the argument. This method gets invoked by ``FunctionAST::codegen()``
+right after it sets up the entry block for the function.
+
+The final missing piece is adding the mem2reg pass, which allows us to
+get good codegen once again:
+
+.. code-block:: c++
+
+ // Promote allocas to registers.
+ TheFPM->add(createPromoteMemoryToRegisterPass());
+ // Do simple "peephole" optimizations and bit-twiddling optzns.
+ TheFPM->add(createInstructionCombiningPass());
+ // Reassociate expressions.
+ TheFPM->add(createReassociatePass());
+ ...
+
+It is interesting to see what the code looks like before and after the
+mem2reg optimization runs. For example, this is the before/after code
+for our recursive fib function. Before the optimization:
+
+.. code-block:: llvm
+
+ define double @fib(double %x) {
+ entry:
+ %x1 = alloca double
+ store double %x, double* %x1
+ %x2 = load double, double* %x1
+ %cmptmp = fcmp ult double %x2, 3.000000e+00
+ %booltmp = uitofp i1 %cmptmp to double
+ %ifcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then: ; preds = %entry
+ br label %ifcont
+
+ else: ; preds = %entry
+ %x3 = load double, double* %x1
+ %subtmp = fsub double %x3, 1.000000e+00
+ %calltmp = call double @fib(double %subtmp)
+ %x4 = load double, double* %x1
+ %subtmp5 = fsub double %x4, 2.000000e+00
+ %calltmp6 = call double @fib(double %subtmp5)
+ %addtmp = fadd double %calltmp, %calltmp6
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
+ ret double %iftmp
+ }
+
+Here there is only one variable (x, the input argument) but you can
+still see the extremely simple-minded code generation strategy we are
+using. In the entry block, an alloca is created, and the initial input
+value is stored into it. Each reference to the variable does a reload
+from the stack. Also, note that we didn't modify the if/then/else
+expression, so it still inserts a PHI node. While we could make an
+alloca for it, it is actually easier to create a PHI node for it, so we
+still just make the PHI.
+
+Here is the code after the mem2reg pass runs:
+
+.. code-block:: llvm
+
+ define double @fib(double %x) {
+ entry:
+ %cmptmp = fcmp ult double %x, 3.000000e+00
+ %booltmp = uitofp i1 %cmptmp to double
+ %ifcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then:
+ br label %ifcont
+
+ else:
+ %subtmp = fsub double %x, 1.000000e+00
+ %calltmp = call double @fib(double %subtmp)
+ %subtmp5 = fsub double %x, 2.000000e+00
+ %calltmp6 = call double @fib(double %subtmp5)
+ %addtmp = fadd double %calltmp, %calltmp6
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
+ ret double %iftmp
+ }
+
+This is a trivial case for mem2reg, since there are no redefinitions of
+the variable. The point of showing this is to calm your tension about
+inserting such blatent inefficiencies :).
+
+After the rest of the optimizers run, we get:
+
+.. code-block:: llvm
+
+ define double @fib(double %x) {
+ entry:
+ %cmptmp = fcmp ult double %x, 3.000000e+00
+ %booltmp = uitofp i1 %cmptmp to double
+ %ifcond = fcmp ueq double %booltmp, 0.000000e+00
+ br i1 %ifcond, label %else, label %ifcont
+
+ else:
+ %subtmp = fsub double %x, 1.000000e+00
+ %calltmp = call double @fib(double %subtmp)
+ %subtmp5 = fsub double %x, 2.000000e+00
+ %calltmp6 = call double @fib(double %subtmp5)
+ %addtmp = fadd double %calltmp, %calltmp6
+ ret double %addtmp
+
+ ifcont:
+ ret double 1.000000e+00
+ }
+
+Here we see that the simplifycfg pass decided to clone the return
+instruction into the end of the 'else' block. This allowed it to
+eliminate some branches and the PHI node.
+
+Now that all symbol table references are updated to use stack variables,
+we'll add the assignment operator.
+
+New Assignment Operator
+=======================
+
+With our current framework, adding a new assignment operator is really
+simple. We will parse it just like any other binary operator, but handle
+it internally (instead of allowing the user to define it). The first
+step is to set a precedence:
+
+.. code-block:: c++
+
+ int main() {
+ // Install standard binary operators.
+ // 1 is lowest precedence.
+ BinopPrecedence['='] = 2;
+ BinopPrecedence['<'] = 10;
+ BinopPrecedence['+'] = 20;
+ BinopPrecedence['-'] = 20;
+
+Now that the parser knows the precedence of the binary operator, it
+takes care of all the parsing and AST generation. We just need to
+implement codegen for the assignment operator. This looks like:
+
+.. code-block:: c++
+
+ Value *BinaryExprAST::codegen() {
+ // Special case '=' because we don't want to emit the LHS as an expression.
+ if (Op == '=') {
+ // Assignment requires the LHS to be an identifier.
+ VariableExprAST *LHSE = dynamic_cast<VariableExprAST*>(LHS.get());
+ if (!LHSE)
+ return LogErrorV("destination of '=' must be a variable");
+
+Unlike the rest of the binary operators, our assignment operator doesn't
+follow the "emit LHS, emit RHS, do computation" model. As such, it is
+handled as a special case before the other binary operators are handled.
+The other strange thing is that it requires the LHS to be a variable. It
+is invalid to have "(x+1) = expr" - only things like "x = expr" are
+allowed.
+
+.. code-block:: c++
+
+ // Codegen the RHS.
+ Value *Val = RHS->codegen();
+ if (!Val)
+ return nullptr;
+
+ // Look up the name.
+ Value *Variable = NamedValues[LHSE->getName()];
+ if (!Variable)
+ return LogErrorV("Unknown variable name");
+
+ Builder.CreateStore(Val, Variable);
+ return Val;
+ }
+ ...
+
+Once we have the variable, codegen'ing the assignment is
+straightforward: we emit the RHS of the assignment, create a store, and
+return the computed value. Returning a value allows for chained
+assignments like "X = (Y = Z)".
+
+Now that we have an assignment operator, we can mutate loop variables
+and arguments. For example, we can now run code like this:
+
+::
+
+ # Function to print a double.
+ extern printd(x);
+
+ # Define ':' for sequencing: as a low-precedence operator that ignores operands
+ # and just returns the RHS.
+ def binary : 1 (x y) y;
+
+ def test(x)
+ printd(x) :
+ x = 4 :
+ printd(x);
+
+ test(123);
+
+When run, this example prints "123" and then "4", showing that we did
+actually mutate the value! Okay, we have now officially implemented our
+goal: getting this to work requires SSA construction in the general
+case. However, to be really useful, we want the ability to define our
+own local variables, let's add this next!
+
+User-defined Local Variables
+============================
+
+Adding var/in is just like any other extension we made to
+Kaleidoscope: we extend the lexer, the parser, the AST and the code
+generator. The first step for adding our new 'var/in' construct is to
+extend the lexer. As before, this is pretty trivial, the code looks like
+this:
+
+.. code-block:: c++
+
+ enum Token {
+ ...
+ // var definition
+ tok_var = -13
+ ...
+ }
+ ...
+ static int gettok() {
+ ...
+ if (IdentifierStr == "in")
+ return tok_in;
+ if (IdentifierStr == "binary")
+ return tok_binary;
+ if (IdentifierStr == "unary")
+ return tok_unary;
+ if (IdentifierStr == "var")
+ return tok_var;
+ return tok_identifier;
+ ...
+
+The next step is to define the AST node that we will construct. For
+var/in, it looks like this:
+
+.. code-block:: c++
+
+ /// VarExprAST - Expression class for var/in
+ class VarExprAST : public ExprAST {
+ std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
+ std::unique_ptr<ExprAST> Body;
+
+ public:
+ VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
+ std::unique_ptr<ExprAST> Body)
+ : VarNames(std::move(VarNames)), Body(std::move(Body)) {}
+
+ Value *codegen() override;
+ };
+
+var/in allows a list of names to be defined all at once, and each name
+can optionally have an initializer value. As such, we capture this
+information in the VarNames vector. Also, var/in has a body, this body
+is allowed to access the variables defined by the var/in.
+
+With this in place, we can define the parser pieces. The first thing we
+do is add it as a primary expression:
+
+.. code-block:: c++
+
+ /// primary
+ /// ::= identifierexpr
+ /// ::= numberexpr
+ /// ::= parenexpr
+ /// ::= ifexpr
+ /// ::= forexpr
+ /// ::= varexpr
+ static std::unique_ptr<ExprAST> ParsePrimary() {
+ switch (CurTok) {
+ default:
+ return LogError("unknown token when expecting an expression");
+ case tok_identifier:
+ return ParseIdentifierExpr();
+ case tok_number:
+ return ParseNumberExpr();
+ case '(':
+ return ParseParenExpr();
+ case tok_if:
+ return ParseIfExpr();
+ case tok_for:
+ return ParseForExpr();
+ case tok_var:
+ return ParseVarExpr();
+ }
+ }
+
+Next we define ParseVarExpr:
+
+.. code-block:: c++
+
+ /// varexpr ::= 'var' identifier ('=' expression)?
+ // (',' identifier ('=' expression)?)* 'in' expression
+ static std::unique_ptr<ExprAST> ParseVarExpr() {
+ getNextToken(); // eat the var.
+
+ std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
+
+ // At least one variable name is required.
+ if (CurTok != tok_identifier)
+ return LogError("expected identifier after var");
+
+The first part of this code parses the list of identifier/expr pairs
+into the local ``VarNames`` vector.
+
+.. code-block:: c++
+
+ while (1) {
+ std::string Name = IdentifierStr;
+ getNextToken(); // eat identifier.
+
+ // Read the optional initializer.
+ std::unique_ptr<ExprAST> Init;
+ if (CurTok == '=') {
+ getNextToken(); // eat the '='.
+
+ Init = ParseExpression();
+ if (!Init) return nullptr;
+ }
+
+ VarNames.push_back(std::make_pair(Name, std::move(Init)));
+
+ // End of var list, exit loop.
+ if (CurTok != ',') break;
+ getNextToken(); // eat the ','.
+
+ if (CurTok != tok_identifier)
+ return LogError("expected identifier list after var");
+ }
+
+Once all the variables are parsed, we then parse the body and create the
+AST node:
+
+.. code-block:: c++
+
+ // At this point, we have to have 'in'.
+ if (CurTok != tok_in)
+ return LogError("expected 'in' keyword after 'var'");
+ getNextToken(); // eat 'in'.
+
+ auto Body = ParseExpression();
+ if (!Body)
+ return nullptr;
+
+ return llvm::make_unique<VarExprAST>(std::move(VarNames),
+ std::move(Body));
+ }
+
+Now that we can parse and represent the code, we need to support
+emission of LLVM IR for it. This code starts out with:
+
+.. code-block:: c++
+
+ Value *VarExprAST::codegen() {
+ std::vector<AllocaInst *> OldBindings;
+
+ Function *TheFunction = Builder.GetInsertBlock()->getParent();
+
+ // Register all variables and emit their initializer.
+ for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
+ const std::string &VarName = VarNames[i].first;
+ ExprAST *Init = VarNames[i].second.get();
+
+Basically it loops over all the variables, installing them one at a
+time. For each variable we put into the symbol table, we remember the
+previous value that we replace in OldBindings.
+
+.. code-block:: c++
+
+ // Emit the initializer before adding the variable to scope, this prevents
+ // the initializer from referencing the variable itself, and permits stuff
+ // like this:
+ // var a = 1 in
+ // var a = a in ... # refers to outer 'a'.
+ Value *InitVal;
+ if (Init) {
+ InitVal = Init->codegen();
+ if (!InitVal)
+ return nullptr;
+ } else { // If not specified, use 0.0.
+ InitVal = ConstantFP::get(TheContext, APFloat(0.0));
+ }
+
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
+ Builder.CreateStore(InitVal, Alloca);
+
+ // Remember the old variable binding so that we can restore the binding when
+ // we unrecurse.
+ OldBindings.push_back(NamedValues[VarName]);
+
+ // Remember this binding.
+ NamedValues[VarName] = Alloca;
+ }
+
+There are more comments here than code. The basic idea is that we emit
+the initializer, create the alloca, then update the symbol table to
+point to it. Once all the variables are installed in the symbol table,
+we evaluate the body of the var/in expression:
+
+.. code-block:: c++
+
+ // Codegen the body, now that all vars are in scope.
+ Value *BodyVal = Body->codegen();
+ if (!BodyVal)
+ return nullptr;
+
+Finally, before returning, we restore the previous variable bindings:
+
+.. code-block:: c++
+
+ // Pop all our variables from scope.
+ for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
+ NamedValues[VarNames[i].first] = OldBindings[i];
+
+ // Return the body computation.
+ return BodyVal;
+ }
+
+The end result of all of this is that we get properly scoped variable
+definitions, and we even (trivially) allow mutation of them :).
+
+With this, we completed what we set out to do. Our nice iterative fib
+example from the intro compiles and runs just fine. The mem2reg pass
+optimizes all of our stack variables into SSA registers, inserting PHI
+nodes where needed, and our front-end remains simple: no "iterated
+dominance frontier" computation anywhere in sight.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+mutable variables and var/in support. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter7/toy.cpp
+ :language: c++
+
+`Next: Compiling to Object Code <LangImpl08.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl08.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl08.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl08.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl08.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,220 @@
+:orphan:
+
+========================================
+ Kaleidoscope: Compiling to Object Code
+========================================
+
+.. contents::
+ :local:
+
+Chapter 8 Introduction
+======================
+
+Welcome to Chapter 8 of the "`Implementing a language with LLVM
+<index.html>`_" tutorial. This chapter describes how to compile our
+language down to object files.
+
+Choosing a target
+=================
+
+LLVM has native support for cross-compilation. You can compile to the
+architecture of your current machine, or just as easily compile for
+other architectures. In this tutorial, we'll target the current
+machine.
+
+To specify the architecture that you want to target, we use a string
+called a "target triple". This takes the form
+``<arch><sub>-<vendor>-<sys>-<abi>`` (see the `cross compilation docs
+<http://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_).
+
+As an example, we can see what clang thinks is our current target
+triple:
+
+::
+
+ $ clang --version | grep Target
+ Target: x86_64-unknown-linux-gnu
+
+Running this command may show something different on your machine as
+you might be using a different architecture or operating system to me.
+
+Fortunately, we don't need to hard-code a target triple to target the
+current machine. LLVM provides ``sys::getDefaultTargetTriple``, which
+returns the target triple of the current machine.
+
+.. code-block:: c++
+
+ auto TargetTriple = sys::getDefaultTargetTriple();
+
+LLVM doesn't require us to link in all the target
+functionality. For example, if we're just using the JIT, we don't need
+the assembly printers. Similarly, if we're only targeting certain
+architectures, we can only link in the functionality for those
+architectures.
+
+For this example, we'll initialize all the targets for emitting object
+code.
+
+.. code-block:: c++
+
+ InitializeAllTargetInfos();
+ InitializeAllTargets();
+ InitializeAllTargetMCs();
+ InitializeAllAsmParsers();
+ InitializeAllAsmPrinters();
+
+We can now use our target triple to get a ``Target``:
+
+.. code-block:: c++
+
+ std::string Error;
+ auto Target = TargetRegistry::lookupTarget(TargetTriple, Error);
+
+ // Print an error and exit if we couldn't find the requested target.
+ // This generally occurs if we've forgotten to initialise the
+ // TargetRegistry or we have a bogus target triple.
+ if (!Target) {
+ errs() << Error;
+ return 1;
+ }
+
+Target Machine
+==============
+
+We will also need a ``TargetMachine``. This class provides a complete
+machine description of the machine we're targeting. If we want to
+target a specific feature (such as SSE) or a specific CPU (such as
+Intel's Sandylake), we do so now.
+
+To see which features and CPUs that LLVM knows about, we can use
+``llc``. For example, let's look at x86:
+
+::
+
+ $ llvm-as < /dev/null | llc -march=x86 -mattr=help
+ Available CPUs for this target:
+
+ amdfam10 - Select the amdfam10 processor.
+ athlon - Select the athlon processor.
+ athlon-4 - Select the athlon-4 processor.
+ ...
+
+ Available features for this target:
+
+ 16bit-mode - 16-bit mode (i8086).
+ 32bit-mode - 32-bit mode (80386).
+ 3dnow - Enable 3DNow! instructions.
+ 3dnowa - Enable 3DNow! Athlon instructions.
+ ...
+
+For our example, we'll use the generic CPU without any additional
+features, options or relocation model.
+
+.. code-block:: c++
+
+ auto CPU = "generic";
+ auto Features = "";
+
+ TargetOptions opt;
+ auto RM = Optional<Reloc::Model>();
+ auto TargetMachine = Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM);
+
+
+Configuring the Module
+======================
+
+We're now ready to configure our module, to specify the target and
+data layout. This isn't strictly necessary, but the `frontend
+performance guide <../Frontend/PerformanceTips.html>`_ recommends
+this. Optimizations benefit from knowing about the target and data
+layout.
+
+.. code-block:: c++
+
+ TheModule->setDataLayout(TargetMachine->createDataLayout());
+ TheModule->setTargetTriple(TargetTriple);
+
+Emit Object Code
+================
+
+We're ready to emit object code! Let's define where we want to write
+our file to:
+
+.. code-block:: c++
+
+ auto Filename = "output.o";
+ std::error_code EC;
+ raw_fd_ostream dest(Filename, EC, sys::fs::F_None);
+
+ if (EC) {
+ errs() << "Could not open file: " << EC.message();
+ return 1;
+ }
+
+Finally, we define a pass that emits object code, then we run that
+pass:
+
+.. code-block:: c++
+
+ legacy::PassManager pass;
+ auto FileType = TargetMachine::CGFT_ObjectFile;
+
+ if (TargetMachine->addPassesToEmitFile(pass, dest, FileType)) {
+ errs() << "TargetMachine can't emit a file of this type";
+ return 1;
+ }
+
+ pass.run(*TheModule);
+ dest.flush();
+
+Putting It All Together
+=======================
+
+Does it work? Let's give it a try. We need to compile our code, but
+note that the arguments to ``llvm-config`` are different to the previous chapters.
+
+::
+
+ $ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs all` -o toy
+
+Let's run it, and define a simple ``average`` function. Press Ctrl-D
+when you're done.
+
+::
+
+ $ ./toy
+ ready> def average(x y) (x + y) * 0.5;
+ ^D
+ Wrote output.o
+
+We have an object file! To test it, let's write a simple program and
+link it with our output. Here's the source code:
+
+.. code-block:: c++
+
+ #include <iostream>
+
+ extern "C" {
+ double average(double, double);
+ }
+
+ int main() {
+ std::cout << "average of 3.0 and 4.0: " << average(3.0, 4.0) << std::endl;
+ }
+
+We link our program to output.o and check the result is what we
+expected:
+
+::
+
+ $ clang++ main.cpp output.o -o main
+ $ ./main
+ average of 3.0 and 4.0: 3.5
+
+Full Code Listing
+=================
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter8/toy.cpp
+ :language: c++
+
+`Next: Adding Debug Information <LangImpl09.html>`_
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl09.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl09.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl09.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl09.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,467 @@
+:orphan:
+
+======================================
+Kaleidoscope: Adding Debug Information
+======================================
+
+.. contents::
+ :local:
+
+Chapter 9 Introduction
+======================
+
+Welcome to Chapter 9 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a
+decent little programming language with functions and variables.
+What happens if something goes wrong though, how do you debug your
+program?
+
+Source level debugging uses formatted data that helps a debugger
+translate from binary and the state of the machine back to the
+source that the programmer wrote. In LLVM we generally use a format
+called `DWARF <http://dwarfstd.org>`_. DWARF is a compact encoding
+that represents types, source locations, and variable locations.
+
+The short summary of this chapter is that we'll go through the
+various things you have to add to a programming language to
+support debug info, and how you translate that into DWARF.
+
+Caveat: For now we can't debug via the JIT, so we'll need to compile
+our program down to something small and standalone. As part of this
+we'll make a few modifications to the running of the language and
+how programs are compiled. This means that we'll have a source file
+with a simple program written in Kaleidoscope rather than the
+interactive JIT. It does involve a limitation that we can only
+have one "top level" command at a time to reduce the number of
+changes necessary.
+
+Here's the sample program we'll be compiling:
+
+.. code-block:: python
+
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+ fib(10)
+
+
+Why is this a hard problem?
+===========================
+
+Debug information is a hard problem for a few different reasons - mostly
+centered around optimized code. First, optimization makes keeping source
+locations more difficult. In LLVM IR we keep the original source location
+for each IR level instruction on the instruction. Optimization passes
+should keep the source locations for newly created instructions, but merged
+instructions only get to keep a single location - this can cause jumping
+around when stepping through optimized programs. Secondly, optimization
+can move variables in ways that are either optimized out, shared in memory
+with other variables, or difficult to track. For the purposes of this
+tutorial we're going to avoid optimization (as you'll see with one of the
+next sets of patches).
+
+Ahead-of-Time Compilation Mode
+==============================
+
+To highlight only the aspects of adding debug information to a source
+language without needing to worry about the complexities of JIT debugging
+we're going to make a few changes to Kaleidoscope to support compiling
+the IR emitted by the front end into a simple standalone program that
+you can execute, debug, and see results.
+
+First we make our anonymous function that contains our top level
+statement be our "main":
+
+.. code-block:: udiff
+
+ - auto Proto = llvm::make_unique<PrototypeAST>("", std::vector<std::string>());
+ + auto Proto = llvm::make_unique<PrototypeAST>("main", std::vector<std::string>());
+
+just with the simple change of giving it a name.
+
+Then we're going to remove the command line code wherever it exists:
+
+.. code-block:: udiff
+
+ @@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() {
+ /// top ::= definition | external | expression | ';'
+ static void MainLoop() {
+ while (1) {
+ - fprintf(stderr, "ready> ");
+ switch (CurTok) {
+ case tok_eof:
+ return;
+ @@ -1184,7 +1183,6 @@ int main() {
+ BinopPrecedence['*'] = 40; // highest.
+
+ // Prime the first token.
+ - fprintf(stderr, "ready> ");
+ getNextToken();
+
+Lastly we're going to disable all of the optimization passes and the JIT so
+that the only thing that happens after we're done parsing and generating
+code is that the LLVM IR goes to standard error:
+
+.. code-block:: udiff
+
+ @@ -1108,17 +1108,8 @@ static void HandleExtern() {
+ static void HandleTopLevelExpression() {
+ // Evaluate a top-level expression into an anonymous function.
+ if (auto FnAST = ParseTopLevelExpr()) {
+ - if (auto *FnIR = FnAST->codegen()) {
+ - // We're just doing this to make sure it executes.
+ - TheExecutionEngine->finalizeObject();
+ - // JIT the function, returning a function pointer.
+ - void *FPtr = TheExecutionEngine->getPointerToFunction(FnIR);
+ -
+ - // Cast it to the right type (takes no arguments, returns a double) so we
+ - // can call it as a native function.
+ - double (*FP)() = (double (*)())(intptr_t)FPtr;
+ - // Ignore the return value for this.
+ - (void)FP;
+ + if (!F->codegen()) {
+ + fprintf(stderr, "Error generating code for top level expr");
+ }
+ } else {
+ // Skip token for error recovery.
+ @@ -1439,11 +1459,11 @@ int main() {
+ // target lays out data structures.
+ TheModule->setDataLayout(TheExecutionEngine->getDataLayout());
+ OurFPM.add(new DataLayoutPass());
+ +#if 0
+ OurFPM.add(createBasicAliasAnalysisPass());
+ // Promote allocas to registers.
+ OurFPM.add(createPromoteMemoryToRegisterPass());
+ @@ -1218,7 +1210,7 @@ int main() {
+ OurFPM.add(createGVNPass());
+ // Simplify the control flow graph (deleting unreachable blocks, etc).
+ OurFPM.add(createCFGSimplificationPass());
+ -
+ + #endif
+ OurFPM.doInitialization();
+
+ // Set the global so the code gen can use this.
+
+This relatively small set of changes get us to the point that we can compile
+our piece of Kaleidoscope language down to an executable program via this
+command line:
+
+.. code-block:: bash
+
+ Kaleidoscope-Ch9 < fib.ks | & clang -x ir -
+
+which gives an a.out/a.exe in the current working directory.
+
+Compile Unit
+============
+
+The top level container for a section of code in DWARF is a compile unit.
+This contains the type and function data for an individual translation unit
+(read: one file of source code). So the first thing we need to do is
+construct one for our fib.ks file.
+
+DWARF Emission Setup
+====================
+
+Similar to the ``IRBuilder`` class we have a
+`DIBuilder <http://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
+that helps in constructing debug metadata for an LLVM IR file. It
+corresponds 1:1 similarly to ``IRBuilder`` and LLVM IR, but with nicer names.
+Using it does require that you be more familiar with DWARF terminology than
+you needed to be with ``IRBuilder`` and ``Instruction`` names, but if you
+read through the general documentation on the
+`Metadata Format <http://llvm.org/docs/SourceLevelDebugging.html>`_ it
+should be a little more clear. We'll be using this class to construct all
+of our IR level descriptions. Construction for it takes a module so we
+need to construct it shortly after we construct our module. We've left it
+as a global static variable to make it a bit easier to use.
+
+Next we're going to create a small container to cache some of our frequent
+data. The first will be our compile unit, but we'll also write a bit of
+code for our one type since we won't have to worry about multiple typed
+expressions:
+
+.. code-block:: c++
+
+ static DIBuilder *DBuilder;
+
+ struct DebugInfo {
+ DICompileUnit *TheCU;
+ DIType *DblTy;
+
+ DIType *getDoubleTy();
+ } KSDbgInfo;
+
+ DIType *DebugInfo::getDoubleTy() {
+ if (DblTy)
+ return DblTy;
+
+ DblTy = DBuilder->createBasicType("double", 64, dwarf::DW_ATE_float);
+ return DblTy;
+ }
+
+And then later on in ``main`` when we're constructing our module:
+
+.. code-block:: c++
+
+ DBuilder = new DIBuilder(*TheModule);
+
+ KSDbgInfo.TheCU = DBuilder->createCompileUnit(
+ dwarf::DW_LANG_C, DBuilder->createFile("fib.ks", "."),
+ "Kaleidoscope Compiler", 0, "", 0);
+
+There are a couple of things to note here. First, while we're producing a
+compile unit for a language called Kaleidoscope we used the language
+constant for C. This is because a debugger wouldn't necessarily understand
+the calling conventions or default ABI for a language it doesn't recognize
+and we follow the C ABI in our LLVM code generation so it's the closest
+thing to accurate. This ensures we can actually call functions from the
+debugger and have them execute. Secondly, you'll see the "fib.ks" in the
+call to ``createCompileUnit``. This is a default hard coded value since
+we're using shell redirection to put our source into the Kaleidoscope
+compiler. In a usual front end you'd have an input file name and it would
+go there.
+
+One last thing as part of emitting debug information via DIBuilder is that
+we need to "finalize" the debug information. The reasons are part of the
+underlying API for DIBuilder, but make sure you do this near the end of
+main:
+
+.. code-block:: c++
+
+ DBuilder->finalize();
+
+before you dump out the module.
+
+Functions
+=========
+
+Now that we have our ``Compile Unit`` and our source locations, we can add
+function definitions to the debug info. So in ``PrototypeAST::codegen()`` we
+add a few lines of code to describe a context for our subprogram, in this
+case the "File", and the actual definition of the function itself.
+
+So the context:
+
+.. code-block:: c++
+
+ DIFile *Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(),
+ KSDbgInfo.TheCU.getDirectory());
+
+giving us an DIFile and asking the ``Compile Unit`` we created above for the
+directory and filename where we are currently. Then, for now, we use some
+source locations of 0 (since our AST doesn't currently have source location
+information) and construct our function definition:
+
+.. code-block:: c++
+
+ DIScope *FContext = Unit;
+ unsigned LineNo = 0;
+ unsigned ScopeLine = 0;
+ DISubprogram *SP = DBuilder->createFunction(
+ FContext, P.getName(), StringRef(), Unit, LineNo,
+ CreateFunctionType(TheFunction->arg_size(), Unit),
+ false /* internal linkage */, true /* definition */, ScopeLine,
+ DINode::FlagPrototyped, false);
+ TheFunction->setSubprogram(SP);
+
+and we now have an DISubprogram that contains a reference to all of our
+metadata for the function.
+
+Source Locations
+================
+
+The most important thing for debug information is accurate source location -
+this makes it possible to map your source code back. We have a problem though,
+Kaleidoscope really doesn't have any source location information in the lexer
+or parser so we'll need to add it.
+
+.. code-block:: c++
+
+ struct SourceLocation {
+ int Line;
+ int Col;
+ };
+ static SourceLocation CurLoc;
+ static SourceLocation LexLoc = {1, 0};
+
+ static int advance() {
+ int LastChar = getchar();
+
+ if (LastChar == '\n' || LastChar == '\r') {
+ LexLoc.Line++;
+ LexLoc.Col = 0;
+ } else
+ LexLoc.Col++;
+ return LastChar;
+ }
+
+In this set of code we've added some functionality on how to keep track of the
+line and column of the "source file". As we lex every token we set our current
+current "lexical location" to the assorted line and column for the beginning
+of the token. We do this by overriding all of the previous calls to
+``getchar()`` with our new ``advance()`` that keeps track of the information
+and then we have added to all of our AST classes a source location:
+
+.. code-block:: c++
+
+ class ExprAST {
+ SourceLocation Loc;
+
+ public:
+ ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {}
+ virtual ~ExprAST() {}
+ virtual Value* codegen() = 0;
+ int getLine() const { return Loc.Line; }
+ int getCol() const { return Loc.Col; }
+ virtual raw_ostream &dump(raw_ostream &out, int ind) {
+ return out << ':' << getLine() << ':' << getCol() << '\n';
+ }
+
+that we pass down through when we create a new expression:
+
+.. code-block:: c++
+
+ LHS = llvm::make_unique<BinaryExprAST>(BinLoc, BinOp, std::move(LHS),
+ std::move(RHS));
+
+giving us locations for each of our expressions and variables.
+
+To make sure that every instruction gets proper source location information,
+we have to tell ``Builder`` whenever we're at a new source location.
+We use a small helper function for this:
+
+.. code-block:: c++
+
+ void DebugInfo::emitLocation(ExprAST *AST) {
+ DIScope *Scope;
+ if (LexicalBlocks.empty())
+ Scope = TheCU;
+ else
+ Scope = LexicalBlocks.back();
+ Builder.SetCurrentDebugLocation(
+ DebugLoc::get(AST->getLine(), AST->getCol(), Scope));
+ }
+
+This both tells the main ``IRBuilder`` where we are, but also what scope
+we're in. The scope can either be on compile-unit level or be the nearest
+enclosing lexical block like the current function.
+To represent this we create a stack of scopes:
+
+.. code-block:: c++
+
+ std::vector<DIScope *> LexicalBlocks;
+
+and push the scope (function) to the top of the stack when we start
+generating the code for each function:
+
+.. code-block:: c++
+
+ KSDbgInfo.LexicalBlocks.push_back(SP);
+
+Also, we may not forget to pop the scope back off of the scope stack at the
+end of the code generation for the function:
+
+.. code-block:: c++
+
+ // Pop off the lexical block for the function since we added it
+ // unconditionally.
+ KSDbgInfo.LexicalBlocks.pop_back();
+
+Then we make sure to emit the location every time we start to generate code
+for a new AST object:
+
+.. code-block:: c++
+
+ KSDbgInfo.emitLocation(this);
+
+Variables
+=========
+
+Now that we have functions, we need to be able to print out the variables
+we have in scope. Let's get our function arguments set up so we can get
+decent backtraces and see how our functions are being called. It isn't
+a lot of code, and we generally handle it when we're creating the
+argument allocas in ``FunctionAST::codegen``.
+
+.. code-block:: c++
+
+ // Record the function arguments in the NamedValues map.
+ NamedValues.clear();
+ unsigned ArgIdx = 0;
+ for (auto &Arg : TheFunction->args()) {
+ // Create an alloca for this variable.
+ AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
+
+ // Create a debug descriptor for the variable.
+ DILocalVariable *D = DBuilder->createParameterVariable(
+ SP, Arg.getName(), ++ArgIdx, Unit, LineNo, KSDbgInfo.getDoubleTy(),
+ true);
+
+ DBuilder->insertDeclare(Alloca, D, DBuilder->createExpression(),
+ DebugLoc::get(LineNo, 0, SP),
+ Builder.GetInsertBlock());
+
+ // Store the initial value into the alloca.
+ Builder.CreateStore(&Arg, Alloca);
+
+ // Add arguments to variable symbol table.
+ NamedValues[Arg.getName()] = Alloca;
+ }
+
+
+Here we're first creating the variable, giving it the scope (``SP``),
+the name, source location, type, and since it's an argument, the argument
+index. Next, we create an ``lvm.dbg.declare`` call to indicate at the IR
+level that we've got a variable in an alloca (and it gives a starting
+location for the variable), and setting a source location for the
+beginning of the scope on the declare.
+
+One interesting thing to note at this point is that various debuggers have
+assumptions based on how code and debug information was generated for them
+in the past. In this case we need to do a little bit of a hack to avoid
+generating line information for the function prologue so that the debugger
+knows to skip over those instructions when setting a breakpoint. So in
+``FunctionAST::CodeGen`` we add some more lines:
+
+.. code-block:: c++
+
+ // Unset the location for the prologue emission (leading instructions with no
+ // location in a function are considered part of the prologue and the debugger
+ // will run past them when breaking on a function)
+ KSDbgInfo.emitLocation(nullptr);
+
+and then emit a new location when we actually start generating code for the
+body of the function:
+
+.. code-block:: c++
+
+ KSDbgInfo.emitLocation(Body.get());
+
+With this we have enough debug information to set breakpoints in functions,
+print out argument variables, and call functions. Not too bad for just a
+few simple lines of code!
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+debug information. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
+ # Run
+ ./toy
+
+Here is the code:
+
+.. literalinclude:: ../../../examples/Kaleidoscope/Chapter9/toy.cpp
+ :language: c++
+
+`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl10.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl10.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl10.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/LangImpl10.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,256 @@
+:orphan:
+
+======================================================
+Kaleidoscope: Conclusion and other useful LLVM tidbits
+======================================================
+
+.. contents::
+ :local:
+
+Tutorial Conclusion
+===================
+
+Welcome to the final chapter of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. In the course of this tutorial, we have
+grown our little Kaleidoscope language from being a useless toy, to
+being a semi-interesting (but probably still useless) toy. :)
+
+It is interesting to see how far we've come, and how little code it has
+taken. We built the entire lexer, parser, AST, code generator, an
+interactive run-loop (with a JIT!), and emitted debug information in
+standalone executables - all in under 1000 lines of (non-comment/non-blank)
+code.
+
+Our little language supports a couple of interesting features: it
+supports user defined binary and unary operators, it uses JIT
+compilation for immediate evaluation, and it supports a few control flow
+constructs with SSA construction.
+
+Part of the idea of this tutorial was to show you how easy and fun it
+can be to define, build, and play with languages. Building a compiler
+need not be a scary or mystical process! Now that you've seen some of
+the basics, I strongly encourage you to take the code and hack on it.
+For example, try adding:
+
+- **global variables** - While global variables have questional value
+ in modern software engineering, they are often useful when putting
+ together quick little hacks like the Kaleidoscope compiler itself.
+ Fortunately, our current setup makes it very easy to add global
+ variables: just have value lookup check to see if an unresolved
+ variable is in the global variable symbol table before rejecting it.
+ To create a new global variable, make an instance of the LLVM
+ ``GlobalVariable`` class.
+- **typed variables** - Kaleidoscope currently only supports variables
+ of type double. This gives the language a very nice elegance, because
+ only supporting one type means that you never have to specify types.
+ Different languages have different ways of handling this. The easiest
+ way is to require the user to specify types for every variable
+ definition, and record the type of the variable in the symbol table
+ along with its Value\*.
+- **arrays, structs, vectors, etc** - Once you add types, you can start
+ extending the type system in all sorts of interesting ways. Simple
+ arrays are very easy and are quite useful for many different
+ applications. Adding them is mostly an exercise in learning how the
+ LLVM `getelementptr <../LangRef.html#getelementptr-instruction>`_ instruction
+ works: it is so nifty/unconventional, it `has its own
+ FAQ <../GetElementPtr.html>`_!
+- **standard runtime** - Our current language allows the user to access
+ arbitrary external functions, and we use it for things like "printd"
+ and "putchard". As you extend the language to add higher-level
+ constructs, often these constructs make the most sense if they are
+ lowered to calls into a language-supplied runtime. For example, if
+ you add hash tables to the language, it would probably make sense to
+ add the routines to a runtime, instead of inlining them all the way.
+- **memory management** - Currently we can only access the stack in
+ Kaleidoscope. It would also be useful to be able to allocate heap
+ memory, either with calls to the standard libc malloc/free interface
+ or with a garbage collector. If you would like to use garbage
+ collection, note that LLVM fully supports `Accurate Garbage
+ Collection <../GarbageCollection.html>`_ including algorithms that
+ move objects and need to scan/update the stack.
+- **exception handling support** - LLVM supports generation of `zero
+ cost exceptions <../ExceptionHandling.html>`_ which interoperate with
+ code compiled in other languages. You could also generate code by
+ implicitly making every function return an error value and checking
+ it. You could also make explicit use of setjmp/longjmp. There are
+ many different ways to go here.
+- **object orientation, generics, database access, complex numbers,
+ geometric programming, ...** - Really, there is no end of crazy
+ features that you can add to the language.
+- **unusual domains** - We've been talking about applying LLVM to a
+ domain that many people are interested in: building a compiler for a
+ specific language. However, there are many other domains that can use
+ compiler technology that are not typically considered. For example,
+ LLVM has been used to implement OpenGL graphics acceleration,
+ translate C++ code to ActionScript, and many other cute and clever
+ things. Maybe you will be the first to JIT compile a regular
+ expression interpreter into native code with LLVM?
+
+Have fun - try doing something crazy and unusual. Building a language
+like everyone else always has, is much less fun than trying something a
+little crazy or off the wall and seeing how it turns out. If you get
+stuck or want to talk about it, feel free to email the `llvm-dev mailing
+list <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_: it has lots
+of people who are interested in languages and are often willing to help
+out.
+
+Before we end this tutorial, I want to talk about some "tips and tricks"
+for generating LLVM IR. These are some of the more subtle things that
+may not be obvious, but are very useful if you want to take advantage of
+LLVM's capabilities.
+
+Properties of the LLVM IR
+=========================
+
+We have a couple of common questions about code in the LLVM IR form -
+let's just get these out of the way right now, shall we?
+
+Target Independence
+-------------------
+
+Kaleidoscope is an example of a "portable language": any program written
+in Kaleidoscope will work the same way on any target that it runs on.
+Many other languages have this property, e.g. lisp, java, haskell,
+javascript, python, etc (note that while these languages are portable,
+not all their libraries are).
+
+One nice aspect of LLVM is that it is often capable of preserving target
+independence in the IR: you can take the LLVM IR for a
+Kaleidoscope-compiled program and run it on any target that LLVM
+supports, even emitting C code and compiling that on targets that LLVM
+doesn't support natively. You can trivially tell that the Kaleidoscope
+compiler generates target-independent code because it never queries for
+any target-specific information when generating code.
+
+The fact that LLVM provides a compact, target-independent,
+representation for code gets a lot of people excited. Unfortunately,
+these people are usually thinking about C or a language from the C
+family when they are asking questions about language portability. I say
+"unfortunately", because there is really no way to make (fully general)
+C code portable, other than shipping the source code around (and of
+course, C source code is not actually portable in general either - ever
+port a really old application from 32- to 64-bits?).
+
+The problem with C (again, in its full generality) is that it is heavily
+laden with target specific assumptions. As one simple example, the
+preprocessor often destructively removes target-independence from the
+code when it processes the input text:
+
+.. code-block:: c
+
+ #ifdef __i386__
+ int X = 1;
+ #else
+ int X = 42;
+ #endif
+
+While it is possible to engineer more and more complex solutions to
+problems like this, it cannot be solved in full generality in a way that
+is better than shipping the actual source code.
+
+That said, there are interesting subsets of C that can be made portable.
+If you are willing to fix primitive types to a fixed size (say int =
+32-bits, and long = 64-bits), don't care about ABI compatibility with
+existing binaries, and are willing to give up some other minor features,
+you can have portable code. This can make sense for specialized domains
+such as an in-kernel language.
+
+Safety Guarantees
+-----------------
+
+Many of the languages above are also "safe" languages: it is impossible
+for a program written in Java to corrupt its address space and crash the
+process (assuming the JVM has no bugs). Safety is an interesting
+property that requires a combination of language design, runtime
+support, and often operating system support.
+
+It is certainly possible to implement a safe language in LLVM, but LLVM
+IR does not itself guarantee safety. The LLVM IR allows unsafe pointer
+casts, use after free bugs, buffer over-runs, and a variety of other
+problems. Safety needs to be implemented as a layer on top of LLVM and,
+conveniently, several groups have investigated this. Ask on the `llvm-dev
+mailing list <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ if
+you are interested in more details.
+
+Language-Specific Optimizations
+-------------------------------
+
+One thing about LLVM that turns off many people is that it does not
+solve all the world's problems in one system. One specific
+complaint is that people perceive LLVM as being incapable of performing
+high-level language-specific optimization: LLVM "loses too much
+information". Here are a few observations about this:
+
+First, you're right that LLVM does lose information. For example, as of
+this writing, there is no way to distinguish in the LLVM IR whether an
+SSA-value came from a C "int" or a C "long" on an ILP32 machine (other
+than debug info). Both get compiled down to an 'i32' value and the
+information about what it came from is lost. The more general issue
+here, is that the LLVM type system uses "structural equivalence" instead
+of "name equivalence". Another place this surprises people is if you
+have two types in a high-level language that have the same structure
+(e.g. two different structs that have a single int field): these types
+will compile down into a single LLVM type and it will be impossible to
+tell what it came from.
+
+Second, while LLVM does lose information, LLVM is not a fixed target: we
+continue to enhance and improve it in many different ways. In addition
+to adding new features (LLVM did not always support exceptions or debug
+info), we also extend the IR to capture important information for
+optimization (e.g. whether an argument is sign or zero extended,
+information about pointers aliasing, etc). Many of the enhancements are
+user-driven: people want LLVM to include some specific feature, so they
+go ahead and extend it.
+
+Third, it is *possible and easy* to add language-specific optimizations,
+and you have a number of choices in how to do it. As one trivial
+example, it is easy to add language-specific optimization passes that
+"know" things about code compiled for a language. In the case of the C
+family, there is an optimization pass that "knows" about the standard C
+library functions. If you call "exit(0)" in main(), it knows that it is
+safe to optimize that into "return 0;" because C specifies what the
+'exit' function does.
+
+In addition to simple library knowledge, it is possible to embed a
+variety of other language-specific information into the LLVM IR. If you
+have a specific need and run into a wall, please bring the topic up on
+the llvm-dev list. At the very worst, you can always treat LLVM as if it
+were a "dumb code generator" and implement the high-level optimizations
+you desire in your front-end, on the language-specific AST.
+
+Tips and Tricks
+===============
+
+There is a variety of useful tips and tricks that you come to know after
+working on/with LLVM that aren't obvious at first glance. Instead of
+letting everyone rediscover them, this section talks about some of these
+issues.
+
+Implementing portable offsetof/sizeof
+-------------------------------------
+
+One interesting thing that comes up, if you are trying to keep the code
+generated by your compiler "target independent", is that you often need
+to know the size of some LLVM type or the offset of some field in an
+llvm structure. For example, you might need to pass the size of a type
+into a function that allocates memory.
+
+Unfortunately, this can vary widely across targets: for example the
+width of a pointer is trivially target-specific. However, there is a
+`clever way to use the getelementptr
+instruction <http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt>`_
+that allows you to compute this in a portable way.
+
+Garbage Collected Stack Frames
+------------------------------
+
+Some languages want to explicitly manage their stack frames, often so
+that they are garbage collected or to allow easy implementation of
+closures. There are often better ways to implement these features than
+explicit stack frames, but `LLVM does support
+them, <http://nondot.org/sabre/LLVMNotes/ExplicitlyManagedStackFrames.txt>`_
+if you want. It requires your front-end to convert the code into
+`Continuation Passing
+Style <http://en.wikipedia.org/wiki/Continuation-passing_style>`_ and
+the use of tail calls (which LLVM also supports).
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/index.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/index.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/index.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/MyFirstLanguageFrontend/index.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,85 @@
+:orphan:
+
+=============================================
+My First Language Frontend with LLVM Tutorial
+=============================================
+
+**Requirements:** This tutorial assumes you know C++, but no previous
+compiler experience is necessary.
+
+Welcome to the "My First Language Frontend with LLVM" tutorial. Here we
+run through the implementation of a simple language, showing
+how fun and easy it can be. This tutorial will get you up and running
+fast and show a concrete example of something that uses LLVM to generate
+code.
+
+This tutorial introduces the simple "Kaleidoscope" language, building it
+iteratively over the course of several chapters, showing how it is built
+over time. This lets us cover a range of language design and LLVM-specific
+ideas, showing and explaining the code for it all along the way,
+and reduces the overwhelming amount of details up front. We strongly
+encourage that you *work with this code* - make a copy and hack it up and
+experiment.
+
+**Warning**: In order to focus on teaching compiler techniques and LLVM
+specifically,
+this tutorial does *not* show best practices in software engineering
+principles. For example, the code uses global variables
+pervasively, doesn't use
+`visitors <http://en.wikipedia.org/wiki/Visitor_pattern>`_, etc... but
+instead keeps things simple and focuses on the topics at hand.
+
+This tutorial is structured into chapters covering individual topics,
+allowing you to skip ahead as you wish:
+
+- `Chapter #1: Kaleidoscope language and Lexer <LangImpl01.html>`_ -
+ This shows where we are
+ going and the basic functionality that we want to build. A lexer
+ is also the first part of building a parser for a language, and we
+ use a simple C++ lexer which is easy to understand.
+- `Chapter #2: Implementing a Parser and AST <LangImpl02.html>`_ -
+ With the lexer in place, we can talk about parsing techniques and
+ basic AST construction. This tutorial describes recursive descent
+ parsing and operator precedence parsing.
+- `Chapter #3: Code generation to LLVM IR <LangImpl03.html>`_ - with
+ the AST ready, we show how easy it is to generate LLVM IR, and show
+ a simple way to incorporate LLVM into your project.
+- `Chapter #4: Adding JIT and Optimizer Support <LangImpl04.html>`_ -
+ One great thing about LLVM is its support for JIT compilation, so
+ we'll dive right into it and show you the 3 lines it takes to add JIT
+ support. Later chapters show how to generate .o files.
+- `Chapter #5: Extending the Language: Control Flow <LangImpl05.html>`_ - With the basic language up and running, we show how to extend
+ it with control flow operations ('if' statement and a 'for' loop). This
+ gives us a chance to talk about SSA construction and control
+ flow.
+- `Chapter #6: Extending the Language: User-defined Operators
+ <LangImpl06.html>`_ - This chapter extends the language to let
+ users define arbitrary unary and binary operators - with assignable
+ precedence! This allows us to build a significant piece of the
+ "language" as library routines.
+- `Chapter #7: Extending the Language: Mutable Variables
+ <LangImpl07.html>`_ - This chapter talks about adding user-defined local
+ variables along with an assignment operator. This shows how easy it is
+ to construct SSA form in LLVM: LLVM does *not* require your front-end
+ to construct SSA form in order to use it!
+- `Chapter #8: Compiling to Object Files <LangImpl08.html>`_ - This
+ chapter explains how to take LLVM IR and compile it down to object
+ files, like a static compiler does.
+- `Chapter #9: Debug Information <LangImpl09.html>`_ - A real language
+ needs to support debuggers, so we
+ add debug information that allows setting breakpoints in Kaleidoscope
+ functions, print out argument variables, and call functions!
+- `Chapter #10: Conclusion and other tidbits <LangImpl10.html>`_ - This
+ chapter wraps up the series by discussing ways to extend the language
+ and includes pointers to info on "special topics" like adding garbage
+ collection support, exceptions, debugging, support for "spaghetti
+ stacks", etc.
+
+By the end of the tutorial, we'll have written a bit less than 1000 lines
+of (non-comment, non-blank) lines of code. With this small amount of
+code, we'll have built up a nice little compiler for a non-trivial
+language including a hand-written lexer, parser, AST, as well as code
+generation support - both static and JIT! The breadth of this is a great
+testament to the strengths of LLVM and shows why it is such a popular
+target for language designers and others who need high performance code
+generation.
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl1.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl1.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl1.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl1.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,285 @@
+=================================================
+Kaleidoscope: Tutorial Introduction and the Lexer
+=================================================
+
+.. contents::
+ :local:
+
+Tutorial Introduction
+=====================
+
+Welcome to the "Implementing a language with LLVM" tutorial. This
+tutorial runs through the implementation of a simple language, showing
+how fun and easy it can be. This tutorial will get you up and started as
+well as help to build a framework you can extend to other languages. The
+code in this tutorial can also be used as a playground to hack on other
+LLVM specific things.
+
+The goal of this tutorial is to progressively unveil our language,
+describing how it is built up over time. This will let us cover a fairly
+broad range of language design and LLVM-specific usage issues, showing
+and explaining the code for it all along the way, without overwhelming
+you with tons of details up front.
+
+It is useful to point out ahead of time that this tutorial is really
+about teaching compiler techniques and LLVM specifically, *not* about
+teaching modern and sane software engineering principles. In practice,
+this means that we'll take a number of shortcuts to simplify the
+exposition. For example, the code leaks memory, uses global variables
+all over the place, doesn't use nice design patterns like
+`visitors <http://en.wikipedia.org/wiki/Visitor_pattern>`_, etc... but
+it is very simple. If you dig in and use the code as a basis for future
+projects, fixing these deficiencies shouldn't be hard.
+
+I've tried to put this tutorial together in a way that makes chapters
+easy to skip over if you are already familiar with or are uninterested
+in the various pieces. The structure of the tutorial is:
+
+- `Chapter #1 <#language>`_: Introduction to the Kaleidoscope
+ language, and the definition of its Lexer - This shows where we are
+ going and the basic functionality that we want it to do. In order to
+ make this tutorial maximally understandable and hackable, we choose
+ to implement everything in Objective Caml instead of using lexer and
+ parser generators. LLVM obviously works just fine with such tools,
+ feel free to use one if you prefer.
+- `Chapter #2 <OCamlLangImpl2.html>`_: Implementing a Parser and
+ AST - With the lexer in place, we can talk about parsing techniques
+ and basic AST construction. This tutorial describes recursive descent
+ parsing and operator precedence parsing. Nothing in Chapters 1 or 2
+ is LLVM-specific, the code doesn't even link in LLVM at this point.
+ :)
+- `Chapter #3 <OCamlLangImpl3.html>`_: Code generation to LLVM IR -
+ With the AST ready, we can show off how easy generation of LLVM IR
+ really is.
+- `Chapter #4 <OCamlLangImpl4.html>`_: Adding JIT and Optimizer
+ Support - Because a lot of people are interested in using LLVM as a
+ JIT, we'll dive right into it and show you the 3 lines it takes to
+ add JIT support. LLVM is also useful in many other ways, but this is
+ one simple and "sexy" way to shows off its power. :)
+- `Chapter #5 <OCamlLangImpl5.html>`_: Extending the Language:
+ Control Flow - With the language up and running, we show how to
+ extend it with control flow operations (if/then/else and a 'for'
+ loop). This gives us a chance to talk about simple SSA construction
+ and control flow.
+- `Chapter #6 <OCamlLangImpl6.html>`_: Extending the Language:
+ User-defined Operators - This is a silly but fun chapter that talks
+ about extending the language to let the user program define their own
+ arbitrary unary and binary operators (with assignable precedence!).
+ This lets us build a significant piece of the "language" as library
+ routines.
+- `Chapter #7 <OCamlLangImpl7.html>`_: Extending the Language:
+ Mutable Variables - This chapter talks about adding user-defined
+ local variables along with an assignment operator. The interesting
+ part about this is how easy and trivial it is to construct SSA form
+ in LLVM: no, LLVM does *not* require your front-end to construct SSA
+ form!
+- `Chapter #8 <OCamlLangImpl8.html>`_: Conclusion and other useful
+ LLVM tidbits - This chapter wraps up the series by talking about
+ potential ways to extend the language, but also includes a bunch of
+ pointers to info about "special topics" like adding garbage
+ collection support, exceptions, debugging, support for "spaghetti
+ stacks", and a bunch of other tips and tricks.
+
+By the end of the tutorial, we'll have written a bit less than 700 lines
+of non-comment, non-blank, lines of code. With this small amount of
+code, we'll have built up a very reasonable compiler for a non-trivial
+language including a hand-written lexer, parser, AST, as well as code
+generation support with a JIT compiler. While other systems may have
+interesting "hello world" tutorials, I think the breadth of this
+tutorial is a great testament to the strengths of LLVM and why you
+should consider it if you're interested in language or compiler design.
+
+A note about this tutorial: we expect you to extend the language and
+play with it on your own. Take the code and go crazy hacking away at it,
+compilers don't need to be scary creatures - it can be a lot of fun to
+play with languages!
+
+The Basic Language
+==================
+
+This tutorial will be illustrated with a toy language that we'll call
+"`Kaleidoscope <http://en.wikipedia.org/wiki/Kaleidoscope>`_" (derived
+from "meaning beautiful, form, and view"). Kaleidoscope is a procedural
+language that allows you to define functions, use conditionals, math,
+etc. Over the course of the tutorial, we'll extend Kaleidoscope to
+support the if/then/else construct, a for loop, user defined operators,
+JIT compilation with a simple command line interface, etc.
+
+Because we want to keep things simple, the only datatype in Kaleidoscope
+is a 64-bit floating point type (aka 'float' in OCaml parlance). As
+such, all values are implicitly double precision and the language
+doesn't require type declarations. This gives the language a very nice
+and simple syntax. For example, the following simple example computes
+`Fibonacci numbers: <http://en.wikipedia.org/wiki/Fibonacci_number>`_
+
+::
+
+ # Compute the x'th fibonacci number.
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2)
+
+ # This expression will compute the 40th number.
+ fib(40)
+
+We also allow Kaleidoscope to call into standard library functions (the
+LLVM JIT makes this completely trivial). This means that you can use the
+'extern' keyword to define a function before you use it (this is also
+useful for mutually recursive functions). For example:
+
+::
+
+ extern sin(arg);
+ extern cos(arg);
+ extern atan2(arg1 arg2);
+
+ atan2(sin(.4), cos(42))
+
+A more interesting example is included in Chapter 6 where we write a
+little Kaleidoscope application that `displays a Mandelbrot
+Set <OCamlLangImpl6.html#kicking-the-tires>`_ at various levels of magnification.
+
+Lets dive into the implementation of this language!
+
+The Lexer
+=========
+
+When it comes to implementing a language, the first thing needed is the
+ability to process a text file and recognize what it says. The
+traditional way to do this is to use a
+"`lexer <http://en.wikipedia.org/wiki/Lexical_analysis>`_" (aka
+'scanner') to break the input up into "tokens". Each token returned by
+the lexer includes a token code and potentially some metadata (e.g. the
+numeric value of a number). First, we define the possibilities:
+
+.. code-block:: ocaml
+
+ (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
+ * these others for known things. *)
+ type token =
+ (* commands *)
+ | Def | Extern
+
+ (* primary *)
+ | Ident of string | Number of float
+
+ (* unknown *)
+ | Kwd of char
+
+Each token returned by our lexer will be one of the token variant
+values. An unknown character like '+' will be returned as
+``Token.Kwd '+'``. If the curr token is an identifier, the value will be
+``Token.Ident s``. If the current token is a numeric literal (like 1.0),
+the value will be ``Token.Number 1.0``.
+
+The actual implementation of the lexer is a collection of functions
+driven by a function named ``Lexer.lex``. The ``Lexer.lex`` function is
+called to return the next token from standard input. We will use
+`Camlp4 <http://caml.inria.fr/pub/docs/manual-camlp4/index.html>`_ to
+simplify the tokenization of the standard input. Its definition starts
+as:
+
+.. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer
+ *===----------------------------------------------------------------------===*)
+
+ let rec lex = parser
+ (* Skip any whitespace. *)
+ | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
+
+``Lexer.lex`` works by recursing over a ``char Stream.t`` to read
+characters one at a time from the standard input. It eats them as it
+recognizes them and stores them in a ``Token.token`` variant. The
+first thing that it has to do is ignore whitespace between tokens. This
+is accomplished with the recursive call above.
+
+The next thing ``Lexer.lex`` needs to do is recognize identifiers and
+specific keywords like "def". Kaleidoscope does this with a pattern
+match and a helper function.
+
+.. code-block:: ocaml
+
+ (* identifier: [a-zA-Z][a-zA-Z0-9] *)
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+
+ ...
+
+ and lex_ident buffer = parser
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+ | [< stream=lex >] ->
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+Numeric values are similar:
+
+.. code-block:: ocaml
+
+ (* number: [0-9.]+ *)
+ | [< ' ('0' .. '9' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+
+ ...
+
+ and lex_number buffer = parser
+ | [< ' ('0' .. '9' | '.' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+ | [< stream=lex >] ->
+ [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
+
+This is all pretty straight-forward code for processing input. When
+reading a numeric value from input, we use the ocaml ``float_of_string``
+function to convert it to a numeric value that we store in
+``Token.Number``. Note that this isn't doing sufficient error checking:
+it will raise ``Failure`` if the string "1.23.45.67". Feel free to
+extend it :). Next we handle comments:
+
+.. code-block:: ocaml
+
+ (* Comment until end of line. *)
+ | [< ' ('#'); stream >] ->
+ lex_comment stream
+
+ ...
+
+ and lex_comment = parser
+ | [< ' ('\n'); stream=lex >] -> stream
+ | [< 'c; e=lex_comment >] -> e
+ | [< >] -> [< >]
+
+We handle comments by skipping to the end of the line and then return
+the next token. Finally, if the input doesn't match one of the above
+cases, it is either an operator character like '+' or the end of the
+file. These are handled with this code:
+
+.. code-block:: ocaml
+
+ (* Otherwise, just return the character as its ascii value. *)
+ | [< 'c; stream >] ->
+ [< 'Token.Kwd c; lex stream >]
+
+ (* end of stream. *)
+ | [< >] -> [< >]
+
+With this, we have the complete lexer for the basic Kaleidoscope
+language (the `full code listing <OCamlLangImpl2.html#full-code-listing>`_ for the
+Lexer is available in the `next chapter <OCamlLangImpl2.html>`_ of the
+tutorial). Next we'll `build a simple parser that uses this to build an
+Abstract Syntax Tree <OCamlLangImpl2.html>`_. When we have that, we'll
+include a driver so that you can use the lexer and parser together.
+
+`Next: Implementing a Parser and AST <OCamlLangImpl2.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl2.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl2.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl2.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl2.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,899 @@
+===========================================
+Kaleidoscope: Implementing a Parser and AST
+===========================================
+
+.. contents::
+ :local:
+
+Chapter 2 Introduction
+======================
+
+Welcome to Chapter 2 of the "`Implementing a language with LLVM in
+Objective Caml <index.html>`_" tutorial. This chapter shows you how to
+use the lexer, built in `Chapter 1 <OCamlLangImpl1.html>`_, to build a
+full `parser <http://en.wikipedia.org/wiki/Parsing>`_ for our
+Kaleidoscope language. Once we have a parser, we'll define and build an
+`Abstract Syntax
+Tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST).
+
+The parser we will build uses a combination of `Recursive Descent
+Parsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and
+`Operator-Precedence
+Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to
+parse the Kaleidoscope language (the latter for binary expressions and
+the former for everything else). Before we get to parsing though, lets
+talk about the output of the parser: the Abstract Syntax Tree.
+
+The Abstract Syntax Tree (AST)
+==============================
+
+The AST for a program captures its behavior in such a way that it is
+easy for later stages of the compiler (e.g. code generation) to
+interpret. We basically want one object for each construct in the
+language, and the AST should closely model the language. In
+Kaleidoscope, we have expressions, a prototype, and a function object.
+We'll start with expressions first:
+
+.. code-block:: ocaml
+
+ (* expr - Base type for all expression nodes. *)
+ type expr =
+ (* variant for numeric literals like "1.0". *)
+ | Number of float
+
+The code above shows the definition of the base ExprAST class and one
+subclass which we use for numeric literals. The important thing to note
+about this code is that the Number variant captures the numeric value of
+the literal as an instance variable. This allows later phases of the
+compiler to know what the stored numeric value is.
+
+Right now we only create the AST, so there are no useful functions on
+them. It would be very easy to add a function to pretty print the code,
+for example. Here are the other expression AST node definitions that
+we'll use in the basic form of the Kaleidoscope language:
+
+.. code-block:: ocaml
+
+ (* variant for referencing a variable, like "a". *)
+ | Variable of string
+
+ (* variant for a binary operator. *)
+ | Binary of char * expr * expr
+
+ (* variant for function calls. *)
+ | Call of string * expr array
+
+This is all (intentionally) rather straight-forward: variables capture
+the variable name, binary operators capture their opcode (e.g. '+'), and
+calls capture a function name as well as a list of any argument
+expressions. One thing that is nice about our AST is that it captures
+the language features without talking about the syntax of the language.
+Note that there is no discussion about precedence of binary operators,
+lexical structure, etc.
+
+For our basic language, these are all of the expression nodes we'll
+define. Because it doesn't have conditional control flow, it isn't
+Turing-complete; we'll fix that in a later installment. The two things
+we need next are a way to talk about the interface to a function, and a
+way to talk about functions themselves:
+
+.. code-block:: ocaml
+
+ (* proto - This type represents the "prototype" for a function, which captures
+ * its name, and its argument names (thus implicitly the number of arguments the
+ * function takes). *)
+ type proto = Prototype of string * string array
+
+ (* func - This type represents a function definition itself. *)
+ type func = Function of proto * expr
+
+In Kaleidoscope, functions are typed with just a count of their
+arguments. Since all values are double precision floating point, the
+type of each argument doesn't need to be stored anywhere. In a more
+aggressive and realistic language, the "expr" variants would probably
+have a type field.
+
+With this scaffolding, we can now talk about parsing expressions and
+function bodies in Kaleidoscope.
+
+Parser Basics
+=============
+
+Now that we have an AST to build, we need to define the parser code to
+build it. The idea here is that we want to parse something like "x+y"
+(which is returned as three tokens by the lexer) into an AST that could
+be generated with calls like this:
+
+.. code-block:: ocaml
+
+ let x = Variable "x" in
+ let y = Variable "y" in
+ let result = Binary ('+', x, y) in
+ ...
+
+The error handling routines make use of the builtin ``Stream.Failure``
+and ``Stream.Error``s. ``Stream.Failure`` is raised when the parser is
+unable to find any matching token in the first position of a pattern.
+``Stream.Error`` is raised when the first token matches, but the rest do
+not. The error recovery in our parser will not be the best and is not
+particular user-friendly, but it will be enough for our tutorial. These
+exceptions make it easier to handle errors in routines that have various
+return types.
+
+With these basic types and exceptions, we can implement the first piece
+of our grammar: numeric literals.
+
+Basic Expression Parsing
+========================
+
+We start with numeric literals, because they are the simplest to
+process. For each production in our grammar, we'll define a function
+which parses that production. We call this class of expressions
+"primary" expressions, for reasons that will become more clear `later in
+the tutorial <OCamlLangImpl6.html#user-defined-unary-operators>`_. In order to parse an
+arbitrary primary expression, we need to determine what sort of
+expression it is. For numeric literals, we have:
+
+.. code-block:: ocaml
+
+ (* primary
+ * ::= identifier
+ * ::= numberexpr
+ * ::= parenexpr *)
+ parse_primary = parser
+ (* numberexpr ::= number *)
+ | [< 'Token.Number n >] -> Ast.Number n
+
+This routine is very simple: it expects to be called when the current
+token is a ``Token.Number`` token. It takes the current number value,
+creates a ``Ast.Number`` node, advances the lexer to the next token, and
+finally returns.
+
+There are some interesting aspects to this. The most important one is
+that this routine eats all of the tokens that correspond to the
+production and returns the lexer buffer with the next token (which is
+not part of the grammar production) ready to go. This is a fairly
+standard way to go for recursive descent parsers. For a better example,
+the parenthesis operator is defined like this:
+
+.. code-block:: ocaml
+
+ (* parenexpr ::= '(' expression ')' *)
+ | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
+
+This function illustrates a number of interesting things about the
+parser:
+
+1) It shows how we use the ``Stream.Error`` exception. When called, this
+function expects that the current token is a '(' token, but after
+parsing the subexpression, it is possible that there is no ')' waiting.
+For example, if the user types in "(4 x" instead of "(4)", the parser
+should emit an error. Because errors can occur, the parser needs a way
+to indicate that they happened. In our parser, we use the camlp4
+shortcut syntax ``token ?? "parse error"``, where if the token before
+the ``??`` does not match, then ``Stream.Error "parse error"`` will be
+raised.
+
+2) Another interesting aspect of this function is that it uses recursion
+by calling ``Parser.parse_primary`` (we will soon see that
+``Parser.parse_primary`` can call ``Parser.parse_primary``). This is
+powerful because it allows us to handle recursive grammars, and keeps
+each production very simple. Note that parentheses do not cause
+construction of AST nodes themselves. While we could do it this way, the
+most important role of parentheses are to guide the parser and provide
+grouping. Once the parser constructs the AST, parentheses are not
+needed.
+
+The next simple production is for handling variable references and
+function calls:
+
+.. code-block:: ocaml
+
+ (* identifierexpr
+ * ::= identifier
+ * ::= identifier '(' argumentexpr ')' *)
+ | [< 'Token.Ident id; stream >] ->
+ let rec parse_args accumulator = parser
+ | [< e=parse_expr; stream >] ->
+ begin parser
+ | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
+ | [< >] -> e :: accumulator
+ end stream
+ | [< >] -> accumulator
+ in
+ let rec parse_ident id = parser
+ (* Call. *)
+ | [< 'Token.Kwd '(';
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')'">] ->
+ Ast.Call (id, Array.of_list (List.rev args))
+
+ (* Simple variable ref. *)
+ | [< >] -> Ast.Variable id
+ in
+ parse_ident id stream
+
+This routine follows the same style as the other routines. (It expects
+to be called if the current token is a ``Token.Ident`` token). It also
+has recursion and error handling. One interesting aspect of this is that
+it uses *look-ahead* to determine if the current identifier is a stand
+alone variable reference or if it is a function call expression. It
+handles this by checking to see if the token after the identifier is a
+'(' token, constructing either a ``Ast.Variable`` or ``Ast.Call`` node
+as appropriate.
+
+We finish up by raising an exception if we received a token we didn't
+expect:
+
+.. code-block:: ocaml
+
+ | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
+
+Now that basic expressions are handled, we need to handle binary
+expressions. They are a bit more complex.
+
+Binary Expression Parsing
+=========================
+
+Binary expressions are significantly harder to parse because they are
+often ambiguous. For example, when given the string "x+y\*z", the parser
+can choose to parse it as either "(x+y)\*z" or "x+(y\*z)". With common
+definitions from mathematics, we expect the later parse, because "\*"
+(multiplication) has higher *precedence* than "+" (addition).
+
+There are many ways to handle this, but an elegant and efficient way is
+to use `Operator-Precedence
+Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_.
+This parsing technique uses the precedence of binary operators to guide
+recursion. To start with, we need a table of precedences:
+
+.. code-block:: ocaml
+
+ (* binop_precedence - This holds the precedence for each binary operator that is
+ * defined *)
+ let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
+
+ (* precedence - Get the precedence of the pending binary operator token. *)
+ let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
+
+ ...
+
+ let main () =
+ (* Install standard binary operators.
+ * 1 is the lowest precedence. *)
+ Hashtbl.add Parser.binop_precedence '<' 10;
+ Hashtbl.add Parser.binop_precedence '+' 20;
+ Hashtbl.add Parser.binop_precedence '-' 20;
+ Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
+ ...
+
+For the basic form of Kaleidoscope, we will only support 4 binary
+operators (this can obviously be extended by you, our brave and intrepid
+reader). The ``Parser.precedence`` function returns the precedence for
+the current token, or -1 if the token is not a binary operator. Having a
+``Hashtbl.t`` makes it easy to add new operators and makes it clear that
+the algorithm doesn't depend on the specific operators involved, but it
+would be easy enough to eliminate the ``Hashtbl.t`` and do the
+comparisons in the ``Parser.precedence`` function. (Or just use a
+fixed-size array).
+
+With the helper above defined, we can now start parsing binary
+expressions. The basic idea of operator precedence parsing is to break
+down an expression with potentially ambiguous binary operators into
+pieces. Consider, for example, the expression "a+b+(c+d)\*e\*f+g".
+Operator precedence parsing considers this as a stream of primary
+expressions separated by binary operators. As such, it will first parse
+the leading primary expression "a", then it will see the pairs [+, b]
+[+, (c+d)] [\*, e] [\*, f] and [+, g]. Note that because parentheses are
+primary expressions, the binary expression parser doesn't need to worry
+about nested subexpressions like (c+d) at all.
+
+To start, an expression is a primary expression potentially followed by
+a sequence of [binop,primaryexpr] pairs:
+
+.. code-block:: ocaml
+
+ (* expression
+ * ::= primary binoprhs *)
+ and parse_expr = parser
+ | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
+
+``Parser.parse_bin_rhs`` is the function that parses the sequence of
+pairs for us. It takes a precedence and a pointer to an expression for
+the part that has been parsed so far. Note that "x" is a perfectly valid
+expression: As such, "binoprhs" is allowed to be empty, in which case it
+returns the expression that is passed into it. In our example above, the
+code passes the expression for "a" into ``Parser.parse_bin_rhs`` and the
+current token is "+".
+
+The precedence value passed into ``Parser.parse_bin_rhs`` indicates the
+*minimal operator precedence* that the function is allowed to eat. For
+example, if the current pair stream is [+, x] and
+``Parser.parse_bin_rhs`` is passed in a precedence of 40, it will not
+consume any tokens (because the precedence of '+' is only 20). With this
+in mind, ``Parser.parse_bin_rhs`` starts with:
+
+.. code-block:: ocaml
+
+ (* binoprhs
+ * ::= ('+' primary)* *)
+ and parse_bin_rhs expr_prec lhs stream =
+ match Stream.peek stream with
+ (* If this is a binop, find its precedence. *)
+ | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
+ let token_prec = precedence c in
+
+ (* If this is a binop that binds at least as tightly as the current binop,
+ * consume it, otherwise we are done. *)
+ if token_prec < expr_prec then lhs else begin
+
+This code gets the precedence of the current token and checks to see if
+if is too low. Because we defined invalid tokens to have a precedence of
+-1, this check implicitly knows that the pair-stream ends when the token
+stream runs out of binary operators. If this check succeeds, we know
+that the token is a binary operator and that it will be included in this
+expression:
+
+.. code-block:: ocaml
+
+ (* Eat the binop. *)
+ Stream.junk stream;
+
+ (* Parse the primary expression after the binary operator *)
+ let rhs = parse_primary stream in
+
+ (* Okay, we know this is a binop. *)
+ let rhs =
+ match Stream.peek stream with
+ | Some (Token.Kwd c2) ->
+
+As such, this code eats (and remembers) the binary operator and then
+parses the primary expression that follows. This builds up the whole
+pair, the first of which is [+, b] for the running example.
+
+Now that we parsed the left-hand side of an expression and one pair of
+the RHS sequence, we have to decide which way the expression associates.
+In particular, we could have "(a+b) binop unparsed" or "a + (b binop
+unparsed)". To determine this, we look ahead at "binop" to determine its
+precedence and compare it to BinOp's precedence (which is '+' in this
+case):
+
+.. code-block:: ocaml
+
+ (* If BinOp binds less tightly with rhs than the operator after
+ * rhs, let the pending operator take rhs as its lhs. *)
+ let next_prec = precedence c2 in
+ if token_prec < next_prec
+
+If the precedence of the binop to the right of "RHS" is lower or equal
+to the precedence of our current operator, then we know that the
+parentheses associate as "(a+b) binop ...". In our example, the current
+operator is "+" and the next operator is "+", we know that they have the
+same precedence. In this case we'll create the AST node for "a+b", and
+then continue parsing:
+
+.. code-block:: ocaml
+
+ ... if body omitted ...
+ in
+
+ (* Merge lhs/rhs. *)
+ let lhs = Ast.Binary (c, lhs, rhs) in
+ parse_bin_rhs expr_prec lhs stream
+ end
+
+In our example above, this will turn "a+b+" into "(a+b)" and execute the
+next iteration of the loop, with "+" as the current token. The code
+above will eat, remember, and parse "(c+d)" as the primary expression,
+which makes the current pair equal to [+, (c+d)]. It will then evaluate
+the 'if' conditional above with "\*" as the binop to the right of the
+primary. In this case, the precedence of "\*" is higher than the
+precedence of "+" so the if condition will be entered.
+
+The critical question left here is "how can the if condition parse the
+right hand side in full"? In particular, to build the AST correctly for
+our example, it needs to get all of "(c+d)\*e\*f" as the RHS expression
+variable. The code to do this is surprisingly simple (code from the
+above two blocks duplicated for context):
+
+.. code-block:: ocaml
+
+ match Stream.peek stream with
+ | Some (Token.Kwd c2) ->
+ (* If BinOp binds less tightly with rhs than the operator after
+ * rhs, let the pending operator take rhs as its lhs. *)
+ if token_prec < precedence c2
+ then parse_bin_rhs (token_prec + 1) rhs stream
+ else rhs
+ | _ -> rhs
+ in
+
+ (* Merge lhs/rhs. *)
+ let lhs = Ast.Binary (c, lhs, rhs) in
+ parse_bin_rhs expr_prec lhs stream
+ end
+
+At this point, we know that the binary operator to the RHS of our
+primary has higher precedence than the binop we are currently parsing.
+As such, we know that any sequence of pairs whose operators are all
+higher precedence than "+" should be parsed together and returned as
+"RHS". To do this, we recursively invoke the ``Parser.parse_bin_rhs``
+function specifying "token\_prec+1" as the minimum precedence required
+for it to continue. In our example above, this will cause it to return
+the AST node for "(c+d)\*e\*f" as RHS, which is then set as the RHS of
+the '+' expression.
+
+Finally, on the next iteration of the while loop, the "+g" piece is
+parsed and added to the AST. With this little bit of code (14
+non-trivial lines), we correctly handle fully general binary expression
+parsing in a very elegant way. This was a whirlwind tour of this code,
+and it is somewhat subtle. I recommend running through it with a few
+tough examples to see how it works.
+
+This wraps up handling of expressions. At this point, we can point the
+parser at an arbitrary token stream and build an expression from it,
+stopping at the first token that is not part of the expression. Next up
+we need to handle function definitions, etc.
+
+Parsing the Rest
+================
+
+The next thing missing is handling of function prototypes. In
+Kaleidoscope, these are used both for 'extern' function declarations as
+well as function body definitions. The code to do this is
+straight-forward and not very interesting (once you've survived
+expressions):
+
+.. code-block:: ocaml
+
+ (* prototype
+ * ::= id '(' id* ')' *)
+ let parse_prototype =
+ let rec parse_args accumulator = parser
+ | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
+ | [< >] -> accumulator
+ in
+
+ parser
+ | [< 'Token.Ident id;
+ 'Token.Kwd '(' ?? "expected '(' in prototype";
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
+ (* success. *)
+ Ast.Prototype (id, Array.of_list (List.rev args))
+
+ | [< >] ->
+ raise (Stream.Error "expected function name in prototype")
+
+Given this, a function definition is very simple, just a prototype plus
+an expression to implement the body:
+
+.. code-block:: ocaml
+
+ (* definition ::= 'def' prototype expression *)
+ let parse_definition = parser
+ | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
+ Ast.Function (p, e)
+
+In addition, we support 'extern' to declare functions like 'sin' and
+'cos' as well as to support forward declaration of user functions. These
+'extern's are just prototypes with no body:
+
+.. code-block:: ocaml
+
+ (* external ::= 'extern' prototype *)
+ let parse_extern = parser
+ | [< 'Token.Extern; e=parse_prototype >] -> e
+
+Finally, we'll also let the user type in arbitrary top-level expressions
+and evaluate them on the fly. We will handle this by defining anonymous
+nullary (zero argument) functions for them:
+
+.. code-block:: ocaml
+
+ (* toplevelexpr ::= expression *)
+ let parse_toplevel = parser
+ | [< e=parse_expr >] ->
+ (* Make an anonymous proto. *)
+ Ast.Function (Ast.Prototype ("", [||]), e)
+
+Now that we have all the pieces, let's build a little driver that will
+let us actually *execute* this code we've built!
+
+The Driver
+==========
+
+The driver for this simply invokes all of the parsing pieces with a
+top-level dispatch loop. There isn't much interesting here, so I'll just
+include the top-level loop. See `below <#full-code-listing>`_ for full code in the
+"Top-Level Parsing" section.
+
+.. code-block:: ocaml
+
+ (* top ::= definition | external | expression | ';' *)
+ let rec main_loop stream =
+ match Stream.peek stream with
+ | None -> ()
+
+ (* ignore top-level semicolons. *)
+ | Some (Token.Kwd ';') ->
+ Stream.junk stream;
+ main_loop stream
+
+ | Some token ->
+ begin
+ try match token with
+ | Token.Def ->
+ ignore(Parser.parse_definition stream);
+ print_endline "parsed a function definition.";
+ | Token.Extern ->
+ ignore(Parser.parse_extern stream);
+ print_endline "parsed an extern.";
+ | _ ->
+ (* Evaluate a top-level expression into an anonymous function. *)
+ ignore(Parser.parse_toplevel stream);
+ print_endline "parsed a top-level expr";
+ with Stream.Error s ->
+ (* Skip token for error recovery. *)
+ Stream.junk stream;
+ print_endline s;
+ end;
+ print_string "ready> "; flush stdout;
+ main_loop stream
+
+The most interesting part of this is that we ignore top-level
+semicolons. Why is this, you ask? The basic reason is that if you type
+"4 + 5" at the command line, the parser doesn't know whether that is the
+end of what you will type or not. For example, on the next line you
+could type "def foo..." in which case 4+5 is the end of a top-level
+expression. Alternatively you could type "\* 6", which would continue
+the expression. Having top-level semicolons allows you to type "4+5;",
+and the parser will know you are done.
+
+Conclusions
+===========
+
+With just under 300 lines of commented code (240 lines of non-comment,
+non-blank code), we fully defined our minimal language, including a
+lexer, parser, and AST builder. With this done, the executable will
+validate Kaleidoscope code and tell us if it is grammatically invalid.
+For example, here is a sample interaction:
+
+.. code-block:: bash
+
+ $ ./toy.byte
+ ready> def foo(x y) x+foo(y, 4.0);
+ Parsed a function definition.
+ ready> def foo(x y) x+y y;
+ Parsed a function definition.
+ Parsed a top-level expr
+ ready> def foo(x y) x+y );
+ Parsed a function definition.
+ Error: unknown token when expecting an expression
+ ready> extern sin(a);
+ ready> Parsed an extern
+ ready> ^D
+ $
+
+There is a lot of room for extension here. You can define new AST nodes,
+extend the language in many ways, etc. In the `next
+installment <OCamlLangImpl3.html>`_, we will describe how to generate
+LLVM Intermediate Representation (IR) from the AST.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for this and the previous chapter.
+Note that it is fully self-contained: you don't need LLVM or any
+external libraries at all for this. (Besides the ocaml standard
+libraries, of course.) To build this, just compile with:
+
+.. code-block:: bash
+
+ # Compile
+ ocamlbuild toy.byte
+ # Run
+ ./toy.byte
+
+Here is the code:
+
+\_tags:
+ ::
+
+ <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
+
+token.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer Tokens
+ *===----------------------------------------------------------------------===*)
+
+ (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
+ * these others for known things. *)
+ type token =
+ (* commands *)
+ | Def | Extern
+
+ (* primary *)
+ | Ident of string | Number of float
+
+ (* unknown *)
+ | Kwd of char
+
+lexer.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer
+ *===----------------------------------------------------------------------===*)
+
+ let rec lex = parser
+ (* Skip any whitespace. *)
+ | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
+
+ (* identifier: [a-zA-Z][a-zA-Z0-9] *)
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+
+ (* number: [0-9.]+ *)
+ | [< ' ('0' .. '9' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+
+ (* Comment until end of line. *)
+ | [< ' ('#'); stream >] ->
+ lex_comment stream
+
+ (* Otherwise, just return the character as its ascii value. *)
+ | [< 'c; stream >] ->
+ [< 'Token.Kwd c; lex stream >]
+
+ (* end of stream. *)
+ | [< >] -> [< >]
+
+ and lex_number buffer = parser
+ | [< ' ('0' .. '9' | '.' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+ | [< stream=lex >] ->
+ [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
+
+ and lex_ident buffer = parser
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+ | [< stream=lex >] ->
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+ and lex_comment = parser
+ | [< ' ('\n'); stream=lex >] -> stream
+ | [< 'c; e=lex_comment >] -> e
+ | [< >] -> [< >]
+
+ast.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Abstract Syntax Tree (aka Parse Tree)
+ *===----------------------------------------------------------------------===*)
+
+ (* expr - Base type for all expression nodes. *)
+ type expr =
+ (* variant for numeric literals like "1.0". *)
+ | Number of float
+
+ (* variant for referencing a variable, like "a". *)
+ | Variable of string
+
+ (* variant for a binary operator. *)
+ | Binary of char * expr * expr
+
+ (* variant for function calls. *)
+ | Call of string * expr array
+
+ (* proto - This type represents the "prototype" for a function, which captures
+ * its name, and its argument names (thus implicitly the number of arguments the
+ * function takes). *)
+ type proto = Prototype of string * string array
+
+ (* func - This type represents a function definition itself. *)
+ type func = Function of proto * expr
+
+parser.ml:
+ .. code-block:: ocaml
+
+ (*===---------------------------------------------------------------------===
+ * Parser
+ *===---------------------------------------------------------------------===*)
+
+ (* binop_precedence - This holds the precedence for each binary operator that is
+ * defined *)
+ let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
+
+ (* precedence - Get the precedence of the pending binary operator token. *)
+ let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
+
+ (* primary
+ * ::= identifier
+ * ::= numberexpr
+ * ::= parenexpr *)
+ let rec parse_primary = parser
+ (* numberexpr ::= number *)
+ | [< 'Token.Number n >] -> Ast.Number n
+
+ (* parenexpr ::= '(' expression ')' *)
+ | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
+
+ (* identifierexpr
+ * ::= identifier
+ * ::= identifier '(' argumentexpr ')' *)
+ | [< 'Token.Ident id; stream >] ->
+ let rec parse_args accumulator = parser
+ | [< e=parse_expr; stream >] ->
+ begin parser
+ | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
+ | [< >] -> e :: accumulator
+ end stream
+ | [< >] -> accumulator
+ in
+ let rec parse_ident id = parser
+ (* Call. *)
+ | [< 'Token.Kwd '(';
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')'">] ->
+ Ast.Call (id, Array.of_list (List.rev args))
+
+ (* Simple variable ref. *)
+ | [< >] -> Ast.Variable id
+ in
+ parse_ident id stream
+
+ | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
+
+ (* binoprhs
+ * ::= ('+' primary)* *)
+ and parse_bin_rhs expr_prec lhs stream =
+ match Stream.peek stream with
+ (* If this is a binop, find its precedence. *)
+ | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
+ let token_prec = precedence c in
+
+ (* If this is a binop that binds at least as tightly as the current binop,
+ * consume it, otherwise we are done. *)
+ if token_prec < expr_prec then lhs else begin
+ (* Eat the binop. *)
+ Stream.junk stream;
+
+ (* Parse the primary expression after the binary operator. *)
+ let rhs = parse_primary stream in
+
+ (* Okay, we know this is a binop. *)
+ let rhs =
+ match Stream.peek stream with
+ | Some (Token.Kwd c2) ->
+ (* If BinOp binds less tightly with rhs than the operator after
+ * rhs, let the pending operator take rhs as its lhs. *)
+ let next_prec = precedence c2 in
+ if token_prec < next_prec
+ then parse_bin_rhs (token_prec + 1) rhs stream
+ else rhs
+ | _ -> rhs
+ in
+
+ (* Merge lhs/rhs. *)
+ let lhs = Ast.Binary (c, lhs, rhs) in
+ parse_bin_rhs expr_prec lhs stream
+ end
+ | _ -> lhs
+
+ (* expression
+ * ::= primary binoprhs *)
+ and parse_expr = parser
+ | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
+
+ (* prototype
+ * ::= id '(' id* ')' *)
+ let parse_prototype =
+ let rec parse_args accumulator = parser
+ | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
+ | [< >] -> accumulator
+ in
+
+ parser
+ | [< 'Token.Ident id;
+ 'Token.Kwd '(' ?? "expected '(' in prototype";
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
+ (* success. *)
+ Ast.Prototype (id, Array.of_list (List.rev args))
+
+ | [< >] ->
+ raise (Stream.Error "expected function name in prototype")
+
+ (* definition ::= 'def' prototype expression *)
+ let parse_definition = parser
+ | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
+ Ast.Function (p, e)
+
+ (* toplevelexpr ::= expression *)
+ let parse_toplevel = parser
+ | [< e=parse_expr >] ->
+ (* Make an anonymous proto. *)
+ Ast.Function (Ast.Prototype ("", [||]), e)
+
+ (* external ::= 'extern' prototype *)
+ let parse_extern = parser
+ | [< 'Token.Extern; e=parse_prototype >] -> e
+
+toplevel.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Top-Level parsing and JIT Driver
+ *===----------------------------------------------------------------------===*)
+
+ (* top ::= definition | external | expression | ';' *)
+ let rec main_loop stream =
+ match Stream.peek stream with
+ | None -> ()
+
+ (* ignore top-level semicolons. *)
+ | Some (Token.Kwd ';') ->
+ Stream.junk stream;
+ main_loop stream
+
+ | Some token ->
+ begin
+ try match token with
+ | Token.Def ->
+ ignore(Parser.parse_definition stream);
+ print_endline "parsed a function definition.";
+ | Token.Extern ->
+ ignore(Parser.parse_extern stream);
+ print_endline "parsed an extern.";
+ | _ ->
+ (* Evaluate a top-level expression into an anonymous function. *)
+ ignore(Parser.parse_toplevel stream);
+ print_endline "parsed a top-level expr";
+ with Stream.Error s ->
+ (* Skip token for error recovery. *)
+ Stream.junk stream;
+ print_endline s;
+ end;
+ print_string "ready> "; flush stdout;
+ main_loop stream
+
+toy.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Main driver code.
+ *===----------------------------------------------------------------------===*)
+
+ let main () =
+ (* Install standard binary operators.
+ * 1 is the lowest precedence. *)
+ Hashtbl.add Parser.binop_precedence '<' 10;
+ Hashtbl.add Parser.binop_precedence '+' 20;
+ Hashtbl.add Parser.binop_precedence '-' 20;
+ Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
+
+ (* Prime the first token. *)
+ print_string "ready> "; flush stdout;
+ let stream = Lexer.lex (Stream.of_channel stdin) in
+
+ (* Run the main "interpreter loop" now. *)
+ Toplevel.main_loop stream;
+ ;;
+
+ main ()
+
+`Next: Implementing Code Generation to LLVM IR <OCamlLangImpl3.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl3.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl3.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl3.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl3.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,961 @@
+========================================
+Kaleidoscope: Code generation to LLVM IR
+========================================
+
+.. contents::
+ :local:
+
+Chapter 3 Introduction
+======================
+
+Welcome to Chapter 3 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. This chapter shows you how to transform
+the `Abstract Syntax Tree <OCamlLangImpl2.html>`_, built in Chapter 2,
+into LLVM IR. This will teach you a little bit about how LLVM does
+things, as well as demonstrate how easy it is to use. It's much more
+work to build a lexer and parser than it is to generate LLVM IR code. :)
+
+**Please note**: the code in this chapter and later require LLVM 2.3 or
+LLVM SVN to work. LLVM 2.2 and before will not work with it.
+
+Code Generation Setup
+=====================
+
+In order to generate LLVM IR, we want some simple setup to get started.
+First we define virtual code generation (codegen) methods in each AST
+class:
+
+.. code-block:: ocaml
+
+ let rec codegen_expr = function
+ | Ast.Number n -> ...
+ | Ast.Variable name -> ...
+
+The ``Codegen.codegen_expr`` function says to emit IR for that AST node
+along with all the things it depends on, and they all return an LLVM
+Value object. "Value" is the class used to represent a "`Static Single
+Assignment
+(SSA) <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+register" or "SSA value" in LLVM. The most distinct aspect of SSA values
+is that their value is computed as the related instruction executes, and
+it does not get a new value until (and if) the instruction re-executes.
+In other words, there is no way to "change" an SSA value. For more
+information, please read up on `Static Single
+Assignment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+- the concepts are really quite natural once you grok them.
+
+The second thing we want is an "Error" exception like we used for the
+parser, which will be used to report errors found during code generation
+(for example, use of an undeclared parameter):
+
+.. code-block:: ocaml
+
+ exception Error of string
+
+ let context = global_context ()
+ let the_module = create_module context "my cool jit"
+ let builder = builder context
+ let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
+ let double_type = double_type context
+
+The static variables will be used during code generation.
+``Codgen.the_module`` is the LLVM construct that contains all of the
+functions and global variables in a chunk of code. In many ways, it is
+the top-level structure that the LLVM IR uses to contain code.
+
+The ``Codegen.builder`` object is a helper object that makes it easy to
+generate LLVM instructions. Instances of the
+`IRBuilder <http://llvm.org/doxygen/IRBuilder_8h-source.html>`_
+class keep track of the current place to insert instructions and has
+methods to create new instructions.
+
+The ``Codegen.named_values`` map keeps track of which values are defined
+in the current scope and what their LLVM representation is. (In other
+words, it is a symbol table for the code). In this form of Kaleidoscope,
+the only things that can be referenced are function parameters. As such,
+function parameters will be in this map when generating code for their
+function body.
+
+With these basics in place, we can start talking about how to generate
+code for each expression. Note that this assumes that the
+``Codgen.builder`` has been set up to generate code *into* something.
+For now, we'll assume that this has already been done, and we'll just
+use it to emit code.
+
+Expression Code Generation
+==========================
+
+Generating LLVM code for expression nodes is very straightforward: less
+than 30 lines of commented code for all four of our expression nodes.
+First we'll do numeric literals:
+
+.. code-block:: ocaml
+
+ | Ast.Number n -> const_float double_type n
+
+In the LLVM IR, numeric constants are represented with the
+``ConstantFP`` class, which holds the numeric value in an ``APFloat``
+internally (``APFloat`` has the capability of holding floating point
+constants of Arbitrary Precision). This code basically just creates
+and returns a ``ConstantFP``. Note that in the LLVM IR that constants
+are all uniqued together and shared. For this reason, the API uses "the
+foo::get(..)" idiom instead of "new foo(..)" or "foo::Create(..)".
+
+.. code-block:: ocaml
+
+ | Ast.Variable name ->
+ (try Hashtbl.find named_values name with
+ | Not_found -> raise (Error "unknown variable name"))
+
+References to variables are also quite simple using LLVM. In the simple
+version of Kaleidoscope, we assume that the variable has already been
+emitted somewhere and its value is available. In practice, the only
+values that can be in the ``Codegen.named_values`` map are function
+arguments. This code simply checks to see that the specified name is in
+the map (if not, an unknown variable is being referenced) and returns
+the value for it. In future chapters, we'll add support for `loop
+induction variables <LangImpl5.html#for-loop-expression>`_ in the symbol table, and for
+`local variables <LangImpl7.html#user-defined-local-variables>`_.
+
+.. code-block:: ocaml
+
+ | Ast.Binary (op, lhs, rhs) ->
+ let lhs_val = codegen_expr lhs in
+ let rhs_val = codegen_expr rhs in
+ begin
+ match op with
+ | '+' -> build_fadd lhs_val rhs_val "addtmp" builder
+ | '-' -> build_fsub lhs_val rhs_val "subtmp" builder
+ | '*' -> build_fmul lhs_val rhs_val "multmp" builder
+ | '<' ->
+ (* Convert bool 0/1 to double 0.0 or 1.0 *)
+ let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
+ build_uitofp i double_type "booltmp" builder
+ | _ -> raise (Error "invalid binary operator")
+ end
+
+Binary operators start to get more interesting. The basic idea here is
+that we recursively emit code for the left-hand side of the expression,
+then the right-hand side, then we compute the result of the binary
+expression. In this code, we do a simple switch on the opcode to create
+the right LLVM instruction.
+
+In the example above, the LLVM builder class is starting to show its
+value. IRBuilder knows where to insert the newly created instruction,
+all you have to do is specify what instruction to create (e.g. with
+``Llvm.create_add``), which operands to use (``lhs`` and ``rhs`` here)
+and optionally provide a name for the generated instruction.
+
+One nice thing about LLVM is that the name is just a hint. For instance,
+if the code above emits multiple "addtmp" variables, LLVM will
+automatically provide each one with an increasing, unique numeric
+suffix. Local value names for instructions are purely optional, but it
+makes it much easier to read the IR dumps.
+
+`LLVM instructions <../LangRef.html#instruction-reference>`_ are constrained by strict
+rules: for example, the Left and Right operators of an `add
+instruction <../LangRef.html#add-instruction>`_ must have the same type, and the
+result type of the add must match the operand types. Because all values
+in Kaleidoscope are doubles, this makes for very simple code for add,
+sub and mul.
+
+On the other hand, LLVM specifies that the `fcmp
+instruction <../LangRef.html#fcmp-instruction>`_ always returns an 'i1' value (a
+one bit integer). The problem with this is that Kaleidoscope wants the
+value to be a 0.0 or 1.0 value. In order to get these semantics, we
+combine the fcmp instruction with a `uitofp
+instruction <../LangRef.html#uitofp-to-instruction>`_. This instruction converts its
+input integer into a floating point value by treating the input as an
+unsigned value. In contrast, if we used the `sitofp
+instruction <../LangRef.html#sitofp-to-instruction>`_, the Kaleidoscope '<' operator
+would return 0.0 and -1.0, depending on the input value.
+
+.. code-block:: ocaml
+
+ | Ast.Call (callee, args) ->
+ (* Look up the name in the module table. *)
+ let callee =
+ match lookup_function callee the_module with
+ | Some callee -> callee
+ | None -> raise (Error "unknown function referenced")
+ in
+ let params = params callee in
+
+ (* If argument mismatch error. *)
+ if Array.length params == Array.length args then () else
+ raise (Error "incorrect # arguments passed");
+ let args = Array.map codegen_expr args in
+ build_call callee args "calltmp" builder
+
+Code generation for function calls is quite straightforward with LLVM.
+The code above initially does a function name lookup in the LLVM
+Module's symbol table. Recall that the LLVM Module is the container that
+holds all of the functions we are JIT'ing. By giving each function the
+same name as what the user specifies, we can use the LLVM symbol table
+to resolve function names for us.
+
+Once we have the function to call, we recursively codegen each argument
+that is to be passed in, and create an LLVM `call
+instruction <../LangRef.html#call-instruction>`_. Note that LLVM uses the native C
+calling conventions by default, allowing these calls to also call into
+standard library functions like "sin" and "cos", with no additional
+effort.
+
+This wraps up our handling of the four basic expressions that we have so
+far in Kaleidoscope. Feel free to go in and add some more. For example,
+by browsing the `LLVM language reference <../LangRef.html>`_ you'll find
+several other interesting instructions that are really easy to plug into
+our basic framework.
+
+Function Code Generation
+========================
+
+Code generation for prototypes and functions must handle a number of
+details, which make their code less beautiful than expression code
+generation, but allows us to illustrate some important points. First,
+lets talk about code generation for prototypes: they are used both for
+function bodies and external function declarations. The code starts
+with:
+
+.. code-block:: ocaml
+
+ let codegen_proto = function
+ | Ast.Prototype (name, args) ->
+ (* Make the function type: double(double,double) etc. *)
+ let doubles = Array.make (Array.length args) double_type in
+ let ft = function_type double_type doubles in
+ let f =
+ match lookup_function name the_module with
+
+This code packs a lot of power into a few lines. Note first that this
+function returns a "Function\*" instead of a "Value\*" (although at the
+moment they both are modeled by ``llvalue`` in ocaml). Because a
+"prototype" really talks about the external interface for a function
+(not the value computed by an expression), it makes sense for it to
+return the LLVM Function it corresponds to when codegen'd.
+
+The call to ``Llvm.function_type`` creates the ``Llvm.llvalue`` that
+should be used for a given Prototype. Since all function arguments in
+Kaleidoscope are of type double, the first line creates a vector of "N"
+LLVM double types. It then uses the ``Llvm.function_type`` method to
+create a function type that takes "N" doubles as arguments, returns one
+double as a result, and that is not vararg (that uses the function
+``Llvm.var_arg_function_type``). Note that Types in LLVM are uniqued
+just like ``Constant``'s are, so you don't "new" a type, you "get" it.
+
+The final line above checks if the function has already been defined in
+``Codegen.the_module``. If not, we will create it.
+
+.. code-block:: ocaml
+
+ | None -> declare_function name ft the_module
+
+This indicates the type and name to use, as well as which module to
+insert into. By default we assume a function has
+``Llvm.Linkage.ExternalLinkage``. "`external
+linkage <../LangRef.html#linkage>`_" means that the function may be defined
+outside the current module and/or that it is callable by functions
+outside the module. The "``name``" passed in is the name the user
+specified: this name is registered in "``Codegen.the_module``"s symbol
+table, which is used by the function call code above.
+
+In Kaleidoscope, I choose to allow redefinitions of functions in two
+cases: first, we want to allow 'extern'ing a function more than once, as
+long as the prototypes for the externs match (since all arguments have
+the same type, we just have to check that the number of arguments
+match). Second, we want to allow 'extern'ing a function and then
+defining a body for it. This is useful when defining mutually recursive
+functions.
+
+.. code-block:: ocaml
+
+ (* If 'f' conflicted, there was already something named 'name'. If it
+ * has a body, don't allow redefinition or reextern. *)
+ | Some f ->
+ (* If 'f' already has a body, reject this. *)
+ if Array.length (basic_blocks f) == 0 then () else
+ raise (Error "redefinition of function");
+
+ (* If 'f' took a different number of arguments, reject. *)
+ if Array.length (params f) == Array.length args then () else
+ raise (Error "redefinition of function with different # args");
+ f
+ in
+
+In order to verify the logic above, we first check to see if the
+pre-existing function is "empty". In this case, empty means that it has
+no basic blocks in it, which means it has no body. If it has no body, it
+is a forward declaration. Since we don't allow anything after a full
+definition of the function, the code rejects this case. If the previous
+reference to a function was an 'extern', we simply verify that the
+number of arguments for that definition and this one match up. If not,
+we emit an error.
+
+.. code-block:: ocaml
+
+ (* Set names for all arguments. *)
+ Array.iteri (fun i a ->
+ let n = args.(i) in
+ set_value_name n a;
+ Hashtbl.add named_values n a;
+ ) (params f);
+ f
+
+The last bit of code for prototypes loops over all of the arguments in
+the function, setting the name of the LLVM Argument objects to match,
+and registering the arguments in the ``Codegen.named_values`` map for
+future use by the ``Ast.Variable`` variant. Once this is set up, it
+returns the Function object to the caller. Note that we don't check for
+conflicting argument names here (e.g. "extern foo(a b a)"). Doing so
+would be very straight-forward with the mechanics we have already used
+above.
+
+.. code-block:: ocaml
+
+ let codegen_func = function
+ | Ast.Function (proto, body) ->
+ Hashtbl.clear named_values;
+ let the_function = codegen_proto proto in
+
+Code generation for function definitions starts out simply enough: we
+just codegen the prototype (Proto) and verify that it is ok. We then
+clear out the ``Codegen.named_values`` map to make sure that there isn't
+anything in it from the last function we compiled. Code generation of
+the prototype ensures that there is an LLVM Function object that is
+ready to go for us.
+
+.. code-block:: ocaml
+
+ (* Create a new basic block to start insertion into. *)
+ let bb = append_block context "entry" the_function in
+ position_at_end bb builder;
+
+ try
+ let ret_val = codegen_expr body in
+
+Now we get to the point where the ``Codegen.builder`` is set up. The
+first line creates a new `basic
+block <http://en.wikipedia.org/wiki/Basic_block>`_ (named "entry"),
+which is inserted into ``the_function``. The second line then tells the
+builder that new instructions should be inserted into the end of the new
+basic block. Basic blocks in LLVM are an important part of functions
+that define the `Control Flow
+Graph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we
+don't have any control flow, our functions will only contain one block
+at this point. We'll fix this in `Chapter 5 <OCamlLangImpl5.html>`_ :).
+
+.. code-block:: ocaml
+
+ let ret_val = codegen_expr body in
+
+ (* Finish off the function. *)
+ let _ = build_ret ret_val builder in
+
+ (* Validate the generated code, checking for consistency. *)
+ Llvm_analysis.assert_valid_function the_function;
+
+ the_function
+
+Once the insertion point is set up, we call the ``Codegen.codegen_func``
+method for the root expression of the function. If no error happens,
+this emits code to compute the expression into the entry block and
+returns the value that was computed. Assuming no error, we then create
+an LLVM `ret instruction <../LangRef.html#ret-instruction>`_, which completes the
+function. Once the function is built, we call
+``Llvm_analysis.assert_valid_function``, which is provided by LLVM. This
+function does a variety of consistency checks on the generated code, to
+determine if our compiler is doing everything right. Using this is
+important: it can catch a lot of bugs. Once the function is finished and
+validated, we return it.
+
+.. code-block:: ocaml
+
+ with e ->
+ delete_function the_function;
+ raise e
+
+The only piece left here is handling of the error case. For simplicity,
+we handle this by merely deleting the function we produced with the
+``Llvm.delete_function`` method. This allows the user to redefine a
+function that they incorrectly typed in before: if we didn't delete it,
+it would live in the symbol table, with a body, preventing future
+redefinition.
+
+This code does have a bug, though. Since the ``Codegen.codegen_proto``
+can return a previously defined forward declaration, our code can
+actually delete a forward declaration. There are a number of ways to fix
+this bug, see what you can come up with! Here is a testcase:
+
+::
+
+ extern foo(a b); # ok, defines foo.
+ def foo(a b) c; # error, 'c' is invalid.
+ def bar() foo(1, 2); # error, unknown function "foo"
+
+Driver Changes and Closing Thoughts
+===================================
+
+For now, code generation to LLVM doesn't really get us much, except that
+we can look at the pretty IR calls. The sample code inserts calls to
+Codegen into the "``Toplevel.main_loop``", and then dumps out the LLVM
+IR. This gives a nice way to look at the LLVM IR for simple functions.
+For example:
+
+::
+
+ ready> 4+5;
+ Read top-level expression:
+ define double @""() {
+ entry:
+ %addtmp = fadd double 4.000000e+00, 5.000000e+00
+ ret double %addtmp
+ }
+
+Note how the parser turns the top-level expression into anonymous
+functions for us. This will be handy when we add `JIT
+support <OCamlLangImpl4.html#adding-a-jit-compiler>`_ in the next chapter. Also note that
+the code is very literally transcribed, no optimizations are being
+performed. We will `add
+optimizations <OCamlLangImpl4.html#trivial-constant-folding>`_ explicitly in the
+next chapter.
+
+::
+
+ ready> def foo(a b) a*a + 2*a*b + b*b;
+ Read function definition:
+ define double @foo(double %a, double %b) {
+ entry:
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
+ }
+
+This shows some simple arithmetic. Notice the striking similarity to the
+LLVM builder calls that we use to create the instructions.
+
+::
+
+ ready> def bar(a) foo(a, 4.0) + bar(31337);
+ Read function definition:
+ define double @bar(double %a) {
+ entry:
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
+ }
+
+This shows some function calls. Note that this function will take a long
+time to execute if you call it. In the future we'll add conditional
+control flow to actually make recursion useful :).
+
+::
+
+ ready> extern cos(x);
+ Read extern:
+ declare double @cos(double)
+
+ ready> cos(1.234);
+ Read top-level expression:
+ define double @""() {
+ entry:
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
+ }
+
+This shows an extern for the libm "cos" function, and a call to it.
+
+::
+
+ ready> ^D
+ ; ModuleID = 'my cool jit'
+
+ define double @""() {
+ entry:
+ %addtmp = fadd double 4.000000e+00, 5.000000e+00
+ ret double %addtmp
+ }
+
+ define double @foo(double %a, double %b) {
+ entry:
+ %multmp = fmul double %a, %a
+ %multmp1 = fmul double 2.000000e+00, %a
+ %multmp2 = fmul double %multmp1, %b
+ %addtmp = fadd double %multmp, %multmp2
+ %multmp3 = fmul double %b, %b
+ %addtmp4 = fadd double %addtmp, %multmp3
+ ret double %addtmp4
+ }
+
+ define double @bar(double %a) {
+ entry:
+ %calltmp = call double @foo(double %a, double 4.000000e+00)
+ %calltmp1 = call double @bar(double 3.133700e+04)
+ %addtmp = fadd double %calltmp, %calltmp1
+ ret double %addtmp
+ }
+
+ declare double @cos(double)
+
+ define double @""() {
+ entry:
+ %calltmp = call double @cos(double 1.234000e+00)
+ ret double %calltmp
+ }
+
+When you quit the current demo, it dumps out the IR for the entire
+module generated. Here you can see the big picture with all the
+functions referencing each other.
+
+This wraps up the third chapter of the Kaleidoscope tutorial. Up next,
+we'll describe how to `add JIT codegen and optimizer
+support <OCamlLangImpl4.html>`_ to this so we can actually start running
+code!
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the LLVM code generator. Because this uses the LLVM libraries, we need
+to link them in. To do this, we use the
+`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform
+our makefile/command line about which options to use:
+
+.. code-block:: bash
+
+ # Compile
+ ocamlbuild toy.byte
+ # Run
+ ./toy.byte
+
+Here is the code:
+
+\_tags:
+ ::
+
+ <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
+ <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
+
+myocamlbuild.ml:
+ .. code-block:: ocaml
+
+ open Ocamlbuild_plugin;;
+
+ ocaml_lib ~extern:true "llvm";;
+ ocaml_lib ~extern:true "llvm_analysis";;
+
+ flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
+
+token.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer Tokens
+ *===----------------------------------------------------------------------===*)
+
+ (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
+ * these others for known things. *)
+ type token =
+ (* commands *)
+ | Def | Extern
+
+ (* primary *)
+ | Ident of string | Number of float
+
+ (* unknown *)
+ | Kwd of char
+
+lexer.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer
+ *===----------------------------------------------------------------------===*)
+
+ let rec lex = parser
+ (* Skip any whitespace. *)
+ | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
+
+ (* identifier: [a-zA-Z][a-zA-Z0-9] *)
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+
+ (* number: [0-9.]+ *)
+ | [< ' ('0' .. '9' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+
+ (* Comment until end of line. *)
+ | [< ' ('#'); stream >] ->
+ lex_comment stream
+
+ (* Otherwise, just return the character as its ascii value. *)
+ | [< 'c; stream >] ->
+ [< 'Token.Kwd c; lex stream >]
+
+ (* end of stream. *)
+ | [< >] -> [< >]
+
+ and lex_number buffer = parser
+ | [< ' ('0' .. '9' | '.' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+ | [< stream=lex >] ->
+ [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
+
+ and lex_ident buffer = parser
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+ | [< stream=lex >] ->
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+ and lex_comment = parser
+ | [< ' ('\n'); stream=lex >] -> stream
+ | [< 'c; e=lex_comment >] -> e
+ | [< >] -> [< >]
+
+ast.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Abstract Syntax Tree (aka Parse Tree)
+ *===----------------------------------------------------------------------===*)
+
+ (* expr - Base type for all expression nodes. *)
+ type expr =
+ (* variant for numeric literals like "1.0". *)
+ | Number of float
+
+ (* variant for referencing a variable, like "a". *)
+ | Variable of string
+
+ (* variant for a binary operator. *)
+ | Binary of char * expr * expr
+
+ (* variant for function calls. *)
+ | Call of string * expr array
+
+ (* proto - This type represents the "prototype" for a function, which captures
+ * its name, and its argument names (thus implicitly the number of arguments the
+ * function takes). *)
+ type proto = Prototype of string * string array
+
+ (* func - This type represents a function definition itself. *)
+ type func = Function of proto * expr
+
+parser.ml:
+ .. code-block:: ocaml
+
+ (*===---------------------------------------------------------------------===
+ * Parser
+ *===---------------------------------------------------------------------===*)
+
+ (* binop_precedence - This holds the precedence for each binary operator that is
+ * defined *)
+ let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
+
+ (* precedence - Get the precedence of the pending binary operator token. *)
+ let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
+
+ (* primary
+ * ::= identifier
+ * ::= numberexpr
+ * ::= parenexpr *)
+ let rec parse_primary = parser
+ (* numberexpr ::= number *)
+ | [< 'Token.Number n >] -> Ast.Number n
+
+ (* parenexpr ::= '(' expression ')' *)
+ | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
+
+ (* identifierexpr
+ * ::= identifier
+ * ::= identifier '(' argumentexpr ')' *)
+ | [< 'Token.Ident id; stream >] ->
+ let rec parse_args accumulator = parser
+ | [< e=parse_expr; stream >] ->
+ begin parser
+ | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
+ | [< >] -> e :: accumulator
+ end stream
+ | [< >] -> accumulator
+ in
+ let rec parse_ident id = parser
+ (* Call. *)
+ | [< 'Token.Kwd '(';
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')'">] ->
+ Ast.Call (id, Array.of_list (List.rev args))
+
+ (* Simple variable ref. *)
+ | [< >] -> Ast.Variable id
+ in
+ parse_ident id stream
+
+ | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
+
+ (* binoprhs
+ * ::= ('+' primary)* *)
+ and parse_bin_rhs expr_prec lhs stream =
+ match Stream.peek stream with
+ (* If this is a binop, find its precedence. *)
+ | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
+ let token_prec = precedence c in
+
+ (* If this is a binop that binds at least as tightly as the current binop,
+ * consume it, otherwise we are done. *)
+ if token_prec < expr_prec then lhs else begin
+ (* Eat the binop. *)
+ Stream.junk stream;
+
+ (* Parse the primary expression after the binary operator. *)
+ let rhs = parse_primary stream in
+
+ (* Okay, we know this is a binop. *)
+ let rhs =
+ match Stream.peek stream with
+ | Some (Token.Kwd c2) ->
+ (* If BinOp binds less tightly with rhs than the operator after
+ * rhs, let the pending operator take rhs as its lhs. *)
+ let next_prec = precedence c2 in
+ if token_prec < next_prec
+ then parse_bin_rhs (token_prec + 1) rhs stream
+ else rhs
+ | _ -> rhs
+ in
+
+ (* Merge lhs/rhs. *)
+ let lhs = Ast.Binary (c, lhs, rhs) in
+ parse_bin_rhs expr_prec lhs stream
+ end
+ | _ -> lhs
+
+ (* expression
+ * ::= primary binoprhs *)
+ and parse_expr = parser
+ | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
+
+ (* prototype
+ * ::= id '(' id* ')' *)
+ let parse_prototype =
+ let rec parse_args accumulator = parser
+ | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
+ | [< >] -> accumulator
+ in
+
+ parser
+ | [< 'Token.Ident id;
+ 'Token.Kwd '(' ?? "expected '(' in prototype";
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
+ (* success. *)
+ Ast.Prototype (id, Array.of_list (List.rev args))
+
+ | [< >] ->
+ raise (Stream.Error "expected function name in prototype")
+
+ (* definition ::= 'def' prototype expression *)
+ let parse_definition = parser
+ | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
+ Ast.Function (p, e)
+
+ (* toplevelexpr ::= expression *)
+ let parse_toplevel = parser
+ | [< e=parse_expr >] ->
+ (* Make an anonymous proto. *)
+ Ast.Function (Ast.Prototype ("", [||]), e)
+
+ (* external ::= 'extern' prototype *)
+ let parse_extern = parser
+ | [< 'Token.Extern; e=parse_prototype >] -> e
+
+codegen.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Code Generation
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+
+ exception Error of string
+
+ let context = global_context ()
+ let the_module = create_module context "my cool jit"
+ let builder = builder context
+ let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
+ let double_type = double_type context
+
+ let rec codegen_expr = function
+ | Ast.Number n -> const_float double_type n
+ | Ast.Variable name ->
+ (try Hashtbl.find named_values name with
+ | Not_found -> raise (Error "unknown variable name"))
+ | Ast.Binary (op, lhs, rhs) ->
+ let lhs_val = codegen_expr lhs in
+ let rhs_val = codegen_expr rhs in
+ begin
+ match op with
+ | '+' -> build_add lhs_val rhs_val "addtmp" builder
+ | '-' -> build_sub lhs_val rhs_val "subtmp" builder
+ | '*' -> build_mul lhs_val rhs_val "multmp" builder
+ | '<' ->
+ (* Convert bool 0/1 to double 0.0 or 1.0 *)
+ let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
+ build_uitofp i double_type "booltmp" builder
+ | _ -> raise (Error "invalid binary operator")
+ end
+ | Ast.Call (callee, args) ->
+ (* Look up the name in the module table. *)
+ let callee =
+ match lookup_function callee the_module with
+ | Some callee -> callee
+ | None -> raise (Error "unknown function referenced")
+ in
+ let params = params callee in
+
+ (* If argument mismatch error. *)
+ if Array.length params == Array.length args then () else
+ raise (Error "incorrect # arguments passed");
+ let args = Array.map codegen_expr args in
+ build_call callee args "calltmp" builder
+
+ let codegen_proto = function
+ | Ast.Prototype (name, args) ->
+ (* Make the function type: double(double,double) etc. *)
+ let doubles = Array.make (Array.length args) double_type in
+ let ft = function_type double_type doubles in
+ let f =
+ match lookup_function name the_module with
+ | None -> declare_function name ft the_module
+
+ (* If 'f' conflicted, there was already something named 'name'. If it
+ * has a body, don't allow redefinition or reextern. *)
+ | Some f ->
+ (* If 'f' already has a body, reject this. *)
+ if block_begin f <> At_end f then
+ raise (Error "redefinition of function");
+
+ (* If 'f' took a different number of arguments, reject. *)
+ if element_type (type_of f) <> ft then
+ raise (Error "redefinition of function with different # args");
+ f
+ in
+
+ (* Set names for all arguments. *)
+ Array.iteri (fun i a ->
+ let n = args.(i) in
+ set_value_name n a;
+ Hashtbl.add named_values n a;
+ ) (params f);
+ f
+
+ let codegen_func = function
+ | Ast.Function (proto, body) ->
+ Hashtbl.clear named_values;
+ let the_function = codegen_proto proto in
+
+ (* Create a new basic block to start insertion into. *)
+ let bb = append_block context "entry" the_function in
+ position_at_end bb builder;
+
+ try
+ let ret_val = codegen_expr body in
+
+ (* Finish off the function. *)
+ let _ = build_ret ret_val builder in
+
+ (* Validate the generated code, checking for consistency. *)
+ Llvm_analysis.assert_valid_function the_function;
+
+ the_function
+ with e ->
+ delete_function the_function;
+ raise e
+
+toplevel.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Top-Level parsing and JIT Driver
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+
+ (* top ::= definition | external | expression | ';' *)
+ let rec main_loop stream =
+ match Stream.peek stream with
+ | None -> ()
+
+ (* ignore top-level semicolons. *)
+ | Some (Token.Kwd ';') ->
+ Stream.junk stream;
+ main_loop stream
+
+ | Some token ->
+ begin
+ try match token with
+ | Token.Def ->
+ let e = Parser.parse_definition stream in
+ print_endline "parsed a function definition.";
+ dump_value (Codegen.codegen_func e);
+ | Token.Extern ->
+ let e = Parser.parse_extern stream in
+ print_endline "parsed an extern.";
+ dump_value (Codegen.codegen_proto e);
+ | _ ->
+ (* Evaluate a top-level expression into an anonymous function. *)
+ let e = Parser.parse_toplevel stream in
+ print_endline "parsed a top-level expr";
+ dump_value (Codegen.codegen_func e);
+ with Stream.Error s | Codegen.Error s ->
+ (* Skip token for error recovery. *)
+ Stream.junk stream;
+ print_endline s;
+ end;
+ print_string "ready> "; flush stdout;
+ main_loop stream
+
+toy.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Main driver code.
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+
+ let main () =
+ (* Install standard binary operators.
+ * 1 is the lowest precedence. *)
+ Hashtbl.add Parser.binop_precedence '<' 10;
+ Hashtbl.add Parser.binop_precedence '+' 20;
+ Hashtbl.add Parser.binop_precedence '-' 20;
+ Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
+
+ (* Prime the first token. *)
+ print_string "ready> "; flush stdout;
+ let stream = Lexer.lex (Stream.of_channel stdin) in
+
+ (* Run the main "interpreter loop" now. *)
+ Toplevel.main_loop stream;
+
+ (* Print out all the generated code. *)
+ dump_module Codegen.the_module
+ ;;
+
+ main ()
+
+`Next: Adding JIT and Optimizer Support <OCamlLangImpl4.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl4.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl4.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl4.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl4.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,915 @@
+==============================================
+Kaleidoscope: Adding JIT and Optimizer Support
+==============================================
+
+.. contents::
+ :local:
+
+Chapter 4 Introduction
+======================
+
+Welcome to Chapter 4 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. Chapters 1-3 described the implementation
+of a simple language and added support for generating LLVM IR. This
+chapter describes two new techniques: adding optimizer support to your
+language, and adding JIT compiler support. These additions will
+demonstrate how to get nice, efficient code for the Kaleidoscope
+language.
+
+Trivial Constant Folding
+========================
+
+**Note:** the default ``IRBuilder`` now always includes the constant
+folding optimisations below.
+
+Our demonstration for Chapter 3 is elegant and easy to extend.
+Unfortunately, it does not produce wonderful code. For example, when
+compiling simple code, we don't get obvious optimizations:
+
+::
+
+ ready> def test(x) 1+2+x;
+ Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 1.000000e+00, 2.000000e+00
+ %addtmp1 = fadd double %addtmp, %x
+ ret double %addtmp1
+ }
+
+This code is a very, very literal transcription of the AST built by
+parsing the input. As such, this transcription lacks optimizations like
+constant folding (we'd like to get "``add x, 3.0``" in the example
+above) as well as other more important optimizations. Constant folding,
+in particular, is a very common and very important optimization: so much
+so that many language implementors implement constant folding support in
+their AST representation.
+
+With LLVM, you don't need this support in the AST. Since all calls to
+build LLVM IR go through the LLVM builder, it would be nice if the
+builder itself checked to see if there was a constant folding
+opportunity when you call it. If so, it could just do the constant fold
+and return the constant instead of creating an instruction. This is
+exactly what the ``LLVMFoldingBuilder`` class does.
+
+All we did was switch from ``LLVMBuilder`` to ``LLVMFoldingBuilder``.
+Though we change no other code, we now have all of our instructions
+implicitly constant folded without us having to do anything about it.
+For example, the input above now compiles to:
+
+::
+
+ ready> def test(x) 1+2+x;
+ Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 3.000000e+00, %x
+ ret double %addtmp
+ }
+
+Well, that was easy :). In practice, we recommend always using
+``LLVMFoldingBuilder`` when generating code like this. It has no
+"syntactic overhead" for its use (you don't have to uglify your compiler
+with constant checks everywhere) and it can dramatically reduce the
+amount of LLVM IR that is generated in some cases (particular for
+languages with a macro preprocessor or that use a lot of constants).
+
+On the other hand, the ``LLVMFoldingBuilder`` is limited by the fact
+that it does all of its analysis inline with the code as it is built. If
+you take a slightly more complex example:
+
+::
+
+ ready> def test(x) (1+2+x)*(x+(1+2));
+ ready> Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double 3.000000e+00, %x
+ %addtmp1 = fadd double %x, 3.000000e+00
+ %multmp = fmul double %addtmp, %addtmp1
+ ret double %multmp
+ }
+
+In this case, the LHS and RHS of the multiplication are the same value.
+We'd really like to see this generate "``tmp = x+3; result = tmp*tmp;``"
+instead of computing "``x*3``" twice.
+
+Unfortunately, no amount of local analysis will be able to detect and
+correct this. This requires two transformations: reassociation of
+expressions (to make the add's lexically identical) and Common
+Subexpression Elimination (CSE) to delete the redundant add instruction.
+Fortunately, LLVM provides a broad range of optimizations that you can
+use, in the form of "passes".
+
+LLVM Optimization Passes
+========================
+
+LLVM provides many optimization passes, which do many different sorts of
+things and have different tradeoffs. Unlike other systems, LLVM doesn't
+hold to the mistaken notion that one set of optimizations is right for
+all languages and for all situations. LLVM allows a compiler implementor
+to make complete decisions about what optimizations to use, in which
+order, and in what situation.
+
+As a concrete example, LLVM supports both "whole module" passes, which
+look across as large of body of code as they can (often a whole file,
+but if run at link time, this can be a substantial portion of the whole
+program). It also supports and includes "per-function" passes which just
+operate on a single function at a time, without looking at other
+functions. For more information on passes and how they are run, see the
+`How to Write a Pass <../WritingAnLLVMPass.html>`_ document and the
+`List of LLVM Passes <../Passes.html>`_.
+
+For Kaleidoscope, we are currently generating functions on the fly, one
+at a time, as the user types them in. We aren't shooting for the
+ultimate optimization experience in this setting, but we also want to
+catch the easy and quick stuff where possible. As such, we will choose
+to run a few per-function optimizations as the user types the function
+in. If we wanted to make a "static Kaleidoscope compiler", we would use
+exactly the code we have now, except that we would defer running the
+optimizer until the entire file has been parsed.
+
+In order to get per-function optimizations going, we need to set up a
+`Llvm.PassManager <../WritingAnLLVMPass.html#what-passmanager-does>`_ to hold and
+organize the LLVM optimizations that we want to run. Once we have that,
+we can add a set of optimizations to run. The code looks like this:
+
+.. code-block:: ocaml
+
+ (* Create the JIT. *)
+ let the_execution_engine = ExecutionEngine.create Codegen.the_module in
+ let the_fpm = PassManager.create_function Codegen.the_module in
+
+ (* Set up the optimizer pipeline. Start with registering info about how the
+ * target lays out data structures. *)
+ DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
+
+ (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
+ add_instruction_combining the_fpm;
+
+ (* reassociate expressions. *)
+ add_reassociation the_fpm;
+
+ (* Eliminate Common SubExpressions. *)
+ add_gvn the_fpm;
+
+ (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
+ add_cfg_simplification the_fpm;
+
+ ignore (PassManager.initialize the_fpm);
+
+ (* Run the main "interpreter loop" now. *)
+ Toplevel.main_loop the_fpm the_execution_engine stream;
+
+The meat of the matter here, is the definition of "``the_fpm``". It
+requires a pointer to the ``the_module`` to construct itself. Once it is
+set up, we use a series of "add" calls to add a bunch of LLVM passes.
+The first pass is basically boilerplate, it adds a pass so that later
+optimizations know how the data structures in the program are laid out.
+The "``the_execution_engine``" variable is related to the JIT, which we
+will get to in the next section.
+
+In this case, we choose to add 4 optimization passes. The passes we
+chose here are a pretty standard set of "cleanup" optimizations that are
+useful for a wide variety of code. I won't delve into what they do but,
+believe me, they are a good starting place :).
+
+Once the ``Llvm.PassManager.`` is set up, we need to make use of it. We
+do this by running it after our newly created function is constructed
+(in ``Codegen.codegen_func``), but before it is returned to the client:
+
+.. code-block:: ocaml
+
+ let codegen_func the_fpm = function
+ ...
+ try
+ let ret_val = codegen_expr body in
+
+ (* Finish off the function. *)
+ let _ = build_ret ret_val builder in
+
+ (* Validate the generated code, checking for consistency. *)
+ Llvm_analysis.assert_valid_function the_function;
+
+ (* Optimize the function. *)
+ let _ = PassManager.run_function the_function the_fpm in
+
+ the_function
+
+As you can see, this is pretty straightforward. The ``the_fpm``
+optimizes and updates the LLVM Function\* in place, improving
+(hopefully) its body. With this in place, we can try our test above
+again:
+
+::
+
+ ready> def test(x) (1+2+x)*(x+(1+2));
+ ready> Read function definition:
+ define double @test(double %x) {
+ entry:
+ %addtmp = fadd double %x, 3.000000e+00
+ %multmp = fmul double %addtmp, %addtmp
+ ret double %multmp
+ }
+
+As expected, we now get our nicely optimized code, saving a floating
+point add instruction from every execution of this function.
+
+LLVM provides a wide variety of optimizations that can be used in
+certain circumstances. Some `documentation about the various
+passes <../Passes.html>`_ is available, but it isn't very complete.
+Another good source of ideas can come from looking at the passes that
+``Clang`` runs to get started. The "``opt``" tool allows you to
+experiment with passes from the command line, so you can see if they do
+anything.
+
+Now that we have reasonable code coming out of our front-end, lets talk
+about executing it!
+
+Adding a JIT Compiler
+=====================
+
+Code that is available in LLVM IR can have a wide variety of tools
+applied to it. For example, you can run optimizations on it (as we did
+above), you can dump it out in textual or binary forms, you can compile
+the code to an assembly file (.s) for some target, or you can JIT
+compile it. The nice thing about the LLVM IR representation is that it
+is the "common currency" between many different parts of the compiler.
+
+In this section, we'll add JIT compiler support to our interpreter. The
+basic idea that we want for Kaleidoscope is to have the user enter
+function bodies as they do now, but immediately evaluate the top-level
+expressions they type in. For example, if they type in "1 + 2;", we
+should evaluate and print out 3. If they define a function, they should
+be able to call it from the command line.
+
+In order to do this, we first declare and initialize the JIT. This is
+done by adding a global variable and a call in ``main``:
+
+.. code-block:: ocaml
+
+ ...
+ let main () =
+ ...
+ (* Create the JIT. *)
+ let the_execution_engine = ExecutionEngine.create Codegen.the_module in
+ ...
+
+This creates an abstract "Execution Engine" which can be either a JIT
+compiler or the LLVM interpreter. LLVM will automatically pick a JIT
+compiler for you if one is available for your platform, otherwise it
+will fall back to the interpreter.
+
+Once the ``Llvm_executionengine.ExecutionEngine.t`` is created, the JIT
+is ready to be used. There are a variety of APIs that are useful, but
+the simplest one is the
+"``Llvm_executionengine.ExecutionEngine.run_function``" function. This
+method JIT compiles the specified LLVM Function and returns a function
+pointer to the generated machine code. In our case, this means that we
+can change the code that parses a top-level expression to look like
+this:
+
+.. code-block:: ocaml
+
+ (* Evaluate a top-level expression into an anonymous function. *)
+ let e = Parser.parse_toplevel stream in
+ print_endline "parsed a top-level expr";
+ let the_function = Codegen.codegen_func the_fpm e in
+ dump_value the_function;
+
+ (* JIT the function, returning a function pointer. *)
+ let result = ExecutionEngine.run_function the_function [||]
+ the_execution_engine in
+
+ print_string "Evaluated to ";
+ print_float (GenericValue.as_float Codegen.double_type result);
+ print_newline ();
+
+Recall that we compile top-level expressions into a self-contained LLVM
+function that takes no arguments and returns the computed double.
+Because the LLVM JIT compiler matches the native platform ABI, this
+means that you can just cast the result pointer to a function pointer of
+that type and call it directly. This means, there is no difference
+between JIT compiled code and native machine code that is statically
+linked into your application.
+
+With just these two changes, lets see how Kaleidoscope works now!
+
+::
+
+ ready> 4+5;
+ define double @""() {
+ entry:
+ ret double 9.000000e+00
+ }
+
+ Evaluated to 9.000000
+
+Well this looks like it is basically working. The dump of the function
+shows the "no argument function that always returns double" that we
+synthesize for each top level expression that is typed in. This
+demonstrates very basic functionality, but can we do more?
+
+::
+
+ ready> def testfunc(x y) x + y*2;
+ Read function definition:
+ define double @testfunc(double %x, double %y) {
+ entry:
+ %multmp = fmul double %y, 2.000000e+00
+ %addtmp = fadd double %multmp, %x
+ ret double %addtmp
+ }
+
+ ready> testfunc(4, 10);
+ define double @""() {
+ entry:
+ %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
+ ret double %calltmp
+ }
+
+ Evaluated to 24.000000
+
+This illustrates that we can now call user code, but there is something
+a bit subtle going on here. Note that we only invoke the JIT on the
+anonymous functions that *call testfunc*, but we never invoked it on
+*testfunc* itself. What actually happened here is that the JIT scanned
+for all non-JIT'd functions transitively called from the anonymous
+function and compiled all of them before returning from
+``run_function``.
+
+The JIT provides a number of other more advanced interfaces for things
+like freeing allocated machine code, rejit'ing functions to update them,
+etc. However, even with this simple code, we get some surprisingly
+powerful capabilities - check this out (I removed the dump of the
+anonymous functions, you should get the idea by now :) :
+
+::
+
+ ready> extern sin(x);
+ Read extern:
+ declare double @sin(double)
+
+ ready> extern cos(x);
+ Read extern:
+ declare double @cos(double)
+
+ ready> sin(1.0);
+ Evaluated to 0.841471
+
+ ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x);
+ Read function definition:
+ define double @foo(double %x) {
+ entry:
+ %calltmp = call double @sin(double %x)
+ %multmp = fmul double %calltmp, %calltmp
+ %calltmp2 = call double @cos(double %x)
+ %multmp4 = fmul double %calltmp2, %calltmp2
+ %addtmp = fadd double %multmp, %multmp4
+ ret double %addtmp
+ }
+
+ ready> foo(4.0);
+ Evaluated to 1.000000
+
+Whoa, how does the JIT know about sin and cos? The answer is
+surprisingly simple: in this example, the JIT started execution of a
+function and got to a function call. It realized that the function was
+not yet JIT compiled and invoked the standard set of routines to resolve
+the function. In this case, there is no body defined for the function,
+so the JIT ended up calling "``dlsym("sin")``" on the Kaleidoscope
+process itself. Since "``sin``" is defined within the JIT's address
+space, it simply patches up calls in the module to call the libm version
+of ``sin`` directly.
+
+The LLVM JIT provides a number of interfaces (look in the
+``llvm_executionengine.mli`` file) for controlling how unknown functions
+get resolved. It allows you to establish explicit mappings between IR
+objects and addresses (useful for LLVM global variables that you want to
+map to static tables, for example), allows you to dynamically decide on
+the fly based on the function name, and even allows you to have the JIT
+compile functions lazily the first time they're called.
+
+One interesting application of this is that we can now extend the
+language by writing arbitrary C code to implement operations. For
+example, if we add:
+
+.. code-block:: c++
+
+ /* putchard - putchar that takes a double and returns 0. */
+ extern "C"
+ double putchard(double X) {
+ putchar((char)X);
+ return 0;
+ }
+
+Now we can produce simple output to the console by using things like:
+"``extern putchard(x); putchard(120);``", which prints a lowercase 'x'
+on the console (120 is the ASCII code for 'x'). Similar code could be
+used to implement file I/O, console input, and many other capabilities
+in Kaleidoscope.
+
+This completes the JIT and optimizer chapter of the Kaleidoscope
+tutorial. At this point, we can compile a non-Turing-complete
+programming language, optimize and JIT compile it in a user-driven way.
+Next up we'll look into `extending the language with control flow
+constructs <OCamlLangImpl5.html>`_, tackling some interesting LLVM IR
+issues along the way.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the LLVM JIT and optimizer. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ ocamlbuild toy.byte
+ # Run
+ ./toy.byte
+
+Here is the code:
+
+\_tags:
+ ::
+
+ <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
+ <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
+ <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
+ <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
+
+myocamlbuild.ml:
+ .. code-block:: ocaml
+
+ open Ocamlbuild_plugin;;
+
+ ocaml_lib ~extern:true "llvm";;
+ ocaml_lib ~extern:true "llvm_analysis";;
+ ocaml_lib ~extern:true "llvm_executionengine";;
+ ocaml_lib ~extern:true "llvm_target";;
+ ocaml_lib ~extern:true "llvm_scalar_opts";;
+
+ flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
+ dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
+
+token.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer Tokens
+ *===----------------------------------------------------------------------===*)
+
+ (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
+ * these others for known things. *)
+ type token =
+ (* commands *)
+ | Def | Extern
+
+ (* primary *)
+ | Ident of string | Number of float
+
+ (* unknown *)
+ | Kwd of char
+
+lexer.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer
+ *===----------------------------------------------------------------------===*)
+
+ let rec lex = parser
+ (* Skip any whitespace. *)
+ | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
+
+ (* identifier: [a-zA-Z][a-zA-Z0-9] *)
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+
+ (* number: [0-9.]+ *)
+ | [< ' ('0' .. '9' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+
+ (* Comment until end of line. *)
+ | [< ' ('#'); stream >] ->
+ lex_comment stream
+
+ (* Otherwise, just return the character as its ascii value. *)
+ | [< 'c; stream >] ->
+ [< 'Token.Kwd c; lex stream >]
+
+ (* end of stream. *)
+ | [< >] -> [< >]
+
+ and lex_number buffer = parser
+ | [< ' ('0' .. '9' | '.' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+ | [< stream=lex >] ->
+ [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
+
+ and lex_ident buffer = parser
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+ | [< stream=lex >] ->
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+ and lex_comment = parser
+ | [< ' ('\n'); stream=lex >] -> stream
+ | [< 'c; e=lex_comment >] -> e
+ | [< >] -> [< >]
+
+ast.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Abstract Syntax Tree (aka Parse Tree)
+ *===----------------------------------------------------------------------===*)
+
+ (* expr - Base type for all expression nodes. *)
+ type expr =
+ (* variant for numeric literals like "1.0". *)
+ | Number of float
+
+ (* variant for referencing a variable, like "a". *)
+ | Variable of string
+
+ (* variant for a binary operator. *)
+ | Binary of char * expr * expr
+
+ (* variant for function calls. *)
+ | Call of string * expr array
+
+ (* proto - This type represents the "prototype" for a function, which captures
+ * its name, and its argument names (thus implicitly the number of arguments the
+ * function takes). *)
+ type proto = Prototype of string * string array
+
+ (* func - This type represents a function definition itself. *)
+ type func = Function of proto * expr
+
+parser.ml:
+ .. code-block:: ocaml
+
+ (*===---------------------------------------------------------------------===
+ * Parser
+ *===---------------------------------------------------------------------===*)
+
+ (* binop_precedence - This holds the precedence for each binary operator that is
+ * defined *)
+ let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
+
+ (* precedence - Get the precedence of the pending binary operator token. *)
+ let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
+
+ (* primary
+ * ::= identifier
+ * ::= numberexpr
+ * ::= parenexpr *)
+ let rec parse_primary = parser
+ (* numberexpr ::= number *)
+ | [< 'Token.Number n >] -> Ast.Number n
+
+ (* parenexpr ::= '(' expression ')' *)
+ | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
+
+ (* identifierexpr
+ * ::= identifier
+ * ::= identifier '(' argumentexpr ')' *)
+ | [< 'Token.Ident id; stream >] ->
+ let rec parse_args accumulator = parser
+ | [< e=parse_expr; stream >] ->
+ begin parser
+ | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
+ | [< >] -> e :: accumulator
+ end stream
+ | [< >] -> accumulator
+ in
+ let rec parse_ident id = parser
+ (* Call. *)
+ | [< 'Token.Kwd '(';
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')'">] ->
+ Ast.Call (id, Array.of_list (List.rev args))
+
+ (* Simple variable ref. *)
+ | [< >] -> Ast.Variable id
+ in
+ parse_ident id stream
+
+ | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
+
+ (* binoprhs
+ * ::= ('+' primary)* *)
+ and parse_bin_rhs expr_prec lhs stream =
+ match Stream.peek stream with
+ (* If this is a binop, find its precedence. *)
+ | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
+ let token_prec = precedence c in
+
+ (* If this is a binop that binds at least as tightly as the current binop,
+ * consume it, otherwise we are done. *)
+ if token_prec < expr_prec then lhs else begin
+ (* Eat the binop. *)
+ Stream.junk stream;
+
+ (* Parse the primary expression after the binary operator. *)
+ let rhs = parse_primary stream in
+
+ (* Okay, we know this is a binop. *)
+ let rhs =
+ match Stream.peek stream with
+ | Some (Token.Kwd c2) ->
+ (* If BinOp binds less tightly with rhs than the operator after
+ * rhs, let the pending operator take rhs as its lhs. *)
+ let next_prec = precedence c2 in
+ if token_prec < next_prec
+ then parse_bin_rhs (token_prec + 1) rhs stream
+ else rhs
+ | _ -> rhs
+ in
+
+ (* Merge lhs/rhs. *)
+ let lhs = Ast.Binary (c, lhs, rhs) in
+ parse_bin_rhs expr_prec lhs stream
+ end
+ | _ -> lhs
+
+ (* expression
+ * ::= primary binoprhs *)
+ and parse_expr = parser
+ | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
+
+ (* prototype
+ * ::= id '(' id* ')' *)
+ let parse_prototype =
+ let rec parse_args accumulator = parser
+ | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
+ | [< >] -> accumulator
+ in
+
+ parser
+ | [< 'Token.Ident id;
+ 'Token.Kwd '(' ?? "expected '(' in prototype";
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
+ (* success. *)
+ Ast.Prototype (id, Array.of_list (List.rev args))
+
+ | [< >] ->
+ raise (Stream.Error "expected function name in prototype")
+
+ (* definition ::= 'def' prototype expression *)
+ let parse_definition = parser
+ | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
+ Ast.Function (p, e)
+
+ (* toplevelexpr ::= expression *)
+ let parse_toplevel = parser
+ | [< e=parse_expr >] ->
+ (* Make an anonymous proto. *)
+ Ast.Function (Ast.Prototype ("", [||]), e)
+
+ (* external ::= 'extern' prototype *)
+ let parse_extern = parser
+ | [< 'Token.Extern; e=parse_prototype >] -> e
+
+codegen.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Code Generation
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+
+ exception Error of string
+
+ let context = global_context ()
+ let the_module = create_module context "my cool jit"
+ let builder = builder context
+ let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
+ let double_type = double_type context
+
+ let rec codegen_expr = function
+ | Ast.Number n -> const_float double_type n
+ | Ast.Variable name ->
+ (try Hashtbl.find named_values name with
+ | Not_found -> raise (Error "unknown variable name"))
+ | Ast.Binary (op, lhs, rhs) ->
+ let lhs_val = codegen_expr lhs in
+ let rhs_val = codegen_expr rhs in
+ begin
+ match op with
+ | '+' -> build_add lhs_val rhs_val "addtmp" builder
+ | '-' -> build_sub lhs_val rhs_val "subtmp" builder
+ | '*' -> build_mul lhs_val rhs_val "multmp" builder
+ | '<' ->
+ (* Convert bool 0/1 to double 0.0 or 1.0 *)
+ let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
+ build_uitofp i double_type "booltmp" builder
+ | _ -> raise (Error "invalid binary operator")
+ end
+ | Ast.Call (callee, args) ->
+ (* Look up the name in the module table. *)
+ let callee =
+ match lookup_function callee the_module with
+ | Some callee -> callee
+ | None -> raise (Error "unknown function referenced")
+ in
+ let params = params callee in
+
+ (* If argument mismatch error. *)
+ if Array.length params == Array.length args then () else
+ raise (Error "incorrect # arguments passed");
+ let args = Array.map codegen_expr args in
+ build_call callee args "calltmp" builder
+
+ let codegen_proto = function
+ | Ast.Prototype (name, args) ->
+ (* Make the function type: double(double,double) etc. *)
+ let doubles = Array.make (Array.length args) double_type in
+ let ft = function_type double_type doubles in
+ let f =
+ match lookup_function name the_module with
+ | None -> declare_function name ft the_module
+
+ (* If 'f' conflicted, there was already something named 'name'. If it
+ * has a body, don't allow redefinition or reextern. *)
+ | Some f ->
+ (* If 'f' already has a body, reject this. *)
+ if block_begin f <> At_end f then
+ raise (Error "redefinition of function");
+
+ (* If 'f' took a different number of arguments, reject. *)
+ if element_type (type_of f) <> ft then
+ raise (Error "redefinition of function with different # args");
+ f
+ in
+
+ (* Set names for all arguments. *)
+ Array.iteri (fun i a ->
+ let n = args.(i) in
+ set_value_name n a;
+ Hashtbl.add named_values n a;
+ ) (params f);
+ f
+
+ let codegen_func the_fpm = function
+ | Ast.Function (proto, body) ->
+ Hashtbl.clear named_values;
+ let the_function = codegen_proto proto in
+
+ (* Create a new basic block to start insertion into. *)
+ let bb = append_block context "entry" the_function in
+ position_at_end bb builder;
+
+ try
+ let ret_val = codegen_expr body in
+
+ (* Finish off the function. *)
+ let _ = build_ret ret_val builder in
+
+ (* Validate the generated code, checking for consistency. *)
+ Llvm_analysis.assert_valid_function the_function;
+
+ (* Optimize the function. *)
+ let _ = PassManager.run_function the_function the_fpm in
+
+ the_function
+ with e ->
+ delete_function the_function;
+ raise e
+
+toplevel.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Top-Level parsing and JIT Driver
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+ open Llvm_executionengine
+
+ (* top ::= definition | external | expression | ';' *)
+ let rec main_loop the_fpm the_execution_engine stream =
+ match Stream.peek stream with
+ | None -> ()
+
+ (* ignore top-level semicolons. *)
+ | Some (Token.Kwd ';') ->
+ Stream.junk stream;
+ main_loop the_fpm the_execution_engine stream
+
+ | Some token ->
+ begin
+ try match token with
+ | Token.Def ->
+ let e = Parser.parse_definition stream in
+ print_endline "parsed a function definition.";
+ dump_value (Codegen.codegen_func the_fpm e);
+ | Token.Extern ->
+ let e = Parser.parse_extern stream in
+ print_endline "parsed an extern.";
+ dump_value (Codegen.codegen_proto e);
+ | _ ->
+ (* Evaluate a top-level expression into an anonymous function. *)
+ let e = Parser.parse_toplevel stream in
+ print_endline "parsed a top-level expr";
+ let the_function = Codegen.codegen_func the_fpm e in
+ dump_value the_function;
+
+ (* JIT the function, returning a function pointer. *)
+ let result = ExecutionEngine.run_function the_function [||]
+ the_execution_engine in
+
+ print_string "Evaluated to ";
+ print_float (GenericValue.as_float Codegen.double_type result);
+ print_newline ();
+ with Stream.Error s | Codegen.Error s ->
+ (* Skip token for error recovery. *)
+ Stream.junk stream;
+ print_endline s;
+ end;
+ print_string "ready> "; flush stdout;
+ main_loop the_fpm the_execution_engine stream
+
+toy.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Main driver code.
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+ open Llvm_executionengine
+ open Llvm_target
+ open Llvm_scalar_opts
+
+ let main () =
+ ignore (initialize_native_target ());
+
+ (* Install standard binary operators.
+ * 1 is the lowest precedence. *)
+ Hashtbl.add Parser.binop_precedence '<' 10;
+ Hashtbl.add Parser.binop_precedence '+' 20;
+ Hashtbl.add Parser.binop_precedence '-' 20;
+ Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
+
+ (* Prime the first token. *)
+ print_string "ready> "; flush stdout;
+ let stream = Lexer.lex (Stream.of_channel stdin) in
+
+ (* Create the JIT. *)
+ let the_execution_engine = ExecutionEngine.create Codegen.the_module in
+ let the_fpm = PassManager.create_function Codegen.the_module in
+
+ (* Set up the optimizer pipeline. Start with registering info about how the
+ * target lays out data structures. *)
+ DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
+
+ (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
+ add_instruction_combination the_fpm;
+
+ (* reassociate expressions. *)
+ add_reassociation the_fpm;
+
+ (* Eliminate Common SubExpressions. *)
+ add_gvn the_fpm;
+
+ (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
+ add_cfg_simplification the_fpm;
+
+ ignore (PassManager.initialize the_fpm);
+
+ (* Run the main "interpreter loop" now. *)
+ Toplevel.main_loop the_fpm the_execution_engine stream;
+
+ (* Print out all the generated code. *)
+ dump_module Codegen.the_module
+ ;;
+
+ main ()
+
+bindings.c
+ .. code-block:: c
+
+ #include <stdio.h>
+
+ /* putchard - putchar that takes a double and returns 0. */
+ extern double putchard(double X) {
+ putchar((char)X);
+ return 0;
+ }
+
+`Next: Extending the language: control flow <OCamlLangImpl5.html>`_
+
Added: www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl5.rst.txt
URL: http://llvm.org/viewvc/llvm-project/www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl5.rst.txt?rev=372328&view=auto
==============================================================================
--- www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl5.rst.txt (added)
+++ www-releases/trunk/9.0.0/docs/_sources/tutorial/OCamlLangImpl5.rst.txt Thu Sep 19 07:32:46 2019
@@ -0,0 +1,1350 @@
+==================================================
+Kaleidoscope: Extending the Language: Control Flow
+==================================================
+
+.. contents::
+ :local:
+
+Chapter 5 Introduction
+======================
+
+Welcome to Chapter 5 of the "`Implementing a language with
+LLVM <index.html>`_" tutorial. Parts 1-4 described the implementation of
+the simple Kaleidoscope language and included support for generating
+LLVM IR, followed by optimizations and a JIT compiler. Unfortunately, as
+presented, Kaleidoscope is mostly useless: it has no control flow other
+than call and return. This means that you can't have conditional
+branches in the code, significantly limiting its power. In this episode
+of "build that compiler", we'll extend Kaleidoscope to have an
+if/then/else expression plus a simple 'for' loop.
+
+If/Then/Else
+============
+
+Extending Kaleidoscope to support if/then/else is quite straightforward.
+It basically requires adding lexer support for this "new" concept to the
+lexer, parser, AST, and LLVM code emitter. This example is nice, because
+it shows how easy it is to "grow" a language over time, incrementally
+extending it as new ideas are discovered.
+
+Before we get going on "how" we add this extension, lets talk about
+"what" we want. The basic idea is that we want to be able to write this
+sort of thing:
+
+::
+
+ def fib(x)
+ if x < 3 then
+ 1
+ else
+ fib(x-1)+fib(x-2);
+
+In Kaleidoscope, every construct is an expression: there are no
+statements. As such, the if/then/else expression needs to return a value
+like any other. Since we're using a mostly functional form, we'll have
+it evaluate its conditional, then return the 'then' or 'else' value
+based on how the condition was resolved. This is very similar to the C
+"?:" expression.
+
+The semantics of the if/then/else expression is that it evaluates the
+condition to a boolean equality value: 0.0 is considered to be false and
+everything else is considered to be true. If the condition is true, the
+first subexpression is evaluated and returned, if the condition is
+false, the second subexpression is evaluated and returned. Since
+Kaleidoscope allows side-effects, this behavior is important to nail
+down.
+
+Now that we know what we "want", lets break this down into its
+constituent pieces.
+
+Lexer Extensions for If/Then/Else
+---------------------------------
+
+The lexer extensions are straightforward. First we add new variants for
+the relevant tokens:
+
+.. code-block:: ocaml
+
+ (* control *)
+ | If | Then | Else | For | In
+
+Once we have that, we recognize the new keywords in the lexer. This is
+pretty simple stuff:
+
+.. code-block:: ocaml
+
+ ...
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | "if" -> [< 'Token.If; stream >]
+ | "then" -> [< 'Token.Then; stream >]
+ | "else" -> [< 'Token.Else; stream >]
+ | "for" -> [< 'Token.For; stream >]
+ | "in" -> [< 'Token.In; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+AST Extensions for If/Then/Else
+-------------------------------
+
+To represent the new expression we add a new AST variant for it:
+
+.. code-block:: ocaml
+
+ type expr =
+ ...
+ (* variant for if/then/else. *)
+ | If of expr * expr * expr
+
+The AST variant just has pointers to the various subexpressions.
+
+Parser Extensions for If/Then/Else
+----------------------------------
+
+Now that we have the relevant tokens coming from the lexer and we have
+the AST node to build, our parsing logic is relatively straightforward.
+Next we add a new case for parsing a if-expression as a primary expression:
+
+.. code-block:: ocaml
+
+ let rec parse_primary = parser
+ ...
+ (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
+ | [< 'Token.If; c=parse_expr;
+ 'Token.Then ?? "expected 'then'"; t=parse_expr;
+ 'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
+ Ast.If (c, t, e)
+
+LLVM IR for If/Then/Else
+------------------------
+
+Now that we have it parsing and building the AST, the final piece is
+adding LLVM code generation support. This is the most interesting part
+of the if/then/else example, because this is where it starts to
+introduce new concepts. All of the code above has been thoroughly
+described in previous chapters.
+
+To motivate the code we want to produce, lets take a look at a simple
+example. Consider:
+
+::
+
+ extern foo();
+ extern bar();
+ def baz(x) if x then foo() else bar();
+
+If you disable optimizations, the code you'll (soon) get from
+Kaleidoscope looks like this:
+
+.. code-block:: llvm
+
+ declare double @foo()
+
+ declare double @bar()
+
+ define double @baz(double %x) {
+ entry:
+ %ifcond = fcmp one double %x, 0.000000e+00
+ br i1 %ifcond, label %then, label %else
+
+ then: ; preds = %entry
+ %calltmp = call double @foo()
+ br label %ifcont
+
+ else: ; preds = %entry
+ %calltmp1 = call double @bar()
+ br label %ifcont
+
+ ifcont: ; preds = %else, %then
+ %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
+ ret double %iftmp
+ }
+
+To visualize the control flow graph, you can use a nifty feature of the
+LLVM '`opt <http://llvm.org/cmds/opt.html>`_' tool. If you put this LLVM
+IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
+window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
+see this graph:
+
+.. figure:: LangImpl05-cfg.png
+ :align: center
+ :alt: Example CFG
+
+ Example CFG
+
+Another way to get this is to call
+"``Llvm_analysis.view_function_cfg f``" or
+"``Llvm_analysis.view_function_cfg_only f``" (where ``f`` is a
+"``Function``") either by inserting actual calls into the code and
+recompiling or by calling these in the debugger. LLVM has many nice
+features for visualizing various graphs.
+
+Getting back to the generated code, it is fairly simple: the entry block
+evaluates the conditional expression ("x" in our case here) and compares
+the result to 0.0 with the "``fcmp one``" instruction ('one' is "Ordered
+and Not Equal"). Based on the result of this expression, the code jumps
+to either the "then" or "else" blocks, which contain the expressions for
+the true/false cases.
+
+Once the then/else blocks are finished executing, they both branch back
+to the 'ifcont' block to execute the code that happens after the
+if/then/else. In this case the only thing left to do is to return to the
+caller of the function. The question then becomes: how does the code
+know which expression to return?
+
+The answer to this question involves an important SSA operation: the
+`Phi
+operation <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
+If you're not familiar with SSA, `the wikipedia
+article <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
+is a good introduction and there are various other introductions to it
+available on your favorite search engine. The short version is that
+"execution" of the Phi operation requires "remembering" which block
+control came from. The Phi operation takes on the value corresponding to
+the input control block. In this case, if control comes in from the
+"then" block, it gets the value of "calltmp". If control comes from the
+"else" block, it gets the value of "calltmp1".
+
+At this point, you are probably starting to think "Oh no! This means my
+simple and elegant front-end will have to start generating SSA form in
+order to use LLVM!". Fortunately, this is not the case, and we strongly
+advise *not* implementing an SSA construction algorithm in your
+front-end unless there is an amazingly good reason to do so. In
+practice, there are two sorts of values that float around in code
+written for your average imperative programming language that might need
+Phi nodes:
+
+#. Code that involves user variables: ``x = 1; x = x + 1;``
+#. Values that are implicit in the structure of your AST, such as the
+ Phi node in this case.
+
+In `Chapter 7 <OCamlLangImpl7.html>`_ of this tutorial ("mutable
+variables"), we'll talk about #1 in depth. For now, just believe me that
+you don't need SSA construction to handle this case. For #2, you have
+the choice of using the techniques that we will describe for #1, or you
+can insert Phi nodes directly, if convenient. In this case, it is really
+really easy to generate the Phi node, so we choose to do it directly.
+
+Okay, enough of the motivation and overview, lets generate code!
+
+Code Generation for If/Then/Else
+--------------------------------
+
+In order to generate code for this, we implement the ``Codegen`` method
+for ``IfExprAST``:
+
+.. code-block:: ocaml
+
+ let rec codegen_expr = function
+ ...
+ | Ast.If (cond, then_, else_) ->
+ let cond = codegen_expr cond in
+
+ (* Convert condition to a bool by comparing equal to 0.0 *)
+ let zero = const_float double_type 0.0 in
+ let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
+
+This code is straightforward and similar to what we saw before. We emit
+the expression for the condition, then compare that value to zero to get
+a truth value as a 1-bit (bool) value.
+
+.. code-block:: ocaml
+
+ (* Grab the first block so that we might later add the conditional branch
+ * to it at the end of the function. *)
+ let start_bb = insertion_block builder in
+ let the_function = block_parent start_bb in
+
+ let then_bb = append_block context "then" the_function in
+ position_at_end then_bb builder;
+
+As opposed to the `C++ tutorial <LangImpl05.html>`_, we have to build our
+basic blocks bottom up since we can't have dangling BasicBlocks. We
+start off by saving a pointer to the first block (which might not be the
+entry block), which we'll need to build a conditional branch later. We
+do this by asking the ``builder`` for the current BasicBlock. The fourth
+line gets the current Function object that is being built. It gets this
+by the ``start_bb`` for its "parent" (the function it is currently
+embedded into).
+
+Once it has that, it creates one block. It is automatically appended
+into the function's list of blocks.
+
+.. code-block:: ocaml
+
+ (* Emit 'then' value. *)
+ position_at_end then_bb builder;
+ let then_val = codegen_expr then_ in
+
+ (* Codegen of 'then' can change the current block, update then_bb for the
+ * phi. We create a new name because one is used for the phi node, and the
+ * other is used for the conditional branch. *)
+ let new_then_bb = insertion_block builder in
+
+We move the builder to start inserting into the "then" block. Strictly
+speaking, this call moves the insertion point to be at the end of the
+specified block. However, since the "then" block is empty, it also
+starts out by inserting at the beginning of the block. :)
+
+Once the insertion point is set, we recursively codegen the "then"
+expression from the AST.
+
+The final line here is quite subtle, but is very important. The basic
+issue is that when we create the Phi node in the merge block, we need to
+set up the block/value pairs that indicate how the Phi will work.
+Importantly, the Phi node expects to have an entry for each predecessor
+of the block in the CFG. Why then, are we getting the current block when
+we just set it to ThenBB 5 lines above? The problem is that the "Then"
+expression may actually itself change the block that the Builder is
+emitting into if, for example, it contains a nested "if/then/else"
+expression. Because calling Codegen recursively could arbitrarily change
+the notion of the current block, we are required to get an up-to-date
+value for code that will set up the Phi node.
+
+.. code-block:: ocaml
+
+ (* Emit 'else' value. *)
+ let else_bb = append_block context "else" the_function in
+ position_at_end else_bb builder;
+ let else_val = codegen_expr else_ in
+
+ (* Codegen of 'else' can change the current block, update else_bb for the
+ * phi. *)
+ let new_else_bb = insertion_block builder in
+
+Code generation for the 'else' block is basically identical to codegen
+for the 'then' block.
+
+.. code-block:: ocaml
+
+ (* Emit merge block. *)
+ let merge_bb = append_block context "ifcont" the_function in
+ position_at_end merge_bb builder;
+ let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
+ let phi = build_phi incoming "iftmp" builder in
+
+The first two lines here are now familiar: the first adds the "merge"
+block to the Function object. The second changes the insertion
+point so that newly created code will go into the "merge" block. Once
+that is done, we need to create the PHI node and set up the block/value
+pairs for the PHI.
+
+.. code-block:: ocaml
+
+ (* Return to the start block to add the conditional branch. *)
+ position_at_end start_bb builder;
+ ignore (build_cond_br cond_val then_bb else_bb builder);
+
+Once the blocks are created, we can emit the conditional branch that
+chooses between them. Note that creating new blocks does not implicitly
+affect the IRBuilder, so it is still inserting into the block that the
+condition went into. This is why we needed to save the "start" block.
+
+.. code-block:: ocaml
+
+ (* Set a unconditional branch at the end of the 'then' block and the
+ * 'else' block to the 'merge' block. *)
+ position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
+ position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
+
+ (* Finally, set the builder to the end of the merge block. *)
+ position_at_end merge_bb builder;
+
+ phi
+
+To finish off the blocks, we create an unconditional branch to the merge
+block. One interesting (and very important) aspect of the LLVM IR is
+that it `requires all basic blocks to be
+"terminated" <../LangRef.html#functionstructure>`_ with a `control flow
+instruction <../LangRef.html#terminators>`_ such as return or branch.
+This means that all control flow, *including fall throughs* must be made
+explicit in the LLVM IR. If you violate this rule, the verifier will
+emit an error.
+
+Finally, the CodeGen function returns the phi node as the value computed
+by the if/then/else expression. In our example above, this returned
+value will feed into the code for the top-level function, which will
+create the return instruction.
+
+Overall, we now have the ability to execute conditional code in
+Kaleidoscope. With this extension, Kaleidoscope is a fairly complete
+language that can calculate a wide variety of numeric functions. Next up
+we'll add another useful expression that is familiar from non-functional
+languages...
+
+'for' Loop Expression
+=====================
+
+Now that we know how to add basic control flow constructs to the
+language, we have the tools to add more powerful things. Lets add
+something more aggressive, a 'for' expression:
+
+::
+
+ extern putchard(char);
+ def printstar(n)
+ for i = 1, i < n, 1.0 in
+ putchard(42); # ascii 42 = '*'
+
+ # print 100 '*' characters
+ printstar(100);
+
+This expression defines a new variable ("i" in this case) which iterates
+from a starting value, while the condition ("i < n" in this case) is
+true, incrementing by an optional step value ("1.0" in this case). If
+the step value is omitted, it defaults to 1.0. While the loop is true,
+it executes its body expression. Because we don't have anything better
+to return, we'll just define the loop as always returning 0.0. In the
+future when we have mutable variables, it will get more useful.
+
+As before, lets talk about the changes that we need to Kaleidoscope to
+support this.
+
+Lexer Extensions for the 'for' Loop
+-----------------------------------
+
+The lexer extensions are the same sort of thing as for if/then/else:
+
+.. code-block:: ocaml
+
+ ... in Token.token ...
+ (* control *)
+ | If | Then | Else
+ | For | In
+
+ ... in Lexer.lex_ident...
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | "if" -> [< 'Token.If; stream >]
+ | "then" -> [< 'Token.Then; stream >]
+ | "else" -> [< 'Token.Else; stream >]
+ | "for" -> [< 'Token.For; stream >]
+ | "in" -> [< 'Token.In; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+AST Extensions for the 'for' Loop
+---------------------------------
+
+The AST variant is just as simple. It basically boils down to capturing
+the variable name and the constituent expressions in the node.
+
+.. code-block:: ocaml
+
+ type expr =
+ ...
+ (* variant for for/in. *)
+ | For of string * expr * expr * expr option * expr
+
+Parser Extensions for the 'for' Loop
+------------------------------------
+
+The parser code is also fairly standard. The only interesting thing here
+is handling of the optional step value. The parser code handles it by
+checking to see if the second comma is present. If not, it sets the step
+value to null in the AST node:
+
+.. code-block:: ocaml
+
+ let rec parse_primary = parser
+ ...
+ (* forexpr
+ ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
+ | [< 'Token.For;
+ 'Token.Ident id ?? "expected identifier after for";
+ 'Token.Kwd '=' ?? "expected '=' after for";
+ stream >] ->
+ begin parser
+ | [<
+ start=parse_expr;
+ 'Token.Kwd ',' ?? "expected ',' after for";
+ end_=parse_expr;
+ stream >] ->
+ let step =
+ begin parser
+ | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
+ | [< >] -> None
+ end stream
+ in
+ begin parser
+ | [< 'Token.In; body=parse_expr >] ->
+ Ast.For (id, start, end_, step, body)
+ | [< >] ->
+ raise (Stream.Error "expected 'in' after for")
+ end stream
+ | [< >] ->
+ raise (Stream.Error "expected '=' after for")
+ end stream
+
+LLVM IR for the 'for' Loop
+--------------------------
+
+Now we get to the good part: the LLVM IR we want to generate for this
+thing. With the simple example above, we get this LLVM IR (note that
+this dump is generated with optimizations disabled for clarity):
+
+.. code-block:: llvm
+
+ declare double @putchard(double)
+
+ define double @printstar(double %n) {
+ entry:
+ ; initial value = 1.0 (inlined into phi)
+ br label %loop
+
+ loop: ; preds = %loop, %entry
+ %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
+ ; body
+ %calltmp = call double @putchard(double 4.200000e+01)
+ ; increment
+ %nextvar = fadd double %i, 1.000000e+00
+
+ ; termination test
+ %cmptmp = fcmp ult double %i, %n
+ %booltmp = uitofp i1 %cmptmp to double
+ %loopcond = fcmp one double %booltmp, 0.000000e+00
+ br i1 %loopcond, label %loop, label %afterloop
+
+ afterloop: ; preds = %loop
+ ; loop always returns 0.0
+ ret double 0.000000e+00
+ }
+
+This loop contains all the same constructs we saw before: a phi node,
+several expressions, and some basic blocks. Lets see how this fits
+together.
+
+Code Generation for the 'for' Loop
+----------------------------------
+
+The first part of Codegen is very simple: we just output the start
+expression for the loop value:
+
+.. code-block:: ocaml
+
+ let rec codegen_expr = function
+ ...
+ | Ast.For (var_name, start, end_, step, body) ->
+ (* Emit the start code first, without 'variable' in scope. *)
+ let start_val = codegen_expr start in
+
+With this out of the way, the next step is to set up the LLVM basic
+block for the start of the loop body. In the case above, the whole loop
+body is one block, but remember that the body code itself could consist
+of multiple blocks (e.g. if it contains an if/then/else or a for/in
+expression).
+
+.. code-block:: ocaml
+
+ (* Make the new basic block for the loop header, inserting after current
+ * block. *)
+ let preheader_bb = insertion_block builder in
+ let the_function = block_parent preheader_bb in
+ let loop_bb = append_block context "loop" the_function in
+
+ (* Insert an explicit fall through from the current block to the
+ * loop_bb. *)
+ ignore (build_br loop_bb builder);
+
+This code is similar to what we saw for if/then/else. Because we will
+need it to create the Phi node, we remember the block that falls through
+into the loop. Once we have that, we create the actual block that starts
+the loop and create an unconditional branch for the fall-through between
+the two blocks.
+
+.. code-block:: ocaml
+
+ (* Start insertion in loop_bb. *)
+ position_at_end loop_bb builder;
+
+ (* Start the PHI node with an entry for start. *)
+ let variable = build_phi [(start_val, preheader_bb)] var_name builder in
+
+Now that the "preheader" for the loop is set up, we switch to emitting
+code for the loop body. To begin with, we move the insertion point and
+create the PHI node for the loop induction variable. Since we already
+know the incoming value for the starting value, we add it to the Phi
+node. Note that the Phi will eventually get a second value for the
+backedge, but we can't set it up yet (because it doesn't exist!).
+
+.. code-block:: ocaml
+
+ (* Within the loop, the variable is defined equal to the PHI node. If it
+ * shadows an existing variable, we have to restore it, so save it
+ * now. *)
+ let old_val =
+ try Some (Hashtbl.find named_values var_name) with Not_found -> None
+ in
+ Hashtbl.add named_values var_name variable;
+
+ (* Emit the body of the loop. This, like any other expr, can change the
+ * current BB. Note that we ignore the value computed by the body, but
+ * don't allow an error *)
+ ignore (codegen_expr body);
+
+Now the code starts to get more interesting. Our 'for' loop introduces a
+new variable to the symbol table. This means that our symbol table can
+now contain either function arguments or loop variables. To handle this,
+before we codegen the body of the loop, we add the loop variable as the
+current value for its name. Note that it is possible that there is a
+variable of the same name in the outer scope. It would be easy to make
+this an error (emit an error and return null if there is already an
+entry for VarName) but we choose to allow shadowing of variables. In
+order to handle this correctly, we remember the Value that we are
+potentially shadowing in ``old_val`` (which will be None if there is no
+shadowed variable).
+
+Once the loop variable is set into the symbol table, the code
+recursively codegen's the body. This allows the body to use the loop
+variable: any references to it will naturally find it in the symbol
+table.
+
+.. code-block:: ocaml
+
+ (* Emit the step value. *)
+ let step_val =
+ match step with
+ | Some step -> codegen_expr step
+ (* If not specified, use 1.0. *)
+ | None -> const_float double_type 1.0
+ in
+
+ let next_var = build_add variable step_val "nextvar" builder in
+
+Now that the body is emitted, we compute the next value of the iteration
+variable by adding the step value, or 1.0 if it isn't present.
+'``next_var``' will be the value of the loop variable on the next
+iteration of the loop.
+
+.. code-block:: ocaml
+
+ (* Compute the end condition. *)
+ let end_cond = codegen_expr end_ in
+
+ (* Convert condition to a bool by comparing equal to 0.0. *)
+ let zero = const_float double_type 0.0 in
+ let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
+
+Finally, we evaluate the exit value of the loop, to determine whether
+the loop should exit. This mirrors the condition evaluation for the
+if/then/else statement.
+
+.. code-block:: ocaml
+
+ (* Create the "after loop" block and insert it. *)
+ let loop_end_bb = insertion_block builder in
+ let after_bb = append_block context "afterloop" the_function in
+
+ (* Insert the conditional branch into the end of loop_end_bb. *)
+ ignore (build_cond_br end_cond loop_bb after_bb builder);
+
+ (* Any new code will be inserted in after_bb. *)
+ position_at_end after_bb builder;
+
+With the code for the body of the loop complete, we just need to finish
+up the control flow for it. This code remembers the end block (for the
+phi node), then creates the block for the loop exit ("afterloop"). Based
+on the value of the exit condition, it creates a conditional branch that
+chooses between executing the loop again and exiting the loop. Any
+future code is emitted in the "afterloop" block, so it sets the
+insertion position to it.
+
+.. code-block:: ocaml
+
+ (* Add a new entry to the PHI node for the backedge. *)
+ add_incoming (next_var, loop_end_bb) variable;
+
+ (* Restore the unshadowed variable. *)
+ begin match old_val with
+ | Some old_val -> Hashtbl.add named_values var_name old_val
+ | None -> ()
+ end;
+
+ (* for expr always returns 0.0. *)
+ const_null double_type
+
+The final code handles various cleanups: now that we have the
+"``next_var``" value, we can add the incoming value to the loop PHI
+node. After that, we remove the loop variable from the symbol table, so
+that it isn't in scope after the for loop. Finally, code generation of
+the for loop always returns 0.0, so that is what we return from
+``Codegen.codegen_expr``.
+
+With this, we conclude the "adding control flow to Kaleidoscope" chapter
+of the tutorial. In this chapter we added two control flow constructs,
+and used them to motivate a couple of aspects of the LLVM IR that are
+important for front-end implementors to know. In the next chapter of our
+saga, we will get a bit crazier and add `user-defined
+operators <OCamlLangImpl6.html>`_ to our poor innocent language.
+
+Full Code Listing
+=================
+
+Here is the complete code listing for our running example, enhanced with
+the if/then/else and for expressions.. To build this example, use:
+
+.. code-block:: bash
+
+ # Compile
+ ocamlbuild toy.byte
+ # Run
+ ./toy.byte
+
+Here is the code:
+
+\_tags:
+ ::
+
+ <{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
+ <*.{byte,native}>: g++, use_llvm, use_llvm_analysis
+ <*.{byte,native}>: use_llvm_executionengine, use_llvm_target
+ <*.{byte,native}>: use_llvm_scalar_opts, use_bindings
+
+myocamlbuild.ml:
+ .. code-block:: ocaml
+
+ open Ocamlbuild_plugin;;
+
+ ocaml_lib ~extern:true "llvm";;
+ ocaml_lib ~extern:true "llvm_analysis";;
+ ocaml_lib ~extern:true "llvm_executionengine";;
+ ocaml_lib ~extern:true "llvm_target";;
+ ocaml_lib ~extern:true "llvm_scalar_opts";;
+
+ flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
+ dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
+
+token.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer Tokens
+ *===----------------------------------------------------------------------===*)
+
+ (* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
+ * these others for known things. *)
+ type token =
+ (* commands *)
+ | Def | Extern
+
+ (* primary *)
+ | Ident of string | Number of float
+
+ (* unknown *)
+ | Kwd of char
+
+ (* control *)
+ | If | Then | Else
+ | For | In
+
+lexer.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Lexer
+ *===----------------------------------------------------------------------===*)
+
+ let rec lex = parser
+ (* Skip any whitespace. *)
+ | [< ' (' ' | '\n' | '\r' | '\t'); stream >] -> lex stream
+
+ (* identifier: [a-zA-Z][a-zA-Z0-9] *)
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+
+ (* number: [0-9.]+ *)
+ | [< ' ('0' .. '9' as c); stream >] ->
+ let buffer = Buffer.create 1 in
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+
+ (* Comment until end of line. *)
+ | [< ' ('#'); stream >] ->
+ lex_comment stream
+
+ (* Otherwise, just return the character as its ascii value. *)
+ | [< 'c; stream >] ->
+ [< 'Token.Kwd c; lex stream >]
+
+ (* end of stream. *)
+ | [< >] -> [< >]
+
+ and lex_number buffer = parser
+ | [< ' ('0' .. '9' | '.' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_number buffer stream
+ | [< stream=lex >] ->
+ [< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
+
+ and lex_ident buffer = parser
+ | [< ' ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9' as c); stream >] ->
+ Buffer.add_char buffer c;
+ lex_ident buffer stream
+ | [< stream=lex >] ->
+ match Buffer.contents buffer with
+ | "def" -> [< 'Token.Def; stream >]
+ | "extern" -> [< 'Token.Extern; stream >]
+ | "if" -> [< 'Token.If; stream >]
+ | "then" -> [< 'Token.Then; stream >]
+ | "else" -> [< 'Token.Else; stream >]
+ | "for" -> [< 'Token.For; stream >]
+ | "in" -> [< 'Token.In; stream >]
+ | id -> [< 'Token.Ident id; stream >]
+
+ and lex_comment = parser
+ | [< ' ('\n'); stream=lex >] -> stream
+ | [< 'c; e=lex_comment >] -> e
+ | [< >] -> [< >]
+
+ast.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Abstract Syntax Tree (aka Parse Tree)
+ *===----------------------------------------------------------------------===*)
+
+ (* expr - Base type for all expression nodes. *)
+ type expr =
+ (* variant for numeric literals like "1.0". *)
+ | Number of float
+
+ (* variant for referencing a variable, like "a". *)
+ | Variable of string
+
+ (* variant for a binary operator. *)
+ | Binary of char * expr * expr
+
+ (* variant for function calls. *)
+ | Call of string * expr array
+
+ (* variant for if/then/else. *)
+ | If of expr * expr * expr
+
+ (* variant for for/in. *)
+ | For of string * expr * expr * expr option * expr
+
+ (* proto - This type represents the "prototype" for a function, which captures
+ * its name, and its argument names (thus implicitly the number of arguments the
+ * function takes). *)
+ type proto = Prototype of string * string array
+
+ (* func - This type represents a function definition itself. *)
+ type func = Function of proto * expr
+
+parser.ml:
+ .. code-block:: ocaml
+
+ (*===---------------------------------------------------------------------===
+ * Parser
+ *===---------------------------------------------------------------------===*)
+
+ (* binop_precedence - This holds the precedence for each binary operator that is
+ * defined *)
+ let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
+
+ (* precedence - Get the precedence of the pending binary operator token. *)
+ let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
+
+ (* primary
+ * ::= identifier
+ * ::= numberexpr
+ * ::= parenexpr
+ * ::= ifexpr
+ * ::= forexpr *)
+ let rec parse_primary = parser
+ (* numberexpr ::= number *)
+ | [< 'Token.Number n >] -> Ast.Number n
+
+ (* parenexpr ::= '(' expression ')' *)
+ | [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
+
+ (* identifierexpr
+ * ::= identifier
+ * ::= identifier '(' argumentexpr ')' *)
+ | [< 'Token.Ident id; stream >] ->
+ let rec parse_args accumulator = parser
+ | [< e=parse_expr; stream >] ->
+ begin parser
+ | [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
+ | [< >] -> e :: accumulator
+ end stream
+ | [< >] -> accumulator
+ in
+ let rec parse_ident id = parser
+ (* Call. *)
+ | [< 'Token.Kwd '(';
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')'">] ->
+ Ast.Call (id, Array.of_list (List.rev args))
+
+ (* Simple variable ref. *)
+ | [< >] -> Ast.Variable id
+ in
+ parse_ident id stream
+
+ (* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
+ | [< 'Token.If; c=parse_expr;
+ 'Token.Then ?? "expected 'then'"; t=parse_expr;
+ 'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
+ Ast.If (c, t, e)
+
+ (* forexpr
+ ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
+ | [< 'Token.For;
+ 'Token.Ident id ?? "expected identifier after for";
+ 'Token.Kwd '=' ?? "expected '=' after for";
+ stream >] ->
+ begin parser
+ | [<
+ start=parse_expr;
+ 'Token.Kwd ',' ?? "expected ',' after for";
+ end_=parse_expr;
+ stream >] ->
+ let step =
+ begin parser
+ | [< 'Token.Kwd ','; step=parse_expr >] -> Some step
+ | [< >] -> None
+ end stream
+ in
+ begin parser
+ | [< 'Token.In; body=parse_expr >] ->
+ Ast.For (id, start, end_, step, body)
+ | [< >] ->
+ raise (Stream.Error "expected 'in' after for")
+ end stream
+ | [< >] ->
+ raise (Stream.Error "expected '=' after for")
+ end stream
+
+ | [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
+
+ (* binoprhs
+ * ::= ('+' primary)* *)
+ and parse_bin_rhs expr_prec lhs stream =
+ match Stream.peek stream with
+ (* If this is a binop, find its precedence. *)
+ | Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
+ let token_prec = precedence c in
+
+ (* If this is a binop that binds at least as tightly as the current binop,
+ * consume it, otherwise we are done. *)
+ if token_prec < expr_prec then lhs else begin
+ (* Eat the binop. *)
+ Stream.junk stream;
+
+ (* Parse the primary expression after the binary operator. *)
+ let rhs = parse_primary stream in
+
+ (* Okay, we know this is a binop. *)
+ let rhs =
+ match Stream.peek stream with
+ | Some (Token.Kwd c2) ->
+ (* If BinOp binds less tightly with rhs than the operator after
+ * rhs, let the pending operator take rhs as its lhs. *)
+ let next_prec = precedence c2 in
+ if token_prec < next_prec
+ then parse_bin_rhs (token_prec + 1) rhs stream
+ else rhs
+ | _ -> rhs
+ in
+
+ (* Merge lhs/rhs. *)
+ let lhs = Ast.Binary (c, lhs, rhs) in
+ parse_bin_rhs expr_prec lhs stream
+ end
+ | _ -> lhs
+
+ (* expression
+ * ::= primary binoprhs *)
+ and parse_expr = parser
+ | [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
+
+ (* prototype
+ * ::= id '(' id* ')' *)
+ let parse_prototype =
+ let rec parse_args accumulator = parser
+ | [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
+ | [< >] -> accumulator
+ in
+
+ parser
+ | [< 'Token.Ident id;
+ 'Token.Kwd '(' ?? "expected '(' in prototype";
+ args=parse_args [];
+ 'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
+ (* success. *)
+ Ast.Prototype (id, Array.of_list (List.rev args))
+
+ | [< >] ->
+ raise (Stream.Error "expected function name in prototype")
+
+ (* definition ::= 'def' prototype expression *)
+ let parse_definition = parser
+ | [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
+ Ast.Function (p, e)
+
+ (* toplevelexpr ::= expression *)
+ let parse_toplevel = parser
+ | [< e=parse_expr >] ->
+ (* Make an anonymous proto. *)
+ Ast.Function (Ast.Prototype ("", [||]), e)
+
+ (* external ::= 'extern' prototype *)
+ let parse_extern = parser
+ | [< 'Token.Extern; e=parse_prototype >] -> e
+
+codegen.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Code Generation
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+
+ exception Error of string
+
+ let context = global_context ()
+ let the_module = create_module context "my cool jit"
+ let builder = builder context
+ let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
+ let double_type = double_type context
+
+ let rec codegen_expr = function
+ | Ast.Number n -> const_float double_type n
+ | Ast.Variable name ->
+ (try Hashtbl.find named_values name with
+ | Not_found -> raise (Error "unknown variable name"))
+ | Ast.Binary (op, lhs, rhs) ->
+ let lhs_val = codegen_expr lhs in
+ let rhs_val = codegen_expr rhs in
+ begin
+ match op with
+ | '+' -> build_add lhs_val rhs_val "addtmp" builder
+ | '-' -> build_sub lhs_val rhs_val "subtmp" builder
+ | '*' -> build_mul lhs_val rhs_val "multmp" builder
+ | '<' ->
+ (* Convert bool 0/1 to double 0.0 or 1.0 *)
+ let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
+ build_uitofp i double_type "booltmp" builder
+ | _ -> raise (Error "invalid binary operator")
+ end
+ | Ast.Call (callee, args) ->
+ (* Look up the name in the module table. *)
+ let callee =
+ match lookup_function callee the_module with
+ | Some callee -> callee
+ | None -> raise (Error "unknown function referenced")
+ in
+ let params = params callee in
+
+ (* If argument mismatch error. *)
+ if Array.length params == Array.length args then () else
+ raise (Error "incorrect # arguments passed");
+ let args = Array.map codegen_expr args in
+ build_call callee args "calltmp" builder
+ | Ast.If (cond, then_, else_) ->
+ let cond = codegen_expr cond in
+
+ (* Convert condition to a bool by comparing equal to 0.0 *)
+ let zero = const_float double_type 0.0 in
+ let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
+
+ (* Grab the first block so that we might later add the conditional branch
+ * to it at the end of the function. *)
+ let start_bb = insertion_block builder in
+ let the_function = block_parent start_bb in
+
+ let then_bb = append_block context "then" the_function in
+
+ (* Emit 'then' value. *)
+ position_at_end then_bb builder;
+ let then_val = codegen_expr then_ in
+
+ (* Codegen of 'then' can change the current block, update then_bb for the
+ * phi. We create a new name because one is used for the phi node, and the
+ * other is used for the conditional branch. *)
+ let new_then_bb = insertion_block builder in
+
+ (* Emit 'else' value. *)
+ let else_bb = append_block context "else" the_function in
+ position_at_end else_bb builder;
+ let else_val = codegen_expr else_ in
+
+ (* Codegen of 'else' can change the current block, update else_bb for the
+ * phi. *)
+ let new_else_bb = insertion_block builder in
+
+ (* Emit merge block. *)
+ let merge_bb = append_block context "ifcont" the_function in
+ position_at_end merge_bb builder;
+ let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
+ let phi = build_phi incoming "iftmp" builder in
+
+ (* Return to the start block to add the conditional branch. *)
+ position_at_end start_bb builder;
+ ignore (build_cond_br cond_val then_bb else_bb builder);
+
+ (* Set a unconditional branch at the end of the 'then' block and the
+ * 'else' block to the 'merge' block. *)
+ position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
+ position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
+
+ (* Finally, set the builder to the end of the merge block. *)
+ position_at_end merge_bb builder;
+
+ phi
+ | Ast.For (var_name, start, end_, step, body) ->
+ (* Emit the start code first, without 'variable' in scope. *)
+ let start_val = codegen_expr start in
+
+ (* Make the new basic block for the loop header, inserting after current
+ * block. *)
+ let preheader_bb = insertion_block builder in
+ let the_function = block_parent preheader_bb in
+ let loop_bb = append_block context "loop" the_function in
+
+ (* Insert an explicit fall through from the current block to the
+ * loop_bb. *)
+ ignore (build_br loop_bb builder);
+
+ (* Start insertion in loop_bb. *)
+ position_at_end loop_bb builder;
+
+ (* Start the PHI node with an entry for start. *)
+ let variable = build_phi [(start_val, preheader_bb)] var_name builder in
+
+ (* Within the loop, the variable is defined equal to the PHI node. If it
+ * shadows an existing variable, we have to restore it, so save it
+ * now. *)
+ let old_val =
+ try Some (Hashtbl.find named_values var_name) with Not_found -> None
+ in
+ Hashtbl.add named_values var_name variable;
+
+ (* Emit the body of the loop. This, like any other expr, can change the
+ * current BB. Note that we ignore the value computed by the body, but
+ * don't allow an error *)
+ ignore (codegen_expr body);
+
+ (* Emit the step value. *)
+ let step_val =
+ match step with
+ | Some step -> codegen_expr step
+ (* If not specified, use 1.0. *)
+ | None -> const_float double_type 1.0
+ in
+
+ let next_var = build_add variable step_val "nextvar" builder in
+
+ (* Compute the end condition. *)
+ let end_cond = codegen_expr end_ in
+
+ (* Convert condition to a bool by comparing equal to 0.0. *)
+ let zero = const_float double_type 0.0 in
+ let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
+
+ (* Create the "after loop" block and insert it. *)
+ let loop_end_bb = insertion_block builder in
+ let after_bb = append_block context "afterloop" the_function in
+
+ (* Insert the conditional branch into the end of loop_end_bb. *)
+ ignore (build_cond_br end_cond loop_bb after_bb builder);
+
+ (* Any new code will be inserted in after_bb. *)
+ position_at_end after_bb builder;
+
+ (* Add a new entry to the PHI node for the backedge. *)
+ add_incoming (next_var, loop_end_bb) variable;
+
+ (* Restore the unshadowed variable. *)
+ begin match old_val with
+ | Some old_val -> Hashtbl.add named_values var_name old_val
+ | None -> ()
+ end;
+
+ (* for expr always returns 0.0. *)
+ const_null double_type
+
+ let codegen_proto = function
+ | Ast.Prototype (name, args) ->
+ (* Make the function type: double(double,double) etc. *)
+ let doubles = Array.make (Array.length args) double_type in
+ let ft = function_type double_type doubles in
+ let f =
+ match lookup_function name the_module with
+ | None -> declare_function name ft the_module
+
+ (* If 'f' conflicted, there was already something named 'name'. If it
+ * has a body, don't allow redefinition or reextern. *)
+ | Some f ->
+ (* If 'f' already has a body, reject this. *)
+ if block_begin f <> At_end f then
+ raise (Error "redefinition of function");
+
+ (* If 'f' took a different number of arguments, reject. *)
+ if element_type (type_of f) <> ft then
+ raise (Error "redefinition of function with different # args");
+ f
+ in
+
+ (* Set names for all arguments. *)
+ Array.iteri (fun i a ->
+ let n = args.(i) in
+ set_value_name n a;
+ Hashtbl.add named_values n a;
+ ) (params f);
+ f
+
+ let codegen_func the_fpm = function
+ | Ast.Function (proto, body) ->
+ Hashtbl.clear named_values;
+ let the_function = codegen_proto proto in
+
+ (* Create a new basic block to start insertion into. *)
+ let bb = append_block context "entry" the_function in
+ position_at_end bb builder;
+
+ try
+ let ret_val = codegen_expr body in
+
+ (* Finish off the function. *)
+ let _ = build_ret ret_val builder in
+
+ (* Validate the generated code, checking for consistency. *)
+ Llvm_analysis.assert_valid_function the_function;
+
+ (* Optimize the function. *)
+ let _ = PassManager.run_function the_function the_fpm in
+
+ the_function
+ with e ->
+ delete_function the_function;
+ raise e
+
+toplevel.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Top-Level parsing and JIT Driver
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+ open Llvm_executionengine
+
+ (* top ::= definition | external | expression | ';' *)
+ let rec main_loop the_fpm the_execution_engine stream =
+ match Stream.peek stream with
+ | None -> ()
+
+ (* ignore top-level semicolons. *)
+ | Some (Token.Kwd ';') ->
+ Stream.junk stream;
+ main_loop the_fpm the_execution_engine stream
+
+ | Some token ->
+ begin
+ try match token with
+ | Token.Def ->
+ let e = Parser.parse_definition stream in
+ print_endline "parsed a function definition.";
+ dump_value (Codegen.codegen_func the_fpm e);
+ | Token.Extern ->
+ let e = Parser.parse_extern stream in
+ print_endline "parsed an extern.";
+ dump_value (Codegen.codegen_proto e);
+ | _ ->
+ (* Evaluate a top-level expression into an anonymous function. *)
+ let e = Parser.parse_toplevel stream in
+ print_endline "parsed a top-level expr";
+ let the_function = Codegen.codegen_func the_fpm e in
+ dump_value the_function;
+
+ (* JIT the function, returning a function pointer. *)
+ let result = ExecutionEngine.run_function the_function [||]
+ the_execution_engine in
+
+ print_string "Evaluated to ";
+ print_float (GenericValue.as_float Codegen.double_type result);
+ print_newline ();
+ with Stream.Error s | Codegen.Error s ->
+ (* Skip token for error recovery. *)
+ Stream.junk stream;
+ print_endline s;
+ end;
+ print_string "ready> "; flush stdout;
+ main_loop the_fpm the_execution_engine stream
+
+toy.ml:
+ .. code-block:: ocaml
+
+ (*===----------------------------------------------------------------------===
+ * Main driver code.
+ *===----------------------------------------------------------------------===*)
+
+ open Llvm
+ open Llvm_executionengine
+ open Llvm_target
+ open Llvm_scalar_opts
+
+ let main () =
+ ignore (initialize_native_target ());
+
+ (* Install standard binary operators.
+ * 1 is the lowest precedence. *)
+ Hashtbl.add Parser.binop_precedence '<' 10;
+ Hashtbl.add Parser.binop_precedence '+' 20;
+ Hashtbl.add Parser.binop_precedence '-' 20;
+ Hashtbl.add Parser.binop_precedence '*' 40; (* highest. *)
+
+ (* Prime the first token. *)
+ print_string "ready> "; flush stdout;
+ let stream = Lexer.lex (Stream.of_channel stdin) in
+
+ (* Create the JIT. *)
+ let the_execution_engine = ExecutionEngine.create Codegen.the_module in
+ let the_fpm = PassManager.create_function Codegen.the_module in
+
+ (* Set up the optimizer pipeline. Start with registering info about how the
+ * target lays out data structures. *)
+ DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
+
+ (* Do simple "peephole" optimizations and bit-twiddling optzn. *)
+ add_instruction_combination the_fpm;
+
+ (* reassociate expressions. *)
+ add_reassociation the_fpm;
+
+ (* Eliminate Common SubExpressions. *)
+ add_gvn the_fpm;
+
+ (* Simplify the control flow graph (deleting unreachable blocks, etc). *)
+ add_cfg_simplification the_fpm;
+
+ ignore (PassManager.initialize the_fpm);
+
+ (* Run the main "interpreter loop" now. *)
+ Toplevel.main_loop the_fpm the_execution_engine stream;
+
+ (* Print out all the generated code. *)
+ dump_module Codegen.the_module
+ ;;
+
+ main ()
+
+bindings.c
+ .. code-block:: c
+
+ #include <stdio.h>
+
+ /* putchard - putchar that takes a double and returns 0. */
+ extern double putchard(double X) {
+ putchar((char)X);
+ return 0;
+ }
+
+`Next: Extending the language: user-defined
+operators <OCamlLangImpl6.html>`_
+
More information about the llvm-commits
mailing list