[llvm-commits] [llvm] r44157 - in /llvm/trunk/docs/tutorial: LangImpl6.html LangImpl7.html LangImpl8.html

Wed Nov 14 20:51:32 PST 2007

Author: lattner
Date: Wed Nov 14 22:51:31 2007
New Revision: 44157

URL: http://llvm.org/viewvc/llvm-project?rev=44157&view=rev
Log:
many edits, patch by Kelly Wilson!

Modified:
    llvm/trunk/docs/tutorial/LangImpl6.html
    llvm/trunk/docs/tutorial/LangImpl7.html
    llvm/trunk/docs/tutorial/LangImpl8.html

Modified: llvm/trunk/docs/tutorial/LangImpl6.html
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/LangImpl6.html?rev=44157&r1=44156&r2=44157&view=diff

==============================================================================

--- llvm/trunk/docs/tutorial/LangImpl6.html (original)
+++ llvm/trunk/docs/tutorial/LangImpl6.html Wed Nov 14 22:51:31 2007
@@ -41,15 +41,16 @@
 
 <p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
 with LLVM</a>" tutorial.  At this point in our tutorial, we now have a fully
-functional language that is fairly minimal, but also useful.  One big problem
-with it though is that it doesn't have many useful operators (like division,
-logical negation, or even any comparisons other than less-than.</p>
+functional language that is fairly minimal, but also useful.  There
+is still one big problem with it, however. Our language doesn't have many 
+useful operators (like division, logical negation, or even any comparisons 
+besides less-than).</p>
 
 <p>This chapter of the tutorial takes a wild digression into adding user-defined
-operators to the simple and beautiful Kaleidoscope language, giving us a 
-simple and ugly language in some ways, but also a powerful one at the same time.
+operators to the simple and beautiful Kaleidoscope language. This digression now gives 
+us a simple and ugly language in some ways, but also a powerful one at the same time.
 One of the great things about creating your own language is that you get to
-decide what is good or bad.  In this tutorial we'll assume that it is okay and
+decide what is good or bad.  In this tutorial we'll assume that it is okay to
 use this as a way to show some interesting parsing techniques.</p>
 
 <p>At the end of this tutorial, we'll run through an example Kaleidoscope 
@@ -73,8 +74,8 @@
 operators that are supported.</p>
 
 <p>The point of going into user-defined operators in a tutorial like this is to
-show the power and flexibility of using a hand-written parser.  The parser we
-are using so far is using recursive descent for most parts of the grammar, and 
+show the power and flexibility of using a hand-written parser.  Thus far, the parser
+we have been implementing uses recursive descent for most parts of the grammar and 
 operator precedence parsing for the expressions.  See <a 
 href="LangImpl2.html">Chapter 2</a> for details.  Without using operator
 precedence parsing, it would be very difficult to allow the programmer to
@@ -152,12 +153,12 @@
 
 <p>This just adds lexer support for the unary and binary keywords, like we
 did in <a href="LangImpl5.html#iflexer">previous chapters</a>.  One nice thing
-about our current AST is that we represent binary operators fully generally
-with their ASCII code as the opcode.  For our extended operators, we'll use the
+about our current AST, is that we represent binary operators with full generalisation
+by using their ASCII code as the opcode.  For our extended operators, we'll use this
 same representation, so we don't need any new AST or parser support.</p>
 
 <p>On the other hand, we have to be able to represent the definitions of these
-new operators, in the "def binary| 5" part of the function definition.  In the
+new operators, in the "def binary| 5" part of the function definition.  In our
 grammar so far, the "name" for the function definition is parsed as the
 "prototype" production and into the <tt>PrototypeAST</tt> AST node.  To
 represent our new user-defined operators as prototypes, we have to extend
@@ -257,14 +258,14 @@
 </pre>
 </div>
 
-<p>This is all fairly straight-forward parsing code, and we have already seen
-a lot of similar code in the past.  One interesting piece of this is the part
-that sets up <tt>FnName</tt> for binary operators.  This builds names like
-"binary@" for a newly defined "@" operator.  This takes advantage of the fact
-that symbol names in the LLVM symbol table are allowed to have any character in
-them, even including embedded nul characters.</p>
+<p>This is all fairly straightforward parsing code, and we have already seen
+a lot of similar code in the past.  One interesting part about the code above is 
+the couple lines that set up <tt>FnName</tt> for binary operators.  This builds names 
+like "binary@" for a newly defined "@" operator.  This then takes advantage of the 
+fact that symbol names in the LLVM symbol table are allowed to have any character in
+them, including embedded nul characters.</p>
 
-<p>The next interesting piece is codegen support for these binary operators.
+<p>The next interesting thing to add, is codegen support for these binary operators.
 Given our current structure, this is a simple addition of a default case for our
 existing binary operator node:</p>
 
@@ -301,10 +302,10 @@
 <p>As you can see above, the new code is actually really simple.  It just does
 a lookup for the appropriate operator in the symbol table and generates a 
 function call to it.  Since user-defined operators are just built as normal
-functions (because the "prototype" boils down into a function with the right
+functions (because the "prototype" boils down to a function with the right
 name) everything falls into place.</p>
 
-<p>The final missing piece is a bit of top level magic, here:</p>
+<p>The final piece of code we are missing, is a bit of top level magic:</p>
 
 <div class="doc_code">
 <pre>
@@ -330,10 +331,9 @@
 
 <p>Basically, before codegening a function, if it is a user-defined operator, we
 register it in the precedence table.  This allows the binary operator parsing
-logic we already have to handle it.  Since it is a fully-general operator
-precedence parser, this is all we need to do to "extend the grammar".</p>
+logic we already have in place to handle it.  Since we are working on a fully-general operator precedence parser, this is all we need to do to "extend the grammar".</p>
 
-<p>With that, we have useful user-defined binary operators.  This builds a lot
+<p>Now we have useful user-defined binary operators.  This builds a lot
 on the previous framework we built for other operators.  Adding unary operators
 is a bit more challenging, because we don't have any framework for it yet - lets
 see what it takes.</p>
@@ -347,7 +347,7 @@
 <div class="doc_text">
 
 <p>Since we don't currently support unary operators in the Kaleidoscope
-language, we'll need to add everything for them.  Above, we added simple
+language, we'll need to add everything to support them.  Above, we added simple
 support for the 'unary' keyword to the lexer.  In addition to that, we need an
 AST node:</p>
 
@@ -390,14 +390,14 @@
 </pre>
 </div>
 
-<p>The grammar we add is pretty straight-forward here.  If we see a unary
+<p>The grammar we add is pretty straightforward here.  If we see a unary
 operator when parsing a primary operator, we eat the operator as a prefix and
 parse the remaining piece as another unary operator.  This allows us to handle
 multiple unary operators (e.g. "!!x").  Note that unary operators can't have 
 ambiguous parses like binary operators can, so there is no need for precedence
 information.</p>
 
-<p>The problem with the above is that we need to call ParseUnary from somewhere.
+<p>The problem with this function, is that we need to call ParseUnary from somewhere.
 To do this, we change previous callers of ParsePrimary to call ParseUnary
 instead:</p>
 
@@ -424,7 +424,7 @@
 </pre>
 </div>
 
-<p>With these two simple changes, we now parse unary operators and build the
+<p>With these two simple changes, we are now able to parse unary operators and build the
 AST for them.  Next up, we need to add parser support for prototypes, to parse
 the unary operator prototype.  We extend the binary operator code above 
 with:</p>
@@ -587,7 +587,7 @@
 
 <p>Based on these simple primitive operations, we can start to define more
 interesting things.  For example, here's a little function that solves for the
-number of iterations it takes for a function in the complex plane to
+number of iterations it takes a function in the complex plane to
 converge:</p>
 
 <div class="doc_code">
@@ -779,8 +779,8 @@
 plot things that are!</p>
 
 <p>With this, we conclude the "adding user-defined operators" chapter of the
-tutorial.  We successfully extended our language with the ability to extend the
-language in the library, and showed how this can be used to build a simple but
+tutorial.  We have successfully augmented our language, adding the ability to extend the
+language in the library, and we have shown how this can be used to build a simple but
 interesting end-user application in Kaleidoscope.  At this point, Kaleidoscope
 can build a variety of applications that are functional and can call functions
 with side-effects, but it can't actually define and mutate a variable itself.
@@ -790,7 +790,7 @@
 languages, and it is not at all obvious how to <a href="LangImpl7.html">add
 support for mutable variables</a> without having to add an "SSA construction"
 phase to your front-end.  In the next chapter, we will describe how you can
-add this without building SSA in your front-end.</p>
+add variable mutation without building SSA in your front-end.</p>
 
 </div>
 

Modified: llvm/trunk/docs/tutorial/LangImpl7.html
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/LangImpl7.html?rev=44157&r1=44156&r2=44157&view=diff

==============================================================================
--- llvm/trunk/docs/tutorial/LangImpl7.html (original)
+++ llvm/trunk/docs/tutorial/LangImpl7.html Wed Nov 14 22:51:31 2007
@@ -49,11 +49,11 @@
 href="http://en.wikipedia.org/wiki/Functional_programming">functional
 programming language</a>.  In our journey, we learned some parsing techniques,
 how to build and represent an AST, how to build LLVM IR, and how to optimize
-the resultant code and JIT compile it.</p>
+the resultant code as well as JIT compile it.</p>
 
-<p>While Kaleidoscope is interesting as a functional language, this makes it 
-"too easy" to generate LLVM IR for it.  In particular, a functional language
-makes it very easy to build LLVM IR directly in <a 
+<p>While Kaleidoscope is interesting as a functional language, the fact that it
+is functional makes it "too easy" to generate LLVM IR for it.  In particular, a 
+functional language makes it very easy to build LLVM IR directly in <a 
 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
 Since LLVM requires that the input code be in SSA form, this is a very nice
 property and it is often unclear to newcomers how to generate code for an
@@ -124,13 +124,13 @@
 (cond_true/cond_false).  In order to merge the incoming values, the X.2 phi node
 in the cond_next block selects the right value to use based on where control 
 flow is coming from: if control flow comes from the cond_false block, X.2 gets
-the value of X.1.  Alternatively, if control flow comes from cond_tree, it gets
+the value of X.1.  Alternatively, if control flow comes from cond_true, it gets
 the value of X.0.  The intent of this chapter is not to explain the details of
 SSA form.  For more information, see one of the many <a 
 href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online 
 references</a>.</p>
 
-<p>The question for this article is "who places phi nodes when lowering 
+<p>The question for this article is "who places the phi nodes when lowering 
 assignments to mutable variables?".  The issue here is that LLVM 
 <em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
 However, SSA construction requires non-trivial algorithms and data structures,
@@ -162,12 +162,12 @@
 </p>
 
 <p>In LLVM, all memory accesses are explicit with load/store instructions, and
-it is carefully designed to not have (or need) an "address-of" operator.  Notice
+it is carefully designed not to have (or need) an "address-of" operator.  Notice
 how the type of the @G/@H global variables is actually "i32*" even though the 
 variable is defined as "i32".  What this means is that @G defines <em>space</em>
 for an i32 in the global data area, but its <em>name</em> actually refers to the
-address for that space.  Stack variables work the same way, but instead of being
-declared with global variable definitions, they are declared with the 
+address for that space.  Stack variables work the same way, except that instead of 
+being declared with global variable definitions, they are declared with the 
 <a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
 
 <div class="doc_code">
@@ -259,10 +259,10 @@
 </pre>
 </div>
 
-<p>The mem2reg pass implements the standard "iterated dominator frontier"
+<p>The mem2reg pass implements the standard "iterated dominance frontier"
 algorithm for constructing SSA form and has a number of optimizations that speed
-up (very common) degenerate cases. mem2reg is the answer for dealing with
-mutable variables, and we highly recommend that you depend on it.  Note that
+up (very common) degenerate cases. The mem2reg optimization pass is the answer to dealing 
+with mutable variables, and we highly recommend that you depend on it.  Note that
 mem2reg only works on variables in certain circumstances:</p>
 
 <ol>
@@ -288,10 +288,10 @@
 
 <p>
 All of these properties are easy to satisfy for most imperative languages, and
-we'll illustrate this below with Kaleidoscope.  The final question you may be
+we'll illustrate it below with Kaleidoscope.  The final question you may be
 asking is: should I bother with this nonsense for my front-end?  Wouldn't it be
 better if I just did SSA construction directly, avoiding use of the mem2reg
-optimization pass?  In short, we strongly recommend that use you this technique
+optimization pass?  In short, we strongly recommend that you use this technique
 for building SSA form, unless there is an extremely good reason not to.  Using
 this technique is:</p>
 
@@ -309,8 +309,8 @@
 
 <li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
 Debug information in LLVM</a> relies on having the address of the variable
-exposed to attach debug info to it.  This technique dovetails very naturally 
-with this style of debug info.</li>
+exposed so that debug info can be attached to it.  This technique dovetails 
+very naturally with this style of debug info.</li>
 </ul>
 
 <p>If nothing else, this makes it much easier to get your front-end up and 
@@ -337,7 +337,7 @@
 </ol>
 
 <p>While the first item is really what this is about, we only have variables
-for incoming arguments and for induction variables, and redefining those only
+for incoming arguments as well as for induction variables, and redefining those only
 goes so far :).  Also, the ability to define new variables is a
 useful thing regardless of whether you will be mutating them.  Here's a
 motivating example that shows how we could use these:</p>
@@ -403,8 +403,8 @@
 </p>
 
 <p>To start our transformation of Kaleidoscope, we'll change the NamedValues
-map to map to AllocaInst* instead of Value*.  Once we do this, the C++ compiler
-will tell use what parts of the code we need to update:</p>
+map so that it maps to AllocaInst* instead of Value*.  Once we do this, the C++ 
+compiler will tell us what parts of the code we need to update:</p>
 
 <div class="doc_code">
 <pre>
@@ -452,7 +452,7 @@
 </pre>
 </div>
 
-<p>As you can see, this is pretty straight-forward.  Next we need to update the
+<p>As you can see, this is pretty straightforward.  Now we need to update the
 things that define the variables to set up the alloca.  We'll start with 
 <tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
 the unabridged code):</p>
@@ -518,7 +518,7 @@
 argument.  This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
 it sets up the entry block for the function.</p>
 
-<p>The final missing piece is adding the 'mem2reg' pass, which allows us to get
+<p>The final missing piece is adding the mem2reg pass, which allows us to get
 good codegen once again:</p>
 
 <div class="doc_code">
@@ -537,7 +537,7 @@
 
 <p>It is interesting to see what the code looks like before and after the
 mem2reg optimization runs.  For example, this is the before/after code for our
-recursive fib.  Before the optimization:</p>
+recursive fib function.  Before the optimization:</p>
 
 <div class="doc_code">
 <pre>
@@ -709,7 +709,7 @@
 </pre>
 </div>
 
-<p>Once it has the variable, codegen'ing the assignment is straight-forward:
+<p>Once we have the variable, codegen'ing the assignment is straightforward:
 we emit the RHS of the assignment, create a store, and return the computed
 value.  Returning a value allows for chained assignments like "X = (Y = Z)".</p>
 
@@ -799,7 +799,7 @@
 the VarNames vector.  Also, var/in has a body, this body is allowed to access
 the variables defined by the var/in.</p>
 
-<p>With this ready, we can define the parser pieces.  First thing we do is add
+<p>With this in place, we can define the parser pieces.  The first thing we do is add
 it as a primary expression:</p>
 
 <div class="doc_code">
@@ -972,7 +972,7 @@
 <p>With this, we completed what we set out to do.  Our nice iterative fib
 example from the intro compiles and runs just fine.  The mem2reg pass optimizes
 all of our stack variables into SSA registers, inserting PHI nodes where needed,
-and our front-end remains simple: no iterated dominator frontier computation
+and our front-end remains simple: no "iterated dominance frontier" computation
 anywhere in sight.</p>
 
 </div>

Modified: llvm/trunk/docs/tutorial/LangImpl8.html
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/tutorial/LangImpl8.html?rev=44157&r1=44156&r2=44157&view=diff

==============================================================================
--- llvm/trunk/docs/tutorial/LangImpl8.html (original)
+++ llvm/trunk/docs/tutorial/LangImpl8.html Wed Nov 14 22:51:31 2007
@@ -98,7 +98,7 @@
 <li><b>standard runtime</b> - Our current language allows the user to access
 arbitrary external functions, and we use it for things like "printd" and
 "putchard".  As you extend the language to add higher-level constructs, often
-these constructs make the most amount of sense to be lowered into calls into a
+these constructs make the most sense if they are lowered to calls into a
 language-supplied runtime.  For example, if you add hash tables to the language,
 it would probably make sense to add the routines to a runtime, instead of 
 inlining them all the way.</li>
@@ -106,14 +106,14 @@
 <li><b>memory management</b> - Currently we can only access the stack in
 Kaleidoscope.  It would also be useful to be able to allocate heap memory,
 either with calls to the standard libc malloc/free interface or with a garbage
-collector.  If you choose to use garbage collection, note that LLVM fully
+collector.  If you would like to use garbage collection, note that LLVM fully
 supports <a href="../GarbageCollection.html">Accurate Garbage Collection</a>
 including algorithms that move objects and need to scan/update the stack.</li>
 
 <li><b>debugger support</b> - LLVM supports generation of <a 
 href="../SourceLevelDebugging.html">DWARF Debug info</a> which is understood by
 common debuggers like GDB.  Adding support for debug info is fairly 
-straight-forward.  The best way to understand it is to compile some C/C++ code
+straightforward.  The best way to understand it is to compile some C/C++ code
 with "<tt>llvm-gcc -g -O0</tt>" and taking a look at what it produces.</li>
 
 <li><b>exception handling support</b> - LLVM supports generation of <a 
@@ -139,22 +139,22 @@
 
 <p>
 Have fun - try doing something crazy and unusual.  Building a language like
-everyone else always has is much less fun than trying something a little crazy
-and off the wall and seeing how it turns out.  If you get stuck or want to talk
+everyone else always has, is much less fun than trying something a little crazy
+or off the wall and seeing how it turns out.  If you get stuck or want to talk
 about it, feel free to email the <a 
 href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">llvmdev mailing 
 list</a>: it has lots of people who are interested in languages and are often
 willing to help out.
 </p>
 
-<p>Before we end, I want to talk about some "tips and tricks" for generating
+<p>Before we end this tutorial, I want to talk about some "tips and tricks" for generating
 LLVM IR.  These are some of the more subtle things that may not be obvious, but
 are very useful if you want to take advantage of LLVM's capabilities.</p>
 
 </div>
 
 <!-- *********************************************************************** -->
-<div class="doc_section"><a name="llvmirproperties">Properties of LLVM 
+<div class="doc_section"><a name="llvmirproperties">Properties of the LLVM 
 IR</a></div>
 <!-- *********************************************************************** -->
 
@@ -184,7 +184,7 @@
 tell that the Kaleidoscope compiler generates target-independent code because it
 never queries for any target-specific information when generating code.</p>
 
-<p>The fact that LLVM provides a compact target-independent representation for
+<p>The fact that LLVM provides a compact, target-independent, representation for
 code gets a lot of people excited.  Unfortunately, these people are usually
 thinking about C or a language from the C family when they are asking questions
 about language portability.  I say "unfortunately", because there is really no
@@ -208,7 +208,7 @@
 </div>
 
 <p>While it is possible to engineer more and more complex solutions to problems
-like this, it cannot be solved in full generality in a way better than shipping
+like this, it cannot be solved in full generality in a way that is better than shipping
 the actual source code.</p>
 
 <p>That said, there are interesting subsets of C that can be made portable.  If
@@ -263,7 +263,7 @@
 writing, there is no way to distinguish in the LLVM IR whether an SSA-value came
 from a C "int" or a C "long" on an ILP32 machine (other than debug info).  Both
 get compiled down to an 'i32' value and the information about what it came from
-is lost.  The more general issue here is that the LLVM type system uses
+is lost.  The more general issue here, is that the LLVM type system uses
 "structural equivalence" instead of "name equivalence".  Another place this
 surprises people is if you have two types in a high-level language that have the
 same structure (e.g. two different structs that have a single int field): these
@@ -275,10 +275,10 @@
 adding new features (LLVM did not always support exceptions or debug info), we
 also extend the IR to capture important information for optimization (e.g.
 whether an argument is sign or zero extended, information about pointers
-aliasing, etc.  Many of the enhancements are user-driven: people want LLVM to
-do some specific feature, so they go ahead and extend it to do so.</p>
+aliasing, etc).  Many of the enhancements are user-driven: people want LLVM to
+include some specific feature, so they go ahead and extend it.</p>
 
-<p>Third, it <em>is possible and easy</em> to add language-specific
+<p>Third, it is <em>possible and easy</em> to add language-specific
 optimizations, and you have a number of choices in how to do it.  As one trivial
 example, it is easy to add language-specific optimization passes that
 "know" things about code compiled for a language.  In the case of the C family,
@@ -291,7 +291,7 @@
 other language-specific information into the LLVM IR.  If you have a specific
 need and run into a wall, please bring the topic up on the llvmdev list.  At the
 very worst, you can always treat LLVM as if it were a "dumb code generator" and
-implement the high-level optimizations you desire in your front-end on the
+implement the high-level optimizations you desire in your front-end, on the
 language-specific AST.
 </p>
 
@@ -316,8 +316,8 @@
 
 <div class="doc_text">
 
-<p>One interesting thing that comes up if you are trying to keep the code 
-generated by your compiler "target independent" is that you often need to know
+<p>One interesting thing that comes up, if you are trying to keep the code 
+generated by your compiler "target independent", is that you often need to know
 the size of some LLVM type or the offset of some field in an llvm structure.
 For example, you might need to pass the size of a type into a function that
 allocates memory.</p>
@@ -342,10 +342,10 @@
 are often better ways to implement these features than explicit stack frames,
 but <a 
 href="http://nondot.org/sabre/LLVMNotes/ExplicitlyManagedStackFrames.txt">LLVM
-does support them if you want</a>.  It requires your front-end to convert the
+does support them,</a> if you want.  It requires your front-end to convert the
 code into <a 
 href="http://en.wikipedia.org/wiki/Continuation-passing_style">Continuation
-Passing Style</a> and use of tail calls (which LLVM also supports).</p>
+Passing Style</a> and the use of tail calls (which LLVM also supports).</p>
 
 </div>