[LLVMbugs] [Bug 13266] New: Kaleidoscope "for" loop codegen is semantically incorrect (Ch5-Ch7)

Wed Jul 4 00:14:41 PDT 2012

http://llvm.org/bugs/show_bug.cgi?id=13266

             Bug #: 13266
           Summary: Kaleidoscope "for" loop codegen is semantically
                    incorrect (Ch5-Ch7)
           Product: new-bugs
           Version: trunk
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: harrism at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 8827
  --> http://llvm.org/bugs/attachment.cgi?id=8827
Patch file to replace for loop codegen in Kaleidoscope tutorial Ch5-Ch7 (code
and docs). Patch created with "git diff --no-prefix > patchfile"

The implementation of for loops in the (excellent) Kaleidoscope tutorial
starting in Chapter 5 evaluates the end condition after executing the loop
body. This means that the body is always executed once even if the end
condition is initially false. The consequence is that it always executes one
iteration more than expected (based on the semantics of for loops in just about
any language).  It's effectively a strange hybrid between a "for" loop and a
"do while" loop.

It might be OK to have Kaleidoscope's "for loops" to have unique semantics
given the limited feature set of the language. However, I am currently using
the Kaleidoscope interpreter code as a learning platform, and my first project
was to add arrays to the language. When iterating over arrays, it is dangerous
for the loop to go one step further than expected -- it can easily lead to a
buffer overrun, or crash (which is how I discovered this). Also, it makes for
unexpected behavior: the "printstars(n)" example in the tutorial prints 101
stars when n==100, and the "fibi(x)" example has different loop limits than
most similar implementations found on the web in other languages.

I have provided a fix for this issue in the attached patch (modified from
svn/git trunk), which fixes the code for chapters 5-7 as well as the HTML
documentation for the tutorial. I did not fix the OCaml implementation of the
tutorial. The fix is contained in the ForExprAST::Codegen() function, and
changes the output IR 

from this:

    declare double @putchard(double)

    define double @printstar(double %n) {
    entry:
      ; initial value = 1.0 (inlined into phi)
      br label %loop

    loop:        ; preds = %loop, %entry
      %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
      ; body
      %calltmp = call double @putchard(double 4.200000e+01)
      ; increment
      %nextvar = fadd double %i, 1.000000e+00

      ; termination test
      %cmptmp = fcmp ult double %i, %n
      %booltmp = uitofp i1 %cmptmp to double
      %loopcond = fcmp one double %booltmp, 0.000000e+00
      br i1 %loopcond, label %loop, label %afterloop

    afterloop:        ; preds = %loop
      ; loop always returns 0.0
      ret double 0.000000e+00
    }

to this:

    declare double @putchard(double)

    define double @printstar(double %n) {
    entry:
      ; initial value = 1.0 (inlined into phi)
      br label %loop

    loopstart:                      ; preds = %loopbody, %entry
      %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loopbody ]

      ; termination test
      %cmptmp = fcmp ult double %i, %n
      %booltmp = uitofp i1 %cmptmp to double
      %loopcond = fcmp one double %booltmp, 0.000000e+00
      br i1 %loopcond, label %loopbody, label %loopexit

    loopbody:                       ; preds = %loopstart
      ; body
      %calltmp = call double @putchard(double 4.200000e+01)
      ; increment
      %nextvar = fadd double %i, 1.000000e+00
      br label %loopstart  

    loopexit:                       ; preds = %loopstart
      ; loop always returns 0.0
      ret double 0.000000e+00
    }

Notice in the latter we first evaluate the end condition, and branch to the
loop body when it is true, and to the loop exit if false. The end of the loop
body is an unconditional branch back to the loop start. (This is the IR for
Ch5.  For Ch7 there is an alloca/load/store rather than phi).

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.