[LLVMdev] ANTLR+LLVM example for simple C

Nick Lewycky nicholas at mxc.ca
Sat Dec 12 18:18:37 PST 2009

Terence Parr wrote:
> Howdy,
> I just finished a book called Language Implementation Patterns but I ran out of room at 400 pages before I could squeeze in an LLVM example.  I left a link in the book to the ANTLR wiki so I can slap something together:
> http://www.antlr.org/wiki/display/ANTLR3/LLVM
> The code is good but the description was slapped together I'm afraid (i.e., don't take it as an example of the book quality. ha!).  The example is cool because the same generator can emit LLVM IR or C code depending on which templates we use.  It's simple enough that people might find it useful to learn about LLVM's IR and about source-to-source translators (I generate the text-based IR).
> I'd welcome any feedback and corrections to the description or code. thanks!

Hi Terence,

Strangely enough, that wiki page appears to require a login to edit, and 
the login page doesn't have any facility for creating new accounts. I 
gave it a cursory glance and have a few edits:

"global variables start with @, but registers a local variables start 
with %."
'a local' --> 'and local'. Note that local variables *are* registers in 
LLVM parlance. Notably, there is no address-of operation.

"all basic blocks must start with a label (functions automatically get 
Not quite. Consider this IR:
   define void @test(i32, i32) {
     add i32 %0, %1
     br label %4
     ret void
which uses anonymous values everywhere. %0 is the first argument, %1 is 
the second, %2 is the first basic block, %3 is the result from the add 
and %4 is the name of the second basic block. Confused yet? Here's how 
it looks through llvm-as | llvm-dis:
   define void @test(i32, i32) {
     %3 = add i32 %0, %1                             ; <i32> [#uses=0]
     br label %4

   ; <label>:4                                       ; preds = %2
     ret void
Just remember that all anonymous values are numbered sequentially 
through the .ll. We commonly leave off (even llvm-dis does it!) the 
first basic block's label because you aren't allowed to branch to it and 
there's never a need to mention it in a phi node.

At the end of the fibo example:
     ret i32 %r11
     ret i32 0
Surely you want to remove the 'ret i32 0' line.

In the print 99 example:
     ; get 99 into a register; t0 = 99+0 (LLVM has no load int instruction)
     %t0 = add i32 99,0
     ; call printf with "%d\n" and t0 as arguments
     call i32 (i8 *, ...)* @printf(i8* %ps, i32 %t0)
Is there any reason not to just write:
   ; call printf with "%d\n" and 99 as arguments
   call i32 (i8*, ...)* @printf(i8* %ps, i32 99)
? Doing an 'add i32 99, 0' is awfully awkward. If you really want to use 
a register here, the preferred trick is '%t0 = bitcast i32 99 to i32' 
since it's more general across different types -- but this sort of thing 
is strongly discouraged.

The link to llvm.org/docs/LangRef.html is far too well hidden given its 
importance to anyone trying to write in IR. I'm not sure exactly what to 
do with it yet.

Finally, congrats on finishing your book!


> Terence
> PS	If you're curious about the book, here's the publishers link:
> http://pragprog.com/titles/tpdsl/language-implementation-patterns
> It's at the printer as we speak!  In stores by new years.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

More information about the llvm-dev mailing list