[LLVMdev] Fibonacci example in OCaml
Gordon Henriksen
gordonhenriksen at mac.com
Mon Nov 26 08:21:22 PST 2007
On Nov 26, 2007, at 00:47, Jon Harrop wrote:
> Here is a complete 104-line native code compiler for a tiny subset
> of OCaml that is expressive enough to compile an external Fibonacci
> program:
>
> [...]
>
> I was kind of hoping that function pointers would just magically
> work, so this:
>
> do (if 1 <= 2 then fib else fib) 40
>
> would run, but instead it produces this error:
>
> $ llc -f run.bc -o run.s
> llc: bitcode didn't read correctly.
> Reason: Invalid instruction with no BB
Try 'Llvm_analysis.assert_valid_module m;' before you write bitcode to
figure out where things went awry. ('dump_module m;' may also help.)
GIGO applies (but garbage-in/segfault-out is more likely).
Unfortunately, even if the bindings were more strongly typed, it would
still be structurally possible to build invalid LLVM code, so you've
just got to take care not to violate the invariants, then use the
verifier as a cross-check.
Using an asserts build of LLVM is also helpful, but you should have
this if you built from SVN.
> I assume I need a real type system that can handle both ints and
> function pointers?
Something like that.
If you're fond of Ocaml's tagged object model (and it does have some
nice properties), you'll want to slam every value to some generic
value type and use lots of bitcasts. The value type is either the
native integer (i32 or i64 depending on target) or a pointer type,
probably the recursive pointer type:
let value_ty =
let temp_ty = opaque_type () in
let h = handle_to_type temp_ty in
refine_type temp_ty (pointer_type temp_ty);
type_of_handle h
Since LLVM does not have an intptr type, you'd need to know your
target and parameterize your codegen accordingly to use the integer
types.
The bitcasts have no runtime cost, and LLVM optimization passes can
simplify them. Your compiler's semantic analysis phase can be
responsible for proving type-safety, so it isn't strictly necessary to
propagate type annotations on expressions into the codegen phase so
long as the AST nodes themselves are not polymorphic.
However, the potential exists to significantly reduce memory pressure
(avoiding boxing) by using a typed object model instead of a tagged
one. In this case, propagating types throughout would be necessary.
This decision has very significant implications for the compilation of
polymorphic functions.
> After that, I'll add compound types and a memory leak. ;-)
Indeed.
— Gordon
More information about the llvm-dev
mailing list