[LLVMdev] Fibonacci example in OCaml
Jon Harrop
jon at ffconsultancy.com
Mon Nov 26 11:18:24 PST 2007
On Monday 26 November 2007 16:21, Gordon Henriksen wrote:
> Try 'Llvm_analysis.assert_valid_module m;' before you write bitcode to
> figure out where things went awry. ('dump_module m;' may also help.)
> GIGO applies (but garbage-in/segfault-out is more likely).
Ok, thanks for the tip.
> Unfortunately, even if the bindings were more strongly typed, it would
> still be structurally possible to build invalid LLVM code, so you've
> just got to take care not to violate the invariants, then use the
> verifier as a cross-check.
I suspect the OCaml bindings to LLVM could express more of these constraints
in the static type system. Has anyone tried to leverage this?
I'd certainly like to do this for a higher-level representation. I have lots
of ideas but I suspect that is OT for this list.
> Using an asserts build of LLVM is also helpful, but you should have
> this if you built from SVN.
I did, yes.
> > I assume I need a real type system that can handle both ints and
> > function pointers?
>
> Something like that.
>
> If you're fond of Ocaml's tagged object model (and it does have some
> nice properties), you'll want to slam every value to some generic
> value type and use lots of bitcasts. The value type is either the
> native integer (i32 or i64 depending on target) or a pointer type,
> probably the recursive pointer type:
>
> let value_ty =
> let temp_ty = opaque_type () in
> let h = handle_to_type temp_ty in
> refine_type temp_ty (pointer_type temp_ty);
> type_of_handle h
>
> Since LLVM does not have an intptr type, you'd need to know your
> target and parameterize your codegen accordingly to use the integer
> types.
>
> The bitcasts have no runtime cost, and LLVM optimization passes can
> simplify them. Your compiler's semantic analysis phase can be
> responsible for proving type-safety, so it isn't strictly necessary to
> propagate type annotations on expressions into the codegen phase so
> long as the AST nodes themselves are not polymorphic.
>
> However, the potential exists to significantly reduce memory pressure
> (avoiding boxing) by using a typed object model instead of a tagged
> one. In this case, propagating types throughout would be necessary.
> This decision has very significant implications for the compilation of
> polymorphic functions.
So this is the part where I start asking really basic questions. :-)
As I understand it, the simplest approach is to box all values, which means
that every value is stored as a struct containing the type (an enum) and the
value itself (inline if it is no bigger than a pointer or as a pointer to a
larger data structure otherwise). Something like this:
enum runtype {
Int;
Float;
Array;
}
struct array {
int length;
box *a;
}
union value {
int n; // 64-bit
float x; // 64-bit
array* u; // >64-bit
}
struct box {
runtype t;
value v;
}
So you'd create an int with:
box make_int(int n) {
box b;
b.t = Int;
b.v.n = n;
return b;
}
Is that right or are values stored as a box*?
I'd rather box everything rather than tag ints to start with. I'll relegate
that to a potential optimization.
What of this can LLVM's optimizer optimize away for me?
So I have to work out how to generate IR that handles those data structures.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
More information about the llvm-dev
mailing list