[LLVMdev] Fibonacci example in OCaml
Jon Harrop
jon at ffconsultancy.com
Mon Nov 26 18:12:56 PST 2007
On Monday 26 November 2007 20:05, Gordon Henriksen wrote:
> On Nov 26, 2007, at 14:18, Jon Harrop wrote:
> > On Monday 26 November 2007 16:21, Gordon Henriksen wrote:
> >> Unfortunately, even if the bindings were more strongly typed, it
> >> would still be structurally possible to build invalid LLVM code, so
> >> you've just got to take care not to violate the invariants, then
> >> use the verifier as a cross-check.
> >
> > I suspect the OCaml bindings to LLVM could express more of these
> > constraints in the static type system. Has anyone tried to leverage
> > this?
>
> Any specific ideas?
Provide a type enumerating the valid terminators and restrict the last
instruction in a block to be a terminator. Something like this:
type terminator = [ `ret of llvalue | `br of llvalue ]
type instruction =
[ terminator
| `add of llvalue * llvalue
| `sub of llvalue * llvalue ]
type block = instruction list * terminator
If you want to avoid having a data structure representing the input to LLVM
then you can probably achieve the same result using combinators, e.g. by
having the building functions for terminators change an [`incomplete] block
into a [`complete] block. However, that might make the bindings harder to
use.
Use phantom types to track the type of each llvalue:
type 'a llvalue
This could prevent my OCaml code with a function pointer error from compiling.
For example, the "define_function" function would return a value of the type:
[ `function ] llvalue
and the build_call function would require a value of that type, so you could
not accidentally pass it an int.
I would use polymorphic variants more, particularly for enums and types that
are only used once (e.g. "linkage" and "visibility"). So types would be `i32
rather than i32_type and int_predicate and real_predicate would become
overlapping sum types, e.g. `ugt is valid for both. I'd also rather see
structuring than identifier bloat, e.g.:
module Linkage = struct
type linkage =
[ `External
| `Link_once
| `Weak
| `Appending
| `Internal
| `Dllimport
| `Dllexport
| `External_weak
| `Ghost ]
external linkage : llvalue -> linkage = "llvm_linkage"
end
> > I'd rather box everything rather than tag ints to start with. I'll
> > relegate that to a potential optimization.
>
> You could go about it that way, sure. Memory pressure will be very high.
Let's ignore that for now and get something up and running. I think I can
unbox within expressions easily enough.
> > What of this can LLVM's optimizer optimize away for me?
>
> Not much. LLVM won't really change your memory layout. The features
> which did perform dramatic memory reorganization were excised from the
> source tree due to patent infringement issues. (They remain in the
> repository on a branch for UIUC research.) These were also targeted
> toward manual memory management.
That's a shame. I'm not sure how this works but presumably distros are free to
add that functionality back in provided they are not in geographical areas
affected by software patents?
> It should be possible for LLVM to perform intra- or interprocedural
> escape analysis and lower heap allocations to stack allocations, but I
> don't believe this is implemented. Garbage collection would also
> complicate it, since LLVM would need to recognize GC allocator
> functions as such. This would be an excellent project if you are
> interested in contributing to LLVM itself.
Perhaps such optimizations would be better done at a higher-level, above the
level of the GC? So that would be ideal for a higher-level VM on top of LLVM.
> > So I have to work out how to generate IR that handles those data
> > structures.
>
> Should be straightforward. Read the getelementptr FAQ. Unions are
> handled with pointer bitcasts.
>
> http://llvm.org/docs/GetElementPtr.html
Brilliant, thanks.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
More information about the llvm-dev
mailing list