[LLVMdev] Fibonacci example in OCaml

Jon Harrop jon at ffconsultancy.com
Mon Nov 26 18:12:56 PST 2007


On Monday 26 November 2007 20:05, Gordon Henriksen wrote:
> On Nov 26, 2007, at 14:18, Jon Harrop wrote:
> > On Monday 26 November 2007 16:21, Gordon Henriksen wrote:
> >> Unfortunately, even if the bindings were more strongly typed, it
> >> would still be structurally possible to build invalid LLVM code, so
> >> you've just got to take care not to violate the invariants, then
> >> use the verifier as a cross-check.
> >
> > I suspect the OCaml bindings to LLVM could express more of these
> > constraints in the static type system. Has anyone tried to leverage
> > this?
>
> Any specific ideas?

Provide a type enumerating the valid terminators and restrict the last 
instruction in a block to be a terminator. Something like this:

  type terminator = [ `ret of llvalue | `br of llvalue ]
  type instruction =
      [ terminator
      | `add of llvalue * llvalue
      | `sub of llvalue * llvalue ] 
  type block = instruction list * terminator

If you want to avoid having a data structure representing the input to LLVM 
then you can probably achieve the same result using combinators, e.g. by 
having the building functions for terminators change an [`incomplete] block 
into a [`complete] block. However, that might make the bindings harder to 
use.

Use phantom types to track the type of each llvalue:

  type 'a llvalue

This could prevent my OCaml code with a function pointer error from compiling. 
For example, the "define_function" function would return a value of the type:

  [ `function ] llvalue

and the build_call function would require a value of that type, so you could 
not accidentally pass it an int.

I would use polymorphic variants more, particularly for enums and types that 
are only used once (e.g. "linkage" and "visibility"). So types would be `i32 
rather than i32_type and int_predicate and real_predicate would become 
overlapping sum types, e.g. `ugt is valid for both. I'd also rather see 
structuring than identifier bloat, e.g.:

module Linkage = struct
  type linkage =
      [ `External
      | `Link_once
      | `Weak
      | `Appending
      | `Internal
      | `Dllimport
      | `Dllexport
      | `External_weak
      | `Ghost ]

  external linkage : llvalue -> linkage = "llvm_linkage"
end

> > I'd rather box everything rather than tag ints to start with. I'll
> > relegate that to a potential optimization.
>
> You could go about it that way, sure. Memory pressure will be very high.

Let's ignore that for now and get something up and running. I think I can 
unbox within expressions easily enough.

> > What of this can LLVM's optimizer optimize away for me?
>
> Not much. LLVM won't really change your memory layout. The features
> which did perform dramatic memory reorganization were excised from the
> source tree due to patent infringement issues. (They remain in the
> repository on a branch for UIUC research.) These were also targeted
> toward manual memory management.

That's a shame. I'm not sure how this works but presumably distros are free to 
add that functionality back in provided they are not in geographical areas 
affected by software patents?

> It should be possible for LLVM to perform intra- or interprocedural
> escape analysis and lower heap allocations to stack allocations, but I
> don't believe this is implemented. Garbage collection would also
> complicate it, since LLVM would need to recognize GC allocator
> functions as such. This would be an excellent project if you are
> interested in contributing to LLVM itself.

Perhaps such optimizations would be better done at a higher-level, above the 
level of the GC? So that would be ideal for a higher-level VM on top of LLVM.

> > So I have to work out how to generate IR that handles those data
> > structures.
>
> Should be straightforward. Read the getelementptr FAQ. Unions are
> handled with pointer bitcasts.
>
> http://llvm.org/docs/GetElementPtr.html

Brilliant, thanks.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e



More information about the llvm-dev mailing list