[LLVMdev] dynamic typing system

Mon Aug 16 15:02:18 PDT 2010

On Aug 16, 2010, at 1:26 PM, Alec Benzer wrote:
> This isn't a strictly llvm-related problem, but I thought I'd ask anyway to see if anyone can help.
> 
> I'm trying to write a dynamically typed language on top of llvm. My initial idea was to have a general object type for all objects in my language. I came up with:

This is a huge subject, but briefly:
1.  Unless your type system is really weird, you will need a garbage-collected heap.  The design and quality of your GC will determine a lot of the performance of your language.  However, it is relatively easy to improve your garbage collector and allocator without fundamentally changing your object representation.
2.  You should design your object representation to make it as cheap as possible to create and recognize the values you expect to be working with the most.  If your dynamic programs are going to do a lot of text processing, don't make it expensive to check whether a value is a string.  In particular, booleans and small integers should be easy to recognize.
3.  If you have any control over your language's semantics, I strongly suggest avoiding implicit coercions;  it should be a (dynamic) type error to use a string where an integer was expected.  This can substantially cut down on the amount of code you have to generate.
4.  So-called "fat pointers" (pointers that include extra bytes of information, like {i8, i8*}) can significantly increase the amount of memory you use, since most architectures require pointers to be aligned on 4- or 8-byte boundaries (or perform better when they are).  Try to see if you can get away with tagged pointers, i.e. using the low bits of a value to determine how to interpret the rest of it.  Getting good low-level performance out of a dynamically-typed language requires a lot of type-unsafe tricks like this.

John.