[LLVMdev] My LLVM Project
Talin
talin at acm.org
Mon Sep 14 20:20:42 PDT 2009
It was a little over two years ago that I saw Chris give a tech talk on
LLVM at Google, and that was when I knew that there was a way that I
could actually build the programming language that I'd been thinking
about for so long.
Well, the compiler is still not done, even though I've been plugging
steadily away at it in my free time. However, a lot of progress has been
made recently, and I thought perhaps some folks might be interested in
what I am trying to do.
The language is called "Tart", and the one-sentence description is "Tart
is to C++ as Python is to Perl". Rather than go on about the philosophy
of the language, however, I will show some code samples.
For example, here's what the "Iterator" interface looks like:
interface Iterator[%T] {
def next -> T or void;
}
'%T' introduces a template parameter named T. The reason for the '%'
prefix is to support partial specialization - you can mix template
parameters and regular type expressions in the template argument list,
as in "Iterator[List[%T]]". This is different from C++ where the
template parameters and the specialization arguments are in separate lists.
Like C++, Tart is a statically-typed language. Unlike C++, however,
Tart's type solver can deduce the template parameters of a class from
the arguments to the constructor or a static method. So for example, a
factory function such as Array.of(1, 2, 3) can deduce that we want an
integer array since the arguments are integers. Of course, we can always
be explicit and say Array[int].of(1, 2, 3). Tart's type solver also
allows function template arguments to be inferred based on the type of
variable that the function's return value is assigned to. (Java does
this as well). Thus, if you have some function that takes a set of
Strings, you can say "myFunction(EmptySet())", and not have to
explicitly specify a template argument to EmptySet - it knows what kind
of set you want.
Back to the example, the result of the iterator's 'next' method is a
disjoint type (also called a discriminated union) written as "T or
void". That is, it returns T as long as there are elements in the
sequence, after which it returns void, i.e. nothing.
LLVM's ability to efficiently return small aggregate types is critical
to making this perform well. Ideally, the Tart iterator protocol ought
to be faster than either Java Enumerators (which requires two icalls per
loop, one for hasNext() and one for next()), or Python's iterators
(which depend on exceptions to signal the end of the sequence.)
To use the iterator interface, you can use the convenient "for .. in"
syntax, similar to Python. Here's a snippet from one of my unit tests:
def sum(nums:int...) -> int {
var sum = 0;
for i in nums {
sum += i;
}
return sum;
}
The 'sum' function takes a variable number of ints (The ... signals a
varargs function), and returns an int.
The 'var' keyword declares a variable ('let', by contrast declares a
constant, which can be local, similar to Java's 'final'). In this case,
the type is omitted (you could say "var sum:int = 0" if you wanted to be
explicit.)
You could also write this out longhand:
def sum(nums:int...) -> int {
var sum = 0;
let iter = nums.iterate();
repeat {
classify iter.next() {
as i:int {
sum += i;
}
else {
break;
}
}
}
return sum;
}
Since the iter assignment never changes, we can use 'let' to bind it
immutably to the variable.
'repeat' is an infinite loop, essentially the same as writing "while true".
'classify' is like a switch statement, except that the cases are types
rather than values. It works with both disjoint types and polymorphic
types, similar to what is seen in Scala and various functional languages
such as OCaml. The variables in the individual 'as' clauses are never in
scope unless the assignment to that variable actually succeeds, so
there's no chance of seeing an uninitialized variable.
In any case, I don't want to go on about this too long - at least not
until the compiler is in better shape to be shown to the world. I still
need to work on closures, reflection, garbage collection, interface
proxies, stack dumps, debug info, and a bunch of other stuff. (There's a
Google code project for it, but I'm not encouraging people to go there
until I have more of the language finished.)
-- Talin
More information about the llvm-dev
mailing list