[LLVMdev] Proposal for TableML, llvmc2 configuration language
Patrick Walton
pcwalton at cs.ucla.edu
Wed Nov 26 13:34:04 PST 2008
Hi,
I've been working on a proof of concept for a new configuration language
for LLVM: specifically for my needs in llvmc2, but I have tried to make
it as generic as possible for use throughout LLVM if other projects
would like to make use of it. It's a compiler that compiles a
near-subset of Standard ML to C++, with an architecture deliberately
very similar to TableGen.
The code is not yet ready to be merged by any means - it has many
failure cases and may not compile at any given time - but I thought that
before I go further I should send a proposal to the list. The WIP code,
for the curious, is here:
http://github.com/pcwalton/llvm-nw/tree/miniml
If TableGen is a language that allows users to specify records of
domain-specific information, TableML is designed to be a configuration
language that is designed to be allow users to specify how to
*construct* records of domain-specific information. TableML has a plugin
architecture in which at any given time one of several backends is in
use, just as in TableGen. The backends specify one or more record types
and definitions. TableML then reads a configuration file, evaluates the
definitions, and passes the results to the backend for serialization.
For instance, we might have a RegisterInfo backend that declares a
definition of "RegisterNames : string list". Then we could have a
TableML input file like this:
def val RegisterInfo = [ "eax", "ebx", "ecx", "edx" ]
Or we could have a more complex one that performs computation to produce
the result.
val make32bit = (fn x => strcat("e", x))
def val RegisterInfo = map make32bit [ "ax", "bx", "cx", "dx" ]
Obviously, this example is somewhat contrived, but it's just to
illustrate that arbitrary computation is allowed (and is performed at
compile time), as long as the definitions end up with the correct types.
This could be thought of as a generalization of the "class" and
"multiclass" concepts in TableGen. Also notice that, like all ML-based
languages, TableML is strongly typed, and it makes heavy use of
Hindley-Milner type inference. (The parser, lexer, and typechecker are
all coded already, by the way, just not very well tested at the moment.)
The subset of Standard ML that TableML supports is essentially the one
shown here:
http://www.macs.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/slicing.cgi
Now the upshot of this for the compiler driver is that function types
are acceptable types for definitions. This means that, unlike TableGen,
backends that want to allow scripting (which is currently just llvmc2)
don't have to define their own programming languages. Instead, they can
simply request a definition with a function type (e.g. SomeFunction :
int -> int). TableML will hand the AST for the function, as well as its
values, over to the backend for emission as C++ code. The backend is
free to generate any C++ code it wants for the typed ASTs (of course,
some support routines could be added to the base to make this easier).
So, in summary, there are two main benefits to TableML that I see,
depending on the backend/use case:
(1) Users of backends that don't need scripting support can benefit from
arbitrary computation in order to express the records, more than the
macro facility that TableGen provides.
(2) Users of backends that do need scripting support don't have to
define their own programming languages, without any run-time performance
loss when compared to TableGen.
I'd definitely appreciate any comments on this proposal! I'd also be
happy to clarify any issues with this explanation.
Patrick
More information about the llvm-dev
mailing list