[LLVMdev] Re: Syn
Simon Funk
simon at interstice.com
Fri Oct 22 19:54:01 PDT 2004
>How is this different from the LISP and scheme macro system? The
>program source is available to the programmer at both compile and
>run-time and may be operated on arbitrarily (transforming code, adding
>code, removing code, specializing code, making new "primitives",
>modifying other macros, etc). There is a reason for LISP's syntax, it
>is so you can program arbitrary semantics on top of it. [...]
>
>Andrew
I'm not familiar with Scheme specifically, but just from my general
knowledge of Lisp:
Syn essentially _is_ Lisp, with a couple of minor exceptions:
- The universal datatypes, rather than being car/cdr tuples and
atoms, are parse-tree nodes and atoms. This difference is
almost meaningless, since, for instance, many parsers
represent their parse trees in Lisp-like lists already...
- Unrecognized constructs are expanded as if they were functions
that evaled to themselves -- which is to say, their children
_are_ evaled, and then the results are re-assembled as a clone
of the parent using the evaled children. This means you can
send in a quite large and complicated data structure, and only
the "recognized" functions, wherever they lie, will be evaled,
while the rest is left alone. (Does Lisp or Scheme do this? I'm
not sufficiently familiar.)
- There's a stateful primitive which makes it easy to generate
unique names that are consistent within the scope of a single
macro expansion/function invocation. (This isn't really
possible in a pure functional language, nor would it be
necessary if not for the fact that the parse tree
itself--largely unevaluated--is destined to be the final
output.)
Really, that's about it for differences. But the point isn't meant
to be that Syn is better or different than Lisp -- rather, it's just a
way of using a Lisp-like language to do something neat. I could pitch
the whole thing as: here's something cool that can be done with Lisp,
and just leave Syn out of it. And that pitch would go something like:
- Make it easy for people to specify their own syntax (or modify/
extend existing ones), and have the parser just output a parse
tree as a Lisp data structure.
- Allow a single project to support multiple syntaxes (which is easy
since they're all just converted to a universal parse tree
format by the parser).
Surprisingly, at this point you've actually got quite a powerful
system, assuming you're ultimately programming to Lisp as your back end,
because you can now introduce new syntactic constructs that express high
level concepts well, and can specify their semantics in any lower level
syntax. This can be layered to good effect. But one final
consideration is:
- Either allow pass-through of non-recognized functions, or write a
set of functions which are effectively pass-through for the parse
tree components of some other target language (i.e., not Lisp).
Now you can use Lisp to implement layers of higher-level syntax and
associated semantics for some other, non-Lisp target language, like
LLVM. Normally one wouldn't think of coding directly to LLVM, but with
this approach it's actually quite quick and easy to implement each
successive level of abstraction, so you're programming in some
high-level language next thing you know, even though it's all really
just "macros" up from LLVM.
What I'm describing ultimately is effectively an approach to writing
a compiler, but it's an approach that's sufficiently simple and
transparent as to make it practical to expose it at the individual project
level--i.e., such that it's normal and reasonable to, effectively,
modify the compiler itself in order to best support the needs of one
particular project (or, more likely, one particular class of projects).
And it also has the side effect of keeping all run-time semantic support
implemented in the target language (LLVM) in a way that opens up a lot
more optimization opportunities than one would normally get without some
specific effort.
What it all comes down to is: you shouldn't ever have to envy
another language's higher level constructs, you should be able to
implement whatever you like. There are always limitations imposed
by the core execution model (imperative vs. declarative vs. relational,
for instance), but within a single family, the Syn approach should pretty
much let you have it all. LLVM plus the Syn approach seems like a natural
"univeral imperative language". The Syn approach straight down onto
Lisp might be a good universal functional language. And so on. I'm
pretty sure you could even use it meaningfully on top of Prolog...
Anyway, hope this answers your Q.
-Simon
More information about the llvm-dev
mailing list