[LLVMdev] invoke/unwind

Wed Jan 13 15:03:30 PST 2010

On 01/13/2010 02:28 PM, Garrison Venn wrote:

> I
> personally view LLVM as a term re-writing system where the rules are
> controlled by the developer a priori.

Hopefully I'll remember that comment when I understand its significance
better. :-)

> Funny I was thinking the same thing. Implementing MIX would be a cool
> way to learn the other side of LLVM (backends).

It seemed appropriate, especially since I've always been too lazy to
really learn MIX and that's unfortunate when one wants to go to the
source instead of read one of Knuth's interpreters.  I haven't needed to
do that often, but one should always have the option.  Also, I have
common hardware so I have no real motivation to target a real machine
(the only possible reason I could see is if I wanted to buy a board and
do robotics with my boy, and at five he's not ready for that yet).  So
doing an (M)MIX backend would have the salutary effect of making me able
to read Knuth better and that's more motivating than real hardware I'm
not actually using myself.

Plus, a priori I'd guess that (M)MIX is very likely more consistent and
less quirky than any real architecture, as it has no practical
constraints or opportunities to exploit.

(M)Mix would probably be a good choice for a backend-writing tutorial.
I think expectations would be suitably modest--I don't think anyone is
going to port the Linux kernel to MIX or anything, so presumably one
wouldn't get endless requests to tweak the code gen to within an inch of
its life.  The existence of both MIX and MMIX could even be an advantage
if both were supported, as one would have examples of both CISC and RISC
style architectures.

> ...I didn't even know
> there was a MMIX until your email forced me to query.

I guess MMIX is to MIX as x86-64 is to 16-bit x86.  Hopefully that
rather than it being like ia64 is to x86. :-)

Of course, the usefulness of MMIX more or less depends on Knuth
finishing stuff. :-)

> Well, even though I did not take your route, I still use the IR ref.
> doc as my true documentation. It is fairly isomorphic to C++ IR API.
> So I think your approach is worth while.

I hope so.  Though of course I have an agenda for learning LLVM too, and
if that pans out I won't be able to escape doing things normally.  I do
not envision writing interpreters for anything more complex than Forth
or Lisp in IR.

One advantage this backwards approach has is exposing more of the real
machine nature than even C.  I'd like to think that makes one a better
compiler user in the end.  It's nice to know what all those nice
high-level semantics are really costing you.  I think part of my
motivation, besides just doing the unexpected, is that long ago someone
told me they took a class in "assembly and lisp"; basically, they taught
programming by teaching you how to implement a higher-level language.
Being young and stupid, I didn't see the point, but eventually I figured
it out.  It's never too late to re-do your childhood right, is it?

I also think that the effort to write good code at such a low level is
very good discipline.  At least, I find it so, because the consequences
of good and bad design become magnified.  The absence of scope nesting
and the difficulty of doing many simple operations really makes
factoring out a vocabulary of small toolkit functions useful, for
example, and that's not a bad discipline to reinforce.  I just created a
couple of functions whose body is a single shift simply because it
enforced some abstraction and the names are documentation.  I can always
move the body into the header and let LLVM inline them if I want to
optimize away the function call overhead (not that there is any great
need to do that in a learning tool).  (It's easy to tell which parts of
the code I care about.  The expression representation is pretty cleanly
divided into a toolkit.  The user interaction loop is a big fat function
I didn't take the time to decompose.)

Apropos of nothing, learning that I'm not going to use invoke/unwind
puts me back a bit while I bloat and uglify the evaluator code with
exception tests and unwinding, but I'm not that far from Turing
completeness now and that's kind of a good feeling. :-)  I probably
didn't oblige myself to go further than that unless I just want to.

Dustin