[LLVMdev] nonlocal go to -- how?

Sun May 4 16:56:22 PDT 2008

On Sun, 04 May 2008 16:05:44 +0000, Hendrik Boom wrote:

> The languages I'm faced with compiling in the near future have nonlocal
> go to statements and nested procedures.
> 
> A procedure gets implemented as a structure containing its entry point
> and an environment pointer.  It is easy enough to call its entry point
> and pass the environment pointer as an extra argument (rather like the
> pointer to this or self in object-oriented code).  It's no problem.  The
> trampoline intrinsic is a neat way of packaging this so as to retrofit
> into C's one-address-only function pointer, but that's not necessary in
> a language where procedures are all known to have this behaviour.
> 
> The problem is with the go to statement.  Again, local go to's, that go
> somewhere within the same function are no particular problem -- though I
> haven't studied the interaction with alloca yet; that might offer a few
> surprises.  The questions I have are about goto's that exit from a
> function.  The traditional mechanism is to implement a label as an
> instruction-pointer/environment-pointer pair, just as for procedures.
> You just load the environment-pointer into the appropriate register, and
> jump to the address.  (again there are technical details about the size
> of the local stack at the destination, disposition of alloca storage,
> and the like, and nowadays, unwinding the stack).
> 
> I don't see a mechanism for this.
> 
> The closest I see is the mechanism for exceptions.  I cannot tell by
> reading the documentation for exception-handling whether it is
> sufficiently flexible to accomodate nonlocal goto's.  It's plain that on
> modern systems (which dribble saved registers all over the sack) some
> kind of unwinding is necessary, and that the traditional model of
> loading a register and jumping is no longer sufficient.  What's not
> clear is whether the exception handling mechanism is sufficiently
> dynamic to permit an accurate specification of the jump target.  It's
> not necessarily the most local version of the label on the stack that's
> the target; it's the one whose stack frame is reached by the jumping
> procedure's array of static links.
> 
> -- hendrik

Well, I suppose it could be kind of done in not-quite C++, and therefore 
it could probably be made to work in LLVM, like this.

Labels are values, like:

struct Label{ void * level, *target; }

The level identifies the activation record in which the jump target 
occurs; the target  is the actual address of the instruction to be jumped 
to when the label is used.

Suppose the high-level language defined a label L in functino foo.  Then 
we build a Label object LL to represent it.

foo()
{
Label LL;
LL.level = &LL;
LL.target = &L;

...
...

L:
}

Someplace else, syntactically nested within foo, there's another function 
bar, which contains a goto L.

This goto gets translated into something like:
Translate the goto inso something like

  throw appropriate_static_link->LL;

where appropriate static link is whatever is needed to access the proper 
variable in the enclosing scope.

Around every call *from* any function that contains a Label, we have code 
like:

  try(
	the_call
  }
  catch(Label e)
  { if(e.level == LL.level)
	goto e.label;
    else throw e;
  }

Of course this coding requires that you can take addresses of labels and 
jump to them later in C++, which as far as I can recall is not one of its 
language features.

But LLVM is a lower-level language, so I'm hoping that something similar 
can be accomplished.  But I haven't noticed dynamic arguments to the br 
instruction.

-- hendrik