[LLVMdev] nonlocal go to -- how?

Sun May 4 19:16:37 PDT 2008

Hi Hendrick,

LLVM's CFG is purely static. This is necessary for efficient register  
allocation behavior. Your front-end could implement a nonlocal goto by  
using a tail call and a switch in the entry block of your branched-to  
function. This is essentially how LLVM implements the address-of-label  
extension, transforming 'goto x' into a switch statement.

// 'foo' makes a nonlocal goto into 'bar'.
int foo() {
   return bar_impl(2 /* non-default entry point */, undef, undef /*  
who knows? */);
}

// Invoke for a 'normal' call.
int bar(int x, int y) {
   return bar_impl(0 /* default entry point */, x, y);
}

// Implementation detail.
int bar_impl(int label, int x, int y) {
   switch (label) {
   case 1: goto L1;
   case 2: goto L2;
   default: goto entry;
   }
entry:
   ...
L1:
   ...
L2:
   ...
}

Your front-end would need to track whether it is necessary to emit  
such a shim (i.e., whether there are any incoming gotos).

If you use a 'tail call' instruction to invoke 'bar_impl', LLVM will  
use proper tail calls in some cases.

Of course, this doesn't support the full generality of indirect  
nonlocal goto, so may not be suitable for your application.

On 2008-05-04, at 19:56, Hendrik Boom wrote:

> On Sun, 04 May 2008 16:05:44 +0000, Hendrik Boom wrote:
>
>> The languages I'm faced with compiling in the near future have  
>> nonlocal
>> go to statements and nested procedures.
>>
>> A procedure gets implemented as a structure containing its entry  
>> point
>> and an environment pointer.  It is easy enough to call its entry  
>> point
>> and pass the environment pointer as an extra argument (rather like  
>> the
>> pointer to this or self in object-oriented code).  It's no  
>> problem.  The
>> trampoline intrinsic is a neat way of packaging this so as to  
>> retrofit
>> into C's one-address-only function pointer, but that's not  
>> necessary in
>> a language where procedures are all known to have this behaviour.
>>
>> The problem is with the go to statement.  Again, local go to's,  
>> that go
>> somewhere within the same function are no particular problem --  
>> though I
>> haven't studied the interaction with alloca yet; that might offer a  
>> few
>> surprises.  The questions I have are about goto's that exit from a
>> function.  The traditional mechanism is to implement a label as an
>> instruction-pointer/environment-pointer pair, just as for procedures.
>> You just load the environment-pointer into the appropriate  
>> register, and
>> jump to the address.  (again there are technical details about the  
>> size
>> of the local stack at the destination, disposition of alloca storage,
>> and the like, and nowadays, unwinding the stack).
>>
>> I don't see a mechanism for this.
>>
>> The closest I see is the mechanism for exceptions.  I cannot tell by
>> reading the documentation for exception-handling whether it is
>> sufficiently flexible to accomodate nonlocal goto's.  It's plain  
>> that on
>> modern systems (which dribble saved registers all over the sack) some
>> kind of unwinding is necessary, and that the traditional model of
>> loading a register and jumping is no longer sufficient.  What's not
>> clear is whether the exception handling mechanism is sufficiently
>> dynamic to permit an accurate specification of the jump target.  It's
>> not necessarily the most local version of the label on the stack  
>> that's
>> the target; it's the one whose stack frame is reached by the jumping
>> procedure's array of static links.
>>
>> -- hendrik
>
> Well, I suppose it could be kind of done in not-quite C++, and  
> therefore
> it could probably be made to work in LLVM, like this.
>
> Labels are values, like:
>
> struct Label{ void * level, *target; }
>
> The level identifies the activation record in which the jump target
> occurs; the target  is the actual address of the instruction to be  
> jumped
> to when the label is used.
>
> Suppose the high-level language defined a label L in functino foo.   
> Then
> we build a Label object LL to represent it.
>
> foo()
> {
> Label LL;
> LL.level = &LL;
> LL.target = &L;
>
> ...
> ...
>
> L:
> }
>
>
> Someplace else, syntactically nested within foo, there's another  
> function
> bar, which contains a goto L.
>
> This goto gets translated into something like:
> Translate the goto inso something like
>
>  throw appropriate_static_link->LL;
>
> where appropriate static link is whatever is needed to access the  
> proper
> variable in the enclosing scope.
>
> Around every call *from* any function that contains a Label, we have  
> code
> like:
>
>  try(
> 	the_call
>  }
>  catch(Label e)
>  { if(e.level == LL.level)
> 	goto e.label;
>    else throw e;
>  }
>
> Of course this coding requires that you can take addresses of labels  
> and
> jump to them later in C++, which as far as I can recall is not one  
> of its
> language features.
>
> But LLVM is a lower-level language, so I'm hoping that something  
> similar
> can be accomplished.  But I haven't noticed dynamic arguments to the  
> br
> instruction.
>
> -- hendrik
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

— Gordon