[LLVMdev] Implementing try/catch/finally

Sun Apr 20 21:50:57 PDT 2008

I'd be interested if anyone has some advice on the best way to represent 
a try/catch/finally statement in LLVM IR.

Assume for the moment that we're using the Python semantics for 
try/catch. According to the Python language specification, the 'finally' 
clause is executed whenever the flow of control leaves the 'try' block.

After the 'finally' clause has finished, the flow of control will 
continue at different points depending on how the 'finally' block was 
entered. There are basically 5 different cases:

  -- If the flow of control fell off the end of the try body, or the 
exception was caught, then after the 'finally' is finished execution 
will continue at the statement after the finally statement.
  -- If the exception was not handled, once the finally statement is 
finished the exception will be re-thrown.
  -- A return statement was executed within the try block. After 
'finally', the function returns.
  -- A break statement was executed within the try block, and the 
innermost loop is outside of the try block.
  -- As above, but a continue statement.

So in the most complex case, the basic block at the end of the 'finally' 
statement may have as many as 5 possible successors (or more -- if there 
are multiple return statements within the try body, and you don't feel 
like messing with phi statements, then it may be easier to consider each 
return statement as a separate assignment.)

One approach would be to simply duplicate the code in the 'finally' 
block for each exit, but that seems sub-optimal. It would be better, I 
think, to set a state variable before entering the 'finally' block, and 
then have it do a switch instruction at the end and transfer to the 
appropriate block.

But I wonder if perhaps it couldn't be better than that. It would seem 
more efficient, I would think, to be able to pass into the finally block 
the address of where to resume execution. Or treat it like a kind of 
local subroutine (i.e. not a full-fledged function with its own local 
variables and stack frame, but more like a simple jsr instruction.)

Anyway, just musing on different possibilities and wondering if anyone 
has any suggestions...