[cfe-dev] "Blocks" in Clang (aka closures)

Chris Lattner clattner at apple.com
Wed Aug 27 20:00:04 PDT 2008


Hi All,

Steve has started working on an implementation of a language feature  
named 'Blocks'.  The back story on this was that it was prototyped in  
an private Clang fork (because it is much easier to experiment with  
clang than with GCC), then implemented in GCC (where it evolved a  
lot), and now we're re-implementing it in Clang.  The language feature  
is already supported by mainline llvm-gcc, but we don't have up-to- 
date documentation for it.  When that documentation is updated, it  
will definitely be checked into the main clang repo (in clang/docs).   
Note that llvm-gcc supports a bunch of deprecated syntax from the  
evolution of Blocks, but we don't plan to support that old stuff in  
Clang.

Until there is more real documentation, this is a basic idea of  
Blocks: it is closures for C.  It lets you pass around units of  
computation that can be executed later.  For example:

void call_a_block(void (^blockptr)(int)) {
   blockptr(4);
}

void test() {
   int X = ...
   call_a_block(^(int y){ print(X+y); });  // references stack var  
snapshot

   call_a_block(^(int y){ print(y*y); });
}

In this example, when the first block is formed, it snapshots the  
value of X into the block and builds a small structure on the stack.   
Passing the block pointer down to call_a_block passes a pointer to  
this stack object.  Invoking a block (with function call syntax) loads  
the relevant info out of the struct and calls it.  call_a_block can  
obviously be passed different blocks as long as they have the same type.

 From a technical perspective, blocks fit into C in a couple places:  
1) a new declaration type (the caret) which work very much like a  
magic kind of pointer that can only point to function types. 2) block  
literals, which capture the computation  3) a new storage class  
__block 4) a really tiny runtime library.

The new storage class comes into play when you want to get mutable  
access to variables on the stack.  Basically you can mark an otherwise- 
auto variable with __block (which is currently a macro that expands to  
an attribute), for example:


void test() {
   int X = ...
   __block int Y = ...
   ^{ X = 4; };  // error, can't modify a const snapshot.
   ^{ Y = 4; };  // ok!
}

 From the implementation standpoint, roughly the address of a __block  
object is captured by the block instead of its value.

The is tricky though because blocks are on the stack, and you may want  
to refer to some computation (and its __block captured variables)  
after the function returns.  To do this, we have a simple form of  
reference counting to manage the lifetimes of these.  For example, in  
this case:

void (^P)(int);  // global var

void gets_a_block(void (^blockptr)(int)) {
   P = blockptr;
}

void called_sometime_later() {
   P(4);
}

if gets_a_block is called with a block on the stack, and  
called_sometime_later is called after that stack frame is popped,  
badness happens (yay for C!).  Instead, we use:

void (^P)(int);  // global var

void gets_a_block(void (^blockptr)(int)) {
   P = _Block_copy(blockptr);  // copies to heap if on the stack with  
refcount +1, otherwise increments refcount.
}

void called_sometime_later() {
   P(4);
   _Block_release(P);  // decrements refcount.
   P = 0;
}

The semantics of this is that it copies the block off the stack *as  
well as any __block variables it references*, and the shared __block  
variables are themselves freed when all referencing blocks go away.   
The really tiny runtime library implements things like _Block_copy and  
friends.

Other interesting things are that the blocks themselves do limited/ 
optional type inference of the result type:

   foo(^int(){ return 4; });   // takes nothing, returns int.
   foo(^(){ return 4; });      // same thing, inferred to return int.


If you're interested in some more low-level details, it looks like gcc/ 
testsuite/gcc.apple/block-blocks-test-8.c in the llvm-gcc testsuite  
has some of the underlying layout info, though I have no idea if it is  
up-to-date.

To head off the obvious question: this syntax and implementation has  
nothing to do with C++ lambdas.  Blocks are designed to work well with  
C and Objective-C, and unfortunately C++ lambdas really require a  
language with templates to be very useful.  The syntax of blocks and C+ 
+ lambdas are completely different, so we expect to eventually support  
both in the same compiler.

In any case, more detailed documentation will be forthcoming, but I  
would be happy to answer specific questions (before Friday, at which  
point I disappear for two weeks on vacation, woo!)

-Chris



More information about the cfe-dev mailing list