[cfe-dev] [RFC] Captured Statements

Pan, Wei wei.pan at intel.com
Wed Jan 23 07:30:22 PST 2013


Hi Doug and clang-dev,

We think this could answer Doug's question about "How function outlining is handled?" http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-January/027311.html 

Thanks!

Ben Langmuir, Wei Pan and Andy Zhang
Intel of Canada, Waterloo

*BEGIN*

[RFC] Captured Statements 

We are proposing a feature that we have called 'Captured Statements', to
support outlining statements into functions in Clang's AST.  This
includes capturing references to variables - similar to C++11 lambdas'
automatic capture.  However, the feature is more "primitive" than lambdas and
so has less complexity and baggage, and so can be used for implementing other
features not related to lambdas.

We used Captured Statements to support the Cilk Plus C/C++ extension in Clang.
However, we believe that Captured Statements will be useful to others, and are
seeking feedback on the proposed design.  In particular, Captured Statements
should be useful in the implementation of OpenMP parallel regions.  They may
also be useful in implementing some of the new features (e.g. in-kernel spawning)
being considered for OpenCL 2.0, and for nested functions as in GCC.

There are a set of requirements for function outlining:

(1) Must work for both C and C++ programs
(2) Should be nestable
(3) Should be able to capture most types of variables, including arrays and 'this'
(4) Should be able to customize the capturing and codegen behavior

The primary use case is to support outlining parallel regions into functions
so that they may be passed to a runtime library.  Both OpenMP and Cilk Plus
require this kind of outlining to run parallel regions on multiple threads
using a runtime library.

E.g.

#paragma omp parallel
{
  ... // parallel region is outlined, some variable references are captured
}

cilk_spawn foo(a, b, c); // call to foo is outlined into a helper function and
                                               // references to a, b, and c are captured


There are two existing AST constructs closely related to Captured Statements:
Objective-C/C++ blocks and C++11 lambda expressions.

The code generation of "block" calls contains quite a few Objective-C specific
runtime calls.  There are also constraints for blocks that do not apply to
Captured Statements, e.g. arrays cannot be captured in blocks.
C++11 lambda expressions work for C++ only, where the context is captured in a
CXXRecordDecl.

As far as we know, neither construct can satisfy all the above requirements,
and a new Captured Statement seems necessary. The proposed AST changes are based
on the AST for blocks, but the codegen is closer to lambdas.

Most existing routines for variable capturing will be shared among blocks,
lambdas and Captured Statements. We still need to extend the current clang
implementation. For example, a OpenMP 'threadprivate' variable should also be
captured, although it may be a static variable or static class member.

AST
===

We propose adding a new abstract AST class CapturedStmt to represent a Captured Statement:

- CapturedStmt derives from Stmt
- CapturedStmt is an abstract class and each kind of outlining
  (eg, for OpenMP, Cilk Plus, etc) will create a separate AST class that
  derives from CapturedStmt
- CapturedStmt will contain "captures", a list of variables referenced within
  the Captured Statement that are declared outside the scope of the statement
- The CapturedStmt node will hold a Stmt that is the statement to be outlined.

We have prototyped Captured Statements and created an example for its use.
In our prototype, the "#pragma captured" directive is used to mark a compound statement
as a Captured Statement, which will be outlined into a separate function and the
compound statement will be replaced a call to the outline function immediately.

Take the following example,

int foo(int x) {
  int y = 7;
  #pragma captured
  { y *= x; }

  return y;
}

This is equivalent to

int foo(int x) {
  __block int y = 7;
  ^{ y *= x; }();

  return y;
}

using a block or

void foo(int x) {
  int y = 7;
  [&](){ y *= x; }();

  return y;
}

using a lambda expression. With the Captured Statement, its AST looks like

(FunctionDecl 0x5b272e0 <captured.c:3:1, line:12:1> foo 'int (int)'
    (ParmVarDecl 0x5b27220 <line:3:9, col:13> x 'int')
    (CompoundStmt 0x5b52d10 <col:16, line:12:1>
      (DeclStmt 0x5b27418 <line:4:3, col:12>
        (VarDecl 0x5b273a0 <col:3, col:11> y 'int'
          (IntegerLiteral 0x5b273f8 <col:11> 'int' 7)))
      (CapturedStmt 0x5b52c60 <line:7:3, line:9:3>
        (Capture (Var 0x5b273a0 'y' 'int'))
        (Capture (ParmVar 0x5b27220 'x' 'int'))
        (CompoundStmt 0x5b52c40 <line:7:3, line:9:3>
          (CompoundAssignOperator 0x5b52c08 <line:8:5, col:10> 'int' '*=' ComputeLHSTy='int' ComputeResultTy='int'
            (DeclRefExpr 0x5b52a88 <col:5> 'int' lvalue Var 0x5b273a0 'y' 'int')
            (ImplicitCastExpr 0x5b52bf0 <col:10> 'int' <LValueToRValue>
              (DeclRefExpr 0x5b52b40 <col:10> 'int' lvalue ParmVar 0x5b27220 'x' 'int')))))
      (ReturnStmt 0x5b52cf0 <line:11:3, col:10>
        (ImplicitCastExpr 0x5b52cd8 <col:10> 'int' <LValueToRValue>
          (DeclRefExpr 0x5b52cb0 <col:10> 'int' lvalue Var 0x5b273a0 'y' 'int'))))))

which is almost of the same AST for the block example above.

An implicit RecordDecl(not CXXRecordDecl) will be created to hold all the capture fields,
and the capture type is by reference by default. The statement to be captured will
be the body of an implicit FunctionDecl.

Semantic analysis
=================

There are a number of common constraints on statements to be captured, and this needs
to be elaborated further. A general rule is to treat a Captured Statement as
a function body. For example, the use of jump statements into and out of the
statement is limited.

Some refactoring may be required to accommodate needs for derived Captured Statements.
For example, one Captured Statement may allow throw expressions but another may not.

Code generation
===============

The Captured Statement AST is close to blocks, but the code generation is completely
different. In fact, for straight Captured Statements (those without additional
language extension runtime calls inserted), both emission of the outlined function
and its invocation are much closer to lambdas. The only difference is that the
capture context is explicitly passed as the first argument.

The code emitted for the outlined function looks like:

%struct.capture = type { i32*, i32* }

define internal void @__captured_stmt_helper(%struct.capture* %this) nounwind {
entry:
  %this.addr = alloca %struct.capture*, align 8
  store %struct.capture* %this, %struct.capture** %this.addr, align 8
  %0 = load %struct.capture** %this.addr
  %1 = getelementptr inbounds %struct.capture* %0, i32 0, i32 1
  %ref = load i32** %1, align 8
  %2 = load i32* %ref, align 4
  %3 = getelementptr inbounds %struct.capture* %0, i32 0, i32 0
  %ref1 = load i32** %3, align 8
  %4 = load i32* %ref1, align 4
  %mul = mul nsw i32 %4, %2
  store i32 %mul, i32* %ref1, align 4
  ret void
}

*END*





More information about the cfe-dev mailing list