[cfe-dev] [RFC] Captured Statements
Douglas Gregor
dgregor at apple.com
Tue Jan 29 17:31:27 PST 2013
Hello,
On Jan 23, 2013, at 7:30 AM, "Pan, Wei" <wei.pan at intel.com> wrote:
> Hi Doug and clang-dev,
>
> We think this could answer Doug's question about "How function outlining is handled?" http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-January/027311.html
>
> Thanks!
>
> Ben Langmuir, Wei Pan and Andy Zhang
> Intel of Canada, Waterloo
>
> *BEGIN*
>
> [RFC] Captured Statements
>
> We are proposing a feature that we have called 'Captured Statements', to
> support outlining statements into functions in Clang's AST. This
> includes capturing references to variables - similar to C++11 lambdas'
> automatic capture. However, the feature is more "primitive" than lambdas and
> so has less complexity and baggage, and so can be used for implementing other
> features not related to lambdas.
>
> We used Captured Statements to support the Cilk Plus C/C++ extension in Clang.
> However, we believe that Captured Statements will be useful to others, and are
> seeking feedback on the proposed design. In particular, Captured Statements
> should be useful in the implementation of OpenMP parallel regions. They may
> also be useful in implementing some of the new features (e.g. in-kernel spawning)
> being considered for OpenCL 2.0, and for nested functions as in GCC.
>
> There are a set of requirements for function outlining:
>
> (1) Must work for both C and C++ programs
> (2) Should be nestable
> (3) Should be able to capture most types of variables, including arrays and 'this'
> (4) Should be able to customize the capturing and codegen behavior
>
> The primary use case is to support outlining parallel regions into functions
> so that they may be passed to a runtime library. Both OpenMP and Cilk Plus
> require this kind of outlining to run parallel regions on multiple threads
> using a runtime library.
>
> E.g.
>
> #paragma omp parallel
> {
> ... // parallel region is outlined, some variable references are captured
> }
>
> cilk_spawn foo(a, b, c); // call to foo is outlined into a helper function and
> // references to a, b, and c are captured
>
>
> There are two existing AST constructs closely related to Captured Statements:
> Objective-C/C++ blocks and C++11 lambda expressions.
>
> The code generation of "block" calls contains quite a few Objective-C specific
> runtime calls. There are also constraints for blocks that do not apply to
> Captured Statements, e.g. arrays cannot be captured in blocks.
> C++11 lambda expressions work for C++ only, where the context is captured in a
> CXXRecordDecl.
>
> As far as we know, neither construct can satisfy all the above requirements,
> and a new Captured Statement seems necessary. The proposed AST changes are based
> on the AST for blocks, but the codegen is closer to lambdas.
>
> Most existing routines for variable capturing will be shared among blocks,
> lambdas and Captured Statements. We still need to extend the current clang
> implementation. For example, a OpenMP 'threadprivate' variable should also be
> captured, although it may be a static variable or static class member.
> AST
> ===
>
> We propose adding a new abstract AST class CapturedStmt to represent a Captured Statement:
>
> - CapturedStmt derives from Stmt
> - CapturedStmt is an abstract class and each kind of outlining
> (eg, for OpenMP, Cilk Plus, etc) will create a separate AST class that
> derives from CapturedStmt
This part surprises me a little bit. I would have expected that CapturedStmt would be the same across the various consumers of outlining, and that it's the consumers that would differ. An OpenMP parallel for loop would store a CapturedStmt, as might a Cilk spawn expression.
> - CapturedStmt will contain "captures", a list of variables referenced within
> the Captured Statement that are declared outside the scope of the statement
> - The CapturedStmt node will hold a Stmt that is the statement to be outlined.
>
> We have prototyped Captured Statements and created an example for its use.
> In our prototype, the "#pragma captured" directive is used to mark a compound statement
> as a Captured Statement, which will be outlined into a separate function and the
> compound statement will be replaced a call to the outline function immediately.
Please put this undef "#pragma clang __debug captured", and we'll remove it as soon as we get our first "real" client in-tree.
> Take the following example,
>
> int foo(int x) {
> int y = 7;
> #pragma captured
> { y *= x; }
>
> return y;
> }
>
> This is equivalent to
>
> int foo(int x) {
> __block int y = 7;
> ^{ y *= x; }();
>
> return y;
> }
>
> using a block or
>
> void foo(int x) {
> int y = 7;
> [&](){ y *= x; }();
>
> return y;
> }
>
> using a lambda expression. With the Captured Statement, its AST looks like
>
> (FunctionDecl 0x5b272e0 <captured.c:3:1, line:12:1> foo 'int (int)'
> (ParmVarDecl 0x5b27220 <line:3:9, col:13> x 'int')
> (CompoundStmt 0x5b52d10 <col:16, line:12:1>
> (DeclStmt 0x5b27418 <line:4:3, col:12>
> (VarDecl 0x5b273a0 <col:3, col:11> y 'int'
> (IntegerLiteral 0x5b273f8 <col:11> 'int' 7)))
> (CapturedStmt 0x5b52c60 <line:7:3, line:9:3>
> (Capture (Var 0x5b273a0 'y' 'int'))
> (Capture (ParmVar 0x5b27220 'x' 'int'))
> (CompoundStmt 0x5b52c40 <line:7:3, line:9:3>
> (CompoundAssignOperator 0x5b52c08 <line:8:5, col:10> 'int' '*=' ComputeLHSTy='int' ComputeResultTy='int'
> (DeclRefExpr 0x5b52a88 <col:5> 'int' lvalue Var 0x5b273a0 'y' 'int')
> (ImplicitCastExpr 0x5b52bf0 <col:10> 'int' <LValueToRValue>
> (DeclRefExpr 0x5b52b40 <col:10> 'int' lvalue ParmVar 0x5b27220 'x' 'int')))))
> (ReturnStmt 0x5b52cf0 <line:11:3, col:10>
> (ImplicitCastExpr 0x5b52cd8 <col:10> 'int' <LValueToRValue>
> (DeclRefExpr 0x5b52cb0 <col:10> 'int' lvalue Var 0x5b273a0 'y' 'int'))))))
>
> which is almost of the same AST for the block example above.
>
> An implicit RecordDecl(not CXXRecordDecl) will be created to hold all the capture fields,
> and the capture type is by reference by default. The statement to be captured will
> be the body of an implicit FunctionDecl.
Just FWIW, you'll almost certainly need to build a CXXRecordDecl in C++ mode, but that shouldn't make what you're doing any harder.
> Semantic analysis
> =================
>
> There are a number of common constraints on statements to be captured, and this needs
> to be elaborated further. A general rule is to treat a Captured Statement as
> a function body. For example, the use of jump statements into and out of the
> statement is limited.
>
> Some refactoring may be required to accommodate needs for derived Captured Statements.
> For example, one Captured Statement may allow throw expressions but another may not.
I guess that's one reason to have different Captured Statement subclasses, but even that's just contextual information that we can easily encode in the single Captured Statement node.
> Code generation
> ===============
>
> The Captured Statement AST is close to blocks, but the code generation is completely
> different. In fact, for straight Captured Statements (those without additional
> language extension runtime calls inserted), both emission of the outlined function
> and its invocation are much closer to lambdas. The only difference is that the
> capture context is explicitly passed as the first argument.
>
> The code emitted for the outlined function looks like:
>
> %struct.capture = type { i32*, i32* }
>
> define internal void @__captured_stmt_helper(%struct.capture* %this) nounwind {
> entry:
> %this.addr = alloca %struct.capture*, align 8
> store %struct.capture* %this, %struct.capture** %this.addr, align 8
> %0 = load %struct.capture** %this.addr
> %1 = getelementptr inbounds %struct.capture* %0, i32 0, i32 1
> %ref = load i32** %1, align 8
> %2 = load i32* %ref, align 4
> %3 = getelementptr inbounds %struct.capture* %0, i32 0, i32 0
> %ref1 = load i32** %3, align 8
> %4 = load i32* %ref1, align 4
> %mul = mul nsw i32 %4, %2
> store i32 %mul, i32* %ref1, align 4
> ret void
> }
>
> *END*
Looks reasonable. I think this is a great approach, and I look forward to seeing the patches.
- Doug
More information about the cfe-dev
mailing list