[cfe-dev] CFG blocks and variable scope

Mon Mar 30 11:17:27 PDT 2009

On Mar 29, 2009, at 3:05 AM, Martin Doucha wrote:

> Chris Lattner wrote:
>> Would it make sense for the CFG to contain a "virtual statement"
>> saying "variable x destroyed here"?  This would be a natural way to
>> handle C++ dtors and would also be useful in C, because you'd know  
>> the
>> end of the variable's life.  CFG construction would handle this as it
>> is walking the scopes.
>
> It would if you didn't have to destroy variables in transition between
> CFG blocks. Look at C99/C++ for loop:
> for (int i = 0; i < 10; i++) { int x =i; ... }
> Here, 'x' is destroyed on each pass while 'i' is destroyed in  
> transition
> to the block directly after the loop. And destroying it in the  
> following
> block doesn't work because other blocks which have no variable
> corresponding to 'i' may point at this block as well. So you either  
> have
> to start making virtual basic blocks for variable destruction or you  
> can
> list variables to be destroyed at the edge between basic blocks.

Hi Martin,

I think having "virtual" blocks is a much better solution (although  
all basic blocks are "virtual" because they are implied by the control- 
flow of the code).  Unlike the other approach, it doesn't introduce a  
new concept that clients of CFGs would have to understand.  Besides,  
CFG blocks are just ordered lists; there is no real advantage of  
having lists attached to edges versus having a basic block  
representing a list of destroyed variables.  Moreover, there is  
control-flow between variable destruction; the ordering between object  
destruction is particularly important in C++, and that control-flow  
should be captured explicitly and clearly in the CFG itself.  By  
having virtual basic blocks, clients are isolated from having to  
reason about the high-level control-flow rules of the language and  
instead just think about evaluating individual expressions.

> Yes, virtual statements could be used for destroying variables in the
> middle of CFG block. But that's not enough. And each of those  
> statements
> should point to a group of variables rather than list each of them
> separately. Listing each of them separately will take a lot of  
> memory in
> more complex code structures.

I don't follow this argument.  First, I don't understand the memory  
concerns.  Having statements point to group of N variables still  
requires N+1 pointers versus having N pointers in a basic block to  
virtual statements.  A basic block that contains virtual statements  
(which would be a format for this information that existing clients  
would already understand) would require space for the basic block  
(which is insignificant) and N pointers.  To me the solution of having  
virtual statements in basic blocks is strictly better because it is  
much simpler from the perspective of clients and has almost exactly  
the same memory requirements.

Note that virtual statements don't necessarily require allocating new  
objects; we could use bit-swizzinling in the pointer values to  
distinguish between regular Stmt* and virtual statements (which could  
be Decl* mangled with a specific bit).

Second, even if there were memory concerns, CFGs are transient data  
structures whose primary role is to greatly simplify the reasoning  
about control-flow for clients.  Typically we only construct a CFG  
when analyzing a specific function or method.  Once we are done  
analyzing a function we release the memory associated with the CFG.   
Even for functions with a ton of control-flow (switches with nested  
loops, etc), I have yet to seen where the size of the CFG causes any  
real memory concerns.

>
>> For declstmt's?  I think the AST has enough information to know
>> scopes. What is missing?
>
> Something like mapping between CFG blocks and complete AST?

I think this is already present.  CFG blocks contain Stmt* which  
directly reference the AST.  This seems more than adequate to  
reference the complete AST (although I'm not certain if "complete"  
implies something more specific).