[cfe-dev] CFG blocks and variable scope
Ted Kremenek
kremenek at apple.com
Mon Mar 30 11:17:27 PDT 2009
On Mar 29, 2009, at 3:05 AM, Martin Doucha wrote:
> Chris Lattner wrote:
>> Would it make sense for the CFG to contain a "virtual statement"
>> saying "variable x destroyed here"? This would be a natural way to
>> handle C++ dtors and would also be useful in C, because you'd know
>> the
>> end of the variable's life. CFG construction would handle this as it
>> is walking the scopes.
>
> It would if you didn't have to destroy variables in transition between
> CFG blocks. Look at C99/C++ for loop:
> for (int i = 0; i < 10; i++) { int x =i; ... }
> Here, 'x' is destroyed on each pass while 'i' is destroyed in
> transition
> to the block directly after the loop. And destroying it in the
> following
> block doesn't work because other blocks which have no variable
> corresponding to 'i' may point at this block as well. So you either
> have
> to start making virtual basic blocks for variable destruction or you
> can
> list variables to be destroyed at the edge between basic blocks.
Hi Martin,
I think having "virtual" blocks is a much better solution (although
all basic blocks are "virtual" because they are implied by the control-
flow of the code). Unlike the other approach, it doesn't introduce a
new concept that clients of CFGs would have to understand. Besides,
CFG blocks are just ordered lists; there is no real advantage of
having lists attached to edges versus having a basic block
representing a list of destroyed variables. Moreover, there is
control-flow between variable destruction; the ordering between object
destruction is particularly important in C++, and that control-flow
should be captured explicitly and clearly in the CFG itself. By
having virtual basic blocks, clients are isolated from having to
reason about the high-level control-flow rules of the language and
instead just think about evaluating individual expressions.
> Yes, virtual statements could be used for destroying variables in the
> middle of CFG block. But that's not enough. And each of those
> statements
> should point to a group of variables rather than list each of them
> separately. Listing each of them separately will take a lot of
> memory in
> more complex code structures.
I don't follow this argument. First, I don't understand the memory
concerns. Having statements point to group of N variables still
requires N+1 pointers versus having N pointers in a basic block to
virtual statements. A basic block that contains virtual statements
(which would be a format for this information that existing clients
would already understand) would require space for the basic block
(which is insignificant) and N pointers. To me the solution of having
virtual statements in basic blocks is strictly better because it is
much simpler from the perspective of clients and has almost exactly
the same memory requirements.
Note that virtual statements don't necessarily require allocating new
objects; we could use bit-swizzinling in the pointer values to
distinguish between regular Stmt* and virtual statements (which could
be Decl* mangled with a specific bit).
Second, even if there were memory concerns, CFGs are transient data
structures whose primary role is to greatly simplify the reasoning
about control-flow for clients. Typically we only construct a CFG
when analyzing a specific function or method. Once we are done
analyzing a function we release the memory associated with the CFG.
Even for functions with a ton of control-flow (switches with nested
loops, etc), I have yet to seen where the size of the CFG causes any
real memory concerns.
>
>> For declstmt's? I think the AST has enough information to know
>> scopes. What is missing?
>
> Something like mapping between CFG blocks and complete AST?
I think this is already present. CFG blocks contain Stmt* which
directly reference the AST. This seems more than adequate to
reference the complete AST (although I'm not certain if "complete"
implies something more specific).
More information about the cfe-dev
mailing list