[cfe-dev] CFG blocks and variable scope
Ted Kremenek
kremenek at apple.com
Sat Mar 28 13:46:40 PDT 2009
On Mar 28, 2009, at 12:19 PM, Martin Doucha wrote:
>> There is currently no scope information in the CFG (or the AST for
>> that matter). Adding this information would be extremely useful, and
>> would probably tie in for eventual support for encoding calls to C++
>> destructors in the CFG as well.
>>
>
> Great, so what's the preferable way of doing this? My idea is to
> have a
> tree of scopes (corresponding to CompoundStmt), each scope
> containing a
> complete list of variables declared inside it (not including
> declarations in nested scopes) regardless of control flow.
Hi Martin,
I haven't given a lot of thought to this yet, but I will comment on
this point. Scopes can be introduced in many places, especially in C+
+. While I'm not certain if you suggested this, we wouldn't want to
reconstruct the work done by Sema in generating scope information;
ideally this information would still be accessible (when desired) when
one has the ASTs.
In the CFG, my thought was that *potentially* destructor calls could
be explicitly modeled. The lifetimes of regular stack variables could
also be modeled using the same mechanism. Since we haven't resolved
how we want to represent destructors in the AST or CFG, I think that
should probably be addressed first.
> Then each CFG
> block would have a single parent scope (the one directly above it)
> and a
> list of scopes inside it with a statement iterator pair designating
> the
> start and end of the scope in the block. Now the question is, can
> different edges leaving the block leave different sets of scopes?
Within a single basic block multiple scopes may be "pushed" and
"popped". The CFG only corresponds to control-flow, and thus nested
compound statements are flattened. Note that C++ also introduces
scopes in many places that C does not. e.g.,
int y = 0;
if (int x = y + 1) { ... }
There are three scopes here. The scope containing the 'if' statement
and 'int y = 0', the scope containing 'int x = 1', and the scope
within the { ... }. The statements 'int y = 0' and 'int x = y + 1'
occur within the same basic block. The successors of that basic block
will have entirely different scopes.
At a high level, I don't think there is much value in modeling the
notion of "scope" at all within a CFG, and the complexity cost would
be high. Scope is a concept of the language and its syntax, and thus
it relates much more directly to the AST than the CFG. The CFG
encodes control-flow between expressions. I really think that all
that you are interested in here is the *effects* of scope on object
lifetime rather than scope itself. Since an object getting destroyed
(and here an object can be anything that is stack allocated, not just
a C++ object) is an actual event with potential side-effects, modeling
that in the CFG makes sense. To me it muddles up the conceptual
clarity of CFGs by trying to have them model scoping (which would make
CFGs a mongrel of two orthogonal concepts).
Don't get me wrong: there is still value in having a way to query the
scope of a variable, but I don't think that belongs in the CFG.
Modeling scope information (which is done in Sema but not in the ASTs)
means having some object or handle that represents a particular scope,
being able to query what objects are in a scope and where a scope
begins and ends, etc. Ultimately analyses based on CFGs probably
don't care about that information at all but rather about the
ramifications of scope in terms of object lifetime. This information
could be captured during CFG construction (which could inspect the
scope information) but the notion of scope shouldn't be in the CFGs
themselves.
Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20090328/f61810d3/attachment.html>
More information about the cfe-dev
mailing list