[cfe-dev] Reference-counted statements
Douglas Gregor
dgregor at apple.com
Fri Aug 7 09:26:37 PDT 2009
Daniel, Ted, John and I had a design discussion yesterday about the
ownership model for expressions and statements in Clang. The catalyst
for this discussion is the generic tree transformation that went into
Clang earlier this week, and on the problem of transforming expression
trees. In particular, given the expression
x + y
let's say that the tree transformer transforms x into x' but leaves y
alone. It will then attempt to construct a new addition operation
x' +' y
With today's strict ownership model for expressions, we have a
problem: both the + and the transformed +' believe that they own the
subexpression y, which will lead to double-frees, use-after-free
errors, etc.
One immediate possibility would be to clone the subexpression y (into
a y' that is exactly similar to y); then the +' expression will own x'
and y' and the strict ownership model is retained. This solution is
relatively easy to implement, and is being used already (in a limited
way) during template instantiation. However, it has performance costs
in both time and memory. Moreover, it might not be the right answer
for other clients of Clang's ASTs, which may need to refer to an
expression within the the AST that might be prematurely destroyed.
We think that the addition of reference-counting to statements and
expressions is the right answer for Clang, and I've attached a patch
that adds such reference counting. Essentially, each statement (and,
therefore, expression) gets a an embedded reference count; the Destroy
method of a statement decrements the reference count, then deallocates
the expression and destroys its children if the reference count hits
zero.
Most AST clients will still use simple Stmt or Expr pointers and will
not need to increase or decrease the reference counts. In those few
places where we are explicitly sharing statements or expressions
(e.g., as part of a tree transform), we will call the statement's
Retain operation to increase the reference count. In the attached
patch, I've introduced Retain calls in a few places to illustrate the
idea:
1) When performing template instantiation for non-dependent leaf
nodes (e.g., literals), we retain the literal as written in the source
code bump its reference count by one. Previously, we were cloning.
2) When adding a SwitchCase (for a "case" or a "default" statement)
to a switch, we retain that SwitchCase. The switch statement itself
then Destroys its SwitchCases as part of its own destruction. This
fixes a bunch of crash-on-invalid problems we've had with switch
statements (see PR3444).
Two performance-related notes: first, the size of statements and
expressions did not get any bigger. We reduced the size of the
statement class field in Stmt from 32 bits to 8 bits [*] and used the
remaining 24 bits for the reference count. Second, the reference
counts are modified very, very rarely, and only in response to
explicit sharing (as in the two cases I've shown), so we are not
concerned about additional overhead introduced by the counting.
Comments, questions, etc. all welcome!
- Doug
[*] 256 statement/expression types should be enough for anyone! ;)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ref-counted-stmts.patch
Type: application/octet-stream
Size: 8205 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20090807/1d4485df/attachment.obj>
More information about the cfe-dev
mailing list