[cfe-dev] Reference-counted statements

Douglas Gregor dgregor at apple.com
Fri Aug 7 09:26:37 PDT 2009


Daniel, Ted, John and I had a design discussion yesterday about the  
ownership model for expressions and statements in Clang. The catalyst  
for this discussion is the generic tree transformation that went into  
Clang earlier this week, and on the problem of transforming expression  
trees. In particular, given the expression

	x + y

let's say that the tree transformer transforms x into x' but leaves y  
alone. It will then attempt to construct a new addition operation

	x' +' y

With today's strict ownership model for expressions, we have a  
problem: both the + and the transformed +' believe that they own the  
subexpression y, which will lead to double-frees, use-after-free  
errors, etc.

One immediate possibility would be to clone the subexpression y (into  
a y' that is exactly similar to y); then the +' expression will own x'  
and y' and the strict ownership model is retained. This solution is  
relatively easy to implement, and is being used already (in a limited  
way) during template instantiation. However, it has performance costs  
in both time and memory. Moreover, it might not be the right answer  
for other clients of Clang's ASTs, which may need to refer to an  
expression within the the AST that might be prematurely destroyed.

We think that the addition of reference-counting to statements and  
expressions is the right answer for Clang, and I've attached a patch  
that adds such reference counting. Essentially, each statement (and,  
therefore, expression) gets a an embedded reference count; the Destroy  
method of a statement decrements the reference count, then deallocates  
the expression and destroys its children if the reference count hits  
zero.

Most AST clients will still use simple Stmt or Expr pointers and will  
not need to increase or decrease the reference counts. In those few  
places where we are explicitly sharing statements or expressions  
(e.g., as part of a tree transform), we will call the statement's  
Retain operation to increase the reference count. In the attached  
patch, I've introduced Retain calls in a few places to illustrate the  
idea:

	1) When performing template instantiation for non-dependent leaf  
nodes (e.g., literals), we retain the literal as written in the source  
code bump its reference count by one. Previously, we were cloning.

	2) When adding a SwitchCase (for a "case" or a "default" statement)  
to a switch, we retain that SwitchCase. The switch statement itself  
then Destroys its SwitchCases as part of its own destruction. This  
fixes a bunch of crash-on-invalid problems we've had with switch  
statements (see PR3444).

Two performance-related notes: first, the size of statements and  
expressions did not get any bigger. We reduced the size of the  
statement class field in Stmt from 32 bits to 8 bits [*] and used the  
remaining 24 bits for the reference count. Second, the reference  
counts are modified very, very rarely, and only in response to  
explicit sharing (as in the two cases I've shown), so we are not  
concerned about additional overhead introduced by the counting.

Comments, questions, etc. all welcome!

	- Doug

[*] 256 statement/expression types should be enough for anyone! ;)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ref-counted-stmts.patch
Type: application/octet-stream
Size: 8205 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20090807/1d4485df/attachment.obj>


More information about the cfe-dev mailing list