[cfe-dev] Is TreeTransform over-eager in dropping stuff?

Wed Feb 22 08:43:22 PST 2012

On Feb 22, 2012, at 1:36 AM, Sebastian Redl wrote:

> Hi,
> 
> I've found two instances recently where TreeTransform simply skips intermediate expressions: single-argument CXXConstructExprs and ImplicitCastExprs. Not sure why this is done - could be for efficiency reasons.

> However, this isn't necessarily efficient; consider:
> 
> struct WithDefault { WithDefault(int i = 0); }
> template <typename T>
> void fn(T t) {
>  new WithDefault;
> }
> 
> Note the new expression, which is completely non-dependent. The AST for the expression in the template is
> (CXXNewExpr)
>  (CXXConstructExpr WithDefault)
>    (CXXDefaultArgExpr)
>      (IntegerLiteral 0)
> 
> When instantiating the CXXNewExpr, the initializer is instantiated. TreeTransform looks at the CXXConstructExpr, sees that it only has one argument, and skips it, returning the instantiation of the single argument instead. This will return the CXXDefaultArgExpr unchanged (because it is non-dependent).
> The CXXNewExpr instantiation now compares the old initializer (CXXConstructExpr) to the new initializer (the CXXDefaultArgExpr contained within), sees that they are different, and thus decides that it has to rebuild the CXXNewExpr.
> Which is *expensive*.
> 
> I think all the places in TreeTransform that just forward to the subexpression should check if the inner expression actually changed before blindly returning it, so as to trigger fewer redundant rebuilds.

That would be more efficient, certainly. 

> Am I right, or is there a deeper reason to the skipping?

It's done for semantic reasons. Essentially, we're skipping through implicitly-generated AST nodes so that when we do end up rebuilding, we start with something that resembles the source and then compute implicit conversions from there. If we didn't skip over these implicitly-generated AST nodes, we'd end up allowing (for example) multiple user-defined conversions after transformation.

That said, this can certainly be optimized, and we we *love* to be able to get to the point where only instantiation-dependent subexpressions are rebuilt during instantiation, to save both memory and time.

	- Doug