[cfe-dev] Ownership of Stmts and Exprs, and destroying them

Sat Nov 22 23:54:28 PST 2008

On Nov 22, 2008, at 3:36 PM, Sebastian Redl wrote:

> Hi,
>
> As I understand the ownership, any given Expr is owned by the Expr  
> it's
> an argument for, which is owned by the Stmt it makes up, which is  
> owned
> by the function declaration, and so on, up to the translation unit.
> Except for some rare cases, like VLA sizes being owned by the
> VariableLengthArrayType, and function default argument expressions  
> being
> owned by their FunctionDecl.

Yep.

> Expressions that are currently being built - e.g. the argument
> parameters of ActOnBinOp - are essentially unowned.

They are owned, but the ownership is transfered to the ActOn method  
that gets them.  It is up to that method to deallocate them or  
transfer ownership to the return value.

> As such, if the Sema
> function returns early, without building some object to wrap them,  
> they
> leak. In addition, they leak when the parser returns early and  
> abandons
> the ExprTys it was carrying around. This is seen as acceptable,  
> because
> the program will soon end anyway if there was an error.

This is not acceptable, Clang can be used by other clients that are  
long lived, and leaking memory is a really bad thing.

> Still, not everyone seems to think so. Doug goes out of his way to not
> leak these objects in the Sema routines, and asked me not to leak  
> them,
> either. Chris thought it great that Doug wasn't leaking, too.

I don't know that anyone thinks it good that we leak memory :).  The  
history here is that much of Sema was built before there *was* a clear  
ownership model.  This means that there is a lot of "bad" code in it  
that doesn't obey the newly defined ownership model.  Also, the parser  
has a couple of places where it leaks as well.  If the parser does  
error recovery and ends up obviating a parsed expr/statement/etc, it  
should call the Actions::DeleteExpr/DeleteStmt methods to free them.   
This happens in some cases, but not all of them.

> To destroy a Stmt or its derived classes, you don't use delete, but  
> the
> Destroy member. This may be overridden in subclasses, and the base
> version first recursively calls Destroy for every child statement, and
> then commits suicide (delete this). Calling delete yourself is *not*
> sufficient.

I think that statements (exprs are statements) are freed with delete  
currently:

void Sema::DeleteExpr(ExprTy *E) {
   delete static_cast<Expr*>(E);
}
void Sema::DeleteStmt(StmtTy *S) {
   delete static_cast<Stmt*>(S);
}

We're slowly migrating decls to use a Create/Destroy interface, in  
order to let them be pool allocated or use some other allocation  
mechanism.  It would be good to also move statements this way  
eventually, but decls should be finished first.

> Is this correct? Because I want to clean this up. I want to make the
> Sema routines (and later the parser, too) not leak. So I need to know
> exactly how this works.

This would be great.  Probably the first place to start is to use  
audit the parser for places where early exits cause parsed stmts to be  
leaked.  Once that is done, start going through sema.  The pattern we  
want to use in Sema for statements is using OwningPtr to hold incoming  
arguments.  For example, ActOnBinOp should look like:

                                     ExprTy *LHS, ExprTy *RHS) {
   BinaryOperator::Opcode Opc = ConvertTokenKindToBinaryOpcode(Kind);
   OwningPtr<Expr> lhs = (Expr *)LHS, rhs = (Expr*)RHS;

   ...
   // Build a built-in binary operation.
   return CreateBuiltinBinOp(TokLoc, Opc, lhs.take(), rhs.take());

For routines that have more incoming values or are variadic, the  
pattern in ActOnCallExpr should be followed.

Incidentally, I'd like for C++ specific code to be moved to one of the  
C++ Sema files where it makes sense.  For example, the  
Sema::ActOnBinOp contains a bunch of code for binary operator  
overloading.  I'd prefer the code to look like this:

     if (getLangOptions().CPlusPlus &&
       (lhs->getType()->isRecordType() || lhs->getType()- 
 >isEnumeralType() ||
        rhs->getType()->isRecordType() || rhs->getType()- 
 >isEnumeralType())) {
       if (HandleCXXBinOpOverloading(...))
         return result;
     }

     // Non overloading code.

-Chris