[cfe-commits] Support for C++ new and delete

Thu Nov 20 11:56:42 PST 2008

Douglas Gregor wrote:
> Hi Sebastian,
>
> On Nov 19, 2008, at 6:37 PM, Sebastian Redl wrote:
>> I've enhanced and updated my patch to support C++ new and delete. It 
>> doesn't support everything yet, but it's a pretty good start, and I'd 
>> like to get the current state checked in.
>
> Very nice. I'd like to discuss the VLAs issue a bit (see below), but 
> otherwise I think it's good to commit this as the first (big) step. 
> Thanks!
Thanks for the review. I think I'll discuss the mess I've found myself 
in with the whole VLA issue first instead of spread over your individual 
comments.

The basic problem stems from the fact that I need to reuse as much code 
as possible, or else go insane from duplication with minimal changes. So 
I wanted to use ParseTypeName for the 'new (type-id)' form. Now, 
ParseTypeName gives me back a TypeTy, which is opaque to the parser. So 
whatever I get from this function, it has to go into Sema, because only 
Sema can handle it.

My original design called for an 'Expr *ArraySize' parameter to 
ActOnCXXNew. It was supposed to be null for non-arrays and non-null for 
arrays. Then I discovered that I couldn't get the ArraySize expression 
out of the result of ParseTypeName in the parser, and modified it so 
that the '(type-id)' form was recognized in ActOnCXXNew, and on 
array-level unwrapped if the type was an array.
Then I discovered that I can't recover the Expr from a constant-sized 
array (a pity, by the way). So I implemented a terrible hack that 
created a LiteralExpr from the recovered size. But of course, that 
didn't even have proper source location information. I finally decided 
that I should really use the existing VLA functionality for this.
Even then, I was unwrapping the VLA at first. But a VLA won't give up 
its size expression once created. My build crashed because the size 
expression was freed twice. For a bit, I had the CXXNewExpr not own the 
size expression, but that was such a horrible hack that I gave it up and 
implemented what I have now instead.

I really like the idea of simply passing a Declarator to ActOnCXXNew, 
though. Ideally, I'd kill the outermost array there and keep its 
expression, then pass it on to ActOnTypeName to create the type for me. 
However, for that I'd have to make the chunks removable in Declarator, 
which they aren't now.
I'll still check this patch in with the VLA solution - I want a 
reversion point if this approach doesn't work out.

>
> In Class CXXNewExpr:
>
> +  // Points to the allocation function used.
> +  Decl *OperatorNew;
> +  // Points to the deallocation function used in case of error. May 
> be null.
> +  Decl *OperatorDelete;
>
> Shouldn't these be FunctionDecl pointers, rather than Decl pointers?
>
> +  // Points to the constructor used. Cannot be null if AllocType is a 
> record;
> +  // it would still point at the default constructor (even an 
> implicit one).
> +  // Must be null for all other types.
> +  Decl *Constructor;
>
> Same question here: why not make it a CXXConstructorDecl pointer?
Yes, and yes. I made them Decls initially because I didn't want to think 
about the hierarchy at that time, then forgot about it.
> +  ~CXXNewExpr() {
> +    delete[] SubExprs;
> +  }
>
> You'll need to delete each of the SubExprs, too. I suggest moving the 
> destructor out-of-line, since it'll end up having a loop.
The individual SubExprs are destroyed by Stmt::DestroyChildren.
>
> +  // Points to the operator delete overload that is used. Could be a 
> member.
> +  Decl *OperatorDelete;
>
> This will always be a FunctionDecl *, right? (Or a subclass)
Again, yes.
>
> +  // The pointer expression to be deleted.
> +  Stmt *Argument;
>
> Could this be an Expr * rather than a Stmt *?
Conceptually, yes. However, I can't initialize a Stmt::child_iterator 
from an Expr**, so I'll have an ugly cast in that function. It seemed 
easier to just make it a Stmt*.
>
> +  // Location of the expression.
> +  SourceLocation Loc;
>
> Is this the location of the "delete" operator? It should probably be 
> called OpLoc or DeleteLoc.
It's the location of the 'delete' operator, or the '::' if the 
expression was "::delete".
> I think it makes sense in the parser to deal with the type in the new 
> expression like a VLA,
Note, though, that the parser doesn't know what types are.
> +  // Validate the type, and unwrap an array if any.
> +  if (CheckAllocatedType(CheckType, StartLoc, SourceRange(TyStart, 
> TyEnd)))
> +    return true;
>
> If we exit early here, we'll leak the PlacementArgs and 
> ConstructorArgs. It might make sense to do what ActOnCallExpr does, 
> which is to create an llvm::OwningPtr of the final expression in the 
> beginning, which will be responsible for deallocating all of the 
> subexpressions even if we exit early.
The thing is, I don't think what ActOnCallExpr does actually works. Like 
CXXNewExpr, CallExpr only deletes the array it holds in the destructor, 
and relies on Stmt::DestroyChildren to delete the subexpressions. But 
llvm::OwningPtr won't call Destroy() (which calls DestroyChildren()).
Since the memory of the arrays passed in is owned by the SmallVectors, 
nothing is freed that wouldn't be freed otherwise too.
So yes, we need a way to avoid the leak, but ActOnCallExpr is doing it 
wrong. (So we need that way and apply it there, too.)
>
> +    // FIXME: This is incorrect for when there is an empty 
> initializer and
> +    // no user-defined constructor. Must zero-initialize, not 
> default-construct.
> +    Constructor = PerformInitializationByConstructor(
> +                      CheckType, ConsArgs, NumConsArgs,
> +                      TyStart, SourceRange(TyStart, ConstructorRParen),
> +                      CheckType.getAsString(),
> +                      NumConsArgs != 0 ? IK_Direct : IK_Default);
>
> We'll deal with this once we've gone through and updated all of the 
> initialization routines to deal with C++ semantics. As you've noted, 
> we don't have all of the various cases implemented.
And to make things worse, the wording has changed significantly in 
C++0x. I *think* it still says the same overall, but since some wording 
has been dragged from all over the standard and centralized in 8.5, it's 
a bit tricky to prove.
> I hadn't realized that isObjectType was false for incomplete types. 
> That's correct for C99, but not for C++. It's not your problem to deal 
> with (unless you want to <g>), but could you please put a
>
>     FIXME: check incomplete type before isObjectType
>
> or something similar into CheckAllocatedType? It'll make it easier to 
> fix the problem of isObjectType having a different behavior in C vs. C++.
OK.
> +  case CXXNewExprClass:
> +    // In theory, there might be new expressions that don't have side 
> effects
> +    // (e.g. a placement new with an uninitialized POD), but those 
> are obscure.
>
> It's an obscure case, but I think it's worthy of a FIXME.  Whether we 
> ever get to fix it or not is the real question :)
OK.
>
> -  case CXXThisExprClass:
> -    return LV_InvalidExpression;
>
> ?
>
> "this" is an rvalue according to C++ [expr.prim]p3.
The default behavior of the switch is to return LV_InvalidExpression. I 
don't see why CXXThisExpr should be the one case that does so explicitly 
and unconditionally.
>
> +void CXXNewExpr::EmitImpl(Serializer& S) const {
> +  S.Emit(getType());
> +  S.Emit(Initializer);
> +  S.Emit(NumPlacementArgs);
> +  S.Emit(NumConstructorArgs);
> +  S.BatchEmitOwnedPtrs(NumPlacementArgs + NumConstructorArgs, SubExprs);
> +  // FIXME: What about the default allocation functions? We don't 
> want to
> +  // own them to avoid the mess DeclRefExpr is in, but who would?
>
> For new, it seems to me that the operator new and operator delete will 
> either be owned by the class (if we found them in the class) or owned 
> by the translation unit  (whether they are implicit or not); the 
> constructor will always be owned by the class. So I don't think the 
> FIXME needs any fixing.
OK.
>
> +void StmtPrinter::VisitCXXNewExpr(CXXNewExpr *E) {
> +  // FIXME: How to recover the '::'?
> +  OS << "new ";
>
> Why not just put a bit into CXXNewExpr/CXXDeleteExpr that says whether 
> the user explicitly wrote the "::"?
> (We could do something similar to keep track of whether we parsed a 
> new-type-id or a (type-id).)
Yeah, probably the best solution.
>
> -void Parser::ParseDeclaratorInternal(Declarator &D, bool PtrOperator) {
> +void Parser::ParseDeclaratorInternal(Declarator &D,
> +                                     DirectDeclParseFunction 
> DirectDeclParser) {
>    tok::TokenKind Kind = Tok.getKind();
>
>    // Not a pointer, C++ reference, or block.
>    if (Kind != tok::star && (Kind != tok::amp || !getLang().CPlusPlus) &&
>        (Kind != tok::caret || !getLang().Blocks)) {
> -    if (!PtrOperator)
> -      ParseDirectDeclarator(D);
> +    if (DirectDeclParser)
> +      (this->*DirectDeclParser)(D);
>      return;
>    }
>
> This is cool.
Thanks.
>
> +void Parser::ParseNewDirectDeclarator(Declarator &D)
> +{
> +  // Parse the array dimensions.
>
> Please add a comment for this routine.
OK.
>
> +bool Parser::ParseExpressionListOrTypeId(ExprListTy &PlacementArgs, 
> TypeTy *&Ty)
> +{
> +  // The '(' was alrady consumed.
>
> Typo "alrady".
OK.

Sebastian