[cfe-commits] [Review] Partial fix for PR 11645: put destructor calls before a return statement.
Erik Verbruggen
erik.verbruggen at me.com
Mon Jan 23 05:44:17 PST 2012
On 18-1-12 19:14, Chandler Carruth wrote:
> On Wed, Jan 18, 2012 at 3:04 AM, Erik Verbruggen <erik.verbruggen at me.com
> <mailto:erik.verbruggen at me.com>> wrote:
>
> On 17-1-12 17:37, Ted Kremenek wrote:
>
> On Tuesday, January 17, 2012 at 8:22 AM, Erik Verbruggen wrote:
>
> On 17-1-12 7:02, Chandler Carruth wrote:
>
> On Mon, Jan 16, 2012 at 9:29 PM, Ted Kremenek
> <kremenek at apple.com <mailto:kremenek at apple.com>
> <mailto:kremenek at apple.com <mailto:kremenek at apple.com>>
> <mailto:kremenek at apple.com <mailto:kremenek at apple.com>>>
> wrote:
>
> Hi Erik,
>
> Thanks for working on this. I've commented in the PR.
>
> My concern is that I am having second thoughts about
> whether or not
> this is the right direction. If there are temporaries in
> the return
> expression, should those destructors appear before the
> 'return'
> statement (but after the subexpression of 'return' gets
> evaluated)?
> I can see it both ways.
>
> For example:
>
> return A();
>
> where 'A' converts to whatever the return value type
> should be, and
> ~A() is non-trivial. Technically, ~A() evaluates after the
> enclosing statement. We're giving 'return' special
> treatment in
> your patch.
>
> One clean representation of this is to put all the
> destructors after
> the 'return' in the CFGBlock. The other way is to have the
> destructors appear after the subexpression of 'return',
> but before
> the 'return' itself. The former requires clients of the
> CFG to
> rethink where they expect 'return' to be in a CFGBlock.
>
> What do you think?
>
>
> If I can jump in from the peanut gallery...
>
> This doesn't seem a necessarily unique problem to return
> statements. Any
> statement which forms a CFGBlock terminator and which
> contains
> temporaries with destructors should hit the same issue.
> The other
> example is I suspect throw. It may even be possible to
> trigger this with
> an indirect goto if the index into the label-address
> array involves
> temporary objects.
>
>
> For a throw it is actually slightly worse: no implicit
> destructors calls
> get added to the CFG. I did not check the indirect goto,
> because I could
> not come up with a good example..
>
> For now, just assume that everything with EH in the CFG is wrong or
> incomplete. That's a discussion on its own.
>
> I think the C++ standard* provides some help in making a
> choice about
> how to represent this. It doesn't actually say that ~A()
> evaluates after
> the enclosing statement. Instead it defines
> 'full-expressions' as an
> expression which is not a subexpression of any other
> expression
> ([intro.execution] 1.9p10). As *part* of this
> full-expression, all
> temporary objects' non-trivial destructors must be called
> ([class.temporary] 12.2p3). To me, that indicates that
> the destructor
> should be placed after the complete subexpression in the
> return
> statement, but within the return statement itself. This
> has the nice
> property of preserving the (very useful!) invariant that
> terminator
> statements are the last statement in the CFG block.
>
>
> This would mean that a full expression of "return
> SomeClass();" would
> get split in two: all the implicit destructors calls for the
> scope get
> placed between the return value calculation and the actual
> return of
> that value.
>
> Yes. Exactly. That's the correct semantics
>
> If you would want to base code generation on the CFG, it
> would force an extra pass to identify whether calculated
> values should
> be constructed in a return value slot or not. So I think
> that using
> return/throw (and possibly goto) as markers, would be a
> simplification.
>
> Concerning "goto" and "throw"; those are terminators and already can
> serve as markers. I'll elaborate with the case of "return". I'm
> also not
> certain what you mean by an "extra pass", but it *might* require
> some work.
>
> My concern is that, unlike "goto", the "return" in the CFG is not a
> terminator; it doesn't represent a transfer of control. That's what
> terminators in CFGBlocks are for. Instead, it appears in the
> CFGBlock as
> an element, which means (in that context) it represents a
> statement that
> gets executed. Since it doesn't represent the transfer of control in
> that context, the sole role of the "return" in the CFG is to
> represent
> the binding of the return value.
>
> That said, we could place the "return" as a terminator, in
> *addition* to
> what we do now, and have that terminator terminate the CFGBlock
> with the
> destructor calls in it. That terminator would be on a CFGBlock that
> transitioned to the Exit block. Doing this isn't absolutely
> necessary,
> but it would annotate the end of the CFGBlock with the return
> statement,
> possibly making it more amendable to code generation while still
> preserving the invariants of the CFG.
>
>
> So if I understand it correctly, the following example should
> generate the CFG below:
>
>
> class A {
> public:
> A();
> ~A();
> };
>
> class B {
> public:
> B();
> ~B();
> };
>
> B test() {
> A a;
> return B();
>
> }
>
> [B3 (ENTRY)]
> Succs (1): B2
>
> [B1]
> 1: [B2.2].~A() (Implicit destructor)
> Succs (1): B0
>
> [B2]
> 1: (CXXConstructExpr, class A)
> 2: A a;
> 3: B() (CXXConstructExpr, class B)
> 4: [B2.3] (BindTemporary)
> 5: [B2.4] (ImplicitCastExpr, NoOp, const class B)
> 6: [B2.5]
> 7: [B2.6] (CXXConstructExpr, class B)
> 8: ~B() (Temporary object destructor)
> 9: return [B2.7];
> Preds (1): B3
> Succs (1): B1
>
>
> [B0 (EXIT)]
> Preds (2): B1
>
> (Note: this is manually "corrected" CFG.)
>
> Is that right?
>
>
> FWIW, this what I had in mind. I'll also give a slightly more complex
> example to further illustrate my idea, but again this is a manually
> "corrected" CFG:
>
> B test2() {
> A a1;
> {
> A a2, a3;
> A a4;
> return B();
> }
> }
>
> [B5 (ENTRY)]
> Succs (1): B4
>
> [B4]
> 1: (CXXConstructExpr, class A)
> 2: A a1,
> Preds (1): B5
> Succs (1): B4
>
> [B3]
> 1: (CXXConstructExpr, class A)
> 2: A a2,
> 3: (CXXConstructExpr, class A)
> 4: a3;
> 5: (CXXConstructExpr, class A)
> 6: A a4;
> 7: B() (CXXConstructExpr, class B)
> 8: [B3.7] (BindTemporary)
> 9: [B3.8] (ImplicitCastExpr, NoOp, const class B)
> 10: [B3.9]
> 11: [B3.10] (CXXConstructExpr, class B)
> 12: ~B() (Temporary object destructor)
> 13: return [B3.11];
> Preds (1): B4
> Succs (1): B1
>
> [B2]
> 1: [B3.6].~A() (Implicit destructor)
> 2: [B3.4].~A() (Implicit destructor)
> 3: [B3.2].~A() (Implicit destructor)
> Preds (1): B3
> Succs (1): B1
>
> [B1]
> 1: [B4.2].~A() (Implicit destructor)
> Preds (1): B2
> Succs (1): B0
>
> [B0 (EXIT)]
> Preds (1): B1
So that comes down to one block with destructors for every scope. What
would be the benefit for that compared to one block for a return that
holds all destructors?
-- Erik.
More information about the cfe-commits
mailing list