[cfe-commits] [Review] Partial fix for PR 11645: put destructor calls before a return statement.

Mon Jan 23 05:44:17 PST 2012

On 18-1-12 19:14, Chandler Carruth wrote:
> On Wed, Jan 18, 2012 at 3:04 AM, Erik Verbruggen <erik.verbruggen at me.com
> <mailto:erik.verbruggen at me.com>> wrote:
>
>     On 17-1-12 17:37, Ted Kremenek wrote:
>
>         On Tuesday, January 17, 2012 at 8:22 AM, Erik Verbruggen wrote:
>
>             On 17-1-12 7:02, Chandler Carruth wrote:
>
>                 On Mon, Jan 16, 2012 at 9:29 PM, Ted Kremenek
>                 <kremenek at apple.com <mailto:kremenek at apple.com>
>                 <mailto:kremenek at apple.com <mailto:kremenek at apple.com>>
>                 <mailto:kremenek at apple.com <mailto:kremenek at apple.com>>>
>                 wrote:
>
>                 Hi Erik,
>
>                 Thanks for working on this. I've commented in the PR.
>
>                 My concern is that I am having second thoughts about
>                 whether or not
>                 this is the right direction. If there are temporaries in
>                 the return
>                 expression, should those destructors appear before the
>                 'return'
>                 statement (but after the subexpression of 'return' gets
>                 evaluated)?
>                 I can see it both ways.
>
>                 For example:
>
>                 return A();
>
>                 where 'A' converts to whatever the return value type
>                 should be, and
>                 ~A() is non-trivial. Technically, ~A() evaluates after the
>                 enclosing statement. We're giving 'return' special
>                 treatment in
>                 your patch.
>
>                 One clean representation of this is to put all the
>                 destructors after
>                 the 'return' in the CFGBlock. The other way is to have the
>                 destructors appear after the subexpression of 'return',
>                 but before
>                 the 'return' itself. The former requires clients of the
>                 CFG to
>                 rethink where they expect 'return' to be in a CFGBlock.
>
>                 What do you think?
>
>
>                 If I can jump in from the peanut gallery...
>
>                 This doesn't seem a necessarily unique problem to return
>                 statements. Any
>                 statement which forms a CFGBlock terminator and which
>                 contains
>                 temporaries with destructors should hit the same issue.
>                 The other
>                 example is I suspect throw. It may even be possible to
>                 trigger this with
>                 an indirect goto if the index into the label-address
>                 array involves
>                 temporary objects.
>
>
>             For a throw it is actually slightly worse: no implicit
>             destructors calls
>             get added to the CFG. I did not check the indirect goto,
>             because I could
>             not come up with a good example..
>
>         For now, just assume that everything with EH in the CFG is wrong or
>         incomplete. That's a discussion on its own.
>
>                 I think the C++ standard* provides some help in making a
>                 choice about
>                 how to represent this. It doesn't actually say that ~A()
>                 evaluates after
>                 the enclosing statement. Instead it defines
>                 'full-expressions' as an
>                 expression which is not a subexpression of any other
>                 expression
>                 ([intro.execution] 1.9p10). As *part* of this
>                 full-expression, all
>                 temporary objects' non-trivial destructors must be called
>                 ([class.temporary] 12.2p3). To me, that indicates that
>                 the destructor
>                 should be placed after the complete subexpression in the
>                 return
>                 statement, but within the return statement itself. This
>                 has the nice
>                 property of preserving the (very useful!) invariant that
>                 terminator
>                 statements are the last statement in the CFG block.
>
>
>             This would mean that a full expression of "return
>             SomeClass();" would
>             get split in two: all the implicit destructors calls for the
>             scope get
>             placed between the return value calculation and the actual
>             return of
>             that value.
>
>         Yes. Exactly. That's the correct semantics
>
>             If you would want to base code generation on the CFG, it
>             would force an extra pass to identify whether calculated
>             values should
>             be constructed in a return value slot or not. So I think
>             that using
>             return/throw (and possibly goto) as markers, would be a
>             simplification.
>
>         Concerning "goto" and "throw"; those are terminators and already can
>         serve as markers. I'll elaborate with the case of "return". I'm
>         also not
>         certain what you mean by an "extra pass", but it *might* require
>         some work.
>
>         My concern is that, unlike "goto", the "return" in the CFG is not a
>         terminator; it doesn't represent a transfer of control. That's what
>         terminators in CFGBlocks are for. Instead, it appears in the
>         CFGBlock as
>         an element, which means (in that context) it represents a
>         statement that
>         gets executed. Since it doesn't represent the transfer of control in
>         that context, the sole role of the "return" in the CFG is to
>         represent
>         the binding of the return value.
>
>         That said, we could place the "return" as a terminator, in
>         *addition* to
>         what we do now, and have that terminator terminate the CFGBlock
>         with the
>         destructor calls in it. That terminator would be on a CFGBlock that
>         transitioned to the Exit block. Doing this isn't absolutely
>         necessary,
>         but it would annotate the end of the CFGBlock with the return
>         statement,
>         possibly making it more amendable to code generation while still
>         preserving the invariants of the CFG.
>
>
>     So if I understand it correctly, the following example should
>     generate the CFG below:
>
>
>     class A {
>     public:
>         A();
>         ~A();
>     };
>
>     class B {
>     public:
>         B();
>         ~B();
>     };
>
>     B test() {
>         A a;
>         return B();
>
>     }
>
>       [B3 (ENTRY)]
>        Succs (1): B2
>
>       [B1]
>        1: [B2.2].~A() (Implicit destructor)
>        Succs (1): B0
>
>       [B2]
>        1:  (CXXConstructExpr, class A)
>        2: A a;
>        3: B() (CXXConstructExpr, class B)
>        4: [B2.3] (BindTemporary)
>        5: [B2.4] (ImplicitCastExpr, NoOp, const class B)
>        6: [B2.5]
>        7: [B2.6] (CXXConstructExpr, class B)
>        8: ~B() (Temporary object destructor)
>        9: return [B2.7];
>        Preds (1): B3
>        Succs (1): B1
>
>
>       [B0 (EXIT)]
>        Preds (2): B1
>
>     (Note: this is manually "corrected" CFG.)
>
>     Is that right?
>
>
> FWIW, this what I had in mind. I'll also give a slightly more complex
> example to further illustrate my idea, but again this is a manually
> "corrected" CFG:
>
> B test2() {
>    A a1;
>    {
>      A a2, a3;
>      A a4;
>      return B();
>    }
> }
>
>   [B5 (ENTRY)]
>    Succs (1): B4
>
>   [B4]
>    1:  (CXXConstructExpr, class A)
>    2: A a1,
>    Preds (1): B5
>    Succs (1): B4
>
>   [B3]
>    1:  (CXXConstructExpr, class A)
>    2: A a2,
>    3:  (CXXConstructExpr, class A)
>    4: a3;
>    5:  (CXXConstructExpr, class A)
>    6: A a4;
>    7: B() (CXXConstructExpr, class B)
>    8: [B3.7] (BindTemporary)
>    9: [B3.8] (ImplicitCastExpr, NoOp, const class B)
>    10: [B3.9]
>    11: [B3.10] (CXXConstructExpr, class B)
>    12: ~B() (Temporary object destructor)
>    13: return [B3.11];
>    Preds (1): B4
>    Succs (1): B1
>
>   [B2]
>    1: [B3.6].~A() (Implicit destructor)
>    2: [B3.4].~A() (Implicit destructor)
>    3: [B3.2].~A() (Implicit destructor)
>    Preds (1): B3
>    Succs (1): B1
>
>   [B1]
>    1: [B4.2].~A() (Implicit destructor)
>    Preds (1): B2
>    Succs (1): B0
>
>   [B0 (EXIT)]
>    Preds (1): B1

So that comes down to one block with destructors for every scope. What 
would be the benefit for that compared to one block for a return that 
holds all destructors?

-- Erik.