[cfe-dev] CFG support for temporary object desctructors

Tue Jul 2 06:20:24 PDT 2013

On 2 July 2013 03:47, Jordan Rose <jordan_rose at apple.com> wrote:

>
> On Jun 26, 2013, at 8:11 , Pavel Labath <labath at google.com> wrote:
>
> I don't think so. "implicit" destructors are really more like "scoped"
> destructors: they fire at the end of the scope. In this case, the temporary
> is materialized, but there is still no lifetime extension—it's still
> destroyed at the end of the full-expression. (Which makes the code really
> really bad, of course, since you're returning a reference to stack memory.
> Do we not warn about this?)
>
The warning is there and is working ok.  I was thrown off, because I was
expecting a different warning from the destructor call. You are right, this
does not need to be changed.

The second issue happens with code patterns like:
> Object *o;
> ...
> foo(&o);
> ...
> o->bar();
>
> Here, I get a "dereferencing undefined pointer" warning on the last line,
> but it only happens for some values of "...". I am still investigating how
> is this related to the destructor problem and what exactly is going on
> there.
>
>
> Without seeing the body of foo() or the contents of the various ..., I
> can't really offer any advice.
>
This turned out to be related to the noreturn destructor problem in the
last patch, so this is covered also.

 The second is a lot more difficult. When you pass temporaries to a
>> function, what happens when they are destroyed?
>>
>> void byRef(const Object &ref);
>> void byVal(Object val);
>>
>> byRef(Object())
>> byVal(Object())
>>
>> The destructor here has to run at the end of the full-expression
>> containing the call (roughly, at the outermost expression). However, the
>> function call may have changed some of the object's contents (even 'byRef',
>> if the object has fields marked 'mutable'). We need to make sure that's
>> reflected when the chosen destructor is inlined.
>>
>> The inlining problem is compounded by the fact that we decay structs to a
>> collection of values (nonloc::LazyCompoundVal) whenever we need to treat
>> them as rvalues. This is usually the right thing to do, but has very
>> confusing results for temporaries being copied in and out of functions.
>> According to the standard, the copy constructor happens in the caller (and
>> that's how it appears in the AST), but the region it's being copied into is
>> based on a ParmVarDecl that's part of a StackFrameContext for inlining the
>> function...which we may not decide to inline after all. Ignoring temporary
>> destructors our current behavior is indistinguishable from the standard,
>> but as soon as we start claiming to support temporary destructors we're
>> going to hit this problem.
>>
> I find this a bit strange. I can't say I know all the quirks of the c++,
> but to me the first case looks very similar to:
> { Object foo; byRef(foo); }
> You still need to construct the object, pass a reference to the function
> and then call the destructor. Obviously, the AST for the two fragments
> looks very differently, and we need to take care when parsing it to
> construct a CFG. However, once we already have a CFG, processing it should
> not depend (too much) on whether we are working with a temporary or a
> normal object. Specifically, I do not see why internal implementation
> details (LazyCompoundVal, et al.) should cause problems in one case but not
> the other. Even in case of normal objects, you still need to pass the
> reference to a function and then reflect those changes when you inline the
> destructor.
>
> What am I missing?
>
>
> The poor AST representation. For this code:
>
> struct Object { ~Object(); };
>
> void test() {
> extern void byRef(const Object &ref);
> byRef(Object());
> }
>
>
> Clang generates this AST:
>
> `-FunctionDecl 0x7fb85b830b60 <line:3:1, line:6:1> test 'void (void)'
>   `-CompoundStmt 0x7fb85b8619c0 <line:3:13, line:6:1>
>     |-DeclStmt 0x7fb85b8615b8 <line:4:2, col:35>
>     | `-FunctionDecl 0x7fb85b861510 <col:2, col:34> byRef 'void (const
> struct Object &)' extern
>     |   `-ParmVarDecl 0x7fb85b861450 <col:20, col:34> 'const struct Object
> &'
>     `-ExprWithCleanups 0x7fb85b8619a8 <line:5:2, col:16> 'void'
>       `-CallExpr 0x7fb85b861940 <col:2, col:16> 'void'
>         |-ImplicitCastExpr 0x7fb85b861928 <col:2> 'void (*)(const struct
> Object &)' <FunctionToPointerDecay>
>         | `-DeclRefExpr 0x7fb85b8615d0 <col:2> 'void (const struct Object
> &)' lvalue Function 0x7fb85b861510 'byRef' 'void (const struct Object &)'
>         `-MaterializeTemporaryExpr 0x7fb85b861988 <col:8, col:15> 'const
> struct Object' lvalue
>           `-ImplicitCastExpr 0x7fb85b861970 <col:8, col:15> 'const struct
> Object' <NoOp>
>             `-CXXBindTemporaryExpr 0x7fb85b8618d8 <col:8, col:15> 'struct
> Object' (CXXTemporary 0x7fb85b8618d0)
>               `-CXXTemporaryObjectExpr 0x7fb85b861890 <col:8, col:15>
> 'struct Object' 'void (void)' zeroing
>
> Notice how the CXXTemporaryObjectExpr is a prvalue (it's not marked
> "lvalue" or "xvalue"). In the analyzer, that means it's not representing a
> region but rather values that could be stored in a region. The
> MaterializeTemporaryExpr is then an lvalue, which will cause the analyzer
> to create a region to store the temporary value in.
>
> That's actually all fine, as long as we destroy the newly-created region
> rather than what we originally got from the CXXTemporaryObjectExpr. But
> that's not what the CXXTemporary will lead us to do.
>
> The byVal case is interesting in a different way:
>
> `-FunctionDecl 0x7f8be1030b60 <line:3:1, line:6:1> test 'void (void)'
>   `-CompoundStmt 0x7f8be1061ab0 <line:3:13, line:6:1>
>     |-DeclStmt 0x7f8be1061588 <line:4:2, col:20>
>     | `-FunctionDecl 0x7f8be10614e0 <col:2, col:19> byVal 'void (struct
> Object)'
>     |   `-ParmVarDecl 0x7f8be1061420 <col:13, col:19> 'struct Object'
>     `-ExprWithCleanups 0x7f8be1061a98 <line:5:2, col:16> 'void'
>       `-CallExpr 0x7f8be1061980 <col:2, col:16> 'void'
>         |-ImplicitCastExpr 0x7f8be1061968 <col:2> 'void (*)(struct
> Object)' <FunctionToPointerDecay>
>         | `-DeclRefExpr 0x7f8be10615a0 <col:2> 'void (struct Object)'
> lvalue Function 0x7f8be10614e0 'byVal' 'void (struct Object)'
>         `-CXXBindTemporaryExpr 0x7f8be1061a78 <col:8, col:15> 'struct
> Object' (CXXTemporary 0x7f8be1061a70)
>           `-CXXConstructExpr 0x7f8be1061a38 <col:8, col:15> 'struct
> Object' 'void (const struct Object &) throw()' elidable
>             `-MaterializeTemporaryExpr 0x7f8be10619c8 <col:8, col:15>
> 'const struct Object' lvalue
>               `-ImplicitCastExpr 0x7f8be10619b0 <col:8, col:15> 'const
> struct Object' <NoOp>
>                 `-CXXBindTemporaryExpr 0x7f8be1061918 <col:8, col:15>
> 'struct Object' (CXXTemporary 0x7f8be1061910)
>                   `-CXXTemporaryObjectExpr 0x7f8be10618d0 <col:8, col:15>
> 'struct Object' 'void (void)' zeroing
>
> Here we get an extra CXXConstructExpr and CXXBindTemporaryExpr, which is
> copying the temporary into the argument. However, at the time we evaluate
> the CXXConstructExpr, we don't yet know if we're going to be able to inline
> the whole call. If we do, we need to make sure the VarRegion for the
> by-value parameter contains the constructed object...once we've built the
> stack frame it's in. We probably want to run the destructor within the
> inlined function as well. If we *don't* inline the function, we either
> need to run the destructor *outside* the function, or elide the
> copy-constructor altogether and just destroy the original temporary.
>
> Certainly you could say our representation of rvalue structs is what's
> causing the problem, or at least part of the problem. Certainly *knowing
> which region to destroy* is a big part of this. But we haven't thought
> about what it would take to solve the problem, or if changing the
> representation would actually help solve everything.
>
>
>
>>
>> The third problem is that we simply haven't put time into qualifying and
>> validating the temporary destructor logic. Marcin implemented it a long
>> time ago in the CFG, then Chandler and Doug made sure it was in good enough
>> condition to use for the analysis-based warnings, but we haven't actually
>> tested it in the analyzer.
>>
>
>> Now, all of that said, you're only interested in making 'noreturn' work
>> right now, so for that it seems reasonable to treat the destructors for
>> temporary objects as opaque in the analyzer. It may turn out that we were
>> properly conservative in everything we've done so far that turning it on
>> will just work, but I'd like to see a fair number of test cases before we
>> start shipping that.
>>
>> 'noreturn' is actually the simplest thing here: that's just a change in
>> the shape of the CFG, not necessarily in its use. It's all the other
>> destructors that come along for the ride that worry me. (And the setup of
>> the call to the noreturn destructor as well.)
>>
>> What do you think?
>> Jordan
>>
>> P.S. I asked "what do you think?", but I'm going to be on vacation for
>> the next week, so please don't expect an immediate response. Sorry about
>> the time for *this* response.
>>
>>
> After reading this email, I took a look if it is possible to make the
> analyzer not inline temporary destructors. But I couldn't find a _nice_ way
> to do that. At the point where the inlining decision is made, I don't have
> access to the type of the destructor call. Obviously, I could  push
> information through somehow, but instead I decided to take another look at
> what happens when I actually do inline the destructors, since it already
> seems to be mostly working. I will try to fix the two issues I mentioned
> above (which means the CFG will get more testing and validation) and then
> you can decide whether the code is stable enough to be switched on. If I
> fail at that, I can always make the destructors opaque, as you suggested.
>
>
> I would cheat a lot by asking if the destructor's region is a
> CXXTempObjectRegion, or perhaps "stack memory space but not a VarRegion".
> ;-)
>
> On the other hand, your patch to NoReturnFunctionChecker is giving me more
> hope that it may be possible to solve the noreturn problem without fully
> implementing this feature.
>
> Jordan
>

After a couple of days of plowing through the code I'm not much smarter,
only more convinced that this is really not easy, so I'm starting to agree
with you that we should just not inline these calls (for now). I have tried
adding a inlining test like:
  if (const CXXDestructorCall *DC = dyn_cast<CXXDestructorCall>(&Call)) {
    if (isa<CXXTempObjectRegion>(DC->getCXXThisVal().getAsRegion())) return
false;
  }

but this also prevents inlining of lifetime-extended temporaries
destructors, even though they have an Implicit destructor call in the CFG.
Since they seem to work fine, it would be bad to regress here. I'm not sure
what you meant by "stack memory space but not a VarRegion". Would that
filter out cases like these? And it seems a bit hacky in any case. :)
I guess the best solution would be to actually store the destructor call
type in the CXXDestructorCall object (it already stores whether it is a
base destructor call anyway). Would you accept a patch like that?

regards,
Pavel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130702/850a3a90/attachment.html>