[cfe-dev] CFG support for temporary object desctructors

Jordan Rose jordan_rose at apple.com
Mon Jul 1 18:47:34 PDT 2013


On Jun 26, 2013, at 8:11 , Pavel Labath <labath at google.com> wrote:

> Hello,
> 
> thank you for the response. It makes things much clearer. However, I still have some questions.
> 
> 
> On 25 June 2013 04:03, Jordan Rose <jordan_rose at apple.com> wrote:
> Hi, Pavel. It's great to hear that you're interested in this. 
> 
> As I remember from last time I looked at this, there are three things holding us back from turning on support for temporary destructors in the analyzer. The first is that the analyzer is not wired up to handle destructor CFG nodes for arbitrary regions. I don't think this is a major blocking issue, but it might take a bit of threading through all the code.
> After turning on the desctructors i've found some unimplemented functions and asserts in the code which were stopping me. I have replaced them with some code, which I cargo-culted from functions dealing with other types of destructors.
>  
> And this actually seems to work fairly well. All my null pointer dereferences have disappeared. At this moment, I am having only two issues with this fix.
> 
> The first happens in patterns like:
>  const Object& foo() { return Object(); }
> Here, the constructed CFG looks like:
> 
> ConstructCXXExpr
> BindTemporary
> ImplicitCastExpr
> MaterializeTemporaryExpr
> Temporary destructor call
> 
> I believe here we should put an Implicit destructor call, given that the object is materialized, as it is done in the case of
> const Object &foo = Object();
> Is that correct? If so, I can make a patch for that. It shouldn't be too difficult.

I don't think so. "implicit" destructors are really more like "scoped" destructors: they fire at the end of the scope. In this case, the temporary is materialized, but there is still no lifetime extension—it's still destroyed at the end of the full-expression. (Which makes the code really really bad, of course, since you're returning a reference to stack memory. Do we not warn about this?)


> The second issue happens with code patterns like:
> Object *o;
> ...
> foo(&o);
> ...
> o->bar();
> 
> Here, I get a "dereferencing undefined pointer" warning on the last line, but it only happens for some values of "...". I am still investigating how is this related to the destructor problem and what exactly is going on there.

Without seeing the body of foo() or the contents of the various ..., I can't really offer any advice.


> 
> The second is a lot more difficult. When you pass temporaries to a function, what happens when they are destroyed?
> 
> void byRef(const Object &ref);
> void byVal(Object val);
> 
> byRef(Object())
> byVal(Object())
> 
> The destructor here has to run at the end of the full-expression containing the call (roughly, at the outermost expression). However, the function call may have changed some of the object's contents (even 'byRef', if the object has fields marked 'mutable'). We need to make sure that's reflected when the chosen destructor is inlined.
> 
> The inlining problem is compounded by the fact that we decay structs to a collection of values (nonloc::LazyCompoundVal) whenever we need to treat them as rvalues. This is usually the right thing to do, but has very confusing results for temporaries being copied in and out of functions. According to the standard, the copy constructor happens in the caller (and that's how it appears in the AST), but the region it's being copied into is based on a ParmVarDecl that's part of a StackFrameContext for inlining the function...which we may not decide to inline after all. Ignoring temporary destructors our current behavior is indistinguishable from the standard, but as soon as we start claiming to support temporary destructors we're going to hit this problem.
> I find this a bit strange. I can't say I know all the quirks of the c++, but to me the first case looks very similar to:
> { Object foo; byRef(foo); }
> You still need to construct the object, pass a reference to the function and then call the destructor. Obviously, the AST for the two fragments looks very differently, and we need to take care when parsing it to construct a CFG. However, once we already have a CFG, processing it should not depend (too much) on whether we are working with a temporary or a normal object. Specifically, I do not see why internal implementation details (LazyCompoundVal, et al.) should cause problems in one case but not the other. Even in case of normal objects, you still need to pass the reference to a function and then reflect those changes when you inline the destructor.
> 
> What am I missing?

The poor AST representation. For this code:

> struct Object { ~Object(); };
> 
> void test() {
> 	extern void byRef(const Object &ref);
> 	byRef(Object());
> }

Clang generates this AST:

`-FunctionDecl 0x7fb85b830b60 <line:3:1, line:6:1> test 'void (void)'
  `-CompoundStmt 0x7fb85b8619c0 <line:3:13, line:6:1>
    |-DeclStmt 0x7fb85b8615b8 <line:4:2, col:35>
    | `-FunctionDecl 0x7fb85b861510 <col:2, col:34> byRef 'void (const struct Object &)' extern
    |   `-ParmVarDecl 0x7fb85b861450 <col:20, col:34> 'const struct Object &'
    `-ExprWithCleanups 0x7fb85b8619a8 <line:5:2, col:16> 'void'
      `-CallExpr 0x7fb85b861940 <col:2, col:16> 'void'
        |-ImplicitCastExpr 0x7fb85b861928 <col:2> 'void (*)(const struct Object &)' <FunctionToPointerDecay>
        | `-DeclRefExpr 0x7fb85b8615d0 <col:2> 'void (const struct Object &)' lvalue Function 0x7fb85b861510 'byRef' 'void (const struct Object &)'
        `-MaterializeTemporaryExpr 0x7fb85b861988 <col:8, col:15> 'const struct Object' lvalue
          `-ImplicitCastExpr 0x7fb85b861970 <col:8, col:15> 'const struct Object' <NoOp>
            `-CXXBindTemporaryExpr 0x7fb85b8618d8 <col:8, col:15> 'struct Object' (CXXTemporary 0x7fb85b8618d0)
              `-CXXTemporaryObjectExpr 0x7fb85b861890 <col:8, col:15> 'struct Object' 'void (void)' zeroing

Notice how the CXXTemporaryObjectExpr is a prvalue (it's not marked "lvalue" or "xvalue"). In the analyzer, that means it's not representing a region but rather values that could be stored in a region. The MaterializeTemporaryExpr is then an lvalue, which will cause the analyzer to create a region to store the temporary value in.

That's actually all fine, as long as we destroy the newly-created region rather than what we originally got from the CXXTemporaryObjectExpr. But that's not what the CXXTemporary will lead us to do.

The byVal case is interesting in a different way:

`-FunctionDecl 0x7f8be1030b60 <line:3:1, line:6:1> test 'void (void)'
  `-CompoundStmt 0x7f8be1061ab0 <line:3:13, line:6:1>
    |-DeclStmt 0x7f8be1061588 <line:4:2, col:20>
    | `-FunctionDecl 0x7f8be10614e0 <col:2, col:19> byVal 'void (struct Object)'
    |   `-ParmVarDecl 0x7f8be1061420 <col:13, col:19> 'struct Object'
    `-ExprWithCleanups 0x7f8be1061a98 <line:5:2, col:16> 'void'
      `-CallExpr 0x7f8be1061980 <col:2, col:16> 'void'
        |-ImplicitCastExpr 0x7f8be1061968 <col:2> 'void (*)(struct Object)' <FunctionToPointerDecay>
        | `-DeclRefExpr 0x7f8be10615a0 <col:2> 'void (struct Object)' lvalue Function 0x7f8be10614e0 'byVal' 'void (struct Object)'
        `-CXXBindTemporaryExpr 0x7f8be1061a78 <col:8, col:15> 'struct Object' (CXXTemporary 0x7f8be1061a70)
          `-CXXConstructExpr 0x7f8be1061a38 <col:8, col:15> 'struct Object' 'void (const struct Object &) throw()' elidable
            `-MaterializeTemporaryExpr 0x7f8be10619c8 <col:8, col:15> 'const struct Object' lvalue
              `-ImplicitCastExpr 0x7f8be10619b0 <col:8, col:15> 'const struct Object' <NoOp>
                `-CXXBindTemporaryExpr 0x7f8be1061918 <col:8, col:15> 'struct Object' (CXXTemporary 0x7f8be1061910)
                  `-CXXTemporaryObjectExpr 0x7f8be10618d0 <col:8, col:15> 'struct Object' 'void (void)' zeroing

Here we get an extra CXXConstructExpr and CXXBindTemporaryExpr, which is copying the temporary into the argument. However, at the time we evaluate the CXXConstructExpr, we don't yet know if we're going to be able to inline the whole call. If we do, we need to make sure the VarRegion for the by-value parameter contains the constructed object...once we've built the stack frame it's in. We probably want to run the destructor within the inlined function as well. If we don't inline the function, we either need to run the destructor outside the function, or elide the copy-constructor altogether and just destroy the original temporary.

Certainly you could say our representation of rvalue structs is what's causing the problem, or at least part of the problem. Certainly knowing which region to destroy is a big part of this. But we haven't thought about what it would take to solve the problem, or if changing the representation would actually help solve everything.

>  
> 
> The third problem is that we simply haven't put time into qualifying and validating the temporary destructor logic. Marcin implemented it a long time ago in the CFG, then Chandler and Doug made sure it was in good enough condition to use for the analysis-based warnings, but we haven't actually tested it in the analyzer. 
> 
> Now, all of that said, you're only interested in making 'noreturn' work right now, so for that it seems reasonable to treat the destructors for temporary objects as opaque in the analyzer. It may turn out that we were properly conservative in everything we've done so far that turning it on will just work, but I'd like to see a fair number of test cases before we start shipping that.
> 
> 'noreturn' is actually the simplest thing here: that's just a change in the shape of the CFG, not necessarily in its use. It's all the other destructors that come along for the ride that worry me. (And the setup of the call to the noreturn destructor as well.)
> 
> What do you think?
> Jordan
> 
> P.S. I asked "what do you think?", but I'm going to be on vacation for the next week, so please don't expect an immediate response. Sorry about the time for this response.
> 
> 
> After reading this email, I took a look if it is possible to make the analyzer not inline temporary destructors. But I couldn't find a _nice_ way to do that. At the point where the inlining decision is made, I don't have access to the type of the destructor call. Obviously, I could  push information through somehow, but instead I decided to take another look at what happens when I actually do inline the destructors, since it already seems to be mostly working. I will try to fix the two issues I mentioned above (which means the CFG will get more testing and validation) and then you can decide whether the code is stable enough to be switched on. If I fail at that, I can always make the destructors opaque, as you suggested.

I would cheat a lot by asking if the destructor's region is a CXXTempObjectRegion, or perhaps "stack memory space but not a VarRegion". ;-)

On the other hand, your patch to NoReturnFunctionChecker is giving me more hope that it may be possible to solve the noreturn problem without fully implementing this feature.

Jordan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130701/b9f17371/attachment.html>


More information about the cfe-dev mailing list