[cfe-dev] Inlining temporary destructors in the analyzer

Manuel Klimek klimek at google.com
Thu Jul 31 12:54:38 PDT 2014


On Thu, Jul 31, 2014 at 9:51 PM, Jordan Rose <jordan_rose at apple.com> wrote:

>
> On Jul 31, 2014, at 12:46 , Manuel Klimek <klimek at google.com> wrote:
>
> On Thu, Jul 31, 2014 at 9:36 PM, Jordan Rose <jordan_rose at apple.com>
> wrote:
>
>>
>> On Jul 31, 2014, at 1:20 , Manuel Klimek <klimek at google.com> wrote:
>>
>> On Thu, Jul 31, 2014 at 5:19 AM, Jordan Rose <jordan_rose at apple.com>
>> wrote:
>>
>>>
>>> On Jul 30, 2014, at 8:14 , Manuel Klimek <klimek at google.com> wrote:
>>>
>>> Hi Jordan,
>>>
>>> I made some progress on inlining temporary destructors. My main problem
>>> currently is that somehow the initializer values in my constructor get lost
>>> if I don't also reference 'this' via some other way.
>>>
>>> An examples shows that best. Consider:
>>> struct A {
>>>   A(int* y) : z(y) {}
>>>   ~A() { *z = 23; }
>>>   int *z;
>>> };
>>> void f() {
>>>   int x = 0;
>>>   {(A(&x));}
>>> }
>>>
>>> This leads to a "Dereference of undefined pointer value" warning on *z =
>>> 23.
>>> Funnily enough this warning goes away (and all values get correctly
>>> propagated) if change the constructor to be:
>>> A(int* y) : z(y) { (void)this; }
>>>
>>> Obviously this also works fine if I use a local variable instead of a
>>> temporary.
>>>
>>> Any ideas what might be going wrong here would be greatly appreciated.
>>>
>>>
>>> My guess is that the temporary is an rvalue, and in the analyzer a
>>> NonLoc, and that the only reason this even appears to work is because we
>>> needed to be able to call *methods* on temporaries and so will create a
>>> temporary region for them on demand to serve as the 'this' pointer.
>>> However, I don't think we expect that to happen for *constructors,* and
>>> so we don't keep the region around after the constructor has finished
>>> executing. (That is, when it's just a method being called—and neither the
>>> constructor nor destructor are inlined—it doesn't actually matter what you
>>> do to the contents of the region, because nothing else can access it.)
>>>
>>>
>> So, funnily enough if I remove this code:
>>   if (isTemporaryPRValue(CCE, ThisV))
>>   ThisV = state->getSVal(ThisV.castAs<Loc>());
>> which puts the temp region into a lazycompoundregion, the memory seems to
>> be correctly handled. I have not yet looked what else this breaks though.
>>
>>
>> Okay, analyzer design time. :-)
>>
>> C++ has a distinction between lvalues and rvalues (actually glvalues and
>> prvalues). There are a number of things that go into this distinction, but
>> the part the analyzer cares about is that glvalues refer to true objects
>> and prvalues do not. (In C++ parlance, "object" refers to anything with
>> storage, a type, and lifetime, not just class instances. In the analyzer,
>> we call such things "regions".)
>>
>> If you were to build a very literal C++ compiler, every glvalue would be
>> represented by an address. PRValues*, on the other hand, don't really need
>> addresses; they only exist to be used transiently.
>>
>
> I'm not sure how "exists transiently" equals "doesn't need an address". At
> least in C++, this seems incorrect (?) for objects.
>
>>
> IMHO it's a problem with the standard, and this extra invention of
> "modifiable" seems weird to me.
>
>
> ** How do you capitalize "prvalue"?*
>>
>> The analyzer pretty much always believes the compiler when it comes to
>> glvalues vs. prvalues. That means that the value of a glvalue expression is
>> *always* a Loc. A prvalue expression, on the other hand, will only be a
>> Loc if it's a pointer type*. If you're talking about a struct prvalue, what
>> you're really talking about is the bindings in its memory region—it has no
>> identity. This is represented by a CompoundVal or LazyCompoundVal. This
>> model worked great for C, but it's starting to show strain in C++.
>>
>> ** or one of the other types mentioned in Loc::isLocType. But not all of
>> those can be expression values.*
>>
>> Besides "regular" temporaries, this is also a problem for prvalue
>> function arguments. When the analyzer inlines a function, it just blithely
>> binds the arguments to the proper VarRegions created for each ParmVarDecl
>> in the current stack frame. That doesn't really fly if we're supposed to be
>> calling the copy constructor. Worse, the copy constructor call happens
>> *outside* of the function call, which means it happens before we've even
>> decided whether or not to inline, which means there are no VarRegions yet
>> to construct into. We've/I've been putting off dealing with this one for a
>> while; we get around it mostly because we *don't inline constructors for
>> temporaries.*
>>
>> The general temporary problem goes to what Richard was saying: a
>> temporary expression is an rvalue, but if we have to run a constructor then
>> clearly there *was* a region at some point. The C++11 standard (at least
>> the version I have) is not wonderful about classifying temporaries
>> ([basic.lval]p1):
>>
>> - An *lvalue*...designates a function or an object.
>>
>> - An *xvalue* (an “eXpiring” value) also refers to an object, usually
>> near the end of its lifetime.
>>
>> - A *glvalue* (“generalized” lvalue) is an lvalue or an xvalue.
>>
>> - An *rvalue...*is an xvalue, a temporary object (12.2) or subobject
>> thereof, or a value that is not associated with an object.
>>
>> - A *prvalue* (“pure” rvalue) is an rvalue that is not an xvalue.
>>
>>
>> So temporary objects are prvalues. Even though they are modifiable
>> (defined later in the same section). Even though with construction and
>> destruction they end up needing a fixed region, or at least consistency in
>> region contents.
>>
>> I don't have a great answer here. My guess is we'll probably end up
>> turning on *some* inlining for temporaries to start, but not all.
>>
>> Oh, right, so what's that 'isTemporaryPRValue' doing? That's taking the
>> few cases where the AST wants a prvalue but the analyzer really needs a
>> valid Loc, and conjuring a region on the fly. That's important for things
>> that need to have referential identity, like the 'this' pointer in a method
>> called on a temporary, or (say) a temporary that's been lifetime-extended.
>>
>
> Since this is right at the end of the constructor call, this always
> creates and binds the value if we hit the constructor. See my patch - it
> seems to work fine :) What am I missing?
>
>
> That breaks the invariant that a non-Loc-type prvalue is not a Loc. If you
> tried to assign this directly into another struct, you'd effectively be
> assigning an address instead of a set of bindings. IIRC there's some code
> in the analyzer that tries to handle this case anyway, so it may not break
> down immediately, but I'd rather not lose that assumption.
>

I'm confused - isn't that already the case currently? Have you looked at
http://reviews.llvm.org/D4740? It's very small, and I don't see where I
regress an assumption that's not already violated (if I remove the
isTemporaryPRValue section loads of tests break, so I assume changing the
layout of a temporary section will make significant changes to the analyzer
necessary).


>
> Jordan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140731/a80456a8/attachment.html>


More information about the cfe-dev mailing list