[cfe-dev] CFRefCount Problem #2: Region Invalidation

Tue Aug 23 22:08:09 PDT 2011

On Aug 23, 2011, at 16:43, Ted Kremenek wrote:

> On Aug 22, 2011, at 9:58 PM, Jordy Rose wrote:
> 
>> One of those couplings, however, is that the /reference counts/ of arguments and message receivers should /not/ be invalidated on a call...but the reference counts of any /members/ should be. This is currently accomplished by use of a private shared "whitelist" of top-level symbols in a function call, set up by the transfer functions and then used by the RetainReleaseChecker. If the RetainReleaseChecker is to lose its privileged status, it won't be able to know which symbols should have their bindings preserved.
>> 
>> I'm pretty stuck on this one, but I came up with a few possibilities:
>> 
>> 1. Restore bindings in a post-call check by walking backwards in the graph to before the call's return value is set.
>> 
>> 2. Add the current statement to the checkRegionChanges callback, which would enable RetainReleaseChecker to skip arguments and receivers.
>> 
>> 3. Instead of using checkRegionChanges to handle bindings of objects passed by reference, just explicitly stop tracking anything passed by reference in a post-call check. Needs fleshing out for structs, C++ constructors, etc though.
>> 
>> I think the first way is the cleanest, but it's certainly extra work.
> 
> I think (1) goes against the spirit of what we want to accomplish.  The region invalidation logic is driven by core logic for handling the "unknown effects" of a function/method call.  That should be preserved unless we have a reason not to do so.  Tampering with the bindings is against the spirit of doing that.  I also don't think this is simple because region invalidation can tamper with the internal representation of a Store in many ways.  I also think invalidation of the values of the ivars/field of an object is separate from the invalidation of the reference count.
> 
> I'm not certain what (3) means.  checkRegionChanges() includes the transitive closure of everything that get's invalidated.  That's the whole point of this callback.

Yeah, never mind. It sounded better in my head and I didn't manage to think it through before writing the e-mail.

> I think (2) is reasonable, although not complete.  It allows checkers to see what values were passed as parameters/receivers, and do anything special that they'd like.  That seems like a very clean solution, and it leaves checkers in charge of deciding the policy they want to have.  However, it doesn't handle the following case:
> 
> ...
> Foo *x = ... // now tracking 'x'
> ...
> x-> fooRef = x;
> ...
> unknownFunction(x);
> 
> 
> In this example, 'Foo' is a recursive data structure.  [snip]
> 
> So, (2) is not enough.  In order to make (2) actionable, we will also need to keep a count of the number of times a region was invalidated, and report that in the callback.  If a region was invalidated once, and appears as parameter in the argument list, then we known unambiguously that the function argument was the sole reference to the object and thus we don't need to stop tracking it's reference count.  The logic on the client side for reasoning about this is simple.
> 
> That said, I think (2) *without* the invalidation counts is good enough to get started.

It's worth noting that the current implementation doesn't handle this either, so at least it won't be a regression.

> Is there a particular reason you were advocating (1)?

(1) is the least intrusive implementation, and it's an implementation that doesn't impose additional semantics on invalidateRegion(s). What I didn't realize is that invalidateRegions is /only/ used for parameters. That means that passing the top-level parameters isn't really changing any semantics. So I'm good with (2).

Is it best to pass a list of SVals? A list of top-level regions? A list of expressions? A list of symbols? A CallOrObjCMessage? The last is clearly most flexible, but I'm not sure that's what we need. I feel like the list of SVals (ArrayRef<SVal>) is probably the way to go.