[PATCH] Fix crash when resolving branch conditions for temporary destructor condition blocks.

Mon May 5 10:38:30 PDT 2014

On May 5, 2014, at 5:55 , Manuel Klimek <klimek at google.com> wrote:

> On Thu, May 1, 2014 at 6:27 PM, Jordan Rose <jordan_rose at apple.com> wrote:
> 
> On Apr 30, 2014, at 10:23 , Manuel Klimek <klimek at google.com> wrote:
> 
>> On Wed, Apr 30, 2014 at 7:16 PM, Jordan Rose <jordan_rose at apple.com> wrote:
>> 
>> On Apr 30, 2014, at 1:35 , Manuel Klimek <klimek at google.com> wrote:
>> 
>>> On Wed, Apr 30, 2014 at 5:34 AM, Jordan Rose <jordan_rose at apple.com> wrote:
>>> What this is basically saying is that we will never be in a situation where the condition (or the rightmost branch in the condition) was not evaluated in the current CFG block. That seems a bit questionable in general but reasonable given the way the CFG is currently constructed.
>>> 
>>> Well, all I've been trying to do here is to a) document and b) encode in the assert the actual current CFG invariant the analyzer relies on for correctness.
>>> The problem is that if what is returned from ResolveCondition is *not* evaluated in the block, we have no guarantee that it was evaluated at all - but the correctness of all the path analyses relies on it being in the SVal cache of the state. Otherwise we will assume branches are reachable that are clearly unreachable because of which parts of the logical operator tree we *assumed* to be true (so we know the condition must be statically knowable in the current state).
>>> 
>>> I have a hard time putting that into words though, so any help for how to wordsmith this to be easier to understand would be highly appreciated...
>> 
>> "in the SVal cache of the state" is "in the Environment" using analyzer terminology (it's not really a cache, since it's not reproducible)
>> 
>> exactly :) I learned that today while doing more digging through the code...
>>  
>> , but since things regularly get cleaned out of the Environment there aren't that many cases where a condition would be present but wouldn't have been evaluated in the previous block.
>> 
>> I don't yet understand how exactly things get cleaned out of the environment (other than losing scope).
>> It seems to me like a lot of the warnings rely on branch conditions always being in the environment at the point of the terminator, otherwise it's easy to produce false positives in unreachable code paths.
> 
> The presence of bindings in the Environment is controlled by the LiveVariables analysis. Essentially, an expression is live from the point of its evaluation to the point it is consumed.
> 
> Can you elaborate on what "consumed" means in this context?

"conusmed" = "used to evaluate another statement". So in "foo(a(), b())", the expressions 'a()' and 'b()' (and 'foo') are consumed by the CallExpr, because their values must be used to evaluate the next statement. Until temporary destructors, an expression was never consumed more than once within a CFG. (It's close to "is a child of", but not quite the same. The LHS of the comma operator is not consumed by the full BinaryOperator expression, because the LHS's value is not needed to get the final result.)

>  
> Unfortunately, it can be a little more complicated than that sometimes...especially with LiveVariables being a flow-sensitive analysis but not a path-sensitive one.
> 
> What is the difference of flow vs. path sensitive analysis here?
>  
> (This is one of the reasons for the liveness bug that Pavel uncovered.)
> 
> I'm not sure yet where there was a liveness bug - every instance of bug I've seen was due to expression not having yet been evaluated (due to the logical operator optimization that ResolveCondition relies on). I've not seen a case yet where an expression was removed from the Environment after it had been in. Do you have an example for that?

A flow-sensitive algorithm is basically a graph search through the CFG: all possible paths through the function assuming that all branches are feasible. (Or nearly all. Our CFG builder prunes out trivially unreachable branches, like "if (false) { ... }" or "do { ... } while (false);".) A path-sensitive algorithm, on the other hand, attempts to only consider valid paths through the CFG; if you see the same condition repeated twice, the results will be consistent.

The issue in PR18159 illustrates this difference for liveness. Consider the expression "A || (B || C)". A fairly simple liveness algorithm would say that if you're trying to get the value of the outer || expression, both "A" and "B || C" need to be live. But that's not actually true at runtime—if "A" is always true, "B || C" will never get evaluated. So saying that the value for "B || C" should be considered live is a valid flow-sensitive statement (because there exists a path where it is needed), but not a path-sensitive one (because it doesn't tell you on which paths it is needed, and it isn't "all of them"). That's not exactly a bug; it's just more conservative than we need to be.

I'm not going to explain this next bit well because it's been a long time since we investigated this, but the (current) temporary destructors branches attempt to keep "A", "B", and "C" live a bit longer than they otherwise would be (because we try to evaluate the "A || B || ..." branch more than once). When Pavel put this in, though, we eventually found that on paths where A was false, the value for "B" and "C" would stick around too long—all the way to the next iteration of a loop. Then, even when A was true, we'd look at the value for "B || C", which would of course be the value from last time through the loop.

Somewhere in there is a bug—does it make sense to use flow-sensitive liveness analysis for a path-sensitive algorithm? Should we have made sure to recursively kill (mark non-live) the leaf descendents of binary operators, which are treated specially, after the top-level operator has been evaluated? Should we kill everything that is syntactically within a loop (not good enough; doesn't handle goto)?

Or, should we not try to change how BinaryOperators are evaluated (i.e. revert Pavel's change), and continue on as before, and come up with another solution? That's what we ended up with.

For the actual question, "when are things removed from the Environment", it happens pretty consistently: roughly at the next full-statement, or at the end of a function (inlined or top-level). Things don't just stick around. But it shouldn't ever hit the cases you care about, since those are within a single expression and everything important has been marked live.

Anyway, I hope that helps somewhat.
Jordan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140505/4aa8f333/attachment.html>