[PATCH] [analyzer] Refactor conditional expression evaluating code

Sat Aug 17 22:39:33 PDT 2013

On Aug 16, 2013, at 1:19 AM, Pavel Labath <labath at google.com> wrote:

> Although it may be debatable, I actually find this implementation of EvaluateLogicalExpressions more sensible. My impression of the ProgramState structure is that it is supposed to contain everything needed for further program evaluation. The current implementation seems to bypass that, because it is looking at the exploded graph history. I agree that it is a sensitive part and understand your caution. If you have any further questions, I would be happy to answer them.
> 

Hi Pavel,

(I’m reviewing D1340.3.patch)

While I agree that your assessment of EvaluateLogicalExpressions is more sensible, I think it depends on an assumption that isn’t true (yet).

Specifically, I’m concerned about the following snippet in EvaluateLogicalExpression:

+  assert(!X.isUnknownOrUndef() && "Value should have already been computed.");
+  ProgramStateRef StTrue, StFalse;
+  llvm::tie(StTrue, StFalse) = State->assume(X.castAs<DefinedOrUnknownSVal>());
+
+  assert(!StTrue != !StFalse && "Value should be evaluate to true or false.”);

This snippet assumes that the left-hand expression of a ‘&&’ or ‘||’ evaluated to a fully-constrained value.  That is, the value cannot be both true and false.  This is an invalid assumption.

To be sure, earlier logic in your patch does handle the case of UnknownVal:

     // If the condition is still unknown, give up.
     if (X.isUnknownOrUndef()) {
-      builder.generateNode(PrevState, true, PredI);
-      builder.generateNode(PrevState, false, PredI);
+      SValBuilder &SVB = PrevState->getStateManager().getSValBuilder();
+
+      StTrue = PrevState->BindExpr(Condition, BldCtx.LC, SVB.makeTruthVal(true));
+      StFalse = PrevState->BindExpr(Condition, BldCtx.LC, SVB.makeTruthVal(false));
+
+      builder.generateNode(StTrue, true, PredI);
+      builder.generateNode(StFalse, false, PredI);
       continue;
     }

Thus you are explicitly recording the truth value in the state for UnknownVal.  However, UnknownVal is not the only under constrained value.  It is also possible to get an under constrained value if the constraint solver can’t model the constraint for a symbol.  In such cases it is possible for a symbol to be both true or false, simply because that distinction cannot be represented in ProgramState.  That scenario would cause the first assertion in the first code snippet to succeed (checking for UnknownVal), but the second assertion to fail.  This undermines the entire approach.

This case is actually handled by the current implementation:

-          // We can't constrain the value to 0 or 1.
-          // The best we can do is a cast.
-          X = getSValBuilder().evalCast(RHSVal, B->getType(), RHS->getType());

However, I still think it should be possible to make your approach work.  To do so, you’ll need to further modify ExprEngine::processBranch(), notably the following logic:

    // Process the true branch.
    if (builder.isFeasible(true)) {
      if (StTrue)
        builder.generateNode(StTrue, true, PredI);
      else
        builder.markInfeasible(true);
    }

    // Process the false branch.
    if (builder.isFeasible(false)) {
      if (StFalse)
        builder.generateNode(StFalse, false, PredI);
      else
        builder.markInfeasible(false);
    }

Just like you did for the UnknownVal case, you’ll need to record what branch you took, e.g.:

      if (StTrue) {
	StTrue = StTrue->BindExpr(Condition, BldCtx.LC, SVB.makeTruthVal(true));
        builder.generateNode(StTrue, true, PredI);
      }

If you do that, THEN you are guaranteed to have a fully constrained value.  Not only that, you are guaranteed to have an actual truth value that is 0 or 1, not just any random symbolic value.  You could then optimize your logic in EvaluateLogicalExpressions to not use ProgramStateRef::assume() but just look at the actual truth value.  The reason this is important is because ProgramStateRef::assume() is a query to the constraint solver.  That’s not guaranteed to be fast.  One advantage of crawling the path (which is a bit gross) is that it’s a fairly fast operation.  Querying the constraint solver just to determine which branch we took is not.  But just checking for 0 or 1 is VERY fast.

If your patch was modified to handle this extra case (thus making it correct), and it did not need to call ProgramStateRef::assume() (which is slow) then I think this would be a nice cleanup.  Moreover, by explicitly marking in the ProgramState what branch was taken that seems amendable for better modeling of cleanups of temporaries, etc. (per your original motivation for this change).

As for the rest of the patch, the logic in LiveVariables looks fine to me.

Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130817/12129aa4/attachment.html>