[cfe-commits] [patch] Tracking simple arithmetic constraints (PR2695) (resubmitted)

Tue Jun 8 16:14:43 PDT 2010

Response inline!

On Jun 6, 2010, at 9:41 PM, Jordy Rose wrote:

> On Sat, 5 Jun 2010 23:43:42 -0700, Ted Kremenek <kremenek at apple.com>
> wrote:
>> On Jun 5, 2010, at 6:01 PM, Jordy Rose wrote:
>>> public:
>>>  enum OverflowKind {
>>>    NoOverflow = 0,
>>>    AdditionOverflow = 1,
>>>    AdditionUnderflow = -AdditionOverflow
>>>  };
>> 
>> Is there a reason we want to make AdditionUnderflow negative?  Why not
>> just make it 2?  From the algorithm in SimpleConstraintManager, this
> looks
>> intended.  If so, please comment.
> 
> Yes; the idea is that a negative overflow followed by a positive overflow
> is no overflow at all, and so the two should cancel out. This occurs in
> situations like these:
> 
> void doSomething (unsigned x, unsigned length) {
>  // pretend this is a branch being Assumed true
>  assert(length == 0);
>  --x;
>  if (x+1 <= length) {
>    // actually do work
>  }
> }
> 
> Here, the condition reduces to "(($x-1)+1) <= 0", which has an
> intermediate step of "($x-1) <= -1".

Okay, makes sense!

>> @@ -316,8 +304,18 @@
>>             assert(symIntExpr->getType(ValMgr.getContext()) ==
> resultTy);
>>             return ValMgr.makeNonLoc(symIntExpr->getLHS(), opc,
>>                                      symIntExpr->getRHS(), resultTy);
>> +          }
>>         }
>>       }
>> +      
>> +      // for now, only handle ($sym op constant) here as well
>> 
>> Minor nit: Please keep this comment in the LLVM coding style (make it a
>> sentence).  It's also probably worth pointing out what cases these
> handle,
>> and why Mul, Div, and friends are handled here.  Brief comments like
> this
>> are helpful, especially when the logic is broken up so recursively. 
> Also,
>> can we really handle 'Mul', 'Div', 'Shl', etc., in this case?  In
>> SimpleConstraintManager::canReasonAbout, we still return false for these
>> cases for SymIntExpr.
> 
> I wasn't reading the section quite right at first, but actually it seems
> correct. There are two things going on here:
> 
> - !(expr) is represented as (expr) == 0. For comparison ops, we can
> simplify the expression by negating the comparison and returning that. The
> original code only allowed this and returned UnknownVal otherwise,
> disallowing things like ($a*3 == 0)
> - The new code also allows "(expr) op constant". Previously, this would
> have been UnknownVal(). Here, it's a SymExprVal that probably still can't
> be evaluated. It's less efficent, but there's no harm done.

Okay, got it.

But don't we still have the problem that 'canReasonAbout' is not accurate? (e.g., for the opcodes not handled)

> 
> It's worth noting that $sym*3 is not currently considered an UnknownVal;
> the case for a SymbolVal LHS and ConcreteInt RHS also just makes a
> SymExprVal.

Right.

> Do you think it's worth checking the expression type in the SymbolVal
> case? That would let the UnknownVal bubble up and probably save a bit of
> work later, but it'd be more stuff to keep in sync with the constraint
> manager.

Are you specifically referring to removing the following code?

      // Does the symbol simplify to a constant?  If so, "fold" the constant
      // by setting 'lhs' to a ConcreteInt and try again.
      if (Sym->getType(ValMgr.getContext())->isIntegerType())
        if (const llvm::APSInt *Constant = state->getSymVal(Sym)) {
          // The symbol evaluates to a constant. If necessary, promote the
          // folded constant (LHS) to the result type.

> 
> I'm going back to clean up this section. Wondering if I can actually push
> all this simplification into SimpleSValuator -- it's all arithmetic stuff
> anyway. Going to try tonight.

Yes; keeping this all in SimpleSValuator makes the most sense to me, and probably would be more efficient.

> 
> 
>> +      const llvm::APSInt *RHS = &SE->getRHS();
>> +      int TotalOverflow = 0;
>> 
>> Should TotalOverflow by an APInt, or a uint64_t?  Seems like this could
>> overflow itself in some cases.
> 
> TotalOverflow is not measuring a quantity, but the net number of
> overflows, where positive overflows cancel out negative overflows.
> Commented and renamed to NetOverflow in my local version.
> 
> John brought up an important point, though: "(x+1) <= 0" has a solution at
> x = UINT_MAX, since unsigned overflow is defined in C and C++, and people
> are used to signed wraparound as well.

Do they depend on it?  If not, and if it is truly undefined for most compilers, we should eventually check this case and issue a warning.

> Worse, "(x+1) <= 1" has two
> solutions: x = UINT_MAX and x = 0. How can we handle these wraparound
> cases?

This requires modeling a disjunction over the values of x.  We could extend the ConstraintManager API to allow one to specify such constraints without bifurcating the state.  RangeConstraintManager, for example, should be able to model this case.

> 
> Until that's resolved, I'm not sure it's a good idea to commit, though
> leaving it as is (pretending wraparound doesn't exist) might be the least
> troublesome mechanism for now.

What you have seems monotonically better than what we have now, and the general approach seems okay to me.  When we consider doing precise integer overflow checking, we will need to be more precise.  For now, just add a big note to the code documenting the caveats/assumptions about the semantics.

Our requirements with the static analyzer are different than the compiler.  For the compiler we need conservative semantics.  For the analyzer we would like that too, but can slide on that when it benefits us to find more bugs/less false positives.

Zhongxing: What do you think?