[cfe-commits] [patch] Tracking simple arithmetic constraints (PR2695) (resubmitted)

Sat Jun 5 23:43:42 PDT 2010

On Jun 5, 2010, at 6:01 PM, Jordy Rose wrote:

> *ping*
> 
> Very basic support for handling conditions involving addition and
> subtraction, such as this:


Jordy,

This is very cool stuff.  Comments inline:

--- include/clang/Checker/PathSensitive/BasicValueFactory.h	(revision 104412)
+++ include/clang/Checker/PathSensitive/BasicValueFactory.h	(working copy)
@@ -76,6 +76,12 @@
   llvm::FoldingSet<LazyCompoundValData> LazyCompoundValDataSet;
 
 public:
+  enum OverflowKind {
+    NoOverflow = 0,
+    AdditionOverflow = 1,
+    AdditionUnderflow = -AdditionOverflow
+  };


Is there a reason we want to make AdditionUnderflow negative?  Why not just make it 2?  From the algorithm in SimpleConstraintManager, this looks intended.  If so, please comment.

I am also not certain if this even needs to be in BasicValueFactory.  It seems specific to SimpleConstraintManager, and the places where the underflow/overflow values are used are only for temporary APSInt values.  BasicValueFactory is meant to created values that are permanently resident in a GRState.



+
   BasicValueFactory(ASTContext& ctx, llvm::BumpPtrAllocator& Alloc)
   : Ctx(ctx), BPAlloc(Alloc), PersistentSVals(0), PersistentSValPairs(0),
     SValListFactory(Alloc) {}
@@ -181,7 +187,8 @@
 
   const llvm::APSInt* EvaluateAPSInt(BinaryOperator::Opcode Op,
                                      const llvm::APSInt& V1,
-                                     const llvm::APSInt& V2);
+                                     const llvm::APSInt& V2,
+                                     OverflowKind* OverflowDirection = NULL);
 
   const std::pair<SVal, uintptr_t>&
   getPersistentSValWithData(const SVal& V, uintptr_t Data);


Please add comments indicating what the 'OverflowDirection' argument does.


===================================================================
--- lib/Checker/SimpleSValuator.cpp	(revision 104412)
+++ lib/Checker/SimpleSValuator.cpp	(working copy)
@@ -261,17 +261,17 @@
       }
     }
     case nonloc::SymExprValKind: {
-      // Logical not?
-      if (!(op == BinaryOperator::EQ && rhs.isZeroConstant()))
-        return UnknownVal();
-
       const SymExpr *symExpr =
         cast<nonloc::SymExprVal>(lhs).getSymbolicExpression();
-
-      // Only handle ($sym op constant) for now.
-      if (const SymIntExpr *symIntExpr = dyn_cast<SymIntExpr>(symExpr)) {
-        BinaryOperator::Opcode opc = symIntExpr->getOpcode();
-        switch (opc) {
+        
+      if (op == BinaryOperator::EQ && rhs.isZeroConstant()) {
+        // Logical not. We can special case some of this

This comment looks stale.  Please remove it.

+        
+        // Only handle ($sym op constant) for now.
+        if (const SymIntExpr *symIntExpr = dyn_cast<SymIntExpr>(symExpr)) {
+          BinaryOperator::Opcode opc = symIntExpr->getOpcode();
+          switch (opc) {
+          default: break; // and try to handle it the normal way

Minor nit: Please keep your comment style consistent with the LLVM coding style.  Please try to have sentences that start capitalized and end with a period.  I'd also prefer that the break appear on the following line with the comment above it, but that's minor.  We tend to prefer comments above statements then on the side of them.

           case BinaryOperator::LAnd:
           case BinaryOperator::LOr:
             assert(false && "Logical operators handled by branching logic.");
@@ -294,18 +294,6 @@
           case BinaryOperator::PtrMemI:
             assert(false && "Pointer arithmetic not handled here.");
             return UnknownVal();
-          case BinaryOperator::Mul:
-          case BinaryOperator::Div:
-          case BinaryOperator::Rem:
-          case BinaryOperator::Add:
-          case BinaryOperator::Sub:
-          case BinaryOperator::Shl:
-          case BinaryOperator::Shr:
-          case BinaryOperator::And:
-          case BinaryOperator::Xor:
-          case BinaryOperator::Or:
-            // Not handled yet.
-            return UnknownVal();
           case BinaryOperator::LT:
           case BinaryOperator::GT:
           case BinaryOperator::LE:
@@ -316,8 +304,18 @@
             assert(symIntExpr->getType(ValMgr.getContext()) == resultTy);
             return ValMgr.makeNonLoc(symIntExpr->getLHS(), opc,
                                      symIntExpr->getRHS(), resultTy);
+          }
         }
       }
+      
+      // for now, only handle ($sym op constant) here as well

Minor nit: Please keep this comment in the LLVM coding style (make it a sentence).  It's also probably worth pointing out what cases these handle, and why Mul, Div, and friends are handled here.  Brief comments like this are helpful, especially when the logic is broken up so recursively.  Also, can we really handle 'Mul', 'Div', 'Shl', etc., in this case?  In SimpleConstraintManager::canReasonAbout, we still return false for these cases for SymIntExpr.

+      if (isa<nonloc::ConcreteInt>(rhs)) {
+        return ValMgr.makeNonLoc(symExpr, op,
+                                 cast<nonloc::ConcreteInt>(rhs).getValue(),
+                                 resultTy);
+      }
+      
+      return UnknownVal();
     }


Index: lib/Checker/SimpleConstraintManager.cpp
===================================================================
--- lib/Checker/SimpleConstraintManager.cpp	(revision 104412)
+++ lib/Checker/SimpleConstraintManager.cpp	(working copy)

+      const llvm::APSInt *RHS = &SE->getRHS();
+      int TotalOverflow = 0;

Should TotalOverflow by an APInt, or a uint64_t?  Seems like this could overflow itself in some cases.
+      
+      while (const SymIntExpr *LSE = dyn_cast<SymIntExpr>(LHS)) {

Please add a comment above this loop that explains the algorithm.  Basically, you are converting:

  (sym expr) op1 constant1 == constant2

into

  (sym expr) == constant (op1') constant2

and repeating until you can't do that anymore.

+        LHS = LSE->getLHS();
+        BinaryOperator::Opcode op = LSE->getOpcode();
+        
+        // TODO: wrap me in a procedure?
+        // FIXME: only handles plus and minus!

Nit: Please keep comment style consistent with LLVM comment style.

+        switch (op) {
+        default:
+          // can't reverse the expression, so give up and assume feasible

Nit: Please keep comment style consistent with LLVM comment style.

I am also not certain what "can't reverse the expression" means; it probably would be clearer with a meta-level comment above the loop that explained the algorithm.

+          return state;
+        case BinaryOperator::Add:
+          op = BinaryOperator::Sub;
+          break;
+        case BinaryOperator::Sub:
+          op = BinaryOperator::Add;
+          break;
+        }
 
+        BasicValueFactory::OverflowKind Overflow =
+          BasicValueFactory::NoOverflow;
+        const llvm::APSInt *newRHS = BasicVals.EvaluateAPSInt(op,
+                                                              *RHS,
+                                                              LSE->getRHS(),
+                                                              &Overflow);

I'm wondering if we really want to be using EvaluateAPSInt here at all.  The problem is that EvaluateAPSInt creates a new APSInt that goes into a FoldingSet.  That FoldingSet gets populated with every value that EvaluateAPSInt creates until the BasicValueFactory is destroyed.  Since this is an intermediate value, it's probably worth just doing the operation directly, and then using BasicValueFactory to get the final persistent APSInt value once you have finished folding values.  Then the overflow logic stays local to SimpleConstraintManager.

+        TotalOverflow += Overflow;
+        RHS = newRHS;
+      }
+      
+      // Overflow may have canceled out if we wrap back and forth
+      // Ex: (n + 1) - 1

Nit: Please keep comment style consistent with LLVM comment style.

+      if (TotalOverflow) {

Note that 'TotalOverflow' will be true if it is negative.  Is this intended?

+        switch (SE->getOpcode()) {
+        default:
+          return state;
+        case BinaryOperator::EQ:
+          return Assumption ? NULL : state;
+        case BinaryOperator::GE:
+        case BinaryOperator::GT:
+          return (Assumption ^ (TotalOverflow > 0)) ? state : NULL;
+        case BinaryOperator::LE:
+        case BinaryOperator::LT:
+          return (Assumption ^ (TotalOverflow < 0)) ? state : NULL;
+        }
+      }
+
+      const llvm::APSInt &finalRHS = BasicVals.Convert(LHSType, *RHS);
+      SymIntExpr SENew(LHS, SE->getOpcode(), finalRHS, SE->getType(Ctx));
+

The comment:

"      // FIXME: This is a hack.  It silently converts the RHS integer to be
      // of the same type as on the left side.  This should be removed once
      // we support truncation/extension of symbolic values.     "

belongs right above the call to Convert.  It accidentally was left above.


       return AssumeSymInt(state, Assumption, &SENew);
     }
 
@@ -186,9 +237,12 @@
                                                      bool Assumption,
                                                      const SymIntExpr *SE) {
 
+  // If we failed to symplify the expression, just assume feasible.
+  if (!isa<SymbolData>(SE->getLHS()))
+    return state;
 
-  // Here we assume that LHS is a symbol.  This is consistent with the
-  // rest of the constraint manager logic.
+  // At this point we can assume that LHS is a symbol.
+  // This is consistent with the rest of the constraint manager logic.
   SymbolRef Sym = cast<SymbolData>(SE->getLHS());
   const llvm::APSInt &Int = SE->getRHS();