[PATCH] D86465: [analyzer][solver] Redesign constraint ranges data structure

Valeriy Savchenko via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Aug 26 01:47:50 PDT 2020


vsavchenko added inline comments.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:110
+  iterator end() const { return Impl->end(); }
+  size_t size() const { return Impl->size(); }
+
----------------
steakhal wrote:
> I think it should be `std::size_t`.
Yeah, I didn't see a unified stand of the community on this matter.  It looks like both options are pretty widespread.  I mimicked `SmallVector.h`, but I can easily change it.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:118-123
+    /// Create a new set with all ranges from both LHS and RHS.
+    /// Possible intersections are not checked here.
+    ///
+    /// Complexity: O(N + M)
+    ///             where N = size(LHS), M = size(RHS)
+    RangeSet add(RangeSet LHS, RangeSet RHS);
----------------
steakhal wrote:
> What about `merge` or `combine`? I know that `union` is a keyword thus we can not use.
Maybe, but the real `union` function is not present.  And I think that in the original code it is `add` to hint that it's not really a union and nothing complex is done here.  Maybe something like `mergeUnchecked` would work.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:149-160
+    /// Intersect the given set with the closed range [Lower, Upper].
+    ///
+    /// Unlike the Range type, this range uses modular arithmetic, corresponding
+    /// to the common treatment of C integer overflow. Thus, if the Lower bound
+    /// is greater than the Upper bound, the range is taken to wrap around. This
+    /// is equivalent to taking the intersection with the two ranges [Min,
+    /// Upper] and [Lower, Max], or, alternatively, /removing/ all integers
----------------
steakhal wrote:
> I think an example would be awesome here.
Agree, I simply copy-pasted the comment, but it would be better to expand it a bit because it might be confusing.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:170-174
+    /// Delete the given point from the range set.
+    ///
+    /// Complexity: O(N)
+    ///             where N = size(From)
+    RangeSet deletePoint(const llvm::APSInt &Point, RangeSet From);
----------------
steakhal wrote:
> Why does this operation take `O(N)` time? Shouldn't be `O(logN)` instead?
We still create a copy of an old vector to maintain persistence, that's why it's `O(N)`.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:190-193
+    /// And it makes us to separate the range
+    /// like [MIN, N] to [MIN, MIN] U [-N, MAX].
+    /// For instance, whole range is {-128..127} and subrange is [-128,-126],
+    /// thus [-128,-127,-126,...] negates to [-128,...,126,127].
----------------
steakhal wrote:
> 
The same thing with this comment, I think I'll rework it even more


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:195-196
+    ///
+    /// Negate restores disrupted ranges on bounds,
+    /// e.g. [MIN, B] => [MIN, MIN] U [-B, MAX] => [MIN, B].
+    ///
----------------
steakhal wrote:
> 
+1


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:203
+  private:
+    RangeSet makePersistent(ContainerType &&From);
+    ContainerType *construct(ContainerType &&From);
----------------
steakhal wrote:
> As commented later, this function worth to be documented.
I didn't think that it's necessary because of the fact that it's `private`, but it's pretty important, so I'll do it.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:207-208
+
+    RangeSet intersect(const RangeSet::ContainerType &LHS,
+                       const RangeSet::ContainerType &RHS);
+
----------------
steakhal wrote:
> Why do we need to explicitly spell the  `RangeSet` here?
Good point!


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:210-211
+
+    // Many operations inlcude producing new APSInt values and that's why
+    // we need this factory.
+    BasicValueFactory &ValueFactory;
----------------
steakhal wrote:
> 
+1


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:213-215
+    // Allocator for all the created containers.
+    // Containers might own their own memory and that's why it is specific
+    // for the type, so it calls containter destructors upon deletion.
----------------
steakhal wrote:
> 
+1


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:225-228
+  RangeSet(const RangeSet &) = default;
+  RangeSet &operator=(const RangeSet &) = default;
+  RangeSet(RangeSet &&) = default;
+  RangeSet &operator=(RangeSet &&) = default;
----------------
steakhal wrote:
> If this is the Rule of 5, where is the defaulted destructor?
It is, and it is forgotten, good catch!


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:269
 
-public:
-  RangeSet Intersect(BasicValueFactory &BV, Factory &F, llvm::APSInt Lower,
-                     llvm::APSInt Upper) const;
-  RangeSet Intersect(BasicValueFactory &BV, Factory &F,
-                     const RangeSet &Other) const;
-  RangeSet Negate(BasicValueFactory &BV, Factory &F) const;
-  RangeSet Delete(BasicValueFactory &BV, Factory &F,
-                  const llvm::APSInt &Point) const;
+  bool operator==(const RangeSet &Other) const { return *Impl == *Other.Impl; }
 
----------------
steakhal wrote:
> If we implement equality, we should also support inequality.
Sure!


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:272-273
+private:
+  RangeSet(ContainerType *RawContainer) : Impl(RawContainer) {}
+  RangeSet(UnderlyingType Ptr) : Impl(Ptr) {}
 
----------------
steakhal wrote:
> Missing `explicit`.
More like missing `/* implicit */` because it is intentional


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:277
+  bool pin(llvm::APSInt &Point) const;
+  bool containsImpl(llvm::APSInt &Point) const;
+
----------------
steakhal wrote:
> If it's a //read-only// operation why don't we take by value? Just like `contains` does.
> If it has side-effects which modify the Point, why is that not documented here?
Good point.  Essentially it modifies the argument the same way `pin` functions do - by changing its type.  I'll add the comments


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:146-149
+RangeSet::ContainerType *RangeSet::Factory::construct(ContainerType &&From) {
+  void *Buffer = Arena.Allocate();
+  return new (Buffer) ContainerType(std::move(From));
+}
----------------
steakhal wrote:
> Off-topic: Why do the Allocate and the similar //allocator// functions return `T*` if there is NO living object there (yet).
> There will be, only if the placement constructor called...
> I highly agree with you using `void *` there.
100% agree.  It might be a source of mistakes where the type system doesn't help.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:177-178
+
+  --It;
+  return It->Includes(Point);
+}
----------------
steakhal wrote:
> 
I don't really see the benefits of one way over another and as I said in another comment, I don't really like to pay for an extra copy with iterators as just a rule of thumb.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:181-188
+bool RangeSet::pin(llvm::APSInt &Point) const {
+  APSIntType Type(getMinValue());
+  if (Type.testInRange(Point, true) != APSIntType::RTR_Within)
+    return false;
+
+  Type.apply(Point);
+  return true;
----------------
steakhal wrote:
> Previously we had an early `empty` check here.
> Why don't we have that now?
> Same question for the other overload.
I'll add assertions checking for non-emptiness.
Essentially, it is a private function and the callers ensure this precondition.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:322-325
+  auto swap = [&First, &FirstEnd, &Second, &SecondEnd]() {
+    std::swap(First, Second);
+    std::swap(FirstEnd, SecondEnd);
+  };
----------------
steakhal wrote:
> It could definitely bear a longer name.
I think that there is nothing wrong in spelling out `std::swap` twice.
And do we have a naming convention for lambdas, is it a variable or a function from a naming perspective?


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:367-370
+      if (Second->To() > First->To())
+        // Here we make a decision to keep First as the "longer"
+        // range.
+        swap();
----------------
steakhal wrote:
> 
OK


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:375
+      //    ---- First      ]-------------------->
+      //    ---- Second ]--[  ++Second ---------->
+      //
----------------
steakhal wrote:
> It's unfortunate to refer to the next subrange by the `++Second`.
> Since `operator++` has a //side-effect// I would recommend the `Second+1` instead.
NP


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:384
+      Result.push_back(Range(IntersectionStart, Second->To()));
+      ++Second;
+      // ...and that the invariant will hold for a valid ++Second
----------------
steakhal wrote:
> It's probably a personal taste though.
> I think better to use the more verbose way incrementing an iterator if you are in the body of the loop.
> It catches more attention.
It basically makes it `Second++`, and even though for these iterators it doesn't really matter I prefer not to pay extra copy when I don't need it.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:498
   os << "{ ";
   for (iterator i = begin(), e = end(); i != e; ++i) {
     if (isFirst)
----------------
steakhal wrote:
> Can we use range-based for loops?
We sure can, I wanted to change it but forgot


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86465/new/

https://reviews.llvm.org/D86465



More information about the cfe-commits mailing list