[PATCH] D99797: [analyzer] Implemented RangeSet::Factory::unite function to handle intersections and adjacency

Valeriy Savchenko via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 12 04:21:12 PDT 2021


vsavchenko added inline comments.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:147
+    ///             where N = size(LHS), M = size(RHS)
+    RangeSet unite(RangeSet Original, RangeSet RHS);
+    /// Create a new set by uniting given range set with the given range.
----------------
`LHS`


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:247
     RangeSet intersect(const ContainerType &LHS, const ContainerType &RHS);
+    /// NOTE: This function relies that all values in the containers are
+    /// persistent (created via BasicValueFactory::getValue). User shall
----------------
nit: "...on the fact that..."


================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:250
+    /// guarantee this.
+    ContainerType unite(const ContainerType &LHS, const ContainerType &RHS);
 
----------------
`ContainerType` is basically a mutable version of `RangeSet`, so there is only one reason to return it - you believe that the users might want to modify it after they called this `unite`.  But as long as this `unite` is just a generalized version of user-facing `unites, it can totally return `RangeSet`.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:112
+RangeSet RangeSet::Factory::add(RangeSet LHS, RangeSet RHS) {
+  ContainerType Result;
+  std::merge(LHS.begin(), LHS.end(), RHS.begin(), RHS.end(),
----------------
Let's reserve some place here.  Because `LHS` and `RHS` don't have intersections, the result always has `size(LHS) + size(RHS)` elements


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:221
+  Result.append(B, E);
+
+  return Result;
----------------
Oof, I don't know about this algorithm.  I mean it does its job.  But IMO it lacks a good description of what are the invariants and what are the different situations we are looking for.
Aaaand you kind of re-check certain conditions multiple times.  One example here is the check for `Min` and `Max`. Those situations are super rare, but we check for them on every single iteration. `std::min` and `std::max` are additional comparisons.  As I mentioned before, constant factor is the key here and less comparisons we do is way more important than doing binary search at some point.
Just make a benchmark if you don't believe me (with google-benchmark, for example).  The version with less comparisons will dominate one with more on `RangeSet` under 20 (and they'll be even smaller in practice).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99797/new/

https://reviews.llvm.org/D99797



More information about the cfe-commits mailing list