[PATCH] D99797: [analyzer] Implemented RangeSet::Factory::unite function to handle intersections and adjacency
Valeriy Savchenko via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed May 12 04:21:12 PDT 2021
vsavchenko added inline comments.
================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:147
+ /// where N = size(LHS), M = size(RHS)
+ RangeSet unite(RangeSet Original, RangeSet RHS);
+ /// Create a new set by uniting given range set with the given range.
----------------
`LHS`
================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:247
RangeSet intersect(const ContainerType &LHS, const ContainerType &RHS);
+ /// NOTE: This function relies that all values in the containers are
+ /// persistent (created via BasicValueFactory::getValue). User shall
----------------
nit: "...on the fact that..."
================
Comment at: clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:250
+ /// guarantee this.
+ ContainerType unite(const ContainerType &LHS, const ContainerType &RHS);
----------------
`ContainerType` is basically a mutable version of `RangeSet`, so there is only one reason to return it - you believe that the users might want to modify it after they called this `unite`. But as long as this `unite` is just a generalized version of user-facing `unites, it can totally return `RangeSet`.
================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:112
+RangeSet RangeSet::Factory::add(RangeSet LHS, RangeSet RHS) {
+ ContainerType Result;
+ std::merge(LHS.begin(), LHS.end(), RHS.begin(), RHS.end(),
----------------
Let's reserve some place here. Because `LHS` and `RHS` don't have intersections, the result always has `size(LHS) + size(RHS)` elements
================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:221
+ Result.append(B, E);
+
+ return Result;
----------------
Oof, I don't know about this algorithm. I mean it does its job. But IMO it lacks a good description of what are the invariants and what are the different situations we are looking for.
Aaaand you kind of re-check certain conditions multiple times. One example here is the check for `Min` and `Max`. Those situations are super rare, but we check for them on every single iteration. `std::min` and `std::max` are additional comparisons. As I mentioned before, constant factor is the key here and less comparisons we do is way more important than doing binary search at some point.
Just make a benchmark if you don't believe me (with google-benchmark, for example). The version with less comparisons will dominate one with more on `RangeSet` under 20 (and they'll be even smaller in practice).
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99797/new/
https://reviews.llvm.org/D99797
More information about the cfe-commits
mailing list