[PATCH] D99797: [analyzer] Implemented RangeSet::Factory::unite function to handle intersections and adjacency
Valeriy Savchenko via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Apr 8 04:06:49 PDT 2021
vsavchenko added a comment.
Well, that is a nice exercise for "two pointer" problems, but can we please talk about the actual use case for it?
================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:136
+
+RangeSet RangeSet::Factory::unite(RangeSet LHS, RangeSet RHS) {
+ if (LHS.isEmpty())
----------------
I'd prefer `merge`
================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:217
+
+ while (BIt1 || BIt2) {
+
----------------
This algorithm is indeed `O(N +M)`, good job!
But let me get picky again 😅
It looks like we do too many comparisons in this loop. As I said earlier, constant factor is the key here and while this iterator idea is good it is a tradeoff. There is a piece of knowledge that would've been obvious with regular iterators, now is a run-time unknown that we constantly need to check (`isTo` and `isFrom`).
What we are trying to achieve here is not harder than what we do in `intersect` and there it is way less comparisons. While iterating through sets we can keep a set of invariants, so we don't need to re-check something that we already know.
Here is a sketch for the algorithm:
```
{
assert(First != FirstEnd && Second != SecondEnd);
// We have && here because one of them reaches its end,
// we should not check for it again and simply add the rest
// of the other set.
while (true) {
// We need to find the beginning of the next range to add
// First should be the iterator which To is a candidate to be
// an and for the merged range.
if (First.From() > Second.From()) {
swap();
}
auto From = First.From();
// That's it, we found our start, and we need to go through
// other ranges and look for the end.
// After this point, First.From() shouldn'y be accessed.
while (true) {
// We can compress all the checks into just one.
// This essentially means that Second should not get merged with First.
if (Second.From() > First.To() + 1) {
break;
}
if (Second.From() > First.From()) {
// First is maintained as a candidate for the end.
swap();
}
// At this point we know that Second lives fully inside
// of the new range and we can skip it.
++Second;
// If we have nothing else in the second set...
if (Second == SecondEnd) {
// ...let's finish the current range first...
Result.emplace_back(From, First.To());
while (++First != FirstEnd) {
// ...and copy the rest of the ranges
Result.push_back(*First);
}
// The range is ready.
return makePersistent(Result);
}
};
// Second is outside of the range, and we can
// safely add a new range.
Result.emplace_back(From, First.To());
++First;
// First set can be over at this point and we should...
if (First == FirstEnd) {
// ...copy the rest if the second set's ranges.
while (Second != SecondEnd) {
Result.push_back(*(Second++));
}
// Nothing left to do.
return makePersistent(Result);
}
};
// No way for us to get here!
llvm_unreachable("...");
}
```
It's trickier in the way we end things than the intersection because we still need to add the rest of the other set.
================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:279-293
+ auto It = std::lower_bound(
+ B, E, Range(Point), [&Max, &One](const Range &OrigR, const Range &R) {
+ return OrigR.To() != Max && (OrigR.To() + One) < R.From();
+ });
+
+ if (It != E && It->Includes(Point))
+ return Original;
----------------
I don't see any practical reasons to keep this version over the more generic `RangeSet` + `RangeSet`
================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:321-335
+
+ const APSInt &F = (It == E) ? R.From() : std::min(R.From(), It->From());
+
+ // Find a right part of disjoint ranges from the new range.
+ It = std::lower_bound(
+ It, E, R, [&One, &Max](const Range &OrigR, const Range &R) {
+ return OrigR.From() == Max || (OrigR.From() - One) <= R.To();
----------------
Same here, it is just way more code to maintain.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99797/new/
https://reviews.llvm.org/D99797
More information about the cfe-commits
mailing list