[PATCH] D125776: [llvm-dva] 01 - Interval tree

Pavel Samolysov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 18 04:59:19 PDT 2022


psamolysov added a comment.

I'm not familiar with the discussion of the class design. From the architecture point of view, the class can be divided into a builder that accumulates the intervals and build the tree and an immutable tree class that contains the query methods only. This design can be an overengineering but on the other hand the implemented algorithm (in the `find_iterator` for example) is not short and it deserves its own class.



================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:119
+//   IL: Intervals to the left (in ascending order by beginning point).
+//   IR: Intervals to the right (in ascending order by ending point).
+//
----------------
It is described as the intervals are stored above but I see in the example they are printed in the same order, for IR - in the descending order by ending point.


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:151
+// contain the intervals described by IL and IR. Each node will contain an
+// index into that vector (global bucket), to indicate the beginning of the
+// intervals assigned to the node.
----------------
Just a nit. Personally I didn't manage to get how the intervals actually stored in the node until I had a look at the source code. Maybe the "that vector (global bucket)" should be introduced a little bit before?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:175
+//    Point: a target point to be tested for inclusion in any interval.
+//    Returns vector will all the intervals containing the target point.
+//
----------------



================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:208
+  IntervalData(PointType Left, PointType Right, ValueType Value)
+      : Left(Left), Right(Right), Value(Value) {}
+  virtual ~IntervalData() = default;
----------------



================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:224
+  /// This is Left <= Point <= Right for closed intervals.
+  bool contain(const PointType &Point) const {
+    return left(Point) && right(Point);
----------------
Maybe to use `contains` (with S) as the method's name? For example, the standard library contains classes with methods with this name: https://en.cppreference.com/w/cpp/string/basic_string_view/contains


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:256-258
+    PointType middle() { return MiddlePoint; }
+    unsigned start() { return BucketIntervalsStart; }
+    unsigned size() { return BucketIntervalsSize; }
----------------
Should these three methods be marked as `const`?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:261
+    IntervalNode(PointType Point, unsigned Start)
+        : MiddlePoint(Point), BucketIntervalsStart(Start) {}
+
----------------
We have no idea what the type will be used for the `PointType`, theoretically it can be heavy, so `move` could make sense.


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:298
+  /// Print the interval list (left and right) for a given \a Node.
+  void printList(raw_ostream &OS, IntervalReferences &IntervalSet,
+                 unsigned Start, unsigned Size, bool HexFormat = true) {
----------------
This method looks as it could be `static`.


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:300
+                 unsigned Start, unsigned Size, bool HexFormat = true) {
+    const char *Format = HexFormat ? "[0x%08x,0x%08x] " : "[%2d,%2d] ";
+    if (Size) {
----------------



================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:315
+    const char *Format = HexFormat ? "MP:0x%08x " : "MP:%2d ";
+    auto printNodeData = [&](StringRef Text, IntervalReferences &IntervalSet) {
+      OS << format("%5d: ", Level);
----------------
Just a nit. By the coding convention all names of the variables should start from a capital letter, I'm not sure whether lambdas are exception here?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:355
+    // The intervals in S_left and S_right are recursively divided in the
+    // same manner until there are no intervals left.
+
----------------
The word `left` has different meaning in the comment: as opposite to the right and `while there is at least one interval`. English is not my native language, and the first time I read the phrase as "there are no intervals in the left side". Could you rewrite the comment a bit?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:362
+    int MiddleIndex = (PointsBeginIndex + PointsEndIndex) / 2;
+    PointType MiddlePoint = EndPoints[MiddleIndex];
+
----------------
To not copy the `PointType` of unknown size?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:430
+  class find_iterator
+      : public std::iterator<std::forward_iterator_tag, DataType> {
+
----------------
`std::iterator` is deprecated in C++17, because the project will use C++17 soon it might make sense to not use a deprecated class for the new code.


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:521
+        : AscendingBuckets(Left), DescendingBuckets(Right), Node(Node),
+          Point(Point), Index(0) {
+      initNode();
----------------



================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:539
+    const DataType *operator->() const {
+      return (Point <= Node->middle())
+                 ? (*AscendingBuckets)[Node->start() + Index]
----------------
There is a code duplication between `operator->` and `operator*`. It may make sense to extract the body of the `operator->` into a private method (`current()`, for example) that returns a pointer and dereference this method's return value in the `operator*`.


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:580
+    IntervalsRight.clear();
+    EndPoints.clear();
+  }
----------------
Maybe the `References` vector also should be cleared?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:592
+  IntervalReferences getContaining(PointType Point) const {
+    IntervalReferences IntervalSet;
+    for (find_iterator Iter = find(Point), E = find_end(); Iter != E; ++Iter)
----------------
Should all the query methods contain an assert that the tree has been actually created?


================
Comment at: llvm/include/llvm/ADT/IntervalTree.h:601
+  /// Descending: return the intervals with the biggest at the front.
+  void sortIntervals(IntervalReferences &IntervalSet, Sorting Sort) const {
+    std::sort(IntervalSet.begin(), IntervalSet.end(),
----------------
This method is for sorting already returned interval set, so this is just a helper and it looks like the method can be a static member of the class.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125776/new/

https://reviews.llvm.org/D125776



More information about the llvm-commits mailing list