[PATCH] D152263: [clang][CFG] Add support for partitioning CFG into intervals.

Tue Jun 27 08:49:37 PDT 2023

ymandel marked 3 inline comments as done.
ymandel added inline comments.

================
Comment at: clang/lib/Analysis/IntervalPartition.cpp:26
+
+  std::queue<const CFGBlock *> Worklist;
+  for (const CFGBlock *S : Header.succs())
----------------
xazax.hun wrote:
> ymandel wrote:
> > xazax.hun wrote:
> > > Is it possible we end up adding the same node to the queue multiple times? Is that desirable or do we want to make sure we only have each node at most once?
> > Added an answer in comments, but repeating here in case you want to discuss further:
> > 
> >   // The worklist *may* contain duplicates. We guard against this possibility by
> >   // checking each popped element for completion (that is, presence in
> >   // `Partitioned`). We choose this approach over using a set because the queue
> >   // allows us to flexibly insert and delete elements while iterating through
> >   // the list. A set would require two separate phases for iterating and
> >   // mutation.
> I see, thanks! This addresses my concerns. I think in some cases we use a bitset with the blocks' ids to more efficiently track things like that.
Thanks. I experimented with that and added some use of bitsets here (`llvm::BitVector`). I didn't go all out though since the CFG doesn't provide a reverse mapping from block ID to block pointer, so I still used `SmallDenseSet` where we'll need the block pointers.

================
Comment at: clang/lib/Analysis/IntervalPartition.cpp:46-47
+
+    // Check whether all predecessors are in the interval, in which case `B`
+    // is included as well.
+    bool AllInInterval = true;
----------------
xazax.hun wrote:
> ymandel wrote:
> > xazax.hun wrote:
> > > I wonder if this approach is correct. Consider the following scenario:
> > > 
> > > ```
> > >    A
> > >   / \
> > >  B   C
> > >  |   |
> > >  |   D
> > >   \ /
> > >    E
> > > ```
> > > 
> > > In the BFS, we might visit: ABCED. Since we visit `E` before `D`, we might not recognize that `E` is part of the interval. Do I miss something?
> > > I wonder if this approach is correct. Consider the following scenario:
> > > 
> > > ```
> > >    A
> > >   / \
> > >  B   C
> > >  |   |
> > >  |   D
> > >   \ /
> > >    E
> > > ```
> > > 
> > > In the BFS, we might visit: ABCED. Since we visit `E` before `D`, we might not recognize that `E` is part of the interval. Do I miss something?
> > 
> > When we add `D` to the interval, we'll push `E` onto the queue again (lines 58-59). The second time that `E` is considered it will have both successors in the interval and will be added as well.
> Ah, I see, makse sense, thanks! This also makes me wonder, would this algorithm be more efficient if we used RPO (in case we can use that without recalculating the order)? :D But we do not need to benchmark this for the first version. 
I suspect it would be though I think that any traversal based ordering vs. random will still be pretty good, especially since the queue already pushes us towards breadth-first. I didn't use RPO because I figured that the cost of computing it would be greater than any potential benefit.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152263/new/

https://reviews.llvm.org/D152263