[PATCH] D127874: [analyzer] Reimplement UnreachableCodeChecker using worklists

Thu Jun 16 09:49:04 PDT 2022

steakhal added a comment.

In D127874#3588786 <https://reviews.llvm.org/D127874#3588786>, @martong wrote:

>> If some checker placed a sink node into some block, the successors of the sink block was marked unreachable previously. This leads to an unpleasant situation from the user's point of view since a different bugreport gets correlated with the FPs caused by the sink node of that. This effectively means that for each unconditional sink, we will also have (preferable) an unreachable code report as well.
>
> An example and/or a visualization would be useful for this.

It's quite hard. I could add step numbers so that I could mark which block is being entered and exited the visitation in the recursion.

>> Now, we ignore unreachable nodes if the predecessor is reachable, but none of its successors are.
>
> Isn't there a contradiction here? The node's predecessor must have at least one reachable successor, the node itself; otherwise the node itself would be unreachable.

I tried to cheap out the mouthful definition. I'm not sure if it helps more than hinders this way.

>> the sweep-back part cannot be implemented correctly via a DFS visitation
>
> What is the purpose of this sweep-back? You should summarize that here.

Yea, done.

>> ... Do a backward DFS to find the start of each unreachable CFG block chain ...
>
> How does it find the //start// if there is a cycle in the graph? Isn't it rather about finding strongly connected components?

I left circle detection to the reader. Now, I'm explicitly mentioning the maintenance of the //visited// block set.

>> The sweep-back part cannot be implemented correctly via a DFS visitation ...
>
> Why do we need to do the sweep-back part in BFS order?

Due to the recursive visitation order, the basic blocks get inserted to the //reachable// blocks set sooner, than another unreachable block would get visited.
In the `magic_clamp` example the recursive (DFS) visitation starts from `B0`, enters `B1`, recurses further up to `B2` then to `B4` and `B4`. This is when the algorithm starts popping frames (backtracking).
It backtracks, backtracks ... back to `B1`, where it has another //predecessor// (`B3`), hence it recurses on that path as well. But this time, it will find that the only predecessor of `B3` is **reachable**, hence it will leave `B3` as unreachable. And this is the problem, because we should not report a dead code bug there. Consequently, we should not extend the //reachable// blocks set in DFS visitation order. However, if we were using BFS visitation order, we can never end up in this situation.

> Connected components can be found by both DFS or BFS. Please elaborate why should we prefer BFS? The problem you are trying to solve seems to be an unnecessarily convoluted problem to me. Shouldn't it be enough to simply pick one node (or all of them) from the connected component? It should not matter if we use DFS or BFS for finding the nodes in a connected component.

I don't think any of the issues mentioned in this patch relates to //strongly connected components//, thus I don't think I can answer to this question.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127874/new/

https://reviews.llvm.org/D127874