[PATCH] D88307: [DON'T MERGE] Jump-threading for finite state automata

Ehsan Amiri via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 16 04:17:30 PST 2020


amehsan added a comment.

Sorry for delay. I will describe the algorithm in two steps. In step 1 (This comment) I describe the high level and main steps of the algorithm. In the next step I will go into details that are missing here. I’d be glad to get some feedback and know what people think about the approach.

The algorithm is developed with coremark in mind. But I think it is generalizable, and I will try to clarify generalization opportunities.

There are three main components in the algorithm.

1-	Choosing a SSA variable in the loop that is interesting. (“the variable” hereafter)
2-	Proving the branches that depend on the value of the variable has opportunity for jump threading.
3-	Deciding on which blocks needs to be replicated, how much code increase we will have and which branches we can remove.

Parts (1) and (2) are easier as I describe below. Part (3) is the one that has some details that I will describe in a subsequent comment.

For (1) we can rely on some heuristics. One current heuristic is to look inside the loop for switch statements with some properties. For example if we focus on the CFG subgraph induced by BBs inside the loop, then switch statement, dominates or post dominates every BB. (note that we are focused on a subgraph of CFG). There is probably a lot of room for experimenting with different heuristics here and generalizing the algorithm.

(2) is fairly straightforward. (It also depends on (1)). But the main idea is that we need to start from the definition of “the variable” and follow use-def chains backward and prove that in every iteration of the loop, a constant value is chosen for this variable that will be used in the next iteration. Again, this condition works for coremark, but this should be generalizable.

I leave (3) for another comment, but the main idea is to enumerate all paths in the loop.  Then for each path identify the CFG edge over which the value of “the variable” for the next iteration is determined. For each BB, depending on whether it appears before or after this edge in the path, we make different decision.

It will be great to get some feedback on what people think about this approach. I will provide details of (3) soon.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88307/new/

https://reviews.llvm.org/D88307



More information about the llvm-commits mailing list