[PATCH] D35918: [GVNHoist] Factor out reachability to search for anticipable instructions quickly

Tue Aug 1 09:04:10 PDT 2017

hiraditya added a subscriber: kuhar.
hiraditya added a comment.

> Such a mapping is by definition n^2 space, and i'm having trouble seeing why it is necessary.
> 
> Here is what you are doing:
>  For each instruction with the same VN:
> 
>   find the post-dominance frontier of the block of the instruction
>   Insert a chi there, with certain arguments.
>    
> 
> This is unnecessarily wasteful (you may compute the same pdf again and again).
> 
> Here is what SSUPRE (and you), should do:
> 
> Collect all the blocks of the instructions with the same VN into defining blocks.
>  Compute the PDF using IDFCalculator.
>  Place empty chis in the PDF.
> 
> At this point, you have two options:
> 
>   Walk post-dominator tree top-down and use a stack to store the last value you see.
>   When you hit a chi from a given edge, the value to use as the argument is at the top of the stack.
>   
> 
> 
> This is O(Basic Blocks)

I tried this based on your suggestions, but post-dominator tree does not work well with infinite loops or CFG with multiple exits.
I can wait on @kuhar 's patch to be merged+stabilize and then I can work on this idea.

> The O(instructions+chis) way to do it is:
> 
> Make a vector of instructions and chi argument uses.  Each should be given the DFS in/out number  from the post dominator tree node for the basic block they come from, and a local DFS number (IE order in block) in the case of instructions
>  A "chi argument use" is created for each incoming edge to the chi, but is empty/fake.  These should assume the basic block from the other side of the edge (IE not the chi block, but the edge to the chi block).
>  Sort by dfs in/out, then local number
> 
>   Walk vector with a stack.
>   At each element of vector:
>     while( !top of stack is empty && DFS in/out  of current thing in vector is not inside of DFS number of top of stack)
>      pop stack
>   If element you are staring at is a chi use:
>     if stack is empty, chi has null operand
>     if stack is not, set chi argument for the edge to top of stack
>   else: // must be an instruction
>       if stack is empty, push onto stack
>      If stack is not empty, the thing on the stack post-dominates you and you are redundant :)
>   
>   
> 
> 
> The forwards version of this algorithm is used by predicateinfo to do SSA renaming.
>  Your algorithm is the same on the reverse graph, except the chi arguments are virtual :)

Because this patch precomputes ANTIC points based on your suggestions, it is already faster than the previous fix of iterating on dominator tree for each instruction to be hoisted.
Can we merge this patch if you think it is good to go, and then I'll work on this idea once the post-dominator patch by @kuhar is ready.
Thank you for writing the algorithm here, it helped me realize how close the algorithm is to PHI-insertion.

https://reviews.llvm.org/D35918