[PATCH] D65350: [DDG] Data Dependence Graph Basics

Wed Aug 21 09:02:08 PDT 2019

bmahjour added a comment.

In D65350#1636205 <https://reviews.llvm.org/D65350#1636205>, @fhahn wrote:

> > This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges. The idea behind the DependenceGraphBuilder and why we need it are summarized in https://ibm.ent.box.com/v/directed-graph-and-ddg.
>
> I think it would be good to summarize the information in-tree as well, to ensure the information is accessible later on as well. Some of the docs fit in the headers, for some of it a new documentation page might be worth adding. Ideally it would include some info about  design decisions, the intended/example uses cases and how the DDG helps and the benefits over the existing infrastructure.

Sure I can create a page or two of documentation, however I'm not very familiar with the doc infrastructure in LLVM. Could you point me to some examples to follow? Would an rts file under llvm/docs/DDG be sufficient? Are they rendered by any tool and if so how can I test it?

> A few additional questions:
> 
> I am not sure I see the direct benefit of duplicating the def-use edges in the DDG? Given a User, we already have access to its uses throughout the User itself and LLVM tries hard to maintain that relation very efficiently.

The def-use dependencies are important as they carry scalar dependencies but it is possible to only consider memory access instructions in the DDG and only follow def-use edges during graph construction to establish reaching defs from `load`-like instructions to `store`-like instructions. However doing that reduces the generality of DDG as transformations would need to do extra work during codegen to pull in all required instructions. If the def-use edges are explicitly represented in the graph, then codegen is simplified because a topological sort of the graph fully represents the whole program. Please note that, the DDG can potentially be used in many different transformations. Many of those transformations, such as instruction scheduling, care about all instructions (not just memory access instructions), and would not benefit from using this implementation of a DDG if a minimalistic approach is to be taken.

I actually implemented a prototype where I did the "minimal" implementation only considering memory instructions. I measured the difference in compile-time for a number of benchmarks with and without this approach, and I only noticed a small improvement. From what I observed and the feedback from several people, the gain is too small to justify the loss of generality and convenience of a full DDG.

> IIUC the plan is to build the DDG up front and then check the legality of a transformation on top of it. Currently, most passes bail out early once they detect a transformation cannot be applied and this helps to limit compile time. Could we do something similar with checks dependent on the DDG?

The DDG would certainly help analyze legality of various transformations. It can go beyond answering the question of "whether a transformation is legal or not". It can actually help determine how to transform the code so that it preserves original program dependencies, a good example of this is loop distribution and loop vectorization. Other transformations, which only care about existence of a certain data dependency pattern, can use the DDG as well, but they would have to consider the benefits versus the compile-time cost of building it.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65350/new/

https://reviews.llvm.org/D65350