[PATCH] D65350: [DDG] Data Dependence Graph Basics

Wed Aug 21 08:29:43 PDT 2019

bmahjour added inline comments.

================
Comment at: llvm/include/llvm/Analysis/DDG.h:101
+  /// Get the list of instructions in this node.
+  const InstructionList &getInstructions() const {
+    assert(!InstList.empty() && "Instruction List is empty.");
----------------
fhahn wrote:
> Unless there is a good reason, I think it would be better to return an ArrayRef/SmallVectorImpl instead of a concrete SmallVector here (and other places, same for arguments).
Ok, I'll change functions to work with SmallVectorImpl.

================
Comment at: llvm/include/llvm/Analysis/DDG.h:107
+    return const_cast<InstructionList &>(
+        static_cast<const SimpleDDGNode *>(this)->getInstructions());
+  }
----------------
Meinersbur wrote:
> [style] Why not
> ```
> return InstrList;
> ```
> instead of those casts?
> 
> Additionally, I don't see the harm of returning and empty instruction list. Especially if the function is used to add instructions to the node.
We could just return InstList, but since the const version is already doing this, we use this trick to avoid code duplication. It's common practice in C++ codes, and I see instances of it in LLVM as well (eg. `getInlineBuckets` in `DenseMap.h`) . 

Scott Meyers encourages this practice in his book "Effective C++" in section //Avoiding Duplication in const and Non-const Member Functions//. 

The assert is useful because we intentionally avoid constructing a `SimpleDDGNode` node with no instruction and none of the member functions allow removing instructions from the node. 

================
Comment at: llvm/include/llvm/Analysis/DDG.h:199
+  // queried it is recomputed using @DI.
+  const DependenceInfo DI;
+};
----------------
fhahn wrote:
> Why do we need a copy here? Wouldn't a reference be enough?
The reason is that the target of the reference may go out of scope, while the DDG (as an analysis result) lives on. For instance if you look at `DDGAnalysis::run`, an object of `DependenceInfo` is created inside the function which is used to construct the DDG. The function returns a unique_ptr to that DDG. The DDG lives on and needs to answer queries about the dependencies, while the `DependenceInfo` object is local to the function and gets destroyed upon return of the `run` function. 

================
Comment at: llvm/include/llvm/Analysis/DDG.h:233
+  DDGNode &createFineGrainedNode(Instruction &I) final override {
+    auto *SN = new SimpleDDGNode(I);
+    assert(SN && "Failed to allocate memory for simple DDG node.");
----------------
fhahn wrote:
> Do we need dynamic memory allocation here? Can the DDG just own the nodes?
> 
I don't see how DDG can be implemented without using dynamic memory allocation. Having the DDG own the nodes also creates problems for the builder design pattern, since the creation of the objects is meant to be delegated to the builder class.

================
Comment at: llvm/include/llvm/Analysis/DependenceGraphBuilder.h:25-26
+
+using BasicBlockList = SmallVector<BasicBlock *, 2>;
+using InstructionList = SmallVector<Instruction *, 8>;
+
----------------
Meinersbur wrote:
> [style] The names "BasicBlockList" and "InstructionList" a very general to be put into the entire `llvm::` namespace and could conflict with other headers that use lists of instructions and BBs. Maybe move them into one of the DDG classes?
> 
> Also the `Ty` suffix for typedefs would apply here.
Good point. I'll fix it.

================
Comment at: llvm/include/llvm/Analysis/DependenceGraphBuilder.h:26
+using BasicBlockList = SmallVector<BasicBlock *, 2>;
+using InstructionList = SmallVector<Instruction *, 8>;
+
----------------
fhahn wrote:
> The default seems a bit excessive. Do you expect most nodes to have multiple instructions?
> The default seems a bit excessive. Do you expect most nodes to have multiple instructions?
It depends on whether the builder is used to build a DDG or a PDG. Program Dependence Graphs tend to contain coarse-grained nodes containing many instructions. Merged DDG nodes after graph simplification (not shown in this patch) usually contain 2 or 3 instructions and based on my observation they constitute about less than a third of all the nodes. PDG nodes on the other hand could easily have in the order of tens or hundreds of instructions.

I'll change it to 2 instructions for now, since PDG is farther down the road.

================
Comment at: llvm/include/llvm/Analysis/DependenceGraphBuilder.h:89
+  /// Map types to map instructions to nodes used when populating the graph.
+  using InstToNodeMap = std::unordered_map<Instruction *, NodeType *>;
+
----------------
fhahn wrote:
> Is there a reason to not use DenseMap here?	
Originally I thought DenseMap was implemented as a binary search tree, but looks like it's a true hash table. I'll switch to DenseMap.

================
Comment at: llvm/lib/Analysis/DependenceGraphBuilder.cpp:119
+        for (auto &IDst : DstIList) {
+          auto D = const_cast<DependenceInfo &>(DI).depends(ISrc, IDst, true);
+          if (!D)
----------------
Meinersbur wrote:
> bmahjour wrote:
> > Meinersbur wrote:
> > > Why this `const_cast`? Can we avoid it?
> > `AbstractDependenceGraphBuilder` holds  a constant reference to the result of the dependence analysis (`DI`). Unfortunately `DependenceInfo::depends` is not marked as constant although it doesn't modify anything. A few ways to solve this:
> > 
> > 1. Make `DI` a non-const reference.
> > 2. Do a const_cast when calling `depends`. 
> > 3. Make depends constant (can go viral and is beyond the scope of this work)
> > 
> > I personally prefer (2) as it keeps the const_cast where the const chain broke down. If `depends` is improved and made `const` we just need to drop this const_cast, and in the meantime we prevent code changes that inadvertently modify `DI` in this implementation.
> DI is created by this patch in `DDGAnalysis::run` and don't see a reason to pass it as `const &` if the class was not designed for it. That is, I'd prefer (1).
Ok, I'll make DI non-const.

================
Comment at: llvm/lib/Analysis/DependenceGraphBuilder.cpp:102
+  };
+  for (DGIterator SrcIt = Graph.begin(); SrcIt != Graph.end(); ++SrcIt) {
+    InstructionList SrcIList;
----------------
Meinersbur wrote:
> [style] By [[ https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop | coding standard ]] , the end-iterator should be cached (the list of nodes is not modified within the body, right)?
Right, I'll use that style for the this and the `DstIt` loop.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65350/new/

https://reviews.llvm.org/D65350