[PATCH] D40333: Separate LoopTraversal and BreakFalseDeps out of ExecutionDomainFix into their own files

Fri Dec 1 11:55:51 PST 2017

MatzeB added inline comments.

================
Comment at: include/llvm/CodeGen/BreakFalseDeps.h:1-18
+//==- llvm/CodeGen/BreakFalseDeps.h - Break False Dependency Fix -*- C++ -*-==//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
----------------
Is there a need to make the whole class public in a header? It feels to me like a `FunctionPass *createFalseDependencyPass();` in include/llvm/CodeGen/Passes.h should be enough here so the class definition and all the details can remain in BreakFalseDeps.cpp.

================
Comment at: include/llvm/CodeGen/BreakFalseDeps.h:36
+
+class BreakFalseDeps : public MachineFunctionPass, LoopTraversal {
+private:
----------------
myatsina wrote:
> MatzeB wrote:
> > This is probably a bad use of inheritance. Saying "BreakFalseDeps is a LoopTraversal" feels odd. I believe this could be rewritten so that LoopTraversal is subclassed separately inside BreakFalseDeps.cpp without making it a supertype of BreakFalseDeps itself.
> 
> Both BreakFalseDeps and ExecutionDomainFix traverse the basic blocks in a special way.
> The LoopTraversal provides the traversal algorithm and methods that allow inheriting passes to handle instructions, handle BBs, etc.
> Do you have a better suggestion on how to do it without duplicating the traversal logic in both passes?
Factoring out the common logic makes a lot of sense and is a good thing.

I was mainly wondering about using subclassing as a tool to get there.

In this case it could be as simple as a single function as the public interface:
```
void dominanceLoopTraversal(MachineFunction &MF,
                            std::function<void(MachineBasicBlock *, bool)> runOnBasicBlock);
```

For the implementation you could still use a class (in the .cpp file) if that feels easier:

```
class LoopTraversalImpl {
  std::function... runOnBasicBlock;
  LoopTraversalImpl(std::function... runOnBasicBlock) : runOnBasicBlock(runOnBasicBlock) {}
  // ...
  void traverse(MachineFunction &MF) { /* as existing */ }
};

void dominanceLoopTraversal(MachineFunction &MF,
                            std::function<void(MachineBasicBlock *, bool)> runOnBasicBlock) {
  LoopTraversalImpl Traversal(runOnBasicBlock);
  Traversal.traverse(MF);
}
```

================
Comment at: include/llvm/CodeGen/BreakFalseDeps.h:82
+private:
+  // Dependency breaking.
+  bool pickBestRegisterForUndef(MachineInstr *MI, unsigned OpIdx,
----------------
myatsina wrote:
> MatzeB wrote:
> > It's usually easiest to just document the functions with `///` doxygen comments instead of forming groups of methods.
> > 
> > If you really want to form groups, the correct doxygen syntax would be:
> > ```
> > /// \name Dependency Breaking
> > /// \{
> > ...
> > /// \}
> > ```
> > 
> Some of these methods are documented in the cpp file.
> This is how they were originally documented when they were part of the ExecutionDepsFix pass.
> Do you think we take this refactoring a a chance to move all method documentation to the header file?
Having the documentation for private methods in the cpp file is fine. But then I'd be consistent and not have something like `// Instruction processing.` here which looks like documentation about the function.

On the other hand I suppose the `// LoopTraversal overrides.` comment is about the 3 methods following the comment. It is at least not very typical to document 3 methods with a single comment, while easy in this case in general it probably won't be obvious what groups of methods a comment is meant for. That's why I was talking about the doxygen groups. (Though in this specific case I'd tend to just not have a comment...)

================
Comment at: include/llvm/CodeGen/LoopTraversal.h:1
+//==- llvm/CodeGen/LoopTraversal.h - Execution Dependency Fix -*- C++ -*-==//
+//
----------------
Could we move it to `lib/CodeGen/LoopTraversal.h` and keep the API private to CodeGen for now?

================
Comment at: include/llvm/CodeGen/LoopTraversal.h:12-39
+/// This class implements the basic blocks traversal order used by passes like
+/// BreakFalseDeps and ExecutionDomainFix.
+///
+/// We want to visit every instruction in every basic block in order to update
+/// it's execution domain or break any false dependencies. However, for the
+/// dependency breaking, we need to know clearances from all predecessors
+/// (including any backedges). One way to do so would be to do two complete
----------------
myatsina wrote:
> MatzeB wrote:
> > - The comment (except for the first `\file` sentence) would better fit in front of `class LoopTraversal`.
> > 
> > - It's good to see an example trying to explain the goals of the algorithm!
> > 
> > - You can simplify the explanation by removing descriptions of how things are not done (i.e. the "it's not using the two-pass or naive worklist approaches" parts). Instead you should describe what actually is done: A reverse postorder traversal with an intertwined worklist algorithm for backedges.
> I think that mentioning the naive approach and what we are trying to optimize is actually making this traversal algorithm (and the motivation for it) much clearer.
> I can refactor the comment so that we have "The algorithm does ..." and at the end add a few sentences like "A naive way of implementing this would have been ..., but the downside of this is ..., this is why the optimized way was chosen", or something of that sort.
> Do you think this way will be better?
Yes that sounds good.

================
Comment at: include/llvm/CodeGen/LoopTraversal.h:52
+  struct MBBInfo {
+    // Whether we have gotten to this block in primary processing yet.
+    bool PrimaryCompleted = false;
----------------
This should be `/// Whether ...` in doxygen syntax.

================
Comment at: include/llvm/CodeGen/LoopTraversal.h:73-81
+  // Overridden by the derived classes.
+  virtual void enterBasicBlock(MachineBasicBlock *) = 0;
+  virtual void leaveBasicBlock(MachineBasicBlock *) = 0;
+  virtual void processBasicBlock(MachineBasicBlock *MBB, bool PrimaryPass) = 0;
+
+  // Traversal operations
+  void runOnBasicBlock(MachineBasicBlock *MBB, bool PrimaryPass);
----------------
myatsina wrote:
> MatzeB wrote:
> > This API seems too complicated to me. Given that this is just a loop traversal API I would expect that `traverse()` and `runOnBasicBlock()` is enough.
> > 
> > See also my comment inside `runOnBasicBlock()`; `isBlockDone()` can be private so it is not part of the API I believe.
> I was trying to avoid duplicating runOnBasicBlock() logic for the inheriting classes.
> It is not a lot of duplication, it basically just activates the other 3 methodes (enterBB, processBB, leaveBB) and that's it, so we can simplify the API as you suggested and have both inheriting class implement runOnBB the same way.
> 
The downside of having 3 functions IMO is that splitting into 3 function is in no way motivated by the fact that this is a loop traversal algorithm. It's just a small convenience that happens to match the two existing users. The downside from splitting is that a reader of the code has to wonder when each of them gets called and at least I would intuititvely assume there to be complicated rules, while if there is only a single function it is quite obvious what will happen.

Repository:
  rL LLVM

https://reviews.llvm.org/D40333