[PATCH] D26872: Outliner: Add MIR-level outlining pass

Matthias Braun via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 9 04:00:10 PST 2016


MatzeB added a comment.

Hi Jessica,

this is great progress from the last patch! I think we need one more roundtrip until we can commit the code and keep improving it inside the repository (as this is an independent pass it shouldn't affect the rest of the compiler).

For the whole Outliner.cpp file: Internal types and functions should be marked as such. In this case it's probably best to open an anonymous namespace (except for the externally visible functions createOutlinerPass(), and InitializerMachineOutliner())



================
Comment at: include/llvm/CodeGen/Passes.h:401
+
+  /// This pass performs outlining on machine instructions directly before printing assembly.
+   ModulePass *createOutlinerPass();
----------------
Indent.


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1517-1563
+
+  public:
+
+  /// Return true if the instruction is legal to outline.
+  virtual bool isLegalToOutline(MachineInstr &MI) const {
+    llvm_unreachable(
+        "Target didn't implement TargetInstrInfo::isLegalToOutline!");
----------------
You should move this block a bit upwards (We usually try to keep private fields at the beginning or end of a class, this would move them in the middle).


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1559
+  /// By default, this means that the function has no red zone.
+  virtual bool functionIsSafeToOutlineFrom(Function &F) const {
+    llvm_unreachable(
----------------
As this is a codegen API passing in a MachineFunction& parameter is more natural. (Implementations can always use MF.getFunction() to get back to the llvm::Function)


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1
+//===---- MachineOutliner.h - Outline instructions -------------*- C++ -*-===//
+//
----------------
MachineOutliner.cpp


================
Comment at: lib/CodeGen/MachineOutliner.cpp:108
+/// \p QueryIndex maps to.
+std::pair<std::vector<unsigned> *, size_t>
+stringContainingIndex(const StringCollection &StringList, size_t QueryIndex) {
----------------
`std::pair<String *, size_t>`?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:178-186
+  /// Create a node starting at \p S, ending at \p E, and with suffix link \p L.
+  SuffixTreeNode *createSuffixTreeNode(size_t S, size_t *E, SuffixTreeNode *L) {
+    SuffixTreeNode *Node = new SuffixTreeNode;
+    Node->Start = S;
+    Node->End = E;
+    Node->Link = L;
+
----------------
Looks like this could be a constructor.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:179
+  /// Create a node starting at \p S, ending at \p E, and with suffix link \p L.
+  SuffixTreeNode *createSuffixTreeNode(size_t S, size_t *E, SuffixTreeNode *L) {
+    SuffixTreeNode *Node = new SuffixTreeNode;
----------------
This looks like it could just be a constructor on the SuffixTreeNode struct.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:189
+  /// Delete the node \p N.
+  void deleteSuffixTreeNode(SuffixTreeNode *N) {
+    if (N == nullptr)
----------------
Should this be called `deleteSuffixTreeSubTree` or similar as it deletes more than 1 node?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:205-206
+    }
+
+    N = nullptr;
+  }
----------------
this has no effect


================
Comment at: lib/CodeGen/MachineOutliner.cpp:391
+  /// The sum of the lengths of the strings contained in \p InputString.
+  size_t size() {
+    return InputLen; 
----------------
can be `size_t size() const`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:397
+  ///
+  /// /param NewStr The string to append to the tree.
+  void append(String *NewStr) {
----------------
`\param`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:416-417
+      NeedsLink = nullptr;
+      LeafEnd = (size_t)EndIdx;
+      extend((size_t)EndIdx, NeedsLink, SuffixesToAdd);
+    }
----------------
No need for to cast `EndIdx`.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:468
+    size_t FirstChar = 0;
+    SuffixTreeNode *N = Root;
+    size_t NumOccurrences = 0;
----------------
use a reference instead of a pointer?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:476
+    // at least once
+    if (MaxHeight > 0) {
+      Longest = new String();
----------------
Could do an early exit:
```
if (MaxHeight == 0)
  return nullptr;
```


================
Comment at: lib/CodeGen/MachineOutliner.cpp:551
+      switch (Found) {
+      case (ExactMatch):
+        RetSuffixTreeNode = CurrSuffixTreeNode;
----------------
the braces around the cases are unnecessary


================
Comment at: lib/CodeGen/MachineOutliner.cpp:582
+    // FIXME: Pruning should happen in a separate function.
+    if (N != nullptr && N->Valid) {
+      N->Valid = false;
----------------
This can use an early exit so you do not need to indent everything inside the if.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:595
+      // There are no occurrences, so return null.
+      else if (N == nullptr) {
+        delete Occurrences;
----------------
This seems to be an impossible case as you already tested for `N != nullptr`.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:617-621
+      N = N->Link;
+      while (N && N != Root) {
+        N->Valid = false;
+        N = N->Link;
+      }
----------------
This could be
```
for (SuffixTreeNode *T = N->Link; T && T != Root; T = T->Link)
  T->Valid = false;
```
(It is usually better to go with a for loop instead of `while() { ... increment; }` on principle, because you do not have to remember to duplicate the increment code when you want to use `continue` somewhere inside the loop)


================
Comment at: lib/CodeGen/MachineOutliner.cpp:630-631
+  size_t numOccurrences(const String &QueryString) {
+    size_t Dummy = 0;
+    size_t NumOccurrences = 0;
+    SuffixTreeNode *N = findString(QueryString, Dummy, Root);
----------------
Maybe do not initialize these to give a hint to the reader that they will be set by `findString()` anyway (If findString() doesn't always set them, then you should make it).


================
Comment at: lib/CodeGen/MachineOutliner.cpp:634
+
+    if (N != nullptr) {
+      if (N->SuffixIndex != EmptyIndex)
----------------
Could be an early exit.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:647
+  /// as a flat string.
+  SuffixTree(StringCollection Strings) : InputLen(0), LeafEnd(EmptyIndex) {
+    size_t *RootEnd = new size_t(EmptyIndex);
----------------
Use `const StringCollection &Strings`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:666
+    Active.Node = nullptr;
+    if (Root != nullptr)
+      deleteSuffixTreeNode(Root);
----------------
The deleteSuffixTreeNode() implementation already protects against nullptrs.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:713
+  /// This is initialized after we go through and create the actual function.
+  llvm::MachineFunction *MF;
+
----------------
the llvm:: prefix should not be necessary (same for OccBB)


================
Comment at: lib/CodeGen/MachineOutliner.cpp:741
+
+namespace llvm {
+
----------------
You probably only need to put INITIALIZE_PASS and createOutlinerPass() in the llvm namespace.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:772
+
+  StringRef getPassName() const override { return "Outliner"; }
+
----------------
Maybe use `"MIR Function Outlining"` to be in sync with INITIALIZE_PASS. Most llvm pass 'names' read more like short descriptions.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:799-801
+                        llvm::MachineBasicBlock *BB,
+                        const TargetRegisterInfo *TRI,
+                        const TargetInstrInfo *TII);
----------------
Could use references instead of pointers.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:843-845
+// Keep the function names from the outliner around to keep the ASMPrinter
+// happy...
+std::vector<std::string *> *OutlinerFunctionNames;
----------------
This makes no sense.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:853
+
+INITIALIZE_PASS(MachineOutliner, "outliner", "MIR Function Outlining", false, false)
+char MachineOutliner::ID = 0;
----------------
Maybe use `"machine-outliner"` to be in sync with DEBUG_TYPE.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:868
+      Container.push_back(CurrIllegalInstrMapping);
+      CurrIllegalInstrMapping--;
+    }
----------------
Maybe add `assert(CurrLegalInstrMapping < CurrIllegalInstrMapping && "Overflow");` The same at the place where you increment CurrLegalInstrMapping.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:875-877
+      auto Mapping = InstructionIntegerMap.find(&MI);
+
+      if (Mapping != InstructionIntegerMap.end()) {
----------------
The `find(), if (I != end()) ... else insert()` sequence walks over the datastructure in the find() and again in the insert() step.

Instead you can simply always `insert()` and check the return value to see whether an element was actually inserted or an existing one reused:
```
auto I = map.insert(make_pair(MI, CurrLegalInstrMapping));
// Newly inserted?
if (I.second)
  CurrLegalInstrMapping++;
unsigned MINumber = I.first;
```


================
Comment at: lib/CodeGen/MachineOutliner.cpp:895
+    std::vector<OutlinedFunction> &FunctionList,
+    std::vector<MachineBasicBlock *> Worklist,
+    SuffixTree &ST) {
----------------
Use a reference.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:990
+  std::string *Name = new std::string(NameStream.str());
+  FunctionNames->push_back(Name);
+
----------------
I don't think the value names need to be kept around.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1019
+
+  auto It = OF.OccBB->instr_begin();
+  std::advance(It, OF.EndIdxInBB);
----------------
Can be `begin()` instead of `instr_begin()`.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1024
+    MachineInstr* MI = MF.CloneMachineInstr(&*It);
+    MI->dropMemRefs();
+    MBB->insert(MBB->instr_begin(), MI);
----------------
Should probably document this with something like
``// The cloned memory operands reference the old function. Drop them.`` 


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1025
+    MI->dropMemRefs();
+    MBB->insert(MBB->instr_begin(), MI);
+    It--;
----------------
can be `MBB->begin()`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1047-1050
+  for (size_t i = 0, e = FunctionList.size(); i < e; i++) {
+    OutlinedFunction OF = FunctionList[i];
+    FunctionList[i].MF = createOutlinedFunction(M, OF);
+  }
----------------
Could use a range based for:
```
for (OutlinedFunction &OF : FunctionList)
  OF.MF = createOutlinedFunction(M, OF);
```


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1108-1112
+    // We need the function name to match up with the internal symbol
+    // build for it. There's no nice way to do this, so we'll just stick
+    // an l_ in front of it manually.
+    Twine InternalName = Twine("l_", MF->getName());
+    MCSymbol *Name = Ctx.getOrCreateSymbol(InternalName);
----------------
Constructing names should not be necessary here. You should be able to add a GlobalValue MachineOperand instead of a Symbol one when you construct the call instruction.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1147-1149
+    const TargetSubtargetInfo *STI = &(MF.getSubtarget());
+    const TargetRegisterInfo *TRI = STI->getRegisterInfo();
+    const TargetInstrInfo *TII = STI->getInstrInfo();
----------------
You can move those out of the loop.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1158
+      buildProxyString(Container, &MBB, TRI, TII);
+      String *BBString = new String(Container);
+      Collection.push_back(BBString);
----------------
There seems to be an unnecessary duplication of the string.


Repository:
  rL LLVM

https://reviews.llvm.org/D26872





More information about the llvm-commits mailing list