[PATCH] D26872: Outliner: Add MIR-level outlining pass

Sean Silva via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 9 04:00:20 PST 2016


silvas added a comment.

Some stylistic comments.

Also, the amount of testing seems small compared to the amount of code being added. Can you beef it up? Especially testing for the handling of "unsafe" cases would be good.



================
Comment at: lib/CodeGen/MachineOutliner.cpp:16
+/// instructions. If a sequence of instructions appears often, then it ought
+/// to be beneficial to pull out into a function.
+///
----------------
A link to your devmtg talk might be good (together with some explanation about how it relates to what is implemented here).


================
Comment at: lib/CodeGen/MachineOutliner.cpp:54
+/// Convenience typedef for a "two-dimensional string".
+// NOTE: This is used as a 2D array *and* as a 2D string.
+// For example, say we have these strings: [0,1,2] [3,4] [5,6,7]
----------------
This seems to have some nontrivial invariants, so you may want something a bit stronger than a typedef. Maybe a lightweight struct around this would be better? You already have a couple helpers that seems like they could quite naturally be methods of such a struct.

I guess the question is: does it ever make sense for a random piece of code that is using this typedef to actually operate on it using the std::vector API? Or would such code inherently risk violating some sort of invariant? If the latter, a lightweight struct encapsulating the underlying vector is likely to be a good choice.




================
Comment at: lib/CodeGen/MachineOutliner.cpp:60
+//
+// And we would have SC[0] = 0, SC[2] = 2, SC[7] = 7, etc.
+typedef std::vector<String *> StringCollection;
----------------
This doesn't really make much sense to me. Can you explain this data structure a bit better?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:73
+/// \returns The index of the string that \p Offset appears in.
+size_t indexAndOffsetHelper(const StringCollection &StringList,
+                            size_t &Offset) {
----------------
This could use a better name.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:124
+/// 
+/// The suffix tree is implemented using Ukkonen's algorithm for linear-time
+/// suffix tree construction.
----------------
Is there a good online resource you can link to about Ukkonen's algorithm? If this implementation is patterned off of a particular resource that might be good to link to if possible.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:126
+/// suffix tree construction.
+class SuffixTree {
+
----------------
Small coding standard nit: use static/anonymous namespace on all the helper stuff here and elsewhere: http://llvm.org/docs/CodingStandards.html#anonymous-namespaces


================
Comment at: lib/CodeGen/MachineOutliner.cpp:141
+  /// the same string, but with the first character chopped off. This is stored
+  /// in \p Link. Each leaf node stores the star index of its respective suffix
+  /// in \p SuffixIndex.
----------------
I assume "start" was meant here. But "star index" (as in Kleene) sounds vaguely plausible in the context of a string algorithm so this typo is worth fixing.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:152
+    /// string by tacking that character on the end of the current string.
+    std::map<unsigned, SuffixTreeNode *> Children;
+
----------------
Do you really need a std::map? http://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc

Also, if the number of outgoing edges is small, a small-type container is probably better here.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:157
+    /// If a node is invalid, then it is not considered in any future queries.
+    bool Valid = true;
+
----------------
As per the comment, maybe call this "is in tree" or "already pruned" or something like that?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:163
+    /// This is used as a shortcut in the construction algorithm. For more
+    /// information, look into Ukkonen's algorithm.
+    /// It is also used during the tree pruning process to let us quickly throw
----------------
Can you be a bit more specific about where this shortcut link comes in in Ukkonen's algorithm?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:172
+    /// End index of this node's substring in the main string.
+    size_t *End = nullptr;
+
----------------
Both `Start` and `End` here are described as "index" but one is a pointer. That's worth explaining in the comment.

Also, SuffixIndex is also an "index" and gets an "Index" suffix to its variable name. Maybe these should be `StartIndex` and `EndIndex`?
Also, other places in this patch use `Idx` for variable names to mean "index". Maybe these should be `StartIdx` etc.?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:306
+
+      // There *is* a match, so we have to traverse the tree and find out where
+      // to put the node.
----------------
Move the comment inside the braces so that this can use the more common `} else {` style, here and elsewhere.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:337
+        // choices: the old string we found, or the string on the mismatch.
+        size_t *SplitEnd = new size_t(NextNode->Start + Active.Len - 1);
+        SuffixTreeNode *SplitNode =
----------------
There's quite a few naked new/delete in this code. Can you encapsulate the ownership better? If not, can you centralize the new/delete in helpers and add some comments about lifetime/ownership?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:770
+  /// The names of each function created. 
+  std::vector<std::string *> *FunctionNames;
+
----------------
Can you hold this by value?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:1175
+  for (size_t i = 0, e = Collection.size(); i != e; i++)
+    delete Collection[i];
+
----------------
Can you improve the ownership here? E.g. use a std::unique_ptr to manage the lifetime?


================
Comment at: lib/CodeGen/TargetPassConfig.cpp:95
     cl::ZeroOrMore);
+static cl::opt<bool> EnableMIROutliner("enable-machine-outliner", cl::Hidden,
+    cl::desc("Enable machine outliner"));
----------------
What is the official name? "machine outliner" or "MIR outliner". Please be consistent here and elsewhere.


Repository:
  rL LLVM

https://reviews.llvm.org/D26872





More information about the llvm-commits mailing list