[PATCH] D26872: Outliner: Add MIR-level outlining pass

Matthias Braun via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 18 17:30:35 PST 2016


MatzeB added a comment.

Hi Jessica,

This looks good on a higher level. So just a bunch of coding style issues:



================
Comment at: include/llvm/CodeGen/Passes.h:403
+  ModulePass *createOutlinerPass();
+
 } // End llvm namespace
----------------
no newline.


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1501
+public:
+  /// isLegalToOutline - Return true if the instruction is legal to outline.
+  virtual bool isLegalToOutline(const MachineInstr &MI) const {
----------------
Don't repeat function name, similar in other functions.


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1510
+  /// emitted.
+  virtual void insertOutlinerEpilog(MachineBasicBlock *MBB,
+                                    MachineFunction &MF) const {
----------------
`MBB` can probably be a reference.


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1512
+                                    MachineFunction &MF) const {
+    return;
+  }
----------------
unnecessary


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1519
+  /// by the target.
+  virtual MachineBasicBlock::instr_iterator
+  insertOutlinedCall(MachineBasicBlock *MBB,
----------------
This probably won't work inside bundles so can use `MachineBasicBlock::iterator`.


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1520-1522
+  insertOutlinedCall(MachineBasicBlock *MBB,
+                     MachineBasicBlock::instr_iterator &It, MachineFunction *MF,
+                     MCSymbol *Name) const {
----------------
Use references for things that cannot be nullptr. Similar in other functions.


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1531
+                                    MachineFunction &MF) const {
+    return;
+  }
----------------
unnecessary


================
Comment at: include/llvm/Target/TargetInstrInfo.h:1537
+  virtual bool functionIsSafeToOutlineFrom(Function &F) const {
+    return F.hasFnAttribute(Attribute::NoRedZone);
+  }
----------------
We probably have to leave this to the targets. The ones without a concept of a red zone probably never have NoRedZone set.


================
Comment at: lib/CodeGen/CMakeLists.txt:144
   XRayInstrumentation.cpp
+  MachineOutliner.cpp
 
----------------
insert alphabetically.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:10
+//
+// Implementation of MachineOutliner.h
+//
----------------
`/// \file ...`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:15
+#include "MachineOutliner.h"
+
+STATISTIC(NumOutlinedStat, "Number of candidates outlined");
----------------
Should have INITIALIZE_PASS somewhere (so we can test it with llc -run-pass).


================
Comment at: lib/CodeGen/MachineOutliner.cpp:28
+                                       const TargetInstrInfo *TII) {
+  for (auto BBI = BB->instr_begin(), BBE = BB->instr_end(); BBI != BBE; BBI++) {
+
----------------
Use range based for. Use begin()/end(), instr_begin()/instr_end() will move into machine instruction bundles which is probably not wanted here. A few similar cases follow.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:30
+
+    // First, check if the current instruction is legal to outline at all
+    bool IsSafeToOutline = TII->isLegalToOutline(*BBI);
----------------
aprantl wrote:
> The LLVM coding guidelines prefer full sentences with a trailing "." at the end.
Comments should be full sentences (ending in a dot). Similar in the following comments.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:52
+        InstructionIntegerMap.insert(
+            std::pair<MachineInstr *, int>(&*BBI, CurrLegalInstrMapping));
+        Container.push_back(CurrLegalInstrMapping);
----------------
You could use `make_pair(...)`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:63
+size_t MachineOutliner::removeOutsideSameBB(
+    std::vector<std::pair<String *, size_t>> &Occurrences, const size_t &Length,
+    StringCollection &SC) {
----------------
pass basic types as value.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:92
+  if (CandidateString != nullptr && CandidateString->length() >= 2) {
+    size_t FunctionsCreated = 0;
+    StringCollection SC = ST->SC;
----------------
If I see this correctly, uses of this variable could be replaced with `FunctionList.size()`?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:106
+      if (Occurrences->size() >= 2) {
+        auto FirstOcc = (*Occurrences)[0];
+        size_t IdxInSC = FirstOcc.second;
----------------
maybe repeat the type instead of `auto` to make it reader friendly.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:136
+
+    std::sort(CandidateList.begin(), CandidateList.end());
+
----------------
Could this create indeterminism (Candidate::operator< looks like a partial ordering to me). Possibly use `stable_sort` or better enhance operator< to a full ordering.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:151-153
+  std::ostringstream NameStream;
+  NameStream << "OUTLINED_FUNCTION" << OF.Name;
+  std::string *Name = new std::string(NameStream.str());
----------------
You can probably use a Twine here to avoid some of the heap allocations.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:191
+  /// the special epilogue and prologue for the outliner.
+  MF.insert(MF.begin(), MBB);
+  TII->insertOutlinerEpilog(MBB, MF);
----------------
can be push_front().


================
Comment at: lib/CodeGen/MachineOutliner.cpp:194
+
+  MachineInstr *MI;
+
----------------
The MI usage inside the while loop could use a separate declaration and this one moved down to the first assignment.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:198-207
+  while (EndIt != StartIt) {
+    MI = MF.CloneMachineInstr(&*EndIt);
+    MI->dropMemRefs();
+    MBB->insert(MBB->instr_begin(), MI);
+    EndIt--;
+  }
+
----------------
Can this be rewritten to a `do {} while` loop to avoid the code duplication?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:213
+        for (auto MBB = MF.begin(), EBB = MF.end(); MBB != EBB;
+             MBB++) { (&(*MBB))->dump(); });
+
----------------
`&*` should not be necessary here, but can use a range based for loop anyway.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:228-231
+  for (size_t i = 0, e = FunctionList.size(); i < e; i++) {
+    OutlinedFunction OF = FunctionList[i];
+    FunctionList[i].MF = createOutlinedFunction(M, OF);
+  }
----------------
range based for.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:247
+    if (!AlreadyOutlinedFrom) {
+      size_t j = 0;
+      for (size_t i = OffsetedStringStart; i < OffsetedStringEnd; i++) {
----------------
Maybe replace this with `size_t j = i - OffsetedStringStart;` inside the loop?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:275
+    MachineBasicBlock *MBB = Worklist[StringLocation.first];
+    const TargetSubtargetInfo *target = &(MF->getSubtarget());
+    const TargetInstrInfo *TII = target->getInstrInfo();
----------------
`Target` or rather `STI` like most codegen code.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:287
+    auto It = MBB->instr_begin();
+    auto StartIt = It;
+    auto EndIt = It;
----------------
Couldn't you rather do `auto StartIt = It;` after the loop?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:288
+    auto StartIt = It;
+    auto EndIt = It;
+
----------------
Can be moved after the loop.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:290-293
+    for (i = 0; i < StringLocation.second; ++i) {
+      ++StartIt;
+      ++It;
+    }
----------------
Maybe `It = std::advance(It, StringLocation.second);`? Similar in the next loop.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:299
+
+    for (; i < StringLocation.second + C.Length; ++i)
+      ++It;
----------------
Use a new iteration variable and count from 0 to C.Length?


================
Comment at: lib/CodeGen/MachineOutliner.cpp:320
+  // Set up the suffix tree.
+  for (auto MI = M.begin(), ME = M.end(); MI != ME; MI++) {
+    Function *F = &*MI;
----------------
range based for.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:323
+    MachineFunction &MF = MMI.getMachineFunction(*F);
+    const TargetSubtargetInfo *target = &(MF.getSubtarget());
+    const TargetRegisterInfo *TRI = target->getRegisterInfo();
----------------
Usually called `STI`


================
Comment at: lib/CodeGen/MachineOutliner.cpp:330
+
+    for (auto MFI = MF.begin(), MFE = MF.end(); MFI != MFE; ++MFI) {
+      MachineBasicBlock *MBB = &*MFI;
----------------
range based for.


================
Comment at: lib/CodeGen/MachineOutliner.cpp:348
+
+  delete ST;
+  return OutlinedSomething;
----------------
aprantl wrote:
> Maybe use a unique_ptr for this?
runOnModule() may be called multiple times if we have multiple modules. With the suffixtree being allocated in the constructor this would lead to multiple deletions of the same thing.


================
Comment at: lib/CodeGen/MachineOutliner.h:1
+//===---- MachineOutliner.h - Outline instructions -------------*- C++ -*-===//
+//
----------------
If MachineOutliner.cpp is the only one including this, then you can just as well copy everything into it and don't need a separate header.

That way you can also push a bunch of stuff into an anonymous namespace to avoid unnecessary symbol exporting.


================
Comment at: lib/CodeGen/MachineOutliner.h:10-12
+// The MachineOutliner is a code size reduction pass. It finds repeated
+// sequences of instructions and pulls them out into their own functions.
+// By pulling out such repeated sequences, we can reduce code size.
----------------
`/// \file ...`
The last sentence seems obvious.


================
Comment at: lib/CodeGen/MachineOutliner.h:19
+
+#define DEBUG_TYPE "machine-outliner"
+
----------------
This should only reside in .cpp files.


================
Comment at: lib/CodeGen/MachineOutliner.h:44-45
+
+/// Helper struct that stores the basic block, the function, the string, and
+/// location of a Candidate.
+struct Candidate {
----------------
doxygen is silly and requires adding a `\brief` prefix if the first sentence is more than 1 line.


================
Comment at: lib/CodeGen/MachineOutliner.h:47
+struct Candidate {
+  MachineBasicBlock *BB;     // BB containing this Candidate
+  MachineFunction *ParentMF; // Function containing bb
----------------
`///< ...`
You can probably use a reference here to indicate it cannot be nullptr.
Similar with the other fields.


================
Comment at: lib/CodeGen/MachineOutliner.h:50
+  String *Str;               // The actual string to outline
+  size_t Length;             // str->length()
+  size_t StartIdxInBB;       // Start index in the string
----------------
Doesn't String have a size() function that you could call isntead? Duplicating the info in a separate field risks data getting out of sync.


================
Comment at: lib/CodeGen/MachineOutliner.h:51-53
+  size_t StartIdxInBB;       // Start index in the string
+  size_t EndIdxInBB;         // End index in the string
+  size_t FunctionIdx; // Index of the candidate's function in the function list
----------------
`unsigned` should be enough. If we ever grow a machine basic block to 2**32 instructions we will have other problems/limits first before we will ever have a problem with the outliner.


================
Comment at: lib/CodeGen/MachineOutliner.h:55
+
+  Candidate(MachineBasicBlock *bb_, MachineFunction *bb_BBParent_, String *Str_,
+            const size_t &Length_, const size_t &StartIdxInBB_,
----------------
Maybe use variables named `MBB`, `MF` to be consistent with the majority of codegen.


================
Comment at: lib/CodeGen/MachineOutliner.h:56
+  Candidate(MachineBasicBlock *bb_, MachineFunction *bb_BBParent_, String *Str_,
+            const size_t &Length_, const size_t &StartIdxInBB_,
+            const size_t &EndIdxInBB_, const size_t &fn_Id_)
----------------
It is more efficient to pass basic types like `size_t` around by value.


================
Comment at: lib/CodeGen/MachineOutliner.h:67-68
+
+/// Output for candidates for debugging purposes.
+raw_ostream &operator<<(raw_ostream &os, const Candidate &c) {
+  os << *(c.Str) << "\n";
----------------
Should be `static inline` (when it stays in a header) or `static`/part of an anonymous namespace when moved to a .cpp file. Similar with other functions.


================
Comment at: lib/CodeGen/MachineOutliner.h:106
+  // Target information
+  size_t FunctionCallOverhead; // TODO
+  int CurrIllegalInstrMapping;
----------------
describe what is left to do.


================
Comment at: lib/CodeGen/MachineOutliner.h:112
+
+  bool runOnModule(Module &M) override;
+  StringRef getPassName() const override { return "Outliner"; }
----------------
Documentation comments should be in the header instead of the .cpp file. (Though that point is mood when you merge the header in this case).


================
Comment at: lib/CodeGen/MachineOutliner.h:113
+  bool runOnModule(Module &M) override;
+  StringRef getPassName() const override { return "Outliner"; }
+
----------------
Use the same name as the DEBUG_TYPE for consistency (changing one of the two).


================
Comment at: lib/CodeGen/MachineOutliner.h:123
+  MachineOutliner() : ModulePass(ID) {
+    ST = new STree();
+
----------------
Can ST simply be a member (instead of a pointer) if you always allocate 1 anyway?
Though maybe you rather want to move this into runOnModule()?


================
Comment at: lib/CodeGen/MachineOutliner.h:125-126
+
+    // FIXME: Release function names
+    FunctionNames = new std::vector<std::string *>;
+  }
----------------
see below.


================
Comment at: lib/CodeGen/MachineOutliner.h:145-146
+
+// FIXME: Free after printing
+std::vector<std::string *> *OutlinerFunctionNames;
+ModulePass *createOutlinerPass() {
----------------
Shouldn't this be per pass data? Or rather per runOnModule() data?

This appears to own the strings so would be safest with `unique_ptr<>`, however I think std::string is small enough to be better stored as value.


================
Comment at: lib/Target/X86/X86InstrInfo.cpp:8964
+
+  // Check if the outliner has any CPI junk-- we can't move around stuff
+  // which depends on the offsets between two instructions
----------------
Luckily CPIs aren't human and can't feel insulted :)


================
Comment at: lib/Target/X86/X86InstrInfo.cpp:8967-8968
+  else {
+    for (auto It = MI.operands_begin(), Et = MI.operands_end(); It != Et;
+         It++) {
+      if ((*It).isCPI() || (*It).isJTI() || (*It).isCFIIndex() ||
----------------
range based for.


================
Comment at: lib/Target/X86/X86InstrInfo.h:583
+public:
+  /// Returns true if a MachineInstr is safe to outline.
+  bool isLegalToOutline(const MachineInstr &MI) const override;
----------------
May be left out if it is just repeating the comment from the parent class. That's less risk of the comment becoming out of date.


================
Comment at: test/CodeGen/X86/machineoutliner.ll:5
+; Function Attrs: noredzone nounwind ssp uwtable
+define void @foo() #0 {
+entry:
----------------
It is good style to add `CHECK-LABEL: foo:` here so filecheck will have a better idea what function we are in, which helps in error reporting and recovery. Same in the next function.


================
Comment at: test/CodeGen/X86/machineoutliner.ll:39
+
+attributes #0 = { noredzone nounwind ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
+
----------------
This can probably be reduced to just `noredzone`


================
Comment at: test/CodeGen/X86/machineoutliner.ll:41-43
+; CHECK: l_OUTLINED_FUNCTION0:                   ## @OUTLINED_FUNCTION0
+; CHECK:	.cfi_startproc
+; CHECK: ## BB#0:
----------------
I'd probably leave out the comments and possible the .cfi_startproc to make the test more robust (they are not the thing we are testing for).


================
Comment at: test/CodeGen/X86/machineoutliner.ll:48
+; CHECK:	retq
\ No newline at end of file

----------------
"no newline at end of file"


Repository:
  rL LLVM

https://reviews.llvm.org/D26872





More information about the llvm-commits mailing list