[llvm-dev] [RFC] mir-canon: A new tool for canonicalizing MIR for cleaner diffing.

Wed Sep 6 08:08:25 PDT 2017

On Tue, Aug 29, 2017 at 2:17 PM, Justin Bogner <mail at justinbogner.com>
wrote:

> Puyan Lotfi via llvm-dev <llvm-dev at lists.llvm.org> writes:
> > mir-canon: A new tool for canonicalizing MIR for cleaner diffing.
> >
> > Problem Statement and Context:
> >
> > Diff-tools are regularly used for comparing IR and assembly files. This
> often
> > involves reasoning through differences that are semantically equivalent
> and it
> > can be very time consuming for a person to do said reasoning.
> >
> > Specifically in the context of GlobalISel development there are problems
> of
> > correctness verification. There is a need to compare two programs,
> compiled
> > from identical IR by two different instruction selectors in a way where
> the
> > true differences stand out.
> >
> > Proposal:
> >
> > We propose a new tool that we have tentatively named 'mir-canon' that
> performs
> > canonical transformations on MIR. The goal is for MIR pre-processed with
> > mir-canon to show fewer differences than if it were not pre-processed.
>
> This sounds great. One place I'm hoping to use this for is in ISel
> fuzzing - we can currently catch crashes and machine verifier errors,
> but comparing SDAG to GlobalISel is a space that I'd really like to
> explore.
>

Thats exactly what I'm trying to do. I'm still trying to think of good ways
to narrow down the set of functions and basic blocks that get
canonicalized. I have some opt flags to control that but I think the
control could be better.

>
> > At the time of this writing we have a prototype canonicalization tool.
> We’ve
> > come up with some techniques that show promise and would like to open
> > discussion with the community to get feedback and suggestions on
> refining said
> > techniques. Currently we think of this as an open ended project.
> >
> > Techniques:
> >
> > Our prototype does the following for each basic block in a Reverse Post
> > Ordering:
> >
> >   * Canonical instruction ordering is done by moving a given instruction
> as
> >     close to the nearest use of its def as possible.
> >
> >   * Next, canonical VReg renaming is done by building a collection of
> >     candidate instructions that can be thought of as sinks in the def-use
> >     graph: they are typically instructions that write to physical
> registers or
> >     store to memory. These candidates are used as the root of a breadth
> first
> >     walk over the vreg operand def-use graph that determines a canonical
> vreg
> >     ordering.
> >
> >   * Using said canonical vreg ordering we rename monotonically, but
> before we
> >     do this we skip several vreg values in order to increase the chance
> that
> >     we land on the same vreg number for two different input MIR files.
>
> This is one place where handling multiple files at once would help,
> correct?
>
>
I think so. I think if I were processing two files at once I could
increment the vreg count by a more exact number between basic blocks
processed and between def-use walks.

> >     We also do this to reduce the chances that a difference in
> >     previously def-use walks will affect the vreg renaming for
> >     subsequent walks. This skipping step could be thought of as a kind
> >     of vreg number reckoning: we skip modulo n vregs so that we are
> >     likely to land on the same vreg for two different files.
> >
> > This approach is completely agnostic of ISA specific semantics so it
> should
> > work for any target.
> >
> > Current status:
> >
> > At the moment we have a standalone llvm tool that uses a single pass to
> do
> > the above described transformations. We have test inputs that show
> promise
> > but we still need a wider set of tests as well as hard metrics.
>
> This sounds like a great start, but I feel like this design won't quite
> scale to all the ways we want to use this.
>
> Consider, if this was a pass in the LLVM library rather than only
> accessible in a tool, you could potentially set it up to run at the end
> of an `llc` run. Similarly, if I want to use this in a fuzzer it would
> be nice to be able to do the canonicalization in memory.
>
>
 I used to have it as an LLVM pass but a while ago I thought it could be
more prudent to have contained in a separate tool as it wasn't actually for
optimization or codegen. I can easily reintegrate it as an LLVM pass. I'll
send out another patch.

> > Our approach processes a single file at a time. The primary benefit to
> this
> > approach is lower complexity in initial implementation and exploration of
> > building such a tool. We are open to other approaches such as an
> llvm-diff
> > like (two file at a time) approach, but we have not explored that avenue
> > fully yet.
> >
> > We’re eager to hear community feedback and will be ready to share patches
> > for review soon.
>  ...
> > Patch for review.
>
> Please send the patch to llvm-commits (and refer back to this thread).
> Patch review is too much traffic for llvm-dev.
>
>
Great, thanks!

> > diff --git a/tools/mir-canon/CMakeLists.txt
> b/tools/mir-canon/CMakeLists.txt
> > new file mode 100644
> > index 00000000000..eb6722e4bd4
> > --- /dev/null
> > +++ b/tools/mir-canon/CMakeLists.txt
> > @@ -0,0 +1,28 @@
> > +set(LLVM_LINK_COMPONENTS
> > +  ${LLVM_TARGETS_TO_BUILD}
> > +  Analysis
> > +  AsmPrinter
> > +  CodeGen
> > +  Core
> > +  IRReader
> > +  MC
> > +  MIRParser
> > +  ScalarOpts
> > +  SelectionDAG
> > +  Support
> > +  Target
> > +  TransformUtils
> > +  Vectorize
> > +  )
> > +
> > +# Support plugins.
> > +set(LLVM_NO_DEAD_STRIP 1)
> > +
> > +add_llvm_tool(mir-canon
> > +  mir-canon.cpp
> > +  MIRCanonicalizerPass.cpp
> > +
> > +  DEPENDS
> > +  intrinsics_gen
> > +  )
> > +export_executable_symbols(mir-canon)
> > diff --git a/tools/mir-canon/Common.h b/tools/mir-canon/Common.h
> > new file mode 100644
> > index 00000000000..3e6be7ef643
> > --- /dev/null
> > +++ b/tools/mir-canon/Common.h
> > @@ -0,0 +1,21 @@
> > +#include <cassert>
> > +
> > +namespace llvm {
> > +/// The purpose of this pass is to establish a canonical operand naming
> so
> > +/// that code compiled with slightly different IR passes can be diffed
> more
> > +/// effectively than otherwise. This is done by renaming vregs in a
> given
> > +/// LiveRange in a canonical way.
> > +extern char &MIRCanonicalizerID;
> > +
> > +void initializeMIRCanonicalizerPass(PassRegistry &);
> > +
> > +} // namespace llvm
> > +
> > +#define CANON_CHECK(CHK, MSG) \
> > +  do { \
> > +    if ((CHK)) break; \
> > +    errs() << (std::string("") + MSG) << '\n'; \
> > +    assert(false && "CANON_CHECK FAILURE."); \
> > +    exit(EXIT_FAILURE); \
> > +  } while (false)
> > +
> > diff --git a/tools/mir-canon/LLVMBuild.txt b/tools/mir-canon/LLVMBuild.
> txt
> > new file mode 100644
> > index 00000000000..5d3e6bb5a91
> > --- /dev/null
> > +++ b/tools/mir-canon/LLVMBuild.txt
> > @@ -0,0 +1,22 @@
> > +;===- ./tools/mir-canon/LLVMBuild.txt --------------------------*-
> Conf -*--===;
> > +;
> > +;                     The LLVM Compiler Infrastructure
> > +;
> > +; This file is distributed under the University of Illinois Open Source
> > +; License. See LICENSE.TXT for details.
> > +;
> > +;===-------------------------------------------------------
> -----------------===;
> > +;
> > +; This is an LLVMBuild description file for the components in this
> subdirectory.
> > +;
> > +; For more information on the LLVMBuild system, please see:
> > +;
> > +;   http://llvm.org/docs/LLVMBuild.html
> > +;
> > +;===-------------------------------------------------------
> -----------------===;
> > +
> > +[component_0]
> > +type = Tool
> > +name = mir-canon
> > +parent = Tools
> > +required_libraries = AsmParser BitReader IRReader MIRParser
> TransformUtils Scalar Vectorize all-targets
> > diff --git a/tools/mir-canon/MIRCanonicalizerPass.cpp b/tools/mir-canon/
> MIRCanonicalizerPass.cpp
> > new file mode 100644
> > index 00000000000..a4a0a7a0e5f
> > --- /dev/null
> > +++ b/tools/mir-canon/MIRCanonicalizerPass.cpp
> > @@ -0,0 +1,588 @@
> > +//===-------------- MIRCanonicalizer.cpp - MIR Canonicalizer
> --------------===//
> > +//
> > +//                     The LLVM Compiler Infrastructure
> > +//
> > +// This file is distributed under the University of Illinois Open Source
> > +// License. See LICENSE.TXT for details.
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +//
> > +// Reorders instructions canonically.
> > +// Renames virtual register operands canonically.
> > +// Strips certain MIR artifacts (optionally).
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +
> > +#include "llvm/CodeGen/Passes.h"
> > +#include "llvm/CodeGen/MachineInstrBuilder.h"
> > +#include "llvm/CodeGen/MachineFunctionPass.h"
> > +#include "llvm/CodeGen/MachineRegisterInfo.h"
> > +
> > +#include "llvm/Support/raw_ostream.h"
> > +#include "llvm/ADT/PostOrderIterator.h"
> > +#include "llvm/Target/TargetInstrInfo.h"
> > +
> > +#include <tuple>
> > +#include <queue>
> > +
> > +using namespace llvm;
> > +
> > +#include "Common.h"
> > +
> > +#define DEBUG_TYPE "mir-canonicalizer"
> > +
> > +static cl::opt<unsigned>
> > +CanonicalizeFunctionNumber("canon-nth-function", cl::Hidden,
> cl::init(~0u),
> > +                           cl::value_desc("N"),
> > +                           cl::desc("Function number to
> canonicalize."));
> > +
> > +static cl::opt<unsigned>
> > +CanonicalizeBasicBlockNumber("canon-nth-basicblock", cl::Hidden,
> cl::init(~0u),
> > +                             cl::value_desc("N"),
> > +                             cl::desc("BasicBlock number to
> canonicalize."));
> > +
> > +namespace {
> > +
> > +class MIRCanonicalizer : public MachineFunctionPass {
> > +public:
> > +  static char ID;
> > +  MIRCanonicalizer() : MachineFunctionPass(ID) {}
> > +
> > +  StringRef getPassName() const override {
> > +    return "Rename register operands in a canonical ordering.";
> > +  }
> > +
> > +  void getAnalysisUsage(AnalysisUsage &AU) const override {
> > +    AU.setPreservesCFG();
> > +    MachineFunctionPass::getAnalysisUsage(AU);
> > +  }
> > +
> > +  bool runOnMachineFunction(MachineFunction &MF) override;
> > +};
> > +
> > +} // end anonymous namespace
> > +
> > +enum RSType { RSE_Reg = 0, RSE_FrameIndex, RSE_NewCandidate };
> > +typedef std::tuple<RSType, unsigned> typedRegStackEntry;
> > +
> > +char MIRCanonicalizer::ID;
> > +
> > +char &llvm::MIRCanonicalizerID = MIRCanonicalizer::ID;
> > +
> > +INITIALIZE_PASS_BEGIN(MIRCanonicalizer, "mir-canonicalizer",
> > +                      "Rename Register Operands Canonically", false,
> false);
> > +
> > +INITIALIZE_PASS_END(MIRCanonicalizer, "mir-canonicalizer",
> > +                    "Rename Register Operands Canonically", false,
> false);
> > +
> > +static std::vector<MachineBasicBlock *> GetRPOList(MachineFunction &MF)
> {
> > +  MachineBasicBlock *Entry = &*MF.begin();
> > +  std::vector<MachineBasicBlock*> RPOList;
> > +  ReversePostOrderTraversal<MachineBasicBlock*> RPOT(Entry);
> > +  for (ReversePostOrderTraversal<MachineBasicBlock*>::rpo_iterator
> > +       MBBI = RPOT.begin(), MBBE = RPOT.end(); MBBI != MBBE; ++MBBI) {
> > +    MachineBasicBlock *MBB = *MBBI;
> > +    RPOList.push_back(MBB);
> > +  }
> > +
> > +  return RPOList;
> > +}
> > +
> > +// Set a dummy vreg. We use this vregs register class to generate
> throw-away
> > +// vregs that are used to skip vreg numbers so that vreg numbers line
> up.
> > +static unsigned GetDummyVReg(const MachineFunction &MF) {
> > +  for (auto &MBB : MF) {
> > +    for (auto II = MBB.begin(), IE = MBB.end(); II != IE; ++II) {
> > +      const MachineInstr *MI = &*II;
> > +      for (unsigned i = 0; i < MI->getNumOperands(); i++) {
> > +        const MachineOperand &MO = MI->getOperand(i);
> > +        if (!MO.isReg() || !TargetRegisterInfo::
> isVirtualRegister(MO.getReg()))
> > +          continue;
> > +        return MO.getReg();
> > +      }
> > +    }
> > +  }
> > +
> > +  return ~0U;
> > +}
> > +
> > +static bool rescheduleCanoncally(MachineBasicBlock *MBB) {
> > +
> > +  bool Changed = false;
> > +
> > +  // Calculates the distance of MI from the begining of its parent BB.
> > +  auto getInstrIdx = [](const MachineInstr &MI) {
> > +    unsigned i = 0;
> > +    for (auto &I : *MI.getParent()) {
> > +      if (&I == &MI)
> > +        return i;
> > +      i++;
> > +    }
> > +    return ~0U;
> > +  };
> > +
> > +  // Pre-Populate vector of instructions to reschedule so that we don't
> > +  // clobber the iterator.
> > +  std::vector<MachineInstr *> instructions;
> > +  for (auto II = MBB->begin(); II != MBB->end(); ++II) {
> > +    instructions.push_back(&*II);
> > +  }
> > +
> > +  for (auto II : instructions) {
> > +    if (II->getNumOperands() == 0)
> > +      continue;
> > +
> > +    MachineOperand &MO = II->getOperand(0);
> > +    if (!MO.isReg() || !TargetRegisterInfo::
> isVirtualRegister(MO.getReg()))
> > +      continue;
> > +
> > +    DEBUG(dbgs() << "Operand " << 0 << " of "; II->dump(); MO.dump(););
> > +
> > +    MachineInstr *def = II;
> > +    unsigned distance = ~0U;
> > +    MachineInstr *useToBringDefCloserTo = nullptr;
> > +    MachineRegisterInfo *MRI = &MBB->getParent()->getRegInfo();
> > +    for (auto UI = MRI->use_begin(MO.getReg()), UE = MRI->use_end();
> > +         UI != UE; ++UI) {
> > +      MachineInstr *useInst = UI->getParent();
> > +
> > +      const unsigned defLoc = getInstrIdx(*def);
> > +      const unsigned useLoc = getInstrIdx(*useInst);
> > +      const unsigned delta = (useLoc - defLoc);
> > +
> > +      if (useInst->getParent() != def->getParent())
> > +        continue;
> > +      if (defLoc >= useLoc)
> > +        continue;
> > +
> > +      if (delta < distance) {
> > +        distance = delta;
> > +        useToBringDefCloserTo = useInst;
> > +      }
> > +    }
> > +
> > +    MachineBasicBlock::iterator defI = MBB->instr_end();
> > +    MachineBasicBlock::iterator useI = MBB->instr_end();
> > +
> > +    for (auto BBI = MBB->instr_begin(); BBI != MBB->instr_end(); ++BBI)
> {
> > +
> > +      if (defI != MBB->instr_end() && useI != MBB->instr_end())
> > +        break;
> > +
> > +      if ((&*BBI != def) && (&*BBI != useToBringDefCloserTo))
> > +        continue;
> > +
> > +      if (&*BBI == def) {
> > +        defI = BBI;
> > +        continue;
> > +      }
> > +
> > +      if (&*BBI == useToBringDefCloserTo) {
> > +        useI = BBI;
> > +        continue;
> > +      }
> > +    }
> > +
> > +    if (defI == MBB->instr_end() || useI == MBB->instr_end())
> > +      continue;
> > +
> > +    DEBUG(dbgs() << "Splicing "; defI->dump(); dbgs() << " right
> before: ";
> > +          useI->dump(););
> > +
> > +    Changed = true;
> > +    MBB->splice(useI, MBB, defI);
> > +  }
> > +
> > +  return Changed;
> > +}
> > +
> > +static std::vector<MachineInstr *> populateCandidates(MachineBasicBlock
> *MBB) {
> > +  std::vector<MachineInstr *> candidates;
> > +  MachineRegisterInfo &MRI = MBB->getParent()->getRegInfo();
> > +
> > +  for (auto II = MBB->begin(), IE = MBB->end(); II != IE; ++II) {
> > +    MachineInstr *MI = &*II;
> > +
> > +    bool doesMISideEffect = false;
> > +
> > +    if (MI->getOperand(0).isReg()) {
> > +      const unsigned dst = MI->getOperand(0).getReg();
> > +      doesMISideEffect |= !TargetRegisterInfo::isVirtualRegister(dst);
> > +
> > +      for (auto UI = MRI.use_begin(dst); UI != MRI.use_end(); ++UI) {
> > +        if (doesMISideEffect) break;
> > +        doesMISideEffect |= (UI->getParent()->getParent() !=
> MI->getParent());
> > +      }
> > +    }
> > +
> > +    if (!MI->mayStore() && !MI->isBranch() && !doesMISideEffect)
> > +      continue;
> > +
> > +    DEBUG(dbgs() << "Found Candidate:  "; MI->dump(););
> > +    candidates.push_back(MI);
> > +  }
> > +
> > +  return candidates;
> > +}
> > +
> > +void doCandidateWalk(std::vector<typedRegStackEntry> &VRegs,
> > +                     std::queue <typedRegStackEntry> &reg_queue,
> > +                     std::vector<MachineInstr *> &visitedMIs,
> > +                     const MachineBasicBlock *MBB) {
> > +
> > +  const MachineFunction &MF = *MBB->getParent();
> > +  const MachineRegisterInfo &MRI = MF.getRegInfo();
> > +
> > +  while (reg_queue.size()) {
> > +    auto regType = std::get<0>(reg_queue.front());
> > +    auto reg = std::get<1>(reg_queue.front());
> > +    reg_queue.pop();
> > +
> > +    if (regType == RSE_FrameIndex) {
> > +      DEBUG(dbgs() << "Popping frame index.\n";);
> > +      VRegs.push_back(std::make_tuple(RSE_FrameIndex, ~0U));
> > +      continue;
> > +    }
> > +
> > +    assert(regType == RSE_Reg && "Expected vreg or physreg.");
> > +
> > +    if (TargetRegisterInfo::isVirtualRegister(reg)) {
> > +      DEBUG(dbgs() << "Popping vreg ";
> > +          MRI.def_begin(reg)->dump();
> > +          dbgs() << "\n";);
> > +
> > +      bool hasVisitedVReg = false;
> > +      for (auto vreg : VRegs) {
> > +        if (std::get<1>(vreg) == reg) {
> > +          hasVisitedVReg = true;
> > +          break;
> > +        }
> > +      }
> > +
> > +      if (!hasVisitedVReg)
> > +        VRegs.push_back(std::make_tuple(RSE_Reg, reg));
> > +    } else {
> > +      DEBUG(dbgs() << "Popping physreg.\n";);
> > +      VRegs.push_back(std::make_tuple(RSE_Reg, reg));
> > +      continue;
> > +    }
> > +
> > +    for (auto RI = MRI.def_begin(reg), RE = MRI.def_end(); RI != RE;
> ++RI) {
> > +      MachineInstr *def = RI->getParent();
> > +
> > +      if (def->getParent() != MBB)
> > +        continue;
> > +
> > +      bool hasVisited = false;
> > +      for (auto VMI : visitedMIs) {
> > +        if (def == VMI) {
> > +          hasVisited = true;
> > +          break;
> > +        }
> > +      }
> > +
> > +      if (hasVisited)
> > +        break;
> > +
> > +      DEBUG(dbgs() << "\n========================\n";
> > +          dbgs() << "Visited MI: "; def->dump();
> > +          dbgs() << "BB Name: " << def->getParent()->getName() << "\n";
> > +          dbgs() << "\n========================\n";);
> > +      visitedMIs.push_back(def);
> > +      for (unsigned I = 1, E = def->getNumOperands(); I != E; ++I) {
> > +
> > +        MachineOperand &MO = def->getOperand(I);
> > +        if (MO.isFI()) {
> > +          DEBUG(dbgs() << "Pushing frame index.\n";);
> > +          reg_queue.push(std::make_tuple(RSE_FrameIndex, ~0U));
> > +        }
> > +
> > +        if (!MO.isReg())
> > +          continue;
> > +        reg_queue.push(std::make_tuple(RSE_Reg, MO.getReg()));
> > +      }
> > +    }
> > +  }
> > +}
> > +
> > +static void SkipVRegs(unsigned &VRegGapIndex, MachineRegisterInfo &MRI,
> > +                      const TargetRegisterClass *RC) {
> > +  const unsigned VR_GAP = (++VRegGapIndex * 1000);
> > +
> > +  DEBUG(dbgs() << "Adjusting per-BB VR_GAP for BB" << VRegGapIndex << "
> to "
> > +               << VR_GAP << "\n";);
> > +
> > +  unsigned I = MRI.createVirtualRegister(RC);
> > +  const unsigned E = (((I + VR_GAP) / VR_GAP) + 1) * VR_GAP;
> > +  while (I != E) {
> > +    I = MRI.createVirtualRegister(RC);
> > +  }
> > +
> > +  CANON_CHECK(I != ~0U && I != 0, "Invalid VReg");
> > +}
> > +
> > +static void GetVRegRenameMap(std::map<unsigned, unsigned>
> &vregRenameMap,
> > +                             const std::vector<typedRegStackEntry>
> &VRegs,
> > +                             const std::vector<unsigned>
> &renamedInOtherBB,
> > +                             MachineRegisterInfo &MRI,
> > +                             const TargetRegisterClass *RC) {
> > +
> > +  unsigned lastRenameReg = MRI.createVirtualRegister(RC);
> > +
> > +  bool firstCandidate = true;
> > +  for (auto vreg : VRegs) {
> > +
> > +    auto regType = std::get<0>(vreg);
> > +    auto reg = std::get<1>(vreg);
> > +
> > +    if (regType == RSE_FrameIndex) {
> > +      lastRenameReg = MRI.createVirtualRegister(RC);
> > +      DEBUG(dbgs() << "Skipping rename for FI " << lastRenameReg <<
> "\n";);
> > +      continue;
> > +    } else if (regType == RSE_NewCandidate) {
> > +      if (!firstCandidate) {
> > +        while (lastRenameReg % 10) {
> > +          lastRenameReg = MRI.createVirtualRegister(RC);
> > +
> > +          DEBUG(dbgs() << "Skipping rename for new candidate " <<
> lastRenameReg
> > +                       << "\n";);
> > +        }
> > +      }
> > +      firstCandidate = false;
> > +      continue;
> > +    } else if (!TargetRegisterInfo::isVirtualRegister(reg)) {
> > +      lastRenameReg = MRI.createVirtualRegister(RC);
> > +      DEBUG(dbgs() << "Skipping rename for Phys Reg " << lastRenameReg
> << "\n";);
> > +      continue;
> > +    }
> > +
> > +    if (std::find(renamedInOtherBB.begin(), renamedInOtherBB.end(),
> reg) !=
> > +        renamedInOtherBB.end()) {
> > +      DEBUG(dbgs() << "Vreg " << reg << " already renamed in other
> BB.\n";);
> > +      continue;
> > +    }
> > +
> > +    auto rename = MRI.createVirtualRegister(MRI.getRegClass(reg));
> > +    lastRenameReg = rename;
> > +
> > +    if (vregRenameMap.find(reg) == vregRenameMap.end()) {
> > +      DEBUG(dbgs() << "Mapping vreg ";);
> > +      if (MRI.reg_begin(reg) != MRI.reg_end()) {
> > +        DEBUG(auto foo = &*MRI.reg_begin(reg); foo->dump(););
> > +      } else {
> > +        DEBUG(dbgs() << reg;);
> > +      }
> > +      DEBUG(dbgs() << " to ";);
> > +      if (MRI.reg_begin(rename) != MRI.reg_end()) {
> > +        DEBUG(auto foo = &*MRI.reg_begin(rename); foo->dump(););
> > +      } else {
> > +        DEBUG(dbgs() << rename;);
> > +      }
> > +      DEBUG(dbgs() << "\n";);
> > +
> > +      vregRenameMap.insert(std::pair<unsigned, unsigned>(reg, rename));
> > +    }
> > +  }
> > +}
> > +
> > +static bool doVRegRenaming(std::vector<unsigned> &renamedInOtherBB,
> > +                           const std::map<unsigned, unsigned>
> &vregRenameMap,
> > +                           MachineRegisterInfo &MRI) {
> > +  bool Changed = false;
> > +  for (auto I = vregRenameMap.begin(), E = vregRenameMap.end(); I != E;
> ++I) {
> > +
> > +    auto vreg = I->first;
> > +    auto rename = I->second;
> > +
> > +    renamedInOtherBB.push_back(rename);
> > +
> > +    std::vector<MachineOperand *> renameMOs;
> > +    for (MachineOperand &MO : MRI.reg_operands(vreg)) {
> > +      renameMOs.push_back(&MO);
> > +    }
> > +
> > +    for (auto MO : renameMOs) {
> > +      Changed = true;
> > +      MO->setReg(rename);
> > +
> > +      if (!MO->isDef())
> > +        MO->setIsKill(false);
> > +    }
> > +  }
> > +
> > +  return Changed;
> > +}
> > +
> > +static bool doDefKillClear(MachineBasicBlock *MBB) {
> > +  bool Changed = false;
> > +
> > +  for (auto II = MBB->begin(), IE = MBB->end(); II != IE; ++II) {
> > +    MachineInstr *MI = &*II;
> > +
> > +    for (unsigned i = 0; i < MI->getNumOperands(); i++) {
> > +      MachineOperand &MO = MI->getOperand(i);
> > +      if (!MO.isReg())
> > +        continue;
> > +      if (!MO.isDef() && MO.isKill()) {
> > +        Changed = true;
> > +        MO.setIsKill(false);
> > +      }
> > +
> > +      if (MO.isDef() && MO.isDead()) {
> > +        Changed = true;
> > +        MO.setIsDead(false);
> > +      }
> > +    }
> > +  }
> > +
> > +  return Changed;
> > +}
> > +
> > +static bool runOnBasicBlock(MachineBasicBlock *MBB,
> > +                            std::vector<StringRef> &bbNames,
> > +                            std::vector<unsigned> &renamedInOtherBB,
> > +                            unsigned &basicBlockNum, unsigned
> &VRegGapIndex) {
> > +
> > +  if (CanonicalizeBasicBlockNumber != ~0U) {
> > +    if (CanonicalizeBasicBlockNumber != basicBlockNum++)
> > +      return false;
> > +    DEBUG(dbgs() << "\n Canonicalizing BasicBlock " << MBB->getName()
> << "\n";);
> > +  }
> > +
> > +  if (std::find(bbNames.begin(), bbNames.end(), MBB->getName()) !=
> > +      bbNames.end()) {
> > +    DEBUG(dbgs() << "Found potentially duplicate BasicBlocks: "
> > +                 << MBB->getName() << "\n";);
> > +    return false;
> > +  }
> > +
> > +  DEBUG(
> > +    dbgs() << "\n\n  NEW BASIC BLOCK: " << MBB->getName() << "  \n\n";
> > +    dbgs() << "\n\n=========================
> =======================\n\n";
> > +  );
> > +
> > +  bool Changed = false;
> > +  MachineFunction &MF = *MBB->getParent();
> > +  MachineRegisterInfo &MRI = MF.getRegInfo();
> > +
> > +  const TargetRegisterClass *dummyRC = [] (const MachineFunction &MF) {
> > +    const unsigned dummyVReg = GetDummyVReg(MF);
> > +    CANON_CHECK(dummyVReg != ~0U, "Could not find a dummy VReg.");
> > +    const MachineRegisterInfo &MRI = MF.getRegInfo();
> > +    return MRI.getRegClass(dummyVReg);
> > +  } (MF);
> > +
> > +  bbNames.push_back(MBB->getName());
> > +  DEBUG(dbgs() << "\n\n NEW BASIC BLOCK: " << MBB->getName() <<
> "\n\n";);
> > +
> > +  DEBUG(dbgs() << "MBB Before Scheduling:\n"; MBB->dump(););
> > +  Changed |= rescheduleCanoncally(MBB);
> > +  DEBUG(dbgs() << "MBB After Scheduling:\n"; MBB->dump(););
> > +
> > +  std::vector<MachineInstr *> candidates = populateCandidates(MBB);
> > +  std::vector<MachineInstr *> visitedMIs;
> > +  std::copy(candidates.begin(), candidates.end(),
> > +            std::back_inserter(visitedMIs));
> > +
> > +  std::vector<typedRegStackEntry> VRegs;
> > +  for (auto candidate : candidates) {
> > +    VRegs.push_back(std::make_tuple(RSE_NewCandidate, ~0U));
> > +
> > +    std::queue<typedRegStackEntry> reg_queue;
> > +
> > +    // Here we walk the vreg operands of a non-root node along our walk.
> > +    // The root nodes are the original candidates (stores normally).
> > +    // These are normally not the root nodes (except for the case of
> copies to
> > +    // physical registers).
> > +    for (unsigned i = 1; i < candidate->getNumOperands(); i++) {
> > +      if (candidate->mayStore() || candidate->isBranch())
> > +        break;
> > +
> > +      MachineOperand &MO = candidate->getOperand(i);
> > +      if (!(MO.isReg() && TargetRegisterInfo::
> isVirtualRegister(MO.getReg())))
> > +        continue;
> > +
> > +      DEBUG(dbgs() << "Enqueue register"; MO.dump(); dbgs() << "\n";);
> > +      reg_queue.push(std::make_tuple(RSE_Reg, MO.getReg()));
> > +    }
> > +
> > +    // Here we walk the root candidates. We start from the 0th operand
> because
> > +    // the root is normally a store to a vreg.
> > +    for (unsigned i = 0; i < candidate->getNumOperands(); i++) {
> > +
> > +      if (!candidate->mayStore() && !candidate->isBranch())
> > +        break;
> > +
> > +      MachineOperand &MO = candidate->getOperand(i);
> > +
> > +      // TODO: Do we want to only add vregs here?
> > +      if (!MO.isReg() && !MO.isFI())
> > +        continue;
> > +
> > +      DEBUG(dbgs() << "Enqueue Reg/FI"; MO.dump(); dbgs() << "\n";);
> > +
> > +      reg_queue.push(MO.isReg() ? std::make_tuple(RSE_Reg, MO.getReg())
> :
> > +                                  std::make_tuple(RSE_FrameIndex,
> ~0U));
> > +    }
> > +
> > +    // Now that we have pushed the operands of the candidate here, we
> do the
> > +    // full breadth first walk.
> > +    doCandidateWalk(VRegs, reg_queue, visitedMIs, MBB);
> > +  }
> > +
> > +  // If we have populated no vregs to rename then bail.
> > +  // The rest of this function does the vreg remaping.
> > +  if (VRegs.size() == 0)
> > +    return Changed;
> > +
> > +  // Skip some vregs, so we can recon where we'll land next.
> > +  SkipVRegs(VRegGapIndex, MRI, dummyRC);
> > +
> > +  std::map<unsigned, unsigned> vregRenameMap;
> > +  GetVRegRenameMap(vregRenameMap, VRegs, renamedInOtherBB, MRI,
> dummyRC);
> > +
> > +  Changed |= doVRegRenaming(renamedInOtherBB, vregRenameMap, MRI);
> > +  Changed |= doDefKillClear(MBB);
> > +
> > +  DEBUG(dbgs() << "Updated MachineBasicBlock:\n"; MBB->dump(); dbgs()
> << "\n";);
> > +  DEBUG(dbgs() << "\n\n=========================
> =======================\n\n");
> > +  return Changed;
> > +}
> > +
> > +bool MIRCanonicalizer::runOnMachineFunction(MachineFunction &MF) {
> > +
> > +  static unsigned functionNum = 0;
> > +  if (CanonicalizeFunctionNumber != ~0U) {
> > +    if (CanonicalizeFunctionNumber != functionNum++)
> > +      return false;
> > +    DEBUG(dbgs() << "\n Canonicalizing Function " << MF.getName() <<
> "\n";);
> > +  }
> > +
> > +  // we need a valid vreg to create a vreg type for skipping all those
> > +  // stray vreg numbers so reach alignment/canonical vreg values.
> > +  std::vector<MachineBasicBlock*> RPOList = GetRPOList(MF);
> > +
> > +  DEBUG(
> > +    dbgs() << "\n\n  NEW MACHINE FUNCTION: " << MF.getName() << "
> \n\n";
> > +    dbgs() << "\n\n=========================
> =======================\n\n";
> > +    dbgs() << "Total Basic Blocks: " << RPOList.size() << "\n";
> > +    for (auto MBB : RPOList) {
> > +      dbgs() << MBB->getName() << "\n";
> > +    }
> > +    dbgs() << "\n\n=========================
> =======================\n\n";
> > +  );
> > +
> > +  std::vector<StringRef> bbNames;
> > +  std::vector<unsigned> renamedInOtherBB;
> > +
> > +  unsigned gapIdx = 0;
> > +  unsigned bbNum = 0;
> > +
> > +  bool Changed = false;
> > +
> > +  for (auto MBB : RPOList)
> > +    Changed |= runOnBasicBlock(MBB, bbNames, renamedInOtherBB, bbNum,
> gapIdx);
> > +
> > +  return Changed;
> > +}
> > +
> > diff --git a/tools/mir-canon/mir-canon.cpp b/tools/mir-canon/mir-canon.
> cpp
> > new file mode 100644
> > index 00000000000..bfe53c01906
> > --- /dev/null
> > +++ b/tools/mir-canon/mir-canon.cpp
> > @@ -0,0 +1,175 @@
> > +//===------ mir-canon.cpp - Implement the LLVM Native Code Generator
> ------===//
> > +//
> > +//                     The LLVM Compiler Infrastructure
> > +//
> > +// This file is distributed under the University of Illinois Open Source
> > +// License. See LICENSE.TXT for details.
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +//
> > +// This is the mir-canon driver. It invokes the MIR canonicalization
> pass.
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +
> > +#include "llvm/CodeGen/CommandFlags.h"
> > +#include "llvm/CodeGen/TargetPassConfig.h"
> > +#include "llvm/CodeGen/MachineModuleInfo.h"
> > +#include "llvm/CodeGen/MachineFunctionPass.h"
> > +#include "llvm/CodeGen/MIRParser/MIRParser.h"
> > +
> > +#include "llvm/IR/LLVMContext.h"
> > +#include "llvm/IR/DiagnosticInfo.h"
> > +#include "llvm/IR/IRPrintingPasses.h"
> > +#include "llvm/IR/DiagnosticPrinter.h"
> > +#include "llvm/IR/LegacyPassManager.h"
> > +
> > +#include "llvm/Support/Signals.h"
> > +#include "llvm/Support/FileSystem.h"
> > +#include "llvm/Support/TargetSelect.h"
> > +#include "llvm/Support/TargetRegistry.h"
> > +#include "llvm/Support/ToolOutputFile.h"
> > +#include "llvm/Support/FormattedStream.h"
> > +#include "llvm/Support/PrettyStackTrace.h"
> > +
> > +#include "llvm/Target/TargetMachine.h"
> > +#include "llvm/Target/TargetSubtargetInfo.h"
> > +
> > +#include "Common.h"
> > +
> > +using namespace llvm;
> > +
> > +static cl::opt<std::string>
> > +    InputFilename(cl::Positional, cl::desc("<input.mir>"),
> cl::init("-"));
> > +
> > +static cl::opt<std::string> OutputFilename("o", cl::desc("Output
> filename"),
> > +                                           cl::value_desc("canon.mir"));
> > +
> > +static cl::opt<std::string> TargetTriple("mtriple",
> > +                                         cl::desc("Override target
> triple."));
> > +
> > +int main(int argc, char **argv) {
> > +  sys::PrintStackTraceOnErrorSignal(argv[0]);
> > +  PrettyStackTraceProgram X(argc, argv);
> > +
> > +  EnableDebugBuffering = true;
> > +
> > +  LLVMContext Context;
> > +  llvm_shutdown_obj Y;
> > +
> > +  // Initialize targets first, so that --version shows registered
> targets.
> > +  InitializeAllTargets();
> > +  InitializeAllTargetMCs();
> > +  InitializeAllAsmPrinters();
> > +  InitializeAllAsmParsers();
> > +
> > +  PassRegistry *Registry = PassRegistry::getPassRegistry();
> > +  initializeCore(*Registry);
> > +  initializeCodeGen(*Registry);
> > +
> > +  initializeMIRCanonicalizerPass(*Registry);
> > +
> > +  // Register the target printer for --version.
> > +  cl::AddExtraVersionPrinter(TargetRegistry::
> printRegisteredTargetsForVersion);
> > +
> > +  cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");
> > +
> > +  CANON_CHECK(StringRef(InputFilename).endswith_lower(".mir"),
> > +              "Invalid Filename: " + InputFilename + ". " +
> > +              "Expected a .mir file extension.");
> > +
> > +  // Set a diagnostic handler that doesn't exit on the first error
> > +  bool HasError = false;
> > +  Context.setDiagnosticHandler(
> > +    [](const DiagnosticInfo &DI, void *Context) {
> > +      if (DI.getSeverity() == DS_Error) {
> > +        bool *HasError = static_cast<bool *>(Context);
> > +        *HasError = true;
> > +      }
> > +
> > +      auto *Remark = dyn_cast<DiagnosticInfoOptimizationBase>(&DI);
> > +      if (Remark && !Remark->isEnabled())
> > +        return;
> > +
> > +      DiagnosticPrinterRawOStream DP(errs());
> > +      errs() << LLVMContext::getDiagnosticMessagePrefix(DI.
> getSeverity());
> > +      errs() << ": ";
> > +      DI.print(DP);
> > +      errs() << "\n";
> > +    },
> > +    &HasError);
> > +
> > +  SMDiagnostic Err;
> > +  auto MIR = createMIRParserFromFile(InputFilename, Err, Context);
> > +  auto M = MIR ? MIR->parseIRModule() : nullptr;
> > +
> > +  CANON_CHECK(M, argv[0] + "Invalid Module or MIR.");
> > +
> > +  if (!TargetTriple.empty())
> > +    M->setTargetTriple(Triple::normalize(TargetTriple));
> > +
> > +  Triple TheTriple = Triple(M->getTargetTriple());
> > +  if (TheTriple.getTriple().empty())
> > +    TheTriple.setTriple(sys::getDefaultTargetTriple());
> > +
> > +  // Get the target specific parser.
> > +  std::string Error;
> > +  const Target *TheTarget =
> > +      TargetRegistry::lookupTarget(MArch, TheTriple, Error);
> > +
> > +  CANON_CHECK(TheTarget, argv[0] + ": " + Error);
> > +
> > +  std::string CPUStr = getCPUStr();
> > +  std::string FeaturesStr = getFeaturesStr();
> > +
> > +  std::unique_ptr<TargetMachine> Target(TheTarget->createTargetMachine(
> > +      TheTriple.getTriple(), CPUStr, FeaturesStr,
> > +      InitTargetOptionsFromCodeGenFlags(), getRelocModel(),
> getCodeModel(),
> > +      CodeGenOpt::Default));
> > +
> > +  CANON_CHECK(Target, "Could not allocate target machine!");
> > +
> > +  std::error_code EC;
> > +  auto Out = llvm::make_unique<tool_output_file>(OutputFilename, EC,
> > +                                                 sys::fs::F_Text);
> > +  CANON_CHECK(!EC, EC.message());
> > +  raw_pwrite_stream *OS = &Out->os();
> > +
> > +  legacy::PassManager PM;
> > +  TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple()));
> > +  PM.add(new TargetLibraryInfoWrapperPass(TLII));
> > +  M->setDataLayout(Target->createDataLayout());
> > +  setFunctionAttributes(CPUStr, FeaturesStr, *M);
> > +
> > +  LLVMTargetMachine &TM = static_cast<LLVMTargetMachine &>(*Target);
> > +  TargetPassConfig &TPC = *TM.createPassConfig(PM);
> > +  PM.add(&TPC);
> > +  MachineModuleInfo *MMI = new MachineModuleInfo(&TM);
> > +  PM.add(MMI);
> > +  TPC.printAndVerify("");
> > +
> > +  StringRef PassName = "mir-canonicalizer";
> > +  const PassRegistry *PR = PassRegistry::getPassRegistry();
> > +  const PassInfo *PI = PR->getPassInfo(PassName);
> > +
> > +  CANON_CHECK(PI, argv[0] + ": " + PI->getPassName() + " not
> registered.");
> > +  CANON_CHECK(PI->getNormalCtor(),
> > +              argv[0] + ": can't create pass: " + PI->getPassName());
> > +
> > +  Pass *P = PI->getNormalCtor()();
> > +  PM.add(P);
> > +  TPC.printAndVerify(std::string("After ") +
> std::string(P->getPassName()));
> > +
> > +  PM.add(createPrintMIRPass(*OS));
> > +
> > +  // Before executing passes, print the final values of the LLVM
> options.
> > +  cl::PrintOptionValues();
> > +
> > +  PM.run(*M);
> > +
> > +  CANON_CHECK(!(*static_cast<bool *>(Context.getDiagnosticContext())),
> > +              argv[0] + " Error: exiting...");
> > +
> > +  Out->keep();
> > +  return 0;
> > +}
> > +
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170906/b1e1642c/attachment-0001.html>