[llvm-dev] [RFC] mir-canon: A new tool for canonicalizing MIR for cleaner diffing.
Puyan Lotfi via llvm-dev
llvm-dev at lists.llvm.org
Wed Sep 6 08:08:25 PDT 2017
On Tue, Aug 29, 2017 at 2:17 PM, Justin Bogner <mail at justinbogner.com>
wrote:
> Puyan Lotfi via llvm-dev <llvm-dev at lists.llvm.org> writes:
> > mir-canon: A new tool for canonicalizing MIR for cleaner diffing.
> >
> > Problem Statement and Context:
> >
> > Diff-tools are regularly used for comparing IR and assembly files. This
> often
> > involves reasoning through differences that are semantically equivalent
> and it
> > can be very time consuming for a person to do said reasoning.
> >
> > Specifically in the context of GlobalISel development there are problems
> of
> > correctness verification. There is a need to compare two programs,
> compiled
> > from identical IR by two different instruction selectors in a way where
> the
> > true differences stand out.
> >
> > Proposal:
> >
> > We propose a new tool that we have tentatively named 'mir-canon' that
> performs
> > canonical transformations on MIR. The goal is for MIR pre-processed with
> > mir-canon to show fewer differences than if it were not pre-processed.
>
> This sounds great. One place I'm hoping to use this for is in ISel
> fuzzing - we can currently catch crashes and machine verifier errors,
> but comparing SDAG to GlobalISel is a space that I'd really like to
> explore.
>
Thats exactly what I'm trying to do. I'm still trying to think of good ways
to narrow down the set of functions and basic blocks that get
canonicalized. I have some opt flags to control that but I think the
control could be better.
>
> > At the time of this writing we have a prototype canonicalization tool.
> We’ve
> > come up with some techniques that show promise and would like to open
> > discussion with the community to get feedback and suggestions on
> refining said
> > techniques. Currently we think of this as an open ended project.
> >
> > Techniques:
> >
> > Our prototype does the following for each basic block in a Reverse Post
> > Ordering:
> >
> > * Canonical instruction ordering is done by moving a given instruction
> as
> > close to the nearest use of its def as possible.
> >
> > * Next, canonical VReg renaming is done by building a collection of
> > candidate instructions that can be thought of as sinks in the def-use
> > graph: they are typically instructions that write to physical
> registers or
> > store to memory. These candidates are used as the root of a breadth
> first
> > walk over the vreg operand def-use graph that determines a canonical
> vreg
> > ordering.
> >
> > * Using said canonical vreg ordering we rename monotonically, but
> before we
> > do this we skip several vreg values in order to increase the chance
> that
> > we land on the same vreg number for two different input MIR files.
>
> This is one place where handling multiple files at once would help,
> correct?
>
>
I think so. I think if I were processing two files at once I could
increment the vreg count by a more exact number between basic blocks
processed and between def-use walks.
> > We also do this to reduce the chances that a difference in
> > previously def-use walks will affect the vreg renaming for
> > subsequent walks. This skipping step could be thought of as a kind
> > of vreg number reckoning: we skip modulo n vregs so that we are
> > likely to land on the same vreg for two different files.
> >
> > This approach is completely agnostic of ISA specific semantics so it
> should
> > work for any target.
> >
> > Current status:
> >
> > At the moment we have a standalone llvm tool that uses a single pass to
> do
> > the above described transformations. We have test inputs that show
> promise
> > but we still need a wider set of tests as well as hard metrics.
>
> This sounds like a great start, but I feel like this design won't quite
> scale to all the ways we want to use this.
>
> Consider, if this was a pass in the LLVM library rather than only
> accessible in a tool, you could potentially set it up to run at the end
> of an `llc` run. Similarly, if I want to use this in a fuzzer it would
> be nice to be able to do the canonicalization in memory.
>
>
I used to have it as an LLVM pass but a while ago I thought it could be
more prudent to have contained in a separate tool as it wasn't actually for
optimization or codegen. I can easily reintegrate it as an LLVM pass. I'll
send out another patch.
> > Our approach processes a single file at a time. The primary benefit to
> this
> > approach is lower complexity in initial implementation and exploration of
> > building such a tool. We are open to other approaches such as an
> llvm-diff
> > like (two file at a time) approach, but we have not explored that avenue
> > fully yet.
> >
> > We’re eager to hear community feedback and will be ready to share patches
> > for review soon.
> ...
> > Patch for review.
>
> Please send the patch to llvm-commits (and refer back to this thread).
> Patch review is too much traffic for llvm-dev.
>
>
Great, thanks!
> > diff --git a/tools/mir-canon/CMakeLists.txt
> b/tools/mir-canon/CMakeLists.txt
> > new file mode 100644
> > index 00000000000..eb6722e4bd4
> > --- /dev/null
> > +++ b/tools/mir-canon/CMakeLists.txt
> > @@ -0,0 +1,28 @@
> > +set(LLVM_LINK_COMPONENTS
> > + ${LLVM_TARGETS_TO_BUILD}
> > + Analysis
> > + AsmPrinter
> > + CodeGen
> > + Core
> > + IRReader
> > + MC
> > + MIRParser
> > + ScalarOpts
> > + SelectionDAG
> > + Support
> > + Target
> > + TransformUtils
> > + Vectorize
> > + )
> > +
> > +# Support plugins.
> > +set(LLVM_NO_DEAD_STRIP 1)
> > +
> > +add_llvm_tool(mir-canon
> > + mir-canon.cpp
> > + MIRCanonicalizerPass.cpp
> > +
> > + DEPENDS
> > + intrinsics_gen
> > + )
> > +export_executable_symbols(mir-canon)
> > diff --git a/tools/mir-canon/Common.h b/tools/mir-canon/Common.h
> > new file mode 100644
> > index 00000000000..3e6be7ef643
> > --- /dev/null
> > +++ b/tools/mir-canon/Common.h
> > @@ -0,0 +1,21 @@
> > +#include <cassert>
> > +
> > +namespace llvm {
> > +/// The purpose of this pass is to establish a canonical operand naming
> so
> > +/// that code compiled with slightly different IR passes can be diffed
> more
> > +/// effectively than otherwise. This is done by renaming vregs in a
> given
> > +/// LiveRange in a canonical way.
> > +extern char &MIRCanonicalizerID;
> > +
> > +void initializeMIRCanonicalizerPass(PassRegistry &);
> > +
> > +} // namespace llvm
> > +
> > +#define CANON_CHECK(CHK, MSG) \
> > + do { \
> > + if ((CHK)) break; \
> > + errs() << (std::string("") + MSG) << '\n'; \
> > + assert(false && "CANON_CHECK FAILURE."); \
> > + exit(EXIT_FAILURE); \
> > + } while (false)
> > +
> > diff --git a/tools/mir-canon/LLVMBuild.txt b/tools/mir-canon/LLVMBuild.
> txt
> > new file mode 100644
> > index 00000000000..5d3e6bb5a91
> > --- /dev/null
> > +++ b/tools/mir-canon/LLVMBuild.txt
> > @@ -0,0 +1,22 @@
> > +;===- ./tools/mir-canon/LLVMBuild.txt --------------------------*-
> Conf -*--===;
> > +;
> > +; The LLVM Compiler Infrastructure
> > +;
> > +; This file is distributed under the University of Illinois Open Source
> > +; License. See LICENSE.TXT for details.
> > +;
> > +;===-------------------------------------------------------
> -----------------===;
> > +;
> > +; This is an LLVMBuild description file for the components in this
> subdirectory.
> > +;
> > +; For more information on the LLVMBuild system, please see:
> > +;
> > +; http://llvm.org/docs/LLVMBuild.html
> > +;
> > +;===-------------------------------------------------------
> -----------------===;
> > +
> > +[component_0]
> > +type = Tool
> > +name = mir-canon
> > +parent = Tools
> > +required_libraries = AsmParser BitReader IRReader MIRParser
> TransformUtils Scalar Vectorize all-targets
> > diff --git a/tools/mir-canon/MIRCanonicalizerPass.cpp b/tools/mir-canon/
> MIRCanonicalizerPass.cpp
> > new file mode 100644
> > index 00000000000..a4a0a7a0e5f
> > --- /dev/null
> > +++ b/tools/mir-canon/MIRCanonicalizerPass.cpp
> > @@ -0,0 +1,588 @@
> > +//===-------------- MIRCanonicalizer.cpp - MIR Canonicalizer
> --------------===//
> > +//
> > +// The LLVM Compiler Infrastructure
> > +//
> > +// This file is distributed under the University of Illinois Open Source
> > +// License. See LICENSE.TXT for details.
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +//
> > +// Reorders instructions canonically.
> > +// Renames virtual register operands canonically.
> > +// Strips certain MIR artifacts (optionally).
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +
> > +#include "llvm/CodeGen/Passes.h"
> > +#include "llvm/CodeGen/MachineInstrBuilder.h"
> > +#include "llvm/CodeGen/MachineFunctionPass.h"
> > +#include "llvm/CodeGen/MachineRegisterInfo.h"
> > +
> > +#include "llvm/Support/raw_ostream.h"
> > +#include "llvm/ADT/PostOrderIterator.h"
> > +#include "llvm/Target/TargetInstrInfo.h"
> > +
> > +#include <tuple>
> > +#include <queue>
> > +
> > +using namespace llvm;
> > +
> > +#include "Common.h"
> > +
> > +#define DEBUG_TYPE "mir-canonicalizer"
> > +
> > +static cl::opt<unsigned>
> > +CanonicalizeFunctionNumber("canon-nth-function", cl::Hidden,
> cl::init(~0u),
> > + cl::value_desc("N"),
> > + cl::desc("Function number to
> canonicalize."));
> > +
> > +static cl::opt<unsigned>
> > +CanonicalizeBasicBlockNumber("canon-nth-basicblock", cl::Hidden,
> cl::init(~0u),
> > + cl::value_desc("N"),
> > + cl::desc("BasicBlock number to
> canonicalize."));
> > +
> > +namespace {
> > +
> > +class MIRCanonicalizer : public MachineFunctionPass {
> > +public:
> > + static char ID;
> > + MIRCanonicalizer() : MachineFunctionPass(ID) {}
> > +
> > + StringRef getPassName() const override {
> > + return "Rename register operands in a canonical ordering.";
> > + }
> > +
> > + void getAnalysisUsage(AnalysisUsage &AU) const override {
> > + AU.setPreservesCFG();
> > + MachineFunctionPass::getAnalysisUsage(AU);
> > + }
> > +
> > + bool runOnMachineFunction(MachineFunction &MF) override;
> > +};
> > +
> > +} // end anonymous namespace
> > +
> > +enum RSType { RSE_Reg = 0, RSE_FrameIndex, RSE_NewCandidate };
> > +typedef std::tuple<RSType, unsigned> typedRegStackEntry;
> > +
> > +char MIRCanonicalizer::ID;
> > +
> > +char &llvm::MIRCanonicalizerID = MIRCanonicalizer::ID;
> > +
> > +INITIALIZE_PASS_BEGIN(MIRCanonicalizer, "mir-canonicalizer",
> > + "Rename Register Operands Canonically", false,
> false);
> > +
> > +INITIALIZE_PASS_END(MIRCanonicalizer, "mir-canonicalizer",
> > + "Rename Register Operands Canonically", false,
> false);
> > +
> > +static std::vector<MachineBasicBlock *> GetRPOList(MachineFunction &MF)
> {
> > + MachineBasicBlock *Entry = &*MF.begin();
> > + std::vector<MachineBasicBlock*> RPOList;
> > + ReversePostOrderTraversal<MachineBasicBlock*> RPOT(Entry);
> > + for (ReversePostOrderTraversal<MachineBasicBlock*>::rpo_iterator
> > + MBBI = RPOT.begin(), MBBE = RPOT.end(); MBBI != MBBE; ++MBBI) {
> > + MachineBasicBlock *MBB = *MBBI;
> > + RPOList.push_back(MBB);
> > + }
> > +
> > + return RPOList;
> > +}
> > +
> > +// Set a dummy vreg. We use this vregs register class to generate
> throw-away
> > +// vregs that are used to skip vreg numbers so that vreg numbers line
> up.
> > +static unsigned GetDummyVReg(const MachineFunction &MF) {
> > + for (auto &MBB : MF) {
> > + for (auto II = MBB.begin(), IE = MBB.end(); II != IE; ++II) {
> > + const MachineInstr *MI = &*II;
> > + for (unsigned i = 0; i < MI->getNumOperands(); i++) {
> > + const MachineOperand &MO = MI->getOperand(i);
> > + if (!MO.isReg() || !TargetRegisterInfo::
> isVirtualRegister(MO.getReg()))
> > + continue;
> > + return MO.getReg();
> > + }
> > + }
> > + }
> > +
> > + return ~0U;
> > +}
> > +
> > +static bool rescheduleCanoncally(MachineBasicBlock *MBB) {
> > +
> > + bool Changed = false;
> > +
> > + // Calculates the distance of MI from the begining of its parent BB.
> > + auto getInstrIdx = [](const MachineInstr &MI) {
> > + unsigned i = 0;
> > + for (auto &I : *MI.getParent()) {
> > + if (&I == &MI)
> > + return i;
> > + i++;
> > + }
> > + return ~0U;
> > + };
> > +
> > + // Pre-Populate vector of instructions to reschedule so that we don't
> > + // clobber the iterator.
> > + std::vector<MachineInstr *> instructions;
> > + for (auto II = MBB->begin(); II != MBB->end(); ++II) {
> > + instructions.push_back(&*II);
> > + }
> > +
> > + for (auto II : instructions) {
> > + if (II->getNumOperands() == 0)
> > + continue;
> > +
> > + MachineOperand &MO = II->getOperand(0);
> > + if (!MO.isReg() || !TargetRegisterInfo::
> isVirtualRegister(MO.getReg()))
> > + continue;
> > +
> > + DEBUG(dbgs() << "Operand " << 0 << " of "; II->dump(); MO.dump(););
> > +
> > + MachineInstr *def = II;
> > + unsigned distance = ~0U;
> > + MachineInstr *useToBringDefCloserTo = nullptr;
> > + MachineRegisterInfo *MRI = &MBB->getParent()->getRegInfo();
> > + for (auto UI = MRI->use_begin(MO.getReg()), UE = MRI->use_end();
> > + UI != UE; ++UI) {
> > + MachineInstr *useInst = UI->getParent();
> > +
> > + const unsigned defLoc = getInstrIdx(*def);
> > + const unsigned useLoc = getInstrIdx(*useInst);
> > + const unsigned delta = (useLoc - defLoc);
> > +
> > + if (useInst->getParent() != def->getParent())
> > + continue;
> > + if (defLoc >= useLoc)
> > + continue;
> > +
> > + if (delta < distance) {
> > + distance = delta;
> > + useToBringDefCloserTo = useInst;
> > + }
> > + }
> > +
> > + MachineBasicBlock::iterator defI = MBB->instr_end();
> > + MachineBasicBlock::iterator useI = MBB->instr_end();
> > +
> > + for (auto BBI = MBB->instr_begin(); BBI != MBB->instr_end(); ++BBI)
> {
> > +
> > + if (defI != MBB->instr_end() && useI != MBB->instr_end())
> > + break;
> > +
> > + if ((&*BBI != def) && (&*BBI != useToBringDefCloserTo))
> > + continue;
> > +
> > + if (&*BBI == def) {
> > + defI = BBI;
> > + continue;
> > + }
> > +
> > + if (&*BBI == useToBringDefCloserTo) {
> > + useI = BBI;
> > + continue;
> > + }
> > + }
> > +
> > + if (defI == MBB->instr_end() || useI == MBB->instr_end())
> > + continue;
> > +
> > + DEBUG(dbgs() << "Splicing "; defI->dump(); dbgs() << " right
> before: ";
> > + useI->dump(););
> > +
> > + Changed = true;
> > + MBB->splice(useI, MBB, defI);
> > + }
> > +
> > + return Changed;
> > +}
> > +
> > +static std::vector<MachineInstr *> populateCandidates(MachineBasicBlock
> *MBB) {
> > + std::vector<MachineInstr *> candidates;
> > + MachineRegisterInfo &MRI = MBB->getParent()->getRegInfo();
> > +
> > + for (auto II = MBB->begin(), IE = MBB->end(); II != IE; ++II) {
> > + MachineInstr *MI = &*II;
> > +
> > + bool doesMISideEffect = false;
> > +
> > + if (MI->getOperand(0).isReg()) {
> > + const unsigned dst = MI->getOperand(0).getReg();
> > + doesMISideEffect |= !TargetRegisterInfo::isVirtualRegister(dst);
> > +
> > + for (auto UI = MRI.use_begin(dst); UI != MRI.use_end(); ++UI) {
> > + if (doesMISideEffect) break;
> > + doesMISideEffect |= (UI->getParent()->getParent() !=
> MI->getParent());
> > + }
> > + }
> > +
> > + if (!MI->mayStore() && !MI->isBranch() && !doesMISideEffect)
> > + continue;
> > +
> > + DEBUG(dbgs() << "Found Candidate: "; MI->dump(););
> > + candidates.push_back(MI);
> > + }
> > +
> > + return candidates;
> > +}
> > +
> > +void doCandidateWalk(std::vector<typedRegStackEntry> &VRegs,
> > + std::queue <typedRegStackEntry> ®_queue,
> > + std::vector<MachineInstr *> &visitedMIs,
> > + const MachineBasicBlock *MBB) {
> > +
> > + const MachineFunction &MF = *MBB->getParent();
> > + const MachineRegisterInfo &MRI = MF.getRegInfo();
> > +
> > + while (reg_queue.size()) {
> > + auto regType = std::get<0>(reg_queue.front());
> > + auto reg = std::get<1>(reg_queue.front());
> > + reg_queue.pop();
> > +
> > + if (regType == RSE_FrameIndex) {
> > + DEBUG(dbgs() << "Popping frame index.\n";);
> > + VRegs.push_back(std::make_tuple(RSE_FrameIndex, ~0U));
> > + continue;
> > + }
> > +
> > + assert(regType == RSE_Reg && "Expected vreg or physreg.");
> > +
> > + if (TargetRegisterInfo::isVirtualRegister(reg)) {
> > + DEBUG(dbgs() << "Popping vreg ";
> > + MRI.def_begin(reg)->dump();
> > + dbgs() << "\n";);
> > +
> > + bool hasVisitedVReg = false;
> > + for (auto vreg : VRegs) {
> > + if (std::get<1>(vreg) == reg) {
> > + hasVisitedVReg = true;
> > + break;
> > + }
> > + }
> > +
> > + if (!hasVisitedVReg)
> > + VRegs.push_back(std::make_tuple(RSE_Reg, reg));
> > + } else {
> > + DEBUG(dbgs() << "Popping physreg.\n";);
> > + VRegs.push_back(std::make_tuple(RSE_Reg, reg));
> > + continue;
> > + }
> > +
> > + for (auto RI = MRI.def_begin(reg), RE = MRI.def_end(); RI != RE;
> ++RI) {
> > + MachineInstr *def = RI->getParent();
> > +
> > + if (def->getParent() != MBB)
> > + continue;
> > +
> > + bool hasVisited = false;
> > + for (auto VMI : visitedMIs) {
> > + if (def == VMI) {
> > + hasVisited = true;
> > + break;
> > + }
> > + }
> > +
> > + if (hasVisited)
> > + break;
> > +
> > + DEBUG(dbgs() << "\n========================\n";
> > + dbgs() << "Visited MI: "; def->dump();
> > + dbgs() << "BB Name: " << def->getParent()->getName() << "\n";
> > + dbgs() << "\n========================\n";);
> > + visitedMIs.push_back(def);
> > + for (unsigned I = 1, E = def->getNumOperands(); I != E; ++I) {
> > +
> > + MachineOperand &MO = def->getOperand(I);
> > + if (MO.isFI()) {
> > + DEBUG(dbgs() << "Pushing frame index.\n";);
> > + reg_queue.push(std::make_tuple(RSE_FrameIndex, ~0U));
> > + }
> > +
> > + if (!MO.isReg())
> > + continue;
> > + reg_queue.push(std::make_tuple(RSE_Reg, MO.getReg()));
> > + }
> > + }
> > + }
> > +}
> > +
> > +static void SkipVRegs(unsigned &VRegGapIndex, MachineRegisterInfo &MRI,
> > + const TargetRegisterClass *RC) {
> > + const unsigned VR_GAP = (++VRegGapIndex * 1000);
> > +
> > + DEBUG(dbgs() << "Adjusting per-BB VR_GAP for BB" << VRegGapIndex << "
> to "
> > + << VR_GAP << "\n";);
> > +
> > + unsigned I = MRI.createVirtualRegister(RC);
> > + const unsigned E = (((I + VR_GAP) / VR_GAP) + 1) * VR_GAP;
> > + while (I != E) {
> > + I = MRI.createVirtualRegister(RC);
> > + }
> > +
> > + CANON_CHECK(I != ~0U && I != 0, "Invalid VReg");
> > +}
> > +
> > +static void GetVRegRenameMap(std::map<unsigned, unsigned>
> &vregRenameMap,
> > + const std::vector<typedRegStackEntry>
> &VRegs,
> > + const std::vector<unsigned>
> &renamedInOtherBB,
> > + MachineRegisterInfo &MRI,
> > + const TargetRegisterClass *RC) {
> > +
> > + unsigned lastRenameReg = MRI.createVirtualRegister(RC);
> > +
> > + bool firstCandidate = true;
> > + for (auto vreg : VRegs) {
> > +
> > + auto regType = std::get<0>(vreg);
> > + auto reg = std::get<1>(vreg);
> > +
> > + if (regType == RSE_FrameIndex) {
> > + lastRenameReg = MRI.createVirtualRegister(RC);
> > + DEBUG(dbgs() << "Skipping rename for FI " << lastRenameReg <<
> "\n";);
> > + continue;
> > + } else if (regType == RSE_NewCandidate) {
> > + if (!firstCandidate) {
> > + while (lastRenameReg % 10) {
> > + lastRenameReg = MRI.createVirtualRegister(RC);
> > +
> > + DEBUG(dbgs() << "Skipping rename for new candidate " <<
> lastRenameReg
> > + << "\n";);
> > + }
> > + }
> > + firstCandidate = false;
> > + continue;
> > + } else if (!TargetRegisterInfo::isVirtualRegister(reg)) {
> > + lastRenameReg = MRI.createVirtualRegister(RC);
> > + DEBUG(dbgs() << "Skipping rename for Phys Reg " << lastRenameReg
> << "\n";);
> > + continue;
> > + }
> > +
> > + if (std::find(renamedInOtherBB.begin(), renamedInOtherBB.end(),
> reg) !=
> > + renamedInOtherBB.end()) {
> > + DEBUG(dbgs() << "Vreg " << reg << " already renamed in other
> BB.\n";);
> > + continue;
> > + }
> > +
> > + auto rename = MRI.createVirtualRegister(MRI.getRegClass(reg));
> > + lastRenameReg = rename;
> > +
> > + if (vregRenameMap.find(reg) == vregRenameMap.end()) {
> > + DEBUG(dbgs() << "Mapping vreg ";);
> > + if (MRI.reg_begin(reg) != MRI.reg_end()) {
> > + DEBUG(auto foo = &*MRI.reg_begin(reg); foo->dump(););
> > + } else {
> > + DEBUG(dbgs() << reg;);
> > + }
> > + DEBUG(dbgs() << " to ";);
> > + if (MRI.reg_begin(rename) != MRI.reg_end()) {
> > + DEBUG(auto foo = &*MRI.reg_begin(rename); foo->dump(););
> > + } else {
> > + DEBUG(dbgs() << rename;);
> > + }
> > + DEBUG(dbgs() << "\n";);
> > +
> > + vregRenameMap.insert(std::pair<unsigned, unsigned>(reg, rename));
> > + }
> > + }
> > +}
> > +
> > +static bool doVRegRenaming(std::vector<unsigned> &renamedInOtherBB,
> > + const std::map<unsigned, unsigned>
> &vregRenameMap,
> > + MachineRegisterInfo &MRI) {
> > + bool Changed = false;
> > + for (auto I = vregRenameMap.begin(), E = vregRenameMap.end(); I != E;
> ++I) {
> > +
> > + auto vreg = I->first;
> > + auto rename = I->second;
> > +
> > + renamedInOtherBB.push_back(rename);
> > +
> > + std::vector<MachineOperand *> renameMOs;
> > + for (MachineOperand &MO : MRI.reg_operands(vreg)) {
> > + renameMOs.push_back(&MO);
> > + }
> > +
> > + for (auto MO : renameMOs) {
> > + Changed = true;
> > + MO->setReg(rename);
> > +
> > + if (!MO->isDef())
> > + MO->setIsKill(false);
> > + }
> > + }
> > +
> > + return Changed;
> > +}
> > +
> > +static bool doDefKillClear(MachineBasicBlock *MBB) {
> > + bool Changed = false;
> > +
> > + for (auto II = MBB->begin(), IE = MBB->end(); II != IE; ++II) {
> > + MachineInstr *MI = &*II;
> > +
> > + for (unsigned i = 0; i < MI->getNumOperands(); i++) {
> > + MachineOperand &MO = MI->getOperand(i);
> > + if (!MO.isReg())
> > + continue;
> > + if (!MO.isDef() && MO.isKill()) {
> > + Changed = true;
> > + MO.setIsKill(false);
> > + }
> > +
> > + if (MO.isDef() && MO.isDead()) {
> > + Changed = true;
> > + MO.setIsDead(false);
> > + }
> > + }
> > + }
> > +
> > + return Changed;
> > +}
> > +
> > +static bool runOnBasicBlock(MachineBasicBlock *MBB,
> > + std::vector<StringRef> &bbNames,
> > + std::vector<unsigned> &renamedInOtherBB,
> > + unsigned &basicBlockNum, unsigned
> &VRegGapIndex) {
> > +
> > + if (CanonicalizeBasicBlockNumber != ~0U) {
> > + if (CanonicalizeBasicBlockNumber != basicBlockNum++)
> > + return false;
> > + DEBUG(dbgs() << "\n Canonicalizing BasicBlock " << MBB->getName()
> << "\n";);
> > + }
> > +
> > + if (std::find(bbNames.begin(), bbNames.end(), MBB->getName()) !=
> > + bbNames.end()) {
> > + DEBUG(dbgs() << "Found potentially duplicate BasicBlocks: "
> > + << MBB->getName() << "\n";);
> > + return false;
> > + }
> > +
> > + DEBUG(
> > + dbgs() << "\n\n NEW BASIC BLOCK: " << MBB->getName() << " \n\n";
> > + dbgs() << "\n\n=========================
> =======================\n\n";
> > + );
> > +
> > + bool Changed = false;
> > + MachineFunction &MF = *MBB->getParent();
> > + MachineRegisterInfo &MRI = MF.getRegInfo();
> > +
> > + const TargetRegisterClass *dummyRC = [] (const MachineFunction &MF) {
> > + const unsigned dummyVReg = GetDummyVReg(MF);
> > + CANON_CHECK(dummyVReg != ~0U, "Could not find a dummy VReg.");
> > + const MachineRegisterInfo &MRI = MF.getRegInfo();
> > + return MRI.getRegClass(dummyVReg);
> > + } (MF);
> > +
> > + bbNames.push_back(MBB->getName());
> > + DEBUG(dbgs() << "\n\n NEW BASIC BLOCK: " << MBB->getName() <<
> "\n\n";);
> > +
> > + DEBUG(dbgs() << "MBB Before Scheduling:\n"; MBB->dump(););
> > + Changed |= rescheduleCanoncally(MBB);
> > + DEBUG(dbgs() << "MBB After Scheduling:\n"; MBB->dump(););
> > +
> > + std::vector<MachineInstr *> candidates = populateCandidates(MBB);
> > + std::vector<MachineInstr *> visitedMIs;
> > + std::copy(candidates.begin(), candidates.end(),
> > + std::back_inserter(visitedMIs));
> > +
> > + std::vector<typedRegStackEntry> VRegs;
> > + for (auto candidate : candidates) {
> > + VRegs.push_back(std::make_tuple(RSE_NewCandidate, ~0U));
> > +
> > + std::queue<typedRegStackEntry> reg_queue;
> > +
> > + // Here we walk the vreg operands of a non-root node along our walk.
> > + // The root nodes are the original candidates (stores normally).
> > + // These are normally not the root nodes (except for the case of
> copies to
> > + // physical registers).
> > + for (unsigned i = 1; i < candidate->getNumOperands(); i++) {
> > + if (candidate->mayStore() || candidate->isBranch())
> > + break;
> > +
> > + MachineOperand &MO = candidate->getOperand(i);
> > + if (!(MO.isReg() && TargetRegisterInfo::
> isVirtualRegister(MO.getReg())))
> > + continue;
> > +
> > + DEBUG(dbgs() << "Enqueue register"; MO.dump(); dbgs() << "\n";);
> > + reg_queue.push(std::make_tuple(RSE_Reg, MO.getReg()));
> > + }
> > +
> > + // Here we walk the root candidates. We start from the 0th operand
> because
> > + // the root is normally a store to a vreg.
> > + for (unsigned i = 0; i < candidate->getNumOperands(); i++) {
> > +
> > + if (!candidate->mayStore() && !candidate->isBranch())
> > + break;
> > +
> > + MachineOperand &MO = candidate->getOperand(i);
> > +
> > + // TODO: Do we want to only add vregs here?
> > + if (!MO.isReg() && !MO.isFI())
> > + continue;
> > +
> > + DEBUG(dbgs() << "Enqueue Reg/FI"; MO.dump(); dbgs() << "\n";);
> > +
> > + reg_queue.push(MO.isReg() ? std::make_tuple(RSE_Reg, MO.getReg())
> :
> > + std::make_tuple(RSE_FrameIndex,
> ~0U));
> > + }
> > +
> > + // Now that we have pushed the operands of the candidate here, we
> do the
> > + // full breadth first walk.
> > + doCandidateWalk(VRegs, reg_queue, visitedMIs, MBB);
> > + }
> > +
> > + // If we have populated no vregs to rename then bail.
> > + // The rest of this function does the vreg remaping.
> > + if (VRegs.size() == 0)
> > + return Changed;
> > +
> > + // Skip some vregs, so we can recon where we'll land next.
> > + SkipVRegs(VRegGapIndex, MRI, dummyRC);
> > +
> > + std::map<unsigned, unsigned> vregRenameMap;
> > + GetVRegRenameMap(vregRenameMap, VRegs, renamedInOtherBB, MRI,
> dummyRC);
> > +
> > + Changed |= doVRegRenaming(renamedInOtherBB, vregRenameMap, MRI);
> > + Changed |= doDefKillClear(MBB);
> > +
> > + DEBUG(dbgs() << "Updated MachineBasicBlock:\n"; MBB->dump(); dbgs()
> << "\n";);
> > + DEBUG(dbgs() << "\n\n=========================
> =======================\n\n");
> > + return Changed;
> > +}
> > +
> > +bool MIRCanonicalizer::runOnMachineFunction(MachineFunction &MF) {
> > +
> > + static unsigned functionNum = 0;
> > + if (CanonicalizeFunctionNumber != ~0U) {
> > + if (CanonicalizeFunctionNumber != functionNum++)
> > + return false;
> > + DEBUG(dbgs() << "\n Canonicalizing Function " << MF.getName() <<
> "\n";);
> > + }
> > +
> > + // we need a valid vreg to create a vreg type for skipping all those
> > + // stray vreg numbers so reach alignment/canonical vreg values.
> > + std::vector<MachineBasicBlock*> RPOList = GetRPOList(MF);
> > +
> > + DEBUG(
> > + dbgs() << "\n\n NEW MACHINE FUNCTION: " << MF.getName() << "
> \n\n";
> > + dbgs() << "\n\n=========================
> =======================\n\n";
> > + dbgs() << "Total Basic Blocks: " << RPOList.size() << "\n";
> > + for (auto MBB : RPOList) {
> > + dbgs() << MBB->getName() << "\n";
> > + }
> > + dbgs() << "\n\n=========================
> =======================\n\n";
> > + );
> > +
> > + std::vector<StringRef> bbNames;
> > + std::vector<unsigned> renamedInOtherBB;
> > +
> > + unsigned gapIdx = 0;
> > + unsigned bbNum = 0;
> > +
> > + bool Changed = false;
> > +
> > + for (auto MBB : RPOList)
> > + Changed |= runOnBasicBlock(MBB, bbNames, renamedInOtherBB, bbNum,
> gapIdx);
> > +
> > + return Changed;
> > +}
> > +
> > diff --git a/tools/mir-canon/mir-canon.cpp b/tools/mir-canon/mir-canon.
> cpp
> > new file mode 100644
> > index 00000000000..bfe53c01906
> > --- /dev/null
> > +++ b/tools/mir-canon/mir-canon.cpp
> > @@ -0,0 +1,175 @@
> > +//===------ mir-canon.cpp - Implement the LLVM Native Code Generator
> ------===//
> > +//
> > +// The LLVM Compiler Infrastructure
> > +//
> > +// This file is distributed under the University of Illinois Open Source
> > +// License. See LICENSE.TXT for details.
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +//
> > +// This is the mir-canon driver. It invokes the MIR canonicalization
> pass.
> > +//
> > +//===------------------------------------------------------
> ----------------===//
> > +
> > +#include "llvm/CodeGen/CommandFlags.h"
> > +#include "llvm/CodeGen/TargetPassConfig.h"
> > +#include "llvm/CodeGen/MachineModuleInfo.h"
> > +#include "llvm/CodeGen/MachineFunctionPass.h"
> > +#include "llvm/CodeGen/MIRParser/MIRParser.h"
> > +
> > +#include "llvm/IR/LLVMContext.h"
> > +#include "llvm/IR/DiagnosticInfo.h"
> > +#include "llvm/IR/IRPrintingPasses.h"
> > +#include "llvm/IR/DiagnosticPrinter.h"
> > +#include "llvm/IR/LegacyPassManager.h"
> > +
> > +#include "llvm/Support/Signals.h"
> > +#include "llvm/Support/FileSystem.h"
> > +#include "llvm/Support/TargetSelect.h"
> > +#include "llvm/Support/TargetRegistry.h"
> > +#include "llvm/Support/ToolOutputFile.h"
> > +#include "llvm/Support/FormattedStream.h"
> > +#include "llvm/Support/PrettyStackTrace.h"
> > +
> > +#include "llvm/Target/TargetMachine.h"
> > +#include "llvm/Target/TargetSubtargetInfo.h"
> > +
> > +#include "Common.h"
> > +
> > +using namespace llvm;
> > +
> > +static cl::opt<std::string>
> > + InputFilename(cl::Positional, cl::desc("<input.mir>"),
> cl::init("-"));
> > +
> > +static cl::opt<std::string> OutputFilename("o", cl::desc("Output
> filename"),
> > + cl::value_desc("canon.mir"));
> > +
> > +static cl::opt<std::string> TargetTriple("mtriple",
> > + cl::desc("Override target
> triple."));
> > +
> > +int main(int argc, char **argv) {
> > + sys::PrintStackTraceOnErrorSignal(argv[0]);
> > + PrettyStackTraceProgram X(argc, argv);
> > +
> > + EnableDebugBuffering = true;
> > +
> > + LLVMContext Context;
> > + llvm_shutdown_obj Y;
> > +
> > + // Initialize targets first, so that --version shows registered
> targets.
> > + InitializeAllTargets();
> > + InitializeAllTargetMCs();
> > + InitializeAllAsmPrinters();
> > + InitializeAllAsmParsers();
> > +
> > + PassRegistry *Registry = PassRegistry::getPassRegistry();
> > + initializeCore(*Registry);
> > + initializeCodeGen(*Registry);
> > +
> > + initializeMIRCanonicalizerPass(*Registry);
> > +
> > + // Register the target printer for --version.
> > + cl::AddExtraVersionPrinter(TargetRegistry::
> printRegisteredTargetsForVersion);
> > +
> > + cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");
> > +
> > + CANON_CHECK(StringRef(InputFilename).endswith_lower(".mir"),
> > + "Invalid Filename: " + InputFilename + ". " +
> > + "Expected a .mir file extension.");
> > +
> > + // Set a diagnostic handler that doesn't exit on the first error
> > + bool HasError = false;
> > + Context.setDiagnosticHandler(
> > + [](const DiagnosticInfo &DI, void *Context) {
> > + if (DI.getSeverity() == DS_Error) {
> > + bool *HasError = static_cast<bool *>(Context);
> > + *HasError = true;
> > + }
> > +
> > + auto *Remark = dyn_cast<DiagnosticInfoOptimizationBase>(&DI);
> > + if (Remark && !Remark->isEnabled())
> > + return;
> > +
> > + DiagnosticPrinterRawOStream DP(errs());
> > + errs() << LLVMContext::getDiagnosticMessagePrefix(DI.
> getSeverity());
> > + errs() << ": ";
> > + DI.print(DP);
> > + errs() << "\n";
> > + },
> > + &HasError);
> > +
> > + SMDiagnostic Err;
> > + auto MIR = createMIRParserFromFile(InputFilename, Err, Context);
> > + auto M = MIR ? MIR->parseIRModule() : nullptr;
> > +
> > + CANON_CHECK(M, argv[0] + "Invalid Module or MIR.");
> > +
> > + if (!TargetTriple.empty())
> > + M->setTargetTriple(Triple::normalize(TargetTriple));
> > +
> > + Triple TheTriple = Triple(M->getTargetTriple());
> > + if (TheTriple.getTriple().empty())
> > + TheTriple.setTriple(sys::getDefaultTargetTriple());
> > +
> > + // Get the target specific parser.
> > + std::string Error;
> > + const Target *TheTarget =
> > + TargetRegistry::lookupTarget(MArch, TheTriple, Error);
> > +
> > + CANON_CHECK(TheTarget, argv[0] + ": " + Error);
> > +
> > + std::string CPUStr = getCPUStr();
> > + std::string FeaturesStr = getFeaturesStr();
> > +
> > + std::unique_ptr<TargetMachine> Target(TheTarget->createTargetMachine(
> > + TheTriple.getTriple(), CPUStr, FeaturesStr,
> > + InitTargetOptionsFromCodeGenFlags(), getRelocModel(),
> getCodeModel(),
> > + CodeGenOpt::Default));
> > +
> > + CANON_CHECK(Target, "Could not allocate target machine!");
> > +
> > + std::error_code EC;
> > + auto Out = llvm::make_unique<tool_output_file>(OutputFilename, EC,
> > + sys::fs::F_Text);
> > + CANON_CHECK(!EC, EC.message());
> > + raw_pwrite_stream *OS = &Out->os();
> > +
> > + legacy::PassManager PM;
> > + TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple()));
> > + PM.add(new TargetLibraryInfoWrapperPass(TLII));
> > + M->setDataLayout(Target->createDataLayout());
> > + setFunctionAttributes(CPUStr, FeaturesStr, *M);
> > +
> > + LLVMTargetMachine &TM = static_cast<LLVMTargetMachine &>(*Target);
> > + TargetPassConfig &TPC = *TM.createPassConfig(PM);
> > + PM.add(&TPC);
> > + MachineModuleInfo *MMI = new MachineModuleInfo(&TM);
> > + PM.add(MMI);
> > + TPC.printAndVerify("");
> > +
> > + StringRef PassName = "mir-canonicalizer";
> > + const PassRegistry *PR = PassRegistry::getPassRegistry();
> > + const PassInfo *PI = PR->getPassInfo(PassName);
> > +
> > + CANON_CHECK(PI, argv[0] + ": " + PI->getPassName() + " not
> registered.");
> > + CANON_CHECK(PI->getNormalCtor(),
> > + argv[0] + ": can't create pass: " + PI->getPassName());
> > +
> > + Pass *P = PI->getNormalCtor()();
> > + PM.add(P);
> > + TPC.printAndVerify(std::string("After ") +
> std::string(P->getPassName()));
> > +
> > + PM.add(createPrintMIRPass(*OS));
> > +
> > + // Before executing passes, print the final values of the LLVM
> options.
> > + cl::PrintOptionValues();
> > +
> > + PM.run(*M);
> > +
> > + CANON_CHECK(!(*static_cast<bool *>(Context.getDiagnosticContext())),
> > + argv[0] + " Error: exiting...");
> > +
> > + Out->keep();
> > + return 0;
> > +}
> > +
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170906/b1e1642c/attachment-0001.html>
More information about the llvm-dev
mailing list