[llvm-dev] [RFC] mir-canon: A new tool for canonicalizing MIR for cleaner diffing.
Justin Bogner via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 29 11:17:54 PDT 2017
Puyan Lotfi via llvm-dev <llvm-dev at lists.llvm.org> writes:
> mir-canon: A new tool for canonicalizing MIR for cleaner diffing.
>
> Problem Statement and Context:
>
> Diff-tools are regularly used for comparing IR and assembly files. This often
> involves reasoning through differences that are semantically equivalent and it
> can be very time consuming for a person to do said reasoning.
>
> Specifically in the context of GlobalISel development there are problems of
> correctness verification. There is a need to compare two programs, compiled
> from identical IR by two different instruction selectors in a way where the
> true differences stand out.
>
> Proposal:
>
> We propose a new tool that we have tentatively named 'mir-canon' that performs
> canonical transformations on MIR. The goal is for MIR pre-processed with
> mir-canon to show fewer differences than if it were not pre-processed.
This sounds great. One place I'm hoping to use this for is in ISel
fuzzing - we can currently catch crashes and machine verifier errors,
but comparing SDAG to GlobalISel is a space that I'd really like to
explore.
> At the time of this writing we have a prototype canonicalization tool. We’ve
> come up with some techniques that show promise and would like to open
> discussion with the community to get feedback and suggestions on refining said
> techniques. Currently we think of this as an open ended project.
>
> Techniques:
>
> Our prototype does the following for each basic block in a Reverse Post
> Ordering:
>
> * Canonical instruction ordering is done by moving a given instruction as
> close to the nearest use of its def as possible.
>
> * Next, canonical VReg renaming is done by building a collection of
> candidate instructions that can be thought of as sinks in the def-use
> graph: they are typically instructions that write to physical registers or
> store to memory. These candidates are used as the root of a breadth first
> walk over the vreg operand def-use graph that determines a canonical vreg
> ordering.
>
> * Using said canonical vreg ordering we rename monotonically, but before we
> do this we skip several vreg values in order to increase the chance that
> we land on the same vreg number for two different input MIR files.
This is one place where handling multiple files at once would help, correct?
> We also do this to reduce the chances that a difference in
> previously def-use walks will affect the vreg renaming for
> subsequent walks. This skipping step could be thought of as a kind
> of vreg number reckoning: we skip modulo n vregs so that we are
> likely to land on the same vreg for two different files.
>
> This approach is completely agnostic of ISA specific semantics so it should
> work for any target.
>
> Current status:
>
> At the moment we have a standalone llvm tool that uses a single pass to do
> the above described transformations. We have test inputs that show promise
> but we still need a wider set of tests as well as hard metrics.
This sounds like a great start, but I feel like this design won't quite
scale to all the ways we want to use this.
Consider, if this was a pass in the LLVM library rather than only
accessible in a tool, you could potentially set it up to run at the end
of an `llc` run. Similarly, if I want to use this in a fuzzer it would
be nice to be able to do the canonicalization in memory.
> Our approach processes a single file at a time. The primary benefit to this
> approach is lower complexity in initial implementation and exploration of
> building such a tool. We are open to other approaches such as an llvm-diff
> like (two file at a time) approach, but we have not explored that avenue
> fully yet.
>
> We’re eager to hear community feedback and will be ready to share patches
> for review soon.
...
> Patch for review.
Please send the patch to llvm-commits (and refer back to this thread).
Patch review is too much traffic for llvm-dev.
> diff --git a/tools/mir-canon/CMakeLists.txt b/tools/mir-canon/CMakeLists.txt
> new file mode 100644
> index 00000000000..eb6722e4bd4
> --- /dev/null
> +++ b/tools/mir-canon/CMakeLists.txt
> @@ -0,0 +1,28 @@
> +set(LLVM_LINK_COMPONENTS
> + ${LLVM_TARGETS_TO_BUILD}
> + Analysis
> + AsmPrinter
> + CodeGen
> + Core
> + IRReader
> + MC
> + MIRParser
> + ScalarOpts
> + SelectionDAG
> + Support
> + Target
> + TransformUtils
> + Vectorize
> + )
> +
> +# Support plugins.
> +set(LLVM_NO_DEAD_STRIP 1)
> +
> +add_llvm_tool(mir-canon
> + mir-canon.cpp
> + MIRCanonicalizerPass.cpp
> +
> + DEPENDS
> + intrinsics_gen
> + )
> +export_executable_symbols(mir-canon)
> diff --git a/tools/mir-canon/Common.h b/tools/mir-canon/Common.h
> new file mode 100644
> index 00000000000..3e6be7ef643
> --- /dev/null
> +++ b/tools/mir-canon/Common.h
> @@ -0,0 +1,21 @@
> +#include <cassert>
> +
> +namespace llvm {
> +/// The purpose of this pass is to establish a canonical operand naming so
> +/// that code compiled with slightly different IR passes can be diffed more
> +/// effectively than otherwise. This is done by renaming vregs in a given
> +/// LiveRange in a canonical way.
> +extern char &MIRCanonicalizerID;
> +
> +void initializeMIRCanonicalizerPass(PassRegistry &);
> +
> +} // namespace llvm
> +
> +#define CANON_CHECK(CHK, MSG) \
> + do { \
> + if ((CHK)) break; \
> + errs() << (std::string("") + MSG) << '\n'; \
> + assert(false && "CANON_CHECK FAILURE."); \
> + exit(EXIT_FAILURE); \
> + } while (false)
> +
> diff --git a/tools/mir-canon/LLVMBuild.txt b/tools/mir-canon/LLVMBuild.txt
> new file mode 100644
> index 00000000000..5d3e6bb5a91
> --- /dev/null
> +++ b/tools/mir-canon/LLVMBuild.txt
> @@ -0,0 +1,22 @@
> +;===- ./tools/mir-canon/LLVMBuild.txt --------------------------*- Conf -*--===;
> +;
> +; The LLVM Compiler Infrastructure
> +;
> +; This file is distributed under the University of Illinois Open Source
> +; License. See LICENSE.TXT for details.
> +;
> +;===------------------------------------------------------------------------===;
> +;
> +; This is an LLVMBuild description file for the components in this subdirectory.
> +;
> +; For more information on the LLVMBuild system, please see:
> +;
> +; http://llvm.org/docs/LLVMBuild.html
> +;
> +;===------------------------------------------------------------------------===;
> +
> +[component_0]
> +type = Tool
> +name = mir-canon
> +parent = Tools
> +required_libraries = AsmParser BitReader IRReader MIRParser TransformUtils Scalar Vectorize all-targets
> diff --git a/tools/mir-canon/MIRCanonicalizerPass.cpp b/tools/mir-canon/MIRCanonicalizerPass.cpp
> new file mode 100644
> index 00000000000..a4a0a7a0e5f
> --- /dev/null
> +++ b/tools/mir-canon/MIRCanonicalizerPass.cpp
> @@ -0,0 +1,588 @@
> +//===-------------- MIRCanonicalizer.cpp - MIR Canonicalizer --------------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// Reorders instructions canonically.
> +// Renames virtual register operands canonically.
> +// Strips certain MIR artifacts (optionally).
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#include "llvm/CodeGen/Passes.h"
> +#include "llvm/CodeGen/MachineInstrBuilder.h"
> +#include "llvm/CodeGen/MachineFunctionPass.h"
> +#include "llvm/CodeGen/MachineRegisterInfo.h"
> +
> +#include "llvm/Support/raw_ostream.h"
> +#include "llvm/ADT/PostOrderIterator.h"
> +#include "llvm/Target/TargetInstrInfo.h"
> +
> +#include <tuple>
> +#include <queue>
> +
> +using namespace llvm;
> +
> +#include "Common.h"
> +
> +#define DEBUG_TYPE "mir-canonicalizer"
> +
> +static cl::opt<unsigned>
> +CanonicalizeFunctionNumber("canon-nth-function", cl::Hidden, cl::init(~0u),
> + cl::value_desc("N"),
> + cl::desc("Function number to canonicalize."));
> +
> +static cl::opt<unsigned>
> +CanonicalizeBasicBlockNumber("canon-nth-basicblock", cl::Hidden, cl::init(~0u),
> + cl::value_desc("N"),
> + cl::desc("BasicBlock number to canonicalize."));
> +
> +namespace {
> +
> +class MIRCanonicalizer : public MachineFunctionPass {
> +public:
> + static char ID;
> + MIRCanonicalizer() : MachineFunctionPass(ID) {}
> +
> + StringRef getPassName() const override {
> + return "Rename register operands in a canonical ordering.";
> + }
> +
> + void getAnalysisUsage(AnalysisUsage &AU) const override {
> + AU.setPreservesCFG();
> + MachineFunctionPass::getAnalysisUsage(AU);
> + }
> +
> + bool runOnMachineFunction(MachineFunction &MF) override;
> +};
> +
> +} // end anonymous namespace
> +
> +enum RSType { RSE_Reg = 0, RSE_FrameIndex, RSE_NewCandidate };
> +typedef std::tuple<RSType, unsigned> typedRegStackEntry;
> +
> +char MIRCanonicalizer::ID;
> +
> +char &llvm::MIRCanonicalizerID = MIRCanonicalizer::ID;
> +
> +INITIALIZE_PASS_BEGIN(MIRCanonicalizer, "mir-canonicalizer",
> + "Rename Register Operands Canonically", false, false);
> +
> +INITIALIZE_PASS_END(MIRCanonicalizer, "mir-canonicalizer",
> + "Rename Register Operands Canonically", false, false);
> +
> +static std::vector<MachineBasicBlock *> GetRPOList(MachineFunction &MF) {
> + MachineBasicBlock *Entry = &*MF.begin();
> + std::vector<MachineBasicBlock*> RPOList;
> + ReversePostOrderTraversal<MachineBasicBlock*> RPOT(Entry);
> + for (ReversePostOrderTraversal<MachineBasicBlock*>::rpo_iterator
> + MBBI = RPOT.begin(), MBBE = RPOT.end(); MBBI != MBBE; ++MBBI) {
> + MachineBasicBlock *MBB = *MBBI;
> + RPOList.push_back(MBB);
> + }
> +
> + return RPOList;
> +}
> +
> +// Set a dummy vreg. We use this vregs register class to generate throw-away
> +// vregs that are used to skip vreg numbers so that vreg numbers line up.
> +static unsigned GetDummyVReg(const MachineFunction &MF) {
> + for (auto &MBB : MF) {
> + for (auto II = MBB.begin(), IE = MBB.end(); II != IE; ++II) {
> + const MachineInstr *MI = &*II;
> + for (unsigned i = 0; i < MI->getNumOperands(); i++) {
> + const MachineOperand &MO = MI->getOperand(i);
> + if (!MO.isReg() || !TargetRegisterInfo::isVirtualRegister(MO.getReg()))
> + continue;
> + return MO.getReg();
> + }
> + }
> + }
> +
> + return ~0U;
> +}
> +
> +static bool rescheduleCanoncally(MachineBasicBlock *MBB) {
> +
> + bool Changed = false;
> +
> + // Calculates the distance of MI from the begining of its parent BB.
> + auto getInstrIdx = [](const MachineInstr &MI) {
> + unsigned i = 0;
> + for (auto &I : *MI.getParent()) {
> + if (&I == &MI)
> + return i;
> + i++;
> + }
> + return ~0U;
> + };
> +
> + // Pre-Populate vector of instructions to reschedule so that we don't
> + // clobber the iterator.
> + std::vector<MachineInstr *> instructions;
> + for (auto II = MBB->begin(); II != MBB->end(); ++II) {
> + instructions.push_back(&*II);
> + }
> +
> + for (auto II : instructions) {
> + if (II->getNumOperands() == 0)
> + continue;
> +
> + MachineOperand &MO = II->getOperand(0);
> + if (!MO.isReg() || !TargetRegisterInfo::isVirtualRegister(MO.getReg()))
> + continue;
> +
> + DEBUG(dbgs() << "Operand " << 0 << " of "; II->dump(); MO.dump(););
> +
> + MachineInstr *def = II;
> + unsigned distance = ~0U;
> + MachineInstr *useToBringDefCloserTo = nullptr;
> + MachineRegisterInfo *MRI = &MBB->getParent()->getRegInfo();
> + for (auto UI = MRI->use_begin(MO.getReg()), UE = MRI->use_end();
> + UI != UE; ++UI) {
> + MachineInstr *useInst = UI->getParent();
> +
> + const unsigned defLoc = getInstrIdx(*def);
> + const unsigned useLoc = getInstrIdx(*useInst);
> + const unsigned delta = (useLoc - defLoc);
> +
> + if (useInst->getParent() != def->getParent())
> + continue;
> + if (defLoc >= useLoc)
> + continue;
> +
> + if (delta < distance) {
> + distance = delta;
> + useToBringDefCloserTo = useInst;
> + }
> + }
> +
> + MachineBasicBlock::iterator defI = MBB->instr_end();
> + MachineBasicBlock::iterator useI = MBB->instr_end();
> +
> + for (auto BBI = MBB->instr_begin(); BBI != MBB->instr_end(); ++BBI) {
> +
> + if (defI != MBB->instr_end() && useI != MBB->instr_end())
> + break;
> +
> + if ((&*BBI != def) && (&*BBI != useToBringDefCloserTo))
> + continue;
> +
> + if (&*BBI == def) {
> + defI = BBI;
> + continue;
> + }
> +
> + if (&*BBI == useToBringDefCloserTo) {
> + useI = BBI;
> + continue;
> + }
> + }
> +
> + if (defI == MBB->instr_end() || useI == MBB->instr_end())
> + continue;
> +
> + DEBUG(dbgs() << "Splicing "; defI->dump(); dbgs() << " right before: ";
> + useI->dump(););
> +
> + Changed = true;
> + MBB->splice(useI, MBB, defI);
> + }
> +
> + return Changed;
> +}
> +
> +static std::vector<MachineInstr *> populateCandidates(MachineBasicBlock *MBB) {
> + std::vector<MachineInstr *> candidates;
> + MachineRegisterInfo &MRI = MBB->getParent()->getRegInfo();
> +
> + for (auto II = MBB->begin(), IE = MBB->end(); II != IE; ++II) {
> + MachineInstr *MI = &*II;
> +
> + bool doesMISideEffect = false;
> +
> + if (MI->getOperand(0).isReg()) {
> + const unsigned dst = MI->getOperand(0).getReg();
> + doesMISideEffect |= !TargetRegisterInfo::isVirtualRegister(dst);
> +
> + for (auto UI = MRI.use_begin(dst); UI != MRI.use_end(); ++UI) {
> + if (doesMISideEffect) break;
> + doesMISideEffect |= (UI->getParent()->getParent() != MI->getParent());
> + }
> + }
> +
> + if (!MI->mayStore() && !MI->isBranch() && !doesMISideEffect)
> + continue;
> +
> + DEBUG(dbgs() << "Found Candidate: "; MI->dump(););
> + candidates.push_back(MI);
> + }
> +
> + return candidates;
> +}
> +
> +void doCandidateWalk(std::vector<typedRegStackEntry> &VRegs,
> + std::queue <typedRegStackEntry> ®_queue,
> + std::vector<MachineInstr *> &visitedMIs,
> + const MachineBasicBlock *MBB) {
> +
> + const MachineFunction &MF = *MBB->getParent();
> + const MachineRegisterInfo &MRI = MF.getRegInfo();
> +
> + while (reg_queue.size()) {
> + auto regType = std::get<0>(reg_queue.front());
> + auto reg = std::get<1>(reg_queue.front());
> + reg_queue.pop();
> +
> + if (regType == RSE_FrameIndex) {
> + DEBUG(dbgs() << "Popping frame index.\n";);
> + VRegs.push_back(std::make_tuple(RSE_FrameIndex, ~0U));
> + continue;
> + }
> +
> + assert(regType == RSE_Reg && "Expected vreg or physreg.");
> +
> + if (TargetRegisterInfo::isVirtualRegister(reg)) {
> + DEBUG(dbgs() << "Popping vreg ";
> + MRI.def_begin(reg)->dump();
> + dbgs() << "\n";);
> +
> + bool hasVisitedVReg = false;
> + for (auto vreg : VRegs) {
> + if (std::get<1>(vreg) == reg) {
> + hasVisitedVReg = true;
> + break;
> + }
> + }
> +
> + if (!hasVisitedVReg)
> + VRegs.push_back(std::make_tuple(RSE_Reg, reg));
> + } else {
> + DEBUG(dbgs() << "Popping physreg.\n";);
> + VRegs.push_back(std::make_tuple(RSE_Reg, reg));
> + continue;
> + }
> +
> + for (auto RI = MRI.def_begin(reg), RE = MRI.def_end(); RI != RE; ++RI) {
> + MachineInstr *def = RI->getParent();
> +
> + if (def->getParent() != MBB)
> + continue;
> +
> + bool hasVisited = false;
> + for (auto VMI : visitedMIs) {
> + if (def == VMI) {
> + hasVisited = true;
> + break;
> + }
> + }
> +
> + if (hasVisited)
> + break;
> +
> + DEBUG(dbgs() << "\n========================\n";
> + dbgs() << "Visited MI: "; def->dump();
> + dbgs() << "BB Name: " << def->getParent()->getName() << "\n";
> + dbgs() << "\n========================\n";);
> + visitedMIs.push_back(def);
> + for (unsigned I = 1, E = def->getNumOperands(); I != E; ++I) {
> +
> + MachineOperand &MO = def->getOperand(I);
> + if (MO.isFI()) {
> + DEBUG(dbgs() << "Pushing frame index.\n";);
> + reg_queue.push(std::make_tuple(RSE_FrameIndex, ~0U));
> + }
> +
> + if (!MO.isReg())
> + continue;
> + reg_queue.push(std::make_tuple(RSE_Reg, MO.getReg()));
> + }
> + }
> + }
> +}
> +
> +static void SkipVRegs(unsigned &VRegGapIndex, MachineRegisterInfo &MRI,
> + const TargetRegisterClass *RC) {
> + const unsigned VR_GAP = (++VRegGapIndex * 1000);
> +
> + DEBUG(dbgs() << "Adjusting per-BB VR_GAP for BB" << VRegGapIndex << " to "
> + << VR_GAP << "\n";);
> +
> + unsigned I = MRI.createVirtualRegister(RC);
> + const unsigned E = (((I + VR_GAP) / VR_GAP) + 1) * VR_GAP;
> + while (I != E) {
> + I = MRI.createVirtualRegister(RC);
> + }
> +
> + CANON_CHECK(I != ~0U && I != 0, "Invalid VReg");
> +}
> +
> +static void GetVRegRenameMap(std::map<unsigned, unsigned> &vregRenameMap,
> + const std::vector<typedRegStackEntry> &VRegs,
> + const std::vector<unsigned> &renamedInOtherBB,
> + MachineRegisterInfo &MRI,
> + const TargetRegisterClass *RC) {
> +
> + unsigned lastRenameReg = MRI.createVirtualRegister(RC);
> +
> + bool firstCandidate = true;
> + for (auto vreg : VRegs) {
> +
> + auto regType = std::get<0>(vreg);
> + auto reg = std::get<1>(vreg);
> +
> + if (regType == RSE_FrameIndex) {
> + lastRenameReg = MRI.createVirtualRegister(RC);
> + DEBUG(dbgs() << "Skipping rename for FI " << lastRenameReg << "\n";);
> + continue;
> + } else if (regType == RSE_NewCandidate) {
> + if (!firstCandidate) {
> + while (lastRenameReg % 10) {
> + lastRenameReg = MRI.createVirtualRegister(RC);
> +
> + DEBUG(dbgs() << "Skipping rename for new candidate " << lastRenameReg
> + << "\n";);
> + }
> + }
> + firstCandidate = false;
> + continue;
> + } else if (!TargetRegisterInfo::isVirtualRegister(reg)) {
> + lastRenameReg = MRI.createVirtualRegister(RC);
> + DEBUG(dbgs() << "Skipping rename for Phys Reg " << lastRenameReg << "\n";);
> + continue;
> + }
> +
> + if (std::find(renamedInOtherBB.begin(), renamedInOtherBB.end(), reg) !=
> + renamedInOtherBB.end()) {
> + DEBUG(dbgs() << "Vreg " << reg << " already renamed in other BB.\n";);
> + continue;
> + }
> +
> + auto rename = MRI.createVirtualRegister(MRI.getRegClass(reg));
> + lastRenameReg = rename;
> +
> + if (vregRenameMap.find(reg) == vregRenameMap.end()) {
> + DEBUG(dbgs() << "Mapping vreg ";);
> + if (MRI.reg_begin(reg) != MRI.reg_end()) {
> + DEBUG(auto foo = &*MRI.reg_begin(reg); foo->dump(););
> + } else {
> + DEBUG(dbgs() << reg;);
> + }
> + DEBUG(dbgs() << " to ";);
> + if (MRI.reg_begin(rename) != MRI.reg_end()) {
> + DEBUG(auto foo = &*MRI.reg_begin(rename); foo->dump(););
> + } else {
> + DEBUG(dbgs() << rename;);
> + }
> + DEBUG(dbgs() << "\n";);
> +
> + vregRenameMap.insert(std::pair<unsigned, unsigned>(reg, rename));
> + }
> + }
> +}
> +
> +static bool doVRegRenaming(std::vector<unsigned> &renamedInOtherBB,
> + const std::map<unsigned, unsigned> &vregRenameMap,
> + MachineRegisterInfo &MRI) {
> + bool Changed = false;
> + for (auto I = vregRenameMap.begin(), E = vregRenameMap.end(); I != E; ++I) {
> +
> + auto vreg = I->first;
> + auto rename = I->second;
> +
> + renamedInOtherBB.push_back(rename);
> +
> + std::vector<MachineOperand *> renameMOs;
> + for (MachineOperand &MO : MRI.reg_operands(vreg)) {
> + renameMOs.push_back(&MO);
> + }
> +
> + for (auto MO : renameMOs) {
> + Changed = true;
> + MO->setReg(rename);
> +
> + if (!MO->isDef())
> + MO->setIsKill(false);
> + }
> + }
> +
> + return Changed;
> +}
> +
> +static bool doDefKillClear(MachineBasicBlock *MBB) {
> + bool Changed = false;
> +
> + for (auto II = MBB->begin(), IE = MBB->end(); II != IE; ++II) {
> + MachineInstr *MI = &*II;
> +
> + for (unsigned i = 0; i < MI->getNumOperands(); i++) {
> + MachineOperand &MO = MI->getOperand(i);
> + if (!MO.isReg())
> + continue;
> + if (!MO.isDef() && MO.isKill()) {
> + Changed = true;
> + MO.setIsKill(false);
> + }
> +
> + if (MO.isDef() && MO.isDead()) {
> + Changed = true;
> + MO.setIsDead(false);
> + }
> + }
> + }
> +
> + return Changed;
> +}
> +
> +static bool runOnBasicBlock(MachineBasicBlock *MBB,
> + std::vector<StringRef> &bbNames,
> + std::vector<unsigned> &renamedInOtherBB,
> + unsigned &basicBlockNum, unsigned &VRegGapIndex) {
> +
> + if (CanonicalizeBasicBlockNumber != ~0U) {
> + if (CanonicalizeBasicBlockNumber != basicBlockNum++)
> + return false;
> + DEBUG(dbgs() << "\n Canonicalizing BasicBlock " << MBB->getName() << "\n";);
> + }
> +
> + if (std::find(bbNames.begin(), bbNames.end(), MBB->getName()) !=
> + bbNames.end()) {
> + DEBUG(dbgs() << "Found potentially duplicate BasicBlocks: "
> + << MBB->getName() << "\n";);
> + return false;
> + }
> +
> + DEBUG(
> + dbgs() << "\n\n NEW BASIC BLOCK: " << MBB->getName() << " \n\n";
> + dbgs() << "\n\n================================================\n\n";
> + );
> +
> + bool Changed = false;
> + MachineFunction &MF = *MBB->getParent();
> + MachineRegisterInfo &MRI = MF.getRegInfo();
> +
> + const TargetRegisterClass *dummyRC = [] (const MachineFunction &MF) {
> + const unsigned dummyVReg = GetDummyVReg(MF);
> + CANON_CHECK(dummyVReg != ~0U, "Could not find a dummy VReg.");
> + const MachineRegisterInfo &MRI = MF.getRegInfo();
> + return MRI.getRegClass(dummyVReg);
> + } (MF);
> +
> + bbNames.push_back(MBB->getName());
> + DEBUG(dbgs() << "\n\n NEW BASIC BLOCK: " << MBB->getName() << "\n\n";);
> +
> + DEBUG(dbgs() << "MBB Before Scheduling:\n"; MBB->dump(););
> + Changed |= rescheduleCanoncally(MBB);
> + DEBUG(dbgs() << "MBB After Scheduling:\n"; MBB->dump(););
> +
> + std::vector<MachineInstr *> candidates = populateCandidates(MBB);
> + std::vector<MachineInstr *> visitedMIs;
> + std::copy(candidates.begin(), candidates.end(),
> + std::back_inserter(visitedMIs));
> +
> + std::vector<typedRegStackEntry> VRegs;
> + for (auto candidate : candidates) {
> + VRegs.push_back(std::make_tuple(RSE_NewCandidate, ~0U));
> +
> + std::queue<typedRegStackEntry> reg_queue;
> +
> + // Here we walk the vreg operands of a non-root node along our walk.
> + // The root nodes are the original candidates (stores normally).
> + // These are normally not the root nodes (except for the case of copies to
> + // physical registers).
> + for (unsigned i = 1; i < candidate->getNumOperands(); i++) {
> + if (candidate->mayStore() || candidate->isBranch())
> + break;
> +
> + MachineOperand &MO = candidate->getOperand(i);
> + if (!(MO.isReg() && TargetRegisterInfo::isVirtualRegister(MO.getReg())))
> + continue;
> +
> + DEBUG(dbgs() << "Enqueue register"; MO.dump(); dbgs() << "\n";);
> + reg_queue.push(std::make_tuple(RSE_Reg, MO.getReg()));
> + }
> +
> + // Here we walk the root candidates. We start from the 0th operand because
> + // the root is normally a store to a vreg.
> + for (unsigned i = 0; i < candidate->getNumOperands(); i++) {
> +
> + if (!candidate->mayStore() && !candidate->isBranch())
> + break;
> +
> + MachineOperand &MO = candidate->getOperand(i);
> +
> + // TODO: Do we want to only add vregs here?
> + if (!MO.isReg() && !MO.isFI())
> + continue;
> +
> + DEBUG(dbgs() << "Enqueue Reg/FI"; MO.dump(); dbgs() << "\n";);
> +
> + reg_queue.push(MO.isReg() ? std::make_tuple(RSE_Reg, MO.getReg()) :
> + std::make_tuple(RSE_FrameIndex, ~0U));
> + }
> +
> + // Now that we have pushed the operands of the candidate here, we do the
> + // full breadth first walk.
> + doCandidateWalk(VRegs, reg_queue, visitedMIs, MBB);
> + }
> +
> + // If we have populated no vregs to rename then bail.
> + // The rest of this function does the vreg remaping.
> + if (VRegs.size() == 0)
> + return Changed;
> +
> + // Skip some vregs, so we can recon where we'll land next.
> + SkipVRegs(VRegGapIndex, MRI, dummyRC);
> +
> + std::map<unsigned, unsigned> vregRenameMap;
> + GetVRegRenameMap(vregRenameMap, VRegs, renamedInOtherBB, MRI, dummyRC);
> +
> + Changed |= doVRegRenaming(renamedInOtherBB, vregRenameMap, MRI);
> + Changed |= doDefKillClear(MBB);
> +
> + DEBUG(dbgs() << "Updated MachineBasicBlock:\n"; MBB->dump(); dbgs() << "\n";);
> + DEBUG(dbgs() << "\n\n================================================\n\n");
> + return Changed;
> +}
> +
> +bool MIRCanonicalizer::runOnMachineFunction(MachineFunction &MF) {
> +
> + static unsigned functionNum = 0;
> + if (CanonicalizeFunctionNumber != ~0U) {
> + if (CanonicalizeFunctionNumber != functionNum++)
> + return false;
> + DEBUG(dbgs() << "\n Canonicalizing Function " << MF.getName() << "\n";);
> + }
> +
> + // we need a valid vreg to create a vreg type for skipping all those
> + // stray vreg numbers so reach alignment/canonical vreg values.
> + std::vector<MachineBasicBlock*> RPOList = GetRPOList(MF);
> +
> + DEBUG(
> + dbgs() << "\n\n NEW MACHINE FUNCTION: " << MF.getName() << " \n\n";
> + dbgs() << "\n\n================================================\n\n";
> + dbgs() << "Total Basic Blocks: " << RPOList.size() << "\n";
> + for (auto MBB : RPOList) {
> + dbgs() << MBB->getName() << "\n";
> + }
> + dbgs() << "\n\n================================================\n\n";
> + );
> +
> + std::vector<StringRef> bbNames;
> + std::vector<unsigned> renamedInOtherBB;
> +
> + unsigned gapIdx = 0;
> + unsigned bbNum = 0;
> +
> + bool Changed = false;
> +
> + for (auto MBB : RPOList)
> + Changed |= runOnBasicBlock(MBB, bbNames, renamedInOtherBB, bbNum, gapIdx);
> +
> + return Changed;
> +}
> +
> diff --git a/tools/mir-canon/mir-canon.cpp b/tools/mir-canon/mir-canon.cpp
> new file mode 100644
> index 00000000000..bfe53c01906
> --- /dev/null
> +++ b/tools/mir-canon/mir-canon.cpp
> @@ -0,0 +1,175 @@
> +//===------ mir-canon.cpp - Implement the LLVM Native Code Generator ------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// This is the mir-canon driver. It invokes the MIR canonicalization pass.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#include "llvm/CodeGen/CommandFlags.h"
> +#include "llvm/CodeGen/TargetPassConfig.h"
> +#include "llvm/CodeGen/MachineModuleInfo.h"
> +#include "llvm/CodeGen/MachineFunctionPass.h"
> +#include "llvm/CodeGen/MIRParser/MIRParser.h"
> +
> +#include "llvm/IR/LLVMContext.h"
> +#include "llvm/IR/DiagnosticInfo.h"
> +#include "llvm/IR/IRPrintingPasses.h"
> +#include "llvm/IR/DiagnosticPrinter.h"
> +#include "llvm/IR/LegacyPassManager.h"
> +
> +#include "llvm/Support/Signals.h"
> +#include "llvm/Support/FileSystem.h"
> +#include "llvm/Support/TargetSelect.h"
> +#include "llvm/Support/TargetRegistry.h"
> +#include "llvm/Support/ToolOutputFile.h"
> +#include "llvm/Support/FormattedStream.h"
> +#include "llvm/Support/PrettyStackTrace.h"
> +
> +#include "llvm/Target/TargetMachine.h"
> +#include "llvm/Target/TargetSubtargetInfo.h"
> +
> +#include "Common.h"
> +
> +using namespace llvm;
> +
> +static cl::opt<std::string>
> + InputFilename(cl::Positional, cl::desc("<input.mir>"), cl::init("-"));
> +
> +static cl::opt<std::string> OutputFilename("o", cl::desc("Output filename"),
> + cl::value_desc("canon.mir"));
> +
> +static cl::opt<std::string> TargetTriple("mtriple",
> + cl::desc("Override target triple."));
> +
> +int main(int argc, char **argv) {
> + sys::PrintStackTraceOnErrorSignal(argv[0]);
> + PrettyStackTraceProgram X(argc, argv);
> +
> + EnableDebugBuffering = true;
> +
> + LLVMContext Context;
> + llvm_shutdown_obj Y;
> +
> + // Initialize targets first, so that --version shows registered targets.
> + InitializeAllTargets();
> + InitializeAllTargetMCs();
> + InitializeAllAsmPrinters();
> + InitializeAllAsmParsers();
> +
> + PassRegistry *Registry = PassRegistry::getPassRegistry();
> + initializeCore(*Registry);
> + initializeCodeGen(*Registry);
> +
> + initializeMIRCanonicalizerPass(*Registry);
> +
> + // Register the target printer for --version.
> + cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);
> +
> + cl::ParseCommandLineOptions(argc, argv, "llvm system compiler\n");
> +
> + CANON_CHECK(StringRef(InputFilename).endswith_lower(".mir"),
> + "Invalid Filename: " + InputFilename + ". " +
> + "Expected a .mir file extension.");
> +
> + // Set a diagnostic handler that doesn't exit on the first error
> + bool HasError = false;
> + Context.setDiagnosticHandler(
> + [](const DiagnosticInfo &DI, void *Context) {
> + if (DI.getSeverity() == DS_Error) {
> + bool *HasError = static_cast<bool *>(Context);
> + *HasError = true;
> + }
> +
> + auto *Remark = dyn_cast<DiagnosticInfoOptimizationBase>(&DI);
> + if (Remark && !Remark->isEnabled())
> + return;
> +
> + DiagnosticPrinterRawOStream DP(errs());
> + errs() << LLVMContext::getDiagnosticMessagePrefix(DI.getSeverity());
> + errs() << ": ";
> + DI.print(DP);
> + errs() << "\n";
> + },
> + &HasError);
> +
> + SMDiagnostic Err;
> + auto MIR = createMIRParserFromFile(InputFilename, Err, Context);
> + auto M = MIR ? MIR->parseIRModule() : nullptr;
> +
> + CANON_CHECK(M, argv[0] + "Invalid Module or MIR.");
> +
> + if (!TargetTriple.empty())
> + M->setTargetTriple(Triple::normalize(TargetTriple));
> +
> + Triple TheTriple = Triple(M->getTargetTriple());
> + if (TheTriple.getTriple().empty())
> + TheTriple.setTriple(sys::getDefaultTargetTriple());
> +
> + // Get the target specific parser.
> + std::string Error;
> + const Target *TheTarget =
> + TargetRegistry::lookupTarget(MArch, TheTriple, Error);
> +
> + CANON_CHECK(TheTarget, argv[0] + ": " + Error);
> +
> + std::string CPUStr = getCPUStr();
> + std::string FeaturesStr = getFeaturesStr();
> +
> + std::unique_ptr<TargetMachine> Target(TheTarget->createTargetMachine(
> + TheTriple.getTriple(), CPUStr, FeaturesStr,
> + InitTargetOptionsFromCodeGenFlags(), getRelocModel(), getCodeModel(),
> + CodeGenOpt::Default));
> +
> + CANON_CHECK(Target, "Could not allocate target machine!");
> +
> + std::error_code EC;
> + auto Out = llvm::make_unique<tool_output_file>(OutputFilename, EC,
> + sys::fs::F_Text);
> + CANON_CHECK(!EC, EC.message());
> + raw_pwrite_stream *OS = &Out->os();
> +
> + legacy::PassManager PM;
> + TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple()));
> + PM.add(new TargetLibraryInfoWrapperPass(TLII));
> + M->setDataLayout(Target->createDataLayout());
> + setFunctionAttributes(CPUStr, FeaturesStr, *M);
> +
> + LLVMTargetMachine &TM = static_cast<LLVMTargetMachine &>(*Target);
> + TargetPassConfig &TPC = *TM.createPassConfig(PM);
> + PM.add(&TPC);
> + MachineModuleInfo *MMI = new MachineModuleInfo(&TM);
> + PM.add(MMI);
> + TPC.printAndVerify("");
> +
> + StringRef PassName = "mir-canonicalizer";
> + const PassRegistry *PR = PassRegistry::getPassRegistry();
> + const PassInfo *PI = PR->getPassInfo(PassName);
> +
> + CANON_CHECK(PI, argv[0] + ": " + PI->getPassName() + " not registered.");
> + CANON_CHECK(PI->getNormalCtor(),
> + argv[0] + ": can't create pass: " + PI->getPassName());
> +
> + Pass *P = PI->getNormalCtor()();
> + PM.add(P);
> + TPC.printAndVerify(std::string("After ") + std::string(P->getPassName()));
> +
> + PM.add(createPrintMIRPass(*OS));
> +
> + // Before executing passes, print the final values of the LLVM options.
> + cl::PrintOptionValues();
> +
> + PM.run(*M);
> +
> + CANON_CHECK(!(*static_cast<bool *>(Context.getDiagnosticContext())),
> + argv[0] + " Error: exiting...");
> +
> + Out->keep();
> + return 0;
> +}
> +
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list