<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2015-06-17 15:25 GMT-07:00 Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><div class="h5"><br>
> On 2015 Jun 16, at 10:42, Alex Lorenz <<a href="mailto:arphaman@gmail.com">arphaman@gmail.com</a>> wrote:<br>
><br>
> Hi dexonsmith, bob.wilson, bogner,<br>
><br>
> This patch is based on a patch that implements initial serialization of machine basic blocks (<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D10465&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=DS4o_1s3kqEnyspPjdTbzmhSW3YdM2t5UBVGotU0Qyo&s=vVpfA1QgG8jD9Ga5qtmr2oa5EwXAGnqLsqhGtpxxbi8&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/D10465</a>).<br>
><br>
> This patch is the initial patch for machine instruction serialization. Only the machine instruction names are serialized by this patch. The instructions are represented using a YAML sequence of string literals and are a part of machine basic block YAML mapping. An example of such MBB mapping is shown below:<br>
><br>
> name: entry<br>
> instructions:<br>
> - mov32r0<br>
> - retq<br>
><br>
> A machine instruction parser class 'MIParser' is introduced by this patch. This class will be used to parse the machine instructions. A supporting machine instruction lexing class will be added in the upcoming patch.<br>
><br>
> REPOSITORY<br>
> rL LLVM<br>
><br>
> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D10481&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=DS4o_1s3kqEnyspPjdTbzmhSW3YdM2t5UBVGotU0Qyo&s=45SvbrZI4eMre4cPJVZXoq8CFsrk1jaJwQzKQzg4FBw&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/D10481</a><br>
><br>
> Files:<br>
> include/llvm/CodeGen/MIRYamlMapping.h<br>
> lib/CodeGen/MIRParser/CMakeLists.txt<br>
> lib/CodeGen/MIRParser/MIParser.cpp<br>
> lib/CodeGen/MIRParser/MIParser.h<br>
> lib/CodeGen/MIRParser/MIRParser.cpp<br>
> lib/CodeGen/MIRPrinter.cpp<br>
> test/CodeGen/MIR/X86/lit.local.cfg<br>
> test/CodeGen/MIR/X86/machine-instructions.mir<br>
> test/CodeGen/MIR/X86/unknown-instruction.mir<br>
><br>
> EMAIL PREFERENCES<br>
> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_settings_panel_emailpreferences_&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=DS4o_1s3kqEnyspPjdTbzmhSW3YdM2t5UBVGotU0Qyo&s=tS9D0Fg0dxXPT_Rte_ykE0icZ8Ke2opsECOW0TWHY_M&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>
</div></div>> <D10481.27769.patch><br>
<br>
LGTM with a few minor nitpicks, although I know you're waiting on the<br>
MBB parsing.<br>
<br>
> Index: lib/CodeGen/MIRParser/MIParser.cpp<br>
> ===================================================================<br>
> --- /dev/null<br>
> +++ lib/CodeGen/MIRParser/MIParser.cpp<br>
> @@ -0,0 +1,105 @@<br>
> +//===- MIParser.cpp - Machine instructions parser implementation ----------===//<br>
> +//<br>
> +// The LLVM Compiler Infrastructure<br>
> +//<br>
> +// This file is distributed under the University of Illinois Open Source<br>
> +// License. See LICENSE.TXT for details.<br>
> +//<br>
> +//===----------------------------------------------------------------------===//<br>
> +//<br>
> +// This file implements the parsing of machine instructions.<br>
> +//<br>
> +//===----------------------------------------------------------------------===//<br>
> +<br>
> +#include "MIParser.h"<br>
> +#include "llvm/ADT/StringMap.h"<br>
> +#include "llvm/CodeGen/MachineBasicBlock.h"<br>
> +#include "llvm/CodeGen/MachineFunction.h"<br>
> +#include "llvm/CodeGen/MachineInstr.h"<br>
> +#include "llvm/Support/raw_ostream.h"<br>
> +#include "llvm/Support/SourceMgr.h"<br>
> +#include "llvm/Target/TargetSubtargetInfo.h"<br>
> +#include "llvm/Target/TargetInstrInfo.h"<br>
> +<br>
> +using namespace llvm;<br>
> +<br>
> +namespace {<br>
> +<br>
> +class MIParser {<br>
> + SourceMgr &SM;<br>
> + MachineFunction &MF;<br>
> + SMDiagnostic &Error;<br>
> + StringRef Source;<br>
> + /// Maps from instruction names to op codes.<br>
> + StringMap<unsigned> Names2InstrOpCodes;<br>
> +<br>
> +public:<br>
> + MIParser(SourceMgr &SM, MachineFunction &MF, SMDiagnostic &Error,<br>
> + StringRef Source);<br>
> +<br>
> + /// Report an error at the current location with the given message.<br>
> + ///<br>
> + /// This function always return true.<br>
> + bool error(const Twine &Msg);<br>
> +<br>
> + MachineInstr *parse();<br>
> +<br>
> +private:<br>
> + void initNames2InstrOpCodes();<br>
> +<br>
> + /// Try to convert an instruction name to an opcode. Return true if the<br>
> + /// instruction name is invalid.<br>
> + bool parseInstrName(StringRef InstrName, unsigned &OpCode);<br>
> +};<br>
> +<br>
> +} // end anonymous namespace<br>
> +<br>
> +MIParser::MIParser(SourceMgr &SM, MachineFunction &MF, SMDiagnostic &Error,<br>
> + StringRef Source)<br>
> + : SM(SM), MF(MF), Error(Error), Source(Source) {}<br>
> +<br>
> +bool MIParser::error(const Twine &Msg) {<br>
> + // TODO: Get the proper location in the MIR file, not just a location inside<br>
> + // the string.<br>
> + Error =<br>
> + SMDiagnostic(SM, SMLoc(), SM.getMemoryBuffer(SM.getMainFileID())<br>
> + ->getBufferIdentifier(),<br>
> + 1, 0, SourceMgr::DK_Error, Msg.str(), Source, None, None);<br>
> + return true;<br>
> +}<br>
> +<br>
> +MachineInstr *MIParser::parse() {<br>
> + StringRef InstrName = Source;<br>
> + unsigned OpCode;<br>
> + if (parseInstrName(InstrName, OpCode)) {<br>
> + error(Twine("unknown machine instruction name '") + InstrName + "'");<br>
> + return nullptr;<br>
> + }<br>
<br>
We need a TODO somewhere (maybe here?) for the rest of the instruction ;).<br>
<br>
> +<br>
> + const auto &MCID = MF.getSubtarget().getInstrInfo()->get(OpCode);<br>
> + auto *MI = MF.CreateMachineInstr(MCID, DebugLoc());<br>
> + return MI;<br>
> +}<br>
> +<br>
> +void MIParser::initNames2InstrOpCodes() {<br>
> + if (!Names2InstrOpCodes.empty())<br>
> + return;<br>
> + const auto *TII = MF.getSubtarget().getInstrInfo();<br>
<br>
Something like:<br>
<br>
assert(TII && "Expected target instruction info");<br>
<br>
might be nice.<br>
<br>
> + for (unsigned I = 0, E = TII->getNumOpcodes(); I < E; ++I)<br>
> + Names2InstrOpCodes.insert(<br>
> + std::make_pair(StringRef(TII->getName(I)).lower(), I));<br>
> +}<br>
> +<br>
> +bool MIParser::parseInstrName(StringRef InstrName, unsigned &OpCode) {<br>
> + initNames2InstrOpCodes();<br>
> + auto InstrInfo = Names2InstrOpCodes.find(InstrName);<br>
<br>
Should we support other cases, using `InstrName.lower()`, here? (I'm<br>
not sure we should; really just a question. I'm happy to go ahead with<br>
what you have, since it's easy to loosen later.)<br></blockquote><div><br></div><div>I'm not sure really, I just print and parse <span style="line-height:normal">everything</span><span style="line-height:normal"> </span><span style="line-height:normal">using lowercase name, like LLVM IR.</span></div><div><span style="line-height:normal">I think this can be changed later if other </span>people<span style="line-height:normal"> want case </span>insensitive format.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
> + if (InstrInfo == Names2InstrOpCodes.end())<br>
> + return true;<br>
> + OpCode = InstrInfo->getValue();<br>
> + return false;<br>
> +}<br>
> +<br>
> +MachineInstr *llvm::parseMachineInstr(SourceMgr &SM, MachineFunction &MF,<br>
> + StringRef Src, SMDiagnostic &Error) {<br>
> + return MIParser(SM, MF, Error, Src).parse();<br>
> +}<br>
<br>
</blockquote></div><br></div></div>