<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><base href="x-msg://10814/"><style><!--
/* Font Definitions */
@font-face
{font-family:Helvetica;
panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
{font-family:Helvetica;
panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Hi Jim,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Thanks for the quick feedback!<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>I have 2 thoughts along the direction that you suggest.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Currently the parser seems to be sorting/searching the instruction table by Mnemonic as the first pass. Is it possible to sort by ConvertFn, and add a generic mnemonic field to this? Then we could convert the assembly parser to search first for the format of the instruction (after first parsing out the formatting fields), and then see if the format is supported by the mnemonic.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Just for my own reference, the Table elements are defined:<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:9.5pt;font-family:Consolas'>static const char *const MnemonicTable;<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:9.5pt;font-family:Consolas'> uint32_t Mnemonic;<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:9.5pt;font-family:Consolas'> uint16_t Opcode;<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:9.5pt;font-family:Consolas'> uint16_t ConvertFn;<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:9.5pt;font-family:Consolas'> uint32_t RequiredFeatures;<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:9.5pt;font-family:Consolas'> uint16_t Classes[18];<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>This may benefit the matching generally by evening out the what the std::equal_range function must search through, and may make my task easier since there is no clearly defined mnemonics in much of Hexagon assembly grammar.<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>The other thought is whether or not we night be able to use stl::map (ordered tree) or std::tr1::unordered_map for a hash implementation when doing the matching. It seems that most of the hard work of turning the format into a ConvertFn index has already been done by the current implementation. Maybe a few Hashes can get you to a match directly?<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Sorry for any dumb questions. I’m a LLVM newbie.<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Regards,<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>David<o:p></o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal style='text-autospace:none'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Jim Grosbach [mailto:grosbach@apple.com] <br><b>Sent:</b> Wednesday, October 17, 2012 6:55 PM<br><b>To:</b> David Young<br><b>Cc:</b> 'LLVM Developers Mailing List'<br><b>Subject:</b> Re: [LLVMdev] Hexagon Assembly parser question<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div><div><p class=MsoNormal>On Oct 17, 2012, at 3:29 PM, David Young <<a href="mailto:davidy@codeaurora.org">davidy@codeaurora.org</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><br><br><o:p></o:p></p><div><div><p class=MsoNormal>Hi,<o:p></o:p></p></div><div><p class=MsoNormal>I’m trying to enable the hexagon LLVM assembly parser. It seem like there is a lot of work that has been done to make this parsing straightforward. <o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>But….<o:p></o:p></p></div><div><p class=MsoNormal>Hexagon assembly does not follow the “Mnemonic Rx Rx …” format that is expected by the assembly parsing infrastructure, represented by:<o:p></o:p></p></div><div><p class=MsoNormal><span style='color:#4F81BD'>StringRef Mnemonic = ((ARMOperand*)Operands[0])->getToken();</span><o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>This Mnemonic location assumption applies to both the Tablegen Backend AsmMatcherEmitter processing, and the .inc file it produces where MatchInstructionImpl is the entry point by which the assembly input is parsed.<o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>However Hexagon assembly has some features that make it more readable, such as r1 = r2, or if(r1) r2 = mem(#immediate). This makes taking advantage of the existing LLVM code difficult.<o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>Currently, I see two options. <o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>One is to preformat the assembly string(s) obtained from the td files so that it is matches the format that the tablegen backend expects, and also preformat the assembly input so that it can be matched against the preformatted assembly strings.<o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I agree this sounds pretty hacky.<o:p></o:p></p></div><p class=MsoNormal><br><br><o:p></o:p></p><div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>The other is to write a whole new TD backend that doesn’t rely on the Mnemonic location assumption. And hope someday to merge this backend with the current AsmMatcherEmitter.<o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>The table is sorted by mnemonic (more abstractly, by operator). That's pretty fundamental to how it works, so sticking with that would be good unless you want to write an entirely new algorithm. You could probably stick with the current basic stuff with some fiddling in tablegen where the asm string gets split apart when building up matchables to re-order things appropriately. Then your ParseInstruction() implementation would do similar tricks. The printer should "just work," thankfully.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><p class=MsoNormal>That said, you'll also likely have to do a bit of work in the generic AsmParser code, as it'll likely look at statements like these and not realize they're instruction sequences. The "mnemonic <whitespace> operands" format is pretty strongly imprinted on everything. That's not completely unfixable, of course, but it may be a bit tricky to avoid syntactic ambiguities. Not impossible, mind, just tricky and something to pay very close attention to in your design.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>-Jim<o:p></o:p></p></div></div><p class=MsoNormal><br><br><o:p></o:p></p><div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>I am leaning toward the latter. The other seems like it will create many more problems in the long run. Any thoughts, comments, or recommended directions are appreciated.<o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><div><p class=MsoNormal>Regards,<o:p></o:p></p></div><div><p class=MsoNormal>David<o:p></o:p></p></div><p class=MsoNormal><span style='font-size:13.5pt;font-family:"Helvetica","sans-serif"'>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu"><span style='color:purple'>LLVMdev@cs.uiuc.edu</span></a><span class=apple-converted-space> </span> <a href="http://llvm.cs.uiuc.edu"><span style='color:purple'>http://llvm.cs.uiuc.edu</span></a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev"><span style='color:purple'>http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</span></a><o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p></div></div></body></html>