<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=iso-8859-1"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-AU link="#0563C1" vlink="#954F72"><div class=WordSection1><p class=MsoNormal>Welcome to all<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Questions from veteran programmer with no LLVM backend experience evaluating<o:p></o:p></p><p class=MsoNormal>llvm for creating a Hitachi 6309 backend.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>This post is about finding out more about machine instruction operands.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The documentation I have read so far includes:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>- the online manuals<o:p></o:p></p><p class=MsoNormal>- Building an LLVM Backend. Fraser Cormack Pierre-André Saulais<o:p></o:p></p><p class=MsoNormal>- The Design of a Custom 32-bit RISC CPU and LLVM Compiler Backend. Connor Jan Goldberg<o:p></o:p></p><p class=MsoNormal>- Design and Implementation of a TriCore Backend for the LLVM Compiler Framework. Christoph Erhardt<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I have also cloned llvm 9.0.1 and started looking at some of the targets. A little overwhelming!<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>At this point I'm at information overload!<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>From the "The LLVM Target-Independent Code Generator"<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The MachineInstr class<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The operands of a machine instruction can be of several different types: a register reference, a constant integer, a basic block reference, etc.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Where are these operand types defined or documented (especially the etcs)?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>How do these operand types relate to the operands specified in the instruction selection and selection patterns?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>A concern I have is raised in "Design and Implementation of a TriCore Backend for the LLVM Compiler" where<o:p></o:p></p><p class=MsoNormal>the instruction set is non orthogonal (contains special purpose address registers) <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The strict distinction between pointers and integers is highly problematic because LLVM’s<o:p></o:p></p><p class=MsoNormal>code generator implicitly converts all pointers to integers of the same width ... upon <o:p></o:p></p><p class=MsoNormal>construction of the SelectionDAG.<o:p></o:p></p><p class=MsoNormal>.<o:p></o:p></p><p class=MsoNormal>.<o:p></o:p></p><p class=MsoNormal>.<o:p></o:p></p><p class=MsoNormal>As mentioned above, LLVM’s agnosticism regarding pointers initially makes it impos-<o:p></o:p></p><p class=MsoNormal>sible to comply with the EABI as there is no way to tell whether an integer argument<o:p></o:p></p><p class=MsoNormal>should go into an address register or a data register.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>However this document is dated circa 2008/2009 and I ask if this situation still remains the same<o:p></o:p></p><p class=MsoNormal>today.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I ask because the backend I would like to target the Hitachi/Motorola 6309/6809 which too<o:p></o:p></p><p class=MsoNormal>provides dedicated indexing (addressing) registers. In fact in all binary operations the second<o:p></o:p></p><p class=MsoNormal>operand is either immediate or some kind of a memory reference via a index/address register.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The syntax being:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal> {[}{OffsetReg | Disp{5,8,16}},{- | --}IndexReg{+ | ++ | ]}<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>OffsetReg can be 8bit or 16bit accumulator (so only certain regs allowed)<o:p></o:p></p><p class=MsoNormal>Displacment can be 5, 8 or 16 bit signed<o:p></o:p></p><p class=MsoNormal>IndexReg can only be special index registers or PC or stack<o:p></o:p></p><p class=MsoNormal>+ ++ is post increment by 1, 2 repsectively<o:p></o:p></p><p class=MsoNormal>- -- is pre decrement by 1, 2 respectively<o:p></o:p></p><p class=MsoNormal>[ ] the entire effective address is a pointer to pointer<o:p></o:p></p><p class=MsoNormal>[] and any incrementors/decrementors are mutally exclusive<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>So given the machine instruction :<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal> add d ,x # to the d register add what the x register points at<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>further examples of the second arguement are:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>,x+ # what register x points to and post inc x ie. *x++<o:p></o:p></p><p class=MsoNormal>10,y # what register y + 10 pointer to ie. *(y+10)<o:p></o:p></p><p class=MsoNormal>[20,u] # what register u + 20 pointer to pointer to ie. **(u+20)<o:p></o:p></p><p class=MsoNormal>w,y # what register y + register w points to ie. *(y+w)<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Is there a way to pattern match these kinds of operands?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>In MachineOperand.h I see this operand type. I assume I can match to it?!?!?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal> MO_TargetIndex, ///< Target-dependent index+offset operand.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>At https://llvm.org/docs/CodeGenerator.html#x86-addressing-mode<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>The x86 has a very flexible way of accessing memory. It is capable of forming memory addresses of the following <o:p></o:p></p><p class=MsoNormal>expression directly in integer instructions (which use ModR/M addressing):<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>SegmentReg: Base + [1,2,4,8] * IndexReg + Disp32<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>In order to represent this, LLVM tracks no less than 5 operands for each memory operand of this form. This means <o:p></o:p></p><p class=MsoNormal>that the “load” form of ‘mov’ has the following MachineOperands in this order:<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Index: 0 | 1 2 3 4 5<o:p></o:p></p><p class=MsoNormal>Meaning: DestReg, | BaseReg, Scale, IndexReg, Displacement Segment<o:p></o:p></p><p class=MsoNormal>OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg, SignExtImm PhysReg<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Stores, and all other instructions, treat the four memory operands in the same way and in the same order. If the <o:p></o:p></p><p class=MsoNormal>segment register is unspecified (regno = 0), then no segment override is generated. “Lea” operations do not have <o:p></o:p></p><p class=MsoNormal>a segment register specified, so they only have 4 operands for their memory reference.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>I then went and looked at the files in target/x86 and I have to admit I got lost trying to find where and<o:p></o:p></p><p class=MsoNormal>how this is implemented.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>At this (learning) stage I would appreciate any input or pointers including any other documentation or<o:p></o:p></p><p class=MsoNormal>tutorials that might help in relation to how I can implement indexed memory addressing operands.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>So appreciate comments.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Walter<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p></div></body></html>