<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Dear LLVM-Dev,<br>
<br>
I'm writing a system that does analysis on x86 machine code in the
LLVM backend (i.e., MachineFunctionPasses). Part of this involves
data-flow analysis (reaching definitions, to be exact) on machine
instructions, handling data flow through both registers and memory
locations. This analysis provides an interface whereby I can query
it to determine the set of definitions that reach a particular
register use-operand (MachineOperand) of a machine instruction, or a
memory load operand (MachineMemOperand).<br>
<br>
Given this, my goal is to determine whether each variable input to
an instruction - whether a register use or a memory load - is
reached by some definition in a particular (known) set. However, the
relationship between MachineOperands and MachineMemOperands
complicates this.<br>
<br>
Whenever a machine instruction does a memory load/store, it has
both:<br>
<ul>
<li>A MachineMemOperand, which specifies the details of the
load/store at a high level; and</li>
<li>A sequence of register and immediate MachineOperands, which
represent the low-level encoding of the memory address in the
instruction. For x86, this sequence consists of five operands,
specifying the base register, scale constant, index register,
offset constant, and segment register respectively. (In cases
where the full 5-part addressing mode is not needed, some of the
registers can be set to %noreg and the constants to identity
values, e.g. scale=1 and offset=0. This convention is detailed
in <a
href="http://llvm.org/docs/CodeGenerator.html#representing-x86-addressing-modes-in-machineinstrs">the
code generator documentation</a>.)</li>
</ul>
The problem I'm having is that there's no way to tell from the
MachineOperands themselves whether they were generated as part of a
memory address specification sequence, or as a "real" register use
that provides a value to be computed on by the instruction. Thus
when I go to query my reaching-definitions interface, I don't know
which register operands I should be querying <i>as registers</i>
and which I should be skipping to instead query <i>as memory
accesses</i> (i.e., via their MachineMemOperands). Although it's
certainly <i>valid</i> to ask the question "which definitions reach
this register operand" when the operand is part of an address
specification, it's not particularly <i>useful</i> - I'm interested
in the flow of data in the logical computation, not "the value of
RBP used in this stack-frame-relative load was defined in the 'mov
%rsp, %rbp' instruction at the beginning of the function". Hence why
I want to skip these and instead look at the respective
MachineMemOperands.<br>
<br>
So, my question is: <b>is there any good way to identify whether a
MachineOperand was generated as part of a memory-addressing sequence?</b><br>
<br>
I looked through the MachineOperand and MachineMemOperand Doxygen
trying to find some link between the two, but to no avail. I also
read through a lot of the CodeGen and X86 backend code learning how
these operand sequences are generated, but I didn't see a <i>single</i>
place where this <i>consistently</i> happens that I could (for
instance) modify to note which operands are generated this way. As a
last resort I could try to guess which operands are the memory
addressing sequence by position (e.g., for stores, the
memory-addressing operands seem to always come first), but I would <i>really</i>
prefer not to do that because there are so many memory instructions
in x86 that it would be a lot of work to comprehensively account for
all of them. :-)<br>
<br>
Thank you,<br>
Ethan Johnson<br>
<i></i>
<pre class="moz-signature" cols="72">--
Ethan J. Johnson
Computer Science PhD student, Systems group, University of Rochester
<a class="moz-txt-link-abbreviated" href="mailto:ejohns48@cs.rochester.edu">ejohns48@cs.rochester.edu</a>
<a class="moz-txt-link-abbreviated" href="mailto:ethanjohnson@acm.org">ethanjohnson@acm.org</a>
PGP public key available from public directory or on request</pre>
</body>
</html>