[llvm-dev] Identifying MachineOperands that are part of an address specification
Ethan J. Johnson via llvm-dev
llvm-dev at lists.llvm.org
Thu Jul 20 09:52:38 PDT 2017
Dear LLVM-Dev,
I'm writing a system that does analysis on x86 machine code in the LLVM
backend (i.e., MachineFunctionPasses). Part of this involves data-flow
analysis (reaching definitions, to be exact) on machine instructions,
handling data flow through both registers and memory locations. This
analysis provides an interface whereby I can query it to determine the
set of definitions that reach a particular register use-operand
(MachineOperand) of a machine instruction, or a memory load operand
(MachineMemOperand).
Given this, my goal is to determine whether each variable input to an
instruction - whether a register use or a memory load - is reached by
some definition in a particular (known) set. However, the relationship
between MachineOperands and MachineMemOperands complicates this.
Whenever a machine instruction does a memory load/store, it has both:
* A MachineMemOperand, which specifies the details of the load/store
at a high level; and
* A sequence of register and immediate MachineOperands, which
represent the low-level encoding of the memory address in the
instruction. For x86, this sequence consists of five operands,
specifying the base register, scale constant, index register, offset
constant, and segment register respectively. (In cases where the
full 5-part addressing mode is not needed, some of the registers can
be set to %noreg and the constants to identity values, e.g. scale=1
and offset=0. This convention is detailed in the code generator
documentation
<http://llvm.org/docs/CodeGenerator.html#representing-x86-addressing-modes-in-machineinstrs>.)
The problem I'm having is that there's no way to tell from the
MachineOperands themselves whether they were generated as part of a
memory address specification sequence, or as a "real" register use that
provides a value to be computed on by the instruction. Thus when I go to
query my reaching-definitions interface, I don't know which register
operands I should be querying /as registers/ and which I should be
skipping to instead query /as memory accesses/ (i.e., via their
MachineMemOperands). Although it's certainly /valid/ to ask the question
"which definitions reach this register operand" when the operand is part
of an address specification, it's not particularly /useful/ - I'm
interested in the flow of data in the logical computation, not "the
value of RBP used in this stack-frame-relative load was defined in the
'mov %rsp, %rbp' instruction at the beginning of the function". Hence
why I want to skip these and instead look at the respective
MachineMemOperands.
So, my question is: *is there any good way to identify whether a
MachineOperand was generated as part of a memory-addressing sequence?*
I looked through the MachineOperand and MachineMemOperand Doxygen trying
to find some link between the two, but to no avail. I also read through
a lot of the CodeGen and X86 backend code learning how these operand
sequences are generated, but I didn't see a /single/ place where this
/consistently/ happens that I could (for instance) modify to note which
operands are generated this way. As a last resort I could try to guess
which operands are the memory addressing sequence by position (e.g., for
stores, the memory-addressing operands seem to always come first), but I
would /really/ prefer not to do that because there are so many memory
instructions in x86 that it would be a lot of work to comprehensively
account for all of them. :-)
Thank you,
Ethan Johnson
//
--
Ethan J. Johnson
Computer Science PhD student, Systems group, University of Rochester
ejohns48 at cs.rochester.edu
ethanjohnson at acm.org
PGP public key available from public directory or on request
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170720/fa9904e2/attachment.html>
More information about the llvm-dev
mailing list