[LLVMdev] Generating different assembly code for the same LLVM instruction depending on the metadata.
Alexander Potapenko
ramosian.glider at gmail.com
Tue Jun 28 02:50:12 PDT 2011
Hi LLVM devs,
consider I've got an instrumentation pass that adds some code (say,
function calls) before some memory access instructions and marks those
calls with some special metadata.
I want the compiler to lower the instrumentation code to a sequence of
no-ops while generating the object code.
For example, the assembly for the following code:
1 void _instr(); // the instrumentation function
2
3 void foo(int *x) {
4 _instr();
5 *x = *x + 1;
6 _instr();
7 }
should look like:
0000000000000000 <foo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
c: b8 00 00 00 00 mov $0x0,%eax
11: 90 nop
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
16: 48 8b 45 f8 mov -0x8(%rbp),%rax
1a: 8b 00 mov (%rax),%eax
1c: 8d 50 01 lea 0x1(%rax),%edx
1f: 48 8b 45 f8 mov -0x8(%rbp),%rax
23: 89 10 mov %edx,(%rax)
25: b8 00 00 00 00 mov $0x0,%eax
2a: 90 nop
2b: 90 nop
2c: 90 nop
2d: 90 nop
2e: 90 nop
2f: c9 leaveq
30: c3 retq
instead of:
0000000000000000 <foo>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
c: b8 00 00 00 00 mov $0x0,%eax
11: e8 00 00 00 00 callq 16 <foo+0x16>
16: 48 8b 45 f8 mov -0x8(%rbp),%rax
1a: 8b 00 mov (%rax),%eax
1c: 8d 50 01 lea 0x1(%rax),%edx
1f: 48 8b 45 f8 mov -0x8(%rbp),%rax
23: 89 10 mov %edx,(%rax)
25: b8 00 00 00 00 mov $0x0,%eax
2a: e8 00 00 00 00 callq 2f <foo+0x2f>
2f: c9 leaveq
30: c3 retq
What's the easiest/best way to do so?
I'm considering the following approaches:
-- use some uncommon DWARF tags to mark the instrumentation
instructions (to make sure they appear in the resulting binary) and
then post-process the .o file replacing the necessary bytes
-- hack the code generator such that it generates different assembly
sequences depending on the metadata (am I right that it is not
currently supported?)
The goal I'm trying to accomplish is to make two versions of code that
can be hot-swapped at runtime using a single mmap() call. This may
allow to turn the instrumentation on and off at runtime.
More information about the llvm-dev
mailing list