[lldb-dev] Redefining functions

Filipe Cabecinhas filcab+lldb-dev at gmail.com
Thu Aug 11 18:11:18 PDT 2011


I've been toying around with loading libraries and what I can do with lldb,
but it seems some of the support isn't there:

  - I can load a library from a command, but the only thing I get is a
"token" (the return of dlopen());
  - I can't (as far as I can tell) know what is the address for the GOT
entry for a function (the one that will be changed by the dynamic linker on
first invocation, they seem to be in the __DATA,__la_symbol_ptr section),
  - Substituting the address in the GOT wouldn't work. I'll have to turn the
original function into a jump to the new one. Nothing is in place for that;
  - I found one email from Jason Molenda where he explained how they
implemented F&C on gdb
(http://www.cygwin.com/ml/gdb/2003-06/msg00531.html), and am trying to
do something similar. But it seems that the current dyld
implementation doesn't have a flag to not run global constructors (or
re-register ObjC classes), and NSLinkModule was deprecated, so these cases
would not.

I wanted to continue this work, but I have some doubts…

How could I get a handle (on my CommandObject) to the library loaded with
dlopen? (It can have the same file name as an already loaded library, how
can I tell which is which?)
If it is impossible, any ideas on how to add that feature?
After that, the easy way to replace the functions would be to get the
symbols (at least for functions) that are defined in the recently loaded
image and turn the current functions into jumps to the new functions.



On Mon, Aug 8, 2011 at 17:08, Filipe Cabecinhas
<filcab+lldb-dev at gmail.com>wrote:

> Hi!
> On Mon, Jul 18, 2011 at 18:13, Greg Clayton <gclayton at apple.com> wrote:
>> On Jul 18, 2011, at 1:32 PM, Filipe Cabecinhas wrote:
>> > Hi,
>> >
>> > I'm trying to create an LLDB command that sets an internal breakpoint
>> for a function, and then executes some code, but I'm having come
>> difficulties...
>> >
>> > I've seen the expression command, which does something close to what I
>> want to do after the breakpoint, but I have some doubts. I want the code to
>> be able to return from the function where it's called, but the
>> "target->EvaluateExpression" doesn't let the code return from it (while I
>> would like to execute code with something like "if (condition) return NULL;
>> more code…"). Is there a way to compile arbitrary code (with return
>> statements) and execute it?
>> Not currently.
>> >
>> > Is there a way to create something like an anonymous function (with
>> certain parameters), and have it compiled and linked, while looking up
>> global variables?
>> Current expressions can do the lookups, but as you already know they don't
>> live beyong the first invocation.
>> > ClangUtilityFunction doesn't look up any variables, and I can't seem to
>> find a way to look up global variables without a Frame object.
>> For globals you shouldn't need the frame. If the globals are in your
>> symbol table and are external you might be able to use dlsym().
>> > Is there a way to know a function (or method)'s address from its
>> prototype?
>> A normal fuction that was compiled into your code or an expression
>> function?
> For my first try (a command like "expr" but that would re-define functions)
> I wantes to find out the location of some function/method, given the
> prototype (e.g: "ProcessGDBRemote::StartDebugserverProcess(char const*)"). I
> would suppose we could mangle the name and try to find the symbol. I haven't
> seen any way to do that in lldb, but I suppose it's possible to do. Maybe
> I'm looking at it wrong.
>> > My final purpose is to be able to redefine functions on-the-fly (with
>> caveats for inlined functions, etc). The only way I saw that could work was
>> creating a (similar) function and making the other function a trampoline
>> (either using breakpoints, or writing a jmp expression at its address)… Did
>> I miss another easier way?
>> We do want the ability to just compile up something in an LLDB command but
>> we don't have that yet. You currently can do this via python if you really
>> want to by making a source file, invoking the compiler on it, and then
>> making a dylib. You can then use the "process load" command to load the
>> shared library:
>> (lldb) process load foo.so
>> So if you have your python code do the global variable lookups and create
>> the source code, you could hack something together.
>> When/if you are ready to try and take over the function, you can look for
>> any "Trampoline" symbols. For a simple a.out program on darwin we see:
>> (lldb) file ~/Documents/src/args/a.out
>> Current executable set to '~/Documents/src/args/a.out' (i386).
>> (lldb) image dump symtab a.out
>> Symtab, file = /Volumes/work/gclayton/Documents/src/args/a.out,
>> num_symbols = 18:
>>               Debug symbol
>>               |Synthetic symbol
>>               ||Externally Visible
>>               |||
>> Index   UserID DSX Type         File Address/Value Load Address       Size
>>               Flags      Name
>> ------- ------ --- ------------ ------------------ ------------------
>> ------------------ ---------- ----------------------------------
>> ....
>> [   10]     16     Trampoline   0x0000000000001e76
>>  0x0000000000000006 0x00010100 __stack_chk_fail
>> ...
>> [   12]     18     Trampoline   0x0000000000001e7c
>>  0x0000000000000006 0x00010100 exit
>> [   13]     19     Trampoline   0x0000000000001e82
>>  0x0000000000000006 0x00010100 getcwd
>> [   14]     20     Trampoline   0x0000000000001e88
>>  0x0000000000000006 0x00010100 perror
>> [   15]     21     Trampoline   0x0000000000001e8e
>>  0x0000000000000006 0x00010100 printf
>> [   16]     22     Trampoline   0x0000000000001e94
>>  0x0000000000000006 0x00010100 puts
>> On MacOSX, you could then easily patch the trampoline code to call your
>> own function for say "printf" by modifying the function address in the PLT
>> entry.
> That would be a good solution, at least to substitute functions that are
> accessed with the PLT. But are the trampolines reified (I don't think so)?
> Or should I just write to the process' PLT directly, after loading the
> function?
> What about replacing other functions? Let's say that I want to replace a
> random function (that I can't replace by changing the PLT). If I have
> information about which functions call it, I can replace the definition of
> the function by a jump and, if necessary, get the new versions of the
> functions that call the replaced function (doing the same to them, for a
> maximum of X iterations, for example). Though I would suppose clang won't
> give us that information (at least for now).
> Thanks for the help,
>   Filipe
