[Lldb-commits] [PATCH] D87868: [RFC] When calling the process mmap try to call all found instead of just the first one

Thu Oct 1 01:00:44 PDT 2020

labath added a comment.

In D87868#2304680 <https://reviews.llvm.org/D87868#2304680>, @clayborg wrote:

> That is possible, though how do we figure out the syscall enumeration for the mmap syscall reliably for a given system? And if we are remote debugging? Seems like we are signing up for architecture specific syscalls on every different architecture. Not sure I would go that route, but I am open to seeing a solution before passing judgement.

Well.. we already need to "know" the right target-specific values for the various PROT_ and MAP_ flags, so I don't think including the syscall number (and the syscall opcode) would not be too much of a stretch.

>> Note that this would not need to be implemented in the lldb client. This sort of thing would be natural to implement in lldb server in response to the `_M` packet. There it would be easy to encode the abi details needed to issue a syscall. The client already prefers this packet, and the existing code could remain as a fallback for platforms not implementing it.
>
> There is no code currently that does any run control down inside of lldb-server and I would like to keep it that way if possible. debugserver can do it only because we have the task port to the program we are debugging and we can call a function in the debugserver process that does the memory allocation in the debug process. Having this done in lldb-server would require lldb-server to perform run control by changing register values, running the syscall, stopping at a breakpoint to after the syscall has run, removing that breakpoint only if it didn't exist already. Seems like a lot of dangerous flow control that we won't be able to see if anything goes wrong. Right now if we are doing it all through the GDB remote protocol, we can see exactly how things go wrong in the packet log, but now it would be a mystery if things go wrong.

I do share that sentiment, and if the setup needed for this would be anything like the mmap dance, I wouldn't even try it. The only reason I brought this up is because I expect the syscall routine to be much simpler (like, maybe simpler than the instruction emulation routine that we do on arm). There's no need to set breakpoints, as we're just PTRACE_SINGLESTEPing over a single instruction (unless we're on arm of course). So the implementation would be something like:

- save all registers
- find first executable page
- write "int 0x80" to the first two bytes
- setup registers for a syscall (including pointing pc to the int 0x80 instruction)
- PTRACE_SINGLESTEP+waitpid
- fetch result from %rax
- restore bytes overwritten by the int 0x80
- restore all registers

I don't think this would be too complicated, though it would obviously have to be done with a steady hand. In fact, one of the original use cases for the linux ptrace syscall was the ability for the tracer to track/replace/emulate syscalls done by the traced process. Unfortunately, the intended use was a bit different (things like qemu and user-mode linux) and it assumes the tracer wants to modify a syscall that the tracee already wanted to make (and not inject a new one "out of the blue"). This means we cannot use the existing support for that. That said, I think I've managed to convince the NetBSD folks to include out-of-blue syscall support in their ptrace implementation. Maybe I should try the same for linux... :P

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87868/new/

https://reviews.llvm.org/D87868