[lldb-dev] lldb on linux -- immediate crash upon attaching to process

Wed Mar 16 10:25:49 PDT 2011

On Mar 16, 2011, at 7:40 AM, Jason E. Aten wrote:

> On Mon, Mar 14, 2011 at 11:11 PM, Stephen Wilson <wilsons at start.ca> wrote:
> Well, in a nutshell you would need to implement something similar to
> what ProcessLinux::DoLaunch does, but in this case you want things to
> boil down to a ptrace(ATTACH) instead of a fork() + ptrace(TRACEME).
> The basic sketch would be:
>   - Define a new ProcessMonitor ctor that takes a pid as argument.
>   - Define ProcessMonitor::Attach which does the actual ptrace magic.
>   - Write a another StartOperationThread method that takes a (new)
>     AttachArgs struct as argument (could just contain the pid for now)
>     and sets up the monitoring business in essentially the same way as
>     the current launch-based code does.  Probably rename
>     OperationThread to LaunchOpThread or similar and write your own
>     AttachOpThread analog.
> It would certainly be nice to have that implemented.  I do not see
> anything that would cause any complications off hand, and it should
> remain fairly isolated from all the other work that needs to happen wrt
> linux support.
> 
> 
> Thanks Steve!  I scoped out the work a little bit, mostly by stepping through in the debuggers both the Xcode version and the current Linux version.  Btw it looks like the current version of lldb has been incremented (now r127600), which is very good news.

Yes we recently updated to get needed disassembler fixes.

> I note that the main contrast is this: the darwin built lldb uses the ProcessGDBRemote class, implemented in the "llvm/tools/lldb/source/Plugins/Process/gdb-remote" directory, rather than ProcessLinux.
> 
> The curious thing is: when I look through the gdb-remote code, there are only two lines that are #ifdef APPLE.  It seems fairly reusable.

It is very reusable and can be used for just about any debugging. It can probably actually be used with the GDB produced gdbserver binary, but that would need to be modified to support some of the extra new packets we added to the GDB remote protocol for register set discovery (qRegisterInfo) and host info (qHostInfo). The register info packets allow complete discovery of process registers and include all of the DWARF and GCC register numberings, as well the generic registers ("pc" (rip/eip on x86), "sp" (rsp/esp on x86), "fp" (rbp/ebp on x86), "ra" (return address (n/a on x86), "flags" (rflags/eflags on x86). Example packets for an x86_64 register context on MacOSX looks like:

-> $qRegisterInfo0#00
<- $name:rax;bitsize:64;offset:0;encoding:uint;format:hex;set:General Purpose Registers;gcc:0;dwarf:0;#00
-> $qRegisterInfo1#00
<- $name:rbx;bitsize:64;offset:8;encoding:uint;format:hex;set:General Purpose Registers;gcc:3;dwarf:3;#00
-> $qRegisterInfo2#00
<- $name:rcx;bitsize:64;offset:16;encoding:uint;format:hex;set:General Purpose Registers;gcc:2;dwarf:2;#00
-> $qRegisterInfo3#00
<- $name:rdx;bitsize:64;offset:24;encoding:uint;format:hex;set:General Purpose Registers;gcc:1;dwarf:1;#00
-> $qRegisterInfo4#00
<- $name:rdi;bitsize:64;offset:32;encoding:uint;format:hex;set:General Purpose Registers;gcc:5;dwarf:5;#00
-> $qRegisterInfo5#00
<- $name:rsi;bitsize:64;offset:40;encoding:uint;format:hex;set:General Purpose Registers;gcc:4;dwarf:4;#00
-> $qRegisterInfo6#00
<- $name:rbp;alt-name:fp;bitsize:64;offset:48;encoding:uint;format:hex;set:General Purpose Registers;gcc:6;dwarf:6;generic:fp;#00
-> $qRegisterInfo7#00
<- $name:rsp;alt-name:sp;bitsize:64;offset:56;encoding:uint;format:hex;set:General Purpose Registers;gcc:7;dwarf:7;generic:sp;#00
-> $qRegisterInfo8#00
<- $name:r8;bitsize:64;offset:64;encoding:uint;format:hex;set:General Purpose Registers;gcc:8;dwarf:8;#00
-> $qRegisterInfo9#00
<- $name:r9;bitsize:64;offset:72;encoding:uint;format:hex;set:General Purpose Registers;gcc:9;dwarf:9;#00
-> $qRegisterInfoa#00
<- $name:r10;bitsize:64;offset:80;encoding:uint;format:hex;set:General Purpose Registers;gcc:10;dwarf:10;#00
-> $qRegisterInfob#00
<- $name:r11;bitsize:64;offset:88;encoding:uint;format:hex;set:General Purpose Registers;gcc:11;dwarf:11;#00
-> $qRegisterInfoc#00
<- $name:r12;bitsize:64;offset:96;encoding:uint;format:hex;set:General Purpose Registers;gcc:12;dwarf:12;#00
-> $qRegisterInfod#00
<- $name:r13;bitsize:64;offset:104;encoding:uint;format:hex;set:General Purpose Registers;gcc:13;dwarf:13;#00
-> $qRegisterInfoe#00
<- $name:r14;bitsize:64;offset:112;encoding:uint;format:hex;set:General Purpose Registers;gcc:14;dwarf:14;#00
-> $qRegisterInfof#00
<- $name:r15;bitsize:64;offset:120;encoding:uint;format:hex;set:General Purpose Registers;gcc:15;dwarf:15;#00
-> $qRegisterInfo10#00
<- $name:rip;alt-name:pc;bitsize:64;offset:128;encoding:uint;format:hex;set:General Purpose Registers;gcc:16;dwarf:16;generic:pc;#00
-> $qRegisterInfo11#00
<- $name:rflags;alt-name:flags;bitsize:64;offset:136;encoding:uint;format:hex;set:General Purpose Registers;#00
-> $qRegisterInfo12#00
<- $name:cs;bitsize:64;offset:144;encoding:uint;format:hex;set:General Purpose Registers;#00
-> $qRegisterInfo13#00
<- $name:fs;bitsize:64;offset:152;encoding:uint;format:hex;set:General Purpose Registers;#00
-> $qRegisterInfo14#00
<- $name:gs;bitsize:64;offset:160;encoding:uint;format:hex;set:General Purpose Registers;#00
-> $qRegisterInfo15#00
<- $name:fctrl;bitsize:16;offset:176;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo16#00
<- $name:fstat;bitsize:16;offset:178;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo17#00
<- $name:ftag;bitsize:8;offset:180;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo18#00
<- $name:fop;bitsize:16;offset:182;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo19#00
<- $name:fioff;bitsize:32;offset:184;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo1a#00
<- $name:fiseg;bitsize:16;offset:188;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo1b#00
<- $name:fooff;bitsize:32;offset:192;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo1c#00
<- $name:foseg;bitsize:16;offset:196;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo1d#00
<- $name:mxcsr;bitsize:32;offset:200;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo1e#00
<- $name:mxcsrmask;bitsize:32;offset:204;encoding:uint;format:hex;set:Floating Point Registers;#00
-> $qRegisterInfo1f#00
<- $name:stmm0;bitsize:80;offset:208;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:33;dwarf:33;#00
-> $qRegisterInfo20#00
<- $name:stmm1;bitsize:80;offset:224;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:34;dwarf:34;#00
-> $qRegisterInfo21#00
<- $name:stmm2;bitsize:80;offset:240;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:35;dwarf:35;#00
-> $qRegisterInfo22#00
<- $name:stmm3;bitsize:80;offset:256;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:36;dwarf:36;#00
-> $qRegisterInfo23#00
<- $name:stmm4;bitsize:80;offset:272;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:37;dwarf:37;#00
-> $qRegisterInfo24#00
<- $name:stmm5;bitsize:80;offset:288;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:38;dwarf:38;#00
-> $qRegisterInfo25#00
<- $name:stmm6;bitsize:80;offset:304;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:39;dwarf:39;#00
-> $qRegisterInfo26#00
<- $name:stmm7;bitsize:80;offset:320;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:40;dwarf:40;#00
-> $qRegisterInfo27#00
<- $name:xmm0;bitsize:128;offset:336;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:17;dwarf:17;#00
-> $qRegisterInfo28#00
<- $name:xmm1;bitsize:128;offset:352;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:18;dwarf:18;#00
-> $qRegisterInfo29#00
<- $name:xmm2;bitsize:128;offset:368;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:19;dwarf:19;#00
-> $qRegisterInfo2a#00
<- $name:xmm3;bitsize:128;offset:384;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:20;dwarf:20;#00
-> $qRegisterInfo2b#00
<- $name:xmm4;bitsize:128;offset:400;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:21;dwarf:21;#00
-> $qRegisterInfo2c#00
<- $name:xmm5;bitsize:128;offset:416;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:22;dwarf:22;#00
-> $qRegisterInfo2d#00
<- $name:xmm6;bitsize:128;offset:432;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:23;dwarf:23;#00
-> $qRegisterInfo2e#00
<- $name:xmm7;bitsize:128;offset:448;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:24;dwarf:24;#00
-> $qRegisterInfo2f#00
<- $name:xmm8;bitsize:128;offset:464;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:25;dwarf:25;#00
-> $qRegisterInfo30#00
<- $name:xmm9;bitsize:128;offset:480;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:26;dwarf:26;#00
-> $qRegisterInfo31#00
<- $name:xmm10;bitsize:128;offset:496;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:27;dwarf:27;#00
-> $qRegisterInfo32#00
<- $name:xmm11;bitsize:128;offset:512;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:28;dwarf:28;#00
-> $qRegisterInfo33#00
<- $name:xmm12;bitsize:128;offset:528;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:29;dwarf:29;#00
-> $qRegisterInfo34#00
<- $name:xmm13;bitsize:128;offset:544;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:30;dwarf:30;#00
-> $qRegisterInfo35#00
<- $name:xmm14;bitsize:128;offset:560;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:31;dwarf:31;#00
-> $qRegisterInfo36#00
<- $name:xmm15;bitsize:128;offset:576;encoding:vector;format:vector-uint8;set:Floating Point Registers;gcc:32;dwarf:32;#00
-> $qRegisterInfo37#00
<- $name:trapno;bitsize:32;offset:696;encoding:uint;format:hex;set:Exception State Registers;#00
-> $qRegisterInfo38#00
<- $name:err;bitsize:32;offset:700;encoding:uint;format:hex;set:Exception State Registers;#00
-> $qRegisterInfo39#00
<- $name:faultvaddr;bitsize:64;offset:704;encoding:uint;format:hex;set:Exception State Registers;#00
-> $qRegisterInfo3a#00
<- $E45#00

The host info gives us the target triple information. On apple it looks like:

-> $qHostInfo#00
<- $cputype:16777223;cpusubtype:3;ostype:Darwin;vendor:apple;endian:little;ptrsize:8;#00

We have also added a few packets that allow for memory allocation and deallocation with read/write/execute permissions. So the GDB remote can be used pretty easily with a few additions to many pre-existing GDB server source bases.

> My naive question then is, why not just reuse the ProcessGDBRemote code for Linux as well?

There is something to be said about native debugging and getting the fastest debugging speeds. Debugging using the remote protocol slows things down a little bit, but not too much, so that would be a major reason to be opposed to using the remote protocol. 

The nice thing about using process GDB Remnote and the reason we use it on MacOSX, is that it gets us remote debugging almost for free. Very little changes need to happen since we just connect to a different ip and port number.

> There's probably higher level design issues that I'm not familiar with, so anyone on lldb-dev should feel free to chime in here.  The second lazy inclination is to just port that code to linux if it must go in it's own directory.

The source code from "debugserver" is a mix of old code and some new code. I would like to re-write a lot of it to take advantage of many of the classes that exist inside LLDB (StringStream, Communication, Connectiond and GDBRemoteCommunication to name a few). It was written long before LLDB was around and has been slowly modernized, though it remains crufty. It is currently tailored to MacOSX only, but that being said, a new source base that re-uses the lldb_private communication classes. GDBRemoteCommunication.cpp/h can be reused in a newer binary very easily and adapted to be the server side instead of client side with a few modifications.

That being said, there is a nice abstraction that can be used with debugserver, check out the DNB.h file. This is designed to be a C interface to a native debugger on a native host. I had adapted it to do pretty much all we need to do for MacOSX and the current plan is to eventually re-use this DNB.h/DNB.cpp (the macosx version) in ProcessMacOSX. So we might be able to do the same for linux. Steps would involve:

1 - Modifying DNB.h to make sure it does all we need it to for all platforms
2 - Grab the code from within ProcessLinux and put it in a new DNB.cpp implementation for linux
3 - Make the ProcessLinux code us the DNB.h interface
4 - Reuse DNB.cpp/.h for linux in a new or modified version of "debugserver"

This way we maintain one codebase for native debugging, and then can leverage that code between the native and GDB remote debugserver binaries.

> Let me know what you think.  I'm probably asking silly questions, but I'm just trying to get my bearings. Please bear with me! :-)

Everyone, let me know what you think about the above proposal. The nice thing too is that once debugging for a native platform is abtracted in DNB.h (yes we can rename the header file, this used to stand for "debug nub"), then others can download the LLDB source code any can be able to run a native debug session without needing to have any of the LLDB engine overhead making it very portable.

> 
> Thanks,
> Jason
> 
> p.s. the one thing that kept me from trying this directly was figuring out where in the Makefile system this got chosen, because by default on linux, the gdb-remote directory isn't built.  If anyone knows where this controlled, please point it out. Thank you!

I will defer to Stephen Wilson on this one.

Greg Clayton