[lldb-dev] Attaching to a stopped (cored) process hangs lldb-server
Pavel Labath via lldb-dev
lldb-dev at lists.llvm.org
Tue Nov 3 17:41:07 PST 2015
I'm following this discussion, but I don't yet understand what is
going on here completely. What I am sure is that the problem here is
not the S+ state, as that just means "interruptible sleep, foreground
process", and a lot of processes have that state and we attach to them
just fine. I would need to investigate what are the exact properties
or this cored state. I'll try to take a look when I get some spare
cycles, but that might not happen very soon.
Mark, have you investigated what is the next thing to fail after you
remove the waitpid call?
pl
On 3 November 2015 at 16:48, Greg Clayton via lldb-dev
<lldb-dev at lists.llvm.org> wrote:
> Can someone with linux experience chime in here? It shouldn't be too hard to figure out which flag 'S' is in. On MacOS we can get a process info structure from a pid and that will have bits set that indicate 'S'...
>
> If you want to checkin this tool into the LLDB source tree at trunk/tools/core_tool then we can get more people to work on it and improve it. It would be nice to have this available for all linux users. I would love to see an JSON output mode that is parseable by automated tools instead of people saving text formats that must be text scraped.
>
> If you can get this into a tool, others can help get this working. Any interest in this?
>
> Greg
>
>> On Nov 3, 2015, at 4:41 PM, Mark Chandler <mchandler at blizzard.com> wrote:
>>
>> The biggest tell is that the process state is already 'S' or stopped. I don’t know lldb at all to make a change to fix this though.
>>
>>
>> Mark Chandler
>> Battle.Net Engineering Systems | Blizzard Entertainment
>> (P) 949-955-1380 x15353
>>
>> -----Original Message-----
>> From: Greg Clayton [mailto:gclayton at apple.com]
>> Sent: Tuesday, November 03, 2015 4:39 PM
>> To: Mark Chandler <mchandler at blizzard.com>
>> Cc: lldb-dev at lists.llvm.org
>> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs lldb-server
>>
>> Makes sense about not writing the core file to disk.
>>
>> Is there a way you can detect this "core" mode where we don't have to waitpid? Seems like that www.sourceware.org message had ideas on how to detect this case?
>>
>> Greg
>>
>>> On Nov 3, 2015, at 4:36 PM, Mark Chandler <mchandler at blizzard.com> wrote:
>>>
>>> Not able to do that as the servers have no hard drives (use ram disk and net boot) and the tool is trying to avoid a core storm that takes down the network file share. I found out what is causing it to hang, there is a call to waitpid in NativeLinuxProcess.cpp that waits forever. As the process is already stopped, I disabled that and it looks to be working
>>>
>>> Mark Chandler
>>> Battle.Net Engineering Systems | Blizzard Entertainment
>>> (P) 949-955-1380 x15353
>>>
>>> -----Original Message-----
>>> From: Greg Clayton [mailto:gclayton at apple.com]
>>> Sent: Tuesday, November 03, 2015 4:34 PM
>>> To: Mark Chandler <mchandler at blizzard.com>
>>> Cc: lldb-dev at lists.llvm.org
>>> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs lldb-server
>>>
>>> One different approach is to have your tool write all STDIN to a file (the core file comes into the tool as STDIN bytes) and then hand LLDB the core file and do any needed backtracing and data gathering from the core file instead of actually attaching to the process for real. All executable and shared library object files (ELF files) from the core file are still on disk so you can get symbols and use the debug info, so LLDB should be able to load all frames up and symbolicate up the crash location. It should be just as good as having the process around without any bad side affects. Core files are less useful if they must be archived and symbolicated later because the executable files might not be around anymore since things like test suites might produce binaries for testing and remove them after the test is run or crashed.
>>>
>>> What do you think about this approach?
>>>
>>> Greg Clayton
>>>
>>>
>>>> On Nov 2, 2015, at 5:54 PM, Mark Chandler via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>>>>
>>>> So im trying to write a core handler program and use lldb to attach and dump important information about it. This works if a use my tool to attach to an existing one but I found that lldb-server will hang in a waitpid call if the kernel has invoked the tool after another process has cored.
>>>>
>>>> Example:
>>>> · /proc/sys/kernel/core_pattern is set to |/opt/core_tool
>>>> · Run a.out and it segfaults
>>>> · Kernel invokes core_tool that uses lldb AttachToProcess and a.out is in state “S+”
>>>> · lldb-server hangs in source\Plugins\Process\Linux\NativeProcessLinux.cpp:867
>>>> · if I remove the waitpid it doesn’t hang but fails to attach
>>>>
>>>> Looks like gdb had a similar problem as well: http://www.sourceware.org/ml/gdb-patches/2008-04/msg00224.html
>>>> Any ideas on how to fix this?
>>>>
>>>> Mark Chandler
>>>> Battle.Net Engineering Systems | Blizzard Entertainment
>>>> (P) 949-955-1380 x15353
>>>>
>>>> _______________________________________________
>>>> lldb-dev mailing list
>>>> lldb-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>
>>
>
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
More information about the lldb-dev
mailing list