[lldb-dev] Attaching to a stopped (cored) process hangs lldb-server

Todd Fiala via lldb-dev lldb-dev at lists.llvm.org
Wed Nov 4 09:44:40 PST 2015


Hey Pavel,

I think Mark is also on RHEL 5-era, so this going *way* back in the kernel
space.  It is entirely possible he is seeing different behavior based on
that.  We only recently started working on RHEL 7 and (I've heard reports
of) 6.  So this could just be legitimate behavioral difference that we
won't see on much newer Ubuntu kernels and/or configuration differences
between RHEL and Debian-based kernels.

On Tue, Nov 3, 2015 at 5:47 PM, Mark Chandler via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

> The ptrace options per thread id also fail so I removed that as well. Atm
> lldb-server is seg-faulting in ThreadAttach that im trying to work out why.
>
>
>
> Mark Chandler
> Battle.Net Engineering Systems | Blizzard Entertainment
> (P) 949-955-1380 x15353
>
> -----Original Message-----
> From: Pavel Labath [mailto:labath at google.com]
> Sent: Tuesday, November 03, 2015 5:41 PM
> To: Greg Clayton <gclayton at apple.com>
> Cc: Mark Chandler <mchandler at blizzard.com>; lldb-dev at lists.llvm.org
> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs
> lldb-server
>
> I'm following this discussion, but I don't yet understand what is going on
> here completely. What I am sure is that the problem here is not the S+
> state, as that just means "interruptible sleep, foreground process", and a
> lot of processes have that state and we attach to them just fine. I would
> need to investigate what are the exact properties or this cored state. I'll
> try to take a look when I get some spare cycles, but that might not happen
> very soon.
>
> Mark, have you investigated what is the next thing to fail after you
> remove the waitpid call?
>
> pl
>
> On 3 November 2015 at 16:48, Greg Clayton via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> > Can someone with linux experience chime in here? It shouldn't be too
> hard to figure out which flag 'S' is in. On MacOS we can get a process info
> structure from a pid and that will have bits set that indicate 'S'...
> >
> > If you want to checkin this tool into the LLDB source tree at
> trunk/tools/core_tool then we can get more people to work on it and improve
> it. It would be nice to have this available for all linux users. I would
> love to see an JSON output mode that is parseable by automated tools
> instead of people saving text formats that must be text scraped.
> >
> > If you can get this into a tool, others can help get this working. Any
> interest in this?
> >
> > Greg
> >
> >> On Nov 3, 2015, at 4:41 PM, Mark Chandler <mchandler at blizzard.com>
> wrote:
> >>
> >> The biggest tell is that the process state is already 'S' or stopped. I
> don’t know lldb at all to make a change to fix this though.
> >>
> >>
> >> Mark Chandler
> >> Battle.Net Engineering Systems | Blizzard Entertainment
> >> (P) 949-955-1380 x15353
> >>
> >> -----Original Message-----
> >> From: Greg Clayton [mailto:gclayton at apple.com]
> >> Sent: Tuesday, November 03, 2015 4:39 PM
> >> To: Mark Chandler <mchandler at blizzard.com>
> >> Cc: lldb-dev at lists.llvm.org
> >> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs
> >> lldb-server
> >>
> >> Makes sense about not writing the core file to disk.
> >>
> >> Is there a way you can detect this "core" mode where we don't have to
> waitpid? Seems like that www.sourceware.org message had ideas on how to
> detect this case?
> >>
> >> Greg
> >>
> >>> On Nov 3, 2015, at 4:36 PM, Mark Chandler <mchandler at blizzard.com>
> wrote:
> >>>
> >>> Not able to do that as the servers have no hard drives (use ram disk
> >>> and net boot) and the tool is trying to avoid a core storm that
> >>> takes down the network file share. I found out what is causing it to
> >>> hang, there is a call to waitpid in NativeLinuxProcess.cpp that
> >>> waits forever. As the process is already stopped, I disabled that
> >>> and it looks to be working
> >>>
> >>> Mark Chandler
> >>> Battle.Net Engineering Systems | Blizzard Entertainment
> >>> (P) 949-955-1380 x15353
> >>>
> >>> -----Original Message-----
> >>> From: Greg Clayton [mailto:gclayton at apple.com]
> >>> Sent: Tuesday, November 03, 2015 4:34 PM
> >>> To: Mark Chandler <mchandler at blizzard.com>
> >>> Cc: lldb-dev at lists.llvm.org
> >>> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs
> >>> lldb-server
> >>>
> >>> One different approach is to have your tool write all STDIN to a file
> (the core file comes into the tool as STDIN bytes) and then hand LLDB the
> core file and do any needed backtracing and data gathering from the core
> file instead of actually attaching to the process for real. All executable
> and shared library object files (ELF files) from the core file are still on
> disk so you can get symbols and use the debug info, so LLDB should be able
> to load all frames up and symbolicate up the crash location. It should be
> just as good as having the process around without any bad side affects.
> Core files are less useful if they must be archived and symbolicated later
> because the executable files might not be around anymore since things like
> test suites might produce binaries for testing and remove them after the
> test is run or crashed.
> >>>
> >>> What do you think about this approach?
> >>>
> >>> Greg Clayton
> >>>
> >>>
> >>>> On Nov 2, 2015, at 5:54 PM, Mark Chandler via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >>>>
> >>>> So im trying to write a core handler program and use lldb to attach
> and dump important information about it. This works if a use my tool to
> attach to an existing one but I found that lldb-server will hang in a
> waitpid call if the kernel has invoked the tool after another process has
> cored.
> >>>>
> >>>> Example:
> >>>> ·         /proc/sys/kernel/core_pattern is set to |/opt/core_tool
> >>>> ·         Run a.out and it segfaults
> >>>> ·         Kernel invokes core_tool that uses lldb AttachToProcess and
> a.out is in state “S+”
> >>>> ·         lldb-server hangs in
> source\Plugins\Process\Linux\NativeProcessLinux.cpp:867
> >>>> ·         if I remove the waitpid it doesn’t hang but fails to attach
> >>>>
> >>>> Looks like gdb had a similar problem as well:
> >>>> http://www.sourceware.org/ml/gdb-patches/2008-04/msg00224.html
> >>>> Any ideas on how to fix this?
> >>>>
> >>>> Mark Chandler
> >>>> Battle.Net Engineering Systems | Blizzard Entertainment
> >>>> (P) 949-955-1380 x15353
> >>>>
> >>>> _______________________________________________
> >>>> lldb-dev mailing list
> >>>> lldb-dev at lists.llvm.org
> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >>>
> >>
> >
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>



-- 
-Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151104/19859978/attachment-0001.html>


More information about the lldb-dev mailing list