<div dir="ltr">Although doing any kind of waitpid() in the case of a core file doesn't make sense.</div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 4, 2015 at 9:44 AM, Todd Fiala <span dir="ltr"><<a href="mailto:todd.fiala@gmail.com" target="_blank">todd.fiala@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hey Pavel,<div><br></div><div>I think Mark is also on RHEL 5-era, so this going *way* back in the kernel space. It is entirely possible he is seeing different behavior based on that. We only recently started working on RHEL 7 and (I've heard reports of) 6. So this could just be legitimate behavioral difference that we won't see on much newer Ubuntu kernels and/or configuration differences between RHEL and Debian-based kernels.</div></div><div class="gmail_extra"><div><div class="h5"><br><div class="gmail_quote">On Tue, Nov 3, 2015 at 5:47 PM, Mark Chandler via lldb-dev <span dir="ltr"><<a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The ptrace options per thread id also fail so I removed that as well. Atm lldb-server is seg-faulting in ThreadAttach that im trying to work out why.<br>
<span><br>
<br>
<br>
Mark Chandler<br>
Battle.Net Engineering Systems | Blizzard Entertainment<br>
(P) <a href="tel:949-955-1380%20x15353" value="+19499551380" target="_blank">949-955-1380 x15353</a><br>
<br>
-----Original Message-----<br>
</span><div><div>From: Pavel Labath [mailto:<a href="mailto:labath@google.com" target="_blank">labath@google.com</a>]<br>
Sent: Tuesday, November 03, 2015 5:41 PM<br>
To: Greg Clayton <<a href="mailto:gclayton@apple.com" target="_blank">gclayton@apple.com</a>><br>
Cc: Mark Chandler <<a href="mailto:mchandler@blizzard.com" target="_blank">mchandler@blizzard.com</a>>; <a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a><br>
Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs lldb-server<br>
<br>
I'm following this discussion, but I don't yet understand what is going on here completely. What I am sure is that the problem here is not the S+ state, as that just means "interruptible sleep, foreground process", and a lot of processes have that state and we attach to them just fine. I would need to investigate what are the exact properties or this cored state. I'll try to take a look when I get some spare cycles, but that might not happen very soon.<br>
<br>
Mark, have you investigated what is the next thing to fail after you remove the waitpid call?<br>
<br>
pl<br>
<br>
On 3 November 2015 at 16:48, Greg Clayton via lldb-dev <<a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a>> wrote:<br>
> Can someone with linux experience chime in here? It shouldn't be too hard to figure out which flag 'S' is in. On MacOS we can get a process info structure from a pid and that will have bits set that indicate 'S'...<br>
><br>
> If you want to checkin this tool into the LLDB source tree at trunk/tools/core_tool then we can get more people to work on it and improve it. It would be nice to have this available for all linux users. I would love to see an JSON output mode that is parseable by automated tools instead of people saving text formats that must be text scraped.<br>
><br>
> If you can get this into a tool, others can help get this working. Any interest in this?<br>
><br>
> Greg<br>
><br>
>> On Nov 3, 2015, at 4:41 PM, Mark Chandler <<a href="mailto:mchandler@blizzard.com" target="_blank">mchandler@blizzard.com</a>> wrote:<br>
>><br>
>> The biggest tell is that the process state is already 'S' or stopped. I don’t know lldb at all to make a change to fix this though.<br>
>><br>
>><br>
>> Mark Chandler<br>
>> Battle.Net Engineering Systems | Blizzard Entertainment<br>
>> (P) <a href="tel:949-955-1380%20x15353" value="+19499551380" target="_blank">949-955-1380 x15353</a><br>
>><br>
>> -----Original Message-----<br>
>> From: Greg Clayton [mailto:<a href="mailto:gclayton@apple.com" target="_blank">gclayton@apple.com</a>]<br>
>> Sent: Tuesday, November 03, 2015 4:39 PM<br>
>> To: Mark Chandler <<a href="mailto:mchandler@blizzard.com" target="_blank">mchandler@blizzard.com</a>><br>
>> Cc: <a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a><br>
>> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs<br>
>> lldb-server<br>
>><br>
>> Makes sense about not writing the core file to disk.<br>
>><br>
>> Is there a way you can detect this "core" mode where we don't have to waitpid? Seems like that <a href="http://www.sourceware.org" rel="noreferrer" target="_blank">www.sourceware.org</a> message had ideas on how to detect this case?<br>
>><br>
>> Greg<br>
>><br>
>>> On Nov 3, 2015, at 4:36 PM, Mark Chandler <<a href="mailto:mchandler@blizzard.com" target="_blank">mchandler@blizzard.com</a>> wrote:<br>
>>><br>
>>> Not able to do that as the servers have no hard drives (use ram disk<br>
>>> and net boot) and the tool is trying to avoid a core storm that<br>
>>> takes down the network file share. I found out what is causing it to<br>
>>> hang, there is a call to waitpid in NativeLinuxProcess.cpp that<br>
>>> waits forever. As the process is already stopped, I disabled that<br>
>>> and it looks to be working<br>
>>><br>
>>> Mark Chandler<br>
>>> Battle.Net Engineering Systems | Blizzard Entertainment<br>
>>> (P) <a href="tel:949-955-1380%20x15353" value="+19499551380" target="_blank">949-955-1380 x15353</a><br>
>>><br>
>>> -----Original Message-----<br>
>>> From: Greg Clayton [mailto:<a href="mailto:gclayton@apple.com" target="_blank">gclayton@apple.com</a>]<br>
>>> Sent: Tuesday, November 03, 2015 4:34 PM<br>
>>> To: Mark Chandler <<a href="mailto:mchandler@blizzard.com" target="_blank">mchandler@blizzard.com</a>><br>
>>> Cc: <a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a><br>
>>> Subject: Re: [lldb-dev] Attaching to a stopped (cored) process hangs<br>
>>> lldb-server<br>
>>><br>
>>> One different approach is to have your tool write all STDIN to a file (the core file comes into the tool as STDIN bytes) and then hand LLDB the core file and do any needed backtracing and data gathering from the core file instead of actually attaching to the process for real. All executable and shared library object files (ELF files) from the core file are still on disk so you can get symbols and use the debug info, so LLDB should be able to load all frames up and symbolicate up the crash location. It should be just as good as having the process around without any bad side affects. Core files are less useful if they must be archived and symbolicated later because the executable files might not be around anymore since things like test suites might produce binaries for testing and remove them after the test is run or crashed.<br>
>>><br>
>>> What do you think about this approach?<br>
>>><br>
>>> Greg Clayton<br>
>>><br>
>>><br>
>>>> On Nov 2, 2015, at 5:54 PM, Mark Chandler via lldb-dev <<a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a>> wrote:<br>
>>>><br>
>>>> So im trying to write a core handler program and use lldb to attach and dump important information about it. This works if a use my tool to attach to an existing one but I found that lldb-server will hang in a waitpid call if the kernel has invoked the tool after another process has cored.<br>
>>>><br>
>>>> Example:<br>
>>>> · /proc/sys/kernel/core_pattern is set to |/opt/core_tool<br>
>>>> · Run a.out and it segfaults<br>
>>>> · Kernel invokes core_tool that uses lldb AttachToProcess and a.out is in state “S+”<br>
>>>> · lldb-server hangs in source\Plugins\Process\Linux\NativeProcessLinux.cpp:867<br>
>>>> · if I remove the waitpid it doesn’t hang but fails to attach<br>
>>>><br>
>>>> Looks like gdb had a similar problem as well:<br>
>>>> <a href="http://www.sourceware.org/ml/gdb-patches/2008-04/msg00224.html" rel="noreferrer" target="_blank">http://www.sourceware.org/ml/gdb-patches/2008-04/msg00224.html</a><br>
>>>> Any ideas on how to fix this?<br>
>>>><br>
>>>> Mark Chandler<br>
>>>> Battle.Net Engineering Systems | Blizzard Entertainment<br>
>>>> (P) <a href="tel:949-955-1380%20x15353" value="+19499551380" target="_blank">949-955-1380 x15353</a><br>
>>>><br>
>>>> _______________________________________________<br>
>>>> lldb-dev mailing list<br>
>>>> <a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a><br>
>>>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev</a><br>
>>><br>
>><br>
><br>
> _______________________________________________<br>
> lldb-dev mailing list<br>
> <a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev</a><br>
_______________________________________________<br>
lldb-dev mailing list<br>
<a href="mailto:lldb-dev@lists.llvm.org" target="_blank">lldb-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div></div></div><span class="HOEnZb"><font color="#888888">-- <br><div><div dir="ltr">-Todd</div></div>
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">-Todd</div></div>
</div>