[Lldb-commits] [PATCH] Prevent LLGS from crashing when exiting - make NativeProcessLinux to wait until ThreadStateCoordinator is fully stopped before entering ~NativeProcessLinux.

Wed Feb 18 11:49:56 PST 2015

Hi,

please see my comments inline:

On Wed, Feb 18, 2015 at 3:46 AM, Pavel Labath <labath at google.com> wrote:

> Greetings,
>
> I have been following this discussion, and would like to add my 2 cents.
> I'll start with my thoughts on virtual functions.
>
> > On 17 February 2015 at 18:44, Oleksiy Vyalov <ovyalov at google.com> wrote:
>
> >
>
> > > On Tue, Feb 17, 2015 at 10:01 AM, Tamas Berghammer <
> tberghammer at google.com> wrote:
>
> >  > As far as I know NativeProcessLinux is still fully functional during
> the execution of it's destructor. The only difference is that the members in
>
> >  > GDBRemoteCommunicationServerLLGS >defined after the
> NativeProcessLinux smart pointer are already destructed. If this ordering
> is the problem then it can be solved with
>
> >  > changing the order of the member declarations >inside
> GDBRemoteCommunicationServerLLGS.
>
> >
>
> > Yes, an instance is functional when its destructor is being called but
> there some limitations:
>
> > You may hit undefined behavior if virtual function are called when
> destruction is in progress -
> http://www.artima.com/cppsource/nevercall.html, i.e. if some virtual
> function of >NativeProcessLinux are called by TSC when ~NativeProcessLinux
> is in progress - it might be a problem.
>
>
> And a more quote from the standard.
>
> > Member functions, including virtual functions (10.3), can be called
> during construction or destruction (12.6.2). When a virtual function is
> called directly or indirectly from a
>
> >  constructor or from a destructor, including during the construction or
> destruction of the class’s non-static data members, and the object to which
> the call applies is the object (call it
>
> >  x) under construction or destruction, the function called is the final
> overrider in the constructor’s or destructor’s class and not one overriding
> it in a more-derived class.
>
>
> Judging from this, there is nothing undefined about the situation you
> mentioned. While ~NativeProcessLinux is in progress any call to its virtual
> functions will resolve "as expected". This is especially true if you are
> still executing the body of the destructor (as would be the case if you
> placed the Join() call in ~NativeProcessLinux), as all the member variables
> are still fully constructed, so you cannot get undefined behavior if the
> virtual functions do member access.
>
> A different case would be if you were calling virtual functions of the
> (now non-existing) NativeProcessLinux object after its destructor has
> completed, and the destruction has moved on to its Base classes. In some
> cases the behavior would be undefined, in others "defined, but surprising".
> And if this were to happen (which, as far as i can see, is not the case),
> then I would argue that the bug is in the fact that ThreadStateCoordinator
> was still alive after the destruction of NativeProcessLinux  -- the
> coordinator is owned by nativeprocess, so it should never outlive it
>
>
Thank you for looking into this. You brought up good points and my attitude
here for calling virtual methods from constructor/destructor is rather
maintaining a safety net - calling virtual methods on an instance after
last line of constructor and before first line of destructor.

> Therefore I think it is a bad idea to create a "destructor" function just
> to avoid doing something in a destructor. The presence of Terminate()
> defeats the purpose of having a shared pointer to the process: a shared
> pointer should delete an object, once the last pointer to it goes out of
> scope, but right now you cannot properly delete the process object in any
> other way except by calling ~GDBRemoteCommunicationServerLLGS(). If someone
> happened to be holding a NativeProcessProtocolSP expecting it to be a
> functional process, it will be surprised that it is non functional after
> GDBRemoteCommunicationServerLLGS has been gone.
>
> Since it seems that the root of the problem was something else (a null ptr
> dereference in NativeThreadLinux), and this has already been addressed, I
> would recommend moving the Join() back to the destructor. Unless there are
> other issues which would require its presence... (?)
>
>
 I propose to leave Terminate call outside of ~NativeThreadLinux while the
ownership issue in Threads/Processes/TSC will be fully addressed. Putting
Join() in ~NativeThreadLinux may lead to the deadlock and potential SIGSEGV:

   - Let's imagine NativeThreadLinux::SetRunning in the middle of execution
   when LLGS is about to shutdown. As part of GDBRemoteCommunicationServerLLGS
   destruction it tries to delete m_debugged_process_sp.
   Since NativeThreadLinux::SetRunning has locked this instance of
   NativeProcessProtocolSP,
GDBRemoteCommunicationServerLLGS::m_debugged_process_sp
   just decrements its reference counter and eventually ~NativeProcessLinux
   will be called in TSC thread context from NativeThreadLinux::SetRunning -
   so, it will be waiting for TSC thread to join from TSC thread itself.
   - If  ~NativeProcessLinux might be called from NativeThreadLinux call it
   may lead to situation when ~NativeProcessLinux is destroying threads
   from NativeProcessProtocol::m_threads, i.e. destroying NativeThreadLinux
   instance that running SetRunning right now

Once ownership issue is resolved, it will be safe to remove Terminate
method and call StopMonitor from ~NativeProcessLinux.

PS:
> After my discussion with Tamas, I have come to think that the root cause
> is unclear ownership semantics between Threads, Processes and TSC: a
> process holds shared_ptrs to threads, which seems to imply that it is
> sharing the ownership of them with someone else. However, from the code it
> would seem that the threads should never outlive the process (or the TSC
> for that matter). Therefore, it would seem that a Process (or maybe the
> TSC) is the perfect candidate for the "owner" of it's threads, which would
> have the sole responsibility for destroying them (which could be
> represented by a unique_ptr). And then, when the lifetime of threads gets
> tied to the lifetime of their process, they no longer need to hold a shared
> (or weak) pointer, they can be happy with a regular pointer. However, this
> is definitely an issue far out of scope of this CL.
>
>
Yes, it  should be enough for thread class just to keep a reference to its
process and process by itself just maintains sequence of threads'
unique_ptrs.

> This completes my braindump. Sorry about the length, I did not expect it
> to be this long when I started. I realize some of the claims might be too
> strong for someone who just started and is getting to know the codebase,
> but I figure I might as well send it, since I spent so much time thinking
> about it.
>
>
> http://reviews.llvm.org/D7692
>
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/
>
>
>

-- 
Oleksiy Vyalov | Software Engineer | ovyalov at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20150218/72b1a8ad/attachment.html>