[lldb-dev] Process spawning and shutdown

Wed Sep 17 15:11:24 PDT 2014

> On Sep 17, 2014, at 1:20 PM, Zachary Turner <zturner at google.com> wrote:
> 
> +Jim, since he looked at my previous HostThread patch and some others, and may have additional context.
> 
> Just to be explicit, are you saying that I should not refactor any of this code, or that it's fine as long as the Apple stuff remains functionally equivalent?

I was out of of the office during the last patch for the threads. I know Jim did work with you on that one. A few things from the last patch that made me worry about a Process equivalent:

class HostThread
{
  public:
    HostThread();
    HostThread(lldb::thread_t thread);

    Error Join(lldb::thread_result_t *result);
    Error Cancel();
    void Reset();
    lldb::thread_t Release();

    void SetState(ThreadState state);
    ThreadState GetState() const;
    HostNativeThreadBase &GetNativeThread();
    const HostNativeThreadBase &GetNativeThread() const;
    lldb::thread_result_t GetResult() const;

    bool EqualsThread(lldb::thread_t thread) const;

  private:
    std::shared_ptr<HostNativeThreadBase> m_native_thread;
};

Why is there a separate HostNativeThreadBase class? Why not just make have each host implement the calls required in HostThread in their own respective HostThread.cpp class? Now we just indirect all calls through another class when there really isn't a valid reason to do so IMHO.

Why is there a SetState() and GetState() here? We set this to running when the thread handle is valid, even though it has no bearing on if the thread is actually running or suspended? Can we remove this?

Everything else is ok. 

The fallout from this was now any code that used to just check if a thread was valid, are now doing:

bool
Communication::JoinReadThread (Error *error_ptr)
{
    if (m_read_thread.GetState() != eThreadStateRunning)
        return true;

    Error error = m_read_thread.Join(nullptr);
    m_read_thread.Reset();
    return error.Success();
}

Since this is now a class, we should probably just do:

bool
Communication::JoinReadThread (Error *error_ptr)
{
    Error error = m_read_thread.Join(nullptr);
    m_read_thread.Reset();
    return error.Success();
}

And have join return success if the thread handle wasn't valid. The other reason I don't like the state in the HostThread is that it seems to indicate that this thread is actually running. I could have already exited, but if you ask the HostThread its state it will tell you "running" when it really isn't. Tracking when a thread goes away can get tricky so it would be hard to keep this state up to date. Can we change it back to a "bool IsValid() const" that returns true if the thread handle is valid? I am not sure if you were thinking of adding more to this class. But this class isn't really designed to represent a thread from another process, so it isn't very useful in that respect. I seem to remember Jim saying that you thought that this class might be able to be reused in for other threads in other processes, and then backed back to your design, so there might just be stuff left over?

This is the main reason for my concern with the next patch.

Greg

> 
> Zach
> 
> On Wed, Sep 17, 2014 at 12:51 AM, Zachary Turner <zturner at google.com> wrote:
> Sorry, hit send too soon.  
> 
> Exposing native OS primitive types like lldb::thread_t, lldb::process_t, and raw file descriptors to generic code means that people will use them in ways that aren't actually generic, and this can already been seen in many places.  The biggest example is currently llgs, which has posix-y stuff all over and which will be quite difficult to port to Windows as a result, if and when we get there.  There are other examples too though.  Platform-specific ideas have made it into the public API, such as SBHostOS::ThreadDetach, and are used in other places as well, such as a reliance on TLS destructors (Timer) and thread cancellation.  There's also select() being used on file descriptors, and probably many things I haven't even found yet. 
> 
> I can go fix all of these things on a case-by-case basis, and originally that was my strategy.  But as I found more and more examples of it, I started thinking that I want to prevent this type of code from showing up in the future.  I searched for the thread but was unable to find it, where Jim (apologies if I'm misquoting or misremembering) said that when the Host layer was originally written, there was not sufficient time to sit down and design something future-proof, and that you guys just had to get it working.  So what I was (and have been) trying to accomplish was exactly that.  Obviously, such large changes are not without risk, and can create headaches and introduce bugs, although I think that once the bugs are resolved the entire codebase and all platforms will benefit from improved code health as a result.  
> 
> Ultimately if you don't think these type of changes add value, or you don't think it's an improvement, then that's that.  It's much easier for me to write code just for Windows and not have to worry about getting stuff working on 3 different platforms that I have varying levels of familiarity with.  I don't think it's necessarily easier in the long term though, as there will still be no clear separation between generic and platform specific code, and new things will continue to turn up where an assumption was made that something was generic when it wasn't.
> 
> 
> 
> On Wed, Sep 17, 2014 at 12:27 AM, Zachary Turner <zturner at google.com> wrote:
> On Tue, Sep 16, 2014 at 3:17 PM, Greg Clayton <gclayton at apple.com> wrote:
> 
> > On Sep 16, 2014, at 12:55 PM, Zachary Turner <zturner at google.com> wrote:
> >
> > Just a follow-up.  It seems like there are three use cases for the StartMonitoringChildProcess code-path (poorly named, since FreeBSD and linux process plugins also have a ProcessMonitor class leading to great confusion).
> >
> > 1) Some places want to Join() on a process exiting.  They currently do this by calling Join() on a HostThread returned by StartMonitoringChildProcess, but the important thing is just that they want to join, not how it's implemented.
> 
> Join is not the right word. Reap() is the correct word. Join is just seen because you might spawn a thread whose sole purpose in life is to reap a child process. When you launch a shell command, you want to spawn the process and wait for it to get reaped. If you debug a process, you need to reap your child process if you launch it, but you won't ever call join on a thread that is waiting for it.
> 
> I know Join isn't really the right word, but there's no concept of reaping a process on Windows.  My understanding is that on posix, "reaping" specifically refers to ensuring that zombie processes don't linger and waste system resources which is a logically different operation than "wait for this process to exit", even though certain operations that reap also wait for the process to exit as a means to an end.  So I'm using join for lack of a better term to refer specifically to "wait for this process to exit".
> 
> 
> On Tue, Sep 16, 2014 at 3:21 PM, Greg Clayton <gclayton at apple.com> wrote:
> I would really like to see the Host::LaunchProcess fixed for windows and fix the reaping to just work for windows. I would like to avoid large changes that aren't needed. All of the AppleScript stuff and other details are fine to stay hidden within the MacOSX specific version of Host::LaunchProcess() as long as the contents of the ProcessLaunchInfo are obeyed.
> 
> I had been working on creating a HostProcess abstraction similar to the HostThread abstraction that I created.  I think that there's alot of value in having a robust set of OS abstractions that can be used by lldb, debugserver, and other processes alike.  Exposing native OS primitive types like lldb::thread_t and lldb::process_t to generic code  
> 
>