[lldb-dev] changing default test runner from multiprocessing-based to threading-based

Tue Sep 22 08:53:16 PDT 2015

On Tue, Sep 22, 2015 at 8:49 AM, Todd Fiala <todd.fiala at gmail.com> wrote:

> Hey guys,
>
> I think you're misunderstanding the process structure here.
>
> The threading-based parallel test runner still execs out to a child
> *process* for the inferio dotest.py run.  So suggesting we move to
> threading is not going to put all tests in a single LLDB process.  We will
> always want to be testing lldb with process isolation per dotest.py
> inferior call.
>
> The multiprocessing-based parallel test runner really has an extra layer
> of process involved.  Each worker is a separate process (per the
> multiprocessing model), which then execs a child process which is
> dotest.py.
>

I'm being fast and loose with "execs" in the sentence above.  It creates a
child process (not execs into it) from the multiprocessing worker process.
The same worker multiprocessing process hangs around and services all items
for that worker queue.  That process is the extra bloat we get rid of and
convert to a thread in the threading model.  That also allows us to use
lighter primitives for main-test-runner / worker communication, which no
longer need one or more processes just to manage communication between them
(implementation details of multiprocessing.Queue and
multiprocessing.Manager, for example).

> So every inferior dotest.py has a process (the inferior dotest.py process)
> and the multiprocess-based worker process.  With the threading-based test
> runner, every dotest.py inferior test process is its own isolated process,
> and it's driven by a test runner thread in the main dotest.py process.
>
> We are not changing anything about the semantics of the test execution
> itself when we do this, nor do we impact the reporting in any way.  It's
> purely a test running infrastructural change that happens to be more
> efficient on most OSes due to the lighter weight of the a thread in most
> places vs. a full-blown process.
>
> On Tue, Sep 22, 2015 at 2:32 AM, Tamas Berghammer <tberghammer at google.com>
> wrote:
>
>> One more point to Zachary's comment is that currently if LLDB crashes for
>> a test we report the test failure somewhat correctly (not perfectly). With
>> a multi threaded approach I would expect an LLDB crash to take down the
>> full test run what isn't something we want.
>>
>> On Tue, Sep 22, 2015 at 12:03 AM Zachary Turner via lldb-dev <
>> lldb-dev at lists.llvm.org> wrote:
>>
>>> After our last discussion, I thought about it some more and there are at
>>> least some problems with this.  The biggest problem is that with only a
>>> single process, you are doing all tests from effectively a single instance
>>> of LLDB.  There's a TestMultipleDebuggers.py for example, and whether or
>>> not that test passes is equivalent to whether or not the test suite can
>>> even work without dying horribly.  In other words, you are inherently
>>> relying on multiple debuggers working to even run the test suite.
>>>
>>> I don't know if that's a problem, but at the very least, it's kind of
>>> unfortunate.  And of course the problem grows to other areas.  What other
>>> things fail horribly when a single instance of LLDB is debugging 100
>>> processes at the same time?
>>>
>>> It's worth adding this as an alternate run mode, but I don't think we
>>> should make it default until it's more battle-tested.
>>>
>>> On Mon, Sep 21, 2015 at 12:49 PM Todd Fiala via lldb-dev <
>>> lldb-dev at lists.llvm.org> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'm considering changing the default lldb test runner from
>>>> multiprocessing-based to threading-based.  Long ago I switched it from
>>>> threading to multiprocessing.  The only reason I did this was because OS X
>>>> was failing to allow more than one exec at a time in the worker threads -
>>>> way down in the Python Global Interpreter Lock (GIL).  And, at the time, I
>>>> didn't have the time to break out the test runner strategies.
>>>>
>>>> We have verified the threading-based issue is no longer manifesting on
>>>> OS X 10.10 and 10.11 beta.  That being the case, I'd like to convert us
>>>> back to being threading-based by default.  Specifically, this will have the
>>>> same effect as doing the following:
>>>> (non-Windows): --test-runner-name threading
>>>> (Windows): --test-runner-name threading-pool
>>>>
>>>> There are a couple benefits here:
>>>> 1. We'll remove a fork for creating the worker queues.  Each of those
>>>> are just threads when using threading, rather than being forked processes.
>>>> Depending on the underlying OS, a thread is typically cheaper.  Also, some
>>>> of the inter-worker communication now becomes cheap intra-process
>>>> communication instead of heavier multiprocessing constructs.
>>>> 2. Debugging is a bit easier.  The worker queues make a lot of noise in
>>>> 'ps aux'-style greps, and are a pain to debug relatively speaking vs. the
>>>> threaded version.
>>>>
>>>> I'm not yet looking to remove the multiprocessing support.  It is
>>>> likely I'll check the OS X version and default to the multiprocessing test
>>>> runner if it wasn't explicitly specified and the OS X version is < 10.10 as
>>>> I'm pretty sure I hit the issue on 10.9's python.
>>>>
>>>> Thoughts?
>>>> --
>>>> -Todd
>>>> _______________________________________________
>>>> lldb-dev mailing list
>>>> lldb-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> lldb-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>
>>
>
>
> --
> -Todd
>

-- 
-Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20150922/b12ac85d/attachment-0001.html>