[lldb-dev] changing default test runner from multiprocessing-based to threading-based

Tue Sep 22 14:23:00 PDT 2015

I went ahead and changed this here:

Sending        test/dosep.py
Transmitting file data .
Committed revision 248323.

I'll be watching the Linux test runner.  I've been using it on Linux and OS
X for a while now.

Zachary, if you have trouble on Windows, we can have the Windows one go
back to defaulting to multiprocessing-pool.  But unless it's a problem, it
would be great to keep it on the simpler threading-pool.  It should be
easier for you to watch it in a debugger, too.

-Todd

On Tue, Sep 22, 2015 at 10:04 AM, Zachary Turner <zturner at google.com> wrote:

> Ahh right, of course.  Disregard my comment then, I forgot about that
> extra layer
>
> On Tue, Sep 22, 2015 at 8:53 AM Todd Fiala <todd.fiala at gmail.com> wrote:
>
>> On Tue, Sep 22, 2015 at 8:49 AM, Todd Fiala <todd.fiala at gmail.com> wrote:
>>
>>> Hey guys,
>>>
>>> I think you're misunderstanding the process structure here.
>>>
>>> The threading-based parallel test runner still execs out to a child
>>> *process* for the inferio dotest.py run.  So suggesting we move to
>>> threading is not going to put all tests in a single LLDB process.  We will
>>> always want to be testing lldb with process isolation per dotest.py
>>> inferior call.
>>>
>>> The multiprocessing-based parallel test runner really has an extra layer
>>> of process involved.  Each worker is a separate process (per the
>>> multiprocessing model), which then execs a child process which is
>>> dotest.py.
>>>
>>
>> I'm being fast and loose with "execs" in the sentence above.  It creates
>> a child process (not execs into it) from the multiprocessing worker
>> process.  The same worker multiprocessing process hangs around and services
>> all items for that worker queue.  That process is the extra bloat we get
>> rid of and convert to a thread in the threading model.  That also allows us
>> to use lighter primitives for main-test-runner / worker communication,
>> which no longer need one or more processes just to manage communication
>> between them (implementation details of multiprocessing.Queue and
>> multiprocessing.Manager, for example).
>>
>>
>>> So every inferior dotest.py has a process (the inferior dotest.py
>>> process) and the multiprocess-based worker process.  With the
>>> threading-based test runner, every dotest.py inferior test process is its
>>> own isolated process, and it's driven by a test runner thread in the main
>>> dotest.py process.
>>>
>>> We are not changing anything about the semantics of the test execution
>>> itself when we do this, nor do we impact the reporting in any way.  It's
>>> purely a test running infrastructural change that happens to be more
>>> efficient on most OSes due to the lighter weight of the a thread in most
>>> places vs. a full-blown process.
>>>
>>> On Tue, Sep 22, 2015 at 2:32 AM, Tamas Berghammer <
>>> tberghammer at google.com> wrote:
>>>
>>>> One more point to Zachary's comment is that currently if LLDB crashes
>>>> for a test we report the test failure somewhat correctly (not perfectly).
>>>> With a multi threaded approach I would expect an LLDB crash to take down
>>>> the full test run what isn't something we want.
>>>>
>>>> On Tue, Sep 22, 2015 at 12:03 AM Zachary Turner via lldb-dev <
>>>> lldb-dev at lists.llvm.org> wrote:
>>>>
>>>>> After our last discussion, I thought about it some more and there are
>>>>> at least some problems with this.  The biggest problem is that with only a
>>>>> single process, you are doing all tests from effectively a single instance
>>>>> of LLDB.  There's a TestMultipleDebuggers.py for example, and whether or
>>>>> not that test passes is equivalent to whether or not the test suite can
>>>>> even work without dying horribly.  In other words, you are inherently
>>>>> relying on multiple debuggers working to even run the test suite.
>>>>>
>>>>> I don't know if that's a problem, but at the very least, it's kind of
>>>>> unfortunate.  And of course the problem grows to other areas.  What other
>>>>> things fail horribly when a single instance of LLDB is debugging 100
>>>>> processes at the same time?
>>>>>
>>>>> It's worth adding this as an alternate run mode, but I don't think we
>>>>> should make it default until it's more battle-tested.
>>>>>
>>>>> On Mon, Sep 21, 2015 at 12:49 PM Todd Fiala via lldb-dev <
>>>>> lldb-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'm considering changing the default lldb test runner from
>>>>>> multiprocessing-based to threading-based.  Long ago I switched it from
>>>>>> threading to multiprocessing.  The only reason I did this was because OS X
>>>>>> was failing to allow more than one exec at a time in the worker threads -
>>>>>> way down in the Python Global Interpreter Lock (GIL).  And, at the time, I
>>>>>> didn't have the time to break out the test runner strategies.
>>>>>>
>>>>>> We have verified the threading-based issue is no longer manifesting
>>>>>> on OS X 10.10 and 10.11 beta.  That being the case, I'd like to convert us
>>>>>> back to being threading-based by default.  Specifically, this will have the
>>>>>> same effect as doing the following:
>>>>>> (non-Windows): --test-runner-name threading
>>>>>> (Windows): --test-runner-name threading-pool
>>>>>>
>>>>>> There are a couple benefits here:
>>>>>> 1. We'll remove a fork for creating the worker queues.  Each of those
>>>>>> are just threads when using threading, rather than being forked processes.
>>>>>> Depending on the underlying OS, a thread is typically cheaper.  Also, some
>>>>>> of the inter-worker communication now becomes cheap intra-process
>>>>>> communication instead of heavier multiprocessing constructs.
>>>>>> 2. Debugging is a bit easier.  The worker queues make a lot of noise
>>>>>> in 'ps aux'-style greps, and are a pain to debug relatively speaking vs.
>>>>>> the threaded version.
>>>>>>
>>>>>> I'm not yet looking to remove the multiprocessing support.  It is
>>>>>> likely I'll check the OS X version and default to the multiprocessing test
>>>>>> runner if it wasn't explicitly specified and the OS X version is < 10.10 as
>>>>>> I'm pretty sure I hit the issue on 10.9's python.
>>>>>>
>>>>>> Thoughts?
>>>>>> --
>>>>>> -Todd
>>>>>> _______________________________________________
>>>>>> lldb-dev mailing list
>>>>>> lldb-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>>>
>>>>> _______________________________________________
>>>>> lldb-dev mailing list
>>>>> lldb-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>>
>>>>
>>>
>>>
>>> --
>>> -Todd
>>>
>>
>>
>>
>> --
>> -Todd
>>
>

-- 
-Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20150922/3220ddeb/attachment.html>