[Lldb-commits] [PATCH] D12651: Add ctrl-c support to parallel dotest.py.

Todd Fiala via lldb-commits lldb-commits at lists.llvm.org
Fri Sep 4 22:04:36 PDT 2015


That'll also let me set it up so Greg can poke around with the threading
version on OS X.

On Fri, Sep 4, 2015 at 10:04 PM, Todd Fiala <todd.fiala at gmail.com> wrote:

> Yep, I'm thinking that's right.
>
> On Fri, Sep 4, 2015 at 10:02 PM, Zachary Turner <zturner at google.com>
> wrote:
>
>> The pluggable method would at least allow everyone to continue working
>> until someone has time to dig into what's wrong with multiprocess on Windows
>>
>> On Fri, Sep 4, 2015 at 9:56 PM Todd Fiala <todd.fiala at gmail.com> wrote:
>>
>>> On Fri, Sep 4, 2015 at 5:40 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2015 at 5:10 PM Todd Fiala <todd.fiala at gmail.com> wrote:
>>>>
>>>>> tfiala added a comment.
>>>>>
>>>>> In http://reviews.llvm.org/D12651#240480, @zturner wrote:
>>>>>
>>>>> > Tried out this patch, unfortunately I'm seeing the same thing.  The
>>>>> very
>>>>> >  first call to worker.join() is never returning.
>>>>> >
>>>>> > It's unfortunate that it's so hard to debug this stuff, do you have
>>>>> any
>>>>> >  suggestions for how I can try to nail down what the child dotest
>>>>> instance
>>>>> >  is actually doing?  I wonder if it's blocking somewhere in its
>>>>> script, or
>>>>> >  if this is some quirk of the multiprocessing library's dynamic
>>>>> invocation /
>>>>> >  whatever magic is does.
>>>>> >
>>>>> > How much of an effort would it be to make the switch to threads
>>>>> now?  The
>>>>> >  main thing we'd have to do is get rid of all of the globals in
>>>>> dotest, and
>>>>> >  make a DoTest class or something.
>>>>>
>>>>>
>>>>> It's a bit more work than I want to take on right now.  I think we
>>>>> really may want to keep the multiprocessing and just not exec out to
>>>>> dotest.py for a third-ish time for each inferior.
>>>>>
>>>>
>>>> Just to clarify, are you saying we may want to keep multiprocessing
>>>> over threads even if you can solve the exec problem?  Any particular reason?
>>>>
>>>
>>> Yes, you understood me correctly.
>>>
>>> Prior to me getting into it, dosep.py was designed to isolate each test
>>> into its own process (via the subprocess exec call) so that each test
>>> directory or file got its own lldb processor and there was process-level
>>> isolation, less contention on the Python global interpreter lock, etc.
>>>
>>> Then, when Steve Pucci and later I got to making it multithreaded, we
>>> wrapped the exec call in a "import threading" style thread pool.  That
>>> maintained the process isolation property by having each thread just do an
>>> exec (i.e. multiple execs in parallel).  Except, this didn't work on
>>> MacOSX.  The exec calls grab the Python GIL on OS X (and not anywhere as as
>>> far as I could find).  But multithreading + exec is a valid option for
>>> everything not OS X.
>>>
>>> The way I solved it to work for everyone was to drop the "import
>>> threading" approach and switch to the "import multiprocessing" approach.
>>> This worked everywhere, including on OS X (although with a few hiccups
>>> initially, as it exposed occasional hangs at the time with what looked like
>>> socket handling under Darwin).  What I failed to see in my haste was that I
>>> then had two levels of fork/exec-like behavior (i.e. we had two process
>>> firewalls where we only needed one, at the cost of an extra exec): the
>>> multiprocessing works by effectively forking/creating a new process that is
>>> now isolated.  But then we turn around and still create a subprocess to
>>> exec out to dotest.py.
>>>
>>> What I'm suggesting in the near future is if we stick with the
>>> multiprocessing approach, and eliminate the subprocess exec and instead
>>> just have the multiprocess worker call directly into a methodized entry
>>> point in dotest.py, we can skip the subprocess call within the multiprocess
>>> worker.  It is already isolated and a separate process, so it is already
>>> fulfilling the isolation requirement.  And it reduces the doubled processes
>>> created.  And it works on OS X in addition to everywhere else.  It does
>>> become more difficult to debug, but then again the majority of the logic is
>>> in dotest.py and can be debugged --no-multiprocess (or with logging).
>>>
>>> This is all separate somewhat from the Ctrl-C issue, but it is the
>>> backstory on what I'm referring to with the parallel test runner.
>>>
>>> Completely as an aside, I did ask Greg Clayton to see if he can poke
>>> into why OS X is hitting the Python GIL on execs in "import
>>> threading"-style execs from multiple threads.  But assuming nothing magic
>>> changes there and it wasn't easily solved (I tried and failed after several
>>> attempts to diagnose last year), I'd prefer to keep a strategy that is the
>>> same unless there's a decent win on the execution front.
>>>
>>> That all said, I'm starting to think a pluggable strategy for the actual
>>> mechanic of the parallel test run might end up being best anyway since I'd
>>> really like the Ctrl-C working and I'm not able to diagnose what's
>>> happening on the Windows scenario.
>>>
>>>
>>>>   Multi-threaded is much easier to debug, for starters, because you can
>>>> just attach your debugger to a single process.  It also solves a lot of
>>>> race conditions and makes output processing easier (not to mention higher
>>>> performance), because you don't even need a way to have the sub-processes
>>>> communicate their results back to the parent because the results are just
>>>> in memory.  stick them in a synchronized queue and the parent can just
>>>> process it.  So it would probably even speed up the test runner.
>>>>
>>>> I think if there's not a very good reason to keep multiprocessing
>>>> around, we should aim for a threaded approach.  My understanding is that
>>>> lit already does this, so there's no fundamental reason it shouldn't work
>>>> correctly on MacOSX, just have to solve the exec problem like you mentioned.
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -Todd
>>>
>>
>
>
> --
> -Todd
>



-- 
-Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20150904/7bf38b7c/attachment-0001.html>


More information about the lldb-commits mailing list