[Lldb-commits] [PATCH] D12651: Add ctrl-c support to parallel dotest.py.
Zachary Turner via lldb-commits
lldb-commits at lists.llvm.org
Fri Sep 4 22:02:39 PDT 2015
The pluggable method would at least allow everyone to continue working
until someone has time to dig into what's wrong with multiprocess on Windows
On Fri, Sep 4, 2015 at 9:56 PM Todd Fiala <todd.fiala at gmail.com> wrote:
> On Fri, Sep 4, 2015 at 5:40 PM, Zachary Turner <zturner at google.com> wrote:
>> On Fri, Sep 4, 2015 at 5:10 PM Todd Fiala <todd.fiala at gmail.com> wrote:
>>> tfiala added a comment.
>>> In http://reviews.llvm.org/D12651#240480, @zturner wrote:
>>> > Tried out this patch, unfortunately I'm seeing the same thing. The
>>> > first call to worker.join() is never returning.
>>> > It's unfortunate that it's so hard to debug this stuff, do you have any
>>> > suggestions for how I can try to nail down what the child dotest
>>> > is actually doing? I wonder if it's blocking somewhere in its
>>> script, or
>>> > if this is some quirk of the multiprocessing library's dynamic
>>> invocation /
>>> > whatever magic is does.
>>> > How much of an effort would it be to make the switch to threads now?
>>> > main thing we'd have to do is get rid of all of the globals in
>>> dotest, and
>>> > make a DoTest class or something.
>>> It's a bit more work than I want to take on right now. I think we
>>> really may want to keep the multiprocessing and just not exec out to
>>> dotest.py for a third-ish time for each inferior.
>> Just to clarify, are you saying we may want to keep multiprocessing over
>> threads even if you can solve the exec problem? Any particular reason?
> Yes, you understood me correctly.
> Prior to me getting into it, dosep.py was designed to isolate each test
> into its own process (via the subprocess exec call) so that each test
> directory or file got its own lldb processor and there was process-level
> isolation, less contention on the Python global interpreter lock, etc.
> Then, when Steve Pucci and later I got to making it multithreaded, we
> wrapped the exec call in a "import threading" style thread pool. That
> maintained the process isolation property by having each thread just do an
> exec (i.e. multiple execs in parallel). Except, this didn't work on
> MacOSX. The exec calls grab the Python GIL on OS X (and not anywhere as as
> far as I could find). But multithreading + exec is a valid option for
> everything not OS X.
> The way I solved it to work for everyone was to drop the "import
> threading" approach and switch to the "import multiprocessing" approach.
> This worked everywhere, including on OS X (although with a few hiccups
> initially, as it exposed occasional hangs at the time with what looked like
> socket handling under Darwin). What I failed to see in my haste was that I
> then had two levels of fork/exec-like behavior (i.e. we had two process
> firewalls where we only needed one, at the cost of an extra exec): the
> multiprocessing works by effectively forking/creating a new process that is
> now isolated. But then we turn around and still create a subprocess to
> exec out to dotest.py.
> What I'm suggesting in the near future is if we stick with the
> multiprocessing approach, and eliminate the subprocess exec and instead
> just have the multiprocess worker call directly into a methodized entry
> point in dotest.py, we can skip the subprocess call within the multiprocess
> worker. It is already isolated and a separate process, so it is already
> fulfilling the isolation requirement. And it reduces the doubled processes
> created. And it works on OS X in addition to everywhere else. It does
> become more difficult to debug, but then again the majority of the logic is
> in dotest.py and can be debugged --no-multiprocess (or with logging).
> This is all separate somewhat from the Ctrl-C issue, but it is the
> backstory on what I'm referring to with the parallel test runner.
> Completely as an aside, I did ask Greg Clayton to see if he can poke into
> why OS X is hitting the Python GIL on execs in "import threading"-style
> execs from multiple threads. But assuming nothing magic changes there and
> it wasn't easily solved (I tried and failed after several attempts to
> diagnose last year), I'd prefer to keep a strategy that is the same unless
> there's a decent win on the execution front.
> That all said, I'm starting to think a pluggable strategy for the actual
> mechanic of the parallel test run might end up being best anyway since I'd
> really like the Ctrl-C working and I'm not able to diagnose what's
> happening on the Windows scenario.
>> Multi-threaded is much easier to debug, for starters, because you can
>> just attach your debugger to a single process. It also solves a lot of
>> race conditions and makes output processing easier (not to mention higher
>> performance), because you don't even need a way to have the sub-processes
>> communicate their results back to the parent because the results are just
>> in memory. stick them in a synchronized queue and the parent can just
>> process it. So it would probably even speed up the test runner.
>> I think if there's not a very good reason to keep multiprocessing around,
>> we should aim for a threaded approach. My understanding is that lit
>> already does this, so there's no fundamental reason it shouldn't work
>> correctly on MacOSX, just have to solve the exec problem like you mentioned.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the lldb-commits