[lldb-dev] Too many open files

Todd Fiala via lldb-dev lldb-dev at lists.llvm.org
Mon Oct 5 11:20:27 PDT 2015


Interesting, okay..

This does appear to be an accumulation issue.  You made it most of the way
through before the issue hit.  I suspect we're leaking file handles.  It
probably doesn't hit the per-process limit on multiprocessing because the
leaked files get spread across more processes.

(All speculation but does fit the results).

I'll see if I can look into what's there - if we've got an obvious leak,
I'll take care of it.

On Mon, Oct 5, 2015 at 9:58 AM, Adrian McCarthy <amccarth at google.com> wrote:

> Thanks for the ideas.
>
> With `--test-runner-name threading-pool`, I get too many open files.
>
> With `--test-runner-name multiprocessing-pool`, the suite runs fine.
>
> My machine has 40 logical cores.
>
> With `--threads=20`:  SUCCESS (and perhaps _faster_).
>
> With `--threads=30`:  SUCCESS.
>
> With `--threads=36`:  SUCCESS.
>
> With `--threads=38`:  TOO MANY OPEN FILES.
>
> So we're right at the edge.  I'll keep investigating.
>
> So it seems we're on the bleeding edge.
>
>
> On Fri, Oct 2, 2015 at 5:38 PM, Todd Fiala <todd.fiala at gmail.com> wrote:
>
>> (swapped out the lldb list for the newer one)
>>
>> On Fri, Oct 2, 2015 at 5:37 PM, Todd Fiala <todd.fiala at gmail.com> wrote:
>>
>>> Hmm, sounds suspicious.
>>>
>>> Can you try running the tests with two options and see if you get
>>> different results?
>>>
>>> # should be equivalent for the default on Windows, thus should match
>>> your above results.  This one uses a thread per worker queue.
>>> --test-runner-name threading-pool
>>>
>>> # should use a different test runner.  This one uses a process per
>>> worker queue.
>>> --test-runner-name multiprocessing-pool
>>>
>>> Aside from that, it seems like the total number of open files is
>>> exceeding some process/system maximum, which sounds like (maybe) we're
>>> leaking files somewhere.  Not enough info yet to guess where that might be
>>> coming in from, but maybe a part of the test runner isn't closing files
>>> somewhere.
>>>
>>> The other thing you can try is reducing the total number of threads,
>>> with:
>>> --threads {some-number-lower-than-your-total-number-of-logical-cores}
>>>
>>> in the event that your machine has a mongo number of logical cores, and
>>> perhaps it is trying to do too much.  (In that case, the
>>> multiprocessing-pool runner might actually help).
>>>
>>> Thanks!
>>>
>>> -Todd
>>>
>>> On Fri, Oct 2, 2015 at 5:31 PM, Adrian McCarthy <amccarth at google.com>
>>> wrote:
>>>
>>>> When running LLDB tests on Windows, I started getting a "too many open
>>>> files" error from Python.  I used git bisect to narrow it down to this
>>>> revision:
>>>>
>>>> http://llvm.org/viewvc/llvm-project?view=revision&revision=249182
>>>>
>>>> The error output is:
>>>>
>>>> Command invoked: D:\src\Python-2.7.9\PCbuild\python_d.exe
>>>> D:\src\llvm\llvm\tools\lldb\test\dotest.py -q --arch=i686 --executable
>>>> D:/src/llvm/build/ninja/bin/lldb.exe -s
>>>> D:/src/llvm/build/ninja/lldb-test-traces -u CXXFLAGS -u CFLAGS
>>>> --enable-crash-dialog -C D:\src\llvm\build\ninja_release\bin\clang.exe
>>>> --inferior -p TestRecursiveTypes.py D:\src\llvm\llvm\tools\lldb\test
>>>> --event-add-entries worker_index=7:int
>>>> 384 out of 400 test suites processed - TestRecursiveTypes.py
>>>>         Traceback (most recent call last):
>>>>   File "D:/src/llvm/llvm/tools/lldb/test/dotest.py", line 1457, in
>>>> <module>
>>>>   File "D:\src\llvm\llvm\tools\lldb\test\dosep.py", line 1355, in main
>>>>   File "D:\src\llvm\llvm\tools\lldb\test\dosep.py", line 968, in
>>>> walk_and_invoke
>>>>   File "D:\src\llvm\llvm\tools\lldb\test\dosep.py", line 1095, in
>>>> <lambda>
>>>>   File "D:\src\llvm\llvm\tools\lldb\test\dosep.py", line 889, in
>>>> threading_test_runner_pool
>>>>   File "D:\src\llvm\llvm\tools\lldb\test\dosep.py", line 774, in
>>>> map_async_run_loop
>>>>   File "D:\src\Python-2.7.9\Lib\multiprocessing\pool.py", line 558, in
>>>> get
>>>> OSError: [Errno 24] Too many open files
>>>> [77809 refs]
>>>> ninja: build stopped: subcommand failed.
>>>>
>>>>
>>>> Any clue what might have caused this or what can be done to fix it?
>>>>
>>>> It's Friday afternoon, so there's no urgency from my perspective.  I'll
>>>> probably get back to this on Monday morning.
>>>>
>>>> Thanks,
>>>> Adrian McCarthy
>>>>
>>>
>>>
>>>
>>> --
>>> -Todd
>>>
>>
>>
>>
>> --
>> -Todd
>>
>
>


-- 
-Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151005/d3772a00/attachment.html>


More information about the lldb-dev mailing list