[PATCH] D13389: [lit] Raise the default soft process limit when possible

Fri Oct 2 10:31:38 PDT 2015

Patch LGTM. Did this fix your problem on your machine?

On Fri, Oct 2, 2015 at 10:26 AM, hfinkel at anl.gov <hfinkel at anl.gov> wrote:

> hfinkel created this revision.
> hfinkel added reviewers: ruiu, chandlerc, cmatthews.
> hfinkel added a subscriber: llvm-commits.
> hfinkel set the repository for this revision to rL LLVM.
>
> It is common to have a default soft process limit, at least on some
> families of Linux distributions, of 1024. This is normally more than
> enough, but if you have many cores, and you're running tests that create
> many threads, this can become a problem. My POWER7 development machine has
> 48 cores, and when running the lld regression tests, which often want to
> create up to 48 threads, I run into problems. lit, by default, will want to
> run 48 tests in parallel, and 48*48 < 1024, and so many tests fail like
> this:
>
> terminate called after throwing an instance of 'std::system_error'
>   what():  Resource temporarily unavailable
>
> or lit fails like this when launching a test:
>
>   OSError: [Errno 11] Resource temporarily unavailable
>
> lit can easily detect this situation and attempt to repair it before
> launching tests (by raising the soft process limit to something that will
> allow ncpus^2 threads to be created), and should do so to prevent spurious
> test failures.
>
> This is the follow-up to this thread:
> http://lists.llvm.org/pipermail/llvm-dev/2015-October/090942.html
>
>
> Repository:
>   rL LLVM
>
> http://reviews.llvm.org/D13389
>
> Files:
>   utils/lit/lit/run.py
>
> Index: utils/lit/lit/run.py
> ===================================================================
> --- utils/lit/lit/run.py
> +++ utils/lit/lit/run.py
> @@ -228,6 +228,28 @@
>              canceled_flag = LockedValue(0)
>              consumer = ThreadResultsConsumer(display)
>
> +       # Because some tests use threads internally, and at least on Linux
> each
> +       # of these threads counts toward the current process limit, try to
> +       # raise the (soft) process limit so that tests don't fail due to
> +       # resource exhaustion.
> +        try:
> +          cpus = lit.util.detectCPUs()
> +          desired_limit = jobs * cpus * 2 # the 2 is a safety factor
> +
> +         # Import the resource module here inside this try block because
> it
> +         # will likely fail on Windows.
> +          import resource
> +
> +          max_procs_soft, max_procs_hard =
> resource.getrlimit(resource.RLIMIT_NPROC)
> +          desired_limit = min(desired_limit, max_procs_hard)
> +
> +          if max_procs_soft < desired_limit:
> +            self.lit_config.note('raising the process limit from %d to
> %d' % \
> +                                 (max_procs_soft, desired_limit))
> +            resource.setrlimit(resource.RLIMIT_NPROC, (desired_limit,
> max_procs_hard))
> +        except:
> +          pass
> +
>          # Create the test provider.
>          provider = TestProvider(self.tests, jobs, queue_impl,
> canceled_flag)
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151002/35dcc941/attachment.html>