[llvm-commits] [PATCH] [Lit] Use multiprocessing instead of threading

Fri Nov 30 13:37:20 PST 2012

Hi Takumi,

Thanks for working on this!

1. I agree with Eli's comments about the function level import.

2. There should be no --enable-mp-whatever flag, things should just work
for users.

3. This patch can be cleaned up a lot:

a. Change the threading case to use an output queue for the test results
and have a separate display thread that consumes it. This drops the need
for the locking in the display code, and means the MP and thread paths can
use the same run() method. Please use a small helper function to dispatch
to that from a free function (for pickling) instead of duplicating the body.

This requires dropping the special case in runTests() for numThreads == 1,
but I am ok with that.

b. The TestProvider object for the threading case can be changed to use a
Queue just like the MP case. Then that object can be shared by both
implementations. Note that TestProvider currently handles the --max-time
argument which your patch drops support for in the MP case.

c. runMPTests() has no comments. :)

With these changes you should be able to make the dispatch code
(runMPTests()) identical for both threading and non-threading cases. Note
that you are putting more sentinel markers in your processing queue than
you need (numThreads**2 instead of numThreads).

4. I just added a trivial test suite in
utils/lit/lit/ExampleTests/ManyTests which you can use to test
multiprocessing performance, if you like. On my machine I get:

TOT:
--
ManyTests (master)$ lit -sv .
-- Testing: 10000 tests, 8 threads --
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 6.83s
  Expected Passes    : 10000
--

With MP patch:
--
ddunbar at ozzy:ManyTests (master)$ lit -sv .
-- Testing: 10000 tests, 8 threads --
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 1.74s
  Expected Passes    : 10000
--

 - Daniel

On Fri, Nov 23, 2012 at 12:03 AM, NAKAMURA Takumi <geek4civic at gmail.com>wrote:

> Hi chandlerc, ddunbar,
>
> Threading is too slow possibly due to GIL.
> multiprocessing brings 95% of processor usage. I believe it would scale up!
>
>
> * For me with lit -j8 llvm/test clang/test :
>
> AMD FX-8150 8 cores, 3.6GHz (same as bb.pgr.jp's buildslave)
>
> - threading (trunk)
>
>   50s to 59s (unstable), with -sv
>   48s to 58s (unstable), with -q
>
> - multiprocessing
>
>   36s -sv
>   35s -q
>
>
> * Considerations:
>
>   - Need the switch "--enable(disable)-mp"?
>   - MP is disabled on Win32.
>     - MP cannot be invoked via utils/llvm-lit. Investigating.
>     - Starting MP's subprocess takes several seconds, due to pickling
> instead of fork().
>
> http://llvm-reviews.chandlerc.com/D134
>
> Files:
>   llvm/utils/lit/lit/Test.py
>   llvm/utils/lit/lit/main.py
>
> Index: llvm/utils/lit/lit/Test.py
> ===================================================================
> --- llvm/utils/lit/lit/Test.py
> +++ llvm/utils/lit/lit/Test.py
> @@ -7,12 +7,14 @@
>          self.name = name
>          self.isFailure = isFailure
>
> -PASS        = TestResult('PASS', False)
> -XFAIL       = TestResult('XFAIL', False)
> -FAIL        = TestResult('FAIL', True)
> -XPASS       = TestResult('XPASS', True)
> -UNRESOLVED  = TestResult('UNRESOLVED', True)
> -UNSUPPORTED = TestResult('UNSUPPORTED', False)
> +TestResults = dict()
> +
> +PASS        = TestResults['PASS']        = TestResult('PASS',
>  False)
> +XFAIL       = TestResults['XFAIL']       = TestResult('XFAIL',
> False)
> +FAIL        = TestResults['FAIL']        = TestResult('FAIL',        True)
> +XPASS       = TestResults['XPASS']       = TestResult('XPASS',       True)
> +UNRESOLVED  = TestResults['UNRESOLVED']  = TestResult('UNRESOLVED',  True)
> +UNSUPPORTED = TestResults['UNSUPPORTED'] = TestResult('UNSUPPORTED',
> False)
>
>  # Test classes.
>
> Index: llvm/utils/lit/lit/main.py
> ===================================================================
> --- llvm/utils/lit/lit/main.py
> +++ llvm/utils/lit/lit/main.py
> @@ -112,7 +112,9 @@
>              item = self.provider.get()
>              if item is None:
>                  break
> -            self.runTest(item)
> +            result, output, elapsed = self.runTest(item)
> +            item.setResult(result, output, elapsed)
> +            self.display.update(item)
>
>      def runTest(self, test):
>          result = None
> @@ -134,8 +136,22 @@
>              output += '\n'
>          elapsed = time.time() - startTime
>
> -        test.setResult(result, output, elapsed)
> -        self.display.update(test)
> +        return ((result, output, elapsed))
> +
> +# It should not be @staticmethod for pickling on Win32.
> +def runMP(litConfig, tests, testids, results):
> +    tester = Tester(litConfig, None, None)
> +    while True:
> +        n = testids.get()
> +        if n is None:
> +            break
> +        result, output, elapsed = tester.runTest(tests[n])
> +
> +        # Don't pass Test.*(s) into Queue. They might become non-identical
> +        # objects against parent's context. Pass its name instead.
> +        results.put((n, result.name, output, elapsed))
> +
> +    results.put((None, None, None, None))
>
>  def dirContainsTestSuite(path):
>      cfgpath = os.path.join(path, gSiteConfigName)
> @@ -297,15 +313,60 @@
>          if sub_ts and not N:
>              litConfig.warning('test suite %r contained no tests' %
> sub_ts.name)
>
> -def runTests(numThreads, litConfig, provider, display):
> +def runMPTests(numThreads, litConfig, tests, display):
> +
> +    import multiprocessing
> +
> +    testids = multiprocessing.Queue()
> +    results = multiprocessing.Queue()
> +    ps = [multiprocessing.Process(
> +            target = runMP,
> +            args = (litConfig, tests, testids, results),
> +            )
> +          for i in range(numThreads)]
> +    for p in ps:
> +        p.start();
> +
> +    for n in range(len(tests)):
> +        testids.put(n)
> +
> +    for i in range(numThreads):
> +        for test in range(numThreads):
> +            testids.put(None)
> +        while True:
> +            n, resultname, output, elapsed = results.get()
> +            if n is None:
> +                break
> +            result = Test.TestResults[resultname]
> +            test = tests[n]
> +            test.setResult(result, output, elapsed)
> +            display.update(test)
> +
> +    for p in ps:
> +        p.join();
> +
> +def runTests(numThreads, litConfig, tests, maxTime, display):
>      # If only using one testing thread, don't use threads at all; this
> lets us
>      # profile, among other things.
>      if numThreads == 1:
> +        provider = TestProvider(tests, maxTime)
>          t = Tester(litConfig, provider, display)
>          t.run()
>          return
>
> +    # Try to run MPTests.
> +    try:
> +        if platform.system()=='Windows':
> +            # It's still slow to invoke children due to cost of pickling.
> +            pass
> +        else:
> +            runMPTests(numThreads, litConfig, tests, display)
> +            return
> +    except ImportError:
> +        pass
> +
>      # Otherwise spin up the testing threads and wait for them to finish.
> +    provider = TestProvider(tests, maxTime)
>      testers = [Tester(litConfig, provider, display)
>                 for i in range(numThreads)]
>      for t in testers:
> @@ -594,8 +655,7 @@
>
>      startTime = time.time()
>      display = TestingProgressDisplay(opts, len(tests), progressBar)
> -    provider = TestProvider(tests, opts.maxTime)
> -    runTests(opts.numThreads, litConfig, provider, display)
> +    runTests(opts.numThreads, litConfig, tests, opts.maxTime, display)
>      display.finish()
>
>      if not opts.quiet:
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20121130/1f280d26/attachment.html>