[PATCH] [Lit] Use multiprocessing instead of threading

Thu Oct 24 17:14:43 PDT 2013

On Thu, Oct 24, 2013 at 5:10 PM, Sean Silva <silvas at purdue.edu> wrote:

>
>
>
> On Thu, Oct 24, 2013 at 8:01 PM, Daniel Dunbar <daniel at zuster.org> wrote:
>
>> On Thu, Oct 24, 2013 at 4:50 PM, Sean Silva <silvas at purdue.edu> wrote:
>>
>>>
>>>
>>>
>>> On Wed, Oct 23, 2013 at 3:44 PM, Rafael Espíndola <
>>> rafael.espindola at gmail.com> wrote:
>>>
>>>> On 23 October 2013 00:26, Daniel Dunbar <daniel at zuster.org> wrote:
>>>> > I wanted to have a little bit of time to check out how things were on
>>>> > Windows, but I haven't found it.
>>>> >
>>>> > Off the top of my head, I think that was the main thing, other than
>>>> also
>>>> > just giving it a little bake time.
>>>> >
>>>> > We could go ahead and switch the default for non-Windows and see what
>>>> > happens if you are interested...
>>>>
>>>> Probably a good idea. With check-all the time goes from  1m10.644s to
>>>> 0m50.506s in my machine :-)
>>>>
>>>
>>> Wow, that means that 20s of the check-all time were just lit overhead,
>>> which is insane.
>>>
>>
>>  No, it doesn't, it means that 20s of check-all time could be saved by
>> increased parallelism.
>>
>
> Oh, I thought that the change to multiprocessing didn't affect the number
> of jobs spawned? Is multiprocessing automatically selecting a better job
> count, or what is the source of this huge performance discrepancy?
>

Previously we used Python-thread based parallelism. Python has a "global
interpreter lock" (GIL) protecting certain core operations. While Python
threads use native threads, whenever they are actually running in Python
byte code they are protected by the GIL, which means only one can execute
concurrently.

As you can imagine, if you run 8 threads but they all share a single lock
to do anything you don't get much parallelism. It's not quite as bad as
that in practice because for lit, it's often shelling out to other
processes (which can happen with the GIL not held) and that is where it's
parallelism came from. However, it still greatly reduces the available
parallelism.

With multiprocessing we should effectively be able to keep all of the CPUs
busy for the duration of testing.

 - Daniel

> -- Sean Silva
>
>
>>
>> The lit overhead is fairly small, you can check it by running with
>> --no-execute. On my machine for just LLVM, lit can "run" all the tests in
>> <3s if it doesn't actually execute them -- this includes the time for
>> discovery, and reading and parsing every single test script. It takes < .5s
>> to handle just discovery and processing of all the tests without reading
>> and parsing the test scripts, so you can see that most of the time is in
>> parsing the "ShTest" format, but even that is a small percentage of the
>> overall testing time.
>>
>>  - Daniel
>>
>>  Can you estimate the total lit overhead after switching to
>>> multiprocessing? E.g., can we gain another 20s of check-all speed by
>>> reducing lit overhead?
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>> Cheers,
>>>> Rafael
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131024/9442a3aa/attachment.html>