<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 19, 2016 at 7:27 AM, Dan Liew <span dir="ltr"><<a href="mailto:dan@su-root.co.uk" target="_blank">dan@su-root.co.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 18 August 2016 at 19:48, Kostya Serebryany <<a href="mailto:kcc@google.com">kcc@google.com</a>> wrote:<br>

><br>

><br>

> On Thu, Aug 18, 2016 at 11:29 AM, Dan Liew <<a href="mailto:dan@su-root.co.uk">dan@su-root.co.uk</a>> wrote:<br>

>><br>

>> >> Perhaps the 1 second time is<br>

>> >> too short for some buildbots if they are under load? We could extend<br>

>> >> the time we wait (to something > 1 second) but we'd have to probably<br>

>> >> increase `-max_total_time=4` to something larger. This wouldn't be a<br>

>> >> real fix though as the test would still be racey.<br>

>> ><br>

>> ><br>

>> > I think it can be rewritten to be non-racey.<br>

>> > Just run it with -max_total_time=2 -jobs=2 and let it and all children<br>

>> > exit<br>

>> > -- then check that the files exist and are good.<br>

>><br>

>> I agree that doing that would be non-racey however that would be<br>

>> testing a slightly different property (that all jobs eventually run)<br>

>> rather than what the test is currently trying to check (that LibFuzzer<br>

>> can spawn multiple copies of itself running in parallel). The change<br>

>> you proposing to the test would fail to catch the problem that I fixed<br>

>> on macOS (LibFuzzer would not run multiple jobs in parallel).<br>

><br>

><br>

> Ok...<br>

> You can probably modify your test to do grep "Running 2 workers" instead of<br>

<br>

</span>That's actually slightly worse. That currently only prints that<br>

message when `Flags.jobs > 0 && Flags.workers == 0` and that is false<br>

for the test (we need to be independent from the number of CPUs on the<br>

host so the number of workers and jobs are set to be the same, i.e.<br>

2).<br>

<br>

Even if we modified LibFuzzer to print the message in all cases it has<br>

the same problem as the previous solution you proposed, namely that<br>

the test would fail to detect the problem I fixed on macOS. This<br>

message would print before calling `ExecuteCommand()` and we wouldn't<br>

be actually checking that the child processes were launched in<br>

parallel.<br>

<br>

I think we either need to<br>

<br>

* Abandon testing that the jobs actually run in parallel (what I<br>

wanted to test) and instead check that all jobs run and eventually<br>

finish without actually checking if the jobs actually ran in parallel.<br>

OR<br>

* Make the times used in the existing test more tolerant to the system<br>

being under load by sleeping for longer and running the jobs for<br>

longer.<br></blockquote><div><br></div><div>Let's start from increasing the sleep to 2 second. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Yes this does suck. I don't know of a way to observe that two<br>

LibFuzzer jobs are actually running in parallel in a way that isn't<br>

racey.<br>

</blockquote></div><br></div></div>