[Lldb-commits] [PATCH] 5 minute timeout for tests

Mon Dec 1 16:05:04 PST 2014

>
> "timeout 5m %s %s/dotest.py %s -p %s %s" will kill python after 5
> minutes, but will it also kill any inferiors, and descendants of those?
> And what if you have A > B > C, and B dies, then you kill A's tree?

As far as I can tell with my experimentation, timeout actually handles all
of those cases perfectly.

launching the process in a process group ("job" on windows)

Could we do that portably?

If it's not important then it seems like just running the process in a
> separate thread with a timeout would be sufficient.

I think it's important. Since otherwise you'll finish the test but still
end up with a bunch of processes taking up resources in the background.

On Mon, Dec 1, 2014 at 3:50 PM, Zachary Turner <zturner at google.com> wrote:

> I looked at this some more, and I'm not sure if there's really a good
> solution.  I looked at the source code for psutil, and it also doesn't work
> correctly in case a process in the middle of the chain dies.  I don't think
> TASKKILL /T does either.  Does the original patch work with this case?  "timeout
> 5m %s %s/dotest.py %s -p %s %s" will kill python after 5 minutes, but will
> it also kill any inferiors, and descendants of those?  And what if you have
> A > B > C, and B dies, then you kill A's tree?
>
> If it's actually important to kill the tree then I think the only way to
> really do it correctly is with by launching the process in a process group
> ("job" on windows).  If it's not important then it seems like just running
> the process in a separate thread with a timeout would be sufficient.
>
> On Mon Dec 01 2014 at 3:27:45 PM Zachary Turner <zturner at google.com>
> wrote:
>
>> On Mon Dec 01 2014 at 3:19:09 PM Vince Harron <vharron at google.com> wrote:
>>
>>> Currently when the tests lock up on the build server, the build script
>>> kills the tests and gets *no* test results.  That means that if a bug goes
>>> in that causes a test to hang, we don't get results for *any* tests.
>>>
>>> Killing the individual test causes it to show up as a FAIL (Chaoren
>>> might change this to show result as "TIMEOUT")
>>>
>>
>> I see, this wasn't clear before.  I understood that the bots would
>> already kill individual tests that hanged, and not fail the entire test
>> suite.
>>
>>
>>
>>>
>>> It's not really about what tests are hanging today, it's that some day a
>>> test will hang.  When that happens, do we want to lose all test results as
>>> a result?  Look at the Linux buildbot history.  Many runs have no test
>>> results at all.  Which test caused the lockup?  I have no idea.
>>>
>>> > Both of those options are really undesirable in my opinion.
>>>
>>> We can make the test timeout be an OSX/Linux thing only for now...
>>>
>> How bad would it be to just use psutil?  We could integrate it with the
>> build so that it builds psutil when you build lldb, and deploys it to the
>> same place that lldb's python module goes so that it's automatically in the
>> PYTHONPATH.
>>
>> Eventually we're going to need this on other platforms too, and it
>> benefits everyone if we can share code instead of implementing things
>> differently on each platform.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-commits/attachments/20141201/704176b0/attachment.html>