[lldb-dev] increase timeout for tests?

Mon Mar 12 19:01:33 PDT 2018

The problem with having no timeouts is that you have to then be fairly careful how you write tests.  You can't do:

while (1) {
   print("Set a breakpoint here and hit it a few times then stop the test");
}

because if the breakpoint setting fails the test can run forever.  And we wait forever to launch or attach to a process internally in lldb, so if you mess up attaching or launching in some situation, that again makes the test run forever.  The timeouts are a backstop so you will get useful results from the testsuite in the case of this sort of error.

Jim

> On Mar 12, 2018, at 6:13 PM, Davide Italiano via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> On Fri, Mar 9, 2018 at 3:45 AM, Pavel Labath <labath at google.com> wrote:
>> 
>> 
>> 
>> On Thu, 8 Mar 2018 at 18:40, Davide Italiano <dccitaliano at gmail.com> wrote:
>>> 
>>> On Thu, Mar 8, 2018 at 10:29 AM, Greg Clayton <clayborg at gmail.com> wrote:
>>>> It would be great to look into these and see what is taking so long.
>>>> Some tests are doing way too much work and should be split up. But it would
>>>> be great to see if we have any tests that are not good tests and are just
>>>> taking too much time for no reason (like the watchpoint tests were that
>>>> Pavel fixed).
>>>> 
>>> 
>>> I tend to agree with you. This is important. Pavel, is there a
>>> systematic way you use to profile test cases?
>>> Maybe we should consider having a testsuite mode where we warn for
>>> slow tests (or revamp the profiling mechanism, if there's one).
>>> 
>>> Thanks!
>> 
>> 
>> I don't have any fancy mechanism for profiling this. I just hacked the
>> process_file function in dosep.py to measure the time each test takes and
>> report it if it exceeds a certain threshold. Then I just looked at the tests
>> that seemed like they are taking more time than they ought to.
>> 
>> This is the patch, in it's entirety:
>> --- a/packages/Python/lldbsuite/test/dosep.py
>> +++ b/packages/Python/lldbsuite/test/dosep.py
>> @@ -489,8 +489,13 @@ def process_file(test_file, dotest_argv,
>> inferior_pid_events):
>>         timeout = (os.getenv("LLDB_%s_TIMEOUT" % timeout_name) or
>>                    getDefaultTimeout(dotest_options.lldb_platform_name))
>> 
>> +        import time
>> +        t = time.time()
>>         results.append(call_with_timeout(
>>             command, timeout, base_name, inferior_pid_events, test_file))
>> +        t = time.time() - t
>> +        if t > 20:
>> +            print("XXXXXXXXXXXXXXXXXX                           %s:
>> %f"%(base_name, t))
>> 
>>     # result = (name, status, passes, failures, unexpected_successes)
>>     timed_out = [name for name, status, _, _, _ in results
>> 
>> We could try to make this more fancy, but I'm not sure worth it. It just
>> occurred to me you can achieve a similar effect by running the test suite
>> with a reduced timeout threshold. Something like "LLDB_TEST_TIMEOUT=20s
>> ninja check-lldb" should do the trick. At one point Todd also implemented a
>> mechanism for recording what the test is doing before we time out and kill
>> it. The idea was that this would help us determine what the test was doing
>> when it was killed (and whether it was really stuck, or just slow). However,
>> I've never used this, so I have no idea what's the state of it.
>> 
>> I am not opposed to raising the timeout threshold, but before we do that, we
>> should check whether it will actually help. I.e., whether the test is just
>> slow, or it sometimes hangs (I think TestMultithreaded is a good candidate
>> for the latter). Can you check what is the time those tests take to run
>> individually? Current osx timeout (dosep.py:getDefaultTimeout) is set to 6
>> minutes, so if the tests normally take much shorter than this (less than 2
>> minutes?), I think there is something else going on, and increasing the
>> timeout will not help.
>> 
>> Another interesting thing may be to check the system load while running the
>> test suite. Nowadays, lldb will sometimes spawn N (where N is the number of
>> cpus) threads to do debug info parsing, so when we run N tests in parallel,
>> we may end up with N^2 threads competing for N cpus. I've never seen this to
>> be an issue as our tests generally don't have that much debug info to begin
>> with, but I don't think we've ever investigated this either...
>> 
>> BTW: This is my list of tests that take over 20 seconds (I ran it on linux
>> so it does not include any osx-specific ones):
>> 20.729438 TestWatchpointCommands.py
>> 21.158373 TestMiData.py
>> 22.018853 TestRecursiveInferior.py
>> 22.692744 TestInferiorCrashing.py
>> 22.986511 TestLinuxCore.py
>> 23.073790 TestInferiorAssert.py
>> 23.193351 TestAnonymous.py
>> 24.539430 TestMiStack.py
>> 25.232940 TestUnwindExpression.py
>> 26.608233 TestSetWatchlocation.py
>> 28.977390 TestWatchLocation.py
>> 29.057234 TestMultithreaded.py
>> 29.813142 TestCPP11EnumTypes.py
>> 30.830618 TestWatchLocationWithWatchSet.py
>> 32.979785 TestThreadStates.py
>> 33.435421 TestTargetWatchAddress.py
>> 36.902618 TestNamespaceLookup.py
>> 37.705945 TestMiSymbol.py
>> 38.735631 TestValueObjectRecursion.py
>> 46.870848 TestLldbGdbServer.py
>> 49.594663 TestLoadUnload.py
>> 52.186429 TestIntegerTypes.py
>> 55.340360 TestMiExec.py
>> 56.148259 TestIntegerTypesExpr.py
>> 110.121061 TestMiGdbSetShow.py
>> 
>> The top one is an lldb-mi test, and this one takes long for a very silly
>> reason: When a test is xfailed because lldb-mi does not produce the expected
>> output, pexpect will keep waiting, hoping that it will eventually produce a
>> line that matches it's regex. This times out after 30 seconds, so any
>> xfailed lldb-mi test will take at least that long. I'm not sure how to fix
>> this without revamping the way we do lldb-mi testing (which would probably
>> be a good thing, but I don't see anyone queuing up to do that).
>> 
>> 
> 
> I thought about this a little more and I guess the best solution would
> be that of removing the timeout altogether.
> This is, among others, what llvm does. timeouts can be opt-in. Setting
> a timeout means inherently making an assumption about the speed at
> which the machine will run the test.
> It's, therefore, by definition, impossible to set an "optimal" value.
> That said, slow tests should be analyzed independently (as we want
> people to not tapping fingers while waiting for tests to complete).
> 
> Best,
> 
> --
> Davide
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev