[lldb-dev] increase timeout for tests?
Jim Ingham via lldb-dev
lldb-dev at lists.llvm.org
Mon Mar 12 19:01:33 PDT 2018
The problem with having no timeouts is that you have to then be fairly careful how you write tests. You can't do:
while (1) {
print("Set a breakpoint here and hit it a few times then stop the test");
}
because if the breakpoint setting fails the test can run forever. And we wait forever to launch or attach to a process internally in lldb, so if you mess up attaching or launching in some situation, that again makes the test run forever. The timeouts are a backstop so you will get useful results from the testsuite in the case of this sort of error.
Jim
> On Mar 12, 2018, at 6:13 PM, Davide Italiano via lldb-dev <lldb-dev at lists.llvm.org> wrote:
>
> On Fri, Mar 9, 2018 at 3:45 AM, Pavel Labath <labath at google.com> wrote:
>>
>>
>>
>> On Thu, 8 Mar 2018 at 18:40, Davide Italiano <dccitaliano at gmail.com> wrote:
>>>
>>> On Thu, Mar 8, 2018 at 10:29 AM, Greg Clayton <clayborg at gmail.com> wrote:
>>>> It would be great to look into these and see what is taking so long.
>>>> Some tests are doing way too much work and should be split up. But it would
>>>> be great to see if we have any tests that are not good tests and are just
>>>> taking too much time for no reason (like the watchpoint tests were that
>>>> Pavel fixed).
>>>>
>>>
>>> I tend to agree with you. This is important. Pavel, is there a
>>> systematic way you use to profile test cases?
>>> Maybe we should consider having a testsuite mode where we warn for
>>> slow tests (or revamp the profiling mechanism, if there's one).
>>>
>>> Thanks!
>>
>>
>> I don't have any fancy mechanism for profiling this. I just hacked the
>> process_file function in dosep.py to measure the time each test takes and
>> report it if it exceeds a certain threshold. Then I just looked at the tests
>> that seemed like they are taking more time than they ought to.
>>
>> This is the patch, in it's entirety:
>> --- a/packages/Python/lldbsuite/test/dosep.py
>> +++ b/packages/Python/lldbsuite/test/dosep.py
>> @@ -489,8 +489,13 @@ def process_file(test_file, dotest_argv,
>> inferior_pid_events):
>> timeout = (os.getenv("LLDB_%s_TIMEOUT" % timeout_name) or
>> getDefaultTimeout(dotest_options.lldb_platform_name))
>>
>> + import time
>> + t = time.time()
>> results.append(call_with_timeout(
>> command, timeout, base_name, inferior_pid_events, test_file))
>> + t = time.time() - t
>> + if t > 20:
>> + print("XXXXXXXXXXXXXXXXXX %s:
>> %f"%(base_name, t))
>>
>> # result = (name, status, passes, failures, unexpected_successes)
>> timed_out = [name for name, status, _, _, _ in results
>>
>> We could try to make this more fancy, but I'm not sure worth it. It just
>> occurred to me you can achieve a similar effect by running the test suite
>> with a reduced timeout threshold. Something like "LLDB_TEST_TIMEOUT=20s
>> ninja check-lldb" should do the trick. At one point Todd also implemented a
>> mechanism for recording what the test is doing before we time out and kill
>> it. The idea was that this would help us determine what the test was doing
>> when it was killed (and whether it was really stuck, or just slow). However,
>> I've never used this, so I have no idea what's the state of it.
>>
>> I am not opposed to raising the timeout threshold, but before we do that, we
>> should check whether it will actually help. I.e., whether the test is just
>> slow, or it sometimes hangs (I think TestMultithreaded is a good candidate
>> for the latter). Can you check what is the time those tests take to run
>> individually? Current osx timeout (dosep.py:getDefaultTimeout) is set to 6
>> minutes, so if the tests normally take much shorter than this (less than 2
>> minutes?), I think there is something else going on, and increasing the
>> timeout will not help.
>>
>> Another interesting thing may be to check the system load while running the
>> test suite. Nowadays, lldb will sometimes spawn N (where N is the number of
>> cpus) threads to do debug info parsing, so when we run N tests in parallel,
>> we may end up with N^2 threads competing for N cpus. I've never seen this to
>> be an issue as our tests generally don't have that much debug info to begin
>> with, but I don't think we've ever investigated this either...
>>
>> BTW: This is my list of tests that take over 20 seconds (I ran it on linux
>> so it does not include any osx-specific ones):
>> 20.729438 TestWatchpointCommands.py
>> 21.158373 TestMiData.py
>> 22.018853 TestRecursiveInferior.py
>> 22.692744 TestInferiorCrashing.py
>> 22.986511 TestLinuxCore.py
>> 23.073790 TestInferiorAssert.py
>> 23.193351 TestAnonymous.py
>> 24.539430 TestMiStack.py
>> 25.232940 TestUnwindExpression.py
>> 26.608233 TestSetWatchlocation.py
>> 28.977390 TestWatchLocation.py
>> 29.057234 TestMultithreaded.py
>> 29.813142 TestCPP11EnumTypes.py
>> 30.830618 TestWatchLocationWithWatchSet.py
>> 32.979785 TestThreadStates.py
>> 33.435421 TestTargetWatchAddress.py
>> 36.902618 TestNamespaceLookup.py
>> 37.705945 TestMiSymbol.py
>> 38.735631 TestValueObjectRecursion.py
>> 46.870848 TestLldbGdbServer.py
>> 49.594663 TestLoadUnload.py
>> 52.186429 TestIntegerTypes.py
>> 55.340360 TestMiExec.py
>> 56.148259 TestIntegerTypesExpr.py
>> 110.121061 TestMiGdbSetShow.py
>>
>> The top one is an lldb-mi test, and this one takes long for a very silly
>> reason: When a test is xfailed because lldb-mi does not produce the expected
>> output, pexpect will keep waiting, hoping that it will eventually produce a
>> line that matches it's regex. This times out after 30 seconds, so any
>> xfailed lldb-mi test will take at least that long. I'm not sure how to fix
>> this without revamping the way we do lldb-mi testing (which would probably
>> be a good thing, but I don't see anyone queuing up to do that).
>>
>>
>
> I thought about this a little more and I guess the best solution would
> be that of removing the timeout altogether.
> This is, among others, what llvm does. timeouts can be opt-in. Setting
> a timeout means inherently making an assumption about the speed at
> which the machine will run the test.
> It's, therefore, by definition, impossible to set an "optimal" value.
> That said, slow tests should be analyzed independently (as we want
> people to not tapping fingers while waiting for tests to complete).
>
> Best,
>
> --
> Davide
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
More information about the lldb-dev
mailing list