[Lldb-commits] [Diffusion] rL238467: Refactor test runner to print sub-test-case pass/fail rate.

Mon Aug 24 14:15:38 PDT 2015

dawn accepted this commit.
dawn added a comment.

After some investigation, it appears your patch may have simply exposed an existing bug, so in one sense I owe an appology, but in another, your patch made that bug impossible to workaround. :) Before your change, it was possible to count the total tests run via adding up all the Ns in the lines:

  Ran N tests in .*

and this would give the correct total.  But after your change, these lines were no longer getting printed, forcing one to depend on the final count from:

  Ran N test cases .*

Which is wrong, as I'll explain below.  Below I've done a comparison between dosep and dotest on a narrowed subset of tests to show how dosep can omit the test cases from a test suite in its count.

Tested on subset of lldb/test with just the following directories/files (i.e. all others directories/files were removed):

      test/make
      test/pexpect-2.4
      test/plugins
      test/types
      test/unittest2
  # The .py files kept in test/types are as follows (so test/types/TestIntegerTypes.py* was removed):
      test/types/AbstractBase.py
      test/types/HideTestFailures.py
      test/types/TestFloatTypes.py
      test/types/TestFloatTypesExpr.py
      test/types/TestIntegerTypesExpr.py
      test/types/TestRecursiveTypes.py

Tests were run in the lldb/test directory using the following commands:

  dotest:
      ./dotest.py -v
  dosep:
      ./dosep.py -s --options "-v"

Comparing the test case totals, dotest correctly counts 46, but dosep counts only 16:

  dotest:
      Ran 46 tests in 75.934s
  dosep:
      Testing: 23 tests, 4 threads ## note: this number changes randonmly
      Ran 6 tests in 7.049s
      [PASSED TestFloatTypes.py] - 1 out of 23 test suites processed
      Ran 6 tests in 11.165s
      [PASSED TestFloatTypesExpr.py] - 2 out of 23 test suites processed
      Ran 30 tests in 54.581s ## FIXME: not counted?
      [PASSED TestIntegerTypesExpr.py] - 3 out of 23 test suites processed
      Ran 4 tests in 3.212s
      [PASSED TestRecursiveTypes.py] - 4 out of 23 test suites processed
      Ran 4 test suites (0 failed) (0.000000%)
      Ran 16 test cases (0 failed) (0.000000%)

With test/types/TestIntegerTypesExpr.py* removed, both correctly count 16 test cases:

  dosep:
      Testing: 16 tests, 4 threads
      Ran 6 tests in 7.059s
      Ran 6 tests in 11.186s
      Ran 4 tests in 3.241s
      Ran 3 test suites (0 failed) (0.000000%)
      Ran 16 test cases (0 failed) (0.000000%)

In rev.238454 (before your change), results didn't count the number of test
cases, but the test suite count is wrong.  Running dosep on the above test
subset but with all tests in types (i.e. adding back TestIntegerTypes.py so we
have 5 tests in types), we get:

  dosep:
      Ran 6 tests in 7.871s
      Ran 6 tests in 13.812s
      Ran 30 tests in 36.102s
      Ran 30 tests in 64.063s
      Ran 4 tests.

It seems now that, with dosep's -s option, we can once again see the output:

  Ran N tests in .*

So counting the totals via:

  ./dosep.py -s --options "-v --executable $INSTALLDIR/bin/lldb" 2>&1 | tee test_out.log || true
  export total=`grep -E "^Ran [0-9]+ tests? in" test_out.log | awk '{count+=$2} END {print count}'`

works once again.

BTW, what about tests that time out?  I don't see where dosep will report any information about tests which time out.

Note: I couldn't compare the test counts on all the tests because of the concern raised in http://reviews.llvm.org/rL237053.  That is, dotest can no longer complete the tests, as all test suites fail after test case 898: test_disassemble_invalid_vst_1_64_raw_data get ERRORs.  I don't think that issue is related to problems in dosep, but I could be wrong.

Note also: When running dotest on earlier sources, it can hang on several tests.
To work around this, delete these tests from lldb/test:

  rm -rf ./functionalities/thread/create_after_attach/TestCreateAfterAttach.py*
  rm -rf ./functionalities/plugins/python_os_plugin/TestPythonOSPlugin.py*
  rm -rf ./functionalities/unwind/sigtramp/TestSigtrampUnwind.py*
  rm -rf ./test/macosx/queues/TestQueues.py*
  rm -rf ./test/functionalities/inferior-crashing/TestInferiorCrashing.py*

In summary, dosep is unable to count the test cases correctly, but this problem existed before your change, and I'm happy that I'm able to use my workaround again.  It would be nice to get that fixed someday, as well as see information about the tests that timed out.

Thanks,
-Dawn

Users:
  zturner (Author)
  dawn (Auditor)

http://reviews.llvm.org/rL238467