[LNT] r263445 - [test-suite] Implement a new option: --single-result

Mon Mar 14 09:50:56 PDT 2016

Author: jamesm
Date: Mon Mar 14 11:50:55 2016
New Revision: 263445

URL: http://llvm.org/viewvc/llvm-project?rev=263445&view=rev
Log:
[test-suite] Implement a new option: --single-result

The idea behind this option is to ease bisecting using tools like 'llvmlab bisect'. With --only-test we can already narrow down the tests we want to run to one test, however we still produce a report.json and exit success no matter if the test passed or failed.

With --single-result, we only perform one test (it sets --only-test) but the exit status of LNT is determined by a predicate which allows it to be used directly with llvmlab exec / llvmlab bisect.

The predicate is set with --single-result-predicate, and is a python expression. It is evaluated in a context that contains 'status', a boolean representing the pass or fail status of the test, and all of the metrics exposed by LIT. Where metrics have different names in LIT and LNT parlance (for example 'exec' in LNT and 'exec_time' in LIT) both names are exposed. This has a sideeffect of working around a feature of Python - 'exec' is a keyword so cannot be used as a variable!

The default predicate is simply "status". This causes the exit status of LNT to correspond to the pass/fail status of the test - useful for conformance testing. Predicates such as "status and exec_time > 6.0" allow for simple performance bisections.

We've been using this feature internally for a few weeks now and have found our bisections have become a lot easier.

Modified:
    lnt/trunk/docs/tests.rst
    lnt/trunk/lnt/tests/test_suite.py

Modified: lnt/trunk/docs/tests.rst
URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/docs/tests.rst?rev=263445&r1=263444&r2=263445&view=diff
==============================================================================

--- lnt/trunk/docs/tests.rst (original)
+++ lnt/trunk/docs/tests.rst Mon Mar 14 11:50:55 2016
@@ -237,18 +237,50 @@ metrics than the Make system, for exampl
 
 Running the test-suite via CMake and lit uses a different LNT test::
 
-$ rm -rf /tmp/BAR
-$ lnt runtest test-suite \
-     --sandbox /tmp/BAR \
-     --cc ~/llvm.obj.64/Release+Asserts/bin/clang \
-     --cxx ~/llvm.obj.64/Release+Asserts/bin/clang++ \
-     --use-cmake=/usr/local/bin/cmake \
-     --use-lit=~/llvm/utils/lit/lit.py \
-     --test-suite ~/llvm-test-suite \
-     --cmake-cache Release
+  $ rm -rf /tmp/BAR
+  $ lnt runtest test-suite \
+       --sandbox /tmp/BAR \
+       --cc ~/llvm.obj.64/Release+Asserts/bin/clang \
+       --cxx ~/llvm.obj.64/Release+Asserts/bin/clang++ \
+       --use-cmake=/usr/local/bin/cmake \
+       --use-lit=~/llvm/utils/lit/lit.py \
+       --test-suite ~/llvm-test-suite \
+       --cmake-cache Release
      
 Since the CMake test-suite uses lit to run the tests and compare their output,
 LNT needs to know the path to your LLVM lit installation.  The test-suite Holds
 some common common configurations in CMake caches. The ``--cmake-cache`` flag
 and the ``--cmake-define`` flag allow you to change how LNT configures cmake
 for the test-suite run.
+
+Bisecting: ``--single-result`` and ``--single-result-predicate``
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+The LNT driver for the CMake-based test suite comes with helpers for bisecting conformance and performance changes with ``llvmlab bisect``.
+
+``llvmlab bisect`` is part of the ``zorg`` repository and allows easy bisection of some predicate through a build cache. The key to using ``llvmlab`` effectively is to design a good predicate command - one which exits with zero on 'pass' and nonzero on 'fail'.
+
+LNT normally runs one or more tests then produces a test report. It always exits with status zero unless an internal error occurred. The ``--single-result`` argument changes LNT's behaviour - it will only run one specific test and will apply a predicate to the result of that test to determine LNT's exit status.
+
+The ``--single-result-predicate`` argument defines the predicate to use. This is a Python expression that is executed in a context containing several pre-set variables:
+
+  * ``status`` - Boolean passed or failed (True for passed, False for failed).
+  * ``exec_time`` - Execution time (note that ``exec`` is a reserved keyword in Python!)
+  * ``compile`` (or ``compile_time``) - Compilation time
+
+Any metrics returned from the test, such as "score" or "hash" are also added to the context.
+
+The default predicate is simply ``status`` - so this can be used to debug correctness regressions out of the box. More complex predicates are possible; for example ``exec_time < 3.0`` would bisect assuming that a 'good' result takes less than 3 seconds.
+
+Full example using ``llvmlab`` to debug a performance improvement::
+
+  $ llvmlab bisect --min-rev=261265 --max-rev=261369 \
+    lnt runtest test-suite \
+      --cc '%(path)s/bin/clang' \
+      --sandbox SANDBOX \
+      --test-suite /work/llvm-test-suite \
+      --use-lit lit \
+      --run-under 'taskset -c 5' \
+      --cflags '-O3 -mthumb -mcpu=cortex-a57' \
+      --single-result MultiSource/Benchmarks/TSVC/Expansion-flt/Expansion-flt \
+      --single-result-predicate 'exec_time > 8.0'

Modified: lnt/trunk/lnt/tests/test_suite.py
URL: http://llvm.org/viewvc/llvm-project/lnt/trunk/lnt/tests/test_suite.py?rev=263445&r1=263444&r2=263445&view=diff
==============================================================================
--- lnt/trunk/lnt/tests/test_suite.py (original)
+++ lnt/trunk/lnt/tests/test_suite.py Mon Mar 14 11:50:55 2016
@@ -1,4 +1,4 @@
-import subprocess, tempfile, json, os, shlex, platform, pipes
+import subprocess, tempfile, json, os, shlex, platform, pipes, sys
 
 from optparse import OptionParser, OptionGroup
 
@@ -156,6 +156,15 @@ class TestSuiteTest(BuiltinTest):
                          help="Do not submit the stat of this type [%default]",
                          action='append', choices=KNOWN_SAMPLE_KEYS,
                          type='choice', default=[])
+        group.add_option("", "--single-result", dest="single_result",
+                         help=("only execute this single test and apply "
+                               "--single-result-predicate to calculate the "
+                               "exit status"))
+        group.add_option("", "--single-result-predicate",
+                         dest="single_result_predicate",
+                         help=("the predicate to apply to calculate the exit "
+                               "status (with --single-result)"),
+                         default="status")
         parser.add_option_group(group)
 
         group = OptionGroup(parser, "Test tools")
@@ -217,7 +226,7 @@ class TestSuiteTest(BuiltinTest):
                 parser.error(
                     "invalid --test-externals argument, does not exist: %r" % (
                         opts.test_suite_externals,))
-
+                
         opts.cmake = resolve_command_path(opts.cmake)
         if not isexecfile(opts.cmake):
             parser.error("CMake tool not found (looked for %s)" % opts.cmake)
@@ -234,6 +243,10 @@ class TestSuiteTest(BuiltinTest):
                 parser.error("Run under wrapper not found (looked for %s)" %
                              opts.run_under)
 
+        if opts.single_result:
+            # --single-result implies --only-test
+            opts.only_test = opts.single_result
+                
         if opts.only_test:
             # --only-test can either point to a particular test or a directory.
             # Therefore, test_suite_root + opts.only_test or
@@ -248,6 +261,10 @@ class TestSuiteTest(BuiltinTest):
             else:
                 parser.error("--only-test argument not understood (must be a " +
                              " test or directory name)")
+
+        if opts.single_result and not opts.only_test[1]:
+            parser.error("--single-result must be given a single test name, not a " +
+                         "directory name")
                 
         opts.cppflags = ' '.join(opts.cppflags)
         opts.cflags = ' '.join(opts.cflags)
@@ -506,6 +523,19 @@ class TestSuiteTest(BuiltinTest):
             raw_name = test_data['name'].split(' :: ', 1)[1]
             name = 'nts.' + raw_name.rsplit('.test', 1)[0]
             is_pass = self._is_pass_code(test_data['code'])
+
+            # If --single-result is given, exit based on --single-result-predicate
+            if self.opts.single_result and \
+               raw_name == self.opts.single_result+'.test':
+                env = {'status': is_pass}
+                if 'metrics' in test_data:
+                    for k,v in test_data['metrics'].items():
+                        env[k] = v
+                        if k in LIT_METRIC_TO_LNT:
+                            env[LIT_METRIC_TO_LNT[k]] = v
+                status = eval(self.opts.single_result_predicate, {}, env)
+                sys.exit(0 if status else 1)
+
             if 'metrics' in test_data:
                 for k,v in test_data['metrics'].items():
                     if k not in LIT_METRIC_TO_LNT or LIT_METRIC_TO_LNT[k] in ignore:
@@ -528,6 +558,10 @@ class TestSuiteTest(BuiltinTest):
                                             [self._get_lnt_code(test_data['code'])],
                                             test_info))
 
+        if self.opts.single_result:
+            # If we got this far, the result we were looking for didn't exist.
+            raise RuntimeError("Result %s did not exist!" %
+                               self.opts.single_result)
 
         # FIXME: Add more machine info!
         run_info = {