[llvm] f951b0f - [lit] Add builtin support for flaky tests in lit

Wed Mar 18 20:46:51 PDT 2020

Libc++ has some tests that are timing-dependent, like `std::future::wait_for`. Those are inherently dependent on timing, and pretty much the only way to make them non-dependent is to mock the system clock (not easy to do). If someone knows other ways to improve the situation, I’m eager to learn (really!).

We already reimplement that feature in libc++’s complicated test format, and it looks like at least 2 other projects had similar needs since they added `config.test_retry_attempts` (I could find that used twice in LLVM).

I have no more love for flaky tests than anyone else, and I agree they’re a huge problem. But when you’re stuck with one, now, at least it’s easy to make it not break every other build without a custom test format (or making every test flaky, which is what config.test_retry_attempts does).

Louis

> On Mar 18, 2020, at 19:03, Philip Reames <listmail at philipreames.com> wrote:
> 
> Er, I'm really not sure this is a good idea.  Can you explain why we have tests which are *fundamentally* flaky and can't simply be fixed to be non-flaky?
> 
> Philip
> 
> On 3/18/20 3:04 PM, Louis Dionne via llvm-commits wrote:
>> Author: Louis Dionne
>> Date: 2020-03-18T18:04:01-04:00
>> New Revision: f951b0f82dff97f91d9b89c67886740acf3373cd
>> 
>> URL: https://github.com/llvm/llvm-project/commit/f951b0f82dff97f91d9b89c67886740acf3373cd
>> DIFF: https://github.com/llvm/llvm-project/commit/f951b0f82dff97f91d9b89c67886740acf3373cd.diff
>> 
>> LOG: [lit] Add builtin support for flaky tests in lit
>> 
>> This commit adds a new keyword in lit called ALLOW_RETRIES. This keyword
>> takes a single integer as an argument, and it allows the test to fail that
>> number of times before it first succeeds.
>> 
>> This work attempts to make the existing test_retry_attempts more flexible
>> by allowing by-test customization, as well as eliminate libc++'s FLAKY_TEST
>> custom logic.
>> 
>> Differential Revision: https://reviews.llvm.org/D76288
>> 
>> Added:
>>     llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py
>>     llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg
>>     llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py
>>     llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py
>>     llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py
>>     llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg
>>     llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py
>>     llvm/utils/lit/tests/allow-retries.py
>> 
>> Modified:
>>     llvm/utils/lit/lit/Test.py
>>     llvm/utils/lit/lit/TestRunner.py
>>     llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>>     llvm/utils/lit/tests/unit/TestRunner.py
>>     llvm/utils/vim/syntax/llvm.vim
>>     llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>> 
>> Removed:
>>     
>> 
>> ################################################################################
>> diff  --git a/llvm/utils/lit/lit/Test.py b/llvm/utils/lit/lit/Test.py
>> index 62f1bbf1f03a..000bcf8fc38f 100644
>> --- a/llvm/utils/lit/lit/Test.py
>> +++ b/llvm/utils/lit/lit/Test.py
>> @@ -220,6 +220,10 @@ def __init__(self, suite, path_in_suite, config, file_path = None):
>>          # triple parts. All of them must be False for the test to run.
>>          self.unsupported = []
>>  +        # An optional number of retries allowed before the test finally succeeds.
>> +        # The test is run at most once plus the number of retries specified here.
>> +        self.allowed_retries = getattr(config, 'test_retry_attempts', 0)
>> +
>>          # The test result, once complete.
>>          self.result = None
>>  
>> diff  --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
>> index 96411c98ee33..a9518b2b5a0b 100644
>> --- a/llvm/utils/lit/lit/TestRunner.py
>> +++ b/llvm/utils/lit/lit/TestRunner.py
>> @@ -1182,13 +1182,15 @@ class ParserKind(object):
>>      LIST: A keyword taking a comma-separated list of values.
>>      BOOLEAN_EXPR: A keyword taking a comma-separated list of
>>          boolean expressions. Ex 'XFAIL:'
>> +    INTEGER: A keyword taking a single integer. Ex 'ALLOW_RETRIES:'
>>      CUSTOM: A keyword with custom parsing semantics.
>>      """
>>      TAG = 0
>>      COMMAND = 1
>>      LIST = 2
>>      BOOLEAN_EXPR = 3
>> -    CUSTOM = 4
>> +    INTEGER = 4
>> +    CUSTOM = 5
>>        @staticmethod
>>      def allowedKeywordSuffixes(value):
>> @@ -1196,6 +1198,7 @@ def allowedKeywordSuffixes(value):
>>                   ParserKind.COMMAND:      [':'],
>>                   ParserKind.LIST:         [':'],
>>                   ParserKind.BOOLEAN_EXPR: [':'],
>> +                 ParserKind.INTEGER:      [':'],
>>                   ParserKind.CUSTOM:       [':', '.']
>>                 } [value]
>>  @@ -1205,6 +1208,7 @@ def str(value):
>>                   ParserKind.COMMAND:      'COMMAND',
>>                   ParserKind.LIST:         'LIST',
>>                   ParserKind.BOOLEAN_EXPR: 'BOOLEAN_EXPR',
>> +                 ParserKind.INTEGER:      'INTEGER',
>>                   ParserKind.CUSTOM:       'CUSTOM'
>>                 } [value]
>>  @@ -1247,6 +1251,8 @@ def __init__(self, keyword, kind, parser=None, initial_value=None):
>>              self.parser = self._handleList
>>          elif kind == ParserKind.BOOLEAN_EXPR:
>>              self.parser = self._handleBooleanExpr
>> +        elif kind == ParserKind.INTEGER:
>> +            self.parser = self._handleSingleInteger
>>          elif kind == ParserKind.TAG:
>>              self.parser = self._handleTag
>>          elif kind == ParserKind.CUSTOM:
>> @@ -1311,6 +1317,18 @@ def _handleList(line_number, line, output):
>>          output.extend([s.strip() for s in line.split(',')])
>>          return output
>>  +    @staticmethod
>> +    def _handleSingleInteger(line_number, line, output):
>> +        """A parser for INTEGER type keywords"""
>> +        if output is None:
>> +            output = []
>> +        try:
>> +            n = int(line)
>> +        except ValueError:
>> +            raise ValueError("INTEGER parser requires the input to be an integer (got {})".format(line))
>> +        output.append(n)
>> +        return output
>> +
>>      @staticmethod
>>      def _handleBooleanExpr(line_number, line, output):
>>          """A parser for BOOLEAN_EXPR type keywords"""
>> @@ -1331,8 +1349,8 @@ def _handleBooleanExpr(line_number, line, output):
>>  def parseIntegratedTestScript(test, additional_parsers=[],
>>                                require_script=True):
>>      """parseIntegratedTestScript - Scan an LLVM/Clang style integrated test
>> -    script and extract the lines to 'RUN' as well as 'XFAIL' and 'REQUIRES'
>> -    and 'UNSUPPORTED' information.
>> +    script and extract the lines to 'RUN' as well as 'XFAIL', 'REQUIRES',
>> +    'UNSUPPORTED' and 'ALLOW_RETRIES' information.
>>        If additional parsers are specified then the test is also scanned for the
>>      keywords they specify and all matches are passed to the custom parser.
>> @@ -1353,6 +1371,7 @@ def parseIntegratedTestScript(test, additional_parsers=[],
>>                                      initial_value=test.requires),
>>          IntegratedTestKeywordParser('UNSUPPORTED:', ParserKind.BOOLEAN_EXPR,
>>                                      initial_value=test.unsupported),
>> +        IntegratedTestKeywordParser('ALLOW_RETRIES:', ParserKind.INTEGER),
>>          IntegratedTestKeywordParser('END.', ParserKind.TAG)
>>      ]
>>      keyword_parsers = {p.keyword: p for p in builtin_parsers}
>> @@ -1412,6 +1431,14 @@ def parseIntegratedTestScript(test, additional_parsers=[],
>>              "Test does not support the following features "
>>              "and/or targets: %s" % msg)
>>  +    # Handle ALLOW_RETRIES:
>> +    allowed_retries = keyword_parsers['ALLOW_RETRIES:'].getValue()
>> +    if allowed_retries:
>> +        if len(allowed_retries) > 1:
>> +            return lit.Test.Result(Test.UNRESOLVED,
>> +                                   "Test has more than one ALLOW_RETRIES lines")
>> +        test.allowed_retries = allowed_retries[0]
>> +
>>      # Enforce limit_to_features.
>>      if not test.isWithinFeatureLimits():
>>          msg = ', '.join(test.config.limit_to_features)
>> @@ -1477,10 +1504,8 @@ def executeShTest(test, litConfig, useExternalSh,
>>                                               normalize_slashes=useExternalSh)
>>      script = applySubstitutions(script, substitutions)
>>  -    # Re-run failed tests up to test_retry_attempts times.
>> -    attempts = 1
>> -    if hasattr(test.config, 'test_retry_attempts'):
>> -        attempts += test.config.test_retry_attempts
>> +    # Re-run failed tests up to test.allowed_retries times.
>> +    attempts = test.allowed_retries + 1
>>      for i in range(attempts):
>>          res = _runShTest(test, litConfig, useExternalSh, script, tmpBase)
>>          if res.code != Test.FAIL:
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py b/llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py
>> new file mode 100644
>> index 000000000000..05e3f35b6f81
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py
>> @@ -0,0 +1,3 @@
>> +# ALLOW_RETRIES: 3
>> +
>> +# RUN: false
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg b/llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg
>> new file mode 100644
>> index 000000000000..eed69f389ed0
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg
>> @@ -0,0 +1,9 @@
>> +import lit.formats
>> +config.name = 'allow-retries'
>> +config.suffixes = ['.py']
>> +config.test_format = lit.formats.ShTest()
>> +config.test_source_root = None
>> +config.test_exec_root = None
>> +
>> +config.substitutions.append(('%python', lit_config.params.get('python', '')))
>> +config.substitutions.append(('%counter', lit_config.params.get('counter', '')))
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py b/llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py
>> new file mode 100644
>> index 000000000000..14fb6b26661a
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py
>> @@ -0,0 +1,4 @@
>> +# ALLOW_RETRIES: 3
>> +# ALLOW_RETRIES: 5
>> +
>> +# RUN: true
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py b/llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py
>> new file mode 100644
>> index 000000000000..d624de900b7f
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py
>> @@ -0,0 +1,3 @@
>> +# ALLOW_RETRIES: not-an-integer
>> +
>> +# RUN: true
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py b/llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py
>> new file mode 100644
>> index 000000000000..45ac9433fc7e
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py
>> @@ -0,0 +1,24 @@
>> +# ALLOW_RETRIES: 5
>> +
>> +# RUN: "%python" "%s" "%counter"
>> +
>> +import sys
>> +import os
>> +
>> +counter_file = sys.argv[1]
>> +
>> +# The first time the test is run, initialize the counter to 1.
>> +if not os.path.exists(counter_file):
>> +    with open(counter_file, 'w') as counter:
>> +        counter.write("1")
>> +
>> +# Succeed if this is the fourth time we're being run.
>> +with open(counter_file, 'r') as counter:
>> +    num = int(counter.read())
>> +    if num == 4:
>> +        sys.exit(0)
>> +
>> +# Otherwise, increment the counter and fail
>> +with open(counter_file, 'w') as counter:
>> +    counter.write(str(num + 1))
>> +    sys.exit(1)
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg
>> new file mode 100644
>> index 000000000000..a3b660fbaef3
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg
>> @@ -0,0 +1,10 @@
>> +import lit.formats
>> +config.name = 'test_retry_attempts'
>> +config.suffixes = ['.py']
>> +config.test_format = lit.formats.ShTest()
>> +config.test_source_root = None
>> +config.test_exec_root = None
>> +
>> +config.test_retry_attempts = 5
>> +config.substitutions.append(('%python', lit_config.params.get('python', '')))
>> +config.substitutions.append(('%counter', lit_config.params.get('counter', '')))
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py
>> new file mode 100644
>> index 000000000000..ee8a92cc5d8f
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py
>> @@ -0,0 +1,22 @@
>> +# RUN: "%python" "%s" "%counter"
>> +
>> +import sys
>> +import os
>> +
>> +counter_file = sys.argv[1]
>> +
>> +# The first time the test is run, initialize the counter to 1.
>> +if not os.path.exists(counter_file):
>> +    with open(counter_file, 'w') as counter:
>> +        counter.write("1")
>> +
>> +# Succeed if this is the fourth time we're being run.
>> +with open(counter_file, 'r') as counter:
>> +    num = int(counter.read())
>> +    if num == 4:
>> +        sys.exit(0)
>> +
>> +# Otherwise, increment the counter and fail
>> +with open(counter_file, 'w') as counter:
>> +    counter.write(str(num + 1))
>> +    sys.exit(1)
>> 
>> diff  --git a/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt b/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>> index e28060320a2e..5809af5477ce 100644
>> --- a/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>> +++ b/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>> @@ -13,6 +13,9 @@
>>  // MY_BOOL: b)
>>  // MY_BOOL: d
>>  //
>> +// MY_INT: 4
>> +// MY_INT: 6
>> +//
>>  // MY_BOOL_UNTERMINATED: a \
>>  //
>>  // END.
>> 
>> diff  --git a/llvm/utils/lit/tests/allow-retries.py b/llvm/utils/lit/tests/allow-retries.py
>> new file mode 100644
>> index 000000000000..3f6cf8f1faa5
>> --- /dev/null
>> +++ b/llvm/utils/lit/tests/allow-retries.py
>> @@ -0,0 +1,41 @@
>> +# Check the behavior of the ALLOW_RETRIES keyword.
>> +
>> +# This test uses a file that's stable across retries of the test to fail and
>> +# only succeed the fourth time it is retried.
>> +#
>> +# RUN: rm -f %t.counter
>> +# RUN: %{lit} -j 1 %{inputs}/allow-retries/succeeds-within-limit.py -Dcounter=%t.counter -Dpython=%{python} | FileCheck --check-prefix=CHECK-TEST1 %s
>> +# CHECK-TEST1: Passes With Retry  : 1
>> +
>> +# Test that a per-file ALLOW_RETRIES overwrites the config-wide test_retry_attempts property, if any.
>> +#
>> +# RUN: rm -f %t.counter
>> +# RUN: %{lit} -j 1 %{inputs}/allow-retries/succeeds-within-limit.py -Dtest_retry_attempts=2 -Dcounter=%t.counter -Dpython=%{python} | FileCheck --check-prefix=CHECK-TEST2 %s
>> +# CHECK-TEST2: Passes With Retry  : 1
>> +
>> +# This test does not succeed within the allowed retry limit
>> +#
>> +# RUN: not %{lit} -j 1 %{inputs}/allow-retries/does-not-succeed-within-limit.py | FileCheck --check-prefix=CHECK-TEST3 %s
>> +# CHECK-TEST3: Failing Tests (1):
>> +# CHECK-TEST3: allow-retries :: does-not-succeed-within-limit.py
>> +
>> +# This test should be UNRESOLVED since it has more than one ALLOW_RETRIES
>> +# lines, and that is not allowed.
>> +#
>> +# RUN: not %{lit} -j 1 %{inputs}/allow-retries/more-than-one-allow-retries-lines.py | FileCheck --check-prefix=CHECK-TEST4 %s
>> +# CHECK-TEST4: Unresolved Tests (1):
>> +# CHECK-TEST4: allow-retries :: more-than-one-allow-retries-lines.py
>> +
>> +# This test does not provide a valid integer to the ALLOW_RETRIES keyword.
>> +# It should be unresolved.
>> +#
>> +# RUN: not %{lit} -j 1 %{inputs}/allow-retries/not-a-valid-integer.py | FileCheck --check-prefix=CHECK-TEST5 %s
>> +# CHECK-TEST5: Unresolved Tests (1):
>> +# CHECK-TEST5: allow-retries :: not-a-valid-integer.py
>> +
>> +# This test checks that the config-wide test_retry_attempts property is used
>> +# when no ALLOW_RETRIES keyword is present.
>> +#
>> +# RUN: rm -f %t.counter
>> +# RUN: %{lit} -j 1 %{inputs}/test_retry_attempts/test.py -Dcounter=%t.counter -Dpython=%{python} | FileCheck --check-prefix=CHECK-TEST6 %s
>> +# CHECK-TEST6: Passes With Retry  : 1
>> 
>> diff  --git a/llvm/utils/lit/tests/unit/TestRunner.py b/llvm/utils/lit/tests/unit/TestRunner.py
>> index ceb7bef34f6a..4f33fce64885 100644
>> --- a/llvm/utils/lit/tests/unit/TestRunner.py
>> +++ b/llvm/utils/lit/tests/unit/TestRunner.py
>> @@ -57,6 +57,7 @@ def custom_parse(line_number, line, output):
>>              IntegratedTestKeywordParser("MY_DNE_TAG.", ParserKind.TAG),
>>              IntegratedTestKeywordParser("MY_LIST:", ParserKind.LIST),
>>              IntegratedTestKeywordParser("MY_BOOL:", ParserKind.BOOLEAN_EXPR),
>> +            IntegratedTestKeywordParser("MY_INT:", ParserKind.INTEGER),
>>              IntegratedTestKeywordParser("MY_RUN:", ParserKind.COMMAND),
>>              IntegratedTestKeywordParser("MY_CUSTOM:", ParserKind.CUSTOM,
>>                                          custom_parse),
>> @@ -112,6 +113,17 @@ def test_boolean(self):
>>          self.assertEqual(value[0].strip(), "a && (b)")
>>          self.assertEqual(value[1].strip(), "d")
>>  +    def test_integer(self):
>> +        parsers = self.make_parsers()
>> +        self.parse_test(parsers)
>> +        int_parser = self.get_parser(parsers, 'MY_INT:')
>> +        value = int_parser.getValue()
>> +        self.assertEqual(len(value), 2)  # there are only two MY_INT: lines
>> +        self.assertEqual(type(value[0]), int)
>> +        self.assertEqual(value[0], 4)
>> +        self.assertEqual(type(value[1]), int)
>> +        self.assertEqual(value[1], 6)
>> +
>>      def test_boolean_unterminated(self):
>>          parsers = self.make_parsers() + \
>>              [IntegratedTestKeywordParser("MY_BOOL_UNTERMINATED:", ParserKind.BOOLEAN_EXPR)]
>> 
>> diff  --git a/llvm/utils/vim/syntax/llvm.vim b/llvm/utils/vim/syntax/llvm.vim
>> index 487a37b4b86b..0a661e82d86a 100644
>> --- a/llvm/utils/vim/syntax/llvm.vim
>> +++ b/llvm/utils/vim/syntax/llvm.vim
>> @@ -203,6 +203,7 @@ syn match   llvmConstant /\<DIFlag[A-Za-z]\+\>/
>>  syn match  llvmSpecialComment /;\s*PR\d*\s*$/
>>  syn match  llvmSpecialComment /;\s*REQUIRES:.*$/
>>  syn match  llvmSpecialComment /;\s*RUN:.*$/
>> +syn match  llvmSpecialComment /;\s*ALLOW_RETRIES:.*$/
>>  syn match  llvmSpecialComment /;\s*CHECK:.*$/
>>  syn match  llvmSpecialComment "\v;\s*CHECK-(NEXT|NOT|DAG|SAME|LABEL):.*$"
>>  syn match  llvmSpecialComment /;\s*XFAIL:.*$/
>> 
>> diff  --git a/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml b/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>> index 9765cee98df8..117ec134d573 100644
>> --- a/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>> +++ b/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>> @@ -319,6 +319,8 @@ patterns:
>>      name: string.regexp
>>    - match: ";\\s*RUN:.*$"
>>      name: string.regexp
>> +  - match: ";\\s*ALLOW_RETRIES:.*$"
>> +    name: string.regexp
>>    - match: ";\\s*CHECK:.*$"
>>      name: string.regexp
>>    - match: ";\\s*CHECK-(NEXT|NOT|DAG|SAME|LABEL):.*$"
>> 
>> 
>>         _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits