[llvm] f951b0f - [lit] Add builtin support for flaky tests in lit

Thu Mar 19 10:15:38 PDT 2020

I agree that first class support is better than hacky support.  I just 
don't like the need.  :)

Your explanation makes sense to me.  Objection dropped.

Philip

On 3/18/20 8:46 PM, Louis Dionne wrote:
> Libc++ has some tests that are timing-dependent, like `std::future::wait_for`. Those are inherently dependent on timing, and pretty much the only way to make them non-dependent is to mock the system clock (not easy to do). If someone knows other ways to improve the situation, I’m eager to learn (really!).
>
> We already reimplement that feature in libc++’s complicated test format, and it looks like at least 2 other projects had similar needs since they added `config.test_retry_attempts` (I could find that used twice in LLVM).
>
> I have no more love for flaky tests than anyone else, and I agree they’re a huge problem. But when you’re stuck with one, now, at least it’s easy to make it not break every other build without a custom test format (or making every test flaky, which is what config.test_retry_attempts does).
>
> Louis
>
>> On Mar 18, 2020, at 19:03, Philip Reames <listmail at philipreames.com> wrote:
>>
>> Er, I'm really not sure this is a good idea.  Can you explain why we have tests which are *fundamentally* flaky and can't simply be fixed to be non-flaky?
>>
>> Philip
>>
>> On 3/18/20 3:04 PM, Louis Dionne via llvm-commits wrote:
>>> Author: Louis Dionne
>>> Date: 2020-03-18T18:04:01-04:00
>>> New Revision: f951b0f82dff97f91d9b89c67886740acf3373cd
>>>
>>> URL: https://github.com/llvm/llvm-project/commit/f951b0f82dff97f91d9b89c67886740acf3373cd
>>> DIFF: https://github.com/llvm/llvm-project/commit/f951b0f82dff97f91d9b89c67886740acf3373cd.diff
>>>
>>> LOG: [lit] Add builtin support for flaky tests in lit
>>>
>>> This commit adds a new keyword in lit called ALLOW_RETRIES. This keyword
>>> takes a single integer as an argument, and it allows the test to fail that
>>> number of times before it first succeeds.
>>>
>>> This work attempts to make the existing test_retry_attempts more flexible
>>> by allowing by-test customization, as well as eliminate libc++'s FLAKY_TEST
>>> custom logic.
>>>
>>> Differential Revision: https://reviews.llvm.org/D76288
>>>
>>> Added:
>>>      llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py
>>>      llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg
>>>      llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py
>>>      llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py
>>>      llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py
>>>      llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg
>>>      llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py
>>>      llvm/utils/lit/tests/allow-retries.py
>>>
>>> Modified:
>>>      llvm/utils/lit/lit/Test.py
>>>      llvm/utils/lit/lit/TestRunner.py
>>>      llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>>>      llvm/utils/lit/tests/unit/TestRunner.py
>>>      llvm/utils/vim/syntax/llvm.vim
>>>      llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>>>
>>> Removed:
>>>      
>>>
>>> ################################################################################
>>> diff  --git a/llvm/utils/lit/lit/Test.py b/llvm/utils/lit/lit/Test.py
>>> index 62f1bbf1f03a..000bcf8fc38f 100644
>>> --- a/llvm/utils/lit/lit/Test.py
>>> +++ b/llvm/utils/lit/lit/Test.py
>>> @@ -220,6 +220,10 @@ def __init__(self, suite, path_in_suite, config, file_path = None):
>>>           # triple parts. All of them must be False for the test to run.
>>>           self.unsupported = []
>>>   +        # An optional number of retries allowed before the test finally succeeds.
>>> +        # The test is run at most once plus the number of retries specified here.
>>> +        self.allowed_retries = getattr(config, 'test_retry_attempts', 0)
>>> +
>>>           # The test result, once complete.
>>>           self.result = None
>>>   
>>> diff  --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
>>> index 96411c98ee33..a9518b2b5a0b 100644
>>> --- a/llvm/utils/lit/lit/TestRunner.py
>>> +++ b/llvm/utils/lit/lit/TestRunner.py
>>> @@ -1182,13 +1182,15 @@ class ParserKind(object):
>>>       LIST: A keyword taking a comma-separated list of values.
>>>       BOOLEAN_EXPR: A keyword taking a comma-separated list of
>>>           boolean expressions. Ex 'XFAIL:'
>>> +    INTEGER: A keyword taking a single integer. Ex 'ALLOW_RETRIES:'
>>>       CUSTOM: A keyword with custom parsing semantics.
>>>       """
>>>       TAG = 0
>>>       COMMAND = 1
>>>       LIST = 2
>>>       BOOLEAN_EXPR = 3
>>> -    CUSTOM = 4
>>> +    INTEGER = 4
>>> +    CUSTOM = 5
>>>         @staticmethod
>>>       def allowedKeywordSuffixes(value):
>>> @@ -1196,6 +1198,7 @@ def allowedKeywordSuffixes(value):
>>>                    ParserKind.COMMAND:      [':'],
>>>                    ParserKind.LIST:         [':'],
>>>                    ParserKind.BOOLEAN_EXPR: [':'],
>>> +                 ParserKind.INTEGER:      [':'],
>>>                    ParserKind.CUSTOM:       [':', '.']
>>>                  } [value]
>>>   @@ -1205,6 +1208,7 @@ def str(value):
>>>                    ParserKind.COMMAND:      'COMMAND',
>>>                    ParserKind.LIST:         'LIST',
>>>                    ParserKind.BOOLEAN_EXPR: 'BOOLEAN_EXPR',
>>> +                 ParserKind.INTEGER:      'INTEGER',
>>>                    ParserKind.CUSTOM:       'CUSTOM'
>>>                  } [value]
>>>   @@ -1247,6 +1251,8 @@ def __init__(self, keyword, kind, parser=None, initial_value=None):
>>>               self.parser = self._handleList
>>>           elif kind == ParserKind.BOOLEAN_EXPR:
>>>               self.parser = self._handleBooleanExpr
>>> +        elif kind == ParserKind.INTEGER:
>>> +            self.parser = self._handleSingleInteger
>>>           elif kind == ParserKind.TAG:
>>>               self.parser = self._handleTag
>>>           elif kind == ParserKind.CUSTOM:
>>> @@ -1311,6 +1317,18 @@ def _handleList(line_number, line, output):
>>>           output.extend([s.strip() for s in line.split(',')])
>>>           return output
>>>   +    @staticmethod
>>> +    def _handleSingleInteger(line_number, line, output):
>>> +        """A parser for INTEGER type keywords"""
>>> +        if output is None:
>>> +            output = []
>>> +        try:
>>> +            n = int(line)
>>> +        except ValueError:
>>> +            raise ValueError("INTEGER parser requires the input to be an integer (got {})".format(line))
>>> +        output.append(n)
>>> +        return output
>>> +
>>>       @staticmethod
>>>       def _handleBooleanExpr(line_number, line, output):
>>>           """A parser for BOOLEAN_EXPR type keywords"""
>>> @@ -1331,8 +1349,8 @@ def _handleBooleanExpr(line_number, line, output):
>>>   def parseIntegratedTestScript(test, additional_parsers=[],
>>>                                 require_script=True):
>>>       """parseIntegratedTestScript - Scan an LLVM/Clang style integrated test
>>> -    script and extract the lines to 'RUN' as well as 'XFAIL' and 'REQUIRES'
>>> -    and 'UNSUPPORTED' information.
>>> +    script and extract the lines to 'RUN' as well as 'XFAIL', 'REQUIRES',
>>> +    'UNSUPPORTED' and 'ALLOW_RETRIES' information.
>>>         If additional parsers are specified then the test is also scanned for the
>>>       keywords they specify and all matches are passed to the custom parser.
>>> @@ -1353,6 +1371,7 @@ def parseIntegratedTestScript(test, additional_parsers=[],
>>>                                       initial_value=test.requires),
>>>           IntegratedTestKeywordParser('UNSUPPORTED:', ParserKind.BOOLEAN_EXPR,
>>>                                       initial_value=test.unsupported),
>>> +        IntegratedTestKeywordParser('ALLOW_RETRIES:', ParserKind.INTEGER),
>>>           IntegratedTestKeywordParser('END.', ParserKind.TAG)
>>>       ]
>>>       keyword_parsers = {p.keyword: p for p in builtin_parsers}
>>> @@ -1412,6 +1431,14 @@ def parseIntegratedTestScript(test, additional_parsers=[],
>>>               "Test does not support the following features "
>>>               "and/or targets: %s" % msg)
>>>   +    # Handle ALLOW_RETRIES:
>>> +    allowed_retries = keyword_parsers['ALLOW_RETRIES:'].getValue()
>>> +    if allowed_retries:
>>> +        if len(allowed_retries) > 1:
>>> +            return lit.Test.Result(Test.UNRESOLVED,
>>> +                                   "Test has more than one ALLOW_RETRIES lines")
>>> +        test.allowed_retries = allowed_retries[0]
>>> +
>>>       # Enforce limit_to_features.
>>>       if not test.isWithinFeatureLimits():
>>>           msg = ', '.join(test.config.limit_to_features)
>>> @@ -1477,10 +1504,8 @@ def executeShTest(test, litConfig, useExternalSh,
>>>                                                normalize_slashes=useExternalSh)
>>>       script = applySubstitutions(script, substitutions)
>>>   -    # Re-run failed tests up to test_retry_attempts times.
>>> -    attempts = 1
>>> -    if hasattr(test.config, 'test_retry_attempts'):
>>> -        attempts += test.config.test_retry_attempts
>>> +    # Re-run failed tests up to test.allowed_retries times.
>>> +    attempts = test.allowed_retries + 1
>>>       for i in range(attempts):
>>>           res = _runShTest(test, litConfig, useExternalSh, script, tmpBase)
>>>           if res.code != Test.FAIL:
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py b/llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py
>>> new file mode 100644
>>> index 000000000000..05e3f35b6f81
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/does-not-succeed-within-limit.py
>>> @@ -0,0 +1,3 @@
>>> +# ALLOW_RETRIES: 3
>>> +
>>> +# RUN: false
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg b/llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg
>>> new file mode 100644
>>> index 000000000000..eed69f389ed0
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/lit.cfg
>>> @@ -0,0 +1,9 @@
>>> +import lit.formats
>>> +config.name = 'allow-retries'
>>> +config.suffixes = ['.py']
>>> +config.test_format = lit.formats.ShTest()
>>> +config.test_source_root = None
>>> +config.test_exec_root = None
>>> +
>>> +config.substitutions.append(('%python', lit_config.params.get('python', '')))
>>> +config.substitutions.append(('%counter', lit_config.params.get('counter', '')))
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py b/llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py
>>> new file mode 100644
>>> index 000000000000..14fb6b26661a
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/more-than-one-allow-retries-lines.py
>>> @@ -0,0 +1,4 @@
>>> +# ALLOW_RETRIES: 3
>>> +# ALLOW_RETRIES: 5
>>> +
>>> +# RUN: true
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py b/llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py
>>> new file mode 100644
>>> index 000000000000..d624de900b7f
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/not-a-valid-integer.py
>>> @@ -0,0 +1,3 @@
>>> +# ALLOW_RETRIES: not-an-integer
>>> +
>>> +# RUN: true
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py b/llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py
>>> new file mode 100644
>>> index 000000000000..45ac9433fc7e
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/allow-retries/succeeds-within-limit.py
>>> @@ -0,0 +1,24 @@
>>> +# ALLOW_RETRIES: 5
>>> +
>>> +# RUN: "%python" "%s" "%counter"
>>> +
>>> +import sys
>>> +import os
>>> +
>>> +counter_file = sys.argv[1]
>>> +
>>> +# The first time the test is run, initialize the counter to 1.
>>> +if not os.path.exists(counter_file):
>>> +    with open(counter_file, 'w') as counter:
>>> +        counter.write("1")
>>> +
>>> +# Succeed if this is the fourth time we're being run.
>>> +with open(counter_file, 'r') as counter:
>>> +    num = int(counter.read())
>>> +    if num == 4:
>>> +        sys.exit(0)
>>> +
>>> +# Otherwise, increment the counter and fail
>>> +with open(counter_file, 'w') as counter:
>>> +    counter.write(str(num + 1))
>>> +    sys.exit(1)
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg
>>> new file mode 100644
>>> index 000000000000..a3b660fbaef3
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/test_retry_attempts/lit.cfg
>>> @@ -0,0 +1,10 @@
>>> +import lit.formats
>>> +config.name = 'test_retry_attempts'
>>> +config.suffixes = ['.py']
>>> +config.test_format = lit.formats.ShTest()
>>> +config.test_source_root = None
>>> +config.test_exec_root = None
>>> +
>>> +config.test_retry_attempts = 5
>>> +config.substitutions.append(('%python', lit_config.params.get('python', '')))
>>> +config.substitutions.append(('%counter', lit_config.params.get('counter', '')))
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py
>>> new file mode 100644
>>> index 000000000000..ee8a92cc5d8f
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/Inputs/test_retry_attempts/test.py
>>> @@ -0,0 +1,22 @@
>>> +# RUN: "%python" "%s" "%counter"
>>> +
>>> +import sys
>>> +import os
>>> +
>>> +counter_file = sys.argv[1]
>>> +
>>> +# The first time the test is run, initialize the counter to 1.
>>> +if not os.path.exists(counter_file):
>>> +    with open(counter_file, 'w') as counter:
>>> +        counter.write("1")
>>> +
>>> +# Succeed if this is the fourth time we're being run.
>>> +with open(counter_file, 'r') as counter:
>>> +    num = int(counter.read())
>>> +    if num == 4:
>>> +        sys.exit(0)
>>> +
>>> +# Otherwise, increment the counter and fail
>>> +with open(counter_file, 'w') as counter:
>>> +    counter.write(str(num + 1))
>>> +    sys.exit(1)
>>>
>>> diff  --git a/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt b/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>>> index e28060320a2e..5809af5477ce 100644
>>> --- a/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>>> +++ b/llvm/utils/lit/tests/Inputs/testrunner-custom-parsers/test.txt
>>> @@ -13,6 +13,9 @@
>>>   // MY_BOOL: b)
>>>   // MY_BOOL: d
>>>   //
>>> +// MY_INT: 4
>>> +// MY_INT: 6
>>> +//
>>>   // MY_BOOL_UNTERMINATED: a \
>>>   //
>>>   // END.
>>>
>>> diff  --git a/llvm/utils/lit/tests/allow-retries.py b/llvm/utils/lit/tests/allow-retries.py
>>> new file mode 100644
>>> index 000000000000..3f6cf8f1faa5
>>> --- /dev/null
>>> +++ b/llvm/utils/lit/tests/allow-retries.py
>>> @@ -0,0 +1,41 @@
>>> +# Check the behavior of the ALLOW_RETRIES keyword.
>>> +
>>> +# This test uses a file that's stable across retries of the test to fail and
>>> +# only succeed the fourth time it is retried.
>>> +#
>>> +# RUN: rm -f %t.counter
>>> +# RUN: %{lit} -j 1 %{inputs}/allow-retries/succeeds-within-limit.py -Dcounter=%t.counter -Dpython=%{python} | FileCheck --check-prefix=CHECK-TEST1 %s
>>> +# CHECK-TEST1: Passes With Retry  : 1
>>> +
>>> +# Test that a per-file ALLOW_RETRIES overwrites the config-wide test_retry_attempts property, if any.
>>> +#
>>> +# RUN: rm -f %t.counter
>>> +# RUN: %{lit} -j 1 %{inputs}/allow-retries/succeeds-within-limit.py -Dtest_retry_attempts=2 -Dcounter=%t.counter -Dpython=%{python} | FileCheck --check-prefix=CHECK-TEST2 %s
>>> +# CHECK-TEST2: Passes With Retry  : 1
>>> +
>>> +# This test does not succeed within the allowed retry limit
>>> +#
>>> +# RUN: not %{lit} -j 1 %{inputs}/allow-retries/does-not-succeed-within-limit.py | FileCheck --check-prefix=CHECK-TEST3 %s
>>> +# CHECK-TEST3: Failing Tests (1):
>>> +# CHECK-TEST3: allow-retries :: does-not-succeed-within-limit.py
>>> +
>>> +# This test should be UNRESOLVED since it has more than one ALLOW_RETRIES
>>> +# lines, and that is not allowed.
>>> +#
>>> +# RUN: not %{lit} -j 1 %{inputs}/allow-retries/more-than-one-allow-retries-lines.py | FileCheck --check-prefix=CHECK-TEST4 %s
>>> +# CHECK-TEST4: Unresolved Tests (1):
>>> +# CHECK-TEST4: allow-retries :: more-than-one-allow-retries-lines.py
>>> +
>>> +# This test does not provide a valid integer to the ALLOW_RETRIES keyword.
>>> +# It should be unresolved.
>>> +#
>>> +# RUN: not %{lit} -j 1 %{inputs}/allow-retries/not-a-valid-integer.py | FileCheck --check-prefix=CHECK-TEST5 %s
>>> +# CHECK-TEST5: Unresolved Tests (1):
>>> +# CHECK-TEST5: allow-retries :: not-a-valid-integer.py
>>> +
>>> +# This test checks that the config-wide test_retry_attempts property is used
>>> +# when no ALLOW_RETRIES keyword is present.
>>> +#
>>> +# RUN: rm -f %t.counter
>>> +# RUN: %{lit} -j 1 %{inputs}/test_retry_attempts/test.py -Dcounter=%t.counter -Dpython=%{python} | FileCheck --check-prefix=CHECK-TEST6 %s
>>> +# CHECK-TEST6: Passes With Retry  : 1
>>>
>>> diff  --git a/llvm/utils/lit/tests/unit/TestRunner.py b/llvm/utils/lit/tests/unit/TestRunner.py
>>> index ceb7bef34f6a..4f33fce64885 100644
>>> --- a/llvm/utils/lit/tests/unit/TestRunner.py
>>> +++ b/llvm/utils/lit/tests/unit/TestRunner.py
>>> @@ -57,6 +57,7 @@ def custom_parse(line_number, line, output):
>>>               IntegratedTestKeywordParser("MY_DNE_TAG.", ParserKind.TAG),
>>>               IntegratedTestKeywordParser("MY_LIST:", ParserKind.LIST),
>>>               IntegratedTestKeywordParser("MY_BOOL:", ParserKind.BOOLEAN_EXPR),
>>> +            IntegratedTestKeywordParser("MY_INT:", ParserKind.INTEGER),
>>>               IntegratedTestKeywordParser("MY_RUN:", ParserKind.COMMAND),
>>>               IntegratedTestKeywordParser("MY_CUSTOM:", ParserKind.CUSTOM,
>>>                                           custom_parse),
>>> @@ -112,6 +113,17 @@ def test_boolean(self):
>>>           self.assertEqual(value[0].strip(), "a && (b)")
>>>           self.assertEqual(value[1].strip(), "d")
>>>   +    def test_integer(self):
>>> +        parsers = self.make_parsers()
>>> +        self.parse_test(parsers)
>>> +        int_parser = self.get_parser(parsers, 'MY_INT:')
>>> +        value = int_parser.getValue()
>>> +        self.assertEqual(len(value), 2)  # there are only two MY_INT: lines
>>> +        self.assertEqual(type(value[0]), int)
>>> +        self.assertEqual(value[0], 4)
>>> +        self.assertEqual(type(value[1]), int)
>>> +        self.assertEqual(value[1], 6)
>>> +
>>>       def test_boolean_unterminated(self):
>>>           parsers = self.make_parsers() + \
>>>               [IntegratedTestKeywordParser("MY_BOOL_UNTERMINATED:", ParserKind.BOOLEAN_EXPR)]
>>>
>>> diff  --git a/llvm/utils/vim/syntax/llvm.vim b/llvm/utils/vim/syntax/llvm.vim
>>> index 487a37b4b86b..0a661e82d86a 100644
>>> --- a/llvm/utils/vim/syntax/llvm.vim
>>> +++ b/llvm/utils/vim/syntax/llvm.vim
>>> @@ -203,6 +203,7 @@ syn match   llvmConstant /\<DIFlag[A-Za-z]\+\>/
>>>   syn match  llvmSpecialComment /;\s*PR\d*\s*$/
>>>   syn match  llvmSpecialComment /;\s*REQUIRES:.*$/
>>>   syn match  llvmSpecialComment /;\s*RUN:.*$/
>>> +syn match  llvmSpecialComment /;\s*ALLOW_RETRIES:.*$/
>>>   syn match  llvmSpecialComment /;\s*CHECK:.*$/
>>>   syn match  llvmSpecialComment "\v;\s*CHECK-(NEXT|NOT|DAG|SAME|LABEL):.*$"
>>>   syn match  llvmSpecialComment /;\s*XFAIL:.*$/
>>>
>>> diff  --git a/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml b/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>>> index 9765cee98df8..117ec134d573 100644
>>> --- a/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>>> +++ b/llvm/utils/vscode/llvm/syntaxes/ll.tmLanguage.yaml
>>> @@ -319,6 +319,8 @@ patterns:
>>>       name: string.regexp
>>>     - match: ";\\s*RUN:.*$"
>>>       name: string.regexp
>>> +  - match: ";\\s*ALLOW_RETRIES:.*$"
>>> +    name: string.regexp
>>>     - match: ";\\s*CHECK:.*$"
>>>       name: string.regexp
>>>     - match: ";\\s*CHECK-(NEXT|NOT|DAG|SAME|LABEL):.*$"
>>>
>>>
>>>          _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits