[llvm] [lit] add --max-retries-per-test execution option (PR #141851)

Fri May 30 06:09:33 PDT 2025

https://github.com/kwk updated https://github.com/llvm/llvm-project/pull/141851

>From 1e070720fa960a7a23787ddca13fb263c971b95c Mon Sep 17 00:00:00 2001
From: Konrad Kleine <kkleine at redhat.com>
Date: Fri, 30 May 2025 10:45:11 +0200
Subject: [PATCH 1/4] [lit] add --max-retries-per-test execution option

When packaging LLVM we've seen arbitrary tests fail.
It happened sporadically and most of the times the test
do work if they are run a second time on the next day.

The tests themselves were always different and we didn't
know ahead of time which ones we wanted to re-run.
That's we filter-out a lot of `libomp` and `libarcher` tests [1].

This change allows us to set
`LIT_OPTS="--max-retries-per-test=12"`
when running any "check-XXX" build target. Then any lit test
will at most be re-run 12 times, unless there's an `ALLOW_RETRIES:`
in one of the test scripts that's specifying a different value
than `12`. `12` is just an example here, any positive integer
will work.

Please note, that this only adds the possibility to re-run
lit tests. It does not actually do it until the caller specifies
`--max-retries-per-test=<POSITIVE_INT>` either on a call to `lit` or
in `LIT_OPTS`.

Also note, that one can still use `ALLOW_RETRIES:` in test scripts
and it will always rule over `--max-retries-per-test`. When
`--max-retries-per-test` is set too low, but the `config.test_retry_attempts`
is high enough, it works as well.

Any option in the list below overrules its predecessor:

* `--max-retries-per-test`
* `config.test_retry_attempts`
* `ALLOW_RETRIES` keyword

[1]: https://src.fedoraproject.org/rpms/llvm/blob/rawhide/f/llvm.spec#_2326

Downstream PR to make use of the `--max-retries-per-test` option: https://src.fedoraproject.org/rpms/llvm/pull-request/434
Downstream ticket: https://issues.redhat.com/browse/LLVM-145
---
 llvm/utils/lit/lit/LitConfig.py               |  2 +
 llvm/utils/lit/lit/TestingConfig.py           |  5 ++
 llvm/utils/lit/lit/cl_arguments.py            |  7 +++
 llvm/utils/lit/lit/main.py                    |  1 +
 .../lit.cfg                                   | 10 ++++
 .../test.py                                   | 23 +++++++++
 .../allow-retries-test_retry_attempts/lit.cfg | 12 +++++
 .../allow-retries-test_retry_attempts/test.py | 23 +++++++++
 .../lit.cfg                                   | 10 ++++
 .../test.py                                   | 22 +++++++++
 .../lit.cfg                                   | 12 +++++
 .../test.py                                   | 22 +++++++++
 llvm/utils/lit/tests/allow-retries.py         | 48 +++++++++++++++++++
 13 files changed, 197 insertions(+)
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
 create mode 100644 llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py

diff --git a/llvm/utils/lit/lit/LitConfig.py b/llvm/utils/lit/lit/LitConfig.py
index 5dc712ae28370..cb4aef6f72a87 100644
--- a/llvm/utils/lit/lit/LitConfig.py
+++ b/llvm/utils/lit/lit/LitConfig.py
@@ -35,6 +35,7 @@ def __init__(
         params,
         config_prefix=None,
         maxIndividualTestTime=0,
+        maxRetriesPerTest=None,
         parallelism_groups={},
         per_test_coverage=False,
         gtest_sharding=True,
@@ -86,6 +87,7 @@ def __init__(
             self.valgrindArgs.extend(self.valgrindUserArgs)
 
         self.maxIndividualTestTime = maxIndividualTestTime
+        self.maxRetriesPerTest = maxRetriesPerTest
         self.parallelism_groups = parallelism_groups
         self.per_test_coverage = per_test_coverage
         self.gtest_sharding = bool(gtest_sharding)
diff --git a/llvm/utils/lit/lit/TestingConfig.py b/llvm/utils/lit/lit/TestingConfig.py
index c063851b89526..c250838250547 100644
--- a/llvm/utils/lit/lit/TestingConfig.py
+++ b/llvm/utils/lit/lit/TestingConfig.py
@@ -235,6 +235,11 @@ def finish(self, litConfig):
             # files. Should we distinguish them?
             self.test_source_root = str(self.test_source_root)
         self.excludes = set(self.excludes)
+        if (
+            litConfig.maxRetriesPerTest is not None
+            and getattr(self, "test_retry_attempts", None) is None
+        ):
+            self.test_retry_attempts = litConfig.maxRetriesPerTest
 
     @property
     def root(self):
diff --git a/llvm/utils/lit/lit/cl_arguments.py b/llvm/utils/lit/lit/cl_arguments.py
index 1d776e0216a1e..30160a7bd5622 100644
--- a/llvm/utils/lit/lit/cl_arguments.py
+++ b/llvm/utils/lit/lit/cl_arguments.py
@@ -199,6 +199,13 @@ def parse_args():
         "0 means no time limit. [Default: 0]",
         type=_non_negative_int,
     )
+    execution_group.add_argument(
+        "--max-retries-per-test",
+        dest="maxRetriesPerTest",
+        metavar="N",
+        help="Maximum number of allowed retry attempts per test (NOTE: ALLOWED_RETRIES keyword always takes precedence)",
+        type=_positive_int,
+    )
     execution_group.add_argument(
         "--max-failures",
         help="Stop execution after the given number of failures.",
diff --git a/llvm/utils/lit/lit/main.py b/llvm/utils/lit/lit/main.py
index ba80330d22400..0939838b78ceb 100755
--- a/llvm/utils/lit/lit/main.py
+++ b/llvm/utils/lit/lit/main.py
@@ -42,6 +42,7 @@ def main(builtin_params={}):
         config_prefix=opts.configPrefix,
         per_test_coverage=opts.per_test_coverage,
         gtest_sharding=opts.gtest_sharding,
+        maxRetriesPerTest=opts.maxRetriesPerTest,
     )
 
     discovered_tests = lit.discovery.find_tests_for_inputs(
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..2f16e95dbe196
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg
@@ -0,0 +1,10 @@
+import lit.formats
+
+config.name = "allow-retries-no-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..f2333a7de455a
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
@@ -0,0 +1,23 @@
+# ALLOW_RETRIES: 3
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+    with open(counter_file, "w") as counter:
+        counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+    num = int(counter.read())
+    if num == 4:
+        sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+    counter.write(str(num + 1))
+    sys.exit(1)
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..97e4edb2dfded
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
@@ -0,0 +1,12 @@
+import lit.formats
+
+config.name = "allow-retries-no-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
+
+config.test_retry_attempts = 2
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..f2333a7de455a
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
@@ -0,0 +1,23 @@
+# ALLOW_RETRIES: 3
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+    with open(counter_file, "w") as counter:
+        counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+    num = int(counter.read())
+    if num == 4:
+        sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+    counter.write(str(num + 1))
+    sys.exit(1)
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..f851ba08002a0
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg
@@ -0,0 +1,10 @@
+import lit.formats
+
+config.name = "no-allow-retries-no-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..a139976cc49ec
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py
@@ -0,0 +1,22 @@
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+    with open(counter_file, "w") as counter:
+        counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+    num = int(counter.read())
+    if num == 4:
+        sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+    counter.write(str(num + 1))
+    sys.exit(1)
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..2279d293849a8
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
@@ -0,0 +1,12 @@
+import lit.formats
+
+config.name = "no-allow-retries-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
+
+config.test_retry_attempts = 3
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..a139976cc49ec
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py
@@ -0,0 +1,22 @@
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+    with open(counter_file, "w") as counter:
+        counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+    num = int(counter.read())
+    if num == 4:
+        sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+    counter.write(str(num + 1))
+    sys.exit(1)
diff --git a/llvm/utils/lit/tests/allow-retries.py b/llvm/utils/lit/tests/allow-retries.py
index 45610fb70d348..ba95d34a09a4c 100644
--- a/llvm/utils/lit/tests/allow-retries.py
+++ b/llvm/utils/lit/tests/allow-retries.py
@@ -70,3 +70,51 @@
 #     CHECK-TEST7: # executed command: export LLVM_PROFILE_FILE=
 # CHECK-TEST7-NOT: # executed command: export LLVM_PROFILE_FILE=
 #     CHECK-TEST7: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+#  * The "--max-retries-per-test" is specified high enough (3).
+#  * No ALLOW_RETRIES keyword is used in the test script.
+#  * No config.test_retry_attempts is adjusted in the test suite config file.
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py \
+# RUN:   --max-retries-per-test=3 \
+# RUN:   -Dcounter=%t.counter \
+# RUN:   -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST8 %s
+# CHECK-TEST8: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+#  * The "--max-retries-per-test" is specified too low (2).
+#  * ALLOW_RETRIES is specified high enough (3)
+#  * No config.test_retry_attempts is adjusted in the test suite config file.
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py \
+# RUN:   --max-retries-per-test=2 \
+# RUN:   -Dcounter=%t.counter \
+# RUN:   -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST9 %s
+# CHECK-TEST9: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+#  * The "--max-retries-per-test" is specified too low (2).
+#  * No ALLOW_RETRIES keyword is used in the test script.
+#  * config.test_retry_attempts is set high enough (3).
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py \
+# RUN:   --max-retries-per-test=2 \
+# RUN:   -Dcounter=%t.counter \
+# RUN:   -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST10 %s
+# CHECK-TEST10: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+#  * The "--max-retries-per-test" is specified too low (1).
+#  * ALLOW_RETRIES keyword set high enough (3).
+#  * config.test_retry_attempts is set too low enough (2).
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py \
+# RUN:   --max-retries-per-test=1 \
+# RUN:   -Dcounter=%t.counter \
+# RUN:   -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST11 %s
+# CHECK-TEST11: Passed With Retry: 1

>From 42ed87fb263c27e37f2c5d9bbe2e81adcbe645a4 Mon Sep 17 00:00:00 2001
From: Konrad Kleine <kkleine at redhat.com>
Date: Fri, 30 May 2025 12:43:19 +0000
Subject: [PATCH 2/4] Fix test suite name

---
 .../allow-retries-test_retry_attempts/lit.cfg                   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
index 97e4edb2dfded..2260e2dce838e 100644
--- a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
@@ -1,6 +1,6 @@
 import lit.formats
 
-config.name = "allow-retries-no-test_retry_attempts"
+config.name = "allow-retries-test_retry_attempts"
 config.suffixes = [".py"]
 config.test_format = lit.formats.ShTest()
 config.test_source_root = None

>From fa4abfd4ac40a3c6c97c6d4162b772099a781fd0 Mon Sep 17 00:00:00 2001
From: Konrad Kleine <kkleine at redhat.com>
Date: Fri, 30 May 2025 12:46:11 +0000
Subject: [PATCH 3/4] Adjust help text of --max-retries-per-test

---
 llvm/utils/lit/lit/cl_arguments.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/llvm/utils/lit/lit/cl_arguments.py b/llvm/utils/lit/lit/cl_arguments.py
index 30160a7bd5622..3292554ab5ff7 100644
--- a/llvm/utils/lit/lit/cl_arguments.py
+++ b/llvm/utils/lit/lit/cl_arguments.py
@@ -203,7 +203,9 @@ def parse_args():
         "--max-retries-per-test",
         dest="maxRetriesPerTest",
         metavar="N",
-        help="Maximum number of allowed retry attempts per test (NOTE: ALLOWED_RETRIES keyword always takes precedence)",
+        help="Maximum number of allowed retry attempts per test "
+        "(NOTE: The config.test_retry_attempts test suite option and "
+        "ALLOWED_RETRIES keyword always take precedence)",
         type=_positive_int,
     )
     execution_group.add_argument(

>From 009ce5e312b87e37ed9ef36a7ec4f15082e1daac Mon Sep 17 00:00:00 2001
From: Konrad Kleine <kkleine at redhat.com>
Date: Fri, 30 May 2025 13:08:26 +0000
Subject: [PATCH 4/4] Update command guide

---
 llvm/docs/CommandGuide/lit.rst | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/llvm/docs/CommandGuide/lit.rst b/llvm/docs/CommandGuide/lit.rst
index 2a0ddd0ea04b4..86879f870e06e 100644
--- a/llvm/docs/CommandGuide/lit.rst
+++ b/llvm/docs/CommandGuide/lit.rst
@@ -218,6 +218,19 @@ EXECUTION OPTIONS
 
  Stop execution after the given number of failures.
 
+.. option:: --max-retries-per-test N
+
+ Retry running failed tests at most ``N`` times.
+ Out of the following options to rerun failed tests the
+ :option:`--max-retries-per-test` is the only one that doesn't
+ require a change in the test scripts or the test config:
+
+  * :option:`--max-retries-per-test` lit option
+  * ``config.test_retry_attempts`` test suite option
+  * ``ALLOW_RETRIES:`` annotation in test script
+
+ Any option in the list above overrules its predecessor.
+
 .. option:: --allow-empty-runs
 
  Do not fail the run if all tests are filtered out.