[llvm] [llvm-exegesis] Timeout if subprocess executor hangs (PR #132861)
Stephen Huan via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 24 19:22:54 PDT 2025
https://github.com/stephen-huan created https://github.com/llvm/llvm-project/pull/132861
The call to `llvm_exegesis_exe` nondeterministically hangs, causing llvm's tests to block (even before the progress bar `-- Testing: 57793 tests, 28 workers --` appears). Introduce a timeout when this happens. Reproduction:
```bash
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=Debug
ninja -C build check-all # hangs, sent ^C
```
```python
^Cllvm-lit: .../llvm-project/llvm/utils/lit/lit/TestingConfig.py:156: fatal: unable to parse config file '.../llvm-project/llvm/test/tools/llvm-exegesis/lit.local.cfg', traceback: Traceback (most recent call last):
File ".../llvm-project/llvm/utils/lit/lit/TestingConfig.py", line 144, in load_from_path
exec(compile(data, path, "exec"), cfg_globals, None)
File ".../llvm-project/llvm/test/tools/llvm-exegesis/lit.local.cfg", line 46, in <module>
if can_use_perf_counters(
^^^^^^^^^^^^^^^^^^^^^^
File ".../llvm-project/llvm/test/tools/llvm-exegesis/lit.local.cfg", line 22, in can_use_perf_counters
return_code = subprocess.call(
^^^^^^^^^^^^^^^^
File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 393, in call
return p.wait(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 1266, in wait
return self._wait(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 2061, in _wait
(pid, sts) = self._try_wait(0)
^^^^^^^^^^^^^^^^^
File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 2019, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
```
When printing out the time it takes on `ninja -C build check-all`, it prints three times---the first two calls take ~0.007 seconds, and the third time (if it doesn't hang) takes ~0.015--0.031 seconds (between 2-4x slower than the first two calls). If it hangs, it always hangs on the third call (waiting for minutes and still doesn't finish). Setting `-DLLVM_LIT_ARGS="-j1"` doesn't seem to change anything. In 50 manual invocations it succeeded 33 times and hung 17 times.
cc @boomanaiden154
>From 705ba309ccebb0f505107a83542ff13d555008e0 Mon Sep 17 00:00:00 2001
From: Stephen Huan <stephen.huan at cgdct.moe>
Date: Mon, 24 Mar 2025 21:35:42 -0400
Subject: [PATCH] [llvm-exegesis] Timeout if subprocess executor hangs
The call to llvm_exegesis_exe nondeterministically hangs, causing
llvm's tests to block. Introduce a timeout when this happens.
---
llvm/test/tools/llvm-exegesis/lit.local.cfg | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/llvm/test/tools/llvm-exegesis/lit.local.cfg b/llvm/test/tools/llvm-exegesis/lit.local.cfg
index a51a2d73442fa..8b7c13f5eb1b6 100644
--- a/llvm/test/tools/llvm-exegesis/lit.local.cfg
+++ b/llvm/test/tools/llvm-exegesis/lit.local.cfg
@@ -24,9 +24,10 @@ def can_use_perf_counters(mode, extra_options=[]):
+ extra_options,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
+ timeout=1,
)
return return_code == 0
- except OSError:
+ except (OSError, subprocess.TimeoutExpired):
print("could not exec llvm-exegesis")
return False
More information about the llvm-commits
mailing list