[llvm] [llvm-exegesis] Timeout if subprocess executor hangs (PR #132861)

Stephen Huan via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 24 19:22:54 PDT 2025


https://github.com/stephen-huan created https://github.com/llvm/llvm-project/pull/132861

The call to `llvm_exegesis_exe` nondeterministically hangs, causing llvm's tests to block (even before the progress bar  `-- Testing: 57793 tests, 28 workers --` appears). Introduce a timeout when this happens. Reproduction:

```bash
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=Debug
ninja -C build check-all # hangs, sent ^C
```

```python
^Cllvm-lit: .../llvm-project/llvm/utils/lit/lit/TestingConfig.py:156: fatal: unable to parse config file '.../llvm-project/llvm/test/tools/llvm-exegesis/lit.local.cfg', traceback: Traceback (most recent call last):
  File ".../llvm-project/llvm/utils/lit/lit/TestingConfig.py", line 144, in load_from_path
    exec(compile(data, path, "exec"), cfg_globals, None)
  File ".../llvm-project/llvm/test/tools/llvm-exegesis/lit.local.cfg", line 46, in <module>
    if can_use_perf_counters(
       ^^^^^^^^^^^^^^^^^^^^^^
  File ".../llvm-project/llvm/test/tools/llvm-exegesis/lit.local.cfg", line 22, in can_use_perf_counters
    return_code = subprocess.call(
                  ^^^^^^^^^^^^^^^^
  File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 393, in call
    return p.wait(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 1266, in wait
    return self._wait(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 2061, in _wait
    (pid, sts) = self._try_wait(0)
                 ^^^^^^^^^^^^^^^^^
  File "/nix/store/f2krmq3iv5nibcvn4rw7nrnrciqprdkh-python3-3.12.9/lib/python3.12/subprocess.py", line 2019, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
```

When printing out the time it takes on `ninja -C build check-all`, it prints three times---the first two calls take ~0.007 seconds, and the third time (if it doesn't hang) takes ~0.015--0.031 seconds (between 2-4x slower than the first two calls). If it hangs, it always hangs on the third call (waiting for minutes and still doesn't finish). Setting `-DLLVM_LIT_ARGS="-j1"` doesn't seem to change anything. In 50 manual invocations it succeeded 33 times and hung 17 times.

cc @boomanaiden154

>From 705ba309ccebb0f505107a83542ff13d555008e0 Mon Sep 17 00:00:00 2001
From: Stephen Huan <stephen.huan at cgdct.moe>
Date: Mon, 24 Mar 2025 21:35:42 -0400
Subject: [PATCH] [llvm-exegesis] Timeout if subprocess executor hangs

The call to llvm_exegesis_exe nondeterministically hangs, causing
llvm's tests to block. Introduce a timeout when this happens.
---
 llvm/test/tools/llvm-exegesis/lit.local.cfg | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/llvm/test/tools/llvm-exegesis/lit.local.cfg b/llvm/test/tools/llvm-exegesis/lit.local.cfg
index a51a2d73442fa..8b7c13f5eb1b6 100644
--- a/llvm/test/tools/llvm-exegesis/lit.local.cfg
+++ b/llvm/test/tools/llvm-exegesis/lit.local.cfg
@@ -24,9 +24,10 @@ def can_use_perf_counters(mode, extra_options=[]):
             + extra_options,
             stdout=subprocess.DEVNULL,
             stderr=subprocess.DEVNULL,
+            timeout=1,
         )
         return return_code == 0
-    except OSError:
+    except (OSError, subprocess.TimeoutExpired):
         print("could not exec llvm-exegesis")
         return False
 



More information about the llvm-commits mailing list