[llvm] [Dexter] Work around flaky LLDB DAP stackTrace response (PR #157090)
Orlando Cazalet-Hyams via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 5 05:23:04 PDT 2025
https://github.com/OCHyams created https://github.com/llvm/llvm-project/pull/157090
Buildbot cross-project-tests-sie-ubuntu sees sporadic test failures due to missing "stackTrace" "source" "path". The "path" field is optional for "source" according to DAP, so it's well formed. It works _most_ of the time, and doesn't consistently fail for any one test which is all strangely inconsistent.
* feature_tests/subtools/test/target_run_args_with_command.c - https://lab.llvm.org/buildbot/#/builders/181/builds/27287/steps/6/logs/stdio
* feature_tests/commands/perfect/dex_declare_address/address_after_ref.cpp - https://lab.llvm.org/buildbot/#/builders/181/builds/27321/steps/6/logs/stdio
* feature_tests/commands/perfect/command_line.c - https://lab.llvm.org/buildbot/#/builders/181/builds/27268/steps/6/logs/stdio
I can't replicate the failures locally after running the feature_tests in a loop for 3 hours, and haven't been able to work out why the "source" is sometimes missing by just looking at LLDB code.
So, instead, here is a plaster that I am hoping will improve bot consistency:
* Attempt to get the stack frames with source paths 3 times before giving up.
It would be ideal if we didn't need to do any of this. I think `_post_step_hook` could be removed if the behaviour in gh#156650 was fixed/changed.
>From 9c6de5b0c35d3b968afa9d0478bbd136c2113334 Mon Sep 17 00:00:00 2001
From: Orlando Cazalet-Hyams <orlando.hyams at sony.com>
Date: Fri, 5 Sep 2025 13:12:00 +0100
Subject: [PATCH] [Dexter] Work around flaky LLDB DAP stackTrace response
Buildbot cross-project-tests-sie-ubuntu sees sporadic test failures due to
missing "stackTrace" "source" "path". The "path" field is optional for "source"
according to DAP, so it's well formed. It works _most_ of the time, and doesn't
consistently fail for any one test which is all strangely inconsistent.
I can't replicate the failure locally after running the feature_tests in a loop
for 3 hours, and haven't been able to work out why the "source" is sometimes
missing by just looking at LLDB code.
So, instead, here is a plaster that I am hoping will improve bot consistency.
Attempt to get the stack frames with source paths 3 times before giving up.
It would be ideal if we didn't need to do any of this. I think `_post_step_hook`
could be removed if the behaviour in gh#156650 was fixed/changed.
---
.../dexter/dex/debugger/lldb/LLDB.py | 42 ++++++++++++++-----
1 file changed, 31 insertions(+), 11 deletions(-)
diff --git a/cross-project-tests/debuginfo-tests/dexter/dex/debugger/lldb/LLDB.py b/cross-project-tests/debuginfo-tests/dexter/dex/debugger/lldb/LLDB.py
index fa10b4914d45c..dde2a1959c1ea 100644
--- a/cross-project-tests/debuginfo-tests/dexter/dex/debugger/lldb/LLDB.py
+++ b/cross-project-tests/debuginfo-tests/dexter/dex/debugger/lldb/LLDB.py
@@ -11,6 +11,7 @@
import shlex
from subprocess import CalledProcessError, check_output, STDOUT
import sys
+import time
from dex.debugger.DebuggerBase import DebuggerBase, watch_is_active
from dex.debugger.DAP import DAP
@@ -419,20 +420,39 @@ def frames_below_main(self):
"_start",
]
+ def _get_current_path_and_addr(self):
+ trace_req_id = self.send_message(
+ self.make_request(
+ "stackTrace", {"threadId": self._debugger_state.thread, "levels": 1}
+ )
+ )
+ trace_response = self._await_response(trace_req_id)
+ if not trace_response["success"]:
+ raise DebuggerException("failed to get stack frames")
+ stackframes = trace_response["body"]["stackFrames"]
+ path = stackframes[0]["source"]["path"]
+ addr = stackframes[0]["instructionPointerReference"]
+ return (path, addr)
+
def _post_step_hook(self):
"""Hook to be executed after completing a step request."""
if self._debugger_state.stopped_reason == "step":
- trace_req_id = self.send_message(
- self.make_request(
- "stackTrace", {"threadId": self._debugger_state.thread, "levels": 1}
- )
- )
- trace_response = self._await_response(trace_req_id)
- if not trace_response["success"]:
- raise DebuggerException("failed to get stack frames")
- stackframes = trace_response["body"]["stackFrames"]
- path = stackframes[0]["source"]["path"]
- addr = stackframes[0]["instructionPointerReference"]
+ # Buildbot cross-project-tests-sie-ubuntu sees sporadic test
+ # failures due to missing stackFrames[0].source.path. The "path"
+ # field is optional for "source" according to DAP, so it's not
+ # ill-formed. But it works most of the time, and doesn't
+ # consistently fail for any one test. Attempt to get the stack
+ # frames with source paths 3 times before giving up.
+ # FIXME: It would be ideal if we didn't need to do any of this.
+ # This entire function could be removed if gh#156650 gets resolved.
+ for attempt in range(1, 3):
+ try:
+ path, addr = self._get_current_path_and_addr()
+ except KeyError as e:
+ if attempt == 3:
+ raise e
+ time.sleep(0.1)
+
if any(
self._debugger_state.bp_addr_map.get(self.dex_id_to_dap_id[dex_bp_id])
== addr
More information about the llvm-commits
mailing list