<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Sep 19, 2018, at 6:49 PM, Leonard Mosescu <<a href="mailto:mosescu@google.com" class="">mosescu@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Sounds like a fantastic idea. <div class=""><br class=""></div><div class="">How would this work when the behavior of the debugee process is non-deterministic?</div></div></div></blockquote><div><br class=""></div><div><div>All the communication between the debugger and the inferior goes through the</div><div>GDB remote protocol. Because we capture and replay this, we can reproduce</div><div>without running the executable, which is particularly convenient when you were</div><div>originally debugging something on a different device for example. </div><div class=""><br class=""></div></div><blockquote type="cite" class=""><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <span dir="ltr" class=""><<a href="mailto:lldb-dev@lists.llvm.org" target="_blank" class="">lldb-dev@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi everyone,<br class="">
<br class="">
We all know how hard it can be to reproduce an issue or crash in LLDB. There<br class="">
are a lot of moving parts and subtle differences can easily add up. We want to<br class="">
make this easier by generating reproducers in LLDB, similar to what clang does<br class="">
today.<br class="">
<br class="">
The core idea is as follows: during normal operation we capture whatever<br class="">
information is needed to recreate the current state of the debugger. When<br class="">
something goes wrong, this becomes available to the user. Someone else should<br class="">
then be able to reproduce the same issue with only this data, for example on a<br class="">
different machine.<br class="">
<br class="">
It's important to note that we want to replay the debug session from the<br class="">
reproducer, rather than just recreating the current state. This ensures that we<br class="">
have access to all the events leading up to the problem, which are usually far<br class="">
more important than the error state itself.<br class="">
<br class="">
# High Level Design<br class="">
<br class="">
Concretely we want to extend LLDB in two ways:<br class="">
<br class="">
1. We need to add infrastructure to _generate_ the data necessary for<br class="">
reproducing.<br class="">
2. We need to add infrastructure to _use_ the data in the reproducer to replay<br class="">
the debugging session.<br class="">
<br class="">
Different parts of LLDB will have different definitions of what data they need<br class="">
to reproduce their path to the issue. For example, capturing the commands<br class="">
executed by the user is very different from tracking the dSYM bundles on disk.<br class="">
Therefore, we propose to have each component deal with its needs in a localized<br class="">
way. This has the advantage that the functionality can be developed and tested<br class="">
independently.<br class="">
<br class="">
## Providers<br class="">
<br class="">
We'll call a combination of (1) and (2) for a given component a `Provider`. For<br class="">
example, we'd have an provider for user commands and a provider for dSYM files.<br class="">
A provider will know how to keep track of its information, how to serialize it<br class="">
as part of the reproducer as well as how to deserialize it again and use it to<br class="">
recreate the state of the debugger.<br class="">
<br class="">
With one exception, the lifetime of the provider coincides with that of the<br class="">
`SBDebugger`, because that is the scope of what we consider here to be a single<br class="">
debug session. The exception would be the provider for the global module cache,<br class="">
because it is shared between multiple debuggers. Although it would be<br class="">
conceptually straightforward to add a provider for the shared module cache,<br class="">
this significantly increases the complexity of the reproducer framework because<br class="">
of its implication on the lifetime and everything related to that.<br class="">
<br class="">
For now we will ignore this problem which means we will not replay the<br class="">
construction of the shared module cache but rather build it up during<br class="">
replaying, as if the current debug session was the first and only one using it.<br class="">
The impact of doing so is significant, as no issue caused by the shared module<br class="">
cache will be reproducible, but does not limit reproducing any issue unrelated<br class="">
to it.<br class="">
<br class="">
## Reproducer Framework<br class="">
<br class="">
To coordinate between the data from different components, we'll need to<br class="">
introduce a global reproducer infrastructure. We have a component responsible<br class="">
for reproducer generation (the `Generator`) and for using the reproducer (the<br class="">
`Loader`). They are essentially two ways of looking at the same unit of<br class="">
repayable work.<br class="">
<br class="">
The Generator keeps track of its providers and whether or not we need to<br class="">
generate a reproducer. When a problem occurs, LLDB will request the Generator<br class="">
to generate a reproducer. When LLDB finishes successfully, the Generator cleans<br class="">
up anything it might have created during the session. Additionally, the<br class="">
Generator populates an index, which is part of the reproducer, and used by the<br class="">
Loader to discover what information is available.<br class="">
<br class="">
When a reproducer is passed to LLDB, we want to use its data to replay the<br class="">
debug session. This is coordinated by the Loader. Through the index created by<br class="">
the Generator, different components know what data (Providers) are available,<br class="">
and how to use them.<br class="">
<br class="">
It's important to note that in order to create a complete reproducer, we will<br class="">
require data from our dependencies (llvm, clang, swift) as well. This means<br class="">
that either (a) the infrastructure needs to be accessible from our dependencies<br class="">
or (b) that an API is provided that allows us to query this. We plan to address<br class="">
this issue when it arises for the respective Generator.<br class="">
<br class="">
# Components<br class="">
<br class="">
We have identified a list of minimal components needed to make reproducing<br class="">
possible. We've divided those into two groups: explicit and implicit inputs.<br class="">
<br class="">
Explicit inputs are inputs from the user to the debugger.<br class="">
<br class="">
- Command line arguments<br class="">
- Settings<br class="">
- User commands<br class="">
- Scripting Bridge API<br class="">
<br class="">
In addition to the components listed above, LLDB has a bunch of inputs that are<br class="">
not passed explicitly. It's often these that make reproducing an issue complex.<br class="">
<br class="">
- GDB Remote Packets<br class="">
- Files containing debug information (object files, dSYM bundles)<br class="">
- Clang headers<br class="">
- Swift modules<br class="">
<br class="">
Every component would have its own provider and is free to implement it as it<br class="">
sees fit. For example, as we expect to have a large number of GDB remote<br class="">
packets, the provider might choose to write these to disk as they come in,<br class="">
while the settings can easily be kept in memory until it is decided that we<br class="">
need to generate a reproducer.<br class="">
<br class="">
# Concerns, Implications & Risks<br class="">
<br class="">
## Performance Impact<br class="">
<br class="">
As the reproducer functionality will have to be always-on, we have to consider<br class="">
performance implications. As mentioned earlier, the provider gives the freedom<br class="">
to be implemented in such a way that works best for its respective component.<br class="">
We'll have to measure to know how big the impact is.<br class="">
<br class="">
## Privacy<br class="">
<br class="">
The reproducer might contain sensitive user information. We should make it<br class="">
clear to the user what kind of data is contained in the reproducer. Initially<br class="">
we will focus on the LLDB developer community and the people already filing<br class="">
bugs.<br class="">
<br class="">
## Versions<br class="">
<br class="">
Because the reproducer works by replaying a debug session, the versions of the<br class="">
debugger generating an replaying the session will have to match. Not only is<br class="">
this important for the serialization format, but more importantly a different<br class="">
LLDB might ask different questions in a different order.<br class="">
<br class="">
# Implementation<br class="">
<br class="">
I've put up a patch (<<a href="https://reviews.llvm.org/D50254" rel="noreferrer" target="_blank" class="">https://reviews.llvm.org/<wbr class="">D50254</a>>) which contains a minimal<br class="">
implementation of the reproducer framework as well as the GDB remote provider.<br class="">
<br class="">
It records the GDB packets and writes them to a YAML file (we can switch to a<br class="">
more performant encoding down the road). When invoking the LLDB driver and<br class="">
passing the reproducer directory to `--reproducer`, this file is read and a<br class="">
dummy server replies with the next packet from this file, without talking to<br class="">
the executable.<br class="">
<br class="">
It's still pretty rudimentary and only works if you enter the exact same<br class="">
commands (so the server receives the exact same requests form the client).<br class="">
<br class="">
The next steps are (in broad strokes):<br class="">
<br class="">
1. Capturing the debugged binary.<br class="">
2. Record and replay user commands and SB-API calls.<br class="">
3. Recording the configuration of the debugger.<br class="">
4. Capturing other files used by LLDB.<br class="">
<br class="">
Please let me know what you think!<br class="">
<br class="">
Thanks,<br class="">
Jonas <br class="">
______________________________<wbr class="">_________________<br class="">
lldb-dev mailing list<br class="">
<a href="mailto:lldb-dev@lists.llvm.org" class="">lldb-dev@lists.llvm.org</a><br class="">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/lldb-dev</a><br class="">
</blockquote></div><br class=""></div>
</div></blockquote></div><br class=""></body></html>