<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Sep 19, 2018, at 6:49 PM, Leonard Mosescu <<a href="mailto:mosescu@google.com" class="">mosescu@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Sounds like a fantastic idea. <div class=""><br class=""></div><div class="">How would this work when the behavior of the debugee process is non-deterministic?</div></div></div></blockquote><div><br class=""></div><div><div>All the communication between the debugger and the inferior goes through the</div><div>GDB remote protocol. Because we capture and replay this, we can reproduce</div><div>without running the executable, which is particularly convenient when you were</div><div>originally debugging something on a different device for example. </div><div class=""><br class=""></div></div><blockquote type="cite" class=""><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <span dir="ltr" class=""><<a href="mailto:lldb-dev@lists.llvm.org" target="_blank" class="">lldb-dev@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi everyone,<br class="">

<br class="">

We all know how hard it can be to reproduce an issue or crash in LLDB. There<br class="">

are a lot of moving parts and subtle differences can easily add up. We want to<br class="">

make this easier by generating reproducers in LLDB, similar to what clang does<br class="">

today.<br class="">

<br class="">

The core idea is as follows: during normal operation we capture whatever<br class="">

information is needed to recreate the current state of the debugger. When<br class="">

something goes wrong, this becomes available to the user. Someone else should<br class="">

then be able to reproduce the same issue with only this data, for example on a<br class="">

different machine.<br class="">

<br class="">

It's important to note that we want to replay the debug session from the<br class="">

reproducer, rather than just recreating the current state. This ensures that we<br class="">

have access to all the events leading up to the problem, which are usually far<br class="">

more important than the error state itself.<br class="">

<br class="">

# High Level Design<br class="">

<br class="">

Concretely we want to extend LLDB in two ways:<br class="">

<br class="">

1.  We need to add infrastructure to _generate_ the data necessary for<br class="">

    reproducing.<br class="">

2.  We need to add infrastructure to _use_ the data in the reproducer to replay<br class="">

    the debugging session.<br class="">

<br class="">

Different parts of LLDB will have different definitions of what data they need<br class="">

to reproduce their path to the issue. For example, capturing the commands<br class="">

executed by the user is very different from tracking the dSYM bundles on disk.<br class="">

Therefore, we propose to have each component deal with its needs in a localized<br class="">

way. This has the advantage that the functionality can be developed and tested<br class="">

independently.<br class="">

<br class="">

## Providers<br class="">

<br class="">

We'll call a combination of (1) and (2) for a given component a `Provider`. For<br class="">

example, we'd have an provider for user commands and a provider for dSYM files.<br class="">

A provider will know how to keep track of its information, how to serialize it<br class="">

as part of the reproducer as well as how to deserialize it again and use it to<br class="">

recreate the state of the debugger.<br class="">

<br class="">

With one exception, the lifetime of the provider coincides with that of the<br class="">

`SBDebugger`, because that is the scope of what we consider here to be a single<br class="">

debug session. The exception would be the provider for the global module cache,<br class="">

because it is shared between multiple debuggers. Although it would be<br class="">

conceptually straightforward to add a provider for the shared module cache,<br class="">

this significantly increases the complexity of the reproducer framework because<br class="">

of its implication on the lifetime and everything related to that.<br class="">

<br class="">

For now we will ignore this problem which means we will not replay the<br class="">

construction of the shared module cache but rather build it up during<br class="">

replaying, as if the current debug session was the first and only one using it.<br class="">

The impact of doing so is significant, as no issue caused by the shared module<br class="">

cache will be reproducible, but does not limit reproducing any issue unrelated<br class="">

to it.<br class="">

<br class="">

## Reproducer Framework<br class="">

<br class="">

To coordinate between the data from different components, we'll need to<br class="">

introduce a global reproducer infrastructure. We have a component responsible<br class="">

for reproducer generation (the `Generator`) and for using the reproducer (the<br class="">

`Loader`). They are essentially two ways of looking at the same unit of<br class="">

repayable work.<br class="">

<br class="">

The Generator keeps track of its providers and whether or not we need to<br class="">

generate a reproducer. When a problem occurs, LLDB will request the Generator<br class="">

to generate a reproducer. When LLDB finishes successfully, the Generator cleans<br class="">

up anything it might have created during the session. Additionally, the<br class="">

Generator populates an index, which is part of the reproducer, and used by the<br class="">

Loader to discover what information is available.<br class="">

<br class="">

When a reproducer is passed to LLDB, we want to use its data to replay the<br class="">

debug session. This is coordinated by the Loader. Through the index created by<br class="">

the Generator, different components know what data (Providers) are available,<br class="">

and how to use them.<br class="">

<br class="">

It's important to note that in order to create a complete reproducer, we will<br class="">

require data from our dependencies (llvm, clang, swift) as well. This means<br class="">

that either (a) the infrastructure needs to be accessible from our dependencies<br class="">

or (b) that an API is provided that allows us to query this. We plan to address<br class="">

this issue when it arises for the respective Generator.<br class="">

<br class="">

# Components<br class="">

<br class="">

We have identified a list of minimal components needed to make reproducing<br class="">

possible. We've divided those into two groups: explicit and implicit inputs.<br class="">

<br class="">

Explicit inputs are inputs from the user to the debugger.<br class="">

<br class="">

-   Command line arguments<br class="">

-   Settings<br class="">

-   User commands<br class="">

-   Scripting Bridge API<br class="">

<br class="">

In addition to the components listed above, LLDB has a bunch of inputs that are<br class="">

not passed explicitly. It's often these that make reproducing an issue complex.<br class="">

<br class="">

-   GDB Remote Packets<br class="">

-   Files containing debug information (object files, dSYM bundles)<br class="">

-   Clang headers<br class="">

-   Swift modules<br class="">

<br class="">

Every component would have its own provider and is free to implement it as it<br class="">

sees fit. For example, as we expect to have a large number of GDB remote<br class="">

packets, the provider might choose to write these to disk as they come in,<br class="">

while the settings can easily be kept in memory until it is decided that we<br class="">

need to generate a reproducer.<br class="">

<br class="">

# Concerns, Implications & Risks<br class="">

<br class="">

## Performance Impact<br class="">

<br class="">

As the reproducer functionality will have to be always-on, we have to consider<br class="">

performance implications. As mentioned earlier, the provider gives the freedom<br class="">

to be implemented in such a way that works best for its respective component.<br class="">

We'll have to measure to know how big the impact is.<br class="">

<br class="">

## Privacy<br class="">

<br class="">

The reproducer might contain sensitive user information. We should make it<br class="">

clear to the user what kind of data is contained in the reproducer. Initially<br class="">

we will focus on the LLDB developer community and the people already filing<br class="">

bugs.<br class="">

<br class="">

## Versions<br class="">

<br class="">

Because the reproducer works by replaying a debug session, the versions of the<br class="">

debugger generating an replaying the session will have to match. Not only is<br class="">

this important for the serialization format, but more importantly a different<br class="">

LLDB might ask different questions in a different order.<br class="">

<br class="">

# Implementation<br class="">

<br class="">

I've put up a patch (<<a href="https://reviews.llvm.org/D50254" rel="noreferrer" target="_blank" class="">https://reviews.llvm.org/<wbr class="">D50254</a>>) which contains a minimal<br class="">

implementation of the reproducer framework as well as the GDB remote provider.<br class="">

<br class="">

It records the GDB packets and writes them to a YAML file (we can switch to a<br class="">

more performant encoding down the road). When invoking the LLDB driver and<br class="">

passing the reproducer directory to `--reproducer`, this file is read and a<br class="">

dummy server replies with the next packet from this file, without talking to<br class="">

the executable.<br class="">

<br class="">

It's still pretty rudimentary and only works if you enter the exact same<br class="">

commands (so the server receives the exact same requests form the client).<br class="">

<br class="">

The next steps are (in broad strokes):<br class="">

<br class="">

1.  Capturing the debugged binary.<br class="">

2.  Record and replay user commands and SB-API calls.<br class="">

3.  Recording the configuration of the debugger.<br class="">

4.  Capturing other files used by LLDB.<br class="">

<br class="">

Please let me know what you think!<br class="">

<br class="">

Thanks,<br class="">

Jonas <br class="">

______________________________<wbr class="">_________________<br class="">

lldb-dev mailing list<br class="">

<a href="mailto:lldb-dev@lists.llvm.org" class="">lldb-dev@lists.llvm.org</a><br class="">

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/lldb-dev</a><br class="">

</blockquote></div><br class=""></div>

</div></blockquote></div><br class=""></body></html>