[lldb-dev] [RFC] LLDB Reproducers

Thu Sep 20 04:16:39 PDT 2018

For the first, I think 99% of the time the bug is not caused by the
sequence of gdb remote packets.  The sequence of gdb remote packets just
happens to be the means by which the debugger was put into the state in
which it failed.  If there is another, stable way of getting the debugger
into the same state this part is solvable.

The second issue you raised does seem like something that would require
human intervention to specify the expected state though as part of a test

On Wed, Sep 19, 2018 at 11:17 AM Jim Ingham <jingham at apple.com> wrote:

> There are a couple of problems with using these reproducers in the
> testsuite.
>
> The first is that we make no commitments that the a future lldb will
> implement the "same" session with the same sequence of gdb-remote packet
> requests.  We often monkey around with lldb's sequences of requests to make
> things go faster.  So some future lldb will end up making a request that
> wasn't in the data from the reproducer, and at that point we won't really
> know what to do.  The Provider for gdb-remote packets should record the
> packets it receives - not just the answers it gives - so it can detect this
> error and not go off the rails.  But I'm pretty sure it isn't worth the
> effort to try to get lldb to maintain all the old sequences it used in the
> past in order to support keeping the reproducers alive.  But this does mean
> that this is an unreliable way to write tests.
>
> The second is that the reproducers as described have no notion of
> "expected state".  They are meant to go along with a bug report where the
> "x was wrong" part is not contained in the reproducer.  That would be an
> interesting thing to think about adding, but I think the problem space here
> is complicated enough already...  You can't write a test if you don't know
> the correct end state.
>
> Jim
>
>
> > On Sep 19, 2018, at 10:59 AM, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >
> > I assume that reproducing race conditions is out of scope?
> >
> > Also, will it be possible to incorporate these reproducers into the test
> suite somehow?  It would be nice if we could create a tar file similar to a
> linkrepro, check in the tar file, and then have a test where you don't have
> to write any python code, any Makefile, any source code, or any anything
> for that matter.  It just enumerates all of these repro tar files in a
> certain location and runs that test.
> >
> > On Wed, Sep 19, 2018 at 10:48 AM Leonard Mosescu via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> > Great, thanks. This means that the lldb-server issues are not in scope
> for this feature, right?
> >
> > On Wed, Sep 19, 2018 at 10:09 AM, Jonas Devlieghere <
> jdevlieghere at apple.com> wrote:
> >
> >
> >> On Sep 19, 2018, at 6:49 PM, Leonard Mosescu <mosescu at google.com>
> wrote:
> >>
> >> Sounds like a fantastic idea.
> >>
> >> How would this work when the behavior of the debugee process is
> non-deterministic?
> >
> > All the communication between the debugger and the inferior goes through
> the
> > GDB remote protocol. Because we capture and replay this, we can reproduce
> > without running the executable, which is particularly convenient when
> you were
> > originally debugging something on a different device for example.
> >
> >>
> >> On Wed, Sep 19, 2018 at 6:50 AM, Jonas Devlieghere via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >> Hi everyone,
> >>
> >> We all know how hard it can be to reproduce an issue or crash in LLDB.
> There
> >> are a lot of moving parts and subtle differences can easily add up. We
> want to
> >> make this easier by generating reproducers in LLDB, similar to what
> clang does
> >> today.
> >>
> >> The core idea is as follows: during normal operation we capture whatever
> >> information is needed to recreate the current state of the debugger.
> When
> >> something goes wrong, this becomes available to the user. Someone else
> should
> >> then be able to reproduce the same issue with only this data, for
> example on a
> >> different machine.
> >>
> >> It's important to note that we want to replay the debug session from the
> >> reproducer, rather than just recreating the current state. This ensures
> that we
> >> have access to all the events leading up to the problem, which are
> usually far
> >> more important than the error state itself.
> >>
> >> # High Level Design
> >>
> >> Concretely we want to extend LLDB in two ways:
> >>
> >> 1.  We need to add infrastructure to _generate_ the data necessary for
> >>     reproducing.
> >> 2.  We need to add infrastructure to _use_ the data in the reproducer
> to replay
> >>     the debugging session.
> >>
> >> Different parts of LLDB will have different definitions of what data
> they need
> >> to reproduce their path to the issue. For example, capturing the
> commands
> >> executed by the user is very different from tracking the dSYM bundles
> on disk.
> >> Therefore, we propose to have each component deal with its needs in a
> localized
> >> way. This has the advantage that the functionality can be developed and
> tested
> >> independently.
> >>
> >> ## Providers
> >>
> >> We'll call a combination of (1) and (2) for a given component a
> `Provider`. For
> >> example, we'd have an provider for user commands and a provider for
> dSYM files.
> >> A provider will know how to keep track of its information, how to
> serialize it
> >> as part of the reproducer as well as how to deserialize it again and
> use it to
> >> recreate the state of the debugger.
> >>
> >> With one exception, the lifetime of the provider coincides with that of
> the
> >> `SBDebugger`, because that is the scope of what we consider here to be
> a single
> >> debug session. The exception would be the provider for the global
> module cache,
> >> because it is shared between multiple debuggers. Although it would be
> >> conceptually straightforward to add a provider for the shared module
> cache,
> >> this significantly increases the complexity of the reproducer framework
> because
> >> of its implication on the lifetime and everything related to that.
> >>
> >> For now we will ignore this problem which means we will not replay the
> >> construction of the shared module cache but rather build it up during
> >> replaying, as if the current debug session was the first and only one
> using it.
> >> The impact of doing so is significant, as no issue caused by the shared
> module
> >> cache will be reproducible, but does not limit reproducing any issue
> unrelated
> >> to it.
> >>
> >> ## Reproducer Framework
> >>
> >> To coordinate between the data from different components, we'll need to
> >> introduce a global reproducer infrastructure. We have a component
> responsible
> >> for reproducer generation (the `Generator`) and for using the
> reproducer (the
> >> `Loader`). They are essentially two ways of looking at the same unit of
> >> repayable work.
> >>
> >> The Generator keeps track of its providers and whether or not we need to
> >> generate a reproducer. When a problem occurs, LLDB will request the
> Generator
> >> to generate a reproducer. When LLDB finishes successfully, the
> Generator cleans
> >> up anything it might have created during the session. Additionally, the
> >> Generator populates an index, which is part of the reproducer, and used
> by the
> >> Loader to discover what information is available.
> >>
> >> When a reproducer is passed to LLDB, we want to use its data to replay
> the
> >> debug session. This is coordinated by the Loader. Through the index
> created by
> >> the Generator, different components know what data (Providers) are
> available,
> >> and how to use them.
> >>
> >> It's important to note that in order to create a complete reproducer,
> we will
> >> require data from our dependencies (llvm, clang, swift) as well. This
> means
> >> that either (a) the infrastructure needs to be accessible from our
> dependencies
> >> or (b) that an API is provided that allows us to query this. We plan to
> address
> >> this issue when it arises for the respective Generator.
> >>
> >> # Components
> >>
> >> We have identified a list of minimal components needed to make
> reproducing
> >> possible. We've divided those into two groups: explicit and implicit
> inputs.
> >>
> >> Explicit inputs are inputs from the user to the debugger.
> >>
> >> -   Command line arguments
> >> -   Settings
> >> -   User commands
> >> -   Scripting Bridge API
> >>
> >> In addition to the components listed above, LLDB has a bunch of inputs
> that are
> >> not passed explicitly. It's often these that make reproducing an issue
> complex.
> >>
> >> -   GDB Remote Packets
> >> -   Files containing debug information (object files, dSYM bundles)
> >> -   Clang headers
> >> -   Swift modules
> >>
> >> Every component would have its own provider and is free to implement it
> as it
> >> sees fit. For example, as we expect to have a large number of GDB remote
> >> packets, the provider might choose to write these to disk as they come
> in,
> >> while the settings can easily be kept in memory until it is decided
> that we
> >> need to generate a reproducer.
> >>
> >> # Concerns, Implications & Risks
> >>
> >> ## Performance Impact
> >>
> >> As the reproducer functionality will have to be always-on, we have to
> consider
> >> performance implications. As mentioned earlier, the provider gives the
> freedom
> >> to be implemented in such a way that works best for its respective
> component.
> >> We'll have to measure to know how big the impact is.
> >>
> >> ## Privacy
> >>
> >> The reproducer might contain sensitive user information. We should make
> it
> >> clear to the user what kind of data is contained in the reproducer.
> Initially
> >> we will focus on the LLDB developer community and the people already
> filing
> >> bugs.
> >>
> >> ## Versions
> >>
> >> Because the reproducer works by replaying a debug session, the versions
> of the
> >> debugger generating an replaying the session will have to match. Not
> only is
> >> this important for the serialization format, but more importantly a
> different
> >> LLDB might ask different questions in a different order.
> >>
> >> # Implementation
> >>
> >> I've put up a patch (<https://reviews.llvm.org/D50254>) which contains
> a minimal
> >> implementation of the reproducer framework as well as the GDB remote
> provider.
> >>
> >> It records the GDB packets and writes them to a YAML file (we can
> switch to a
> >> more performant encoding down the road). When invoking the LLDB driver
> and
> >> passing the reproducer directory to `--reproducer`, this file is read
> and a
> >> dummy server replies with the next packet from this file, without
> talking to
> >> the executable.
> >>
> >> It's still pretty rudimentary and only works if you enter the exact same
> >> commands (so the server receives the exact same requests form the
> client).
> >>
> >> The next steps are (in broad strokes):
> >>
> >> 1.  Capturing the debugged binary.
> >> 2.  Record and replay user commands and SB-API calls.
> >> 3.  Recording the configuration of the debugger.
> >> 4.  Capturing other files used by LLDB.
> >>
> >> Please let me know what you think!
> >>
> >> Thanks,
> >> Jonas
> >> _______________________________________________
> >> lldb-dev mailing list
> >> lldb-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >>
> >
> >
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20180920/bc866076/attachment-0001.html>