[lldb-dev] Module Cache improvements - RFC

Mon Feb 22 17:27:59 PST 2016

Can't you just cache the modules locally on the disk, so that you only take
that 26 second hit the first time you try to download that module, and then
it indexes it by some sort of hash.  Then instead of just downloading it,
you check the local cache first and only download if it's not there.

If you already do all this, then disregard.

On Mon, Feb 22, 2016 at 4:39 PM Greg Clayton via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

>
> > On Jan 28, 2016, at 4:21 AM, Pavel Labath <labath at google.com> wrote:
> >
> > Hello all,
> >
> > we are running into limitations of the current module download/caching
> > system. A simple android application can link to about 46 megabytes
> > worth of modules, and downloading that with our current transfer rates
> > takes about 25 seconds. Much of the data we download this way is never
> > actually accessed, and yet we download everything immediately upon
> > starting the debug session, which makes the first session extremely
> > laggy.
> >
> > We could speed up a lot by only downloading the portions of the module
> > that we really need (in my case this turns out to be about 8
> > megabytes). Also, further speedups could be made by increasing the
> > throughput of the gdb-remote protocol used for downloading these files
> > by using pipelining.
> >
> > I made a proof-of-concept hack  of these things, put it into lldb and
> > I was able to get the time for the startup-attach-detach-exit cycle
> > down to 5.4 seconds (for comparison, the current time for the cycle is
> > about 3.6 seconds with a hot module cache, and 28(!) seconds with an
> > empty cache).
> >
> > Now, I would like to properly implement these things in lldb properly,
> > so this is a request for comments on my plan. What I would like to do
> > is:
> > - Replace ModuleCache with a SectionCache (actually, more like a cache
> > of arbitrary file chunks). When a the cache gets a request for a file
> > and the file is not in the cache already, it returns a special kind of
> > a Module, whose fragments will be downloaded as we are trying to
> > access them. These fragments will be cached on disk, so that
> > subsequent requests for the file do not need to re-download them. We
> > can also have the option to short-circuit this logic and download the
> > whole file immediately (e.g., when the file is small, or we have a
> > super-fast way of obtaining the whole file via rsync, etc...)
> > - Add pipelining support to GDBRemoteCommunicationClient for
> > communicating with the platform. This actually does not require any
> > changes to the wire protocol. The only change is in adding the ability
> > to send an additional request to the server while waiting for the
> > response to the previous one. Since the protocol is request-response
> > based and we are communication over a reliable transport stream, each
> > response can be correctly matched to a request even though we have
> > multiple packets in flight. Any packets which need to maintain more
> > complex state (like downloading a single entity using continuation
> > packets) can still lock the stream to get exclusive access, but I am
> > not sure if we actually even have any such packets in the platform
> > flavour of the protocol.
> > - Paralelize downloading of multiple files in parallel, utilizing
> > request pipelining. Currently we get the biggest delay when first
> > attaching to a process (we download file headers and some basic
> > informative sections) and when we try to set the first symbol-level
> > breakpoint (we download symbol tables and string sections). Both of
> > these actions operate on all modules in bulk, which makes them easy
> > paralelization targets. This will provide a big speed boost, as we
> > will be eliminating communication latency. Furthermore, in case of
> > lots of files, we will be overlapping file download  (io) with parsing
> > (cpu), for an even bigger boost.
> >
> > What do you think?
> >
>
> Feel free to implement this in PlatformAndroid and allow others to opt
> into this. I won't want this by default in any of the Apple platforms in
> MachO we have our entire image mapped into memory and we have other tricks
> for getting the information quicker.
>
> So I would leave the module cache there and not change it, but feel free
> to add the section cache as needed. Maybe if this goes really well and it
> can be arbitrarily used on any files types (MachO, ELF, COFF, etc) and it
> just works seamlessly, we can expand who uses it.
>
> In Xcode we take the time the first time we connect to a device we haven't
> seen to download all of the system libraries. Why is the 28 seconds
> considered prohibitive for the first time you connect. The data stays
> cached even after you quit and restart LLDB or your IDE right?
>
> Greg Clayton
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160223/4aa82ec9/attachment.html>