[llvm-commits] Proposal/patch: Enable bitcode streaming

Derek Schuff dschuff at google.com
Tue Nov 15 14:25:31 PST 2011


Thanks. hopefully someone will soon.

I need to add one amendment to the above description:
Since the container for bitcode doesn't really act all that much like a
vector, I renamed it to BitstreamBytes, and the way to get the bitcode
bytes is just a getByte/getWord method rather than operator[]. Otherwise
the description still applies.



On Tue, Nov 15, 2011 at 12:41 PM, Eli Friedman <eli.friedman at gmail.com>wrote:

> On Mon, Nov 14, 2011 at 10:54 AM, Derek Schuff <dschuff at google.com> wrote:
> > Ping?
> > Could someone take a look at this? (or, have I sent it to the wrong
> > place/not marked it properly?)
>
> This is the right place; not sure why nobody has responded.
>
> -Eli
>
> > On Wed, Nov 9, 2011 at 4:57 PM, Derek Schuff <dschuff at google.com> wrote:
> >>
> >> Hello all,
> >> The following is a proposal (and a prototype patch) to enable bitcode
> >> streaming. The overall goal is to be able to overlap bitcode
> >> reading/download with compilation, a functionality useful obviously for
> >> pnacl and renderscript but also potentially for any situation where the
> >> interface between the frontend and backend is something other than a
> file.
> >> In the current state of the world, at a high level, there are 2 things
> >> keeping this from happening. The first is that BitcodeReader
> construction
> >> takes a MemoryBuffer which it expects to be filled with bitcode, and
> inside
> >> BitcodeReader, the BitstreamCursor (which is the primary interface to
> the
> >> bitcode itself) gets pointers to the bitcode in memory, and does all of
> its
> >> magic with pointer arithmetic. The second issue is that in
> >> BitcodeReader::ParseModule (which is run when right after the Module and
> >> BitcodeReader objects are created), the reader makes a pass over the
> entire
> >> bitcode file. This step does everything except read the function
> bodies, but
> >> it records the bit locations of each function for future
> materialization.
> >> High-level change description:
> >> This patch creates a class called BitcodeStream, which is a very simple
> >> interface with one method (GetBytes), which fetches the requested
> number of
> >> bytes from the stream, and writes them into the requested destination.
> This
> >> method may block the calling thread if there are not yet enough bytes
> >> available in the stream buffer (similarly to a stdin or socket read).
> >> The first issue above is addressed by introducing the BitstreamVector,
> an
> >> abstraction that wraps the bitcode in memory. Instead of using
> pointers, the
> >> BitstreamCursor uses indices and gets bitcode bytes by indexing (i.e.
> >> operator[] ) the BitstreamVector. When streaming is not used, the
> >> BitstreamVector itself keeps pointers to the start and end of the
> backing
> >> MemoryBuffer and the indexing operator is just a pointer dereference.
> For
> >> streaming use, the BitstreamVector has a BitcodeStream object. If a
> byte is
> >> requested that has not yet been fetched, it calls GetBytes to get more,
> >> until it has enough to return the requested byte.
> >> This model of allowing any byte to be requested and blocking the caller
> >> has the advantage that there is no structural/architectural change
> required
> >> at this lowest level, nor at the high level (A FunctionPassManager is
> used
> >> to iterate over all the functions and compile each one).
> >> The second issue is solved by 2 simple changes. The first is in
> >> ParseModule. Instead of a single pass over all the bitcode, ParseModule
> >> becomes resumable.  ParseModule will do its normal handling for
> top-level
> >> records, type table blocks, metadata, etc, but if streaming is in use,
> it
> >> will save its state and return as soon as a function subblock is
> encountered
> >> (rather than saving its location and skipping over it). Each subsequent
> time
> >> it is called, it bookmarks and skips one function block. Later, when a
> >> function needs to be materialized, if the function body has been seen
> >> already, then materialization is the same as before. Otherwise,
> Materialize
> >> will keep calling ParseModule (each time bookmarking and skipping one
> >> function body) until the requested function is found. The one other
> change
> >> required to make this work simply is that the bitcode writer writes
> function
> >> bodies as the last subblock (currently the attachment metadata and value
> >> symbol table are written after the function bodies).
> >> The prototype patch is attached and can also be viewed online at
> >> http://codereview.chromium.org/8393017/ . Feedback is welcome, as well
> as
> >> guidance from the relevant code owners/reviewers regarding what the next
> >> step needs to be toward committing this.
> >> Thanks,
> >> -Derek
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111115/1e552018/attachment.html>


More information about the llvm-commits mailing list