[llvm-commits] Proposal/patch: Enable bitcode streaming

Derek Schuff dschuff at google.com
Wed Nov 9 16:57:38 PST 2011


Hello all,
The following is a proposal (and a prototype patch) to enable bitcode
streaming. The overall goal is to be able to overlap bitcode
reading/download with compilation, a functionality useful obviously for
pnacl and renderscript but also potentially for any situation where the
interface between the frontend and backend is something other than a file.

In the current state of the world, at a high level, there are 2 things
keeping this from happening. The first is that BitcodeReader construction
takes a MemoryBuffer which it expects to be filled with bitcode, and inside
BitcodeReader, the BitstreamCursor (which is the primary interface to the
bitcode itself) gets pointers to the bitcode in memory, and does all of its
magic with pointer arithmetic. The second issue is that in
BitcodeReader::ParseModule (which is run when right after the Module and
BitcodeReader objects are created), the reader makes a pass over the entire
bitcode file. This step does everything except read the function bodies,
but it records the bit locations of each function for future
materialization.

High-level change description:
This patch creates a class called BitcodeStream, which is a very simple
interface with one method (GetBytes), which fetches the requested number of
bytes from the stream, and writes them into the requested destination. This
method may block the calling thread if there are not yet enough bytes
available in the stream buffer (similarly to a stdin or socket read).

The first issue above is addressed by introducing the BitstreamVector, an
abstraction that wraps the bitcode in memory. Instead of using pointers,
the BitstreamCursor uses indices and gets bitcode bytes by indexing (i.e.
operator[] ) the BitstreamVector. When streaming is not used, the
BitstreamVector itself keeps pointers to the start and end of the backing
MemoryBuffer and the indexing operator is just a pointer dereference. For
streaming use, the BitstreamVector has a BitcodeStream object. If a byte is
requested that has not yet been fetched, it calls GetBytes to get more,
until it has enough to return the requested byte.
This model of allowing any byte to be requested and blocking the caller has
the advantage that there is no structural/architectural change required at
this lowest level, nor at the high level (A FunctionPassManager is used to
iterate over all the functions and compile each one).

The second issue is solved by 2 simple changes. The first is in
ParseModule. Instead of a single pass over all the bitcode, ParseModule
becomes resumable.  ParseModule will do its normal handling for top-level
records, type table blocks, metadata, etc, but if streaming is in use, it
will save its state and return as soon as a function subblock is
encountered (rather than saving its location and skipping over it). Each
subsequent time it is called, it bookmarks and skips one function block.
Later, when a function needs to be materialized, if the function body has
been seen already, then materialization is the same as before. Otherwise,
Materialize will keep calling ParseModule (each time bookmarking and
skipping one function body) until the requested function is found. The one
other change required to make this work simply is that the bitcode writer
writes function bodies as the last subblock (currently the attachment
metadata and value symbol table are written after the function bodies).

The prototype patch is attached and can also be viewed online at
http://codereview.chromium.org/8393017/ . Feedback is welcome, as well as
guidance from the relevant code owners/reviewers regarding what the next
step needs to be toward committing this.

Thanks,
-Derek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111109/ec901d02/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bitcode_streaming_r144247.diff
Type: text/x-patch
Size: 40099 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111109/ec901d02/attachment.bin>


More information about the llvm-commits mailing list