[llvm-commits] Proposal/patch: Enable bitcode streaming

Derek Schuff dschuff at google.com
Tue Nov 15 17:26:02 PST 2011


There's a small difference, about 2.5%

Here's a 45MB bitcode file:
time bin/llvm-dis -disable-output
/ulg/naclgit/native_client/toolchain/hg-build-newlib/llvm-sb-universal-srpc/Release+Asserts/bin/llc---linked.pre_opt.pexe

real    0m3.055s
user    0m2.750s
sys     0m0.290s

new:
time bin/llvm-dis -disable-output
/ulg/naclgit/native_client/toolchain/hg-build-newlib/llvm-sb-universal-srpc/Release+Asserts/bin/llc---linked.pre_opt.pexe

real    0m3.126s
user    0m2.860s
sys     0m0.260s



On Tue, Nov 15, 2011 at 4:37 PM, Chris Lattner <clattner at apple.com> wrote:

> Hi Derek,
>
> Have you measured the performance impact of this patch on the
> non-streaming case?  How much slower does "llvm-dis -disable-output foo.bc"
> go (with a release build)?
>
> -Chris
>
> On Nov 9, 2011, at 4:57 PM, Derek Schuff wrote:
>
> Hello all,
> The following is a proposal (and a prototype patch) to enable bitcode
> streaming. The overall goal is to be able to overlap bitcode
> reading/download with compilation, a functionality useful obviously for
> pnacl and renderscript but also potentially for any situation where the
> interface between the frontend and backend is something other than a file.
>
> In the current state of the world, at a high level, there are 2 things
> keeping this from happening. The first is that BitcodeReader construction
> takes a MemoryBuffer which it expects to be filled with bitcode, and inside
> BitcodeReader, the BitstreamCursor (which is the primary interface to the
> bitcode itself) gets pointers to the bitcode in memory, and does all of its
> magic with pointer arithmetic. The second issue is that in
> BitcodeReader::ParseModule (which is run when right after the Module and
> BitcodeReader objects are created), the reader makes a pass over the entire
> bitcode file. This step does everything except read the function bodies,
> but it records the bit locations of each function for future
> materialization.
>
> High-level change description:
> This patch creates a class called BitcodeStream, which is a very simple
> interface with one method (GetBytes), which fetches the requested number of
> bytes from the stream, and writes them into the requested destination. This
> method may block the calling thread if there are not yet enough bytes
> available in the stream buffer (similarly to a stdin or socket read).
>
> The first issue above is addressed by introducing the BitstreamVector, an
> abstraction that wraps the bitcode in memory. Instead of using pointers,
> the BitstreamCursor uses indices and gets bitcode bytes by indexing (i.e.
> operator[] ) the BitstreamVector. When streaming is not used, the
> BitstreamVector itself keeps pointers to the start and end of the backing
> MemoryBuffer and the indexing operator is just a pointer dereference. For
> streaming use, the BitstreamVector has a BitcodeStream object. If a byte is
> requested that has not yet been fetched, it calls GetBytes to get more,
> until it has enough to return the requested byte.
> This model of allowing any byte to be requested and blocking the caller
> has the advantage that there is no structural/architectural change required
> at this lowest level, nor at the high level (A FunctionPassManager is used
> to iterate over all the functions and compile each one).
>
> The second issue is solved by 2 simple changes. The first is in
> ParseModule. Instead of a single pass over all the bitcode, ParseModule
> becomes resumable.  ParseModule will do its normal handling for top-level
> records, type table blocks, metadata, etc, but if streaming is in use, it
> will save its state and return as soon as a function subblock is
> encountered (rather than saving its location and skipping over it). Each
> subsequent time it is called, it bookmarks and skips one function block.
> Later, when a function needs to be materialized, if the function body has
> been seen already, then materialization is the same as before. Otherwise,
> Materialize will keep calling ParseModule (each time bookmarking and
> skipping one function body) until the requested function is found. The one
> other change required to make this work simply is that the bitcode writer
> writes function bodies as the last subblock (currently the attachment
> metadata and value symbol table are written after the function bodies).
>
> The prototype patch is attached and can also be viewed online at
> http://codereview.chromium.org/8393017/ . Feedback is welcome, as well as
> guidance from the relevant code owners/reviewers regarding what the next
> step needs to be toward committing this.
>
> Thanks,
> -Derek
> <bitcode_streaming_r144247.diff>
> _______________________________________________
>
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111115/1bede258/attachment.html>


More information about the llvm-commits mailing list