[llvm-commits] Proposal/patch: Enable bitcode streaming
Nick Lewycky
nicholas at mxc.ca
Fri Feb 3 23:33:58 PST 2012
Derek Schuff wrote:
> Thanks,
> I'll rebase the patch and run it through a final round of tests.
> 1) By "commit it" I assume you mean all the combined code?
Yep!
> 2) I'm not actually a committer yet. Can you fix that (or commit for me
> if I send you the updated patch)
I'll commit it for you. Once you've had a number of smaller patches go
through you can apply for commit access per
http://llvm.org/docs/DeveloperPolicy.html#commitaccess
Nick
>
> -Derek
>
> On Wed, Feb 1, 2012 at 4:32 PM, Nick Lewycky <nlewycky at google.com
> <mailto:nlewycky at google.com>> wrote:
>
> On 26 January 2012 11:19, Derek Schuff <dschuff at google.com
> <mailto:dschuff at google.com>> wrote:
>
> Thanks for the review. comments inline, and updated patches
> attached.
>
>
> Thanks.
>
> As for StreamableMemoryObject vs. StreamingMemoryObject; please just
> add pretty-much what you wrote in your email as a comment explaining
> the difference, and then commit it. It seems to me that once we have
> some experience with these APIs, we'll know what the right factoring
> of this code really ought to be.
>
> Nick
>
> On Wed, Jan 25, 2012 at 2:04 PM, Nick Lewycky
> <nlewycky at google.com <mailto:nlewycky at google.com>> wrote:
>
> On 20 January 2012 11:55, Derek Schuff <dschuff at google.com
> <mailto:dschuff at google.com>> wrote:
>
> And finally, the StreamingMemoryObject implementation,
> modified BitcodeReader, and modifed llvm-dis.cpp using
> the streaming interface.
> Please take a look
>
>
> Overall this looks good. I'm especially happy with some of
> the refactoring inside BitcodeReader! Comments:
>
> --- a/include/llvm/Bitcode/ReaderWriter.h
> +++ b/include/llvm/Bitcode/ReaderWriter.h
> @@ -21,31 +21,41 @@ namespace llvm {
> class MemoryBuffer;
> class ModulePass;
> class BitstreamWriter;
> + class DataStreamer;
> class LLVMContext;
> class raw_ostream;
>
> I realize these were unsorted when you got here, but please
> alphabetize them.
>
> Done.
>
>
> + /// If 'verify' is true, check that the file fits in the
> buffer.
> + static inline bool SkipBitcodeWrapperHeader(const
> unsigned char *&BufPtr,
> + const
> unsigned char *&BufEnd,
> + bool Verify) {
>
> I didn't really understand the comment. I think what you're
> doing is disabling the check that the buffer contained the
> whole header. Could you make it "bool SkipPastEnd" instead?
>
> There 2 checks: the first checks that the buffer is large enough
> to contain the whole header. That always runs. The check that's
> conditionalized on 'verify' checks that the buffer is large
> enough to contain the whole bitcode. This doesn't work with the
> streaming implementation since it does not allocate a buffer
> that fits the whole file ahead of time. It also always skips
> past the end of the header if found (since this is how it
> returns the size of the buffer), so SkipPastEnd would be a bad
> name for the variable. I clarified the comment and renamed the
> variable VerifyBufferSize. (Also on this pass I found and
> deleted some trailing whitespace).
>
>
> +DataStreamer *getDataFileStreamer(const std::string &Filename,
> + std::string *Err);
>
> Please line up the argument to the (.
>
> Done.
>
>
> --- a/include/llvm/Support/StreamableMemoryObject.h
> +++ b/include/llvm/Support/StreamableMemoryObject.h
> @@ -12,6 +12,9 @@
> #define STREAMABLEMEMORYOBJECT_H_
> #include "llvm/Support/MemoryObject.h"
> +#include <vector>
> +#include "llvm/ADT/OwningPtr.h"
> +#include "llvm/Support/DataStream.h"
>
> ADT, Support, then <system> headers. See
> http://llvm.org/docs/CodingStandards.html#scf_includes .
>
> +/// StreamingMemoryObject - interface to data which is
> actually streamed from
> +/// at DataStreamer. In addition to inherited members, it
> has the
> +/// dropLeadingBytes and setKnownObjectSize methods which
> are not applicable
> +/// to non-streamed objects
> +class StreamingMemoryObject : public StreamableMemoryObject {
>
> I think that's a full sentence missing a period. It feels
> *awfully weird* to have StreamableMemoryObject and
> StreamingMemoryObject, and both of them are interfaces. The
> comment doesn't seem to sufficiently explain what's going on
> here. (The DataStreamer can stream from a Streaming but not
> with a Streamable? What?)
>
>
> Yeah, this was kind of a tough naming problem. But there really
> does need to be these 2 different kinds of interfaces. The one I
> called StreamableMemoryObject (in which the data may or may not
> actually be streamed) needs to have extra methods over and above
> MemoryObject, which are directly due to the streamability:
> isValidAddress and isObjectEnd are needed because if we don't
> know the length of the stream ahead of time, then calling
> getExtent requires waiting until the entire stream is fetched.
> (getPointer is basically just there to support BLOBs, avoiding
> extra copies). Then you have RawMemoryObject, a non-streamed
> StreamableMemoryObject, and StreamingMemoryObject is an
> interface because there could be different implementations of
> StreamableMemoryObject (getting data from different sources).
> I'm open to ideas to simplify the situation.
>
>
> + // fetch enough bytes such that Pos can be read or EOF is
> reached
> + // (i.e. BytesRead > Pos). Return true if Pos can be read.
> + // Unlike most of the functions in BitcodeReader, returns
> true on success.
> + bool fetchToPos(size_t Pos) {
>
> Comment should start with a capital.
>
> + bool fetchToPos(size_t Pos) {
> + if (EOFReached) return Pos < ObjectSize;
> + while (Pos >= BytesRead) {
> + Bytes.resize(BytesRead + kChunkSize);
> + size_t bytes = Streamer->GetBytes(&Bytes[BytesRead +
> BytesSkipped],
> + kChunkSize);
>
> Why is kChunkSize so special? Why not ask for all the bytes
> up until Pos? The comment on DataStreamer::GetBytes doesn't
> give any reason not to ask for as many bytes as you want?
>
> The common case (actually the only case, currently) will
> actually be that the requested size is much smaller than the
> chunk size, and the chunk size just ensures that we batch them
> together rather than making a lot of potentially expensive
> requests into the streamer. The 'while' loop is just there
> instead of an 'if' to cover the corner case of a large request.
> I updated the comment
>
>
> +bool BitcodeReader::SuspendModuleParse() {
> + // save our current position
> + NextUnreadBit = Stream.GetCurrentBitNo();
> + return false;
> +}
>
> What's up with that returning bool?
>
> Originally it was going to check for error and use the same
> convention as the rest of the functions, but it ended up being
> simpler than I expected. I removed it entirely now.
>
>
> + // ParseModule will parse the next body in the stream
> and set its
> + // position in the DeferredFunctionInfo map
>
> Sentence needs period.
>
> Done.
>
>
> + unsigned char buf[16];
> + if (Bytes->readBytes(0, 16, buf, NULL) == -1)
> + return Error("Bitcode stream must be at least 16 bytes
> in length");
> +
> + if (!isBitcode(buf, buf + 16)) {
> + return Error("Invalid bitcode signature");
> + }
>
> So, uh, braces or no braces around one-line return
> statements? :-)
>
> LLVM style seems to be no braces, but Google style dies hard :)
> Fixed.
>
>
> +Module *llvm::getStreamedBitcodeModule(const std::string &name,
> + DataStreamer *streamer,
> + LLVMContext &Context,
> + std::string *ErrMsg) {
>
> These args don't line up.
>
> Done
>
>
> --- /dev/null
> +++ b/lib/Support/DataStream.cpp
> @@ -0,0 +1,96 @@
> +//===--- llvm/Support/DataStream.cpp - Lazy streamed Data
> -*- C++ -*-===//
>
> Don't include emacs mode markers (the -*- C++ -*- bit) on
> .cpp files, only on .h files.
>
> Done.
>
>
> +// Very simple stream backed by a file. Mostly useful for
> stdin and debugging;
> +// actual file access is probably still best done with mmap
> +class DataFileStreamer : public DataStreamer {
> + int Fd;
>
> Sentence seeking full stop.
>
> +DataStreamer *getDataFileStreamer(const std::string &Filename,
> + std::string
> *StrError) {
>
> Line up to the ( again.
>
> Done.
>
>
> + if (e != success) {
> + *StrError = std::string() + "Could not open " +
> Filename + ": "
> + + e.message() + "\n";
> + return NULL;
> + }
>
> Optional: std::string("Could not open ") + Filename, and
> also putting the + on the previous line instead of starting
> a line with the operator.
>
> Done.
>
>
> --- a/lib/Support/StreamableMemoryObject.cpp
> +++ b/lib/Support/StreamableMemoryObject.cpp
>
> In this file you added some spurious blank lines. Please
> don't do that.
>
> I think i got them all. Also fixed some argument alignment.
>
>
> +bool StreamingMemoryObject::isObjectEnd(uint64_t address) {
> + if (ObjectSize) return address == ObjectSize;
> + fetchToPos(address);
> + return address == BytesRead;
> +}
>
> Shouldn't that end with "return address == ObjectSize"? If
> the file is larger than 'address' bytes, won't this read up
> to address bytes, then stop, leaving BytesRead == address?
> ObjectSize on the other hand isn't set until EOF is reached.
>
>
> Yes, it should. although BytesRead is in practice unlikely to
> actually == address due to the chunk-fetching behavior. No doubt
> that's why this slipped through all the testing. Fixed (and
> covered the case where address == 0; it's never the end because
> 0 is an invalid stream size)
>
>
> +int StreamingMemoryObject::readBytes(uint64_t address,
> + uint64_t size,
> + uint8_t* buf,
> + uint64_t* copied) {
>
> Misaligned.
>
> Fixed.
>
>
> + //StreamableMemoryObject
> +
>
> Please remove.
>
> Done.
>
>
>
More information about the llvm-commits
mailing list