[llvm-commits] Proposal/patch: Enable bitcode streaming
Nick Lewycky
nlewycky at google.com
Wed Jan 25 14:04:23 PST 2012
On 20 January 2012 11:55, Derek Schuff <dschuff at google.com> wrote:
> And finally, the StreamingMemoryObject implementation, modified
> BitcodeReader, and modifed llvm-dis.cpp using the streaming interface.
> Please take a look
>
Overall this looks good. I'm especially happy with some of the refactoring
inside BitcodeReader! Comments:
--- a/include/llvm/Bitcode/ReaderWriter.h
+++ b/include/llvm/Bitcode/ReaderWriter.h
@@ -21,31 +21,41 @@ namespace llvm {
class MemoryBuffer;
class ModulePass;
class BitstreamWriter;
+ class DataStreamer;
class LLVMContext;
class raw_ostream;
I realize these were unsorted when you got here, but please alphabetize
them.
+ /// If 'verify' is true, check that the file fits in the buffer.
+ static inline bool SkipBitcodeWrapperHeader(const unsigned char *&BufPtr,
+ const unsigned char *&BufEnd,
+ bool Verify) {
I didn't really understand the comment. I think what you're doing is
disabling the check that the buffer contained the whole header. Could you
make it "bool SkipPastEnd" instead?
+DataStreamer *getDataFileStreamer(const std::string &Filename,
+ std::string *Err);
Please line up the argument to the (.
--- a/include/llvm/Support/StreamableMemoryObject.h
+++ b/include/llvm/Support/StreamableMemoryObject.h
@@ -12,6 +12,9 @@
#define STREAMABLEMEMORYOBJECT_H_
#include "llvm/Support/MemoryObject.h"
+#include <vector>
+#include "llvm/ADT/OwningPtr.h"
+#include "llvm/Support/DataStream.h"
ADT, Support, then <system> headers. See
http://llvm.org/docs/CodingStandards.html#scf_includes .
+/// StreamingMemoryObject - interface to data which is actually streamed
from
+/// at DataStreamer. In addition to inherited members, it has the
+/// dropLeadingBytes and setKnownObjectSize methods which are not
applicable
+/// to non-streamed objects
+class StreamingMemoryObject : public StreamableMemoryObject {
I think that's a full sentence missing a period. It feels *awfully weird*
to have StreamableMemoryObject and StreamingMemoryObject, and both of them
are interfaces. The comment doesn't seem to sufficiently explain what's
going on here. (The DataStreamer can stream from a Streaming but not with a
Streamable? What?)
+ // fetch enough bytes such that Pos can be read or EOF is reached
+ // (i.e. BytesRead > Pos). Return true if Pos can be read.
+ // Unlike most of the functions in BitcodeReader, returns true on
success.
+ bool fetchToPos(size_t Pos) {
Comment should start with a capital.
+ bool fetchToPos(size_t Pos) {
+ if (EOFReached) return Pos < ObjectSize;
+ while (Pos >= BytesRead) {
+ Bytes.resize(BytesRead + kChunkSize);
+ size_t bytes = Streamer->GetBytes(&Bytes[BytesRead + BytesSkipped],
+ kChunkSize);
Why is kChunkSize so special? Why not ask for all the bytes up until Pos?
The comment on DataStreamer::GetBytes doesn't give any reason not to ask
for as many bytes as you want?
+bool BitcodeReader::SuspendModuleParse() {
+ // save our current position
+ NextUnreadBit = Stream.GetCurrentBitNo();
+ return false;
+}
What's up with that returning bool?
+ // ParseModule will parse the next body in the stream and set its
+ // position in the DeferredFunctionInfo map
Sentence needs period.
+ unsigned char buf[16];
+ if (Bytes->readBytes(0, 16, buf, NULL) == -1)
+ return Error("Bitcode stream must be at least 16 bytes in length");
+
+ if (!isBitcode(buf, buf + 16)) {
+ return Error("Invalid bitcode signature");
+ }
So, uh, braces or no braces around one-line return statements? :-)
+Module *llvm::getStreamedBitcodeModule(const std::string &name,
+ DataStreamer *streamer,
+ LLVMContext &Context,
+ std::string *ErrMsg) {
These args don't line up.
--- /dev/null
+++ b/lib/Support/DataStream.cpp
@@ -0,0 +1,96 @@
+//===--- llvm/Support/DataStream.cpp - Lazy streamed Data -*- C++ -*-===//
Don't include emacs mode markers (the -*- C++ -*- bit) on .cpp files, only
on .h files.
+// Very simple stream backed by a file. Mostly useful for stdin and
debugging;
+// actual file access is probably still best done with mmap
+class DataFileStreamer : public DataStreamer {
+ int Fd;
Sentence seeking full stop.
+DataStreamer *getDataFileStreamer(const std::string &Filename,
+ std::string *StrError) {
Line up to the ( again.
+ if (e != success) {
+ *StrError = std::string() + "Could not open " + Filename + ": "
+ + e.message() + "\n";
+ return NULL;
+ }
Optional: std::string("Could not open ") + Filename, and also putting the +
on the previous line instead of starting a line with the operator.
--- a/lib/Support/StreamableMemoryObject.cpp
+++ b/lib/Support/StreamableMemoryObject.cpp
In this file you added some spurious blank lines. Please don't do that.
+bool StreamingMemoryObject::isObjectEnd(uint64_t address) {
+ if (ObjectSize) return address == ObjectSize;
+ fetchToPos(address);
+ return address == BytesRead;
+}
Shouldn't that end with "return address == ObjectSize"? If the file is
larger than 'address' bytes, won't this read up to address bytes, then
stop, leaving BytesRead == address? ObjectSize on the other hand isn't set
until EOF is reached.
+int StreamingMemoryObject::readBytes(uint64_t address,
+ uint64_t size,
+ uint8_t* buf,
+ uint64_t* copied) {
Misaligned.
+ //StreamableMemoryObject
+
Please remove.
Nick
> thanks,
> -Derek
>
>
> On Wed, Jan 18, 2012 at 4:05 PM, Derek Schuff <dschuff at google.com> wrote:
>
>> Hi Chris & Nick,
>> Attached is a very slightly updated version of patch number 1, and patch
>> number 2, with a new StreamableMemoryObject (derived from MemoryObject),
>> suitable for streaming usage.
>> One consequence of deriving from MemoryObject is that I had to make the
>> getExtent and readByte/readBytes methods of MemoryObject no longer const,
>> since they are definitely not const in the StreamableMemoryObject. This
>> resulted in having to remove const in several usages of MemoryObject. It
>> seemed less bad than adding a bunch of mutable data members in
>> StreamableMemoryObject or creating a near-duplicate of MemoryObject.
>>
>> The third patch is nearly ready as well, please let me know what you
>> think.
>> thanks,
>> -Derek
>>
>>
>> On Thu, Jan 12, 2012 at 3:09 PM, Derek Schuff <dschuff at google.com> wrote:
>>
>>>
>>>> Overall, my recommendation would be to split this into three patches:
>>>> the first patch would just increase the abstraction level of the bitcode
>>>> reader, by adding various predicates that you need and tidy things up. The
>>>> second would extend MemoryObject as needed to add the functionality that
>>>> you need. The third would actually switch the meat of the bitcode reader
>>>> to be lazy-streaming, and switch a tool to use it.
>>>>
>>>>
>>>
>>> Attached is patch 1 as listed above: a refactor of BistreamCursor to use
>>> an offset rather than raw pointers, and abstract the relevant operations
>>> into functions. For now BitstreamReader is the same. I've tested it locally
>>> and it works in isolation and is hopefully ready to apply. Patch 2, a
>>> subclass of MemoryObject (StreamableMemoryObject) to replace the
>>> BitstreamBytes class here, is upcoming next.
>>>
>>> thanks,
>>> -Derek
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120125/bb5fe28b/attachment.html>
More information about the llvm-commits
mailing list