[PATCH] Add test showing error in StreamingMemoryObject.setKnownObjectSize().

Tue Apr 14 13:48:12 PDT 2015

Rewrote fetchToPos to only worry about reading, and readBytes to check if input is truncated because ObjectSize is set. This simplifies fetchToPos considerably, and only marginally complicates readBytes.

================
Comment at: include/llvm/Support/StreamingMemoryObject.h:78
@@ +77,3 @@
+      if (ObjectSize && BytesRead > ObjectSize) {
+        BytesRead = ObjectSize;
+        EOOReached = true;
----------------
jvoung wrote:
> Just thinking out loud here to think through what's going on... it is a bit confusing that this has to deal with an ObjectSize, and the fact that it can be different from BytesRead is just more fun =)
> 
> If ObjectSize is known, and BytesRead goes above ObjectSize, doesn't that mean that it already read beyond the ObjectSize from the perspective of the Streamer? The amount to read is just kChunkSize. This will adjust the "position" of the Streamer for its next call to GetBytes... if there is a next call. The Streamer doesn't let you seek backward, so you can no longer make the next GetBytes start at exactly the ObjectSize boundary.
> 
> So, why is this okay? My impression is that this is okay because the Streamer is owned by this class, and so it should not be shared, so adjusting the position of Streamer beyond the ObjectSize doesn't matter too much.
> 
> Okay, so what does matter?
> 
> The bounds determined by BytesRead does matter, since that affects what readBytes() will do. readBytes() does not check ObjectSize... and that's what this extra check fixes. On the other hand readBytes() calls fetchToPos, so it is in a way checking ObjectSize (just tucked away here).
I agree that the concepts of ObjectSize, NextBytes, and EndOfStream() are confused by the way this code is written. Separating these concepts out (see comment below), so that things are (hopefully) cleaner.

================
Comment at: include/llvm/Support/StreamingMemoryObject.h:81
@@ +80,3 @@
+      } else if (bytes == 0) { // reached EOF/ran out of bytes
+        if (ObjectSize == 0)
+          ObjectSize = BytesRead;
----------------
jvoung wrote:
> Following up on Rafael's example, why not set ObjectSize = BytesRead here unconditionally? At this point, if ObjectSize != 0, then BytesRead <= ObjectSize.
> 
> Otherwise, if you call fetchToPos(Pos) afterwards with BytesRead < Pos < ObjectSize, then EOOReached is true and it will say that Pos is readable (since the check on line 71 is Pos < ObjectSize, instead of Pos < BytesRead).
> 
> Maybe that earlier condition on line 71 should just be Pos < BytesRead to be consistent with readBytes using "BytesRead" as the boundary at EOF, instead of ObjectSize as the boundary at EOF. It will also be consistent with the "return Pos < BytesRead;" at the end of this function on line 87.
> 
> On the other hand, getExtent() and isValidAddress() use ObjectSize, and it will be inconsistent if the stream is ever truncated to be less than the known ObjectSize, but the callers will just have to rely on what readBytes ultimately says.
First off, when ObjectSize != 0, BytesRead <= ObjectSize is only guaranteed to be true because setKnownObjectSize would have changed it if BytesRead > ObjectSize. In general, since we don't control when setKnownObjectSize is called, there is no real association between BytesRead and ObjectSize.

The fundamental problem is that we have two notions of EOF (BytesRead and ObjectSize), and their value is independent of each other.

This CL tries to reduce overhead by making them the same if setKnownObjectSize is called.

Alternatively, we can keep these notions separate. Hence, fetchToPos will always do a read and set NextBytes to the end of the read. However, then we need to modify readBytes to check that the requested bytes doesn't exceed ObjectSize.

Updating the CL to match this concept.

http://reviews.llvm.org/D8931

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/