[PATCH] D16277: Bitcode: use blob for string storage in the IR: trade a bit of space for faster reading

Filipe Cabecinhas via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 19 08:43:52 PST 2016


filcab added inline comments.

================
Comment at: include/llvm/Support/StreamingMemoryObject.h:34
@@ -30,9 +33,3 @@
                      uint64_t Address) const override;
-  const uint8_t *getPointer(uint64_t address, uint64_t size) const override {
-    // FIXME: This could be fixed by ensuring the bytes are fetched and
-    // making a copy, requiring that the bitcode size be known, or
-    // otherwise ensuring that the memory doesn't go away/get reallocated,
-    // but it's not currently necessary. Users that need the pointer (any
-    // that need Blobs) don't stream.
-    report_fatal_error("getPointer in streaming memory objects not allowed");
-    return nullptr;
+  const uint8_t *getPointer(uint64_t Address, uint64_t Size) const override {
+    Buffer.resize(Size);
----------------
tejohnson wrote:
> filcab wrote:
> > joker.eph wrote:
> > > filcab wrote:
> > > > What happens if there's two BLOBs in the stream? Wouldn't you overwrite one with the other?
> > > Yes the validity of the pointer returned last only till the next read from the stream. 
> > > The model is that there will be a copy made by the client anyway. But with blob won't "unpack" 6 bits elements to an array of unsigned, and then decode the 6 bits encoding to char, and then do the copy.
> > > 
> > > Note also that I haven't find another place than llvm-dis that uses this code-path.
> > Have you tried clang too?
> > 
> > Wouldn't this code (in clang) break (I added comments)?
> >   case SM_SLOC_BUFFER_ENTRY: {
> >     const char *Name = Blob.data();           // <- Getting a ref to the current blob
> >     unsigned Offset = Record[0];
> >     SrcMgr::CharacteristicKind
> >       FileCharacter = (SrcMgr::CharacteristicKind)Record[2];
> >     SourceLocation IncludeLoc = ReadSourceLocation(*F, Record[1]);
> >     if (IncludeLoc.isInvalid() &&
> >         (F->Kind == MK_ImplicitModule || F->Kind == MK_ExplicitModule)) {
> >       IncludeLoc = getImportLocation(F);
> >     }
> >     unsigned Code = SLocEntryCursor.ReadCode();
> >     Record.clear();
> >     unsigned RecCode
> >       = SLocEntryCursor.readRecord(Code, Record, &Blob);     // <- That old blob reference is now invalid
> > 
> >     if (RecCode != SM_SLOC_BUFFER_BLOB) {
> >       Error("AST record has invalid code");
> >       return true;
> >     }
> > 
> >     std::unique_ptr<llvm::MemoryBuffer> Buffer =
> >         llvm::MemoryBuffer::getMemBuffer(Blob.drop_back(1), Name);    // <- Use ref to first blob
> >     SourceMgr.createFileID(std::move(Buffer), FileCharacter, ID,
> >                            BaseOffset + Offset, IncludeLoc);
> >     break;
> >   }
> > 
> > "Nothing" in "llvm only" uses Blobs, basically (llvm-bcanalyzer does get one and dump it :-) ). But clang uses them a lot.
> The old streaming getPointer took a fatal error, so presumably the clang code wasn't invoking it during streaming, so wouldn't break with Mehdi's change. Or is the concern that other code like this may creep in for StreamingMemoryObject?
True, the Streaming version is not being used in clang for sure. And if it's used by someone else (PNaCl?), it's not a problem now, so it can't be a problem after Mehdi's patch.

Mehdi: Instead of keeping the assert like I was saying (which is basically there to block code paths), let's just do your change. But please add some documentation stating that the interface to getPointer only guarantees that it's valid until the next call to it, and that the caller needs to either use it before then, or copy it. That way it's explicitly stated in the comment.

Thank you.


http://reviews.llvm.org/D16277





More information about the llvm-commits mailing list