[PATCH] D16277: Bitcode: use blob for string storage in the IR: trade a bit of space for faster reading

Filipe Cabecinhas via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 19 04:05:50 PST 2016


filcab added a comment.

Have you tried running clang with this patch? If possible, with ASan on.


================
Comment at: include/llvm/Support/StreamingMemoryObject.h:34
@@ -30,9 +33,3 @@
                      uint64_t Address) const override;
-  const uint8_t *getPointer(uint64_t address, uint64_t size) const override {
-    // FIXME: This could be fixed by ensuring the bytes are fetched and
-    // making a copy, requiring that the bitcode size be known, or
-    // otherwise ensuring that the memory doesn't go away/get reallocated,
-    // but it's not currently necessary. Users that need the pointer (any
-    // that need Blobs) don't stream.
-    report_fatal_error("getPointer in streaming memory objects not allowed");
-    return nullptr;
+  const uint8_t *getPointer(uint64_t Address, uint64_t Size) const override {
+    Buffer.resize(Size);
----------------
joker.eph wrote:
> filcab wrote:
> > What happens if there's two BLOBs in the stream? Wouldn't you overwrite one with the other?
> Yes the validity of the pointer returned last only till the next read from the stream. 
> The model is that there will be a copy made by the client anyway. But with blob won't "unpack" 6 bits elements to an array of unsigned, and then decode the 6 bits encoding to char, and then do the copy.
> 
> Note also that I haven't find another place than llvm-dis that uses this code-path.
Have you tried clang too?

Wouldn't this code (in clang) break (I added comments)?
  case SM_SLOC_BUFFER_ENTRY: {
    const char *Name = Blob.data();           // <- Getting a ref to the current blob
    unsigned Offset = Record[0];
    SrcMgr::CharacteristicKind
      FileCharacter = (SrcMgr::CharacteristicKind)Record[2];
    SourceLocation IncludeLoc = ReadSourceLocation(*F, Record[1]);
    if (IncludeLoc.isInvalid() &&
        (F->Kind == MK_ImplicitModule || F->Kind == MK_ExplicitModule)) {
      IncludeLoc = getImportLocation(F);
    }
    unsigned Code = SLocEntryCursor.ReadCode();
    Record.clear();
    unsigned RecCode
      = SLocEntryCursor.readRecord(Code, Record, &Blob);     // <- That old blob reference is now invalid

    if (RecCode != SM_SLOC_BUFFER_BLOB) {
      Error("AST record has invalid code");
      return true;
    }

    std::unique_ptr<llvm::MemoryBuffer> Buffer =
        llvm::MemoryBuffer::getMemBuffer(Blob.drop_back(1), Name);    // <- Use ref to first blob
    SourceMgr.createFileID(std::move(Buffer), FileCharacter, ID,
                           BaseOffset + Offset, IncludeLoc);
    break;
  }

"Nothing" in "llvm only" uses Blobs, basically (llvm-bcanalyzer does get one and dump it :-) ). But clang uses them a lot.


http://reviews.llvm.org/D16277





More information about the llvm-commits mailing list