[PATCH] D16277: Bitcode: use blob for string storage in the IR: trade a bit of space for faster reading

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 19 08:37:30 PST 2016


> On Jan 19, 2016, at 4:05 AM, Filipe Cabecinhas <filcab+llvm.phabricator at gmail.com> wrote:
> 
> filcab added a comment.
> 
> Have you tried running clang with this patch? If possible, with ASan on.

I bootstrapped clang with ThinLTO enabled with this patch. Not with ASAN unfortunately.



> 
> 
> ================
> Comment at: include/llvm/Support/StreamingMemoryObject.h:34
> @@ -30,9 +33,3 @@
>                      uint64_t Address) const override;
> -  const uint8_t *getPointer(uint64_t address, uint64_t size) const override {
> -    // FIXME: This could be fixed by ensuring the bytes are fetched and
> -    // making a copy, requiring that the bitcode size be known, or
> -    // otherwise ensuring that the memory doesn't go away/get reallocated,
> -    // but it's not currently necessary. Users that need the pointer (any
> -    // that need Blobs) don't stream.
> -    report_fatal_error("getPointer in streaming memory objects not allowed");
> -    return nullptr;
> +  const uint8_t *getPointer(uint64_t Address, uint64_t Size) const override {
> +    Buffer.resize(Size);
> ----------------
> joker.eph wrote:
>> filcab wrote:
>>> What happens if there's two BLOBs in the stream? Wouldn't you overwrite one with the other?
>> Yes the validity of the pointer returned last only till the next read from the stream. 
>> The model is that there will be a copy made by the client anyway. But with blob won't "unpack" 6 bits elements to an array of unsigned, and then decode the 6 bits encoding to char, and then do the copy.
>> 
>> Note also that I haven't find another place than llvm-dis that uses this code-path.
> Have you tried clang too?

I looked up in clang as well, and it does not use this code path, Rafael told me it is used by PNaCl only.
 

> 
> Wouldn't this code (in clang) break (I added comments)?

Yes it is broken (if used with the StreamingMemoryObject which clang does not do). 
I can:

- leave it as is (as the StreamingMemoryObject is not used in clang)
- Use a list of vector that will be kept alive by the StreamingMemoryObject.
- Leave the assert in StreamingMemoryObject and move llvm-dis to the same reader as opt/llc/…

Any opinion?

— 
Mehdi

>  case SM_SLOC_BUFFER_ENTRY: {
>    const char *Name = Blob.data();           // <- Getting a ref to the current blob
>    unsigned Offset = Record[0];
>    SrcMgr::CharacteristicKind
>      FileCharacter = (SrcMgr::CharacteristicKind)Record[2];
>    SourceLocation IncludeLoc = ReadSourceLocation(*F, Record[1]);
>    if (IncludeLoc.isInvalid() &&
>        (F->Kind == MK_ImplicitModule || F->Kind == MK_ExplicitModule)) {
>      IncludeLoc = getImportLocation(F);
>    }
>    unsigned Code = SLocEntryCursor.ReadCode();
>    Record.clear();
>    unsigned RecCode
>      = SLocEntryCursor.readRecord(Code, Record, &Blob);     // <- That old blob reference is now invalid
> 
>    if (RecCode != SM_SLOC_BUFFER_BLOB) {
>      Error("AST record has invalid code");
>      return true;
>    }
> 
>    std::unique_ptr<llvm::MemoryBuffer> Buffer =
>        llvm::MemoryBuffer::getMemBuffer(Blob.drop_back(1), Name);    // <- Use ref to first blob
>    SourceMgr.createFileID(std::move(Buffer), FileCharacter, ID,
>                           BaseOffset + Offset, IncludeLoc);
>    break;
>  }
> 
> "Nothing" in "llvm only" uses Blobs, basically (llvm-bcanalyzer does get one and dump it :-) ). But clang uses them a lot.
> 
> 
> http://reviews.llvm.org/D16277
> 
> 
> 



More information about the llvm-commits mailing list