[PATCH] D99110: [Coverage] Load records immediately

Mon Mar 22 16:43:26 PDT 2021

tunz added a comment.

>>   std::string FuncRecords;
>>   ...
>>   for (SectionRef Section : *CoverageRecordsSections) {
>>     auto CoverageRecordsOrErr = Section.getContents();
>>     if (!CoverageRecordsOrErr)
>>       return CoverageRecordsOrErr.takeError();
>>     FuncRecords += CoverageRecordsOrErr.get();
>>     while (FuncRecords.size() % 8 != 0)
>>       FuncRecords += '\0';
>>   }
>>
>> FuncRecords is stored as a new string. When I removed this part, I could save about 10-20% of memory. And, there are some other similar things such as uncompressed data. So, I just came back to this approach.
>
> I don't understand what you replaced this with. For some context: this logic is only needed for non-MachO (non-Darwin) file formats, because the use of comdats splits function records into multiple sections. See 80cd518b809dd <https://reviews.llvm.org/rG80cd518b809ddb3caaf1dc59db3fbc6d1bcaf2e0>. Looking at this some more, I admit it's possible to optimize, but I don't see how this patch does it.

The patch I did was replacing std::string to std::vector<StringRef> so that it can use the existing buffer rather than creating a new buffer. This is just one example of the increasing memory behind the readonly buffer. I've also observed some other places increasing the memory such as RawCoverageFilenamesReader.

I was measuring coverage of 500~600 binaries whose total binary size is about 100GB. So, I thought it kind of makes sense that it's taking quite a lot of memory since it loads all of them and processes them. I'll try to profile the memory.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99110/new/

https://reviews.llvm.org/D99110