[PATCH] Initial code coverage mapping data, reader, writer + C interface for ProfileData

Thu Jul 17 09:40:42 PDT 2014

> On Jul 9, 2014, at 4:39 PM, Alex Lorenz <arphaman at gmail.com> wrote:
> 
> The updated version of the patch that fixes various typos, adds format version to the coverage mapping format, changes the storage of filenames to once per tu, and adds document that describes the actual coverage mapping format.
> 
> http://reviews.llvm.org/D4301
> 
> Files:
>  docs/CoverageMappingFormat.rst
>  docs/index.rst
>  include/llvm/ProfileData/CoverageMapping.h
>  include/llvm/ProfileData/CoverageMappingReader.h
>  include/llvm/ProfileData/CoverageMappingWriter.h
>  lib/ProfileData/CMakeLists.txt
>  lib/ProfileData/CoverageMapping.cpp
>  lib/ProfileData/CoverageMappingReader.cpp
>  lib/ProfileData/CoverageMappingWriter.cpp
>  lib/ProfileData/LLVMBuild.txt
> <D4301.11233.patch>

Here are my comments on the documentation. As I mentioned in my latest review of the code, I’d like to split the documentation out to a separate patch. I’m not going to go into details yet because this needs more work first.

My biggest comment is about the style of the document. You’ve written it as a fairly formal specification, but I don’t think that is the best way to do it. Formal specifications are not easy to read. They’re often necessary if someone might have to reimplement the code from scratch, using only the information in the document, but that is not the case here since the code is freely available. The most important goal of this document should be to explain the format so that anyone working on the code can easily understand it. For that purpose, you should focus on conveying the high-level structure and making the document readable.

I suggest you start by inverting the organization, moving the low-level details to the end and putting the high-level structures at the beginning.

Next, the Overview section needs to be expanded to give a summary of the context. Anyone who has not delved into the details of clang’s instrumentation-based profiling is going to be lost otherwise. I.E., why are there expressions for counters instead of just simple count values? (An example showing how to calculate the count for the “else” block of an if-statement would be a good idea here.) You should also explain the concepts that come up later in the document, such as “expansion regions”, “empty regions”, and “skipped regions”.