[cfe-dev] probable bug in clang source based coverage

Vedant Kumar via cfe-dev cfe-dev at lists.llvm.org
Mon Mar 20 14:39:16 PDT 2017


Hi,


> In the file above, there are 2 segments that are clear:
> [start_line, start_col => end_line, end_col]
> [2,1 => 6, 2] and [4, 4 => 5, 5]
> covering both the function and the if's body scopes.
> This leaves us with the following list:
>   [3,8,1,1,1],
>   [3,22,0,1,1],
>   [3,33,1,1,0]
> 
> My understanding is, that every segment, that has IsRegionEntry set
> should be a region opening, whereas if it's unset, it is closing the
> last open most-inner region, so:
>   [3,8,1,1,1] means we "open" new region at position 3,8 and
>   [3,22,0,1,1] means we "open" a nested region at position 3,22.

The `IsRegionEntry` field is set when a segment corresponds to a source region with a fresh profiling counter. Otherwise it is not set. I don't think it will be helpful to think of segments as "closing" a region.

Some examples:

- We need a fresh profiling counter for "argc > 1", because its execution count cannot be determined any other way. So `IsRegionEntry` is set.
- The same for "argc < 10".
- The final segment on line 3 is different, and does not have `IsRegionEntry` set. That's because its execution count can be derived from existing counters. 

The `IsRegionEntry` field is just used as a hint by llvm-cov to compute better line execution counts, and to determine when it is appropriate to create tooltips for regions within a line.


> Now
>   [3,33,1,1,0] would mean that at position 3,33 we are closing the
> last open region, that is, the one open at 3,22.
> but in this case we are missing the "closing mark" for region open at
> position 3,8.

If you choose to think of segments as 'closing' a region, consider that one may 'close' multiple regions.


> If the above is incorrect, please let me know how could I parse the
> segments from json file to have source code properly annotated with
> regions.

The design goal for the segment information was to "do all the work" of figuring out which execution counts correspond to which snippets of text. If you look at the SourceCoverageViewHTML::renderLine function in llvm-cov, you can see an example of how this is done (hopefully the comments help).

What kind of source annotations are you interested in? If it's similar to something llvm-cov does, I can provide more details about how it works.


> So far I am using a recursive function keeping track of the last
> "unclosed" segment at the current level and recursing whenever there
> is another one opening with the last one not closed.

It sounds like you may be running into issues with the lack of explicit "end-of-segment" information in the CoverageSegment data structure. That information is implicit in the order in which the segments are presented.

I suggest breaking each line into snippets of text, where each snippet corresponds to exactly one CoverageSegment. This is what SourceCoverageViewHTML::renderLine does.


> I hope this is clear enough :)

Thanks for your interest! Please keep us updated on how your tool develops.

vedant

> regards,
> Michał Pszona
> 
> 
> On Mon, Mar 20, 2017 at 7:11 PM, Vedant Kumar <vsk at apple.com> wrote:
>> The '&&' in your example belongs to the source region corresponding to the if condition. It is always 'reached' before '(argc > 1)' is reached, so the segment contains 'argc > 1) && ('. A new segment is pushed for 'argc < 10', because it isn't always reached when the '&&' is reached.
>> 
>> This is why the current implementation does not emit a new segment just for the ' && ' snippet: it would be redundant.
>> 
>> Also, there are no 'closing' segments, only new ones. Maybe this is just a terminology mix up.
>> 
>> Let me know if the segments should be made more precise in any other way. If you have access to bugs.llvm.org, I'd appreciate any bug reports there (please add llvm-cov to the title).
>> 
>> best,
>> vedant
>> 
>>> On Mar 20, 2017, at 5:37 AM, Michał Pszona via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>>> 
>>> All,
>>> 
>>> I am parsing llvm-cov exports (json files) and I was surprised by some
>>> of the outputs.
>>> I was able to narrow it to the below example:
>>> 
>>> the source file:
>>> 
>>> int main (int argc, char* argv[])
>>> {
>>>  if ((argc > 1) && (argc < 10))
>>>  {
>>>  }
>>> }
>>> 
>>> the output:
>>> https://gist.github.com/meahow/84f39ed625d138b42ee1b9cde7613b6a
>>> 
>>> The confusing part is in segments section:
>>> 
>>> "segments": [
>>>  [2,1,1,1,1],
>>>  [3,8,1,1,1],
>>>  [3,22,0,1,1],
>>>  [3,33,1,1,0],
>>>  [4,4,0,1,1],
>>>  [5,5,1,1,0],
>>>  [6,2,0,0,0]
>>> ],
>>> 
>>> 
>>> It looks like there is unclosed region produced.
>>> There are two opening marks in line 3 (cols 8 and 22) and one closing.
>>> I believe that there should be another closing one in col 17.
>>> 
>>> It looks like this behavior is easily reproducible with multi
>>> statement conditions in if expression like this:
>>> 
>>> if ( (a > 1) && (a < 5))
>>> {}
>>> 
>>> Can someone confirm that this is a bug or if I'm not getting it right ?
>>> 
>>> 
>>> regards,
>>> Michał Pszona
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> 




More information about the cfe-dev mailing list