[PATCH] D69027: [llvm-dwarfdump][Statistics] Fix calculation of OffsetToFirstDefinition

Tue Oct 29 12:25:46 PDT 2019

krisb added a comment.

In D69027#1723448 <https://reviews.llvm.org/D69027#1723448>, @avl wrote:

> > If the other metric is valuable, I suggest implementing it in a separate patch. But could you give an example when calculating coverage against full scope provides additional knowledge which does not follow from currently existing metrics?
>
> suppose we have such case with incomplete debug info:
>
>   DW_TAG_lexical_block
>  
>   DW_AT_ranges  (0x00000000
>      [0x000000000000010, 0x0000000000000197))
>  
>   DW_TAG_variable
>     DW_AT_location  (0x0000018e
>        [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)
>
>
> And then will create a fix which improves debug info in such a way:
>
>   DW_TAG_lexical_block
>  
>   DW_AT_ranges  (0x00000000
>      [0x000000000000010, 0x0000000000000197))
>  
>   DW_TAG_variable
>     DW_AT_location  (0x0000018e
>        [0x0000000000000100,  0x0000000000000183): DW_OP_reg1 RBX)
>        [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)
>
>
> In both cases reported coverage there would be 100%. 
>  That hides original low coverage for that variable and makes invisible performed improvement.
>  Change in coverage looks like this: 100%->100%. If it instead would be 20%->30% then result of changes would be seen.

Doesn't the absolute number of covered bytes (i.e. ScopeBytesCovered) suite this purpose? Yes, we will see no change in coverage, but the absolute number will be changed from 14 to 97 bytes, which may be considered as an improvement. Does it make sense to you?

In D69027#1724440 <https://reviews.llvm.org/D69027#1724440>, @dblaikie wrote:

> We know that the variable should be live until the end of the enclosing scope (as per the language) but we don't know where it should start - so there are (if I'm unedrstanding this discussion correctyl) two statistics - one for each end of the heuristic range: one measuring from the start of the enclosing scope (the earliest the variable could be declared) and the other measuring from the first reported location (that's the statistic that's being discussed/fixed here).

In the master, we indeed have two statistic types: one calculates a number of bytes covered by variable's DW_AT_location (i.e ScopeBytesCovered) against a number of bytes of variable's scope from first definition (i.e ScopeBytesFromFirstDefinition); another - calculates 'coverage bucket' per variable dividing the number of variable's covered bytes by //number of bytes in full scope//, w/o adjustment to the first definition (see collectLocStats()). 
But I'm changing the way we calculate 'coverage bucket' in this patch to 'dividing by the number of bytes in //the scope adjusted to the first definition//' for consistency purpose. It's not clear to me why 'coverage buckets' is being measured this way and I'm not sure it was intentional.

Just to summarize the discussion above. If I understand correctly (Alexey @avl, please, correct me if I'm wrong here), Alexey suggests either

- to replace all the statistics that are calculated against the scope adjusted to the first definition with the statistics against full scope.

or

- to add new statistics that are calculated against full scope to provide additional opportunities to detect/track changes in debug info coverage.

The first option makes this patch meaningless. But, IMO, with this approach statistics will no longer answer the question about the quality of debug info. This is why I do not agree with it.
Regarding the second option, I do not have a strong opinion. But I think it's better to consider it not in the scope of this patch.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69027/new/

https://reviews.llvm.org/D69027