[PATCH] D69027: [llvm-dwarfdump][Statistics] Fix calculation of OffsetToFirstDefinition

Wed Nov 6 05:48:54 PST 2019

avl added a comment.

> I'm not sure the bleakness is as much of a problem (but I didn't implement/don't personally use the statistics at all) - like all the other measures, this would be a relative one, not one where "100%" was a goal/meaningful (& the only reason to view it as a % would be to understand progress in a relative sense, I guess? Though measuring it on exactly the same code as an absolute value would be fine and represent progress too)

It seems that %-coefficient was originally created not for understanding progress.
The idea is to see completeness. i.e., if coverage for all variables matches with coverage in source code, then we have a complete debug info. This coefficient _is_not_precise_. But it shows direction.

If we have a variable declared at the very beginning of the scope, then we assume that the coverage for that variable should be 100%. If we see %-coefficient = 30%, then it signals us that debug info is _probably_ incomplete. The same information could be understood from absolute values. But to see this, we need to make the same calculations, as done for coefficient, for absolute values.

Another thing is that absolute values are changed too often. Let's have a variable declared at the start of scope. That means its coverage(in source code) is 100%. It's calculated coverage would be 200 bytes. The scope size is also 200 bytes. After some optimization size of scope would be increased till 250 bytes. Calculated coverage for variables would be 250 bytes(debug info covers whole scope). Thus we would see 200->250 bytes increase. But, in both cases, debug info equally complete. If we would see 100%->100% in that case, then we would see "no changes, everything is OK."  If we see 200->250, then we would need additionally check that case(i.e. this is false alarm).

>From that point of view, having %-coefficient as a completion signal looks useful.
In that case, it is essential to have a coefficient in the range of 0%-100%.

> If we are going to keep adjusted scope values - I'd love to hear a proposal that explains how to do so in a way that makes sense to me in the face of arbitrary basic block order - so far as I can think of, the only "adjusted scope bytes" that would make sense to me would be one that looks at each basic block (which would required disassembly to even identify, unfortunately) and has an adjusted scope of "first byte with a valid location description to last byte with a valid location description" within a scope subrange within a basic block. Anything else seems not very meaningful to me due to basic block order being unspecified (& maybe based on profile driven optimization, etc).

I also think that calculating adjusted scope just by trimming to the first reported location is not the right way. Because the real variable scope is unknown and because variable lifetime in optimized code could not match with its scope. Calculating coverage for a variable lifetime would be the most useful metric in that case(though it requires knowledge about real instructions).

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69027/new/

https://reviews.llvm.org/D69027