[PATCH] D36627: dwarfdump: Add an option to collect debug info quality metrics

Fri Aug 25 15:37:40 PDT 2017

On Wed, Aug 16, 2017 at 12:04 PM Adrian Prantl via Phabricator <
reviews at reviews.llvm.org> wrote:

> aprantl marked 4 inline comments as done.
> aprantl added inline comments.
>
>
> ================
> Comment at: lib/DebugInfo/DWARF/DWARFDebugLoc.cpp:77-78
> +DWARFDebugLoc::getLocationList(unsigned Offset) const {
> +  DWARFDebugLoc::LocationList Searched;
> +  Searched.Offset = Offset;
> +  const LocationList *List =
> ----------------
> dblaikie wrote:
> > I believe C++11 lower_bound supports heterogenous lookup, so you don't
> need to create a LocationList to query - you can write an asymmetric
> comparison (unsigned < LocationList) & use that.
> >
> > I believe we use that to find a CU at an offset - the function's in
> DWARFContext, I think (just search for lower_bound in lib/DebugInfo/DWARF
> and you should find this example easily enough, probably)
> I added an operator< which is not quite what you suggested, but what all
> the other DWARF lookups were doing.
>

I'd suggest removing this and using heterogenous lookup, as in
DWARFUnitSection<T>::getUnitForOffset.

>
>
> ================
> Comment at: tools/llvm-dwarfdump/Statistics.cpp:64
> +  auto &Stats = Statistics[Prefix];
> +  Stats.VarsInFunction.insert({Name, 0});
> +  if (BytesInScope) {
> ----------------
> dblaikie wrote:
> > Name wouldn't be unambiguous - due to shadowing?
> Yes, this doesn't handle
> ```
> void f() {
>   { int i = 1; }
>   { int i = 2; }
> }
> ```
> correctly right now, and I'm also not quite sure how to fix this right
> now. We could give all lexical scopes unique names based on their nesting
> levels, but if there are two inlined instances of the same function:
>

Can you use the scope DIE's identity in memory (a pointer to the scope DIE)
to disambiguate? ('variable named "foo" in scope 0x42' - that sort of thing)

>
> ```
> void f(bool b) {
>   if (b) {
>     int i = 1;
>   } else {
>     int i = 2;
>   }
> }
> ```
>
> one inlined as `f(true)`, and one as `f(false)` with only one lexical
> block remaining in each inlined instance, I don't know how to handle that.
>
>
> ================
> Comment at: tools/llvm-dwarfdump/Statistics.cpp:67
> +    Stats.TotalVarWithLoc += (unsigned)HasLoc;
> +    // Turns out we have a lot of ranges that extend past the lexical
> scope.
> +    Stats.BytesCovered += std::min<uint64_t>(BytesInScope, BytesCovered);
> ----------------
> dblaikie wrote:
> > Should we gather statistics on that? (honestly there might be cases
> where it's more efficient to describe a variable as being in one location
> with one address range - even though the scope has a hole in it, for
> example (eg: scope says [a, b)+[c,d)+[e,f) but the variable location says
> '[a,b) it's in register foo, [c,f) it's in register bar' - I /think/ that's
> probably OK to do, the debugger/consumer shouldn't consult the variable if
> it's looking at an instruction in [d,e), right?)
> Yeah, it's not something I would collect always, but as a one-off it would
> be interesting. I recently fixed a bug related to this in r305599 and
> there's probably more opportunities.
>
>
> ================
> Comment at: tools/llvm-dwarfdump/Statistics.cpp:196
> +  llvm::errs() << "PC Ranges covered: "
> +               << (ScopeBytesCovered * 100) / ScopeBytesTotal << "%\n";
> +  return;
> ----------------
> dblaikie wrote:
> > This might not be the best statistic.
> >
> > Consider code like this
> >
> >   void f() {
> >     g();
> >     h j = k();
> >     return j;
> >   }
> >
> > optimized, 'j' should never have a location range that covers the entire
> scope of 'f'. There's no value for 'j' during 'g()'.
> >
> > So maybe it'd be necessary to track which bytes of a scope are covered
> by a location list, after the earliest byte in the scope that is covered.
> >
> > Separately you could track this statistic (which bytes of a scope are
> covered by the variable) - but not expect this statistic to approach 100%.
> This case even appears in my test case. The point of this number was to
> look at the delta between two compilers, and not for the ratio to ever
> approach 100%.
> I really like the idea of only starting at the first byte of coverage
> though, so changed the implementation to collect number instead.
>
>
> https://reviews.llvm.org/D36627
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170825/70acf5dd/attachment.html>