[llvm-dev] [DebugInfo] The current status of debug values using multiple machine locations

Tue Jan 26 10:36:07 PST 2021

On Tue, Jan 26, 2021 at 6:10 AM <stephen.tozer at sony.com> wrote:

> As David has said, the coverage % is not an especially meaningful number
> in general because we do not have a general method of determining the true
> upper bound of coverage for an optimised program. To answer your question
> as best I can though, here are the coverage numbers:
>
> Project              Variable availability      PC ranges covered
>                        Old     New   Delta       Old     New  Delta
> 7zip                79.48%  80.01%   0.53%    60.55%  60.73%  0.18%
> bullet              44.57%  45.21%   0.65%    55.55%  56.00%  0.46%
> ClamAV              88.89%  89.37%   0.48%    53.48%  53.57%  0.09%
> consumer-typeset    91.62%  91.44%  -0.19%    32.48%  32.48%  0.00%
> kimwitu++           68.34%  68.75%   0.41%    69.12%  69.82%  0.70%
> lencod              89.86%  90.77%   0.91%    48.41%  48.83%  0.42%
> mafft               89.26%  89.14%  -0.12%    57.89%  57.89%  0.00%
> SPASS               83.23%  83.26%   0.03%    52.61%  52.66%  0.05%
> sqlite3             73.55%  75.62%   2.07%    51.59%  52.01%  0.43%
> tramp3d-v4          54.77%  63.04%   8.27%    66.67%  68.16%  1.49%
>
> These numbers are not high resolution - the change is simply the
> difference of the rounded "old" and "new" numbers. Notably, the variable
> availability for some of the projects has actually gone down, as we have
> more variables being emitted to DWARF with 0% coverage (the DWARF emission
> of variables with 0% coverage is an issue in itself, but not one introduced
> or fixed by this patch). The PC bytes numbers are also slightly misleading,
> as the % is calculated as "the sum of PC bytes covered for each variable"
> divided by "the sum of PC bytes in the parent scope for each variable".
> This means that if, for example, we doubled the number of variables covered
> by the program but all of the new variables had slightly lower average
> coverage than the old variables, we would see this number decrease despite
> the clear increase in actual coverage.
>

Hmm, that seems like a somewhat unhelpful statistic - when you say "more
variables being emitted to DWARF with 0% coverage" - what do you mean by
that? Are we counting a variable with no location attribute as being 100%
covered, because it isn't partially covered? Could we instead count such
variables as 0% covered?

>
> As you can see, these numbers aren't as helpful as we'd like - for
> example, we could easily hit 100% coverage by choosing not to emit any
> variables that don't have a location for their entire scope, but this would
> not translate to a better debug experience. We could compare the number of
> available variables with the program at O0, but this also does not work out
> as it might first seem, because optimizations can *increase* the number
> of variables by inlining functions; for all of these projects, the number
> of variables at O2 is several times larger than the number at O0.
>
> Hopefully this summarizes why comparing the raw variable counts and PC
> bytes covered is, as far as I can tell, the best way of comparing the
> actual change in debug quality between the two patches.
> ------------------------------
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* 22 January 2021 19:45
> *To:* Owen Anderson <resistor at mac.com>
> *Cc:* Tozer, Stephen <stephen.tozer at sony.com>; llvm-dev at lists.llvm.org <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] [DebugInfo] The current status of debug values
> using multiple machine locations
>
> It's hard to know the upper bound on what's possible.
>
> eg: code like this:
>
> int x = f1();
> f2(x);
> f2(4);
>
> With optimized code, there's no way to recover the value of 'x' during the
> second f2 call. We can compute an absolute upper bound that's certainly
> unreachable - by looking at the scope of variables (assuming our scope
> tracking is perfect - which, it's not bad, but can get weird under
> optimizations) and comparing total scope bytes of variables compared to the
> bytes for which a location is described. We do have those two stats in
> llvm-dwarfdump --statistics. But generally we don't bother looking at that
> because it's a fair way off and the limitations of any such measurement as
> I've described here. (we also don't currently track where a variable's
> scope starts - so the upper bound for "x" in "{ f1(); int x = ...; f1(); }"
> includes both calls to f1, even though the location shouldn't ever extend
> to cover the first f1 call)
>
> On Fri, Jan 22, 2021 at 11:31 AM Owen Anderson via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi Stephen,
>
> Is it possible to quantify this coverage in absolute terms, at least the
> PC bytes portion? It would be helpful to understand how close this is
> bringing us to 100% coverage, for example.
>
> —Owen
>
> On Jan 22, 2021, at 7:23 AM, via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Following the previous discussion on the mailing list[0], I have been
> writing a series of patches that implement the proposed instructions[1],
> enabling multi-location debug values (debug value lists) in LLVM. Although
> the patches are still in review, the basic implementation is finished
> (except for the removal and replacement of the old DBG_VALUE instruction,
> as discussed on the mailing list[2]).
>
> Given below is the change in debug info output for the LLVM test-suite
> CTMark, from before and after the debug value list patch:
>
> Project               Available variables           PC bytes covered
>                       Old     New  Change         Old       New  Change
> 7zip                40252   40501   0.62%     7112336   7142255   0.42%
> bullet              32655   33296   1.96%     6272034   6323049   0.81%
> ClamAV               8795    8842   0.53%     5090634   5099634   0.18%
> consumer-typeset     4354    4356   0.05%     3171498   3171605   0.00%
> kimwitu++           30006   30177   0.57%     1736826   1755152   1.06%
> lencod              14176   14319   1.01%     6123957   6177106   0.87%
> mafft                6854    6859   0.07%    12045196  12046744   0.01%
> SPASS               38477   38492   0.04%     3396246   3399668   0.10%
> sqlite3             29479   30301   2.79%     7964547   8024747   0.76%
> tramp3d-v4          91732  105588  15.10%     7925131   8106167   2.28%
>
> As most of the patches have been approved, I am hopeful that the full set
> of patches will be merged into main in the near future. Part of the purpose
> of this email is to give notice of the upcoming merge, as the changes are
> significant and may conflict with any private changes concerning debug
> info. In terms of output, the patch should not change any existing variable
> locations; it should only add new locations for some variables. This may
> break tests that expect certain variables to be missing or optimized out,
> but should not be disruptive otherwise. If you want to test this patch,
> either to benchmark compiler performance, gather DWARF statistics, or test
> its merging with private changes, there is a single patch comprising the
> entirety of the current work on Phabricator[3].
>
> The other purpose of this email is to request further reviews on the
> patches, as all but 5 have been accepted and most of the remaining patches
> have been well-reviewed by now. Due to the size of the patches, there will
> likely be conflicts between any in-development debug-info work and these
> patches, creating extra work for any developers that need to update their
> patches to handle the new instruction. It will also allow current and
> future work to take advantage of the new functionality to preserve more
> debug information.
>
> ------------------------------
> With the patch implementations essentially complete, we can see more
> precisely the effect of these patches. With respect to the results above,
> it is important to note that although this patch is functional it does not
> cover all the potential ground of this feature. The only direct improvement
> added by this patch is enabling the salvage of non-constant binary operator
> and GetElementPtr instructions within the existing salvage function during
> opt passes. This is a significant improvement, but more may come: follow-up
> patches can enable this improved salvaging during instruction selection,
> and enable the salvage of cmp and select instructions; there are also
> some gaps in the instruction selection implementation, such as resolving
> dangling debug value lists. Some of these are easy wins that can be
> implemented immediately after landing the patch, and some are longer term
> projects that can progress when this work has merged.
>
> The patch currently leads to a medium-sized improvement for most cases,
> with some very small and one very large improvement. As mentioned
> previously, this set of patches primarily lays the groundwork for more
> complex variable locations; once it has landed there will be further work
> to salvage from more places, as well as improving the handling of list
> dbg.values to better preserve variable locations through optimizations.
> Compilation times do not appear to be significantly affected; I've been
> measuring compile times for the CTMark projects in the llvm test-suite
> (using the best practices from the benchmarking guide[4]), but so far any
> change in compile time is much smaller than the measurement noise, so I
> don't have an exact number to give. I estimate from the results so far that
> the increase will be no more than 1% in the worst case, but it could be
> smaller - I'm testing further to verify.
>
> [0] https://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html
> [1] https://reviews.llvm.org/D82363
> [2] https://lists.llvm.org/pipermail/llvm-dev/2020-August/144589.html
> [3] https://reviews.llvm.org/D94631
> [4] https://llvm.org/docs/Benchmarking.html
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210126/073365de/attachment-0001.html>