[llvm-dev] [DebugInfo] The current status of debug values using multiple machine locations
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Tue Jan 26 10:36:07 PST 2021
On Tue, Jan 26, 2021 at 6:10 AM <stephen.tozer at sony.com> wrote:
> As David has said, the coverage % is not an especially meaningful number
> in general because we do not have a general method of determining the true
> upper bound of coverage for an optimised program. To answer your question
> as best I can though, here are the coverage numbers:
>
> Project Variable availability PC ranges covered
> Old New Delta Old New Delta
> 7zip 79.48% 80.01% 0.53% 60.55% 60.73% 0.18%
> bullet 44.57% 45.21% 0.65% 55.55% 56.00% 0.46%
> ClamAV 88.89% 89.37% 0.48% 53.48% 53.57% 0.09%
> consumer-typeset 91.62% 91.44% -0.19% 32.48% 32.48% 0.00%
> kimwitu++ 68.34% 68.75% 0.41% 69.12% 69.82% 0.70%
> lencod 89.86% 90.77% 0.91% 48.41% 48.83% 0.42%
> mafft 89.26% 89.14% -0.12% 57.89% 57.89% 0.00%
> SPASS 83.23% 83.26% 0.03% 52.61% 52.66% 0.05%
> sqlite3 73.55% 75.62% 2.07% 51.59% 52.01% 0.43%
> tramp3d-v4 54.77% 63.04% 8.27% 66.67% 68.16% 1.49%
>
> These numbers are not high resolution - the change is simply the
> difference of the rounded "old" and "new" numbers. Notably, the variable
> availability for some of the projects has actually gone down, as we have
> more variables being emitted to DWARF with 0% coverage (the DWARF emission
> of variables with 0% coverage is an issue in itself, but not one introduced
> or fixed by this patch). The PC bytes numbers are also slightly misleading,
> as the % is calculated as "the sum of PC bytes covered for each variable"
> divided by "the sum of PC bytes in the parent scope for each variable".
> This means that if, for example, we doubled the number of variables covered
> by the program but all of the new variables had slightly lower average
> coverage than the old variables, we would see this number decrease despite
> the clear increase in actual coverage.
>
Hmm, that seems like a somewhat unhelpful statistic - when you say "more
variables being emitted to DWARF with 0% coverage" - what do you mean by
that? Are we counting a variable with no location attribute as being 100%
covered, because it isn't partially covered? Could we instead count such
variables as 0% covered?
>
> As you can see, these numbers aren't as helpful as we'd like - for
> example, we could easily hit 100% coverage by choosing not to emit any
> variables that don't have a location for their entire scope, but this would
> not translate to a better debug experience. We could compare the number of
> available variables with the program at O0, but this also does not work out
> as it might first seem, because optimizations can *increase* the number
> of variables by inlining functions; for all of these projects, the number
> of variables at O2 is several times larger than the number at O0.
>
> Hopefully this summarizes why comparing the raw variable counts and PC
> bytes covered is, as far as I can tell, the best way of comparing the
> actual change in debug quality between the two patches.
> ------------------------------
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* 22 January 2021 19:45
> *To:* Owen Anderson <resistor at mac.com>
> *Cc:* Tozer, Stephen <stephen.tozer at sony.com>; llvm-dev at lists.llvm.org <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] [DebugInfo] The current status of debug values
> using multiple machine locations
>
> It's hard to know the upper bound on what's possible.
>
> eg: code like this:
>
> int x = f1();
> f2(x);
> f2(4);
>
> With optimized code, there's no way to recover the value of 'x' during the
> second f2 call. We can compute an absolute upper bound that's certainly
> unreachable - by looking at the scope of variables (assuming our scope
> tracking is perfect - which, it's not bad, but can get weird under
> optimizations) and comparing total scope bytes of variables compared to the
> bytes for which a location is described. We do have those two stats in
> llvm-dwarfdump --statistics. But generally we don't bother looking at that
> because it's a fair way off and the limitations of any such measurement as
> I've described here. (we also don't currently track where a variable's
> scope starts - so the upper bound for "x" in "{ f1(); int x = ...; f1(); }"
> includes both calls to f1, even though the location shouldn't ever extend
> to cover the first f1 call)
>
> On Fri, Jan 22, 2021 at 11:31 AM Owen Anderson via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi Stephen,
>
> Is it possible to quantify this coverage in absolute terms, at least the
> PC bytes portion? It would be helpful to understand how close this is
> bringing us to 100% coverage, for example.
>
> —Owen
>
> On Jan 22, 2021, at 7:23 AM, via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Following the previous discussion on the mailing list[0], I have been
> writing a series of patches that implement the proposed instructions[1],
> enabling multi-location debug values (debug value lists) in LLVM. Although
> the patches are still in review, the basic implementation is finished
> (except for the removal and replacement of the old DBG_VALUE instruction,
> as discussed on the mailing list[2]).
>
> Given below is the change in debug info output for the LLVM test-suite
> CTMark, from before and after the debug value list patch:
>
> Project Available variables PC bytes covered
> Old New Change Old New Change
> 7zip 40252 40501 0.62% 7112336 7142255 0.42%
> bullet 32655 33296 1.96% 6272034 6323049 0.81%
> ClamAV 8795 8842 0.53% 5090634 5099634 0.18%
> consumer-typeset 4354 4356 0.05% 3171498 3171605 0.00%
> kimwitu++ 30006 30177 0.57% 1736826 1755152 1.06%
> lencod 14176 14319 1.01% 6123957 6177106 0.87%
> mafft 6854 6859 0.07% 12045196 12046744 0.01%
> SPASS 38477 38492 0.04% 3396246 3399668 0.10%
> sqlite3 29479 30301 2.79% 7964547 8024747 0.76%
> tramp3d-v4 91732 105588 15.10% 7925131 8106167 2.28%
>
> As most of the patches have been approved, I am hopeful that the full set
> of patches will be merged into main in the near future. Part of the purpose
> of this email is to give notice of the upcoming merge, as the changes are
> significant and may conflict with any private changes concerning debug
> info. In terms of output, the patch should not change any existing variable
> locations; it should only add new locations for some variables. This may
> break tests that expect certain variables to be missing or optimized out,
> but should not be disruptive otherwise. If you want to test this patch,
> either to benchmark compiler performance, gather DWARF statistics, or test
> its merging with private changes, there is a single patch comprising the
> entirety of the current work on Phabricator[3].
>
> The other purpose of this email is to request further reviews on the
> patches, as all but 5 have been accepted and most of the remaining patches
> have been well-reviewed by now. Due to the size of the patches, there will
> likely be conflicts between any in-development debug-info work and these
> patches, creating extra work for any developers that need to update their
> patches to handle the new instruction. It will also allow current and
> future work to take advantage of the new functionality to preserve more
> debug information.
>
> ------------------------------
> With the patch implementations essentially complete, we can see more
> precisely the effect of these patches. With respect to the results above,
> it is important to note that although this patch is functional it does not
> cover all the potential ground of this feature. The only direct improvement
> added by this patch is enabling the salvage of non-constant binary operator
> and GetElementPtr instructions within the existing salvage function during
> opt passes. This is a significant improvement, but more may come: follow-up
> patches can enable this improved salvaging during instruction selection,
> and enable the salvage of cmp and select instructions; there are also
> some gaps in the instruction selection implementation, such as resolving
> dangling debug value lists. Some of these are easy wins that can be
> implemented immediately after landing the patch, and some are longer term
> projects that can progress when this work has merged.
>
> The patch currently leads to a medium-sized improvement for most cases,
> with some very small and one very large improvement. As mentioned
> previously, this set of patches primarily lays the groundwork for more
> complex variable locations; once it has landed there will be further work
> to salvage from more places, as well as improving the handling of list
> dbg.values to better preserve variable locations through optimizations.
> Compilation times do not appear to be significantly affected; I've been
> measuring compile times for the CTMark projects in the llvm test-suite
> (using the best practices from the benchmarking guide[4]), but so far any
> change in compile time is much smaller than the measurement noise, so I
> don't have an exact number to give. I estimate from the results so far that
> the increase will be no more than 1% in the worst case, but it could be
> smaller - I'm testing further to verify.
>
> [0] https://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html
> [1] https://reviews.llvm.org/D82363
> [2] https://lists.llvm.org/pipermail/llvm-dev/2020-August/144589.html
> [3] https://reviews.llvm.org/D94631
> [4] https://llvm.org/docs/Benchmarking.html
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210126/073365de/attachment-0001.html>
More information about the llvm-dev
mailing list