[PATCH] D82129: [DebugInfo] Drop location ranges for variables which exist entirely outside the variable's scope

Orlando Cazalet-Hyams via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 18 15:22:04 PDT 2020


Orlando created this revision.
Orlando added reviewers: dblaikie, probinson, aprantl, vsk, djtodoro.
Orlando added projects: LLVM, debug-info.
Herald added subscribers: llvm-commits, mgrang, hiraditya, kristof.beyls.

This patch reduces file size by dropping variable locations a debugger user
will not see.

Background
----------

PR45889 [1] describes a bug where we get a variable location for an instruction
range that exists outside of the instruction range of the scope that it is
declared in.

We discovered the root cause of that particular problem lies in the handling of
parameters and argument values through SelectionDAG. Locations like these are
superfluous. Test llvm/test/DebugInfo/COFF/register-variables.ll shows the same
problem occuring as a result of a seemingly incorrect scope range (more on this
later). In cases like this the variable locations appear to be sane and it is
that the scope ranges are wrong or misleading.

A debugger user will not get to see these out of scope locations. Not only do
they increase the size of the binary in and of themselves, but having more than
one location will prevent us from emitting locations that cover their entire
scope as single locations. Single locations are desirable because they take up
less space than location lists with single entries.

Solution
--------

This patch drops variable locations which exist entirely outside of the
variable's scope. The way it is implemented is simple: after building the debug
entity history map we loop through it. For each variable we look at each entry.
If the entry opens a location range which does not intersect any of the
variable's scope's ranges then we mark it for removal. After visiting the
entries for each variable we also mark any clobbering entries which will no
longer be referenced for removal, and then finally erase the marked entries.
This all requires the ability to query the order of instructions, so before
this runs we number them.

Results
-------

Building CTMark with CMAKE_BUILD_TYPE=RelWithDebInfo without (base) and with
(patched) this patch yields a geomean binary size difference of -1.9%, with the
.debug_loc sections up to nearly 14% smaller in some cases. For a clang-3.4
build I see similar percentage savings to sqlite3 in the suite below.

  Program                                        base      patched   diff 
   test-suite :: CTMark/SPASS/SPASS.test          3971320   3814544  -3.9%
   test-suite...ark/tramp3d-v4/tramp3d-v4.test    14357584  13812000 -3.8%
   test-suite...TMark/7zip/7zip-benchmark.test    9190760   8885512  -3.3%
   test-suite :: CTMark/Bullet/bullet.test        7403096   7182752  -3.0%
   test-suite...:: CTMark/sqlite3/sqlite3.test    4077552   4003072  -1.8%
   test-suite :: CTMark/kimwitu++/kc.test         5117728   5038336  -1.6%
   test-suite...:: CTMark/ClamAV/clamscan.test    2545312   2526184  -0.8%
   test-suite :: CTMark/lencod/lencod.test        2643960   2631568  -0.5%
   test-suite...Mark/mafft/pairlocalalign.test    1530904   1526760  -0.3%
   test-suite...-typeset/consumer-typeset.test    1527120   1524040  -0.2%
   Geomean difference                                                -1.9%

My machine is not set up for precise performance measurements, but I observed
no significant change in compile time (0.0x% change in either direction).

The function validThroughout in DwarfDebug.cpp returns true when a location can
be considered a single location for a variable. Earlier I mentioned that single
locations are desirable. Here are some numbers from a clang-3.4 build that show
the patch improving our abillity to detect them:

                                      base    patched
  times validThroughout is called     3676550 3638269
  times validThroughout returns true  1470517 1541733
  percentage of calls returning true  40.0%   42.4%

For these benchmarks 'base' is clang at 772349de887 <https://reviews.llvm.org/rG772349de887839add823af70b0cdb37f3b47fbc3>, and 'patched' is that
commit with the patch applied on top.

Tests
-----

Added llvm/test/DebugInfo/X86/trim-var-locs.mir

Modified llvm/test/DebugInfo/X86/live-debug-variables.ll
Modified llvm/test/DebugInfo/ARM/PR26163.ll
In both tests an out of scope location is now removed. The remaining location
covers the entire scope of the variable allowing us to emit it as a single
location.

Modified llvm/test/DebugInfo/COFF/register-variables.ll
The scope range for variable 'c' ends before the variable location range
starts. Looking at the source and IR indicates that something may have gone
wrong with how we are tracking the lexical block scope that 'c' is declared
in, rather than generating bogus location ranges as in the other tests.
Regardless of the cause, a user will not get to see this variable in their
debugger so AFAICT the patch is working as intended.

Related future work?
--------------------

The simple instruction numbering added in this patch can be used to improve how
we detect single locations in validThroughout (DwarfDebug.cpp). With a small
change on top of this patch we can reduce binary size even further, and
potentially by a similar magnitude. In addition it allows us to replaces a
linear scan in validThroughout with a map lookup.

Summary
-------

This patch reduces the binary size of RelWithDebInfo builds by an average of
1.9%, and by nearly 4% in one case, across 11 applications by dropping variable
locations which a debugger user will never see. These locations exist in the
first place as a result of bugs in clang so there's an argument that this patch
should not land, and instead we should make a verifer pass (either in clang and
llc, or in llvm-dwarfdump), and focus on fixing the issues they reveal. OTOH
this patch has immediate and tangible wins, and doesn't preclude work on fixing
those fundamental issues.

What do you think?

Thank you for taking the time to read this,
Orlando

[1] https://bugs.llvm.org/show_bug.cgi?id=45889


https://reviews.llvm.org/D82129

Files:
  llvm/include/llvm/CodeGen/DbgEntityHistoryCalculator.h
  llvm/lib/CodeGen/AsmPrinter/DbgEntityHistoryCalculator.cpp
  llvm/lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
  llvm/test/DebugInfo/ARM/PR26163.ll
  llvm/test/DebugInfo/COFF/register-variables.ll
  llvm/test/DebugInfo/X86/live-debug-variables.ll
  llvm/test/DebugInfo/X86/trim-var-locs.mir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D82129.271843.patch
Type: text/x-patch
Size: 20753 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200618/04c2b545/attachment.bin>


More information about the llvm-commits mailing list