[LLVMdev] What does "debugger tuning" mean?

Fri May 1 13:06:59 PDT 2015

This is basically a reboot of the previous thread titled
  About the "debugger target"
except that "target" was really too strong a term for what I had intended
to use this feature for.  "Debugger tuning" is more like it.  You don't
need to have read the previous thread, I'll recap here.

Fundamentally, Clang/LLVM uses DWARF as the specification for the _format_
of information provided by the compiler to a variety of "consumers," which
primarily means debuggers (but not exclusively).  [For a long time it was 
the only format supported by LLVM. Lately, Microsoft debug info has started
appearing, but being a less widely used format, the issues that DWARF runs 
into aren't a concern for that format.  So "debugger tuning" is unlikely
to be an issue for Microsoft debug info.]

DWARF is a permissive standard, meaning that it does not rigidly require
that source-language construct X must be described using the DWARF 
construct Y.  Instead, DWARF says something more like, "If you have a 
source construct that means something like X, here's a mechanism Y that 
you could use to describe it."  While this gives compilers a lot of nice 
flexibility, it does mean that there's a lot of wiggle room for how a 
compiler describes something and in how a debugger interprets that 
description.  Compilers and debuggers therefore need to do a bit of 
negotiation in determining how the debug-info "contract" will work, when 
it comes to nitty-gritty details.  DWARF itself (the standard, as well 
as the committee that owns the standard) refuses to get involved in this 
negotiation, referring to all that as "quality of implementation issues."

It is readily apparent that different debuggers have different ideas
about certain DWARF features, for example whether they are useful or
irrelevant, or whether a certain source construct should be described
this way or that way.  As these generally fall into the QOI realm, the
DWARF spec itself is no help, and it comes down to a matter of opinion
about whether "the debugger should just know this" or "the compiler
really ought to just emit it that way."

Clang/LLVM is in the position of being a compiler that wants to support
several different debuggers, all of which have slightly different ideas
about what they want from the DWARF info for a program.  Our first line
of defense of course is the DWARF standard itself, but as we've seen,
that is not a universally definitive reference.

LLVM already emits DWARF slightly differently for different *targets*;
primarily Darwin, in a few cases PS4.  But in at least some cases, the
target is just a (somewhat unreliable) proxy for which *debugger* the
compiler expects to be consuming the DWARF.  The most instructive case
is the exact DWARF expression used to describe the location of a thread-
local variable.  DWARF v3 defined an operator to find the base address
of the thread-local storage area; however, GDB has never learned to
recognize it.  Therefore, for targets where we "know" GDB isn't used,
we can emit the standard operator; for targets where GDB *might* be
used, we need to emit the equivalent (non-standard) GNU operator.

It would be semantically more meaningful to base decisions like this on
whether we expected the debugger to be X or Y or Z.  Therefore I've
proposed (http://reviews.llvm.org/D8506) a "debugger tuning" option that
will make the reasoning behind these choices more obvious, and ultimately
give users a way to control the tuning themselves, when the platform's
default isn't what they want. (I'll have a follow-up patch exposing the
tuning option to the Clang driver.)

So, what kinds of things should be based on the debugger tuning option?
Are there still things that should be based on the target platform?
Simplest to consider these questions together, because it is often clear
which criterion is important if you consider (a) the same debugger run
on different targets, versus (b) different debuggers running on the same
target.  Basically, if the same debugger on different targets wants to
have something a certain way, that's probably a debugger-tuning thing.
And if different debuggers on the same target doesn't mean you should
change how the DWARF looks, that's likely a platform-specific thing.

The most obvious example of a debugger-tuning consideration is the TLS
operator mentioned above. That's something that GDB insists on having.
(It turns out that the standard operator was defined in DWARF 3, so we
also have to emit the GNU operator if we're producing DWARF 2.  Tuning
considerations don't trump what the standard says.)

Another example would be .debug_pubnames and .debug_pubtypes sections.
Currently these default to omitted for Darwin and PS4, but included
everywhere else. My initial patch for "tuning" changes the PS4 platform
criterion to the SCE debugger predicate; quite likely the "not Darwin"
criterion ought to be "not LLDB" or in other words "on for GDB only."
And having the code actually reflect the correct semantic purpose seems
like an overall goodness.

An example of a target-dependent feature might be the .debug_aranges
section. As it happens, we don't emit this section by default, because
apparently no debugger finds it useful, although there's a command-line 
option (-gdwarf-aranges) for it.  But, for PS4 we do want to emit it, 
because we have non-debugger tools that find it useful.  We haven't yet 
done the work to make that change on llvm.org, but it's on the list.
I would conditionalize this on the target, not the debugger, because
the debugger is not why we want to generate the section.

Okay, so I've been pretty long-winded about all this, can I possibly
codify it all into a reasonably succinct set of guidelines?  (which 
ought to be committed to the repo somewhere, although whether it's as
a lump of text in a docs webpage or a lump of commentary in some source
file is not clear; opinions welcome.)

o Emit standard DWARF if possible.
o Omitting standard DWARF features that nobody uses is fine.
  (example: DW_AT_sibling)
o Extensions are okay, but think about the circumstances where they 
  would be useful (versus just wasting space).  These are probably a
  debugger tuning decision, but might be a target-based decision.
  (example: DW_AT_APPLE_* attributes)
o If some debugger can't tolerate some piece of standard DWARF, that's
  a missing feature or a bug in the debugger.  Accommodating that in
  the compiler is a debugger tuning decision.
  (example: DW_OP_form_tls_address not understood by GDB)
o If some debugger has no use for some piece of standard DWARF, and
  it saves space to omit it, that's a debugger tuning decision.
  (example: .debug_pubnames/.debug_pubtypes sections)
o If a debugger wants things a certain way regardless of the target,
  that's probably a debugger tuning decision.
o If "system" software on a target (other than the debugger) wants
  things a certain way regardless of which debugger you're using,
  that's NOT a debugger tuning decision, but a target-based decision.
  (example: .debug_aranges section)

Let me know if this all seems reasonable, and especially if you have
a good idea where to keep the guidelines.
Thanks,
--paulr