[llvm-dev] [DebugInfo][RFC] Enabling "instruction referencing" variable locations for x86_64
Jeremy Morse via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 10 11:48:54 PST 2021
tl;dr: Is it okay to enable value tracking (aka "instruction
referencing") to preserve debug-info variable locations, by default,
The high-level summary of instruction referencing is that, after
instruction selection, instead of describing variable locations as a
virtual register, we indirectly refer to the MachineInstr operand
where the variables value is defined. Once compilation is done, some
standard SSA analysis is used to determine the locations that values
are in, and what locations contain variable values. The benefit is
improved variable location coverage because we no longer have to act
conservatively during register allocation. More in the original RFC
We've (Sony) tested this internally and it works well; Caroline kindly
tested it  on some Google benchmarks and it seemed to work well
there too. It's currently in-tree and can be enabled by passing
-Xclang -fexperimental-debug-variable-locations to clang, or
-experimental-debug-variable-locations in LLVM options.
Probably the biggest caveat right now is that I'd like to only enable
for x86_64: this isn't due to any design limitation, it's just where
I've been doing all the testing, and it's largely untested on most
other architectures. I've run a stage2 cross compile to aarch64 for
clang as it's another popular arch, but that's all. There's a
migration guide in  of how other architectures can benefit: I would
suggest that other architectures opt into instruction referencing when
Here's a list of "remaining work" that are things I consider
incomplete about the new implementation:
* A bunch of tests need to be updated to check for the correct
outputs in this new mode, see D113194 .
* Variadic variable locations aren't implemented -- this is probably
a week or two of work.
* I've got unit tests for maybe 85% of the important parts of the
"new" LiveDebugValues, but there are some gaps.
* It can be non-trivial for humans to interpret variable locations
 in MIR, some printing improvements are needed.
There are some downsides that can be discussed too: most significantly
that it's slower on CTMark . The relevant configs there are
NewPM-ReleaseLTO-g and LegacyPM-ReleaseLTO-g, which are the only
configs that get optimisations and debug-info. Strictly speaking, this
slowdown is unavoidable because there's more information and more
accurate information being produced (one CTMark binary is almost 20%
larger due to extra debug-info), but it's still unfortunate. There are
some optimisations that can still be applied, and with instruction
referencing we can trivially compress all sequences of debug
instructions into a single instruction. Reids experiments  on
debug-instruction contribution to compile times (applied to IR not
MIR) suggest that could be a performance win. I should be able to
prototype this sometime soon.
* Support for i686 FP registers is nonexistant; due to it being a
stack, it's extra effort to track, which I haven't bothered to do yet.
Normal DBG_VALUEs don't do particularly well either.
* In the original RFC I pointed out that we can define a debug
use-before-def as a missing location at the point of any instruction
not dominated by the def, and as a normal location at any instruction
dominated by the def. Turning on instruction referencing by default
means this interpretation is implicitly accepted.
In my opinion it's mature enough to turn on by default (for x86_64),
ideally in good time for LLVM14's branch date, and I'm confident that
the remaining work can be done by the branch date. What do other
More information about the llvm-dev