[llvm] b4459b5 - [docs] Specify rules for updating debug locations

Thu Jun 18 14:05:54 PDT 2020

Author: Vedant Kumar
Date: 2020-06-18T14:05:45-07:00
New Revision: b4459b597a676065f666b75e46815758da09d4aa

URL: https://github.com/llvm/llvm-project/commit/b4459b597a676065f666b75e46815758da09d4aa
DIFF: https://github.com/llvm/llvm-project/commit/b4459b597a676065f666b75e46815758da09d4aa.diff

LOG: [docs] Specify rules for updating debug locations

Summary:
Restructure HowToUpdateDebugInfo.rst to specify rules for when
transformations should preserve, merge, or drop debug locations.

The goal is to have clear, well-justified rules that come with a few
examples and counter-examples, so that pass authors can pick the best
strategy for managing debug locations depending on the specific task at
hand.

I've tried to set down sensible rules here that mostly align with what
we already do in llvm today, and that take a diverse set of use cases
into account (interactive debugging, crash triage, SamplePGO).

Please *do* try to pick these rules apart and suggest clarifications or
improvements :).

Side note: Prior to 24660ea1, this document was structured as a long
list of very specific code transformations -- the idea being that we
would fill in what to do in each specific case. I chose to reorganize
the document as a list of actions to take because it drastically cuts
down on the amount of redundant exposition/explanation needed. I hope
that's fine...

Reviewers: jmorse, aprantl, dblaikie

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81198

Added: 
    

Modified: 
    llvm/docs/HowToUpdateDebugInfo.rst

Removed: 
    


################################################################################
diff  --git a/llvm/docs/HowToUpdateDebugInfo.rst b/llvm/docs/HowToUpdateDebugInfo.rst
index ef309d282ca3..f08722fb8eb9 100644

--- a/llvm/docs/HowToUpdateDebugInfo.rst
+++ b/llvm/docs/HowToUpdateDebugInfo.rst
@@ -18,11 +18,125 @@ info tests for arbitrary transformations.
 For more on the philosophy behind LLVM debugging information, see
 :doc:`SourceLevelDebugging`.
 
-IR-level transformations
-========================
+Rules for updating debug locations
+==================================
 
-Deleting an Instruction
------------------------
+.. _WhenToPreserveLocation:
+
+When to preserve an instruction location
+----------------------------------------
+
+A transformation should preserve the debug location of an instruction if the
+instruction either remains in its basic block, or if its basic block is folded
+into a predecessor that branches unconditionally. The APIs to use are
+``IRBuilder``, or ``Instruction::setDebugLoc``.
+
+The purpose of this rule is to ensure that common block-local optimizations
+preserve the ability to set breakpoints on source locations corresponding to
+the instructions they touch. Debugging, crash logs, and SamplePGO accuracy
+would be severely impacted if that ability were lost.
+
+Examples of transformations that should follow this rule include:
+
+* Instruction scheduling. Block-local instruction reordering should not drop
+  source locations, even though this may lead to jumpy single-stepping
+  behavior.
+
+* Simple jump threading. For example, if block ``B1`` unconditionally jumps to
+  ``B2``, *and* is its unique predecessor, instructions from ``B2`` can be
+  hoisted into ``B1``. Source locations from ``B2`` should be preserved.
+
+* Peephole optimizations that replace or expand an instruction, like ``(add X
+  X) => (shl X 1)``. The location of the ``shl`` instruction should be the same
+  as the location of the ``add`` instruction.
+
+* Tail duplication. For example, if blocks ``B1`` and ``B2`` both
+  unconditionally branch to ``B3`` and ``B3`` can be folded into its
+  predecessors, source locations from ``B3`` should be preserved.
+
+Examples of transformations for which this rule *does not* apply include:
+
+* LICM. E.g., if an instruction is moved from the loop body to the preheader,
+  the rule for :ref:`dropping locations<WhenToDropLocation>` applies.
+
+.. _WhenToMergeLocation:
+
+When to merge instruction locations
+-----------------------------------
+
+A transformation should merge instruction locations if it replaces multiple
+instructions with a single merged instruction, *and* that merged instruction
+does not correspond to any of the original instructions' locations. The API to
+use is ``Instruction::applyMergedLocation``.
+
+The purpose of this rule is to ensure that a) the single merged instruction
+has a location with an accurate scope attached, and b) to prevent misleading
+single-stepping (or breakpoint) behavior. Often, merged instructions are memory
+accesses which can trap: having an accurate scope attached greatly assists in
+crash triage by identifying the (possibly inlined) function where the bad
+memory access occurred. This rule is also meant to assist SamplePGO by banning
+scenarios in which a sample of a block containing a merged instruction is
+misattributed to a block containing one of the instructions-to-be-merged.
+
+Examples of transformations that should follow this rule include:
+
+* Merging identical loads/stores which occur on both sides of a CFG diamond
+  (see the ``MergedLoadStoreMotion`` pass).
+
+* Merging identical loop-invariant stores (see the LICM utility
+  ``llvm::promoteLoopAccessesToScalars``).
+
+* Peephole optimizations which combine multiple instructions together, like
+  ``(add (mul A B) C) => llvm.fma.f32(A, B, C)``.  Note that the location of
+  the ``fma`` does not exactly correspond to the locations of either the
+  ``mul`` or the ``add`` instructions.
+
+Examples of transformations for which this rule *does not* apply include:
+
+* Block-local peepholes which delete redundant instructions, like
+  ``(sext (zext i8 %x to i16) to i32) => (zext i8 %x to i32)``. The inner
+  ``zext`` is modified but remains in its block, so the rule for
+  :ref:`preserving locations<WhenToPreserveLocation>` should apply.
+
+* Converting an if-then-else CFG diamond into a ``select``. Preserving the
+  debug locations of speculated instructions can make it seem like a condition
+  is true when it's not (or vice versa), which leads to a confusing
+  single-stepping experience. The rule for
+  :ref:`dropping locations<WhenToDropLocation>` should apply here.
+
+* Hoisting identical instructions which appear in several successor blocks into
+  a predecessor block (see ``BranchFolder::HoistCommonCodeInSuccs``). In this
+  case there is no single merged instruction. The rule for
+  :ref:`dropping locations<WhenToDropLocation>` applies.
+
+.. _WhenToDropLocation:
+
+When to drop an instruction location
+------------------------------------
+
+A transformation should drop debug locations if the rules for
+:ref:`preserving<WhenToPreserveLocation>` and
+:ref:`merging<WhenToMergeLocation>` debug locations do not apply. The API to
+use is ``Instruction::setDebugLoc()``.
+
+The purpose of this rule is to prevent erratic or misleading single-stepping
+behavior in situations in which an instruction has no clear, unambiguous
+relationship to a source location.
+
+To handle an instruction without a location, the DWARF generator
+defaults to allowing the last-set location after a label to cascade forward, or
+to setting a line 0 location with viable scope information if no previous
+location is available.
+
+See the discussion in the section about
+:ref:`merging locations<WhenToMergeLocation>` for examples of when the rule for
+dropping locations applies.
+
+Rules for updating debug values
+===============================
+
+Deleting an IR-level Instruction
+--------------------------------
 
 When an ``Instruction`` is deleted, its debug uses change to ``undef``. This is
 a loss of debug info: the value of a one or more source variables becomes