[PATCH] D59790: [DebugInfo][Docs] Document how variable location metadata is transformed through target codegen

Tue Apr 2 19:08:47 PDT 2019

vsk added a comment.

Thanks so much for writing this down! Imho this is in pretty good shape (better than the drafts of similar docs I've thrown away). I've taken a look at the first third. I'll try to give a more detailed pass over the remaining bits later.

================
Comment at: docs/SourceLevelDebugging.rst:519

+Compiling debug information to CodeGen targets
+==============================================
----------------
Wdyt of keeping with the title of this patch and calling this 'How variable location metadata is used' (or, transformed)?

================
Comment at: docs/SourceLevelDebugging.rst:522
+
+As well as preserving debug information during optimization, LLVM must also
+preserve debug information through backend code generation passes, and
----------------
Start with: 'LLVM preserves ... throughout mid-level and backend passes'? No need to mention file formats, per the patch summary.

================
Comment at: docs/SourceLevelDebugging.rst:532
+to a mapping from each instruction to a set of variable locations, which can
+then be encoded in the desired file format. Three transformations account for
+the most serious changes in debugging information, as explained below.
----------------
Isn't it the reverse? I thought we mapped variable locations to instruction ranges. Also, not sure it's necessary to mention encoding/serializing here.

================
Comment at: docs/SourceLevelDebugging.rst:533
+then be encoded in the desired file format. Three transformations account for
+the most serious changes in debugging information, as explained below.
+Instruction scheduling can significantly change the ordering of the program
----------------
Instead of mentioning serious changes, how about giving a very brief outline? Like: '''The major transformations which affect variable location fidelity include: <bulleted list>\n The following sections go into more detail about each transformation.'''

I really like that the sections are ordered by how they occur in the pipeline. Maybe mention that, so readers have an easier time forming a timeline of when things happen?

================
Comment at: docs/SourceLevelDebugging.rst:539
+---------------------------------------------------
+
+Within IR, variable locations or their memory address are always identified by
----------------
Can you start by mentioning that this transformation converts IR into MIR?

================
Comment at: docs/SourceLevelDebugging.rst:540
+
+Within IR, variable locations or their memory address are always identified by
+a Value. When instruction selection occurs and the IR becomes encoded in a MIR
----------------
Can we drop the 'or', or is there a need to distinguish 'location' from 'memory address'?

================
Comment at: docs/SourceLevelDebugging.rst:543
+function, there are many categories of location where the variable can be
+encoded. For example, a GEP may be folded into the memory operand of a machine
+instruction, or the operation of multiple IR instructions combined into one
----------------
Just: 'In MIR, variable locations may be identified in a number of different ways'?

================
Comment at: docs/SourceLevelDebugging.rst:544
+encoded. For example, a GEP may be folded into the memory operand of a machine
+instruction, or the operation of multiple IR instructions combined into one
+machine instruction (such as multiply-and-accumulate). To track variable
----------------
Can you split the second example out into its own sentence?

================
Comment at: docs/SourceLevelDebugging.rst:554
+otherwise transformed into a non-register, the variable location becomes
+undefined.
+
----------------
Can you clarify what is meant by undefined? Is this something that gets patched up later, or is the assignment lost?

================
Comment at: docs/SourceLevelDebugging.rst:556
+
+Now having a set of MIR locations for variables, machine pseudo-instructions
+corresponding to each ``llvm.dbg.value`` and ``llvm.dbg.addr`` intrinsic are
----------------
'After MIR locations are assigned to each variable'?

================
Comment at: docs/SourceLevelDebugging.rst:576
+The position at which the DBG_VALUEs are inserted should correspond to the
+positions of their matching ``llvm.dbg.value`` intrinsics in the IR block.
+To demonstrate some of this lowering, consider the following example:
----------------
Mention that llvm makes a best-effort attempt to ensure the positions of DBG_VALUEs in the instruction stream correspond to the order in which assignments/updates occur in source? Maybe you already discuss this later, but this seems like a good place to bring it up.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59790/new/

https://reviews.llvm.org/D59790