[LLVMdev] First-class debug info IR: MDLocation

Mon Oct 27 11:13:53 PDT 2014

> On 2014-Oct-27, at 01:19, Chandler Carruth <chandlerc at google.com> wrote:
> 
> Separate reply as the topics seem to have very little in common...
> 
> On Fri, Oct 24, 2014 at 4:16 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> Making debug info assembly readable and writable
> ================================================
> 
> Moreover, we're now in a place where it's trivial to express the
> "context" pointer structurally.  Here's the same debug info as above,
> using syntactic sugar to fill the "context" pointers:
> 
> FWIW, this doesn't make a huge difference to me in terms of readability other than avoiding ordering problems....
> 
> Bike sheds to paint
> ===================
> 
>  1. Should we trim some boilerplate?  E.g., it would be trivial to
>     change:
> 
>         !6 = metadata MDLocation(line: 43, column: 7, scope: !4)
> 
>     to:
> 
>         !6 = MDLocation(line: 43, column: 7, scope: !4)
> 
>     This would not complicate `LLParser`.  Thoughts?
> 
> If it is metadata, it should use 'metadata'.

FWIW, `metadata` is implied by the context of `!6 =`.  I don't
see this as different from dropping `metadata` from the reference,
`!dbg !6`.

> The boiler plate isn't the problem here IMO.

Certainly not the biggest problem :).

>  2. Which of the two "end goal" syntaxes is better: flat, or
>     hierarchical?  Better for what?  Why?
> 
> Some points that might simplify things:
> 
> - The largest overhead left for humans (once the fields are named and semantically de-obfuscated) are IMO: lack of symbolic constants from DWARF

Good point.  It wouldn't be hard to support proper names here
(like `DW_TAG_structure_type` instead of the `0x13` in my
example).

> and the lack of locality with the referencing instruction.
> 
> - I think there is likely a better inflection point in the tradeoff space between normalization and duplication. For example, I would be happy to see line and column repeated for every instruction *on* the instruction. Every time you save the reader an indirection through some "!019243" (which is totally unremarkable and hard to track) you win unless the size of the input is greatly changed.
> 

> - The conflict between humans and FileCheck is not as bad as I think you imagine. We have well established techniques for handling cases where what the IR contains isn't useful for FileCheck, and/or what would be useful for FileCheck is terribly cumbersome to write. We have the printer inject comments which can then be used in FileCheck. I think the same technique would tremendously help both human *readers* (but not writers) and FileCheck with location information. My suggestion: print out a comment line of the form "# location: ..." for all the indirected information every time that indirected information changes, and printed *before* the first instruction with the new indirected information. If both line or column are attached directly to the instruction, this gives just a comment at the start of each function and after each file change within the function body. Enough to form reasonably bracketed FileCheck but not overly verbose I suspect.

Interesting idea!  This would be easy to do.

> 
> - Once you've had an indirection from the actual IR structure to the debuginfo structure embedded down in the metadata, I agree that a structural form looks better.
> 
> Hope these thoughts help formulate even better looking IR. Not specifically trying to change any part of this patch or any other single patch.