[llvm-dev] RFC: Supporting macros in LLVM debug info

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 3 10:34:34 PST 2015


On Tue, Nov 3, 2015 at 10:19 AM, Aboud, Amjad <amjad.aboud at intel.com> wrote:

> > Not necessarily, if we kept the macros in order in the list of macros
> attached to the CU, which I imagine we would.
>
> OK, now I understand what you are aiming for. I really do not favor one on
> the other.
>
> But, can you explain what is the advantage of the parent approach over the
> children approach?
>

Not too much in it, really. The only thing I'd wonder about is whether the
parent approach would work better for LTO or not. Bit of a toss-up perhaps.
If each file generally produces the same macros (ie: no per-file macro
weirdness causing different sets of macros to come out of the same file)
then a parent->child structure should deduplicate fine under LTO, I think.

How is the macinfo referenced by the rest of the debug info?


> If any, the children approach seems to be the one reduces the LLVM IR
> size, is not it?
>

Hmm... yes, good point. I suppose it would involve twice as many pointers
to the relevant nodes (the child nodes move from the parent's child list to
the child's parent pointer, a zero-cost change, but then you add another
pointer to each child from the primary list)

OK - yeah, I'm fine with a top down design as you have it. (just took me a
little while to think through - since most of our structures are bottom up
to allow new things to be added later/merged during LTO, etc, but that
should be relatively uncommon in this case since we'll be emitting /all/
the macros in a given file (that are enabled, and differently enabled
features in the same program in different files should be relatively
uncommon))


>
>
> Regards,
>
> Amjad
>
>
>
> *From:* David Blaikie [mailto:dblaikie at gmail.com]
> *Sent:* Tuesday, November 03, 2015 18:46
>
> *To:* Aboud, Amjad
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] RFC: Supporting macros in LLVM debug info
>
>
>
>
>
>
>
> On Tue, Nov 3, 2015 at 12:16 AM, Aboud, Amjad <amjad.aboud at intel.com>
> wrote:
>
> > Do we really need to touch the AST? Or would it be reasonable to wire up
> the CGDebugInfo directly to the PPCallbacks, if it isn't already? (perhaps
> it is already wired up for other reasons?)
>
> This sound as a good idea, I will check that approach.
>
> PPCallbacks is only an interface, has nothing connected to it, but we will
> create a new class, which implement PPCallbacks, for macros.
>
>
>
> Right - I was wondering if CGDebugInfo already implemented PPCallbacks or
> was otherwise being notified of PPCallback related things, possibly through
> a layer or two of indirection.
>
>
>
> So we can connect whatever we want to that class.
>
> The only drawback with this approach, is that we can test the frontend
> using the generated  LLVM IR, i.e. the whole path, instead of having two
> tests, AST for testing the parser, and LLVM IR for testing the Sema.
>
>
>
> We don't usually do direct AST tests in Clang for debug info (or for many
> things, really) - we just do source -> llvm IR anyway, so that's nothing
> out of the ordinary.
>
>
>
>
>
> > I wonder if it'd be better to use a parent chain style approach (DIMacro
> has a DIMacroFile it refers to, each DIMacroFile has another one that it
> refers to, up to null)?
> > (does it ever make sense/need to have a DIMacroFile without any macros
> in it? I assume not?)
> First, it seems that GCC does emit MacroFile that has no macros inside (I
> understand that it might not be useful, but I am not sure if we should
> ignore that or not).
>
>
>
> Yeah, that's weird - I'd sort of be inclined to skip it until we know what
> it's useful for.
>
>
>
> Second, I assume that you are suggesting the parent chain style instead to
> the current children style, right?
>
>
>
> Correct
>
>
>
> In this case, won’t it make the debug emitter code much complicated to
> figure out the DFS tree,
>
>
>
> I don't quite imagine it would be more complicated - we would just be
> building the file parent chain as we go, and keeping the current macro file
> around to be used as the parent to any macros we create.
>
>
>
> which should be emitted for the macros, not mentioning the macro order
> which will be lost?
>
>
>
> Not necessarily, if we kept the macros in order in the list of macros
> attached to the CU, which I imagine we would.
>
>
>
> Also, remember that the command line macros have no DIMacroFile parent.
>
>
>
> Fair - they could have the null parent, potentially.
>
>
>
> However, if you meant to use the parent chain in addition to the children
> list, then what extra information it will give us?
>
>
>
> >Might be good to start with dwarfdump support - seems useful regardless
> of anything else?
>
> I agree, and in fact, I already have this code implemented, will upload it
> for review soon.
>
>
>
> Cool
>
>
>
>
>
> Thanks,
>
> Amjad
>
>
>
> *From:* David Blaikie [mailto:dblaikie at gmail.com]
> *Sent:* Tuesday, November 03, 2015 00:32
> *To:* Aboud, Amjad
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] RFC: Supporting macros in LLVM debug info
>
>
>
>
>
>
>
> On Wed, Oct 28, 2015 at 7:56 AM, Aboud, Amjad via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi,
>
> I would like to implement macro debug info support in LLVM.
>
> Below you will find 4 parts:
>
> 1.      Background  on what does it mean to debug macros.
>
> 2.      A brief explanation on how to represent macro debug info in DWARF
> 4.0.
>
> 3.      The suggested design.
>
> 4.      A full example: Source -> AST -> LLVM IR -> DWARF.
>
>
>
> Feel free to skip first two parts if you think you know the background.
>
> Please, let me know if you have any comment or feedback on this approach.
>
>
>
> Thanks,
>
> Amjad
>
>
>
> *[Background]*
>
> There are two kind of macro definition:
>
> 1. Simple macro definition, e.g.  #define M1 Value1
>
> 2. Function macro definition, e.g. #define M2(x, y)  (x) + (y)
>
> Macro scope starts with the "#define" directive and ends with "#undef"
> directive.
>
>
>
> GDB supports debugging macros. This means, it can evaluate the macro
> expression for all macros, which have a scope that interleaves with the
> current breakpoint.
>
> For example:
>
> GDB command: print M2(3, 5)
>
> GDB Result: 8
>
>
>
> GDB can evaluate the macro expression based on the ".debug_macroinfo"
> section (DWARF 4.0).
>
>
>
> *[DWARF 4.0 ".debug_macroinfo" section]*
>
> In this section there are 4 kinds of entries
>
> 1.      DW_MACROINFO_define
>
> 2.      DW_MACROINFO_undef
>
> 3.      DW_MACROINFO_start_file
>
> 4.      DW_MACROINFO_end_file
>
>
>
> Note: There is a 5th kind of entry for vendor specific macro information,
> that we do not need to support.
>
>
>
> The first two entries contain information about the line number where the
> macro is defined/undefined, and a null terminated string, which contain the
> macro name (followed by the replacement value in case of a definition, or a
> list of parameters then the replacement value in case of function macro
> definition).
>
> The third entry contains information about the line where the file was
> included followed by the file id (an offset into the files table in the
> debug line section).
>
> The fourth entry contains nothing, and it just close the previous entry of
> third kind (start_file) .
>
>
>
> Macro definition and file including entries must appear at the same order
> as they appear in the source file. Where all macro entries between
> "start_file" and "end_file" entries represent macros appears
> directly/indirectly in the included file.
>
>
>
> Special cases:
>
> 1.      The main source file should be the first "start_file" entry in
> the sequence, and should have line number "0".
>
> 2.      Command line/Compiler definitions must also have line number "0"
> but must appear before the first "start_file" entry.
>
> 3.      Command line include files, must also have line number "0" but
> will appear straight after the "start_file" of the main source.
>
>
>
> *[Design]*
>
> To support macros the following components need to be modified: Clang,
> LLVM IR, Dwarf Debug emitter.
>
>
>
> In clang, we need to handle these source directives:
>
> 1.      #define
>
> 2.      #undef
>
> 3.      #include
>
> The idea is to make a use of "PPCallbacks" class, which allows
> preprocessor to notify the parser each time one of the above directives
> occurs.
>
> These are the callbacks that should be implemented:
>
> "MacroDefined", "MacroUndefined", "FileChanged", and "InclusionDirective".
>
>
>
> AST will be extended to support two new DECL types: "MacroDecl" and
> "FileIncludeDecl".
>
>
>
> Do we really need to touch the AST? Or would it be reasonable to wire up
> the CGDebugInfo directly to the PPCallbacks, if it isn't already? (perhaps
> it is already wired up for other reasons?)
>
>
>
> Where "FileIncludeDecl" AST might contain other
> "FileIncludeDecl"/"MacroDecl" ASTs.
>
> These two new AST DECLs are not part of TranslationUnitDecl and are
> handled separately (see AST example below).
>
>
>
> In the LLVM IR, metadata debug info will be extended to support new DIs as
> well:
>
> "DIMacro", "DIFileInclude", and "MacroNode".
>
> The last, is needed as we cannot use DINode as a base class of "DIMacro"
> and DIFileInclude" nodes.
>
>
>
> DIMacro will contain:
>
> ·        type (definition/undefinition).
>
> ·        line number (interger).
>
> ·        name (null terminated string).
>
> ·        replacement value  (null terminated string - optional).
>
>
>
> DIFileMacro will contain:
>
> ·        line number (interger).
>
> ·        file (DIFile).
>
> ·        macro list (MacroNodeArray) - optional.
>
>
>
> I wonder if it'd be better to use a parent chain style approach (DIMacro
> has a DIMacroFile it refers to, each DIMacroFile has another one that it
> refers to, up to null)?
> (does it ever make sense/need to have a DIMacroFile without any macros in
> it? I assume not?)
>
>
> Might be good to start with dwarfdump support - seems useful regardless of
> anything else?
>
>
>
>
>
> In addition, the DICompileUnit will contain a new optional field of macro
> list of type (MacroNodeArray).
>
>
>
> Finally, I assume that macro support should be disabled by default, and
> there should be a flag to enable this feature. I would say that we should
> introduce a new specific flag, e.g. "-gmacro", that could be used with
> "-g".
>
>
>
> *[Example]*
>
> Here is an example that demonstrate the macro support from
> Source->AST->LLVM IR->DWARF.
>
>
>
> Source
>
> =========================================================
>
> mainfile.c:
>
>
> --------------------------------------------------------------------------------------
>
> 1. #define M1 Value1
>
> 2. #include "myfile.h"
>
> 3. #define M2( x , y)   ( (x)    + (y)  * Value2)
>
>
> --------------------------------------------------------------------------------------
>
>
>
> myfile.h:
>
>
> --------------------------------------------------------------------------------------
>
> 1.
>
> 2.
>
> 3.
>
> 4. #undef M1
>
> 5. #define M1 NewValue1
>
>
> --------------------------------------------------------------------------------------
>
>
>
> myfile2.h:
>
>
> --------------------------------------------------------------------------------------
>
> 1. #define M4 Value4
>
>
> --------------------------------------------------------------------------------------
>
> =========================================================
>
>
>
> Command line:
>
> clang -c -g -gmacro -O0 -DM3=Value3 -include myfile2.h mainfile.c
>
>
>
>
>
> AST
>
> =========================================================
>
> MacroDecl 0xd6c5c0 <<invalid sloc>> <invalid sloc> __llvm__ defined
>
> MacroDecl 0xd6c618 <<invalid sloc>> <invalid sloc> __clang__ defined
>
>
>
> … <More compiler macros> …
>
>
>
> MacroDecl 0x11c01b0 <<invalid sloc>> <invalid sloc> M3 defined
>
> FileIncludeDecl 0x11c0208 <mainfile.c:1:1> col:1
>
> |-FileIncludeDecl 0x11c0238 <myfile2.h:1:1> col:1
>
> | `-MacroDecl 0x11c0268 <<invalid sloc>> <invalid sloc> M4 defined
>
> |-MacroDecl 0x11c02c0 <mainfile.c:1:9> col:9 M1 defined
>
> |-FileIncludeDecl 0x11c0318 <myfile.h:1:1> col:1
>
> | |-MacroDecl 0x11c0348 <line:4:8> col:8 M1 undefined
>
> | `-MacroDecl 0x11c03a0 <line:5:9> col:9 M1 defined
>
> `-MacroDecl 0x11c03f8 <mainfile.c:3:9> col:9 M2 defined
>
> TranslationUnitDecl 0xd6c078 <<invalid sloc>> <invalid sloc>
>
> |-TypedefDecl 0xd6c330 <<invalid sloc>> <invalid sloc> implicit __int128_t
> '__int128'
>
> |-TypedefDecl 0xd6c370 <<invalid sloc>> <invalid sloc> implicit
> __uint128_t 'unsigned __int128'
>
> |-TypedefDecl 0xd6c3c8 <<invalid sloc>> <invalid sloc> implicit
> __builtin_ms_va_list 'char *'
>
> `-TypedefDecl 0xd6c590 <<invalid sloc>> <invalid sloc> implicit
> __builtin_va_list 'struct __va_list_tag [1]'
>
> =========================================================
>
>
>
>
>
> LLVM IR
>
> =========================================================
>
> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>
> target triple = "x86_64-pc-linux"
>
>
>
> !llvm.dbg.cu = !{!0}
>
> !llvm.module.flags = !{!327}
>
> !llvm.ident = !{!328}
>
>
>
> !0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer:
> "clang version 3.8.0 (trunk 251321)", isOptimized: false, runtimeVersion:
> 0, emissionKind: 1, enums: !2, macros: !3)
>
> !1 = !DIFile(filename: "mainfile.c", directory: "/")
>
> !2 = !{}
>
> !3 = !{!4, !5, … <More compiler macros> … , !312, !313}
>
> !4 = !DIMacro(macro type: DW_MACINFO_define, name: "__llvm__", value: !"1")
>
> !5 = !DIMacro(macro type: DW_MACINFO_define, name: "__clang__", value:
> !"1")
>
>
>
> … <More compiler macros> …
>
>
>
> !312 = !DIMacro(macro type: DW_MACINFO_define, name: "M3", value:
> !"Value3")
>
> !313 = !DIFileInclude(file: !314, nodes: !315)
>
> !314 = !DIFile(filename: "mainfile.c", directory: "/")
>
> !315 = !{!316, !320, !321, !326}
>
> !316 = !DIFileInclude(file: !317, nodes: !318)
>
> !317 = !DIFile(filename: "myfile2.h", directory: "/")
>
> !318 = !{!319}
>
> !319 = !DIMacro(macro type: DW_MACINFO_define, name: "M4", value:
> !"Value4")
>
> !320 = !DIMacro(macro type: DW_MACINFO_define, name: "M1", line: 1, value:
> !"Value1")
>
> !321 = !DIFileInclude(line: 2, file: !322, nodes: !323)
>
> !322 = !DIFile(filename: "myfile.h", directory: "/")
>
> !323 = !{!324, !325}
>
> !324 = !DIMacro(macro type: DW_MACINFO_undef, name: "M1", line: 4)
>
> !325 = !DIMacro(macro type: DW_MACINFO_define, name: "M1", line: 5, value:
> !"NewValue1")
>
> !326 = !DIMacro(macro type: DW_MACINFO_define, name: "M2(x,y)", line: 3,
> value: !"( (x) + (y) * Value2)")
>
> !327 = !{i32 2, !"Debug Info Version", i32 3}
>
> !328 = !{!"clang version 3.8.0 (trunk 251321)"}
>
> =========================================================
>
>
>
>
>
> DWARF
>
> =========================================================
>
> Command line: llvm-dwarfdump.exe -debug-dump=macro mainfile.o
>
>
> --------------------------------------------------------------------------------------
>
> mainfile3.o:  file format ELF64-x86-64
>
>
>
> .debug_macinfo contents:
>
> DW_MACINFO_define - lineno: 0 macro: __llvm__ 1
>
> DW_MACINFO_define - lineno: 0 macro: __clang__ 1
>
>
>
> … <More compiler macros> …
>
>
>
> DW_MACINFO_define - lineno: 0 macro: M3 Value3
>
> DW_MACINFO_start_file - lineno: 0 filenum: 1
>
>   DW_MACINFO_start_file - lineno: 0 filenum: 2
>
>     DW_MACINFO_define - lineno: 0 macro: M4 Value4
>
>   DW_MACINFO_end_file
>
>   DW_MACINFO_define - lineno: 1 macro: M1 Value1
>
>   DW_MACINFO_start_file - lineno: 2 filenum: 3
>
>     DW_MACINFO_undef - lineno: 4 macro: M1
>
>     DW_MACINFO_define - lineno: 5 macro: M1 NewValue1
>
>   DW_MACINFO_end_file
>
>   DW_MACINFO_define - lineno: 3 macro: M2(x,y) ( (x) + (y) * Value2)
>
> DW_MACINFO_end_file
>
>
>
>
> --------------------------------------------------------------------------------------
>
> Command line: llvm-dwarfdump.exe -debug-dump=line mainfile.o
>
>
> --------------------------------------------------------------------------------------
>
> .debug_line contents:
>
>
>
> … <Other line table Info> …
>
>
>
>                 Dir  Mod Time   File Len   File Name
>
>                 ---- ---------- ---------- ---------------------------
>
> file_names[  1]    1 0x00000000 0x00000000 mainfile.c
>
> file_names[  2]    1 0x00000000 0x00000000 myfile2.h
>
> file_names[  3]    1 0x00000000 0x00000000 myfile.h
>
> =========================================================
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151103/002b5f5d/attachment.html>


More information about the llvm-dev mailing list