[llvm-dev] [cfe-dev] RFC: Supporting macros in LLVM debug info

Richard Smith via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 13 14:41:58 PST 2015


On Fri, Nov 13, 2015 at 10:21 AM, David Blaikie via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> On Mon, Nov 9, 2015 at 4:00 AM, Aboud, Amjad <amjad.aboud at intel.com>
> wrote:
>
>> I found a way to skip representing macros in AST and create them directly
>> in CGDebugInfo through PPCallbacks during preprocessing.
>>
>> To do that, I needed to extend ASTConsumer interface with this extra
>> method:
>>
>>
>>
>>   /// If the consumer is interested in notifications from Preprocessor,
>>
>>   /// for example: notifications on macro definitions, etc., it should
>> return
>>
>>   /// a pointer to a PPCallbacks here.
>>
>>   /// The caller takes ownership on the returned pointer.
>>
>>   virtual PPCallbacks *CreatePreprocessorCallbacks() { return nullptr; }
>>
>>
>>
>> Then the ParseAST can use it to add these preprocessor callbacks, which
>> are needed by the AST consumer, to the preprocessor:
>>
>>
>>
>>   S.getPreprocessor().addPPCallbacks(
>>
>>       std::unique_ptr<PPCallbacks
>> >(Consumer->CreatePreprocessorCallbacks()));
>>
>
> (CreatePreprocessorCallbacks, if that's the path we take, should return a
> unique_ptr itself rather than returning a raw ownership-passing pointer,
> but that's a minor API detail)
>
>
>>
>>
>> With this, approach the change in clang to support macros is very small.
>>
>>
>>
>> Do you agree to this approach?
>>
>
> Richard - what do you reckon's the right hook/path to get preprocessor
> info through to codegen (& CGDebugInfo in particular). Would a general
> purpose hook in the ASTConsumer be appropriate/useful?
>

ASTConsumer shouldn't know anything about the preprocessor; there's no
reason to think, in general, that the AST is being produced by
preprocessing and parsing some text. Perhaps adding a PreprocessorConsumer
interface akin to the existing SemaConsumer interface would be a better way
to go.

Thanks,
>>
>> Amjad
>>
>>
>>
>> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *Aboud,
>> Amjad via llvm-dev
>> *Sent:* Thursday, November 05, 2015 16:56
>> *To:* David Blaikie
>>
>> *Cc:* llvm-dev at lists.llvm.org
>> *Subject:* Re: [llvm-dev] RFC: Supporting macros in LLVM debug info
>>
>>
>>
>> > Right - I was wondering if CGDebugInfo already implemented PPCallbacks
>> or was otherwise being notified of PPCallback related things, possibly
>> through a layer or two of indirection.
>>
>>
>>
>> I checked the approach of skipping representing macros in AST, and
>> communicate them directly from Parser to CGDebugInfo.
>>
>> However, I could not find a way to initialize this communication.
>>
>> The only interface available through Parser is either Sema (to create an
>> AST) or ASTConsumer. While the CGDebugInfo is only available in the
>> CodeGenModule, which is accessible from BackendConsumer, that implements
>> ASTConsumer.
>>
>>
>>
>> David, skipping the AST will save a lot of code, but I need help figuring
>> out how to communicate with the CGDebugInfo.
>>
>>
>>
>> Thanks,
>>
>> Amjad
>>
>>
>>
>> *From:* David Blaikie [mailto:dblaikie at gmail.com <dblaikie at gmail.com>]
>> *Sent:* Tuesday, November 03, 2015 18:46
>> *To:* Aboud, Amjad
>> *Cc:* llvm-dev at lists.llvm.org
>> *Subject:* Re: [llvm-dev] RFC: Supporting macros in LLVM debug info
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Nov 3, 2015 at 12:16 AM, Aboud, Amjad <amjad.aboud at intel.com>
>> wrote:
>>
>> > Do we really need to touch the AST? Or would it be reasonable to wire
>> up the CGDebugInfo directly to the PPCallbacks, if it isn't already?
>> (perhaps it is already wired up for other reasons?)
>>
>> This sound as a good idea, I will check that approach.
>>
>> PPCallbacks is only an interface, has nothing connected to it, but we
>> will create a new class, which implement PPCallbacks, for macros.
>>
>>
>>
>> Right - I was wondering if CGDebugInfo already implemented PPCallbacks or
>> was otherwise being notified of PPCallback related things, possibly through
>> a layer or two of indirection.
>>
>>
>>
>> So we can connect whatever we want to that class.
>>
>> The only drawback with this approach, is that we can test the frontend
>> using the generated  LLVM IR, i.e. the whole path, instead of having two
>> tests, AST for testing the parser, and LLVM IR for testing the Sema.
>>
>>
>>
>> We don't usually do direct AST tests in Clang for debug info (or for many
>> things, really) - we just do source -> llvm IR anyway, so that's nothing
>> out of the ordinary.
>>
>>
>>
>>
>>
>> > I wonder if it'd be better to use a parent chain style approach
>> (DIMacro has a DIMacroFile it refers to, each DIMacroFile has another one
>> that it refers to, up to null)?
>> > (does it ever make sense/need to have a DIMacroFile without any macros
>> in it? I assume not?)
>> First, it seems that GCC does emit MacroFile that has no macros inside (I
>> understand that it might not be useful, but I am not sure if we should
>> ignore that or not).
>>
>>
>>
>> Yeah, that's weird - I'd sort of be inclined to skip it until we know
>> what it's useful for.
>>
>>
>>
>> Second, I assume that you are suggesting the parent chain style instead
>> to the current children style, right?
>>
>>
>>
>> Correct
>>
>>
>>
>> In this case, won’t it make the debug emitter code much complicated to
>> figure out the DFS tree,
>>
>>
>>
>> I don't quite imagine it would be more complicated - we would just be
>> building the file parent chain as we go, and keeping the current macro file
>> around to be used as the parent to any macros we create.
>>
>>
>>
>> which should be emitted for the macros, not mentioning the macro order
>> which will be lost?
>>
>>
>>
>> Not necessarily, if we kept the macros in order in the list of macros
>> attached to the CU, which I imagine we would.
>>
>>
>>
>> Also, remember that the command line macros have no DIMacroFile parent.
>>
>>
>>
>> Fair - they could have the null parent, potentially.
>>
>>
>>
>> However, if you meant to use the parent chain in addition to the children
>> list, then what extra information it will give us?
>>
>>
>>
>> >Might be good to start with dwarfdump support - seems useful regardless
>> of anything else?
>>
>> I agree, and in fact, I already have this code implemented, will upload
>> it for review soon.
>>
>>
>>
>> Cool
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Amjad
>>
>>
>>
>> *From:* David Blaikie [mailto:dblaikie at gmail.com]
>> *Sent:* Tuesday, November 03, 2015 00:32
>> *To:* Aboud, Amjad
>> *Cc:* llvm-dev at lists.llvm.org
>> *Subject:* Re: [llvm-dev] RFC: Supporting macros in LLVM debug info
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Oct 28, 2015 at 7:56 AM, Aboud, Amjad via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> I would like to implement macro debug info support in LLVM.
>>
>> Below you will find 4 parts:
>>
>> 1.      Background  on what does it mean to debug macros.
>>
>> 2.      A brief explanation on how to represent macro debug info in
>> DWARF 4.0.
>>
>> 3.      The suggested design.
>>
>> 4.      A full example: Source -> AST -> LLVM IR -> DWARF.
>>
>>
>>
>> Feel free to skip first two parts if you think you know the background.
>>
>> Please, let me know if you have any comment or feedback on this approach.
>>
>>
>>
>> Thanks,
>>
>> Amjad
>>
>>
>>
>> *[Background]*
>>
>> There are two kind of macro definition:
>>
>> 1. Simple macro definition, e.g.  #define M1 Value1
>>
>> 2. Function macro definition, e.g. #define M2(x, y)  (x) + (y)
>>
>> Macro scope starts with the "#define" directive and ends with "#undef"
>> directive.
>>
>>
>>
>> GDB supports debugging macros. This means, it can evaluate the macro
>> expression for all macros, which have a scope that interleaves with the
>> current breakpoint.
>>
>> For example:
>>
>> GDB command: print M2(3, 5)
>>
>> GDB Result: 8
>>
>>
>>
>> GDB can evaluate the macro expression based on the ".debug_macroinfo"
>> section (DWARF 4.0).
>>
>>
>>
>> *[DWARF 4.0 ".debug_macroinfo" section]*
>>
>> In this section there are 4 kinds of entries
>>
>> 1.      DW_MACROINFO_define
>>
>> 2.      DW_MACROINFO_undef
>>
>> 3.      DW_MACROINFO_start_file
>>
>> 4.      DW_MACROINFO_end_file
>>
>>
>>
>> Note: There is a 5th kind of entry for vendor specific macro information,
>> that we do not need to support.
>>
>>
>>
>> The first two entries contain information about the line number where the
>> macro is defined/undefined, and a null terminated string, which contain the
>> macro name (followed by the replacement value in case of a definition, or a
>> list of parameters then the replacement value in case of function macro
>> definition).
>>
>> The third entry contains information about the line where the file was
>> included followed by the file id (an offset into the files table in the
>> debug line section).
>>
>> The fourth entry contains nothing, and it just close the previous entry
>> of third kind (start_file) .
>>
>>
>>
>> Macro definition and file including entries must appear at the same order
>> as they appear in the source file. Where all macro entries between
>> "start_file" and "end_file" entries represent macros appears
>> directly/indirectly in the included file.
>>
>>
>>
>> Special cases:
>>
>> 1.      The main source file should be the first "start_file" entry in
>> the sequence, and should have line number "0".
>>
>> 2.      Command line/Compiler definitions must also have line number "0"
>> but must appear before the first "start_file" entry.
>>
>> 3.      Command line include files, must also have line number "0" but
>> will appear straight after the "start_file" of the main source.
>>
>>
>>
>> *[Design]*
>>
>> To support macros the following components need to be modified: Clang,
>> LLVM IR, Dwarf Debug emitter.
>>
>>
>>
>> In clang, we need to handle these source directives:
>>
>> 1.      #define
>>
>> 2.      #undef
>>
>> 3.      #include
>>
>> The idea is to make a use of "PPCallbacks" class, which allows
>> preprocessor to notify the parser each time one of the above directives
>> occurs.
>>
>> These are the callbacks that should be implemented:
>>
>> "MacroDefined", "MacroUndefined", "FileChanged", and "InclusionDirective".
>>
>>
>>
>> AST will be extended to support two new DECL types: "MacroDecl" and
>> "FileIncludeDecl".
>>
>>
>>
>> Do we really need to touch the AST? Or would it be reasonable to wire up
>> the CGDebugInfo directly to the PPCallbacks, if it isn't already? (perhaps
>> it is already wired up for other reasons?)
>>
>>
>>
>> Where "FileIncludeDecl" AST might contain other
>> "FileIncludeDecl"/"MacroDecl" ASTs.
>>
>> These two new AST DECLs are not part of TranslationUnitDecl and are
>> handled separately (see AST example below).
>>
>>
>>
>> In the LLVM IR, metadata debug info will be extended to support new DIs
>> as well:
>>
>> "DIMacro", "DIFileInclude", and "MacroNode".
>>
>> The last, is needed as we cannot use DINode as a base class of "DIMacro"
>> and DIFileInclude" nodes.
>>
>>
>>
>> DIMacro will contain:
>>
>> ·        type (definition/undefinition).
>>
>> ·        line number (interger).
>>
>> ·        name (null terminated string).
>>
>> ·        replacement value  (null terminated string - optional).
>>
>>
>>
>> DIFileMacro will contain:
>>
>> ·        line number (interger).
>>
>> ·        file (DIFile).
>>
>> ·        macro list (MacroNodeArray) - optional.
>>
>>
>>
>> I wonder if it'd be better to use a parent chain style approach (DIMacro
>> has a DIMacroFile it refers to, each DIMacroFile has another one that it
>> refers to, up to null)?
>> (does it ever make sense/need to have a DIMacroFile without any macros in
>> it? I assume not?)
>>
>>
>> Might be good to start with dwarfdump support - seems useful regardless
>> of anything else?
>>
>>
>>
>>
>>
>> In addition, the DICompileUnit will contain a new optional field of macro
>> list of type (MacroNodeArray).
>>
>>
>>
>> Finally, I assume that macro support should be disabled by default, and
>> there should be a flag to enable this feature. I would say that we should
>> introduce a new specific flag, e.g. "-gmacro", that could be used with
>> "-g".
>>
>>
>>
>> *[Example]*
>>
>> Here is an example that demonstrate the macro support from
>> Source->AST->LLVM IR->DWARF.
>>
>>
>>
>> Source
>>
>> =========================================================
>>
>> mainfile.c:
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> 1. #define M1 Value1
>>
>> 2. #include "myfile.h"
>>
>> 3. #define M2( x , y)   ( (x)    + (y)  * Value2)
>>
>>
>> --------------------------------------------------------------------------------------
>>
>>
>>
>> myfile.h:
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> 1.
>>
>> 2.
>>
>> 3.
>>
>> 4. #undef M1
>>
>> 5. #define M1 NewValue1
>>
>>
>> --------------------------------------------------------------------------------------
>>
>>
>>
>> myfile2.h:
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> 1. #define M4 Value4
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> =========================================================
>>
>>
>>
>> Command line:
>>
>> clang -c -g -gmacro -O0 -DM3=Value3 -include myfile2.h mainfile.c
>>
>>
>>
>>
>>
>> AST
>>
>> =========================================================
>>
>> MacroDecl 0xd6c5c0 <<invalid sloc>> <invalid sloc> __llvm__ defined
>>
>> MacroDecl 0xd6c618 <<invalid sloc>> <invalid sloc> __clang__ defined
>>
>>
>>
>> … <More compiler macros> …
>>
>>
>>
>> MacroDecl 0x11c01b0 <<invalid sloc>> <invalid sloc> M3 defined
>>
>> FileIncludeDecl 0x11c0208 <mainfile.c:1:1> col:1
>>
>> |-FileIncludeDecl 0x11c0238 <myfile2.h:1:1> col:1
>>
>> | `-MacroDecl 0x11c0268 <<invalid sloc>> <invalid sloc> M4 defined
>>
>> |-MacroDecl 0x11c02c0 <mainfile.c:1:9> col:9 M1 defined
>>
>> |-FileIncludeDecl 0x11c0318 <myfile.h:1:1> col:1
>>
>> | |-MacroDecl 0x11c0348 <line:4:8> col:8 M1 undefined
>>
>> | `-MacroDecl 0x11c03a0 <line:5:9> col:9 M1 defined
>>
>> `-MacroDecl 0x11c03f8 <mainfile.c:3:9> col:9 M2 defined
>>
>> TranslationUnitDecl 0xd6c078 <<invalid sloc>> <invalid sloc>
>>
>> |-TypedefDecl 0xd6c330 <<invalid sloc>> <invalid sloc> implicit
>> __int128_t '__int128'
>>
>> |-TypedefDecl 0xd6c370 <<invalid sloc>> <invalid sloc> implicit
>> __uint128_t 'unsigned __int128'
>>
>> |-TypedefDecl 0xd6c3c8 <<invalid sloc>> <invalid sloc> implicit
>> __builtin_ms_va_list 'char *'
>>
>> `-TypedefDecl 0xd6c590 <<invalid sloc>> <invalid sloc> implicit
>> __builtin_va_list 'struct __va_list_tag [1]'
>>
>> =========================================================
>>
>>
>>
>>
>>
>> LLVM IR
>>
>> =========================================================
>>
>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>
>> target triple = "x86_64-pc-linux"
>>
>>
>>
>> !llvm.dbg.cu = !{!0}
>>
>> !llvm.module.flags = !{!327}
>>
>> !llvm.ident = !{!328}
>>
>>
>>
>> !0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer:
>> "clang version 3.8.0 (trunk 251321)", isOptimized: false, runtimeVersion:
>> 0, emissionKind: 1, enums: !2, macros: !3)
>>
>> !1 = !DIFile(filename: "mainfile.c", directory: "/")
>>
>> !2 = !{}
>>
>> !3 = !{!4, !5, … <More compiler macros> … , !312, !313}
>>
>> !4 = !DIMacro(macro type: DW_MACINFO_define, name: "__llvm__", value:
>> !"1")
>>
>> !5 = !DIMacro(macro type: DW_MACINFO_define, name: "__clang__", value:
>> !"1")
>>
>>
>>
>> … <More compiler macros> …
>>
>>
>>
>> !312 = !DIMacro(macro type: DW_MACINFO_define, name: "M3", value:
>> !"Value3")
>>
>> !313 = !DIFileInclude(file: !314, nodes: !315)
>>
>> !314 = !DIFile(filename: "mainfile.c", directory: "/")
>>
>> !315 = !{!316, !320, !321, !326}
>>
>> !316 = !DIFileInclude(file: !317, nodes: !318)
>>
>> !317 = !DIFile(filename: "myfile2.h", directory: "/")
>>
>> !318 = !{!319}
>>
>> !319 = !DIMacro(macro type: DW_MACINFO_define, name: "M4", value:
>> !"Value4")
>>
>> !320 = !DIMacro(macro type: DW_MACINFO_define, name: "M1", line: 1,
>> value: !"Value1")
>>
>> !321 = !DIFileInclude(line: 2, file: !322, nodes: !323)
>>
>> !322 = !DIFile(filename: "myfile.h", directory: "/")
>>
>> !323 = !{!324, !325}
>>
>> !324 = !DIMacro(macro type: DW_MACINFO_undef, name: "M1", line: 4)
>>
>> !325 = !DIMacro(macro type: DW_MACINFO_define, name: "M1", line: 5,
>> value: !"NewValue1")
>>
>> !326 = !DIMacro(macro type: DW_MACINFO_define, name: "M2(x,y)", line: 3,
>> value: !"( (x) + (y) * Value2)")
>>
>> !327 = !{i32 2, !"Debug Info Version", i32 3}
>>
>> !328 = !{!"clang version 3.8.0 (trunk 251321)"}
>>
>> =========================================================
>>
>>
>>
>>
>>
>> DWARF
>>
>> =========================================================
>>
>> Command line: llvm-dwarfdump.exe -debug-dump=macro mainfile.o
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> mainfile3.o:  file format ELF64-x86-64
>>
>>
>>
>> .debug_macinfo contents:
>>
>> DW_MACINFO_define - lineno: 0 macro: __llvm__ 1
>>
>> DW_MACINFO_define - lineno: 0 macro: __clang__ 1
>>
>>
>>
>> … <More compiler macros> …
>>
>>
>>
>> DW_MACINFO_define - lineno: 0 macro: M3 Value3
>>
>> DW_MACINFO_start_file - lineno: 0 filenum: 1
>>
>>   DW_MACINFO_start_file - lineno: 0 filenum: 2
>>
>>     DW_MACINFO_define - lineno: 0 macro: M4 Value4
>>
>>   DW_MACINFO_end_file
>>
>>   DW_MACINFO_define - lineno: 1 macro: M1 Value1
>>
>>   DW_MACINFO_start_file - lineno: 2 filenum: 3
>>
>>     DW_MACINFO_undef - lineno: 4 macro: M1
>>
>>     DW_MACINFO_define - lineno: 5 macro: M1 NewValue1
>>
>>   DW_MACINFO_end_file
>>
>>   DW_MACINFO_define - lineno: 3 macro: M2(x,y) ( (x) + (y) * Value2)
>>
>> DW_MACINFO_end_file
>>
>>
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> Command line: llvm-dwarfdump.exe -debug-dump=line mainfile.o
>>
>>
>> --------------------------------------------------------------------------------------
>>
>> .debug_line contents:
>>
>>
>>
>> … <Other line table Info> …
>>
>>
>>
>>                 Dir  Mod Time   File Len   File Name
>>
>>                 ---- ---------- ---------- ---------------------------
>>
>> file_names[  1]    1 0x00000000 0x00000000 mainfile.c
>>
>> file_names[  2]    1 0x00000000 0x00000000 myfile2.h
>>
>> file_names[  3]    1 0x00000000 0x00000000 myfile.h
>>
>> =========================================================
>>
>>
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>>
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>>
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
>>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151113/b2f80294/attachment.html>


More information about the llvm-dev mailing list