[PATCH] Have clang list the imported modules in the debug info

David Blaikie dblaikie at gmail.com
Tue Mar 17 15:39:27 PDT 2015


On Mar 17, 2015 3:28 PM, "Adrian Prantl" <aprantl at apple.com> wrote:
>
>
>> On Mar 16, 2015, at 6:47 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Mon, Mar 16, 2015 at 5:14 PM, Adrian Prantl <aprantl at apple.com> wrote:
>>>
>>>
>>> Thanks for the explanation David, I missed that it is entirely the
linker's (or some dwarf post-processor's) responsibility to find the module
files and link in the debug info from the .pcm files, so debugger doesn’t
notice a difference.
>>
>>
>> I think there's still some confusion here. Sorry if I'm rehashing
something, but I'll try to explain how this all works.
>
>
> thanks!
>
>> Normal split DWARF:
>>
>> Compiler generates two files: .o and .dwo.
>> .dwo has static, non-relocatable debug info.
>> .o has a skeleton compile_unit that has the name of the .dwo file and a
hash to verify that the .dwo file isn't stale when the debugger reads it.
>> The .o files are all linked together, the .dwo files stay where they are.
>> The debugger reads the linked executable, finds the skeleton
compile_units contained therein, and find/loads the .dwo files
>
> That makes total sense.
>
> Now, to eliminate the last remaining misconception: Does LLVM actually
emit the separate .dwo file currently?

Clang does emit the separate .dwo file.

> From looking at testcases like DebugInfo/X86/fission-cu.ll it appears as
if the relocatable and the non-relocatable output both end up in the .o
file. This is where I got the impression that there was another tool
involved that extracted the non-relocateable content from the .o into a
.dwo file, but maybe that’s just something we do for testing?

Llvm just puts everything in one file, then the clang driver runs a tool to
split them (objdump or something has a mode for doing this splitting). This
is just an implementation detail, we would do it directly in llvm, but
teaching llvm about outputting two object files simultaneously is hard.

- David

>
> -- adrian
>
>>
>> The scenario I have in mind for module debug info is this:
>> Module is compiled as an object file with debug info (this file is
actually a .dwo file, even if it has some other extension - it has the
non-relocatable debug info in it)
>> .o file has a comdat'd skeleton compile_unit describing the .dwo/module
file
>> <from here on no extra work is required, the linker and debugger just
act as normal>
>> The .o files are linked together, the skeleton compile_units get
deduplicated by the linker (comdat sections)
>> The debugger reads the linked executable, finds the skeleton
compile_units contained therein, and find/loads the module files just as
.dwo files.
>>
>> There's no need for a debug-aware linker or any DWARF post-processing so
far as I understand it. No module-linking is required. Debugger reads the
modules directly, just as if they were .dwo files - they're just object
files in the filesystem like any other (that they have a different
extension isn't too important).
>>
>> Does this make sense?
>>
>>
>>>
>>>
>>>> On Mar 16, 2015, at 2:55 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Mon, Mar 16, 2015 at 2:45 PM, Robinson, Paul <
Paul_Robinson at playstation.sony.com> wrote:
>>>>>
>>>>> Beyond the above (that using a new tag would mean this would go from
'free' to 'not free' for GDB) having a new top level tag is pretty
substantial (we only have two at the moment, and with our talk of modules
being a "bag of dwarf" might go back to having one top level tag? (it's not
clear to me from DWARF4 whether DW_TAG_module is currently a top-level tag,
I don't think it is?)
>>>>>
>>>>> The .debug_info section contains one or more compilation units,
partial units, or in DWARF 5, type units.  DW_TAG_module isn't a unit, if
you want it to be handled independently then it would need to be wrapped in
a DW_TAG_partial_unit.  You would probably then use DW_TAG_imported_unit to
refer to it, rather than DW_TAG_imported_module.
>>>>
>>>>
>>>> This makes a fair bit of sense - though the terminology's never going
to quite line up with modules, I suspect, and this would still require
modifying existing consumers (well, GDB) that can handle split-dwarf today,
I suspect (not sure how it'd handle partial_unit - maybe that does work? -
and still don't know how existing consumers would handle imported_unit
either - could be worth some testing, as it sounds sort of right out of
several less right options).
>>>
>>>
>>> The standard specifically recommends DW_TAG_partial_unit for #include
directives so that sounds like a comparatively good match. Partial units
were already introduced in DWARF3 so maybe GDB supports them. But even if
it doesn’t this shouldn’t necessarily be a problem (unless it crashes). The
DW_TAG_imported_unit since this is primarily useful for AST-based debuggers
that know how to import a module before expression evaluation.
>>>
>>> -- adrian
>>>
>>>> - David
>>>>>
>>>>> (Sorry about the top-quoting but Outlook can't handle HTML editing
properly.)
>>>
>>>
>>> Unfortunately the gmail client somewhat forces a thread to HTML — gmail
quotation markers mysteriously disappear in the plain text version
displayed by other mail clients.
>>>
>>>>> --paulr
>>>>>
>>>>>
>>>>>
>>>>> From: David Blaikie [mailto:dblaikie at gmail.com]
>>>>> Sent: Monday, March 16, 2015 1:36 PM
>>>>> To: Adrian Prantl
>>>>> Cc: Richard Smith; Eric Christopher; llvm cfe; Greg Clayton;
Robinson, Paul
>>>>> Subject: Re: [PATCH] Have clang list the imported modules in the
debug info
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 16, 2015 at 1:24 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> On Mar 10, 2015, at 12:10 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 10, 2015 at 12:05 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Mar 9, 2015, at 5:16 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Mar 9, 2015 at 5:07 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Mar 9, 2015, at 2:14 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Mar 9, 2015 at 1:52 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Feb 24, 2015, at 3:06 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Feb 24, 2015 at 2:56 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>>>>>>>>>>
>>>>>>>>>> On Feb 24, 2015, at 2:36 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 23, 2015 at 3:45 PM, Adrian Prantl <
aprantl at apple.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Feb 23, 2015, at 3:37 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 23, 2015 at 3:32 PM, Adrian Prantl <
aprantl at apple.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Feb 23, 2015, at 3:14 PM, David Blaikie <
dblaikie at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Feb 23, 2015 at 3:08 PM, Adrian Prantl <
aprantl at apple.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Feb 23, 2015, at 2:59 PM, David Blaikie <
dblaikie at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Feb 23, 2015 at 2:51 PM, Adrian Prantl <
aprantl at apple.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > On Jan 20, 2015, at 11:07 AM, David Blaikie <
dblaikie at gmail.com> wrote:
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > My vague recollection from the previous design
discussions was that these module references would be their own 'unit'
COMDAT'd so that we don't end up with the duplication of every module
reference in every unit linked together when linking debug info?
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > I think in my brain I'd been picturing this module
reference as being an extended fission reference (fission skeleton CU +
extra fields for users who want to load the Clang AST module directly and
skip the split CU).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Apologies for letting this rest for so long.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Your memory was of course correct and I didn’t follow up
on this because I had convinced myself that the fission reference would be
completely sufficient. Now that I’ve been thinking some more about it, I
don’t think that it is sufficient in the LTO case.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Here is the example from the
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> foo.o:
>>>>>>>>>>>>>>>>>> .debug_info.dwo
>>>>>>>>>>>>>>>>>>   DW_TAG_compile_unit
>>>>>>>>>>>>>>>>>>      // For DWARF consumers
>>>>>>>>>>>>>>>>>>      DW_AT_dwo_name
("/path/to/module-cache/MyModule.pcm")
>>>>>>>>>>>>>>>>>>      DW_AT_dwo_id   ([unique AST signature])
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> .debug_info
>>>>>>>>>>>>>>>>>>   DW_TAG_compile_unit
>>>>>>>>>>>>>>>>>>     DW_TAG_variable
>>>>>>>>>>>>>>>>>>       DW_AT_name "x"
>>>>>>>>>>>>>>>>>>       DW_AT_type (DW_FORM_ref_sig8) ([hash for MyStruct])
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In this example it is clear that foo.o imported MyModule
because its DWO skeleton is there in the same object file. But if we deal
with the result of an LTO compilation we will end up with many compile
units in the same .debug_info section, plus a bunch of skeleton compile
units for _all_ imported modules in the entire project. We thus loose the
ability to determine which of the compile units imported which module.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Why would we need to know which CU imported which
modules? (I can imagine some possible reasons, but wondering what you have
in mind)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When the debugger is stopped at a breakpoint and the user
wants to evaluate an expression, it should import the modules that are
available at this location, so the user can write the expression from
within the context of the breakpoint (e.g., without having to fully qualify
each type, etc).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not sure how much current debuggers actually worry
about that - (& this may differ from lldb to gdb to other things, of
course). I'm pretty sure at least for GDB, a context in one CU is as good
as one in another (at least without split-dwarf, type units, etc - with
those sometimes things end up overly restrictive as the debugger won't
search everything properly).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> eg: if you have a.cpp: int main() { }, b.cpp: void func() {
} and you run 'start' in gdb (which breaks at the beginning of main) you
can still run 'p func()' to call the func, even though there's no
declaration of it in a.cpp, etc.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> LLDB would definitely care (as it is using clang for the
expression evaluation supporting these kinds of features is really
straightforward there). By importing the modules (rather than searching
through the DWARF), the expression evaluator gains access to additional
declarations that are not there in the DWARF, such as templates. But since
clang modules are not namespaces, we can’t generally "import the world” as
a debugger would usually do.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry, not sure I understand this last sentence - could you
explain further?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I imagine it would be rather limiting for the user if they
could only use expressions that are valid in this file from the file - it
wouldn't be uncommon to want to call a function from another
module/file/etc to aid in debugging.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Usually LLDB’s expression evaluator works by creating a clang
AST type out of a DWARF type and inserting it into its AST context. We
could pre-polulate it with the definitions from the imported modules (with
all sorts of benefits as described above), but that only works if no two
modules conflict. If the declaration can’t be found in any imported module,
LLDB would still import it from DWARF in the “traditional” fashion.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But it would import it from DWARF in other TUs rather than use
the module info just because the module wasn't directly referenced from
this TU? That would seem strange to me. (you would lose debug info fidelity
(by falling back to DWARF even though there are modules with the full
fidelity info) unnecessarily, it sounds like)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I think it’s reasonable to expect full fidelity for everything
that is available in the current TU, and having the normal DWARF-based
debugging capabilities for everything beyond that. But we can only ever
provide full fidelity if we have the list of imports for the current TU.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Would it be reasonable to use the accelerator table/index to
lookup the types, then if the type is in the module you could use the
module rather than the DWARF stashed alongside it? (so the comdat'd
split-dwarf skeleton CU for the module would have an index to tell you what
names are inside it, but if you got an index hit you'd just look at the
module instead of loading the split-dwarf debug info in the referenced file)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I don’t think this approach would work for templates and
enumerator values;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Not sure why enumerator values are an issue - but templates (&
all manner of other things that don't make it into the index,
unfortunately), sure.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> they aren’t in the accelerator tables to begin with. It would
also be slower if the declaration is available in a module.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Though you're rapidly going to end up loading a lot of modules in
(as you go up & down a stack printing various things you'll cross into
other TUs & load more modules).
>>>>>>>>>
>>>>>>>>> For a standard DWARF consumer, it seems fine to just have a
comdat'd skeleton CU for a module without the need for other CUs to mention
which module CUs they reference (but I could be wrong here) & that's the
design we originally discussed.
>>>>>>>>>
>>>>>>>>> It would seem unfortunate to bloat every CU with a
non-deduplicable list of every module it references, but if that's
necessary for a serialized AST aware debugger, it might be fine to have it
as an option (so long as it can be turned off) & may still benefit from
that list not being the authoritative module reference, but a /very/ terse
reference to it so all the extra flags & stuff can be in the deduplicable
comdat (& to keep it as consistent as possible between the flag (on/off)
codepaths for this extra data). Maybe a FORM_block (?) of fixed-size hashes
of all the modules back-to-back, so it's as small as possible?
>>>>>>>>>
>>>>>>>>> But I wouldn't mind spending some more time discussing whether
there's a better way to keep these things streamlined/symmetric/the same
between modular and non-modular debug info.
>>>>>>>>
>>>>>>>> Sure!
>>>>>>>>
>>>>>>>> Now that we established that recording the list of imported
modules for every CU is useful for an AST-based debugger,
>>>>>>>>
>>>>>>>>
>>>>>>>> +Richard, just to see if he's got some ideas about how a debugger
might efficiently use modules to support debugger scenarios and whether or
not having a list of which modules are referenced from which contexts is
valuable in that.
>>>>>>>>
>>>>>>>> It still concerns me that this would create something of a
regression/oddity/difference between AST-based debug info (you wouldn't be
able to handle expressions referencing things in other TUs) and non-AST
based debug info (where I think the average user is used to not worrying
about what headers are included in the current file they're debugging when
they try to use a type or other identifier)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If I understood you correctly, this is not actually the case. The
list of imported modules allows the AST-based debugger to import all the
modules that were imported by the CU that the current frame is in. This
enables the user to, e.g., type "p myVector->size()" even though
std::vector<MyClass>::size() was not used by the CU and is thus not
available in DWARF.
>>>>>>>>
>>>>>>>> If the user types “p foo” even though foo was not defined in any
imported module the debugger can — after failing to import foo via clang —
still fall back to looking up foo in DWARF and do what it always did.
>>>>>>>
>>>>>>>
>>>>>>> If you do the DWARF fallback then you'll get a pretty clear
inconsistency between templates and non-templates. If I have a function foo
and a function template foo_tmpl in one file, and I'm debugging in another
file I'll be able to call 'foo' (normal DWARF fallback/search) but not
foo_tmpl (if I'm calling a new instantiation of foo_tmpl - if I'm calling
an existing instantiation presumably the fallback would catch me). Seems
unfortunate/confusing, perhaps.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Good point, but it my guess is that this wouldn’t be any worse than
the “why can’t I print the size() of this vector!?”-situations we have at
the moment.
>>>>>>
>>>>>>
>>>>>> Sure - it's strictly better in the sense that there are strictly
more expressions that can be evaluated, but seems incomplete is my point,
and maybe worth considering alternative designs that might be more-betterer.
>>>>>>
>>>>>>>
>>>>>>> In certain situations (i.e., non-templates) the debugger could use
the DWARF in the modules to print a message about which module to import.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> let’s talk about how to most efficiently represent this
information.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In the CU, using DW_TAG_imported_module appears to be the most
appropriate choice, even though there is some room for confusion since C++
using declarations are also represented this way. Inside the
DW_TAG_imported_module, we could use
>>>>>>>>>
>>>>>>>>> (1) a DW_AT_import that references the skeleton (I hope that is
the right terminology) CU for the module, the idea being that the skeleton
CU would contain all the details (flags, name, include dirs, hash, ...) and
be in a comdat'ed section.
>>>>>>>>
>>>>>>>>
>>>>>>>> I'd be concerned about overloading the terminology & confusing
other debuggers - they might try to follow the DW_AT_import and be
surprised that it doesn't refer to a DW_TAG_namespace tag.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That’s a valid concern, and we probably should not be emitting
this if we have any evidence of, e.g., gdb crashes when encountering such a
construct. Then again, we would be using a DW_TAG_imported_module to
express what it is meant to express according to the DWARF spec (namely
importing a module)... but I admit that the tag also does have a very
specific meaning for C++, which we maybe shouldn’t overload.
>>>>>>>
>>>>>>>
>>>>>>> That's my concern, yes.
>>>>>>>
>>>>>>>>
>>>>>>>> The right thing here is probably to put aside my personal sense of
aesthetics and use a private _LLVM_ namespace for all new additions, and
then attempt to standardize an official DWARF version once we know what is
really needed and what isn't.
>>>>>>>
>>>>>>>
>>>>>>> I'd prefer this, yes. I mean the usual bar we use for language
features is that they're at least proposed for standardization before we
adopt them in clang - I wouldn't mind a similar bar here. If you want to
bring up this use of DW_TAG_imported_module with the DWARF committee & see
if it sounds reasonable (& test/inquire about GDB's behavior here).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I started a thread on dwarf-discuss to this end (
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org, the list
archives are only visible to subscribers, but anyone can subscribe).
>>>>>>
>>>>>>
>>>>>> Cool cool
>>>>>
>>>>>
>>>>>
>>>>> To paraphrase the replies that my question solicited: We are, perhaps
not very surprising, encouraged to follow the standard and use a
DW_TAG_imported_module that references a DW_TAG_module. If we, however,
choose to describe the module by using a skeleton DW_TAG_compile_unit, we
should be careful (my own words) about using a DW_TAG_imported_module until
that use is sanctioned by the standard.
>>>>>
>>>>>
>>>>>
>>>>> I see two possible ways to proceed in this spirit:
>>>>>
>>>>> a) Rename the module skeleton DW_TAG_compile_units to DW_TAG_module,
but keep all the comdat/split dwarf goodness from the original proposal
[1]. My understanding is that even though we are making clever use of the
split DWARF features, GDB would still need to be taught to follow
references to external files,
>>>>>
>>>>>
>>>>> Not sure what you're referring to here, perhaps a misunderstanding
about how split DWARF works.
>>>>>
>>>>> To the best of my knowledge, what we've talked about for module DWARF
debug info is actually just split-dwarf, no extra work required by DWARF
consumers*.
>>>>>
>>>>> * It's, admittedly, a little tricksy to include type unit references
in an object file that doesn't include the type unit at all - relying on it
being linked into the final executable. But DWARF doesn't really talk about
objects versus executables, etc - so, so long as the type unit is there in
the end, it's valid DWARF no matter how it got there (& should work fine
for existing consumers - they can't tell if the type unit was in every
object file that referenced the type or not once it's been linked and
deduplicated).
>>>>>
>>>>>>
>>>>>> so having it recognize a new tag in this context doesn’t appear to
be much additional effort (but others may provide more insight here).
>>>>>
>>>>>
>>>>> Beyond the above (that using a new tag would mean this would go from
'free' to 'not free' for GDB) having a new top level tag is pretty
substantial (we only have two at the moment, and with our talk of modules
being a "bag of dwarf" might go back to having one top level tag? (it's not
clear to me from DWARF4 whether DW_TAG_module is currently a top-level tag,
I don't think it is?)
>>>>>
>>>>>>
>>>>>> b) Emit an LLVM-specific DW_AT_LLVM_import attribute inside the
DW_TAG_imported_module (or vice versa) that refers to the skeleton
DW_TAG_compile_unit.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think that option (a) is a bit more elegant and it is bending the
dwarf standard not quite as much and will make the dwarf output a bit more
readable.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- adrian
>>>>>>
>>>>>>
>>>>>>
>>>>>> [1] Module debugging proposal for reference:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> - David
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- adrian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> But extension tags seems like the conservatively correct option
(not sure what GDB does on tags it doesn't recognize - I forget if it warns
or just completely ignores them, hopefully the latter)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (2) David’s suggestion of using a custom form that records the
module hash directly is quite space-efficient, but it has the drawback of
not being resilient against small changes to the imported module
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That's going to be true of the normal fission info here (the
skeleton CU and the full CU in the .dwo file (or module) are associated by
hash) - granted, in the "loading an AST" mode, you can ignore those hashes
and rely on your custom attributes instead.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> , since clang’s module hash changes each time the module is
being rebuilt.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Clang's module hash only changes if the DWARF contents change -
it doesn't use a timestamp or anything. It seems like actually you're going
to want to fail to load even more aggressively - there are ways the AST
might've changed that the debug info doesn't reflect but are still
important (a type unreferenced in this module, but built into some other
code that is not built with debug info changes - no hash changes because
the debug info for that type is unreferenced here, but if you try to use it
you could have an incompatible layout, etc).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Agreed: If the module contents changed the debugger needs to
display a big flashing "here be dragons" warning.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is less of an issue if the hash is referring to a skeleton
CU in the same file, which contains all the detailed information.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Personally I’d prefer option 1 because mostly uses the existing
mechanisms from DWARF. Here’s a visual guide to the options on the table:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> (1)
>>>>>>>>>
>>>>>>>>> foo.o (compiled with, let’s call it .. "-gmodule-imports”)
>>>>>>>>>
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> .debug_info:
>>>>>>>>>
>>>>>>>>>   DW_TAG_compile_unit
>>>>>>>>>
>>>>>>>>>     DW_AT_name(“foo.c”)
>>>>>>>>>
>>>>>>>>>     DW_TAG_imported_module
>>>>>>>>>
>>>>>>>>>       DW_AT_import(DW_FORM_ref_addr 0x123)  // Could be a
FORM_ref_sig8 0x1234ABCDE as well.
>>>>>>>>>
>>>>>>>>>     DW_TAG_imported_module
>>>>>>>>>
>>>>>>>>>       DW_AT_import(...)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .debug_info.dwo:
>>>>>>>>>
>>>>>>>>> // Skeleton CUs for modules imported by foo.o.
>>>>>>>>>
>>>>>>>>> 0x123:
>>>>>>>>>
>>>>>>>>>   DW_TAG_compile_unit
>>>>>>>>>
>>>>>>>>>     // Used by split-dwarf debuggers to find external type
definitions.
>>>>>>>>>
>>>>>>>>>
DW_AT_dwo_name(“/tmp/org.llvm.clang/ModuleCache/1234ABCDE/Foundation.pcm”)
>>>>>>>>>
>>>>>>>>>     DW_AT_dwo_id(“0x1234ABCDE”)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     // Used by AST-based debuggers to import the module.
>>>>>>>>>
>>>>>>>>>     DW_AT_name(“Foundation”)
>>>>>>>>
>>>>>>>>
>>>>>>>> (side notes: the mixed indentation here makes it a bit hard to
read this example, and I'd make sure /all/ the extended attributes
(including the name here) use custom attribute names, not standard ones)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Agreed.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>     DW_AT_LLVM_sysroot(“/“)
>>>>>>>>>
>>>>>>>>>     DW_AT_LLVM_include_dir(“”)
>>>>>>>>>
>>>>>>>>>     DW_AT_LLVM_macros(“-DNDEBUG”)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> (2)
>>>>>>>>>
>>>>>>>>> .debug_info.dwo:
>>>>>>>>>
>>>>>>>>> (As above.)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .debug_info:
>>>>>>>>>
>>>>>>>>>   DW_TAG_compile_unit
>>>>>>>>>
>>>>>>>>>     DW_AT_name(“foo.c”)
>>>>>>>>>
>>>>>>>>>     DW_AT_LLVM_imported_modules(DW_FORM_block 0x1234ABCDE
0xDEADBEEF 0x....)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Now I’m curious what option (3) will look like; the one that
we’ll actually implement!
>>>>>>>>
>>>>>>>>
>>>>>>>> ;)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- adrian
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150317/3559722a/attachment.html>


More information about the cfe-commits mailing list