<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 24, 2015 at 2:56 PM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><div><div class="h5"><blockquote type="cite"><div>On Feb 24, 2015, at 2:36 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:</div><br><div><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 23, 2015 at 3:45 PM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><div><div><blockquote type="cite"><div>On Feb 23, 2015, at 3:37 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:</div><br><div><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 23, 2015 at 3:32 PM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><div><div><blockquote type="cite"><div>On Feb 23, 2015, at 3:14 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:</div><br><div><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 23, 2015 at 3:08 PM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><span><blockquote type="cite"><div>On Feb 23, 2015, at 2:59 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:</div><br><div><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 23, 2015 at 2:51 PM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><br>

> On Jan 20, 2015, at 11:07 AM, David Blaikie <<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>> wrote:<br>

><br>

> My vague recollection from the previous design discussions was that these module references would be their own 'unit' COMDAT'd so that we don't end up with the duplication of every module reference in every unit linked together when linking debug info?<br>

><br>

> I think in my brain I'd been picturing this module reference as being an extended fission reference (fission skeleton CU + extra fields for users who want to load the Clang AST module directly and skip the split CU).<br>

<br>

</span>Apologies for letting this rest for so long.<br>

<br>

Your memory was of course correct and I didn’t follow up on this because I had convinced myself that the fission reference would be completely sufficient. Now that I’ve been thinking some more about it, I don’t think that it is sufficient in the LTO case.<br>

<br>

Here is the example from the <a href="http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html" target="_blank">http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-November/040076.html</a>:<br>

<br>

foo.o:<br>

.debug_info.dwo<br>

  DW_TAG_compile_unit<br>

     // For DWARF consumers<br>

     DW_AT_dwo_name ("/path/to/module-cache/MyModule.pcm")<br>

     DW_AT_dwo_id   ([unique AST signature])<br>

<br>

.debug_info<br>

  DW_TAG_compile_unit<br>

    DW_TAG_variable<br>

      DW_AT_name "x"<br>

      DW_AT_type (DW_FORM_ref_sig8) ([hash for MyStruct])<br>

<br>

In this example it is clear that foo.o imported MyModule because its DWO skeleton is there in the same object file. But if we deal with the result of an LTO compilation we will end up with many compile units in the same .debug_info section, plus a bunch of skeleton compile units for _all_ imported modules in the entire project. We thus loose the ability to determine which of the compile units imported which module.<br></blockquote><div><br>Why would we need to know which CU imported which modules? (I can imagine some possible reasons, but wondering what you have in mind)<br></div></div></div></div></div></blockquote><div><br></div></span><div>When the debugger is stopped at a breakpoint and the user wants to evaluate an expression, it should import the modules that are available at this location, so the user can write the expression from within the context of the breakpoint (e.g., without having to fully qualify each type, etc).</div></div></div></blockquote><div><br>I'm not sure how much current debuggers actually worry about that - (& this may differ from lldb to gdb to other things, of course). I'm pretty sure at least for GDB, a context in one CU is as good as one in another (at least without split-dwarf, type units, etc - with those sometimes things end up overly restrictive as the debugger won't search everything properly).<br><br>eg: if you have a.cpp: int main() { }, b.cpp: void func() { } and you run 'start' in gdb (which breaks at the beginning of main) you can still run 'p func()' to call the func, even though there's no declaration of it in a.cpp, etc.<br></div></div></div></div></div></blockquote><div><br></div></div></div>LLDB would definitely care (as it is using clang for the expression evaluation supporting these kinds of features is really straightforward there). By importing the modules (rather than searching through the DWARF), the expression evaluator gains access to additional declarations that are not there in the DWARF, such as templates. But since clang modules are not namespaces, we can’t generally "import the world” as a debugger would usually do.</div></div></blockquote><div><br>Sorry, not sure I understand this last sentence - could you explain further?<br><br>I imagine it would be rather limiting for the user if they could only use expressions that are valid in this file from the file - it wouldn't be uncommon to want to call a function from another module/file/etc to aid in debugging.<br></div></div></div></div></div></blockquote><div><br></div></div></div><div>Usually LLDB’s expression evaluator works by creating a clang AST type out of a DWARF type and inserting it into its AST context. We could pre-polulate it with the definitions from the imported modules (with all sorts of benefits as described above), but that only works if no two modules conflict. If the declaration can’t be found in any imported module, LLDB would still import it from DWARF in the “traditional” fashion.</div></div></div></blockquote><div><br>But it would import it from DWARF in other TUs rather than use the module info just because the module wasn't directly referenced from this TU? That would seem strange to me. (you would lose debug info fidelity (by falling back to DWARF even though there are modules with the full fidelity info) unnecessarily, it sounds like)<br></div></div></div></div></div></blockquote><div><br></div></div></div>I think it’s reasonable to expect full fidelity for everything that is available in the current TU, and having the normal DWARF-based debugging capabilities for everything beyond that. But we can only ever provide full fidelity if we have the list of imports for the current TU.<span class=""><br><blockquote type="cite"><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>Would it be reasonable to use the accelerator table/index to lookup the types, then if the type is in the module you could use the module rather than the DWARF stashed alongside it? (so the comdat'd split-dwarf skeleton CU for the module would have an index to tell you what names are inside it, but if you got an index hit you'd just look at the module instead of loading the split-dwarf debug info in the referenced file)<br></div></div></div></div></div></blockquote><div><br></div></span><div>I don’t think this approach would work for templates and enumerator values;</div></div></div></blockquote><div><br>Not sure why enumerator values are an issue - but templates (& all manner of other things that don't make it into the index, unfortunately), sure.<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div> they aren’t in the accelerator tables to begin with. It would also be slower if the declaration is available in a module.</div></div></div></blockquote><div><br>Though you're rapidly going to end up loading a lot of modules in (as you go up & down a stack printing various things you'll cross into other TUs & load more modules).<br><br>For a standard DWARF consumer, it seems fine to just have a comdat'd skeleton CU for a module without the need for other CUs to mention which module CUs they reference (but I could be wrong here) & that's the design we originally discussed.<br><br>It would seem unfortunate to bloat every CU with a non-deduplicable list of every module it references, but if that's necessary for a serialized AST aware debugger, it might be fine to have it as an option (so long as it can be turned off) & may still benefit from that list not being the authoritative module reference, but a /very/ terse reference to it so all the extra flags & stuff can be in the deduplicable comdat (& to keep it as consistent as possible between the flag (on/off) codepaths for this extra data). Maybe a FORM_block (?) of fixed-size hashes of all the modules back-to-back, so it's as small as possible?<br><br>But I wouldn't mind spending some more time discussing whether there's a better way to keep these things streamlined/symmetric/the same between modular and non-modular debug info.<br><br>- David<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span class="HOEnZb"><font color="#888888"><div><br></div><div>-- adrian</div></font></span><span class=""><br><blockquote type="cite"><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>- David<br><br><br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><font color="#888888"><div><br></div><div>-- adrian</div></font></span><span><br><blockquote type="cite"><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><span><div><br></div><div>-- adrian<br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think it really is necessary to put the info about the module imported into the compile unit that imported it. Or is there a way to do this using the fission capabilities that I’m not aware of?<br>

<span><font color="#888888"><br>

-- adrian<br>

</font></span><div><div><br>

><br>

> [rambling a bit more along those lines:<br>

> This would work fine in the case of the module (now an object file) containing all the static debug info<br>

> The future step, when we put IR/object code in a module to be linked into the final binary, we could put the skeleton CU in that object file that's being linked in (then we wouldn't need to COMDAT it) or, optionally, link in the debug info itself (skipping the indirection through the external file) if a standalone debug info executable was desired]<br>

<br>

<br>

<br>

><br>

> On Tue, Jan 20, 2015 at 9:39 AM, Adrian Prantl <<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>> wrote:<br>

> As a complementary part of the module debugging story, here is a proposal to list the imported modules in the debug info. This patch is not about efficiency, but rather enables a cool debugging feature:<br>

><br>

> Record the clang modules imported by the current compile unit in the debug info. This allows a module-aware debugger (such as LLDB) to @import all modules visible in the current context before evaluating an expression, thus making available all declarations in the current context (that originate from a module) and not just the ones that were actually used by the program.<br>

><br>

> This implementation uses existing DWARF mechanisms as much as possible by emitting a DW_TAG_imported_module that references a DW_TAG_module, which contains the information necessary for the debugger to rebuild the module. This is similar to how C++ using declarations are encoded in DWARF, with the difference that we're importing a module instead of a namespace.<br>

> The information stored for a module includes the umbrella directory, any config macros passed in via the command line that affect the module, and the filename of the raw .pcm file. Why include all these parameters when we have the .pcm file? Apart from module chache volatility, there is no guarantee that the debugger was linked against the same version of clang that generated the .pcm, so it may need to regenerate the module while importing it.<br>

><br>

> Let me know what you think!<br>

> -- adrian<br>

><br>

><br>

><br>

<br>

</div></div></blockquote></div><br></div></div>

</div></blockquote></span></div><br></div></blockquote></div><br></div></div>

</blockquote></div><br></span></div></blockquote></div><br></div></div>

</div></blockquote></span></div><br></div></blockquote></div><br></div></div>

</div></blockquote></span></div><br></div></blockquote></div><br></div></div>