<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jan 12, 2015 at 2:11 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="h5">On Mon, Jan 12, 2015 at 1:56 PM, Richard Smith <span dir="ltr"><<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Fri, Jan 9, 2015 at 8:26 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Fri, Jan 9, 2015 at 5:02 PM, Richard Smith <span dir="ltr"><<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div>On Fri, Jan 9, 2015 at 4:03 PM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><div><blockquote type="cite"><div>On Jan 9, 2015, at 3:57 PM, Richard Smith <<a href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>> wrote:</div><br><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Jan 6, 2015 at 10:07 AM, Adrian Prantl <span dir="ltr"><<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>

> On Dec 12, 2014, at 8:47 PM, Adrian Prantl <<a href="mailto:aprantl@apple.com" target="_blank">aprantl@apple.com</a>> wrote:<br>

><br>

><br>

>> On Dec 12, 2014, at 5:37 PM, Argyrios Kyrtzidis <<a href="mailto:kyrtzidis@apple.com" target="_blank">kyrtzidis@apple.com</a>> wrote:<br>

>><br>

>><br>

>>> On Dec 12, 2014, at 4:33 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>> wrote:<br>

>>><br>

>>> Debug info for types isn't inherently a code generation concept. If you think about it, debug info for types is a stable (if lossy) serialization method for a module file. The line number etc for when there's code generated is a separate issue.<br>

>><br>

>> I see what you mean, but it is a traditionally codegen product with a particular use-case, and it’s not reasonable to force it on every clang client that only wants to parse code, like libclang, static analyzers, migrators, refactoring tools, etc., or builds that didn’t ask for it.<br>

><br>

> Good point, I tend to forget about non-compiler users of clang modules.<br>

><br>

> If we do decide that having clang modules without debug info is desirable, and we want debug info to be generated lazily (only when needed) then putting it into a separate file is preferable, because it then can be captured as a dependency by build systems.<br>

><br>

> It looks like at this point everyone’s argument is really depending on an assumption that emitting debug info is expensive (or really cheap!, respectively), so my suggestion is to revisit this thread once I actually have some numbers on how long it takes to emit debug info and how much space it takes up. I’ll try to get that done soon.<br>

<br>

</div></div>Hi Argyrios,<br>

<br>

back from the break, here are the promised numbers to make our decision easier:<br>

<br>

I did an experiment where I patched clang to emit debug type info for each type (patch attached for the curious), and compiled an empty program that imports the Cocoa.h header. To compare the sizes I emitted the DWARF to a separate file:<br>

<br>

-rw-r--r--  1 adrian  staff  2151068 Dec 19 16:30 Foundation-3QM1BFEPXW18W.pcm<br>

-rw-r--r--  1 adrian  staff   110772 Dec 19 16:30 Foundation-3QM1BFEPXW18W.pcm.o<br>

<br>

here’s AppKit:<br>

<br>

-rw-r--r--  1 adrian  staff  3302744 Dec 19 16:40 AppKit-5HXLHEH4UB4M.pcm<br>

-rw-r--r--  1 adrian  staff   279080 Dec 19 16:40 AppKit-5HXLHEH4UB4M.pcm.o<br>

<br>

The median of the size of the DWARF compared to the size of the pcm over all the modules pulled in by Cocoa.h is 5%; i.e., the DWARF would take up roughly 5% of the size of the individual modules.<br>

<br>

>From these numbers I would argue that DWARF emission is comparatively cheap. To keep the implementation simple, I’d prefer to have everything in one file; this way we won’t have to introduce another layer of locking for creating the pcm.o files lazily, but if someone wants to point out that this is a lame excuse, be my guest ;-)<br>

[Another reason to argue for separate .pcm.o files is if we ever want to put something target-specific in there, such as code. Currently this is not the case,</blockquote><div><br></div><div>I certainly have plans to do this, as mentioned previously on this thread.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">and even if we did this, we would still benefit from having the DWARF type information shared between the several .pcm.o files]<br></blockquote><div><br></div><div>Is there any disadvantage to having the debug information for a module split over two .o files (one for the types and another for the inline functions / template instantiations)?</div></div></div></div></div></blockquote><div><br></div></div></div><div>I think that having it split is actually an advantage. By split I mean having the .pcm which contains AST and the DWARF for the types ands then several .pcm.o’s for each target that contains e.g., IR for inline functions+debug info and the debug info in the various targets refers to the shared DWARF type info in the .pcm. As far as the debug info is concerned, we would use the same mechanisms for the .pcm.o files as we would for any other object that imports the module.</div></div></div></blockquote><div><br></div></div></div><div>OK, I'm fine with that (though in our case I think we'll want to turn this feature off and put all the DWARF output into the same file as the inline functions etc). Do you have a plan for supporting debug fission with this mode?</div></div></div></div></blockquote></div></div><div><br>The way I was thinking is that this is, in some sense, fission already.<br><br>We would put a simple module skeleton compile unit that represents the module in each object file compiled using that module - comdat it so it's dedup'd by the linker, and that would reference the pcm.o file just like we reference .dwo files today - and in there we'd have all the usual debug_types.dwo, etc.<br><br>So this /is/ fission.<br><br>If we wanted to split the debug info out from the module, I don't think this would really change - we'd just point at that other file instead.<br><br>(& when we eventually have inline functions and their debug info in the module, we could drop the comdat and just put the skeleton CU in that object file to be linked in directly (and to contain the debug info for those inline functions, etc))<br><br>Does that sound reasonable/make sense - I can flesh out some of the DWARF terminology I've used if it's unclear.</div></div></div></div></blockquote><div><br></div></div></div><div>This is the answer I was hoping for / expecting, I just wanted to make sure that this had been considered. To my mind, this means that it's neither relevant nor necessary that the .pcm file is an ELF / MachO / COFF / etc. object file, all that matters is that it's a file that DWARF readers are able to read DWARF from (and a format that we can read Clang's PCM information from). Does that give us any additional flexibility regarding the format?</div></div></div></div></blockquote></div></div><div><br>My guess would be that this doesn't give us any additional flexibility today - I think GDB is the only implementation of Fission today and, while I don't know for sure, I don't have any reason to believe it can handle .dwo files in any format other than ELF (or perhaps generalized to any object file GDB can cope with on each platform it supports).<br> </div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>One other change that I would like to be made with this one: fix llvm-bcanalyzer so that it can read whatever file format we end up using for .pcm files. We get several fringe benefits such as this from using bitcode, and it would be unfortunate to lose them.<br></div></div></div></div></blockquote></span><div><br>Would it be sufficient to teach llvm-readelf or something to have options (if it doesn't have them already) to dump a specific section to stdout and you'd just pipe that to bcanalyzer?<br></div></div></div></div></blockquote><div><br></div><div>That seems reasonable to me.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div></div><div>Adrian: have you looked at the file size increase for an empty module from adding this wrapper format and skeleton/empty DWARF information? That'd be an interesting data point (mostly just to assuage my concern here -- some builds will have thousands of these files loaded, and a few dozen KiB per PCM file adds up to a lot of address space).<br></div></div></div></div>

</blockquote></span></div><br></div></div>

<br>_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>

<br></blockquote></div><br></div></div>