r188739 - Revert "Revert "Revert "Revert "DebugInfo: Omit debug info for dynamic classes in TUs that do not have the vtable for that class""""

Greg Clayton gclayton at apple.com
Tue Dec 17 12:18:51 PST 2013


On Dec 17, 2013, at 12:07 PM, David Blaikie <dblaikie at gmail.com> wrote:

> 
> 
> 
> On Mon, Dec 16, 2013 at 5:18 PM, Greg Clayton <gclayton at apple.com> wrote:
> 
> On Dec 16, 2013, at 5:10 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> >
> >
> >
> > On Mon, Dec 16, 2013 at 4:42 PM, Adrian Prantl <aprantl at apple.com> wrote:
> >
> > On Dec 16, 2013, at 14:55, David Blaikie <dblaikie at gmail.com> wrote:
> >
> > >
> > >
> > >
> > > On Mon, Dec 16, 2013 at 2:44 PM, Adrian Prantl <aprantl at apple.com> wrote:
> > > Hi Chandler and David,
> > >
> > > unfortunately it looks more like case 1. This optimization breaks several assumptions that tools in our software stack depend on.
> > >
> > > It's a fairly substantial debug info size savings that seems worth investigating whether you can keep it enabled at least in
> >
> > this sentence is cut off in the middle :-)
> > But extrapolating from that: For (LTO) builds we still have type uniquing which gets us the same kind of improvement and more.
> >
> > Sure - in the final linked debug info, but you'll still suffer a disk/build-time penalty for all that & that only applies to LTO.
> >
> > I was trying to say "keep it enabled at least in some cases" - ie: look at the cases where you're having trouble with this optimization and see if a more targetted approach to addressing those particular use cases can be employed.
> >
> > > - For example, it breaks dtrace, which on Darwin relies on being able to pull the (complete) CTF info (compact C type format) out of the DWARF in the .dSYM for a given module.
> > >
> > > I take it you're already using -fno-limit-debug-info for these scenarios, then? (are you using -flimit-debug-info at all?)
> >
> > Currently I believe that -flimit-debug-info is the default on Darwin. I don’t know what you mean by “these scenarios”; “we" can’t anticipate what programs users may want to probe using dtrace.
> >
> > Can dtrace be improved to read the rest of the debug info/can you ship debug info for the libraries in question?
> 
> No, dtrace uses CTF, which is an archived version of the struct/class layouts generated from the DWARF. It is the CTF tool that would be required to be updated.
> 
> OK - I'm unfamiliar with these tools.
>  
> Again, how would you go about find a random type "foo" that is a declaration only? Where would you look? What executable file should one look in?
> 
> Fair question - though I do wonder why this question isn't already an issue. If the user's code only declares a pointer to one of these types Clang (under -flimit-debug-info) only emits the declaration of that type. Or if the user forward declares the type and never includes the header with the definition, we can only emit a declaration of the type. 
> 
> So how does this issue not already come up for you?
> 

It does come up, but the forward declaration is all that is needed for that type to exist. Forward declarations are usually emitted for pointers and reference types and they are all that is needed in order to recreate the type (in a clang AST like LLDB does, though this requirement is LLDB specific).

We solve it by leaving all forward declarations as forward declarations in the type that is parsed by the DWARF reader. Later when we are displaying the type in an expression or when showing variables in the command line or in the Xcode GUI, we will look for the type in other shared libraries. Again, this is specific to a debugger that has a dynamic linker telling us all of the libraries involved. If we don't already have the debug info for other shared libraries, we don't go downloading all of them just to look for types, so a forward decl that is used in a pointer, might end up remaining opaque.

> >
> > >
> > > - Kernel extensions tend to inherit from base classes that are defined in a system framework (I/O Kit works this way for example).
> > >
> > > And the library where the base class is defined isn't built with debug info as a matter of course? Is that solvable at all?
> >
> > Of course — by not performing this optimization, the type info for the base class ends up in the user’s .o file, as it always used to. This is just one example to make the "user code that inherits from base classes defined in 3rd-party C++ library” scenario I mentioned yesterday more real.
> >
> > I meant is it possible to solve the problem of "this library is built without debug info" - in my experience libraries generally offer a dbg variant of the library for this purpose. Is this a possible solution to your problem? Or are you unable to ship the libraries in question with appropriate debug info & must rely on the user's builds to contain sufficient info?
> 
> We don't ship any system libraries to external folks here at Apple, so that isn't a solution we can deal with. The kernel is "special" in that it uses all sorts of interesting approaches and special tools for its build and it isn't going to change soon and/or quickly. We can work on it, but it will take time.
> 
> OK - so are you suggesting making a special case change so that kernel modules would build without this debug info space optimization on? It would seem a pity (but hey, it's your platform) to suffer 20% debug info size increase (yes, in non-LTO, non-linked debug info cases) for other scenarios just because this particularly esoteric one has an issue.

I really see this space savings as coming at a cost I am not willing to pay. Disc space is cheap. Debuggers want to have debug info for the variables that are in the current binary. If we are really looking to solve this issue, we should really solve it in some way that we can remove duplicate type definitions. This doesn't help the .o file case, but it solves the problem correctly instead of making concessions by not emitting stuff because we think it isn't going to be neeed/wanted by the consumer.

>  
> 
> >
> > (notice that I first discovered this GCC optimization looking at basic_ifstream - the stream base is polymorphic and has an out-of-line vtable - this optimization allows code using C++ streams to avoid emitting /tons/ of stream related debug info and rely on it being present in the debug build of the standard library)
> 
> I can believe that there are some serious savings with this enabled.
> 
> Given that we have no good solution for finding the one true definition for a type besides "find all debug info files for all shared libraries and parse all that debug info until you can find a definition", it is hard to justify this extra cost.
> 
> I don't really understand why you don't already have this problem, though... especially if you're defaulting to -flimit-debug-info. Maybe not for this particular type that you're encountering a problem with today/with my optimization, but for many other types I imagine you would've seen declaration-only debug info quite a bit.

We do see this problem already, but this current change makes it much worse for LLDB because we are re-creating AST types from the DWARF. We have an AST context per executable file that contains all types from that executable and we create clang types from the DWARF for each DWARF type. Now information is missing that is required for us to make the type.

Greg







More information about the cfe-commits mailing list