[lldb-dev] [BUG] Many lookup failures

Tue Dec 1 11:29:44 PST 2015

So one other issue with removing debug info from the current binary for base classes that are virtual: if the definition for the base class changes in libb.so, but liba.so was linked against an older version of class B from libb.so, like for example:

class A : public B
{
    int m_a;
};

If A was linked against a B that looked like this:

class B
{
    virtual ~B();
    int m_b;
};

Then libb.so was rebuilt and B now looks like:

class B
{
    virtual ~B();
    virtual int foo();
    int m_b;
    int m_bb;
};

Then we when displaying an instance of "A" using in liba.so that was linked against the first version of B, we would actually show you the new version of "B" and everything would look like it was using the new definition for B, but liba.so is actually linked against the old instance and the code in class A would probably crash at some point due to the compilation mismatch, but the user would never really see actually what the original program was linked against and possibly be able to see the issue and realize they need to recompile liba.so against libb.so. If full debug info is emitted we would be able to show the original structure for B. Not an issue that people are always going to run into, but it is a reason that I like to have all the info complete in the current binary.

Greg

> On Nov 30, 2015, at 3:32 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Mon, Nov 30, 2015 at 3:29 PM, Greg Clayton <gclayton at apple.com> wrote:
> 
> > On Nov 30, 2015, at 2:54 PM, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Mon, Nov 30, 2015 at 2:42 PM, Greg Clayton <gclayton at apple.com> wrote:
> > >
> > > This will print out the complete class definition that we have for "CG::Node" including ivars and methods. You should be able to see the inheritance structure and you might need to also dump the type info for each inherited class.
> > >
> > > Compilers have been trying to not output a bunch of debug info and in the process they started to omit class info for base classes. So if you have:
> > >
> > > class A : public B
> > > {
> > > };
> > >
> > > where class "B" has all sorts of interesting methods, the debug info will often look like:
> > >
> > > class B; // Forward declaration for class B
> > >
> > > class A : public B
> > > {
> > > };
> > >
> > > When this happens, we must make class A in a clang::ASTContext in DWARFASTParserClang and if "B" is a forward declaration, we can't leave it as a forward declaration or clang will assert and kill the debugger, so currently we just say "oh well, the compiler gave us lame debug info, and clang will crash if we don't fix this, so I am going to pretend we have a definition for class B and it contains nothing".
> > >
> > > Why not lookup the definition of B in the debug info at this point rather than making a stub/empty definition? (& if there is none, then, yes, I suppose an empty definition of B is as good as anything, maybe - it's going to produce some weird results, maybe)
> >
> > LLDB creates types using only the debug info from the currently shared library and we don't take a copy of a type from another shared library when creating the types for a given shared library. Why? LLDB has a global repository of modules (the class that represents an executable or shared library in LLDB). If Xcode, or any other IDE that can debug more that one thing at a time has two targets: "a.out" and "b.out", they share all of the shared library modules so that if debug info has already been parsed in the target for "a.out" for the shared library "liba.so" (or any other shared library), then the "b.out" target has the debug info already loaded for "liba.so" because "a.out" already loaded that module (LLDB runs in the same address space as our IDE). This means that all debug info in LLDB currently creates types using only the info in the current shared library. When we debug "a.out" again, we might have recompiled "liba.so", but not "libb.so" and when we debug again, we don't need to reload the debug info for "libb.so" if it hasn't changed, we just reload "liba.so" and its debug info. When we rerun a target (run a.out again), we don't need to spend any time reloading any shared libraries that haven't changed since they are still in our global shared library cache. So to keep this global library cache clean, we don't allow types from another shared library (libb.so) to be loaded into another (liba.so), otherwise we wouldn't be able to reap the benefits of our shared library cache as we would always need to reload debug info every time we run.
> >
> > Ah, right - I do remember you describing this to me before. Sorry I forgot.
> >
> > Wouldn't it be sufficient to just copy the definition when needed? If the type changes in an incompatible way in a dependent library, the user is up a creek already, aren't they? (eg: libb.so is rebuilt with a new, incompatible version of some type that liba.so uses, but liba.so is not rebuilt) Perhaps you wouldn't be responsible for rebuilding the liba.so cache until it's actually recompiled. Maybe?
> >
> 
> The fix to LLDB I want to do is to complete the type when we need to for base classes, but mark it with metadata. When we run expressions we create a new clang::ASTContext for each expression, and copy types over into it. The ASTImporter can be taught to look for the metadata on the class that says "I completed this class because I had to", and when copying it, we would grab the right type from the current version of libb.so. This keeps everyone happy: modules get their types with some classes completed but marked, and the expressions get the best version available in their AST contexts where if a complete version of the type is available we find it and copy it in place of the completed but incomplete version from the module AST.
> 
> 
> > LLDB does have the ability, when displaying types, to grab types from the best source (other shared libraries), we just don't transplant types in the LLDB shared library objects (lldb_private::Module) versions of the types. We do currently assume that all classes that aren't pointers or references (or other types that can legally have forward declarations of structs or classes) are complete in our current model.
> >
> > There are modifications we can do to LLDB to deal with the partial debug info and possible lack thereof when the debug info for other shared libraries are not present, but we haven't done this yet in LLDB.
> >
> > >
> > > I really don't like that the compiler thinks this is OK to do, but that is the reality and we have to deal with it.
> > >
> > > GCC's been doing it for a while longer than Clang & it represents a substantial space savings in debug info size - it'd be hard to explain to users why Clang's debug info is so much (20% or more) larger than GCC's when GCC's contains all the information required and GDB gives a good user experience with that information and LLDB does not.
> >
> > LLDB currently recreates types in a clang::ASTContext and this imposes much stricter rules on how we represent types which is one of the weaknesses of the LLDB approach to type representation as the clang codebase often asserts when it is not happy with how things are represented.
> >
> > Sure, but it seems like it's the cache that's the real issue/stumbling block here, rather than Clang's AST requirements. As Eric said, the DWARF is (usually) available (unless you aren't building your whole program with debug info, when the -fstandalone-debug (aka -fno-limit-debug-info) is intended for "hey, I need this object file to have debug info that doesn't depend on any other file"), LLDB just isn't using it.
> 
> So that problem goes away with my ASTImporter changes as mentioned above where when we import a type from liba.so into the expression AST, we copy all complete types and any types marked with the "I was completed just to keep clang happy" metadata get imported from the best source available or just left complete but empty if the debug info is missing since that is the best we can do.
> 
> Yep, seems plausible to me. Looking forward to it, some day - maybe the Windows guys'll get to this before you do, not sure. But good to have a plan described/to work from whenever anyone decides this is their longest pole.
>  
> 
> > This does payoff IMHO in the complex expressions we can evaluate where we can use flow control, define and use C++ lambdas, and write more than one statement when writing expressions. But it is definitely a tradeoff. GDB has its own custom type representation which can be better for dealing with the different kinds and completeness of debug info, but I am comfortable with our approach.
> >
> > So we need to figure out what the root problem is here before we can go further and talk about any additional solutions or fixes that may be required.
> >
> > For sure, for this particular user - perhaps there's some other reason they're seeing this behavior that's got nothing to do with this tangent. (but, as you say, judging by the specific situation/behavior, it's a fair guess/bet that it's this quirk/bug/mismatch of expectations)
> 
> Yes, something is failing and we need to fix the problem so users don't need to worry about it, it should just work and be efficiently stored debug info.
> 
> Greg
> 
> >
> > - Dave
> >
> >
> > Greg
> >
> > >
> > > So the best thing I can offer it you must use -fno-limit-debug-info when compiling to stop the compiler from doing this and things should be back to normal for you. If this isn't what is happening, let us know what the "image lookup -t" output looks like and we can see what we can do.
> > >
> > > Greg Clayton
> > > > On Nov 25, 2015, at 10:00 AM, Ramkumar Ramachandra via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > Basic things are failing.
> > > >
> > > > (lldb) p lhs
> > > > (CG::VarExpr *) $0 = 0x000000010d445ca0
> > > > (lldb) p lhs->rootStmt()
> > > > (CG::ExprStmt *) $1 = 0x000000010d446290
> > > > (lldb) p cg_pp_see_it(lhs->rootStmt())
> > > > (const char *) $2 = 0x000000010d448020 "%A = $3;"
> > > > (lldb) p cg_pp_see_it(def->rootStmt())
> > > > error: no member named 'rootStmt' in 'CG::Node'
> > > > error: 1 errors parsing expression
> > > > (lldb) p cg_pp_see_it(def)
> > > > error: no matching function for call to 'cg_pp_see_it'
> > > > note: candidate function not viable: no known conversion from
> > > > 'CG::Node *' to 'CG_Obj *' for 1st argument
> > > > error: 1 errors parsing expression
> > > >
> > > > It's total junk; why can't it see the inheritance VarExpr -> Node ->
> > > > CG_Obj? The worst part is that rootStmt() is a function defined on
> > > > Node!
> > > >
> > > > Ram
> > > > _______________________________________________
> > > > lldb-dev mailing list
> > > > lldb-dev at lists.llvm.org
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> > >
> > > _______________________________________________
> > > lldb-dev mailing list
> > > lldb-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev