[Lldb-commits] [Diffusion] rL245090: Move all clang type system DWARF type parsing into ClangASTContext.cpp.

Greg Clayton via lldb-commits lldb-commits at lists.llvm.org
Thu Aug 27 11:54:47 PDT 2015


> On Aug 26, 2015, at 9:08 PM, Dawn Perchik via lldb-commits <lldb-commits at lists.llvm.org> wrote:
> 
> dawn added a subscriber: dawn.
> dawn added a comment.
> 
> Hi Greg,
> 
> While I really appreciate that you're making lldb less Clang specific (e.g. renaming ClangASTType to CompilerType), I'm concerned about this commit because it moves DWARF knowledge from the SymbolFileDWARF plugin into ClangASTContext.cpp, making it impossible to support other debug formats as plugins.

Impossible? Why is it impossible? This is the way DWARF is able to abstract AST specific type creation from raw DWARF info. Any other type system could create their own types. If you have a pascal::ASTContext that has all of its own types, it should be able to create types correctly in pascal::ASTContext using DWARF. It should also be able to do this without littering the SymbolFileDWARF code with:

#include "clang/..."
#include "clang/..."
#include "clang/..."
#include "clang/..."
#include "clang/..."

#include "swift/..."
#include "swift/..."
#include "swift/..."
#include "swift/..."
#include "swift/..."

#include "pascal/..."
#include "pascal/..."
#include "pascal/..."
#include "pascal/..."
#include "pascal/..."

void
SymbolFileDWARF::DoSomething()
{
    if (IsClang())
    {
        clang::ASTContext ....
    }
    else if (IsPascal())
    {
        pascal::ASTContext ....
    }
    else if (IsSwift())
    {
        swift::ASTContext ....
    }
}

This is wrong. SymbolFileDWARF implements a way to explore raw DWARF and it isn't be tied to a language or AST context format. It should defer that to a language specific type system, or in our case a lldb_private::TypeSystem subclass.

ClangASTContext inherits from lldb_private::TypeSystem. The DWARF can then get the correct type system from the compile unit language and have the type system parse the types into their native AST type format. We support both clang and Swift internally at Apple and I can't tell you how many headaches there have been merging the DWARF code when this wasn't the case. The code used to look like the example show above. Now it looks like:

Type *
SymbolFileDWARF::ParseType(const DWARFDIE &die)
{
    TypeSystem *type_system = die.GetTypeSystem();
    if (type_system)
        return type_system->ParseType (die);
    return nullptr;
}

All clang includes are gone from the DWARF parser. Same for Swift. 

Yes there is DWARF specific type parsing going on in ClangASTContext, but now any type system (language like c lanaguages, swift, others) can convert DWARF into types for their own language.

When the PDB support is added, you will need to make a way to convert the PDB stuff into Clang specific types as well. All these changes do are allow the different languages and their different AST classes to cleanly create types from pure DWARF.

> 
> I realize some DWARF had leaked outside of the DWARF plugin before this commit (like the enums for language and calling convention), but those could easily be converted to/from other debug formats.  This new code really requires that the debug info be DWARF.

I didn't want to spend the time to make an agnostic form of debug info that each SymbolFile would need to create in order to represent types so that TypeSystem subclasses could consume those to create types. Why? Performance. DWARF reading is already a bottleneck and adding any more conversion layers in between will only slow down type parsing. This new code _is_ for DWARF, so yes, it does require DWARF. There are 5 function added for DWARF parsing to lldb_private::TypeSystem:

class TypeSystem
{
    virtual lldb::TypeSP
    ParseTypeFromDWARF (const SymbolContext& sc,
                        const DWARFDIE &die,
                        Log *log,
                        bool *type_is_new_ptr);

    virtual Function *
    ParseFunctionFromDWARF (const SymbolContext& sc,
                            const DWARFDIE &die);

    virtual bool
    CompleteTypeFromDWARF (const DWARFDIE &die,
                           lldb_private::Type *type,
                           CompilerType &clang_type);

    virtual CompilerDeclContext
    GetDeclContextForUIDFromDWARF (const DWARFDIE &die);

    virtual CompilerDeclContext
    GetDeclContextContainingUIDFromDWARF (const DWARFDIE &die);
}


These functions could be abstracted into a class like ASTParserDWARF:

class ASTParserDWARF
{
    virtual lldb::TypeSP
    ParseType (const SymbolContext& sc,
               const DWARFDIE &die) = 0;

    virtual Function *
    ParseFunction (const SymbolContext& sc,
                   const DWARFDIE &die) = 0;

    virtual bool
    CompleteType (const DWARFDIE &die,
                  lldb_private::Type *type,
                  CompilerType &clang_type) = 0;

    virtual CompilerDeclContext
    GetDeclContextForUID (const DWARFDIE &die) = 0;

    virtual CompilerDeclContext
    GetDeclContextContainingUID (const DWARFDIE &die) = 0;

};

And then TypeSystem could ask for the DWARF parser:

class TypeSystem
{
    virtual ASTParserDWARF *GetDWARFParser();
};

But I didn't do that, I just added these functions to TypeSystem. We can make them not be pure virtual so that they don't need to be implemented by all lldb_private::TypeSystem subclasses. Adding support for new debug info formats would involve adding new functions to TypeSystem for your debug info format:

class PDBEntry;

class TypeSystem
{
    virtual lldb::TypeSP
    ParseTypeFromPDB (const SymbolContext& sc,
                      const PDBEntry &p) = 0;

    virtual Function *
    ParseFunctionFromPDB (const SymbolContext& sc,
                          const PDBEntry &p) = 0;

    virtual bool
    CompleteTypeFromPDB (const PDBEntry &p,
                           lldb_private::Type *type,
                           CompilerType &clang_type) = 0;

    virtual CompilerDeclContext
    GetDeclContextForUIDFromPDB (const PDBEntry &p) = 0;

    virtual CompilerDeclContext
    GetDeclContextContainingUIDFromPDB (const PDBEntry &p) = 0;
}

The fact is it is much cleaner having all code that converts raw debug info into AST specific types be written in the file that implements and can create these specific types. It also makes merges super easy as there is nothing to merge since all type parsing is in the ClangASTContext.cpp, SwiftASTContext.cpp or PascalASTContext.cpp. As long as the virtual interface in lldb_private::TypeSystem doesn't change, we are good to go.

So the main idea is to allow for languages to be merged into our codebase with minimal invasiveness by having all language specific functionality hidden behind an interface that allows for very easy merging. Yes the DWARF has leaked out of the DWARF plug-in, but one could say that clang previously leaked into the DWARF plug-in. So neither way was clean.

We can talk about approaches to type parsing like possibly making a TypeSystemDWARF and TypeSystemPDB class that TypeSystem can hand out with code like:


class TypeSystem
{
    virtual ASTParserDWARF *GetDWARFParser();
    virtual ASTParserPDB *GetPDBASTParser();
};

Then we can better abstract format specific type parsing in this way. Let me know what you think, and I hope this explains the motivation for the latest changes.

To sum up:
- Yes the changes were designed to be DWARF specific
- We needed to allow for easier merging
- I wanted all clang:: stuff out of SymbolFileDWARF

Greg 



More information about the lldb-commits mailing list