[lldb-dev] RFC: Simplifying SymbolFile interface
Pavel Labath via lldb-dev
lldb-dev at lists.llvm.org
Thu Jan 10 00:43:35 PST 2019
On 09/01/2019 20:59, Zachary Turner via lldb-dev wrote:
> The Native PDB symbol file plugin I think is mostly complete. It's at
> least almost as good as the old Windows-only PDB plugin in 90% of ways,
> while actually being significantly better in other ways (for example,
> there was a test that took over 2 minutes to run with the Windows-only
> PDB plugin, which now takes about 2 seconds to run with the native PDB
> plugin.
>
> While implementing this, I ran into several things that made my life
> quite difficult, and only later found out that I could have saved myself
> a lot of headache and time if the SymbolFile interface had been a little
> simpler and easier to understand.
>
> Specifically, I'd like to remove the heavy use of SymbolContext in the
> SymbolFile / SymbolVendor interface and replace it with more narrow and
> targeted parameter lists.
>
> Consider the case of someone calling FindTypes. In theory, today they
> can fill out any combination of Target, Module, Function, Block,
> CompileUnit, LineEntry, and Symbol. That's 2^7 different possible ways
> the function can be called. While obviously not all of these
> combinations make sense, the fact is that it greatly increases the API
> surface, which is bad for test coverage, bad for ease of understanding,
> bad for usability, and leads to a lot of dead code.
>
> For a person implementing this function for the first time, and who may
> not know all the details about how the rest of LLDB works, this is quite
> daunting because there's an inherent desire to implement the function
> faithfully "just in case", since they don't know all of the different
> ways the function might be called.
>
> This results in wasted time on the developer's part, because they end up
> implementing a bunch of functionality that is essentially dead code.
>
> We can certainly document for every single function "The implementor
> should be prepared to handle the case of fields X, Y, and Z being set,
> and handle it in such and such way", but I think it's easier to just
> change the interface to be more clear in the first place.
>
>
> Here are the cases I identified, and a proposal for how I could change
> the interface.
>
> 1) SymbolFile::ParseTypes(SymbolContext&)
> * In the entire codebase, this is only called with a CompileUnit
> set. We should change this to be ParseTypesForCompileUnit(CompileUnit&)
> so that the interface is self-documenting. A patch with this change is
> here [https://reviews.llvm.org/D56462]
>
> 2) SymbolFile::ParseDeclsForContext(CompilerDeclContext)
> * This is intended to only be used for parsing variables in a block.
> But should it be recursive? It's impossible to tell from the function
> name, so callers can't use it correctly and implementors can't implement
> it correctly. I spent 4 days trying to implement a generic version of
> this function for the NativePDB plugin only to find out that I only
> actually cared about block variables. I would propose changing this to
> ParseVariableDeclsForBlock(Block&).
>
> 3) These functions:
> * ParseCompileUnitLanguage(SymbolContext&)
> * ParseCompileUnitFunctions(SymbolContext&)
> * ParseCompileUnitLineTable(SymbolContext&)
> * ParseCompileUnitDebugMacros(SymbolContext&)
> * ParseCompileUnitSupportFiles(SymbolContext&)
>
> are only for CompileUnits (as the name implies. I propose changing the
> parameter from a SymbolContext& to a CompileUnit&.
>
> 4) SymbolFile::ParseFunctionBlocks(SymbolContext&)
> * This is intended to be used when the SymbolContexts m_function
> member is set. I propose changing this to
> SymbolFile::ParseFunctionBlocks(Function&).
>
> 5) SymbolFile::ParseVariablesForContext(CompilerDeclContext)
> * This function is only called with the the Context being a CompileUnit,
> Function, and Block. But does it need to be recursive? For a Function
> and Block it seems to be assumed to be recursive, and for a CompileUnit
> it seems to be assumed to not be recursive. For the former case, it's
> not actually clear how this function differs from ParseGlobalVariables,
> and for the latter case I would propose changing this to
> ParseImmedateVariablesForBlock(Block&).
>
> 6) SymbolFile::FindTypes(SymbolContext&).
> * This function is only called with the m_module field set, and since a
> SymbolFile is already tied to a module anyway, the parameter appears
> unnecessary. I propose changing this to SymbolFile::FindAllTypes()
>
> 7) SymbolFile::FindNamespace(SymbolContext&, ConstString, DeclContext*)
> is only called with default-constructed (i.e. null) SymbolContexts,
> making the first parameter unnecessary. I propose changing this to
> FindNamespace(ConstString, DeclContext*)
>
>
> 8) Module::FindTypes(SymbolContext &, ConstString, bool , size_t ,
> DenseSet<SymbolFile *> &, TypeList&):
>
> * After the change in #6, we can propagate this change upwards for
> greater benefit. The first parameter in
> Module::FindTypes(SymbolContext&, ...) now becomes unnecessary (and in
> fact, it was kind of unnecessary to begin with since in every case, the
> SymbolContext actually just had a single member set, which was equal to
> the this pointer of the Module from which this function was called). So
> I propose deleting this parameter and leaving the rest of the function
> the same.
>
>
> I think all of these changes combined will make life significantlly
> easier for new SymbolFile implementors, delete untested code paths, and
> overall be easier to understand and work with.
>
>
> Thoughts?
>
I like all of the proposed interface changes.
As for the names, I wonder if we really need to put the argument type in
the function name when that is already obvious from the type itself (so
I'd think ParseTypes(CompileUnit&) is sufficient and
ParseTypesForCompileUnit(CompileUnit&) is verbose).
pl
More information about the lldb-dev
mailing list