[lldb-dev] RFC: Simplifying SymbolFile interface

Pavel Labath via lldb-dev lldb-dev at lists.llvm.org
Thu Jan 10 00:43:35 PST 2019


On 09/01/2019 20:59, Zachary Turner via lldb-dev wrote:
> The Native PDB symbol file plugin I think is mostly complete.  It's at 
> least almost as good as the old Windows-only PDB plugin in 90% of ways, 
> while actually being significantly better in other ways (for example, 
> there was a test that took over 2 minutes to run with the Windows-only 
> PDB plugin, which now takes about 2 seconds to run with the native PDB 
> plugin.
> 
> While implementing this, I ran into several things that made my life 
> quite difficult, and only later found out that I could have saved myself 
> a lot of headache and time if the SymbolFile interface had been a little 
> simpler and easier to understand.
> 
> Specifically, I'd like to remove the heavy use of SymbolContext in the 
> SymbolFile / SymbolVendor interface and replace it with more narrow and 
> targeted parameter lists.
> 
> Consider the case of someone calling FindTypes.  In theory, today they 
> can fill out any combination of Target, Module, Function, Block, 
> CompileUnit, LineEntry, and Symbol.  That's 2^7 different possible ways 
> the function can be called.  While obviously not all of these 
> combinations make sense, the fact is that it greatly increases the API 
> surface, which is bad for test coverage, bad for ease of understanding, 
> bad for usability, and leads to a lot of dead code.
> 
> For a person implementing this function for the first time, and who may 
> not know all the details about how the rest of LLDB works, this is quite 
> daunting because there's an inherent desire to implement the function 
> faithfully "just in case", since they don't know all of the different 
> ways the function might be called.
> 
> This results in wasted time on the developer's part, because they end up 
> implementing a bunch of functionality that is essentially dead code.
> 
> We can certainly document for every single function "The implementor 
> should be prepared to handle the case of fields X, Y, and Z being set, 
> and handle it in such and such way", but I think it's easier to just 
> change the interface to be more clear in the first place.
> 
> 
> Here are the cases I identified, and a proposal for how I could change 
> the interface.
> 
> 1) SymbolFile::ParseTypes(SymbolContext&)
>    * In the entire codebase, this is only called with a CompileUnit 
> set.  We should change this to be ParseTypesForCompileUnit(CompileUnit&) 
> so that the interface is self-documenting.  A patch with this change is 
> here [https://reviews.llvm.org/D56462]
> 
> 2) SymbolFile::ParseDeclsForContext(CompilerDeclContext)
>    * This is intended to only be used for parsing variables in a block.  
> But should it be recursive?  It's impossible to tell from the function 
> name, so callers can't use it correctly and implementors can't implement 
> it correctly.  I spent 4 days trying to implement a generic version of 
> this function for the NativePDB plugin only to find out that I only 
> actually cared about block variables.  I would propose changing this to 
> ParseVariableDeclsForBlock(Block&).
> 
> 3) These functions:
>   * ParseCompileUnitLanguage(SymbolContext&)
>   * ParseCompileUnitFunctions(SymbolContext&)
>   * ParseCompileUnitLineTable(SymbolContext&)
>   * ParseCompileUnitDebugMacros(SymbolContext&)
>   * ParseCompileUnitSupportFiles(SymbolContext&)
> 
> are only for CompileUnits (as the name implies.  I propose changing the 
> parameter from a SymbolContext& to a CompileUnit&.
> 
> 4) SymbolFile::ParseFunctionBlocks(SymbolContext&)
>   * This is intended to be used when the SymbolContexts m_function 
> member is set.  I propose changing this to 
> SymbolFile::ParseFunctionBlocks(Function&).
> 
> 5) SymbolFile::ParseVariablesForContext(CompilerDeclContext)
> * This function is only called with the the Context being a CompileUnit, 
> Function, and Block.  But does it need to be recursive?  For a Function 
> and Block it seems to be assumed to be recursive, and for a CompileUnit 
> it seems to be assumed to not be recursive.  For the former case, it's 
> not actually clear how this function differs from ParseGlobalVariables, 
> and for the latter case I would propose changing this to 
> ParseImmedateVariablesForBlock(Block&).
> 
> 6) SymbolFile::FindTypes(SymbolContext&).
> * This function is only called with the m_module field set, and since a 
> SymbolFile is already tied to a module anyway, the parameter appears 
> unnecessary.  I propose changing this to SymbolFile::FindAllTypes()
> 
> 7) SymbolFile::FindNamespace(SymbolContext&, ConstString, DeclContext*) 
> is only called with default-constructed (i.e. null) SymbolContexts, 
> making the first parameter unnecessary.  I propose changing this to 
> FindNamespace(ConstString, DeclContext*)
> 
> 
> 8)   Module::FindTypes(SymbolContext &, ConstString, bool , size_t , 
> DenseSet<SymbolFile *> &, TypeList&):
> 
> * After the change in #6, we can propagate this change upwards for 
> greater benefit.  The first parameter in 
> Module::FindTypes(SymbolContext&, ...) now becomes unnecessary (and in 
> fact, it was kind of unnecessary to begin with since in every case, the 
> SymbolContext actually just had a single member set, which was equal to 
> the this pointer of the Module from which this function was called).  So 
> I propose deleting this parameter and leaving the rest of the function 
> the same.
> 
> 
> I think all of these changes combined will make life significantlly 
> easier for new SymbolFile implementors, delete untested code paths, and 
> overall be easier to understand and work with.
> 
> 
> Thoughts?
> 

I like all of the proposed interface changes.

As for the names, I wonder if we really need to put the argument type in 
the function name when that is already obvious from the type itself (so 
I'd think ParseTypes(CompileUnit&) is sufficient and 
ParseTypesForCompileUnit(CompileUnit&) is verbose).

pl


More information about the lldb-dev mailing list