[lldb-dev] RFC: Simplifying SymbolFile interface

Zachary Turner via lldb-dev lldb-dev at lists.llvm.org
Wed Jan 9 11:59:10 PST 2019


The Native PDB symbol file plugin I think is mostly complete.  It's at
least almost as good as the old Windows-only PDB plugin in 90% of ways,
while actually being significantly better in other ways (for example, there
was a test that took over 2 minutes to run with the Windows-only PDB
plugin, which now takes about 2 seconds to run with the native PDB plugin.

While implementing this, I ran into several things that made my life quite
difficult, and only later found out that I could have saved myself a lot of
headache and time if the SymbolFile interface had been a little simpler and
easier to understand.

Specifically, I'd like to remove the heavy use of SymbolContext in the
SymbolFile / SymbolVendor interface and replace it with more narrow and
targeted parameter lists.

Consider the case of someone calling FindTypes.  In theory, today they can
fill out any combination of Target, Module, Function, Block, CompileUnit,
LineEntry, and Symbol.  That's 2^7 different possible ways the function can
be called.  While obviously not all of these combinations make sense, the
fact is that it greatly increases the API surface, which is bad for test
coverage, bad for ease of understanding, bad for usability, and leads to a
lot of dead code.

For a person implementing this function for the first time, and who may not
know all the details about how the rest of LLDB works, this is quite
daunting because there's an inherent desire to implement the function
faithfully "just in case", since they don't know all of the different ways
the function might be called.

This results in wasted time on the developer's part, because they end up
implementing a bunch of functionality that is essentially dead code.

We can certainly document for every single function "The implementor should
be prepared to handle the case of fields X, Y, and Z being set, and handle
it in such and such way", but I think it's easier to just change the
interface to be more clear in the first place.


Here are the cases I identified, and a proposal for how I could change the
interface.

1) SymbolFile::ParseTypes(SymbolContext&)
  * In the entire codebase, this is only called with a CompileUnit set.  We
should change this to be ParseTypesForCompileUnit(CompileUnit&) so that the
interface is self-documenting.  A patch with this change is here [
https://reviews.llvm.org/D56462]

2) SymbolFile::ParseDeclsForContext(CompilerDeclContext)
  * This is intended to only be used for parsing variables in a block.  But
should it be recursive?  It's impossible to tell from the function name, so
callers can't use it correctly and implementors can't implement it
correctly.  I spent 4 days trying to implement a generic version of this
function for the NativePDB plugin only to find out that I only actually
cared about block variables.  I would propose changing this to
ParseVariableDeclsForBlock(Block&).

3) These functions:
 * ParseCompileUnitLanguage(SymbolContext&)
 * ParseCompileUnitFunctions(SymbolContext&)
 * ParseCompileUnitLineTable(SymbolContext&)
 * ParseCompileUnitDebugMacros(SymbolContext&)
 * ParseCompileUnitSupportFiles(SymbolContext&)

are only for CompileUnits (as the name implies.  I propose changing the
parameter from a SymbolContext& to a CompileUnit&.

4) SymbolFile::ParseFunctionBlocks(SymbolContext&)
 * This is intended to be used when the SymbolContexts m_function member is
set.  I propose changing this to SymbolFile::ParseFunctionBlocks(Function&).

5) SymbolFile::ParseVariablesForContext(CompilerDeclContext)
* This function is only called with the the Context being a CompileUnit,
Function, and Block.  But does it need to be recursive?  For a Function and
Block it seems to be assumed to be recursive, and for a CompileUnit it
seems to be assumed to not be recursive.  For the former case, it's not
actually clear how this function differs from ParseGlobalVariables, and for
the latter case I would propose changing this to
ParseImmedateVariablesForBlock(Block&).

6) SymbolFile::FindTypes(SymbolContext&).
* This function is only called with the m_module field set, and since a
SymbolFile is already tied to a module anyway, the parameter appears
unnecessary.  I propose changing this to SymbolFile::FindAllTypes()

7) SymbolFile::FindNamespace(SymbolContext&, ConstString, DeclContext*) is
only called with default-constructed (i.e. null) SymbolContexts, making the
first parameter unnecessary.  I propose changing this to
FindNamespace(ConstString, DeclContext*)


8)   Module::FindTypes(SymbolContext &, ConstString, bool , size_t ,
DenseSet<SymbolFile *> &, TypeList&):

* After the change in #6, we can propagate this change upwards for greater
benefit.  The first parameter in Module::FindTypes(SymbolContext&, ...) now
becomes unnecessary (and in fact, it was kind of unnecessary to begin with
since in every case, the SymbolContext actually just had a single member
set, which was equal to the this pointer of the Module from which this
function was called).  So I propose deleting this parameter and leaving the
rest of the function the same.


I think all of these changes combined will make life significantlly easier
for new SymbolFile implementors, delete untested code paths, and overall be
easier to understand and work with.


Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20190109/a7ad6a0b/attachment-0001.html>


More information about the lldb-dev mailing list