[cfe-dev] [clang-tools-extra] A Clang extra tool to find missing includes

Luke Zarko lukezarko at gmail.com
Mon Oct 14 15:44:29 PDT 2013


Thanks for taking the time to look at this!

The ctags database seemed like a good way to get a simple backend up and
running. I agree that it is not an ideal tool for C++. Of the various
efforts going on to build this sort of database, is there one that is
generally preferred by the community? For example, there exists cldoc <
http://jessevdk.github.io/cldoc/>, which claims to serialize to an
XML-based representation. Maybe going from this representation to something
like JSON (since there's already parsing code for this checked in to
support compilation databases) would not be too bad.

I thought I'd normalized the function names to the surrounding Clang code,
but it looks like I was following an out-of-date standard. (Maybe someone
someday will work out a good way of getting clang-format to understand the
morphology and case rules for identifiers.)

As to whether this should be part of libclang or a separate tool, I will
need to learn more about the former before I can think of a good answer! I
regret that I have significantly less time to work on this now than when I
was building it as a summer project, so any large refactoring or split work
may be blocked for a while, but I am still happy to receive comments.

Thank you again for your time.

  Luke

On Fri, Oct 11, 2013 at 5:52 PM, Manuel Klimek <klimek at google.com> wrote:

> On Fri, Oct 11, 2013 at 10:00 AM, Manuel Klimek <klimek at google.com> wrote:
>
>> Hi Luke,
>>
>> I think in general this makes sense. One problem is that it's a really
>> large change, so be prepared for it to require a longer time for the review
>> - if there's any way to split it up into "manageable" chunks, that provide
>> incremental functionality (note that splitting up vertically will not help
>> at all), that would be super useful...
>>
>> One concern I have is the ctags database. I haven't seen useful ctags
>> tools for C++ yet, and I'd vote for rather implementing something simple
>> that works well for C++ on our own than importing a ctags parser. That
>> said, I've not yet looked at the code in detail, as the change is large
>> enough that it doesn't load in phabricator for me (I filed an upstream bug
>> ;)
>>
>
> A few more high level comments:
> - method names in clang should be camelCase() (
> http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
> )
> - dumping a "public domain" implementation of a ctags parser will not fly
> (according to Chandler) - if we want to support ctags, we'll probably
> reimplement it to make sure it's provided with a LLVM compatible license
> - why do we want a tool instead of providing an extension point in
> libclang to hook this directly into the parser's recovery?
>
> Cheers,
> /Manuel
>
>
>
>>
>> Cheers,
>> /Manuel
>>
>>
>>
>> On Tue, Sep 10, 2013 at 11:49 AM, Luke Zarko <lukezarko at gmail.com> wrote:
>>
>>> Crossposting from cfe-commits, as suggested.
>>>
>>> The code is up at http://llvm-reviews.chandlerc.com/D1538 .
>>>
>>> This is a tool that uses an external database, like a ctags file, to
>>> suggest missing includes to fix a broken translation unit. It can produce
>>> diagnostics like:
>>>
>>>   TESTFILE:6:9: note: Inferring that M names a namespace.
>>>   using ::M::X;
>>>         ^
>>>   TESTFILE:6:12: note: For M::X, include m_x_empty_class.h.
>>>   using ::M::X;
>>>              ^
>>>
>>> or:
>>>
>>>   TESTFILE:6:1: note: For M::Q::Y, include m_q_y_empty_class.h.
>>>   Y a;
>>>   ^
>>>   TESTFILE:6:1: note: Alternately, for N::Q::Y, include
>>> n_q_y_empty_class.h.
>>>
>>> It works by looping for a bounded number of times. On each iteration,
>>> the broken TU is parsed and passed through semantic analysis. When typos
>>> are encountered or when an incomplete type is detected where a complete
>>> type is required, the database is consulted to try and find a header with
>>> an appropriate declaration or definition. The header is inserted into the
>>> TU (currently using a fixed formatting rule).
>>>
>>> The primary components of the tool and their roles are:
>>>
>>> SemaSearchPlugin: An interface to a semantic database providing
>>> name-to-definition-file queries and limited name-to-kind (here, namespace)
>>> queries. It's useful to know if some fully-qualified name points to a
>>> namespace because it allows the tool to 'fake out' missing namespace
>>> declarations without confusing them with static members (Foo::Bar vs
>>> ClassFoo::InnerClass::Bar). The tool does this when a typo correction hook
>>> isn't able to find a name during the Sema::LookupNestedNameSpecifierName
>>> LookupKind.
>>>
>>> CtagsSemaSearchPlugin: A specialization for SemaSearchPlugin that works
>>> over Exuberant Ctags files. (Parsing these files is handled by the readtags
>>> library.)
>>>
>>> FileTracker: Keeps track of data associated with a particular TU being
>>> fixed across tool iterations. Holds a MemoryBuffer that contains the most
>>> recent version of the file that was rewritten (if rewriting occurred).
>>>
>>> FyiAction: Hooks MaybeDiagnoseMissingCompleteType and CorrectTypo calls
>>> from Sema.
>>>
>>> In the former case, since a complete type has a fully qualified name, a
>>> simple query can be made on the SemaSearchPlugin (eg, if we see that
>>> Foo::Bar is incomplete in a context where a complete type is necessary, we
>>> can query for that fully-qualified name). The database may contain multiple
>>> entries for a definition site; ranking these is the plugin's duty. The
>>> highest-ranked suggestion is used to add an include, while other
>>> suggestions may be displayed as diagnostic notes. (The ctags plugin simply
>>> returns an unordered vector.)
>>>
>>> In the latter case, the mapping between (un)qualified name and
>>> fully-qualified name is ambiguous. The tool searches DeclContexts from
>>> innermost to outermost, appending the typo's NestedNameSpecifier as
>>> appropriate (and taking care to first expand namespace aliases). If it
>>> fails a lookup in the database, it will then (as a heuristic) look in the
>>> namespaces of available 'using' declarations. Finally, it will try simple
>>> unqualified lookup.
>>>
>>> FyiDiagnosticHooks: Registers custom diagnostics, acts as a diagnostic
>>> sink to collect diagnostics (so that intermediate steps in the outer tool
>>> loop don't litter the terminal), and flags some warnings and notes as
>>> errors to force the tool to spin around again--for example, if the compiler
>>> generates a diag::warn_delete_incomplete, the tool should treat this as
>>> seriously as any other incomplete-type error.
>>>
>>> FyiPreprocessorHooks: Listens for file-entered and file-left events,
>>> watches for inclusion directives, and keeps track of insertion positions in
>>> target TUs. The rewrite and header-equality mechanisms currently assume a
>>> particular programming style (system first, then
>>> qualified-from-project-root) and that the missing headers will be quoted.
>>>
>>> FyiActionFactory: Manages the map from filenames to FileTrackers and the
>>> global iteration count. FyiActionFactory can save out changes to files
>>> (optionally with new suffixes or after moving the old file to a backup).
>>>
>>> FindYourIncludes: Contains the tool's entry point.
>>>
>>> More information on the various components is available in Doxygen
>>> comments throughout the source code.
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20131014/56c6772f/attachment.html>


More information about the cfe-dev mailing list