[cfe-dev] source code database

Nico Weber thakis at chromium.org
Tue Feb 28 20:37:21 PST 2012


clang should support much of what you ask for.

DXR ( https://wiki.mozilla.org/DXR ) is an existing attempt to use
clang to build a program database. https://github.com/nico/complete is
some old hack from me that does the same in worse - but since there's
a lot less code, maybe it's easier for a first look (relevant file:
https://github.com/nico/complete/blob/master/server/complete_plugin.cc).

Nico

On Tue, Feb 28, 2012 at 8:29 PM, James K. Lowden
<jklowden at schemamania.org> wrote:
> The "open clang projects" page refers to some potential uses of clang
> for tool-building.  A few of them require metadata from the
> lexer or parser.
>
> I'm interested in creating a framework for searching and reporting on
> large C++ code trees.  I wonder what work has already been done, and if
> the information I want is currently available from the clang front
> end.  I would begin by capturing the token metadata in SQLite, thereby
> making them accessible to a variety of applications.
>
>        Motivation
>
> Back when the VAX dinosaur was knee-high to a mammal, I used DEC's
> Source Code Analyzer (SCA)[1].  To this day, I have never seen or heard
> of anything as good.  ISTM clang could be used to create something
> better.
>
> What is "as good", and what would be better?
>
> SCA let the user:
>
> 1.  analyze arbitrary subsets of a source code tree
> 2.  dynamically restrict the range of queries on that subset
> 3.  distinguish among read, write, invoke, reference, and dereference
> 4.  define  "interesting" cases for repeated use, including reports
>
>        Current Tools Fail
>
> Microsoft's tool lacks all these features.  cscope has some of them,
> but only for C.  (For example, cscope cannot search for a
> destructor or anything with a scope operator.)  VS parses C++, but the
> user cannot search for uses of e.g. operator<<.
>
> The free tools I've looked at share don't really parse C++.   They
> parse the nonlanguage "C/C++".  Consequently they cannot hope to
> answer #3 above; they can't even distinguish between ::B and A::B.
> They also lack any kind of scripting language, preventing #4 and
> severely restricting the capability of #2.
>
> These problems are all answered by clang+SQL.  Or, might be, if clang
> is up to the job.
>
>        Required Metadata
>
> I'm sure the following is incomplete and that it is more
> comprehensive than what is available from any existing tool at any
> price.  Is it covered by clang at present?
>
> [spec]
>
> For any token
>
> 1.  namespace
> 2.  enclosing class/struct
> 3.  const, static
> 4.  linkage
> 5.  public, protected, or private (or none)
> 6.  declare, define, or use
> 7.  translation unit (file) and line number
>
> It should be possible to say in which lines of a file a given token
> is visible.
>
> For types
>
> 1.  class, struct, or enum
> 2.  derived from
> 3.  derived how (public/protected/private)
>
> For typedefs, the above must be available for all components of the
> lhs.
>
> For variables
>
> 1.  read, write, invoke, reference, and dereference
>    (A variable may be invoked if it holds a pointer to a function.)
> 2.  type: class, struct, typedef, or builtin
> 3.  const, static, or automatic
> 4.  (overrides can be derived)
> 5.  for uses, discarded Koenig lookups
>
> For functions
>
> 1.  for each parameter and return type, cf. "for variables", above
> 2.  invoke or reference
> 3.  (overrides can be derived)
> 4.  for invocations, discarded Koenig lookups
>
> For operators
>
> 1.  declare, define, reference, or invoke
> 2.  friendship (1 : many)
> 3.  for invocations, discarded Koenig lookups
>
> For the preprocessor
>
> 1.  define or use
> 2.  scope
> 3.  post-processing interpretation, as above
>
> [ceps]
>
> As I said, I would like to know if the above information is accessible
> from the clang "kit" and what, if anything, has been undertaken in this
> vicinity heretofore.  If clang can provide the information, the project
> I have in mind -- of writing a tool to collect it and keep it in a
> database -- is both useful and feasible.
>
> It's a big question, I know.  You can appreciate I'd want to know the
> feasibility first, before diving in.
>
> Thank you for your time.
>
> --jkl
>
> [1]
>  http://deathrow.vistech.net/HyperReader/docs/progtool/decst124a/scasys.bkb?Chunk=48&Referer=htt&Title=ALL-IN-1%20Anv%E4ndarhandbok
>
> P.S.  Prior to posting, I tried to read the mailing list archives.  I
> must not be the first to notice they're almost impossible to read
> because the text doesn't wrap in the browser.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev




More information about the cfe-dev mailing list