[cfe-dev] source code database
Nico Weber
thakis at chromium.org
Tue Feb 28 20:37:21 PST 2012
clang should support much of what you ask for.
DXR ( https://wiki.mozilla.org/DXR ) is an existing attempt to use
clang to build a program database. https://github.com/nico/complete is
some old hack from me that does the same in worse - but since there's
a lot less code, maybe it's easier for a first look (relevant file:
https://github.com/nico/complete/blob/master/server/complete_plugin.cc).
Nico
On Tue, Feb 28, 2012 at 8:29 PM, James K. Lowden
<jklowden at schemamania.org> wrote:
> The "open clang projects" page refers to some potential uses of clang
> for tool-building. A few of them require metadata from the
> lexer or parser.
>
> I'm interested in creating a framework for searching and reporting on
> large C++ code trees. I wonder what work has already been done, and if
> the information I want is currently available from the clang front
> end. I would begin by capturing the token metadata in SQLite, thereby
> making them accessible to a variety of applications.
>
> Motivation
>
> Back when the VAX dinosaur was knee-high to a mammal, I used DEC's
> Source Code Analyzer (SCA)[1]. To this day, I have never seen or heard
> of anything as good. ISTM clang could be used to create something
> better.
>
> What is "as good", and what would be better?
>
> SCA let the user:
>
> 1. analyze arbitrary subsets of a source code tree
> 2. dynamically restrict the range of queries on that subset
> 3. distinguish among read, write, invoke, reference, and dereference
> 4. define "interesting" cases for repeated use, including reports
>
> Current Tools Fail
>
> Microsoft's tool lacks all these features. cscope has some of them,
> but only for C. (For example, cscope cannot search for a
> destructor or anything with a scope operator.) VS parses C++, but the
> user cannot search for uses of e.g. operator<<.
>
> The free tools I've looked at share don't really parse C++. They
> parse the nonlanguage "C/C++". Consequently they cannot hope to
> answer #3 above; they can't even distinguish between ::B and A::B.
> They also lack any kind of scripting language, preventing #4 and
> severely restricting the capability of #2.
>
> These problems are all answered by clang+SQL. Or, might be, if clang
> is up to the job.
>
> Required Metadata
>
> I'm sure the following is incomplete and that it is more
> comprehensive than what is available from any existing tool at any
> price. Is it covered by clang at present?
>
> [spec]
>
> For any token
>
> 1. namespace
> 2. enclosing class/struct
> 3. const, static
> 4. linkage
> 5. public, protected, or private (or none)
> 6. declare, define, or use
> 7. translation unit (file) and line number
>
> It should be possible to say in which lines of a file a given token
> is visible.
>
> For types
>
> 1. class, struct, or enum
> 2. derived from
> 3. derived how (public/protected/private)
>
> For typedefs, the above must be available for all components of the
> lhs.
>
> For variables
>
> 1. read, write, invoke, reference, and dereference
> (A variable may be invoked if it holds a pointer to a function.)
> 2. type: class, struct, typedef, or builtin
> 3. const, static, or automatic
> 4. (overrides can be derived)
> 5. for uses, discarded Koenig lookups
>
> For functions
>
> 1. for each parameter and return type, cf. "for variables", above
> 2. invoke or reference
> 3. (overrides can be derived)
> 4. for invocations, discarded Koenig lookups
>
> For operators
>
> 1. declare, define, reference, or invoke
> 2. friendship (1 : many)
> 3. for invocations, discarded Koenig lookups
>
> For the preprocessor
>
> 1. define or use
> 2. scope
> 3. post-processing interpretation, as above
>
> [ceps]
>
> As I said, I would like to know if the above information is accessible
> from the clang "kit" and what, if anything, has been undertaken in this
> vicinity heretofore. If clang can provide the information, the project
> I have in mind -- of writing a tool to collect it and keep it in a
> database -- is both useful and feasible.
>
> It's a big question, I know. You can appreciate I'd want to know the
> feasibility first, before diving in.
>
> Thank you for your time.
>
> --jkl
>
> [1]
> http://deathrow.vistech.net/HyperReader/docs/progtool/decst124a/scasys.bkb?Chunk=48&Referer=htt&Title=ALL-IN-1%20Anv%E4ndarhandbok
>
> P.S. Prior to posting, I tried to read the mailing list archives. I
> must not be the first to notice they're almost impossible to read
> because the text doesn't wrap in the browser.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
More information about the cfe-dev
mailing list