[cfe-dev] source code database

Sean Silva silvas at purdue.edu
Tue Feb 28 21:05:34 PST 2012


This is right up your alley:
http://llvm.org/devmtg/2011-11/#talk2
video:
http://llvm.org/devmtg/2011-11/videos/Carruth_ClangMapReduce-desktop.mp4

On Tue, Feb 28, 2012 at 11:37 PM, Nico Weber <thakis at chromium.org> wrote:

> clang should support much of what you ask for.
>
> DXR ( https://wiki.mozilla.org/DXR ) is an existing attempt to use
> clang to build a program database. https://github.com/nico/complete is
> some old hack from me that does the same in worse - but since there's
> a lot less code, maybe it's easier for a first look (relevant file:
> https://github.com/nico/complete/blob/master/server/complete_plugin.cc).
>
> Nico
>
> On Tue, Feb 28, 2012 at 8:29 PM, James K. Lowden
> <jklowden at schemamania.org> wrote:
> > The "open clang projects" page refers to some potential uses of clang
> > for tool-building.  A few of them require metadata from the
> > lexer or parser.
> >
> > I'm interested in creating a framework for searching and reporting on
> > large C++ code trees.  I wonder what work has already been done, and if
> > the information I want is currently available from the clang front
> > end.  I would begin by capturing the token metadata in SQLite, thereby
> > making them accessible to a variety of applications.
> >
> >        Motivation
> >
> > Back when the VAX dinosaur was knee-high to a mammal, I used DEC's
> > Source Code Analyzer (SCA)[1].  To this day, I have never seen or heard
> > of anything as good.  ISTM clang could be used to create something
> > better.
> >
> > What is "as good", and what would be better?
> >
> > SCA let the user:
> >
> > 1.  analyze arbitrary subsets of a source code tree
> > 2.  dynamically restrict the range of queries on that subset
> > 3.  distinguish among read, write, invoke, reference, and dereference
> > 4.  define  "interesting" cases for repeated use, including reports
> >
> >        Current Tools Fail
> >
> > Microsoft's tool lacks all these features.  cscope has some of them,
> > but only for C.  (For example, cscope cannot search for a
> > destructor or anything with a scope operator.)  VS parses C++, but the
> > user cannot search for uses of e.g. operator<<.
> >
> > The free tools I've looked at share don't really parse C++.   They
> > parse the nonlanguage "C/C++".  Consequently they cannot hope to
> > answer #3 above; they can't even distinguish between ::B and A::B.
> > They also lack any kind of scripting language, preventing #4 and
> > severely restricting the capability of #2.
> >
> > These problems are all answered by clang+SQL.  Or, might be, if clang
> > is up to the job.
> >
> >        Required Metadata
> >
> > I'm sure the following is incomplete and that it is more
> > comprehensive than what is available from any existing tool at any
> > price.  Is it covered by clang at present?
> >
> > [spec]
> >
> > For any token
> >
> > 1.  namespace
> > 2.  enclosing class/struct
> > 3.  const, static
> > 4.  linkage
> > 5.  public, protected, or private (or none)
> > 6.  declare, define, or use
> > 7.  translation unit (file) and line number
> >
> > It should be possible to say in which lines of a file a given token
> > is visible.
> >
> > For types
> >
> > 1.  class, struct, or enum
> > 2.  derived from
> > 3.  derived how (public/protected/private)
> >
> > For typedefs, the above must be available for all components of the
> > lhs.
> >
> > For variables
> >
> > 1.  read, write, invoke, reference, and dereference
> >    (A variable may be invoked if it holds a pointer to a function.)
> > 2.  type: class, struct, typedef, or builtin
> > 3.  const, static, or automatic
> > 4.  (overrides can be derived)
> > 5.  for uses, discarded Koenig lookups
> >
> > For functions
> >
> > 1.  for each parameter and return type, cf. "for variables", above
> > 2.  invoke or reference
> > 3.  (overrides can be derived)
> > 4.  for invocations, discarded Koenig lookups
> >
> > For operators
> >
> > 1.  declare, define, reference, or invoke
> > 2.  friendship (1 : many)
> > 3.  for invocations, discarded Koenig lookups
> >
> > For the preprocessor
> >
> > 1.  define or use
> > 2.  scope
> > 3.  post-processing interpretation, as above
> >
> > [ceps]
> >
> > As I said, I would like to know if the above information is accessible
> > from the clang "kit" and what, if anything, has been undertaken in this
> > vicinity heretofore.  If clang can provide the information, the project
> > I have in mind -- of writing a tool to collect it and keep it in a
> > database -- is both useful and feasible.
> >
> > It's a big question, I know.  You can appreciate I'd want to know the
> > feasibility first, before diving in.
> >
> > Thank you for your time.
> >
> > --jkl
> >
> > [1]
> >
> http://deathrow.vistech.net/HyperReader/docs/progtool/decst124a/scasys.bkb?Chunk=48&Referer=htt&Title=ALL-IN-1%20Anv%E4ndarhandbok
> >
> > P.S.  Prior to posting, I tried to read the mailing list archives.  I
> > must not be the first to notice they're almost impossible to read
> > because the text doesn't wrap in the browser.
> > _______________________________________________
> > cfe-dev mailing list
> > cfe-dev at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120229/8c8d2163/attachment.html>


More information about the cfe-dev mailing list