[cfe-dev] New clangIndex library

Chris Lattner clattner at apple.com
Thu Jul 9 10:21:37 PDT 2009


On Jul 9, 2009, at 9:52 AM, Argyrios Kyrtzidis wrote:
> Hi all,
>
> If you are following the cfe-commits mailing list, you may have  
> noticed that a new 'Index' library (referred to as 'clangIndex' from  
> now on) has "landed" on the clang repository recently.
> I'd like to make a proper introduction of this new library.

This is a great introduction Argiris, can you please add this to the  
web site somewhere?

-Chris

>
> ClangIndex is meant to provide the basic infrastructure for cross- 
> translation-unit analysis and is primarily focused on indexing  
> related functionality.
> It provides an API for clients that need to accurately map the AST  
> nodes of the ASTContext to the locations in the source files.
> It also allows them to analyze information across multiple  
> translation units.
>
> As a "general rule", ASTContexts are considered the primary source  
> of information that a client wants about a translation unit.
> As a consequence, there will be no such class as an "indexing  
> database" that stores, for example, source locations of identifiers  
> separately from ASTContext.
> All the information that a client needs from a translation unit will  
> be extracted from the ASTContext.
>
>
> Entity:
> --------
> To be able to reason about semantically the same Decls that are  
> contained in multiple ASTContexts, the 'Entity' class was introduced.
> An Entity is an ASTContext-independent "token" that can be created  
> from a Decl (and a typename in the future) with the purpose to  
> "resolve"
> it into a Decl belonging to another ASTContext. Some examples to  
> make the concept of Entities more clear:
>
> t1.c:
> void foo(void);
> void bar(void);
>
> t2.c:
> void foo(void) {
> }
>
> Translation unit 't1.c' contains 2 Entities 'foo' and 'bar', while  
> 't2.c' contains 1 Entity 'foo'.
> Entities are uniqued in such a way that the Entity* pointer for  
> 't1.c/foo' is the same as the Entity* pointer for 't2.c/foo'.
> An Entity doesn't convey any information about the declaration, it  
> is more like an opaque pointer used only to get the
> associated Decl out of an ASTContext so that the actual information  
> for the declaration can be accessed.
> Another important aspect of Entities is that they can only be  
> created/associated for declarations that are visible outside the
> translation unit. This means that for:
>
> t3.c:
> static void foo(void);
>
> there can be no Entity (if you ask for the Entity* of the static  
> function 'foo' you'll get a null pointer).
> This is for 2 reasons:
> 1) To preserve the invariant that the same Entity* pointers refer to  
> the same semantic Decls.
>    In the above example t1.c/foo and t2.c/foo are the same, while  
> t3.c/foo is different.
> 2) The purpose of Entity is to get the same semantic Decl from  
> multiple ASTContexts. For a Decl that is not visible
>    outside of its own translation unit, you don't need an Entity  
> since it won't appear in another ASTContext.
>
>
> ASTLocation
> -----------
> Encapsulates a "point" in the AST tree of the ASTContext.
> It represents either a Decl*, or a Stmt* along with its immediate  
> Decl* parent.
> An example for its usage is that clangIndex will provide the  
> references of 'foo' in the form of ASTLocations, "pointing" at the  
> expressions that reference 'foo'.
>
> ResolveLocationInAST
> -------------------
> A function that accepts an ASTContext and a SourceLocation which it  
> resolves into an ASTLocation.
>
> DeclReferenceMap
> ---------------
> Accepts an ASTContext and creates a mapping from NamedDecls to the  
> ASTLocations that reference them (in the same ASTContext).
>
> AST files
> ---------
> The precompiled headers implementation of clang (http://clang.llvm.org/docs/PCHInternals.html 
> ) is ideal for storing an ASTContext in a compact form that
> will be loaded later for AST analysis. An "AST file" refers to a  
> translation unit that was "compiled" into a precompiled header file.
>
> index-test
> ----------
> A command-line tool that exercises the clangIndex API, useful for  
> testing the clangIndex features.
> As input it accepts multiple AST files (representing multiple  
> translation units) and a few options:
>
>    -point-at  [file:line:column]
> Resolves a [file:line:column] triplet into a ASTLocation from the  
> first AST file. If no other option is specified, it prints the  
> ASTLocation.
> It also prints a declaration's associated doxygen comment, if one is  
> available (courtesy of Doug).
>
>    -print-refs
> Prints the ASTLocations that reference the declaration that was  
> resolved out of the [file:line:column] triplet
>
>    -print-defs
> Prints the ASTLocations that define the resolved declaration
>
>    -print-decls
> Prints the ASTLocations that declare the resolved declaration
>
>
> Here's an example of using index-test:
>
> You have 3 files,
>
> foo.h:
> ---------------------
> extern int global_var;
>
> void foo_func(int param1);
> void bar_func(void);
> -----------------------
>
> t1.c:
> --------------------
> #include "foo.h"
>
> void foo_func(int param1) {
>   int local_var = global_var;
>   for (int for_var = 100; for_var < 500; ++for_var) {
>     local_var = param1 + for_var;
>   }
>   bar_func();
> }
> --------------------
>
> t2.c:
> ----------------------
> #include "foo.h"
>
> int global_var = 10;
>
> void bar_func(void) {
>   global_var += 100;
>   foo_func(global_var);
> }
> -------------------------
>
> You first get AST files out of t1.c and t2.c:
>
> $ clang-cc -emit-pch t1.c -o t1.ast
> $ clang-cc -emit-pch t2.c -o t2.ast
>
> Find the ASTLocation under this position of t1.c:
> .................
> void foo_func(int param1) {
>   int local_var = global_var;
>                       ^
> ...................
>
> $ index-test t1.ast -point-at t1.c:4:23
> [Decl: Var local_var | Stmt: DeclRefExpr global_var] <t1.c:4:19,  
> t1.c:4:19>
>
> Find the declaration:
>
> $ index-test t1.ast -point-at t1.c:4:23 -print-decls
> [Decl: Var global_var] <foo.h:1:12, foo.h:1:12>
>
> Find the references:
>
> $ index-test t1.ast t2.ast -point-at t1.c:4:23 -print-refs
> [Decl: Var local_var | Stmt: DeclRefExpr global_var] <t1.c:4:19,  
> t1.c:4:19>
> [Decl: Function bar_func | Stmt: DeclRefExpr global_var] <t2.c:6:3,  
> t2.c:6:3>
> [Decl: Function bar_func | Stmt: DeclRefExpr global_var] <t2.c:7:12,  
> t2.c:7:12>
>
> Find definitions:
>
> $ index-test t1.ast t2.ast -point-at t1.c:4:23 -print-defs
> [Decl: Var global_var] <t2.c:3:5, t2.c:3:18>
>
>
>
> This concludes the introduction to the clangIndex library, if you  
> have any questions or comments please let me know.
>
> -Argiris
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20090709/5e9e230c/attachment.html>


More information about the cfe-dev mailing list