[cfe-dev] Announcing Crange

Karen Shaeffer shaeffer at neuralscape.com
Fri May 9 10:11:22 PDT 2014


On Fri, May 09, 2014 at 05:38:37PM +0530, Anurag wrote:
> Announcing Crange: https://github.com/crange/crange
> 
> Summary
> -------
> 
> Crange is a tool to index and cross-reference C/C++ source code. It
> can be used to generate tags database that can help with:
> 
> * Identifier definitions
> * Identifier declaraions
> * References
> * Expressions
> * Operators
> * Symbols
> * Source range
> 
> The source metadata collected by Crange can help with building tools
> to provide cross-referencing, syntax highlighting, code folding and
> deep source code search.
> 
> 
> Rationale
> ---------
> 
> I was looking for tools that can extract and index identifiers present
> in C/C++ source code and can work with large code bases.
> 
> Considering the amount of data Clang can generate while traversing
> very large C/C++ projects (like, Linux), I decided against using
> ctags/etags style tags database. Crange uses SQLite based tags
> database to store identifiers and metadata, and uses SQLite's bulk
> insert capabilities wherever possible.
>
> I've used python's multiprocessing library to parallelize translation
> unit traversal and metadata extraction from identifiers. Its possible
> to control the number of jobs using -j command line option.
> 
> 
> Usage example
> -------------
> 
> Generating tags database for Linux 3.13.5
> 
>   $ cd linux-3.13.5
>   $ crtags -v -j 32 .
>   Parsing fs/xfs/xfs_bmap_btree.c (count: 1)
>   Indexing fs/xfs/xfs_bmap_btree.c (nodes: 379, qsize: 0)
>   ...
>   Parsing sound/soc/codecs/ak4641.h (count: 34348)
>   Generating indexes
> 
> This would create a new file named tags.db containing all the
> identified tags.
> 
> Search all declarations for identifier named device_create
> 
>   $ crange device_create
> 
> Search all references for identifier named device_create
> 
>   $ crange -r device_create
> 
> Not all command line options are available though (-b, -k etc.), as
> the tool is still in development.
> 
> Performance
> -----------
> 
> Running crtags on Linux kernel v3.13.5 sources (containing 45K files,
> size 614MB) took a little less than 7 hours (415m10.974s) on 32 CPU
> Xeon server with 16GB of memory and 32 jobs. The generated tags.db
> file was 22GB in size and contained 60,461,329 unique identifiers.


Hi Anurag,
Last time I looked, the linux kernel was written in C. And I could use
cscope to create a crosss-reference database of the entire linux kernel
in about 10 minutes, using far fewer resources than you used. Admittedly,
cscope gets confused on a few of the complex MACRO usages. But cscope is
a very popular tool for linux kernel development.

I suggest you consider the current state of your work a proof of concept
and think about how to improve your performance by more than an order of
magnitude.

enjoy,
Karen
-- 
Karen Shaeffer                 Be aware: If you see an obstacle in your path,
Neuralscape Services           that obstacle is your path.        Zen proverb

> 
> Installation
> ------------
> 
>   $ sudo python setup.py install
> or
>   $ sudo pip install crange
> 
> Feedback
> --------
> 
> I would highly appreciate any feedback on improving this tool and
> making it more useful. Also, clang-ctags and python-clang test suite
> were of great help, thank you guys!
> 
> Anurag
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
--- end quoted text ---



More information about the cfe-dev mailing list