[cfe-dev] Announcing Crange
Karen Shaeffer
shaeffer at neuralscape.com
Fri May 9 10:11:22 PDT 2014
On Fri, May 09, 2014 at 05:38:37PM +0530, Anurag wrote:
> Announcing Crange: https://github.com/crange/crange
>
> Summary
> -------
>
> Crange is a tool to index and cross-reference C/C++ source code. It
> can be used to generate tags database that can help with:
>
> * Identifier definitions
> * Identifier declaraions
> * References
> * Expressions
> * Operators
> * Symbols
> * Source range
>
> The source metadata collected by Crange can help with building tools
> to provide cross-referencing, syntax highlighting, code folding and
> deep source code search.
>
>
> Rationale
> ---------
>
> I was looking for tools that can extract and index identifiers present
> in C/C++ source code and can work with large code bases.
>
> Considering the amount of data Clang can generate while traversing
> very large C/C++ projects (like, Linux), I decided against using
> ctags/etags style tags database. Crange uses SQLite based tags
> database to store identifiers and metadata, and uses SQLite's bulk
> insert capabilities wherever possible.
>
> I've used python's multiprocessing library to parallelize translation
> unit traversal and metadata extraction from identifiers. Its possible
> to control the number of jobs using -j command line option.
>
>
> Usage example
> -------------
>
> Generating tags database for Linux 3.13.5
>
> $ cd linux-3.13.5
> $ crtags -v -j 32 .
> Parsing fs/xfs/xfs_bmap_btree.c (count: 1)
> Indexing fs/xfs/xfs_bmap_btree.c (nodes: 379, qsize: 0)
> ...
> Parsing sound/soc/codecs/ak4641.h (count: 34348)
> Generating indexes
>
> This would create a new file named tags.db containing all the
> identified tags.
>
> Search all declarations for identifier named device_create
>
> $ crange device_create
>
> Search all references for identifier named device_create
>
> $ crange -r device_create
>
> Not all command line options are available though (-b, -k etc.), as
> the tool is still in development.
>
> Performance
> -----------
>
> Running crtags on Linux kernel v3.13.5 sources (containing 45K files,
> size 614MB) took a little less than 7 hours (415m10.974s) on 32 CPU
> Xeon server with 16GB of memory and 32 jobs. The generated tags.db
> file was 22GB in size and contained 60,461,329 unique identifiers.
Hi Anurag,
Last time I looked, the linux kernel was written in C. And I could use
cscope to create a crosss-reference database of the entire linux kernel
in about 10 minutes, using far fewer resources than you used. Admittedly,
cscope gets confused on a few of the complex MACRO usages. But cscope is
a very popular tool for linux kernel development.
I suggest you consider the current state of your work a proof of concept
and think about how to improve your performance by more than an order of
magnitude.
enjoy,
Karen
--
Karen Shaeffer Be aware: If you see an obstacle in your path,
Neuralscape Services that obstacle is your path. Zen proverb
>
> Installation
> ------------
>
> $ sudo python setup.py install
> or
> $ sudo pip install crange
>
> Feedback
> --------
>
> I would highly appreciate any feedback on improving this tool and
> making it more useful. Also, clang-ctags and python-clang test suite
> were of great help, thank you guys!
>
> Anurag
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
--- end quoted text ---
More information about the cfe-dev
mailing list