[cfe-dev] Announcing "clang-ctags"

David Röthlisberger david at rothlis.net
Mon Jul 23 01:04:14 PDT 2012


Announcing "clang-ctags", a libclang-based ctags implementation written
in python.

Source code: https://github.com/drothlis/clang-ctags

I took care to structure the commits in a tutorial-like fashion, so you
could start from the oldest commit:
https://github.com/drothlis/clang-ctags/commits/master/clang-ctags
As time permits I'll write up a tutorial presenting the material in a
more structured way.

Currently clang-ctags only supports the Emacs ("etags") format
(mainly because I haven't figured out how to write integration tests
for vim).

WHY:

This seemed like the simplest tool I could write to get acquainted with
libclang, and still be useful.

https://github.com/drothlis/clang-ctags/blob/master/test/why.sh tests
some specific cases that the traditional etags doesn't handle well.

LESSONS LEARNED:

Using this tool is far more complicated than existing ctags/etags
implementations. To process a source file you need its compilation
command line (there are several ways to obtain this:
https://github.com/drothlis/clang-ctags#compilation-command-line).

There are other complications. How do you process header files? You
don't have a compilation command line for headers. The approach I've
taken is to generate tags for a header file encountered during
processing a source file, but only if that header file was also
specified on the clang-ctags command line. This matches the way you
invoke traditional ctags tools, but instead of:
    find . -name '*.[ch]pp' | xargs ctags
you say:
    find . -name '*.[ch]pp' |
    xargs clang-ctags --compile-commands=compile_commands.json

(I also added a "--non-system-headers" flag to generate tags for all
header files encountered that are under the directory where clang-ctags
is invoked.)

In general the development was very straight-forward, and clearly a tool
to index C++ with this accuracy wouldn't be feasible without clang. But
it still took far longer than I expected, and I'm beginning to
understand why we haven't seen more clang-based tools springing up.
(This is a fault of C++, not of clang! And I expect things will get
easier as more tooling is added, like the new support for compile
command databases.)

ipython (a python shell with tab-completion) is great for discovering
the libclang api.

Deployment is going to be difficult -- until libclang and its python
bindings are included in your system's clang packages, you'll have to
build clang from source. (A separate project, the clang_complete plugin
for vim, works around this by shipping a copy of cindex.py, but then you
have to make sure that your system's version of libclang matches
clang_complete's cindex.py.)

PERFORMANCE:

Running clang-ctags over the `lib` directory of the `clang`
source code (480 files totalling 470k lines of code) takes 37 minutes on
my 1.8GHz Intel Core i7. 98% of this time is the parsing done by
libclang itself. By comparison, GNU etags takes 0.5 *seconds* on the
same input.

CONCLUSION:

As a replacement for traditional ctags/etags, the disadvantages of
clang-ctags may outweigh the advantages. But it could be useful as
a base to build a more advanced indexing tool. :-)

Cheers
David Rothlisberger.




More information about the cfe-dev mailing list