[cfe-dev] RFC: Fuzzy parser for highlighting C++
Johannes Kapfhammer
kapf at student.ethz.ch
Wed Jul 30 10:28:52 PDT 2014
Hi all,
I am working on a Google Summer of Code project to use the clang lexer for
syntax highlighting. The intended usage is to highlight C++ for LaTeX
(papers, presentations), HTML (documentations, wikis) and other formats.
My goal is to provide a better alternative to Pygments (which highlights C++
on the llvm.org docs) or GNU Source-highlight. These tools can identify
keywords perfectly well, but aren't able to highlight types and functions.
To correctly highlight those source snippets, I wrote a fuzzy parser library on
top of the clang lexer. The clang parser cannot be used for this as snippets
don't need to be self-contained, e.g. use types or functions which definitions
aren't included.
The fuzzy parser doesn't understand all language constructs of C++, but enough
to produce a reasonably good highlighting. A sample output produced with
LaTeX an be found on github [1] (136 KB). There's also more documentation
about clang-highlight [2] and the fuzzy parser [3].
I submitted my work for review on phabricator [4] to get it into
clang/tools/extra.
The fuzzy parser is a general library that may have some other potential uses
beside clang-highlight. clang-format internally has a similar fuzzy parser
and is currently more complete, but not written in a reusable way.
Another possible use would be for an auto complete system for editors.
Any opinions or suggestions about this project?
Best,
Johannes
1 : https://github.com/kapf/clang-highlight/blob/master/latex/fuzzyparser.pdf?raw=true
2 : https://github.com/kapf/clang-highlight/blob/master/docs/clang-highlight.rst
3 : https://github.com/kapf/clang-highlight/blob/master/docs/LibFuzzy.rst
4 : http://reviews.llvm.org/D4725
More information about the cfe-dev
mailing list