[cfe-dev] RFC: clang-doc proposal

Julie Hockett via cfe-dev cfe-dev at lists.llvm.org
Mon Dec 4 12:21:26 PST 2017

This proposal is to build a new Clang tool for generating C/C++
documentation, similar to other modern documentation generators such as
rustdoc.  This tool would be a modular and extensible documentation
generator for C/C++ code. It also would introduce a more elegant way of
documenting C/C++ code, as well as maintaining backwards-compatibility with
Doxygen-style markup to generate documentation.

Today, Doxygen is a de-facto standard for generating C/C++ documentation.
While widely used, the tool itself is a bit cumbersome, its output is both
aesthetically and functionally lacking, and the non-permissive license
combined with outdated codebase make any improvements difficult. This new
tool would aim to simplify the overhead of generating documentation,
integrating it into a Clang tool as well as allowing existing comments to
continue to be used. It would also allow for relatively easy adaptation to
new language features, as it would be built on the Clang parser and would
use the Clang AST to generate documentation.

Proposed Tool

The proposed tool would consist of two parts. First, it would have a
frontend that consumes the Clang-generated AST and generates an
intermediate representation of the code and documentation structure,
including additional Markdown files. Second, it would have a set of backend
modules that consume that representation and output documentation in a
particular format (e.g. Markdown, HTML/website, etc.).

The frontend would be a new tool that uses the Clang parser, which can
already parse C/C++ documentation comments (using -Wdocumentation option).
It can be easily used through the LibTooling interface, similarly to other
Clang tools such as clang-check or clang-format. The initial steps in this
project would be to build this tool using Clang's documentation parser.
This tool would be able to attach comments to both functions, types, and
macros and resolve declaration references, both of which will be useful in
generating effective documentation. Since a good deal of existing C/C++
code uses the Doxygen documentation comment style, which is also supported
by Clang's parser (and Doxygen itself can use Clang to parse these comments),
this is the syntax we are going to support as well. In the future, we would
also like to support Markdown-style comments, akin to Apple Swift Markup.

For implementation, this tool will use the JSON Compilation Database format
to integrate with existing build systems. It would also have subcommands to
choose which parts of the code will be documented (e.g. all code, all
public signatures, all comment-documented signatures). Once the code is
processed, the tool will write out the internal representation of the the
documentation in an intermediate representation, encapsulating the
necessary information about the code, comments, and structure. This will
allow backend tools to take the output and transform it as necessary.

The backend modules would cover different possible outputs for the defined
intermediate representation. Each module will consume the representation
and output documentation in a specific format. Initially, we propose to
focus on a module that generates Markdown files, in order to make the first
version as simple as possible. Markdown files are automatically rendered on
a number of sites and systems, as well as being clear and uncluttered in
raw text form. It is also relatively easy to convert Markdown files into
other formats, making it a good starting target. An additional module would
target HTML/website output.

Intermediate Format

The frontend would process the code and comments into an output, to be
consumed by the backend. This representation would be internally
represented as a set of classes and structs. Once the frontend has
finished, it would write this representation to a file. While existing
tools like Doxygen emit XML, XML is somewhat restrictive and bulky. Also,
in order to fully use XML, the tool would need to define the representation
twice (once for the internal classes/structs, once in the XML schema). So,
we are instead considering two possible formats for this intermediate step:
LLVM bitstream and JSON/YAML.

LLVM bitstream format is space-efficient, and is natively written out by
the Clang parser. It has the benefit of being similar to existing clang
functionality, as the compiler frontend writes out its AST into the
bitstream format to pass along to the LLVM backend. Using this format would
allow the tool to emit the representation with minimal manipulation or
additional parsing.

Alternatively, JSON/YAML, while less space-efficient than bitstream, are
human-readable and widely extensible. Neither has formal grammar or
namespacing support, so if the tool needed rules of the sort it would need
to define them itself on the frontend and require that the backend modules
know them. While this would require a bit more parsing to emit on the
frontend and load on the backend, the representation would be able to stand
separately from the tool, and the backend modules would not necessarily
need an understanding of the LLVM bitstream to load it.


In addition to generating documentation from comments, a future extension
would be to automatically generate and insert boilerplate comments into the
code on demand. As the tool would have access to the AST, it could insert
comments into the code similar to how tools like clang-tidy and
clang-format adjust the code. Such generated comments would follow the
documentation style for comments, and so would generate basic, if not
wholly described, documentation, including information about parameters,
return types, class members, etc. For example, the following would be
generated for the below function:

/// Do Things


/// TODO: Write detailed description


/// \param value

/// \return int

int doThings(int value) { return value; }

In addition, the parsing tool could also be expanded to also parse
Markdown-style comments, using the Apple Swift Markup style as a reference.

Please let us know if you have comments or concerns about this proposal.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20171204/0941955d/attachment.html>

More information about the cfe-dev mailing list