[cfe-dev] RFC: clang-doc proposal

Gábor Horváth via cfe-dev cfe-dev at lists.llvm.org
Mon Mar 26 11:40:01 PDT 2018


Hi Julie,

I saw that the first bits already landed. Congratulations, great work! :)
I have a few questions.
Do you plan to include the backends in the clang (tools) repository?
I think it would be great to have at least one reference backend next to
the frontend.
Building an AST can be quite expensive. Do you plan to support generating
documentation as a by-product of building the code? Similar to how indexing
while building was proposed by Apple.
Do you plant the intermediate representation to be self-contained, or the
backends will need access to the original files (available at the original
paths)?

Regards,
Gábor

On 4 December 2017 at 21:21, Julie Hockett via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> This proposal is to build a new Clang tool for generating C/C++
> documentation, similar to other modern documentation generators such as
> rustdoc.  This tool would be a modular and extensible documentation
> generator for C/C++ code. It also would introduce a more elegant way of
> documenting C/C++ code, as well as maintaining backwards-compatibility with
> Doxygen-style markup to generate documentation.
>
> Today, Doxygen is a de-facto standard for generating C/C++ documentation.
> While widely used, the tool itself is a bit cumbersome, its output is both
> aesthetically and functionally lacking, and the non-permissive license
> combined with outdated codebase make any improvements difficult. This new
> tool would aim to simplify the overhead of generating documentation,
> integrating it into a Clang tool as well as allowing existing comments to
> continue to be used. It would also allow for relatively easy adaptation to
> new language features, as it would be built on the Clang parser and would
> use the Clang AST to generate documentation.
>
> Proposed Tool
>
> The proposed tool would consist of two parts. First, it would have a
> frontend that consumes the Clang-generated AST and generates an
> intermediate representation of the code and documentation structure,
> including additional Markdown files. Second, it would have a set of backend
> modules that consume that representation and output documentation in a
> particular format (e.g. Markdown, HTML/website, etc.).
>
> The frontend would be a new tool that uses the Clang parser, which can
> already parse C/C++ documentation comments (using -Wdocumentation
> option). It can be easily used through the LibTooling interface,
> similarly to other Clang tools such as clang-check or clang-format. The
> initial steps in this project would be to build this tool using Clang's
> documentation parser. This tool would be able to attach comments to both
> functions, types, and macros and resolve declaration references, both of
> which will be useful in generating effective documentation. Since a good
> deal of existing C/C++ code uses the Doxygen documentation comment style,
> which is also supported by Clang's parser (and Doxygen itself can use Clang
> to parse these comments), this is the syntax we are going to support as
> well. In the future, we would also like to support Markdown-style comments,
> akin to Apple Swift Markup.
>
> For implementation, this tool will use the JSON Compilation Database format
> to integrate with existing build systems. It would also have subcommands to
> choose which parts of the code will be documented (e.g. all code, all
> public signatures, all comment-documented signatures). Once the code is
> processed, the tool will write out the internal representation of the the
> documentation in an intermediate representation, encapsulating the
> necessary information about the code, comments, and structure. This will
> allow backend tools to take the output and transform it as necessary.
>
> The backend modules would cover different possible outputs for the defined
> intermediate representation. Each module will consume the representation
> and output documentation in a specific format. Initially, we propose to
> focus on a module that generates Markdown files, in order to make the first
> version as simple as possible. Markdown files are automatically rendered on
> a number of sites and systems, as well as being clear and uncluttered in
> raw text form. It is also relatively easy to convert Markdown files into
> other formats, making it a good starting target. An additional module would
> target HTML/website output.
>
> Intermediate Format
>
> The frontend would process the code and comments into an output, to be
> consumed by the backend. This representation would be internally
> represented as a set of classes and structs. Once the frontend has
> finished, it would write this representation to a file. While existing
> tools like Doxygen emit XML, XML is somewhat restrictive and bulky. Also,
> in order to fully use XML, the tool would need to define the representation
> twice (once for the internal classes/structs, once in the XML schema). So,
> we are instead considering two possible formats for this intermediate step:
> LLVM bitstream and JSON/YAML.
>
> LLVM bitstream format is space-efficient, and is natively written out by
> the Clang parser. It has the benefit of being similar to existing clang
> functionality, as the compiler frontend writes out its AST into the
> bitstream format to pass along to the LLVM backend. Using this format would
> allow the tool to emit the representation with minimal manipulation or
> additional parsing.
>
> Alternatively, JSON/YAML, while less space-efficient than bitstream, are
> human-readable and widely extensible. Neither has formal grammar or
> namespacing support, so if the tool needed rules of the sort it would need
> to define them itself on the frontend and require that the backend modules
> know them. While this would require a bit more parsing to emit on the
> frontend and load on the backend, the representation would be able to stand
> separately from the tool, and the backend modules would not necessarily
> need an understanding of the LLVM bitstream to load it.
>
> Extensions
>
> In addition to generating documentation from comments, a future extension
> would be to automatically generate and insert boilerplate comments into the
> code on demand. As the tool would have access to the AST, it could insert
> comments into the code similar to how tools like clang-tidy and
> clang-format adjust the code. Such generated comments would follow the
> documentation style for comments, and so would generate basic, if not
> wholly described, documentation, including information about parameters,
> return types, class members, etc. For example, the following would be
> generated for the below function:
>
> /// Do Things
>
> ///
>
> /// TODO: Write detailed description
>
> ///
>
> /// \param value
>
> /// \return int
>
> int doThings(int value) { return value; }
>
> In addition, the parsing tool could also be expanded to also parse
> Markdown-style comments, using the Apple Swift Markup style as a reference
> .
>
>
> Please let us know if you have comments or concerns about this proposal.
>
> Thanks!
>
> Julie
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180326/8025f288/attachment.html>


More information about the cfe-dev mailing list