[cfe-dev] Adding more HTML-related facilities in Doxygen comment parsing

Mon Apr 28 11:35:53 PDT 2014

Hi Reid,

On Mon, Apr 28, 2014 at 7:27 PM, Reid Kleckner <rnk at google.com> wrote:
> What applications does this HTML5 validation enable?  I've tried to skim
> this thread to find the big picture, but I can't find it.

Clients that use parsed comments could always use parsed markup -- it
is always safe to render.  But Doxygen includes HTML as an indivisible
part of it.  If clients that render parsed comments (in IDE, in HTML,
in PDF etc) would like to use markup represented as HTML, they should
either trust comments or sanitize HTML first.

> Why does Clang need to validate the HTML, rather than simply associating
> comments with Decls and handing them over to a client who knows the details
> of Doxygen and HTML?

Clang needs to parse Doxygen in order to give useful warnings (most
notably \param not matching any actual parameter in the function, but
there are lots of others).  Since Clang needs to understand Doxygen
that much, it makes sense for Clang to parse all of it and represent
parsing results in a cooked representation that is easily consumable
by external clients, so that other clients don't have to concern
themselves with parsing, only with further processing and/or
rendering.  We have two such intermediate representations -- comment
AST, accessible by C++ and (a bit less so)  by libclang APIs, and an
XML representation with a well-defined schema that is extended in a
backwards-compatible way.

With C++ and libclang APIs Clang also allows one to get the raw,
unparsed comment for a declaration and parse it using any other
parsing algorithm or even treat it as a non-comment (e.g., as a pragma
to guide static analysis etc.)

Dmitri

-- 
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/