[cfe-dev] Adding more HTML-related facilities in Doxygen comment parsing

Mon Apr 28 08:25:23 PDT 2014

On 28/04/2014 15:14, Dmitri Gribenko wrote:
> On Mon, Apr 28, 2014 at 2:56 PM, Alp Toker <alp at nuanti.com> wrote:
>> ~20,000 LoC implementing XML schemas, HTML, JavaScript validators .. are all
>> so intertwined it's difficult to cut things down to provide the basic
>> comment callbacks and diagnostics users would benefit from.
> Alp, the way you have been putting this discussion is
> non-constructive.  You are trying to reuse Clang's comment parsing for
> some other purpose, yet unknown.  It seems that it is hard for you to
> factor the code (because it is tied to Clang's ASTs, on purpose of
> providing diagnostics), but you start blaming the code and finding
> deficiencies when there are none.

You've convinced yourself that the existing code has no deficiencies, 
therefore any suggestions must be motivated by some unknown ulterior 
purpose.

That's not the case and I can tell you there's no conspiracy :-)

I do however make an incisive observation that's perhaps not easy to 
hear. Users are looking for two things:

1) Fast -Wdocumentation that basically just ensures that \param and 
\return match what's in the declarator, and perhaps at most extracts the 
\brief. Why not use the Regex class and do this in 100 lines of code in 
ParseDecl.h? If the regex fails, no big deal.

2) Efficient and flexible callbacks to support documentation tooling and 
IDEs with interactive. We're totally failing to expose useful callbacks 
at present. That's because it's all getting hard-coded into a massive 
monolthic world view of how and when docs should be consumed that nobody 
is really using. If we sit down and instead decide which ASTConsumer 
interface to put this into that's perhaps no more than 50 lines of code.

We're failing miserably at both of these right now, and both are things 
a compiler should actually provide.

There is instead this "comment AST" which doesn't serve any mainstream 
use case but appears to be part of a Doxygen-like tool you're building. 
We cannot even enable it on our own build bots on llvm.org. Despite the 
prescribed terminology these documents aren't really an abstract syntax 
tree, nor are they even part of clang's AST. There are curiously named 
Lex, Parse and Sema classes that mimic clang, and even a set of RAV 
templates, presumably to visit documents that rarely have a depth 
greater than one level?

This application should be split out into an external plugin or tool 
while we look for a quick solution to (1) and (2). As an external tool 
it will help validate clang's programming interfaces.

There is no criticism here, just calling what I see. Could we get back 
to discussing how to split this out in an orderly manner?

Cheers,
Alp.

-- 
http://www.nuanti.com
the browser experts