[cfe-dev] Adding more HTML-related facilities in Doxygen comment parsing

Dmitri Gribenko gribozavr at gmail.com
Mon Apr 28 08:38:08 PDT 2014


On Mon, Apr 28, 2014 at 4:25 PM, Alp Toker <alp at nuanti.com> wrote:
>
> On 28/04/2014 15:14, Dmitri Gribenko wrote:
>>
>> On Mon, Apr 28, 2014 at 2:56 PM, Alp Toker <alp at nuanti.com> wrote:
>>>
>>> ~20,000 LoC implementing XML schemas, HTML, JavaScript validators .. are
>>> all
>>> so intertwined it's difficult to cut things down to provide the basic
>>> comment callbacks and diagnostics users would benefit from.
>>
>> Alp, the way you have been putting this discussion is
>> non-constructive.  You are trying to reuse Clang's comment parsing for
>> some other purpose, yet unknown.  It seems that it is hard for you to
>> factor the code (because it is tied to Clang's ASTs, on purpose of
>> providing diagnostics), but you start blaming the code and finding
>> deficiencies when there are none.
>
>
> You've convinced yourself that the existing code has no deficiencies,
> therefore any suggestions must be motivated by some unknown ulterior
> purpose.
>
> That's not the case and I can tell you there's no conspiracy :-)
>
> I do however make an incisive observation that's perhaps not easy to hear.
> Users are looking for two things:
>
> 1) Fast -Wdocumentation that basically just ensures that \param and \return
> match what's in the declarator, and perhaps at most extracts the \brief. Why
> not use the Regex class and do this in 100 lines of code in ParseDecl.h? If
> the regex fails, no big deal.

Why not re-implement C++ parsing with a regex then?..  This sounds
like the same kind of argument to me.  Will your regex run in linear
time like the current parser does?  You can not extract \brief, by the
way, without skipping HTML tags, unescaping Doxygen escape sequences
and probably tons of other quirks that I can not remember on the spot.

> 2) Efficient and flexible callbacks to support documentation tooling and
> IDEs with interactive. We're totally failing to expose useful callbacks at
> present. That's because it's all getting hard-coded into a massive monolthic
> world view of how and when docs should be consumed that nobody is really
> using.

I want to reassure you that this is used.

> If we sit down and instead decide which ASTConsumer interface to put
> this into that's perhaps no more than 50 lines of code.

Please explain how a callback-based interface would be better than an
AST interface.

> There is instead this "comment AST" which doesn't serve any mainstream use
> case but appears to be part of a Doxygen-like tool you're building.

> We
> cannot even enable it on our own build bots on llvm.org.

-Wdocumentation is enabled on all my clang_fast buildbots, lldb and
lld buildbots.

> Despite the
> prescribed terminology these documents aren't really an abstract syntax
> tree,

They are an abstract syntax tree.

> nor are they even part of clang's AST. There are curiously named Lex,
> Parse and Sema classes that mimic clang, and even a set of RAV templates,
> presumably to visit documents that rarely have a depth greater than one
> level?

I don't see a problem with this.  Would you rather see one monolithic
class where all processing, starting from unescaping Doxygen escapes
to matching \param to declarations, is jammed together?

> This application should be split out into an external plugin or tool while
> we look for a quick solution to (1) and (2). As an external tool it will
> help validate clang's programming interfaces.
>
> There is no criticism here, just calling what I see. Could we get back to
> discussing how to split this out in an orderly manner?

Again you are talking about splitting this, and yet I have to see how
Clang would benefit from this.

Dmitri

-- 
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/



More information about the cfe-dev mailing list