[cfe-dev] [GSoC] Doxygen documentation with clang

Tue Mar 11 06:22:48 PDT 2014

On Tue, Mar 11, 2014 at 12:36 PM, Philipp Moeller <bootsarehax at gmail.com> wrote:
> Dmitri Gribenko <gribozavr at gmail.com>
> writes:
>
>> On Mon, Mar 10, 2014 at 5:59 PM, Philipp Moeller <bootsarehax at gmail.com> wrote:
>>> For a good application I would like to define a certain set of
>>> milestones we want to achieve. If you have anything specific in mind,
>>> please let me know.
>>
>> Based on the discussion so far, I think this could be used as a draft plan:
>>
>> - attaching comments to macros;
>> - parsing the reference syntax (recognising that the text from here to
>> there is a possible reference, which we will need to resolve).
>> Implementing Comment AST representation for unresolved references.
>> Designing and implementing the XML representation for unresolved
>> references.
>> - resolving links to decls within the TU.  The result should probably
>> be a Decl* or a USR.  The USR should be available in the XML;
>> - defining a schema for a DB to store information about possible link
>> targets (declarations and macros);
>> - populating DB with information from TUs in the project;
>> - resolving links to decls cross-TU using the DB.  The result should
>> be a USR, and maybe the source file name + source location.
>>
>> Does this sound reasonable?  What do you think?
>
> The first three stages seem very self-contained and we can probably add
> them independently.

Maybe only the first two stages?  In order to resolve links, we should
have parsed something first...

>> This already looks like a lot of work, so I am not sure if actually
>> writing a tool that is going to produce HTML or LaTeX is going to fit
>> in...  Maybe only a skeleton of such a tool.
>
> I agree. The proposal I'll upload talks about a very basic HTML and
> possibly a LaTeX generator to outline how the functionality can be used
> to build a more general purpose tool.

Sounds good.

>> It just involves a completely different code path, through
>> Preprocessor.  I don't expect implementing it to be too hard, but
>> probably not trivial either, and probably involving a lot of plumbing
>> though everywhere.
>
> Sounds like a perfect first task to tackle.

I agree.  How much experience do you have with the Clang codebase?

>> Sorry, I did not explain clearly.  Just to clear any possible misunderstandings:
>> - XML format is only for comments, not C, C++, Objective-C ASTs.
>> - XML format is not reversible to comment ASTs.
>>
>> Currently clients already have a TranslationUnit when they query it
>> for the XML representation of the comment.  XML is optimised for the
>> IDE usecase, where the XML will be rendered into some rich text view
>> in the IDE.  If the client needs need extra information, it can query
>> it with very little overhead, because the TranslationUnit is already
>> in memory, and all the parsing and semantic analysis work was done.
>>
>> OTOH, if we will decide on more offline approach, where comments in
>> XML format are stored after the TranslationUnit is destroyed, then we
>> either need to store more indexing info out-of-band, or add optional
>> pieces to the XML with that information.
>
> Thanks, I was under the impression that the XML should represent at
> least a subset of the AST and that the XML should be the sole input to a
> documentation generator. This would obviously require it to contain much
> more information.

I think an indexing DB would be a more promising approach.  If we
stuff everything into XML, then certainly, that information can be
used by the documentation tool, but probably not by other tools.
OTOH, if we have a DB with a more-or-less extensible schema, we can
eventually build other tools to assist editors (go to definition, find
usages etc.)

>>>>> 3.3.3 Database + Web-server
>>>>> ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
>>>>>
>>>>> A special case for HTML. Provide a database and a web-frontend that
>>>>> can be hosted. Seems interesting for fast search functions and live
>>>>> documentation updates. clang-server where are you?
>>>>
>>>> This looks like a very promising approach that does not just provide
>>>> the same functionality as Doxygen does, but introduces new value.
>>>> This can actually become the foundation for the clang-server itself!
>>>> The basic functionality for live updates -- tracking dependencies
>>>> between source files, indexing and reindexing will be useful for both
>>>> documentation server and clang-server.
>>>
>>> The main question here seems how to represent the persistent AST:
>>>
>>> - relational DB
>>> - NoSQL
>>> - graph DB
>>>
>>> all seem like they could work and I don't have a clear idea how either
>>> of them is going to perform.
>>
>> Do you have any previous experience with databases, or a particular
>> preference?  I guess that if we use a portable subset of sqlite, then
>> the tool would be able to run on a wide variety of systems, make it
>> extremely easy to set up the tool, and leave a possibility of using a
>> more heavyweight database in future if needed.
>
> Most of my database work has been with sqlite and it seems the most
> portable of all the options and is also the least hassle for users.
>
> I'll allocate some time in the schema design phase of the database to
> research some alternatives more closely.

Looking forward to future collaboration.

Dmitri

-- 
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/